際際滷

際際滷Share a Scribd company logo
Introduction To HaLoopxiafei.qiu@PCA
How hadoop works
Page Rank in Hadoop
Introduction to ha loop
Page Rank in Hadoop
Differences
Introduction to ha loop
Loop-aware Schedulingplace on the same physical machines those map and reduce tasks that occur in different iterations but access the same data.
Scheduling Algorithmthe number of reduce tasks should be invariant across iterations, so that the hash function assigning mapper outputs to reducer nodes remains unchanged.the master node maintains a mapping from each slave node to the data partitions that this node processed in the previous iteration.
CachesReducer Input CacheSame key hashed to same reducer.f must be deterministic, same across iterations, take tuple t as only the input.Number of reducers remains unchanged.Reducer Output CacheThat is, if two Reduce function calls produce the same output key from two different reducer input keys, both reducer input keys must be in the same partition so that they are sent to the same reduce task.Mapper Input Cache
Inspirations

More Related Content

Introduction to ha loop

  • 3. Page Rank in Hadoop
  • 5. Page Rank in Hadoop
  • 8. Loop-aware Schedulingplace on the same physical machines those map and reduce tasks that occur in different iterations but access the same data.
  • 9. Scheduling Algorithmthe number of reduce tasks should be invariant across iterations, so that the hash function assigning mapper outputs to reducer nodes remains unchanged.the master node maintains a mapping from each slave node to the data partitions that this node processed in the previous iteration.
  • 10. CachesReducer Input CacheSame key hashed to same reducer.f must be deterministic, same across iterations, take tuple t as only the input.Number of reducers remains unchanged.Reducer Output CacheThat is, if two Reduce function calls produce the same output key from two different reducer input keys, both reducer input keys must be in the same partition so that they are sent to the same reduce task.Mapper Input Cache