140222 how to be a creative parent for slideshareAnnita Mau
?
In order to make a better world to live, we need more people who can cope with difficulties and crises, hence we need produce more kids who can stand up to challenges. First step is to convince the parents not to help their kids with everything. What is more useful than to train them to become a creative parent. A talk to 90 parents who are barely over 30 in a kindergarten at Tai Po on 22 Feb, 2014.
- MapReduce is a programming model for processing large datasets in a distributed manner across clusters of machines. It handles parallelization, load balancing, and hardware failures automatically.
- In MapReduce, the input data is mapped to intermediate key-value pairs, shuffled and sorted by the keys, then reduced to produce the final output. This pattern applies to many large-scale computing problems.
- Google uses MapReduce for tasks like generating map tiles, processing web search logs, and more. It hides the complex distributed systems details from programmers and provides robustness and scalability.
This document summarizes Doug Cutting's presentation on using Hadoop for scalable web crawling and indexing with the Nutch project. It describes how Nutch algorithms like crawling, parsing, link inversion, and indexing were converted to MapReduce jobs that can scale to billions of web pages. The document outlines the key Nutch algorithms and how they were adapted to the Hadoop framework using MapReduce.
140222 how to be a creative parent for slideshareAnnita Mau
?
In order to make a better world to live, we need more people who can cope with difficulties and crises, hence we need produce more kids who can stand up to challenges. First step is to convince the parents not to help their kids with everything. What is more useful than to train them to become a creative parent. A talk to 90 parents who are barely over 30 in a kindergarten at Tai Po on 22 Feb, 2014.
- MapReduce is a programming model for processing large datasets in a distributed manner across clusters of machines. It handles parallelization, load balancing, and hardware failures automatically.
- In MapReduce, the input data is mapped to intermediate key-value pairs, shuffled and sorted by the keys, then reduced to produce the final output. This pattern applies to many large-scale computing problems.
- Google uses MapReduce for tasks like generating map tiles, processing web search logs, and more. It hides the complex distributed systems details from programmers and provides robustness and scalability.
This document summarizes Doug Cutting's presentation on using Hadoop for scalable web crawling and indexing with the Nutch project. It describes how Nutch algorithms like crawling, parsing, link inversion, and indexing were converted to MapReduce jobs that can scale to billions of web pages. The document outlines the key Nutch algorithms and how they were adapted to the Hadoop framework using MapReduce.