The Future of Hadoop
Now let’s get to the Hadoop ecosystem. Here’s my summary of what I think happened in the Hadoop ecosystem and then I’d like you to talk about three points that you wanted to make: Hadoop arose out of the need to process huge amounts of data. Google had been doing it and wrote a paper about an architecture called MapReduce that could be very successful for processing large amounts of data. And the people at Yahoo, Doug Cutting and others, created this infrastructure called Hadoop to enable large amounts of data to be processed.
Read more about the Hadoop career scope in 2020.
Now, initially the use case was the indexing of the web but quickly, as more and more data sources became available, it became clear that lots of other people had this big data problem, and Hadoop was an open source platform that allowed you to handle it. What it did most valuably was it created a file system that was much cheaper than any other method that allowed you to store massive amounts of data using commodity hardware. Then on top of it, they created various generations of programming environments that allowed you to write programs that would sift through all of the data in parallel using the MapReduce paradigm, where you would sift through and process a bunch of stuff and get an intermediate result.