This article will help you explore the main functionalities of distributed file system and show how it differs from the traditional \ files systems that we currently have on our computers. Try to understand the need to have a distributed file system and how this can empower Big Data concept.
The beginning of Apache Hadoop cropped up in 2003. Distributed applications that process extremely massive amount of data is developed and carried out by Hadoop. It stores data as well as runs applications on clusters of commodity hardware and gives huge storage for any sort of data, great power of processing plus the capacity to deal with virtually unlimited simultaneous tasks or jobs. Further, it was planned to step up from solo servers to thousands of machines where every single machine will offer local computation and space. And, instead of depending on hardware to give great level of accessibility, the Apache Hadoop software library is aimed to identify and deal with failures at the application layer.
This article is an introduction to Hadoop. It will help you know more about Hadoop and what is actually is. The article will delve a bit into the history and different versions of Hadoop. For the un-initiated, it will also look at high level architecture of Hadoop and its different modules.