Apache Hadoop 3 Quick Start Guide
ApacheHadoopisawidelyuseddistributeddataplatform.Itenableslargedatasetstobeefficientlyprocessedinsteadofusingonelargecomputertostoreandprocessthedata.ThisbookwillgetyoustartedwiththeHadoopecosystem,andintroduceyoutothemaintechnicaltopics,includingMapReduce,YARN,andHDFS.ThebookbeginswithanoverviewofbigdataandApacheHadoop.Then,youwillsetupapseudoHadoopdevelopmentenvironmentandamulti-nodeenterpriseHadoopcluster.Youwillseehowtheparallelprogrammingparadigm,suchasMapReduce,cansolvemanycomplexdataprocessingproblems.Thebookalsocoverstheimportantaspectsofthebigdatasoftwaredevelopmentlifecycle,includingqualityassuranceandcontrol,performance,administration,andmonitoring.YouwillthenlearnabouttheHadoopecosystem,andtoolssuchasKafka,Sqoop,Flume,Pig,Hive,andHBase.Finally,youwilllookatadvancedtopics,includingrealtimestreamingusingApacheStorm,anddataanalyticsusingApacheSpark.Bytheendofthebook,youwillbewellversedwithdifferentconfigurationsoftheHadoop3cluster.
·4萬字