舉報(bào)

會(huì)員
Big Data Analysis with Python
最新章節(jié):
Chapter 08: Creating a Full Analysis Report
Processingbigdatainrealtimeischallengingduetoscalability,informationinconsistency,andfaulttolerance.BigDataAnalysiswithPythonteachesyouhowtousetoolsthatcancontrolthisdataavalancheforyou.Withthisbook,you'lllearnpracticaltechniquestoaggregatedataintousefuldimensionsforposterioranalysis,extractstatisticalmeasurements,andtransformdatasetsintofeaturesforothersystems.ThebookbeginswithanintroductiontodatamanipulationinPythonusingpandas.You'llthengetfamiliarwithstatisticalanalysisandplottingtechniques.Withmultiplehands-onactivitiesinstore,you'llbeabletoanalyzedatathatisdistributedonseveralcomputersbyusingDask.Asyouprogress,you'llstudyhowtoaggregatedataforplotswhentheentiredatacannotbeaccommodatedinmemory.You'llalsoexploreHadoop(HDFSandYARN),whichwillhelpyoutacklelargerdatasets.ThebookalsocoversSparkandexplainshowitinteractswithothertools.Bytheendofthisbook,you'llbeabletobootstrapyourownPythonenvironment,processlargefiles,andmanipulatedatatogeneratestatistics,metrics,andgraphs.
目錄(73章)
倒序
- 封面
- 版權(quán)頁(yè)
- Preface
- Chapter 1 The Python Data Science Stack
- Introduction
- Python Libraries and Packages
- Using Pandas
- Data Type Conversion
- Aggregation and Grouping
- Exporting Data from Pandas
- Visualization with Pandas
- Summary
- Chapter 2 Statistical Visualizations
- Introduction
- Types of Graphs and When to Use Them
- Components of a Graph
- Seaborn
- Which Tool Should Be Used?
- Types of Graphs
- Pandas DataFrames and Grouped Data
- Changing Plot Design: Modifying Graph Components
- Exporting Graphs
- Summary
- Chapter 3 Working with Big Data Frameworks
- Introduction
- Hadoop
- Spark
- Writing Parquet Files
- Handling Unstructured Data
- Summary
- Chapter 4 Diving Deeper with Spark
- Introduction
- Getting Started with Spark DataFrames
- Writing Output from Spark DataFrames
- Exploring Spark DataFrames
- Data Manipulation with Spark DataFrames
- Graphs in Spark
- Summary
- Chapter 5 Handling Missing Values and Correlation Analysis
- Introduction
- Setting up the Jupyter Notebook
- Missing Values
- Handling Missing Values in Spark DataFrames
- Correlation
- Summary
- Chapter 6 Exploratory Data Analysis
- Introduction
- Defining a Business Problem
- Translating a Business Problem into Measurable Metrics and Exploratory Data Analysis (EDA)
- Structured Approach to the Data Science Project Life Cycle
- Summary
- Chapter 7 Reproducibility in Big Data Analysis
- Introduction
- Reproducibility with Jupyter Notebooks
- Gathering Data in a Reproducible Way
- Code Practices and Standards
- Avoiding Repetition
- Summary
- Chapter 8 Creating a Full Analysis Report
- Introduction
- Reading Data in Spark from Different Data Sources
- SQL Operations on a Spark DataFrame
- Generating Statistical Measurements
- Summary
- Appendix
- Chapter 01: The Python Data Science Stack
- Chapter 02: Statistical Visualizations Using Matplotlib and Seaborn
- Chapter 03: Working with Big Data Frameworks
- Chapter 04: Diving Deeper with Spark
- Chapter 05: Missing Value Handling and Correlation Analysis in Spark
- Chapter 6: Business Process Definition and Exploratory Data Analysis
- Chapter 07: Reproducibility in Big Data Analysis
- Chapter 08: Creating a Full Analysis Report 更新時(shí)間:2021-06-11 13:46:55
推薦閱讀
- 基于C語(yǔ)言的程序設(shè)計(jì)
- Visualforce Development Cookbook(Second Edition)
- 網(wǎng)上沖浪
- TIBCO Spotfire:A Comprehensive Primer(Second Edition)
- Docker Quick Start Guide
- Windows內(nèi)核原理與實(shí)現(xiàn)
- Implementing Oracle API Platform Cloud Service
- 嵌入式操作系統(tǒng)
- 中國(guó)戰(zhàn)略性新興產(chǎn)業(yè)研究與發(fā)展·工業(yè)機(jī)器人
- 步步圖解自動(dòng)化綜合技能
- 水下無(wú)線傳感器網(wǎng)絡(luò)的通信與決策技術(shù)
- 計(jì)算機(jī)與信息技術(shù)基礎(chǔ)上機(jī)指導(dǎo)
- Dreamweaver CS6精彩網(wǎng)頁(yè)制作與網(wǎng)站建設(shè)
- Mastering Ansible(Second Edition)
- 生成對(duì)抗網(wǎng)絡(luò)項(xiàng)目實(shí)戰(zhàn)
- 空間機(jī)器人智能感知技術(shù)
- JSP網(wǎng)絡(luò)開(kāi)發(fā)入門(mén)與實(shí)踐
- 實(shí)戰(zhàn)大數(shù)據(jù)(Hadoop+Spark+Flink):從平臺(tái)構(gòu)建到交互式數(shù)據(jù)分析(離線/實(shí)時(shí))
- Data Science with Python
- Practical Autodesk AutoCAD 2021 and AutoCAD LT 2021
- Proteus從入門(mén)到精通100例
- 仿蛛機(jī)器人的設(shè)計(jì)與制作
- Office 2010辦公應(yīng)用
- 中文版Photoshop CS6高手速成
- 這樣用Excel!
- 人工智能與大數(shù)據(jù)技術(shù)導(dǎo)論
- Apache Ignite Quick Start Guide
- PLC與步進(jìn)伺服快速入門(mén)與實(shí)踐
- 電子商務(wù)網(wǎng)站安全與維護(hù)
- Photoshop CS3中文版圖像處理與平面設(shè)計(jì)精彩百練