- Jupyter for Data Science
- Dan Toomey
- 577字
- 2021-07-08 09:22:28
What this book covers
Chapter 1, Jupyter and Data Science, covers the details of the Jupyter user interface: what objects it works with and what actions can be taken by Jupyter. We'll see what the display tells us about the data, what tools are available, and some real-life examples from the industry showing R and Python coding. We will also see some of the ways to share our notebook with other users and, correspondingly, how to protect our notebook with different security mechanisms.
Chapter 2, Working with Analytical Data in Jupyter, covers using Python to scrape a website to gather data for analysis. Then we use Python NumPy, pandas, and SciPy functions for in-depth computations of results. The chapter goes further into pandas and explores manipulating data frames. Lastly, it shows examples of sorting and filtering data frames.
Chapter 3, Data Visualization and Prediction, demonstrates prediction models from Python and R under Jupyter. Then it uses Matplotlib for data visualization and interactive plotting (under Python). Then it covers several graphing techniques available in Jupyter and density maps with SciPy. We use histograms to visualize social data. Lastly, we generate a 3D plot in Jupyter.
Chapter 4, Data Mining and SQL Queries, covers Spark Context. We show examples of using Hadoop map/reduce and use SQL with Spark data. Then we combine data frames, operate on the resulting set, import JSON data, and manipulate it with Spark. Lastly, we look at using a pivot to gather information about a data frame.
Chapter 5, R on Jupyter, covers setting up R to be one of the engines available for a notebook. Then we use some rudimentary R to analyze voter demographics for a presidential election and trends in college admissions. Finally, we look at using a predictive model to determine whether some flights would be delayed or not.
Chapter 6, Data Wrangling, teaches reading in CSV files and performing some quick analysis of the data, including visualizations to help understand the data. Next, we consider some of the functions available in the dplyr package. We also use piping to more easily transfer the results of one operation into another operation. Lastly, we look into using the tidyr package to clean up or tidy up our data.
Chapter 7, Jupyter Dashboards, covers visualizing data graphically using glyphs to emphasize important aspects of the data. We use markdown to annotate a notebook page and Shiny to generate an interactive application. We show a way to host notebooks outside of Jupyter.
Chapter 8, Statistical Modeling, teaches converting a JSON file to a CSV file. We evaluate the yelp cuisine review dataset, determining the top rated and most rated firms. We use Python to perform a similar evaluation of yelp business ratings, finding very similar distributions of the data.
Chapter 9, Machine Learning Using Jupyter, covers several machine learning algorithms in both R and Python to compare and contrast. We use naive Bayes to determine how the data might be used. We apply nearest neighbor in a couple of different ways to see results. We also use decision trees to come up with an algorithm for predictions and a neural net to explain housing prices. Finally, we use a random forest algorithm to do the same.
Chapter 10, Optimizing Jupyter Notebooks, deploys your notebook so that others can access it. It shows optimizations you can make to increase your notebook's performance. Then we look at securing the notebook and the mechanisms of sharing it.
- Azure IoT Development Cookbook
- 軟件架構設計:大型網站技術架構與業務架構融合之道
- Microsoft Dynamics 365 Extensions Cookbook
- Scratch真好玩:教小孩學編程
- Visual Basic程序設計教程
- C語言從入門到精通(第4版)
- Python完全自學教程
- 小學生C++創意編程(視頻教學版)
- C語言程序設計實驗指導 (第2版)
- Creating Mobile Apps with jQuery Mobile(Second Edition)
- Getting Started with Python and Raspberry Pi
- OpenCV 3 Blueprints
- Building Slack Bots
- Java程序設計及應用開發
- Web前端開發精品課:HTML5 Canvas開發詳解