官术网_书友最值得收藏!

What this book covers

Chapter 1, Jupyter and Data Science, covers the details of the Jupyter user interface: what objects it works with and what actions can be taken by Jupyter. We'll see what the display tells us about the data, what tools are available, and some real-life examples from the industry showing R and Python coding. We will also see some of the ways to share our notebook with other users and, correspondingly, how to protect our notebook with different security mechanisms.

Chapter 2, Working with Analytical Data in Jupyter, covers using Python to scrape a website to gather data for analysis. Then we use Python NumPy, pandas, and SciPy functions for in-depth computations of results. The chapter goes further into pandas and explores manipulating data frames. Lastly, it shows examples of sorting and filtering data frames.

Chapter 3, Data Visualization and Prediction, demonstrates prediction models from Python and R under Jupyter. Then it uses Matplotlib for data visualization and interactive plotting (under Python). Then it covers several graphing techniques available in Jupyter and density maps with SciPy. We use histograms to visualize social data. Lastly, we generate a 3D plot in Jupyter.

Chapter 4, Data Mining and SQL Queries, covers Spark Context. We show examples of using Hadoop map/reduce and use SQL with Spark data. Then we combine data frames, operate on the resulting set, import JSON data, and manipulate it with Spark. Lastly, we look at using a pivot to gather information about a data frame.

Chapter 5, R on Jupyter, covers setting up R to be one of the engines available for a notebook. Then we use some rudimentary R to analyze voter demographics for a presidential election and trends in college admissions. Finally, we look at using a predictive model to determine whether some flights would be delayed or not.

Chapter 6, Data Wrangling, teaches reading in CSV files and performing some quick analysis of the data, including visualizations to help understand the data. Next, we consider some of the functions available in the dplyr package. We also use piping to more easily transfer the results of one operation into another operation. Lastly, we look into using the tidyr package to clean up or tidy up our data.

Chapter 7, Jupyter Dashboards, covers visualizing data graphically using glyphs to emphasize important aspects of the data. We use markdown to annotate a notebook page and Shiny to generate an interactive application. We show a way to host notebooks outside of Jupyter.

Chapter 8, Statistical Modeling, teaches converting a JSON file to a CSV file. We evaluate the yelp cuisine review dataset, determining the top rated and most rated firms. We use Python to perform a similar evaluation of yelp business ratings, finding very similar distributions of the data.

Chapter 9, Machine Learning Using Jupyter, covers several machine learning algorithms in both R and Python to compare and contrast. We use naive Bayes to determine how the data might be used. We apply nearest neighbor in a couple of different ways to see results. We also use decision trees to come up with an algorithm for predictions and a neural net to explain housing prices. Finally, we use a random forest algorithm to do the same.

Chapter 10, Optimizing Jupyter Notebooks, deploys your notebook so that others can access it. It shows optimizations you can make to increase your notebook's performance. Then we look at securing the notebook and the mechanisms of sharing it.

主站蜘蛛池模板: 洛川县| 肇源县| 双牌县| 佛学| 长垣县| 合水县| 柳林县| 理塘县| 贡觉县| 长寿区| 涿鹿县| 甘南县| 全椒县| 栾川县| 洪湖市| 昌邑市| 平遥县| 集安市| 明溪县| 梅河口市| 即墨市| 清新县| 隆昌县| 富宁县| 东光县| 梧州市| 芮城县| 永寿县| 民权县| 哈尔滨市| 常山县| 天水市| 禄丰县| 丽江市| 平果县| 和龙市| 东乡| 成武县| 博湖县| 米脂县| 稷山县|