官术网_书友最值得收藏!

Modeling

In the modeling stage you formalize your discoveries found during exploration into an explicit explanation of the steps and data structures required to get to the desired meaning contained within your data. This is the model, a combination of both data structures as well as steps in code to get from the raw data to your information and conclusions.

The modeling process is iterative where, through an exploration of the data, you select the variables required to support your analysis, organize the variables for input to analytical processes, execute the model, and determine how well the model supports your original assumptions. It can include a formal modeling of the structure of the data, but can also combine techniques from various analytic domains such as (and not limited to) statistics, machine learning, and operations research.

To facilitate this, pandas provides extensive data modeling facilities. It is in this step that you will move more from exploring your data, to formalizing the data model in DataFrame objects, and ensuring the processes to create these models are succinct. Additionally, by being based in Python, you get to use its full power to create programs to automate the process from beginning to end. The models you create are executable.

From an analytic perspective, pandas provides several capabilities, most notably integrated support for descriptive statistics, which can get you to your goal for many types of problems. And because pandas is Python-based, if you need more advanced analytic capabilities, it is very easy to integrate with other parts of the extensive Python scientific environment.

主站蜘蛛池模板: 沁阳市| 洛川县| 长武县| 山阴县| 盐边县| 宁蒗| 来安县| 广东省| 灵山县| 壤塘县| 巫山县| 崇左市| 远安县| 运城市| 淮安市| 柳林县| 太白县| 安仁县| 松潘县| 恩施市| 姜堰市| 新源县| 西林县| 崇礼县| 淮阳县| 五河县| 灵山县| 积石山| 广昌县| 鲁甸县| 金华市| 惠来县| 米脂县| 宜良县| 读书| 安溪县| 临湘市| 叶城县| 方山县| 屏边| 沧源|