官术网_书友最值得收藏!

Reading the data – variations and examples

Before we delve deeper into the realm of data, let us familiarize ourselves with a few terms that will appear frequently from now on.

Data frames

A data frame is one of the most common data structures available in Python. Data frames are very similar to the tables in a spreadsheet or a SQL table. In Python vocabulary, it can also be thought of as a dictionary of series objects (in terms of structure). A data frame, like a spreadsheet, has index labels (analogous to rows) and column labels (analogous to columns). It is the most commonly used pandas object and is a 2D structure with columns of different or same types. Most of the standard operations, such as aggregation, filtering, pivoting, and so on which can be applied on a spreadsheet or the SQL table can be applied to data frames using methods in pandas.

The following screenshot is an illustrative picture of a data frame. We will learn more about working with them as we progress in the chapter:

Fig. 2.1 A data frame

Delimiters

A delimiter is a special character that separates various columns of a dataset from one another. The most common (one can go to the extent of saying that it is a default delimiter) delimiter is a comma (,). A .csv file is called so because it has comma separated values. However, a dataset can have any special character as its delimiter and one needs to know how to juggle and manage them in order to do an exhaustive and exploratory analysis and build a robust predictive model. Later in this chapter, we will learn how to do that.

主站蜘蛛池模板: 新化县| 大同县| 宝鸡市| 彰化县| 张家界市| 庆安县| 乌鲁木齐县| 宜兰县| 平舆县| 宝兴县| 和顺县| 蒲江县| 合作市| 怀来县| 防城港市| 进贤县| 通山县| 泸溪县| 咸丰县| 芷江| 扎鲁特旗| 巴青县| 都兰县| 盐亭县| 平罗县| 启东市| 当涂县| 康定县| 旌德县| 巴彦淖尔市| 资兴市| 江安县| 修水县| 武乡县| 河西区| 湘潭市| 错那县| 江源县| 辛集市| 甘孜| 佛坪县|