官术网_书友最值得收藏!

  • Java Data Analysis
  • John R. Hubbard
  • 212字
  • 2021-07-02 18:21:46

Summary

This chapter discussed various organizational processes used to prepare data for analysis. When used in computer programs, each data value is assigned a data type, which characterizes the data and defines the kind of operations that can be performed upon it.

When stored in a relational database, data is organized into tables, in which each row corresponds to one data point, and where all the data in each column corresponds to a single field of a specified type. The key field(s) has unique values, which allows indexed searching.

A similar viewpoint is the organization of data into key-value pairs. As in relational database tables, the key fields must be unique. A hash table implements the key-value paradigm with a hash function that determines where the key's associated data is stored.

Data files are formatted according to their file type's specifications. The comma-separated value type (CSV) is one of the most common. Common structured data file types include XML and JSON.

The information that describes the structure of the data is called its metadata. That information is required for the automatic processing of the data.

Specific data processes described here include data cleaning and filtering (removing erroneous data), data scaling (adjusting numeric values according to a specified scale), sorting, merging, and hashing.

主站蜘蛛池模板: 房产| 黄大仙区| 扎兰屯市| 定陶县| 伊川县| 屏东市| 牟定县| 江永县| 天长市| 历史| 龙山县| 多伦县| 富民县| 桐庐县| 应用必备| 织金县| 明水县| 尖扎县| 泰宁县| 黄龙县| 亚东县| 富阳市| 宁明县| 延边| 建昌县| 顺昌县| 佛冈县| 普宁市| 新龙县| 子长县| 高台县| 抚顺县| 二连浩特市| 南雄市| 桃源县| 中方县| 华安县| 驻马店市| 宁陵县| 永丰县| 兰溪市|