- Hands-On Data Science with SQL Server 2017
- Marek Chmel Vladimír Mu?n?
- 380字
- 2021-06-10 19:13:53
Math and statistics
Statistics and other math skills are essential in several phases of the data science project. Even in the beginning of data exploration, you'll be dividing the features of your data observations into categories:
- Categorical
- Numeric:
- Discrete
- Continuous
Categorical values describe the item and represent an attribute of the item. Imagine you have a dataset about cars: car brand would be a typical categorical value, and color would be another.
On the other side, we have numerical values that can be split into two different categories—discrete and continuous. Discrete values describe the amount of observations, such as how many people purchased a product, and so on. Continuous values have an infinite number of possible values and use real numbers for the representation. In a nutshell, discrete variables are like points plotted on a chart, and a continuous variable can be plotted as a line.
Another classification of the data is the measurement-level point of view. We can split data into two primary categories:
- Qualitative:
- Nominal
- Ordinal
- Quantitative:
- Interval
- Ratio
Nominal variables can't be ordered and only describe an attribute. An example would be the color of a product; this describes how the product looks, but you can't put any ordering scheme on the color saying that red is bigger than green, and so on. Ordinal variables describe the feature with a categorical value and provide an ordering system; for example: Education—elementary, high school, university degree, and so on.
With quantitative values, it's a different story. The major difference is that ratio has a true zero. Imagine the attribute was a length. If the length is 0, you know there's no length. But this does not apply to temperature, since there's an interval of possible values for the temperature, where 0°C or 0°F does not mean the beginning of the scale for the temperature (as absolute zero, or beginning of the scale is 273.15° C or -459.67° F). With °K, it would actually be a ratio type of the quantitative value, since the scale really begins with 0°K. So, as you can see, any number can be an interval or a ratio value, but it depends on the context!
- ABB工業(yè)機(jī)器人編程全集
- Practical Data Analysis
- Java開發(fā)技術(shù)全程指南
- MicroPython Projects
- Expert AWS Development
- Python Algorithmic Trading Cookbook
- Windows環(huán)境下32位匯編語言程序設(shè)計
- Android游戲開發(fā)案例與關(guān)鍵技術(shù)
- 項目管理成功利器Project 2007全程解析
- Visual FoxPro數(shù)據(jù)庫基礎(chǔ)及應(yīng)用
- 貫通Java Web開發(fā)三劍客
- Mastering pfSense
- 大數(shù)據(jù):引爆新的價值點(diǎn)
- EJB JPA數(shù)據(jù)庫持久層開發(fā)實(shí)踐詳解
- PostgreSQL 10 High Performance