- IBM Watson Projects
- James Miller
- 186字
- 2021-07-16 17:31:19
Data review
When you have successfully loaded your data into Watson Analytics, you should review it and assess its quality.
The IBM Watson Analytics documentation describes data quality as:
Data quality assesses the degree to which a data set is suitable for analysis. A shorthand representation of this assessment is the data quality score. The score is measured on a scale of 0-100, with 100 representing the highest possible data quality.
Further:
The data quality score for a data set is computed by averaging the data quality score for every column in the data set. Several factors affect the data quality score for an individual field or column.
The factors that can affect the data quality score include:
- Missing values: Records for which no data are entered.
- Constant values: Some fields have the same value recorded for every field.
- Imbalance: Occurs in a categorical field when records are not equally distributed across categories.
- Influential categories: Those categories that are significantly different from other categories.
- Outliers: Extreme values.
- Skewness: Skewness measures how symmetrical a continuous field is distributed. Skewed fields have lower data quality scores.
推薦閱讀
- Java編程全能詞典
- 網(wǎng)頁編程技術(shù)
- 機艙監(jiān)測與主機遙控
- 人工智能與人工生命
- 觸控顯示技術(shù)
- 悟透AutoCAD 2009案例自學手冊
- 嵌入式Linux系統(tǒng)實用開發(fā)
- Linux Shell Scripting Cookbook(Third Edition)
- PowerPoint 2010幻燈片制作高手速成
- JSP通用范例開發(fā)金典
- Raspberry Pi 3 Projects for Java Programmers
- ARM嵌入式系統(tǒng)開發(fā)完全入門與主流實踐
- 工業(yè)機器人基礎(chǔ)
- Outlook時間管理秘笈
- Learn Power BI