- Hands-On Data Science with Anaconda
- Dr. Yuxing Yan James Yan
- 285字
- 2021-06-25 21:08:48
UCI machine learning
The UCI maintains 413 datasets, as of 1/10/2018, for machine learning: http://archive.ics.uci.edu/ml/index.php. The following screenshot shows the top three downloaded datasets:

For the number one downloaded dataset called Iris, we have the following information:

The beauty of these datasets is that they give quite detailed information such as the source, the creator or donator, a description, and even citations.
The following table shows several potential public data sources for users in the area of data science and business analytics:

Table 3.1: Potential sources of open data for data science and business analytics
After we go to https://www.data.gov/, we can see the following choices related to Agriculture, Climate, Consumer, Ecosystems, Education, and the like:

The next table shows the potential sources of open data for users in the area of economics:

Table 3.2: Potential sources of open data for economics
After going to the Federal Reserve economic data and clicking Data on the menu, we can see the following entries:

The following table offers free data for users in the areas of finance and accounting:

Table 3.3: Potential sources of open data for finance and accounting
From Professor French's data library, we could download the famous Fama/French's three-factor time series:
> infile<-"http://canisius.edu/~yany/data/ff3monthly.csv" > x<-read.csv(infile,skip=3) > head(x,2) Date Mkt.RF SMB HML RF 1 192607 2.96 -2.3 -2.87 0.22 2 192608 2.64 -1.4 4.19 0.25 > tail(x,2) Date Mkt.RF SMB HML RF 1095 201709 2.51 4.53 3.02 0.09 1096 201710 2.25 -1.94 -0.09 0.09
In the previous code, the input file called ff3monthly.csv is a modified copy of F-F_Research_Data_Factor.csv made by removing the second part of the annual data and adding a Date as the header. Note that F-F_Research_Data_Factor.csv is from the ZIP file called F-F_Research_Data_Factor_CSV.zip.
- Microsoft Dynamics CRM Customization Essentials
- Microsoft Power BI Quick Start Guide
- Splunk 7 Essentials(Third Edition)
- 火格局的時空變異及其在電網防火中的應用
- 手把手教你學AutoCAD 2010
- Dreamweaver 8中文版商業案例精粹
- MCSA Windows Server 2016 Certification Guide:Exam 70-741
- CorelDRAW X4中文版平面設計50例
- 讓每張照片都成為佳作的Photoshop后期技法
- 自動生產線的拆裝與調試
- 基于單片機的嵌入式工程開發詳解
- 人工智能趣味入門:光環板程序設計
- Python:Data Analytics and Visualization
- 基于Xilinx ISE的FPAG/CPLD設計與應用
- Extending Ansible