- Hands-On Data Science with Anaconda
- Dr. Yuxing Yan James Yan
- 119字
- 2021-06-25 21:08:50
Slicing and dicing datasets
Our first example is to pick all stocks listed on the NYSE by using an R dataset called marketCap.Rdata, shown in the code here:
> con<-url("http://canisius.edu/~yany/RData/marketCap.RData") > load(con) > head(.marketCap)
The associated output is shown here:
> head(.marketCap) Symbol Name MarketCap Exchange 1 A Agilent Technologies, Inc. $12,852.3 NYSE 2 AA Alcoa Inc. $28,234.5 NYSE 3 AA-P Alcoa Inc. $43.6 AMEX
4 AAC Ableauctions.Com Inc $4.3 AMEX 5 AAI AirTran Holdings, Inc. $156.9 NYSE 6 AAP Advance Auto Parts Inc $3,507.4 NYSE
We have various ways to choose a subset of the R dataset called .marketCap. Note that there is a dot in front of .marketCap:
a<-.marketCap[1] # choose the 1st column b<-.marketCap$SYMBOL # another way to choose the 1st column c<-.marketCap[,1:2] # choose the first two columns d<-subset(.marketCap,.marketCap$EXCHANGE=="NYSE") e<-subset(head(.marketCap)) f<-subset(.marketCap,.marketCap$MARKET>200 & .marketCap$MARKETCAP<=3000)
A Python dataset is downloadable at http://canisius.edu/~yany/python/marketCap.pkl.