- Machine Learning for Cybersecurity Cookbook
- Emmanuel Tsukerman
- 206字
- 2021-06-24 12:28:56
How it works...
We begin by reading in our dataset and then standardizing it, as in the recipe on standardizing data (steps 1 and 2). (It is necessary to work with standardized data before applying PCA). We now instantiate a new PCA transformer instance, and use it to both learn the transformation (fit) and also apply the transform to the dataset, using fit_transform (step 3). In step 4, we analyze our transformation. In particular, note that the elements of pca.explained_variance_ratio_ indicate how much of the variance is accounted for in each direction. The sum is 1, indicating that all the variance is accounted for if we consider the full space in which the data lives. However, just by taking the first few directions, we can account for a large portion of the variance, while limiting our dimensionality. In our example, the first 40 directions account for 90% of the variance:
sum(pca.explained_variance_ratio_[0:40])
This produces the following output:
This means that we can reduce our number of features to 40 (from 78) while preserving 90% of the variance. The implications of this are that many of the features of the PE header are closely correlated, which is understandable, as they are not designed to be independent.
- 工業(yè)機器人虛擬仿真實例教程:KUKA.Sim Pro(全彩版)
- Design for the Future
- Dreamweaver CS3網(wǎng)頁設(shè)計與網(wǎng)站建設(shè)詳解
- 機器自動化控制器原理與應(yīng)用
- Security Automation with Ansible 2
- Pig Design Patterns
- 信息物理系統(tǒng)(CPS)測試與評價技術(shù)
- 工業(yè)機器人應(yīng)用案例集錦
- Prometheus監(jiān)控實戰(zhàn)
- 網(wǎng)站前臺設(shè)計綜合實訓(xùn)
- Excel 2010函數(shù)與公式速查手冊
- 電氣控制與PLC原理及應(yīng)用(歐姆龍機型)
- 算法設(shè)計與分析
- 菜鳥起飛五筆打字高手
- 深度學(xué)習(xí)之模型優(yōu)化:核心算法與案例實踐