- Machine Learning for Cybersecurity Cookbook
- Emmanuel Tsukerman
- 162字
- 2021-06-24 12:28:57
How to do it...
In the following steps, we will see a demonstration of how scikit-learn's K-means clustering algorithm performs on a toy PE malware classification:
- Start by importing and plotting the dataset:
import pandas as pd
import plotly.express as px
df = pd.read_csv("file_pe_headers.csv", sep=",")
fig = px.scatter_3d(
df,
x="SuspiciousImportFunctions",
y="SectionsLength",
z="SuspiciousNameSection",
color="Malware",
)
fig.show()
The following screenshot shows the output:

- Extract the features and target labels:
y = df["Malware"]
X = df.drop(["Name", "Malware"], axis=1).to_numpy()
- Next, import scikit-learn's clustering module and fit a K-means model with two clusters to the data:
from sklearn.cluster import KMeans
estimator = KMeans(n_clusters=len(set(y)))
estimator.fit(X)
- Predict the cluster using our trained algorithm:
y_pred = estimator.predict(X)
df["pred"] = y_pred
df["pred"] = df["pred"].astype("category")
- To see how the algorithm did, plot the algorithm's clusters:
fig = px.scatter_3d(
df,
x="SuspiciousImportFunctions",
y="SectionsLength",
z="SuspiciousNameSection",
color="pred",
)
fig.show()
The following screenshot shows the output:

The results are not perfect, but we can see that the clustering algorithm captured much of the structure in the dataset.
推薦閱讀
- 現(xiàn)代測控電子技術(shù)
- Microsoft Power BI Quick Start Guide
- Machine Learning for Cybersecurity Cookbook
- Managing Mission:Critical Domains and DNS
- 計算機控制技術(shù)
- 網(wǎng)絡(luò)化分布式系統(tǒng)預(yù)測控制
- 大數(shù)據(jù)驅(qū)動的設(shè)備健康預(yù)測及維護決策優(yōu)化
- 工業(yè)機器人運動仿真編程實踐:基于Android和OpenGL
- OpenStack Cloud Computing Cookbook
- Excel 2007常見技法與行業(yè)應(yīng)用實例精講
- 單片機C語言程序設(shè)計完全自學(xué)手冊
- 基于神經(jīng)網(wǎng)絡(luò)的監(jiān)督和半監(jiān)督學(xué)習(xí)方法與遙感圖像智能解譯
- Linux Shell Scripting Cookbook(Third Edition)
- 渲染王3ds Max三維特效動畫技術(shù)
- PostgreSQL High Performance Cookbook