- Machine Learning for Cybersecurity Cookbook
- Emmanuel Tsukerman
- 150字
- 2021-06-24 12:29:08
Extracting N-grams
In standard quantitative analysis of text, N-grams are sequences of N tokens (for example, words or characters). For instance, given the text The quick brown fox jumped over the lazy dog, if our tokens are words, then the 1-grams are the, quick, brown, fox, jumped, over, the, lazy, and dog. The 2-grams are the quick, quick brown, brown fox, and so on. The 3-grams are the quick brown, quick brown fox, brown fox jumped, and so on. Just like the local statistics of the text allowed us to build a Markov chain to perform statistical predictions and text generation from a corpus, N-grams allow us to model the local statistical properties of our corpus. Our ultimate goal is to utilize the counts of N-grams to help us predict whether a sample is malicious or benign. In this recipe, we demonstrate how to extract N-gram counts from a sample.
- PostgreSQL 11 Server Side Programming Quick Start Guide
- 21天學(xué)通PHP
- 返璞歸真:UNIX技術(shù)內(nèi)幕
- 智能工業(yè)報(bào)警系統(tǒng)
- OpenStack Cloud Computing Cookbook(Second Edition)
- 水晶石精粹:3ds max & ZBrush三維數(shù)字靜幀藝術(shù)
- 網(wǎng)絡(luò)組建與互聯(lián)
- MATLAB/Simulink權(quán)威指南:開(kāi)發(fā)環(huán)境、程序設(shè)計(jì)、系統(tǒng)仿真與案例實(shí)戰(zhàn)
- 生成對(duì)抗網(wǎng)絡(luò)項(xiàng)目實(shí)戰(zhàn)
- 傳感器原理及實(shí)用技術(shù)
- 計(jì)算機(jī)硬件技術(shù)基礎(chǔ)(第2版)
- DynamoDB Applied Design Patterns
- Hands-On Geospatial Analysis with R and QGIS
- 工業(yè)機(jī)器人應(yīng)用系統(tǒng)三維建模
- SolarWinds Server & Application Monitor:Deployment and Administration