- Machine Learning for Cybersecurity Cookbook
- Emmanuel Tsukerman
- 70字
- 2021-06-24 12:29:08
Selecting the best N-grams
The number of different N-grams grows exponentially in N. Even for a fixed tiny N, such as N=3, there are 256x256x256=16,777,216 possible N-grams. This means that the number of N-grams features is impracticably large. Consequently, we must select a smaller subset of N-grams that will be of most value to our classifiers. In this section, we show three different methods for selecting the topmost informative N-grams.