- Python Machine Learning By Example
- Yuxi (Hayden) Liu
- 464字
- 2021-07-02 12:41:39
Picking up NLP basics while touring popular NLP libraries
After a short list of real-world applications of NLP, we'll be touring the essential stack of Python NLP libraries in this chapter. These packages handle a wide range of NLP tasks as mentioned previously as well as others such as sentiment analysis, text classification, and named entity recognition.
The most famous NLP libraries in Python include the Natural Language Toolkit (NLTK), spaCy, Gensim, and TextBlob. The scikit-learn library also has impressive NLP-related features. Let's take a look at the following popular NLP libraries in Python:
- nltk: This library (http://www.nltk.org/) was originally developed for educational purposes and is now being widely used in industries as well. It is said that you can't talk about NLP without mentioning NLTK. It is one of the most famous and leading platforms for building Python-based NLP applications. You can install it simply by running the following command line in terminal:
sudo pip install -U nltk
If you're using conda, then execute the following command line:
conda install nltk
- SpaCy: This library (https://spacy.io/) is a more powerful toolkit in the industry than NLTK. This is mainly for two reasons: one, spaCy is written in Cython, which is much more memory-optimized (now you see where the Cy in spaCy comes from) and excels in NLP tasks; second, spaCy keeps using state-of-the-art algorithms for core NLP problems, such as, convolutional neural network (CNN) models for tagging and name entity recognition. But it could seem advanced for beginners. In case you're interested, here's the installation instructions.
Run the following command line in the terminal:
pip install -U spacy
For conda, execute the following command line:
conda install -c conda-forge spacy
- Gensim: This library (https://radimrehurek.com/gensim/), developed by Radim Rehurek, has been gaining popularity over recent years. It was initially designed in 2008 to generate a list of similar articles given an article, hence the name of this library (generate similar—> Gensim). It was later drastically improved by Radim Rehurek in terms of its efficiency and scalability. Again, we can easily install it via pip by running the following command line:
pip install --upgrade gensim
In the case of conda, you can perform the following command line in terminal:
conda install -c conda-forge gensim
- TextBlob: This library (https://textblob.readthedocs.io/en/dev/) is a relatively new one built on top of NLTK. It simplifies NLP and text analysis with easy-to-use built-in functions and methods, as well as wrappers around common tasks. We can install TextBlob by running the following command line in the terminal:
pip install -U textblob
TextBlob has some useful features that are not available in NLTK (currently), such as spell checking and correction, language detection, and translation.
- 電氣自動化專業英語(第3版)
- Getting Started with Oracle SOA B2B Integration:A Hands-On Tutorial
- Hands-On Neural Networks with Keras
- 機艙監測與主機遙控
- STM32G4入門與電機控制實戰:基于X-CUBE-MCSDK的無刷直流電機與永磁同步電機控制實現
- 工業機器人現場編程(FANUC)
- 大數據技術與應用
- 工業機器人操作與編程
- 嵌入式操作系統
- Microsoft System Center Confi guration Manager
- Linux嵌入式系統開發
- 云計算和大數據的應用
- INSTANT Puppet 3 Starter
- 會聲會影X4中文版從入門到精通
- 設計模式