書名： Hands-On Natural Language Processing with Python
作者名： Rajesh Arumugam Rajalingappaa Shanmugamani
本章字?jǐn)?shù)： 217字
更新時(shí)間： 2021-08-13 16:01:38

Recognizing named entities

NER is a type of text annotation task. In NER, words or tokens in a piece of text are labeled or annotated into categories, such as organizations, locations, people, and so on. In effect, NER converts unstructured text data into structured data that can later be used for further analysis. The following screenshot is a visualization from the Google Cloud API. The reader can try out the API with the link provided in the preceding subsection:

The output result in the preceding screenshot shows how the different entities, such as ORGANISATION (Google), PERSON (Sundar Pitchai), EVENT (CONSUMER ELECTRONICS SHOW), and so on, are automatically extracted from the unstructured raw text by NER. The output also gives the sentiment for each label or category, based on sentiment analysis. The reader can experiment with different text using the link provided earlier. When we click on the Categories tab, we can see the following:

The preceding screenshot shows how the system also classifies a particular piece of text into Computer & Electronics, News, and so on, using the recognized named entities in the text. Such a categorization, called topic modeling, is another important NLP task, used to identify the main theme or topic of a sentence or document.

官术网_书友最值得收藏!

Hands-On Natural Language Processing with Python

Recognizing named entities