官术网_书友最值得收藏!

Applications of POS tagging

POS tagging finds applications in Named Entity Recognition (NER), sentiment analysis, question answering, and word sense disambiguation. We will look at an example of word sense disambiguation in the following code. In the sentences I left the room and Left of the room, the word left conveys different meanings. A POS tagger would help to differentiate between the two meanings of the word left. We will now look at how these two different usages of the same word are tagged:

>>> import nltk
>>> text1 = nltk.word_tokenize("I left the room")
>>> text2 = nltk.word_tokenize("Left of the room")
>>> nltk.pos_tag(text1,tagset='universal')
[('I', 'PRON'), ('left', 'VERB'), ('the', 'DET'), ('room', 'NOUN')]
>>> nltk.pos_tag(text2, tagset='universal')
[('Left', 'NOUN'), ('of', 'ADP'), ('the', 'DET'), ('room', 'NOUN')]

In the first example, the word left is a verb, whereas it is a noun in the second example. In NER, POS tagging helps in identifying a person, place, or location, based on the tags. NLTK provides a built-in trained classifier that can identify entities in the text, which works on top of the POS tagged sentences, as shown in the following code:

>>> import nltk
>>> example_sent = nltk.word_tokenize("The company is located in South Africa")
>>> example_sent
['The', 'company', 'is', 'located', 'in', 'South', 'Africa']
>>> tagged_sent = nltk.pos_tag(example_sent)
>>> tagged_sent
[('The', 'DT'), ('company', 'NN'), ('is', 'VBZ'), ('located', 'VBN'), ('in', 'IN'), ('South', 'NNP'), ('Africa', 'NNP')]
>>> nltk.ne_chunk(tagged_sent)
Tree('S', [('The', 'DT'), ('company', 'NN'), ('is', 'VBZ'), ('located', 'VBN'), ('in', 'IN'), Tree('GPE', [('South', 'NNP'), ('Africa', 'NNP')])])

The ne_chunk() function uses the trained named entity chunker to identify South Africa as a geopolitical entity (GPE), in the example sentence. So far, we have seen examples using NLTK's built-in taggers. In the next section, we will look at how to develop our own POS tagger.

主站蜘蛛池模板: 浦东新区| 吴江市| 林口县| 新安县| 洛宁县| 达州市| 万安县| 香格里拉县| 保康县| 景谷| 封丘县| 陇西县| 友谊县| 南郑县| 冀州市| 新乡县| 广饶县| 昂仁县| 皋兰县| 克东县| 颍上县| 南乐县| 漳浦县| 阜新| 滁州市| 依兰县| 青州市| 扎赉特旗| 弋阳县| 祁门县| 突泉县| 车致| 循化| 达拉特旗| 洮南市| 宁河县| 民丰县| 福泉市| 阿巴嘎旗| 延长县| 灵宝市|