官术网_书友最值得收藏!

Applications of POS tagging

POS tagging finds applications in Named Entity Recognition (NER), sentiment analysis, question answering, and word sense disambiguation. We will look at an example of word sense disambiguation in the following code. In the sentences I left the room and Left of the room, the word left conveys different meanings. A POS tagger would help to differentiate between the two meanings of the word left. We will now look at how these two different usages of the same word are tagged:

>>> import nltk
>>> text1 = nltk.word_tokenize("I left the room")
>>> text2 = nltk.word_tokenize("Left of the room")
>>> nltk.pos_tag(text1,tagset='universal')
[('I', 'PRON'), ('left', 'VERB'), ('the', 'DET'), ('room', 'NOUN')]
>>> nltk.pos_tag(text2, tagset='universal')
[('Left', 'NOUN'), ('of', 'ADP'), ('the', 'DET'), ('room', 'NOUN')]

In the first example, the word left is a verb, whereas it is a noun in the second example. In NER, POS tagging helps in identifying a person, place, or location, based on the tags. NLTK provides a built-in trained classifier that can identify entities in the text, which works on top of the POS tagged sentences, as shown in the following code:

>>> import nltk
>>> example_sent = nltk.word_tokenize("The company is located in South Africa")
>>> example_sent
['The', 'company', 'is', 'located', 'in', 'South', 'Africa']
>>> tagged_sent = nltk.pos_tag(example_sent)
>>> tagged_sent
[('The', 'DT'), ('company', 'NN'), ('is', 'VBZ'), ('located', 'VBN'), ('in', 'IN'), ('South', 'NNP'), ('Africa', 'NNP')]
>>> nltk.ne_chunk(tagged_sent)
Tree('S', [('The', 'DT'), ('company', 'NN'), ('is', 'VBZ'), ('located', 'VBN'), ('in', 'IN'), Tree('GPE', [('South', 'NNP'), ('Africa', 'NNP')])])

The ne_chunk() function uses the trained named entity chunker to identify South Africa as a geopolitical entity (GPE), in the example sentence. So far, we have seen examples using NLTK's built-in taggers. In the next section, we will look at how to develop our own POS tagger.

主站蜘蛛池模板: 台安县| 会同县| 梨树县| 武义县| 南漳县| 习水县| 花莲县| 苗栗县| 安平县| 吉木乃县| 大安市| 曲沃县| 乐亭县| 措美县| 章丘市| 神木县| 上虞市| 乐清市| 微博| 天等县| 资溪县| 军事| 宜昌市| 甘德县| 广饶县| 炉霍县| 法库县| 淮滨县| 金昌市| 朔州市| 崇仁县| 寿宁县| 额济纳旗| 宜丰县| 孝感市| 大冶市| 嵩明县| 墨脱县| 江城| 中阳县| 海原县|