If you have been working with NLTK for some time now, you probably find the task of preprocessing the text a bit cumbersome. In this post, I will walk you through a simple and fun approach for performing repetitive tasks using coroutines. The coroutines concept is a pretty obscure one but very useful indeed. You […]
What is chunking Text chunking, also referred to as shallow parsing, is a task that follows Part-Of-Speech Tagging and that adds more structure to the sentence. The result is a grouping of the words in “chunks”. Here’s a quick example:
(NP Every/DT day/NN)
(NP the/DT corner/NN shop/NN))
In other words, in a shallow parse tree, there’s one maximum level between the […]
What is “NER” NER, short for Named Entity Recognition is probably the first step towards information extraction from unstructured text. It basically means extracting what is a real world entity from the text (Person, Organization, Event etc …). Why do you need this information? You might want to map it against a knowledge base to […]
Part-Of-Speech tagging (or POS tagging, for short) is one of the main components of almost any NLP analysis. The task of POS-tagging simply implies labelling words with their appropriate Part-Of-Speech (Noun, Verb, Adjective, Adverb, Pronoun, …).