Menu Sidebar
Menu
natural language processing pipeline

Building a NLP pipeline in NLTK

If you have been working with NLTK for some time now, you probably find the task of preprocessing the text a bit cumbersome. In this post, I will walk you through a simple and fun approach for performing repetitive tasks using coroutines. The coroutines concept is a pretty obscure one but very useful indeed. You can check out this awesome presentation by David Beazley to grasp all the stuff needed to get you through this (plus much, much more).
Read More

text chunking

Text Chunking with NLTK

What is chunking

Text chunking, also referred to as shallow parsing, is a task that follows Part-Of-Speech Tagging and that adds more structure to the sentence. The result is a grouping of the words in “chunks”. Here’s a quick example:
Read More

Recipe: Text classification using NLTK and scikit-learn

Text classification is most probably, the most encountered Natural Language Processing task. It can be described as assigning texts to an appropriate bucket. A sports article should go in SPORT_NEWS, and a medical prescription should go in MEDICAL_PRESCRIPTIONS.

To train a text classifier, we need some annotated data. This training data can be obtained through several methods. Suppose you want to build a spam classifier. You would export the contents of your mailbox. You’d label the email in the inbox folder as NOT_SPAM and the contents of your spam folder as SPAM.
Read More

Building a simple inverted index using NLTK

In this example I want to show how to use some of the tools packed in NLTK to build something pretty awesome. Inverted indexes are a very powerful tool and is one of the building blocks of modern day search engines.

While building the inverted index, you’ll learn to:
1. Use a stemmer from NLTK
2. Filter words using a stopwords list
3. Tokenize text
Read More

Newer Posts
Older Posts

NLP-FOR-HACKERS

The NLP-FOR-HACKERS Book

NLP-FOR-HACKERS Book

Like My Tutorials?

Buy me a coffee
GDPR
Privacy Policy

Pin It on Pinterest

Sign up for the Newsletter

Here's what to expect:

* Newly published content

* Curated articles from around the web about NLP and related

* Absolutely NO SPAM

You have Successfully Subscribed!