Menu Sidebar
Menu

Recipe: Text classification using NLTK and scikit-learn

Text classification is most probably, the most encountered Natural Language Processing task. It can be described as assigning texts to an appropriate bucket. A sports article should go in SPORT_NEWS, and a medical prescription should go in MEDICAL_PRESCRIPTIONS.

To train a text classifier, we need some annotated data. This training data can be obtained through several methods. Suppose you want to build a spam classifier. You would export the contents of your mailbox. You’d label the email in the inbox folder as NOT_SPAM and the contents of your spam folder as SPAM.
Read More

Building a simple inverted index using NLTK

In this example I want to show how to use some of the tools packed in NLTK to build something pretty awesome. Inverted indexes are a very powerful tool and is one of the building blocks of modern day search engines.

While building the inverted index, you’ll learn to:
1. Use a stemmer from NLTK
2. Filter words using a stopwords list
3. Tokenize text
Read More

How to convert between verb/noun/adjective/adverb forms using Wordnet

You might have stumbled in your NLP application development upon situations when you needed to get the “closest” adjective to a noun, or maybe you needed to “nounify” a verb. After poking around Wordnet I found a simple and pretty effective way to do this. Keep in mind that it is not error proof, but for most of my needs, I found it to perform pretty well. We’ll be using NLTK Wordnet wrapper for this. Let’s have a look at the code:
Read More

Newer Posts

NLP-FOR-HACKERS

The NLP-FOR-HACKERS Book

NLP-FOR-HACKERS Book

Like My Tutorials?

Buy me a coffee
GDPR
Privacy Policy

Privacy Preference Center


  • Warning: reset() expects parameter 1 to be array, string given in /home/bogdani/webapps/nlpforhackers/wp-content/plugins/gdpr/public/partials/privacy-preferences-modal.php on line 32

    • Warning: Invalid argument supplied for foreach() in /home/bogdani/webapps/nlpforhackers/wp-content/plugins/gdpr/public/partials/privacy-preferences-modal.php on line 36

Warning: Invalid argument supplied for foreach() in /home/bogdani/webapps/nlpforhackers/wp-content/plugins/gdpr/public/partials/privacy-preferences-modal.php on line 71

Close your account?

Your account will be closed and all data will be permanently deleted and cannot be recovered. Are you sure?

Are you sure?

By disagreeing you will no longer have access to our site and will be logged out.

Pin It on Pinterest

>

Sign up for the Newsletter

Here's what to expect:

* Newly published content

* Curated articles from around the web about NLP and related

* Absolutely NO SPAM

You have Successfully Subscribed!