Menu Sidebar
Menu

tokenization

Introduction to Python NLTK

Introduction to NLTK

NLTK (Natural Language ToolKit) is the most popular Python framework for working with human language. There’s a bit of controversy around the question whether NLTK is appropriate or not for production environments. Here’s my take on the matter: NLTK doesn’t come with super powerful trained models (like other frameworks do, like Stanford CoreNLP) NLTK is […]

Splitting text into sentences

Splitting text into sentences

Few people realise how tricky splitting text into sentences can be. Most of the NLP frameworks out there already have English models created for this task. You might encounter issues with the pretrained models if: You are working with a specific genre of text(usually technical) that contains strange abbreviations. You are working with a language […]

NLP-FOR-HACKERS

Newsletter