Menu Sidebar
Menu

Author: bogdani

Word Embeddings Cover

Complete Guide to Word Embeddings

Introduction We talked briefly about word embeddings (also known as word vectors) in the spaCy tutorial. SpaCy has word vectors included in its models. This tutorial will go deep into the intricacies of how to compute them and their different applications.

spaCy Tutorial Cover

Complete Guide to spaCy

Updates 29-Apr-2018 – Fixed import in extension code (Thanks Ruben) spaCy is a relatively new framework in the Python Natural Language Processing environment but it quickly gains ground and will most likely become the de facto library. There are some really good reasons for its popularity:

Complete Guide to Topic Modeling

Complete Guide to Topic Modeling

What is Topic Modeling? Topic modelling, in the context of Natural Language Processing, is described as a method of uncovering hidden structure in a collection of texts. Although that is indeed true it is also a pretty useless definition. Let’s define topic modeling in more practical terms.

WordClouds Cover

Quick Recipe: Building Word Clouds

What are Word Clouds? Word Clouds are a popular way of displaying how important words are in a collection of texts. Basically, the more frequent the word is, the greater space it occupies in the image. One of the uses of Word Clouds is to help us get an intuition about what the collection of […]

TextRank for Text Summarization

TextRank for Text Summarization

The task of summarization is a classic one and has been studied from different perspectives. The task consists of picking a subset of a text so that the information disseminated by the subset is as close to the original text as possible. The subset, named the summary, should be human readable. The task is not […]

Language models

If you come from a statistical background or a machine learning one then probably you don’t need any reasons for why it’s useful to build language models. If not, here’s what language models are and why they are useful.

Natural Language Processing Corpora

Natural Language Processing Corpora

One of the reasons why it’s so hard to learn, practice and experiment with Natural Language Processing is due to the lack of available corpora. Building a gold standard corpus is seriously hard work. That’s why resources are so scarce or cost a lot of money. In this post, I’m going to aggregate some cool […]

Introduction to Python NLTK

Introduction to NLTK

NLTK (Natural Language ToolKit) is the most popular Python framework for working with human language. There’s a bit of controversy around the question whether NLTK is appropriate or not for production environments. Here’s my take on the matter:

term-frequency-inverse-document-frequency

Weighting words using Tf-Idf

Updates 29-Apr-2018 – Added string instance check Python 2.7, Python3.6 compatibility (Thanks Greg) If I ask you “Do you remember the article about electrons in NY Times?” there’s a better chance you will remember it than if I asked you “Do you remember the article about electrons in the Physics books?”. Here’s why: an article […]

Older Posts

NLP-FOR-HACKERS

The NLP-FOR-HACKERS Book

NLP-FOR-HACKERS Book

Like My Tutorials?

Buy me a coffee
GDPR
Privacy Policy

Privacy Preference Center


  • Warning: reset() expects parameter 1 to be array, string given in /home/bogdani/webapps/nlpforhackers/wp-content/plugins/gdpr/public/partials/privacy-preferences-modal.php on line 32

    • Warning: Invalid argument supplied for foreach() in /home/bogdani/webapps/nlpforhackers/wp-content/plugins/gdpr/public/partials/privacy-preferences-modal.php on line 36

Warning: Invalid argument supplied for foreach() in /home/bogdani/webapps/nlpforhackers/wp-content/plugins/gdpr/public/partials/privacy-preferences-modal.php on line 71

Close your account?

Your account will be closed and all data will be permanently deleted and cannot be recovered. Are you sure?

Are you sure?

By disagreeing you will no longer have access to our site and will be logged out.

Pin It on Pinterest

>