Introduction We talked briefly about word embeddings (also known as word vectors) in the spaCy tutorial. SpaCy has word vectors included in its models. This tutorial will go deep into the intricacies of how to compute them and their different applications.
What is Topic Modeling? Topic modelling, in the context of Natural Language Processing, is described as a method of uncovering hidden structure in a collection of texts. Although that is indeed true it is also a pretty useless definition. Let’s define topic modeling in more practical terms.
Few people realise how tricky splitting text into sentences can be. Most of the NLP frameworks out there already have English models created for this task. You might encounter issues with the pretrained models if:
Simple recipe for text clustering. This sometimes creates issues in scikit-learn because text has sparse features.