Deep Learning is one of those hyper-hyped subjects that everybody is talking about and everybody claims they’re doing. In certain cases, startups just need to mention they use Deep Learning and they instantly get appreciation. Deep Learning is indeed a powerful technology, but it’s not an answer to every problem. It’s also not magic like many people make it look like.
In this post, we’ll be doing a gentle introduction to the subject. You’ll learn what a Neural Network is, how to train it and how to represent text features (in 2 ways). For this purpose, we’ll be using the IMDB dataset. It contains around 25.000 sentiment annotated reviews. Deep Learning models usually require a lot of data to train properly. If you have little data, maybe Deep Learning is not the solution to your problem. In this case, the amount of data is a good compromise: it’s enough to train some toy models and we don’t need to spend days waiting for the training to finish or use GPU.
You can get the dataset from here: Kaggle IMDB Movie Reviews Dataset
Let’s quickly explore the IMDB dataset: