Blog

Using Deep Learning AI for Natural Language Processing (NLP), Sentiment Analysis and Emotion AI

Fennaio | blog post


 

30-11-2021

Using Deep Learning AI for Natural Language Processing (NLP), Sentiment Analysis and Emotion AI

The possibilities of Deep Learning AI are proving to be endless. Traditionally used for tasks like classification of images, Artificial Neural Networks (ANN) are increasingly being used for learning from numerical and categorical data. Taking it one step further still, categorical data can be words, and this means sentences, paragraphs and human-written text can be sent in. The net result is Deep Learning via an ANN can be trained to interpret language. Here's how it works, what it means, and how it can be used.

Basics of Deep Learning

Deep Learning uses layers of Artificial Neurons to learn, known as Artificial Neural Networks. Data passes in loops (epochs) through each layer sequentially and the ANN relatively quickly works out how to fit the data mathematically. This process, known as training, links input data to output designations, or classifications. Following training, you can then send new data into the now-rigid ANN and it will give a predicted, very accurate outcome, having learnt to understand data it has seen before.

Traditional Image Classification using Deep Learning

Deep Learning Artificial Neural Networks accept unlimited data at one end, pass it through the neurons (known as perceptrons), and link it to unlimited outcomes - in real terms this could be binary: is the image a cat or a dog?, or it can have multiple classifications e.g. is this face happy, sad, angry etc?. This makes ANNs perfect for image recognition: an image is converted into pixels (dozens to potentially millions of pixels per image), the pixels are converted to numbers, images are linked to a label (cat/dog/happy/sad/angry) and the ANN learns to link the pixel image number data with the label.

Using Deep Learning for more than Image Classification - numbers

We know that ANNs can be used for Image classification by fundamentally passing in pixels in the form of numbers - could be binary (0/1) for pure black and white, or more than likely numbers on the RGB scale. This means we can send in any number; the same process applies e.g. height, weight, income, age - this is known a continuous data. Using the same methodology we can classify normal numerical data in an ANN. That's quite straightforward - pass in numbers at one end and link these numbers to an output(s).

Using Deep Learning for more than Image Classification - categorical data

We know we can pass in numbers to Artificial Neural Networks easily. Useful for a lot of things, but restrictive for many real-world problems such as learning from categorical data e.g. colour, postcodes, user ID, car type, product category, weather type, patient type etc. These are known as discrete data. They aren't on a scale like continuous data. E.g. for normal numerical continuous data, such as height, you may have 170,171,172...190cm, it's scale, easily countable and easily passable into an ANN on one input or node. So how do we pass in categorical data where e.g. blue, red, orange, green for something like car colour? You can't just have one input for colour and say 1,2,3,4 for each of our colours respectively as the ANN would treat them as a scale, which we know it isn't. We could instead just have 4 car colour inputs called "blue", "red", "orange", "green" and make them binary e.g. if we had a red car it would be 0,1,0,0. That would work, until you get to hundreds, thousands, millions etc of values and that's just for one label. The only way to deal with categorical data is with embeddings. Embeddings convert categorical data into vectors for use in ANNs. Rather than have potentially unlimited inputs for one category, we can use embeddings to encode them into a much smaller input size while maintaining discreteness for ANN calculations.

Using Deep Learning for more than Image Classification - words, sentences, paragraphs

We know we can use embeddings to send in categorial data. This means we can use the same approach to send words in. If we use a bit of programming we can split up sentences into words, encode the words using embeddings and then fire it all off through the Artificial Neural Network. This is the basis of Natural Language Processing (NLP) using Deep Learning.

Natural Language Processing (NLP) in Deep Learning Artificial Neural Networks

We now have our method of passing in words and paragraphs into an ANN; using embeddings to encode the words at the input level. All we need to do now is label our words, sentences and paragraphs at the output level and let the ANN do its job of linking the inputs to the outputs.

The magical bi-product of using embeddings for encoding words

When I first got eldr AI to learn from words and sentences it was quite exciting, and probably a bit scary at the same time - it was then I realised how powerful AI was.

I then found out about cosine similarity and how, when applied to embeddings, can work out how closely-linked words are from the same input. Cosine Similarity when applied to Artificial Neural Networks/Deep Learning looks for the similarities between embedding vectors. When I say similarity, this is the similarity in all directions between vectors - and this means they can be literally linked by distance in space.

Examples of NLP use in Deep Learning

0 Comments

Post a Comment

Post Comment

You are one step closer to getting Artificial Intelligence into your organisation

Fennaio has the expertise to get you up and running with AI, Machine Learning, Deep Learning and Data Science in your new or existing systems, software and operations.

Get Started
no to gdpr