Deep Learning – Please Don’t Start Here

Things become easier when you spend enough time with it, but that didn’t stop me from giving up right away.

I’m no AI/ML enthusiast.

I am a do-whatever-it-takes-to-stay-employed enthusiast.

I decided to explore this purely on a whim. Machine Learning is no joke – its more math and algorithm than it is development which is why it took me so long to pen this piece.

Isn’t This About Deep Learning?

Deep Learning is Machine Learning on steroids, and I’m the rookie bodybuilder who decided to take steroids on his very first day at the gym thinking “he can handle it”.

If you’re currently in the process of learning Data Science, do us (and the mental health wards) all a favour and start with statistics. Without statistics, your chances of survival are next to zero. After statistics, you may move onto python, then Machine Learning and only then must you explore Deep Learning. You don’t learn to swim by diving headfirst into the ocean during a storm in the middle of a nuclear holocaust.

So why did I do it?

Well, ~~I overestimated my abilities~~ I have no interest whatsoever, in pursuing a career in data science. I’m exploring this purely out of curiosity, so I started with that topic that didn’t sound like it was going to bite my head off.

OK FINE, I was interested at first, but no one told me that I had to learn math.

OK FINE, I knew mathematics was involved, but I didn’t think it had enough mathematics to scare the living daylights out of me.

There is only so much trauma I can endure.

But that’s enough about me. Lets focus on the question thats on everyone’s mind which is :

What’s so Deep About ‘Deep’ Learning?

Deep Learning is pretty darn efficient at categorizing stuff that have layers of complexity to them.

We don’t have to tell it what to look for – it figures things out on its own. Ok let me rephrase that – once we provide a large enough dataset for it to work with it does a pretty good job of figuring stuff out and categorizing them into hierarchies on its own without our input.

In fact, its so good at recognizing complex patterns and hierarchies that its used in areas related such as language art and medicine.

In short, it’s every Asian parent’s dream child.

What makes it so smart? Deep learning has a unique architecture comprised of neurons – a term plagiarized from the medical community, to learn and make predictions better than any astrologer ever could.

Oh sorry, I meant borrowed. I’m not sorry about that last bit though.

Let’s continue.

Just like an onion, neural networks are comprised of layers upon layers of neurons.

And just like an onion, it makes me cry.

These neurons are arranged in layers which are then arranged in networks, which is fine and dandy, but is having multiple layers of neurons what makes it deep? I have five sweaters in the cupboard, would wearing all of them make me look deep? Or would it make me look like an idiot?

Don’t answer that.

Unlike my sweaters, having multiple layers of neurons, which in this context are tiny computational units, helps the model dive deeper into the data. These layers are what enable the model to comprehend complexity and there are many ways to arrange these layers.

We won’t go into too much detail since this is just an introduction. First, we must get an overview and construct a semi-working model in our heads before diving into the nitty gritty details, and that brings us to our next topic which is Models.

Models?

There are plenty of models out there, many of which haven’t responded to my DMs.

However, the models we are talking about today are the not so attractive mathematically drenched abstractions that help us represent real world stuff. The models we are particularly interested in are Neural Network Models. While models are mathematical abstractions jampacked with algorithms, Neural Networks are also mathematical abstractions jampacked with algorithm, except they are modelled after the brain.

But how do they work?

Models accept data and conduct shady business undercover before providing us with a prediction. Here is a pixtorial representation of a simple neural network.

Pixtorial – a term penned by wannabe pixel artist TCT

Neural Networks are comprised of three main sections:

Input Layer
Hidden Layer(s)
Output Layer

Its important that you understand what each layer does, so pay attention.

Input Layer

This layers receives data and transfers it to subsequent layers. While this may sound simple enough, a fair amount of introspection goes into designing the input layer and the operations it must perform before pushing data to the next stage.

The number of neurons, and the “operations” it must perform depend entirely on the nature of the data its expected to work with. Imagine we have a large dataset consisting entirely of cat images. These images are fixed to 40×40 pixels, just to make things easier for us. Some of these cats have ears, while others don’t. We would like our Deep Learning Model to effectively label the images based on the “feature” we want it to recognise – which are the ears.

Also, lets keep the cruelty down to a minimum by adding that the “cats without ears” are a result of the photographer doing an excellent job of capturing their photos.

Photographer: It ain’t my fault, she won’t stay still!

Of course, we haven’t reached a stage where we can develop and train our very own deep learning models, so lets imagine that we are seasoned data scientists with a massive salary package and walk through this example with a large smug on our face. Since the images are fixed in dimension, we know that the neurons required in the Input Layer is 1600.

Why?

Because that is the number of pixels we have in each image.

And What of Its Operation?

Images are mostly normalized before serving it to the model for pattern recognition. They are usually 8-bit which translates into 256 different shades of red, blue and green, or black and white if the image is greyscale. This means, each pixel contains values ranging from (0-255), so our neuron will consist of values ranging from (0-255).

I glossed over some details regarding pixels and channels as its not important in our discussion. We will look at it in some of our working examples in another article.

These values are scaled down to values present in the range (0-1) to improve the model’s stability. Large values behave as outliers and affect the model’s ability to generalize data and a host of other issues that will only make sense once we get to the mathematics(oh the horror).

All in all, the input layer doesn’t just let stuff in – it performs operations that make it easier for the hidden layers to digest them.

Hidden Layer

Here is where the magic happens – but before that, let me update the gif for you.

As you can see, each neuron in the input layer interacts with each neuron in the hidden layer. This cascades into what you are seeing below:

Unlike the input layer, the Hidden layer plays around with the data and learns from its interactions. Yes, that sentence is incredibly vague, but that’s because the English language is not equipped to explain this adequately – and neither is my brain. My neurons aren’t equipped to perform any of that deep learning shenanigans either and we have to switch over to mathematics to actually understand what goes on under the hood – which will be covered in a separate article.

Simply put, each neuron takes an input, performs mathematical operations onto it and passes it onto the next neuron then onto the next layer and so on and so forth.

Remember What I Said About “Giving Up”?

This is where I decided to stop because I realized that I didn’t know enough to effectively learn and use Machine Learning. If I continue, all I’d be learning are terminologies. I’d probably develop the skills required to cosplay as a manager but not enough to actually develop anything reliable.

With that being said, I’ll have to conclude here.

What’s Next?

Statistics.

Statistics? On What?

Stay tuned!

September 18, 2023 TCT