Thoughts on Machine Learning

My first blog post in 2017, yey! It was about time. Can’t believe almost two months have passed already. The first 2 weeks of this year have been quite hectic for me, with 3 important coursework deadlines, which drained most of my energy. Then one more, which made me a temporary “NLP word embeddings and LSTM” expert and after that I finally took a 4-day vacation to the incredible Iceland (I’d love to write a post on that trip soon).

However, long story short, I’m in the second half on my second MSc term and I can gladly say I am now much more mature about it. I sometimes like to personify my Masters. So in the first term we were like strangers, pretty much scared/terrified of each other. With time, after submitting my first “wave” of coursework and getting more than good feedback, I started to gain more courage and realized “well, maybe this isn’t that bad”. Some incredible goals I’ve set, slowly started to materialize, which gave me an even greater feeling of being able to tackle the impossible (it always seems impossible until it’s done – N. Mandela). I still don’t understand how this could be, but for me it is empirically determined that whenever I truly wish for something from the bottom of my heart, it happens.

So, I wanted to write this post to lay down some of my current views on machine learning from my present standpoint. There is no doubt they will change. I’d just like to keep track of what I think now, how I view it and how I’ll develop my knowledge and perspective on the topic in the future. Be aware, this is just a high level reflection, which is meant to be more humorous than informative. So let’s begin!

By many, machine learning is viewed as a major, growing branch of artificial intelligence. Some like to call it pure statistics. I say it’s on the edge. Machine learning algorithms have been developed from late 50s, evolved throughout the decades to neural networks in the 80s and so on. They looked good on paper, but they were impractical to test. As you already now, there just wasn’t enough computational power to support the theory. Quick parentheses: after watching the Space Odyssey 2001 film recently, I was amazed how developed the views on artificial intelligence were in the late 60s.

Fast forward to 2017, thanks to Moore’s law, we have incredible computational power and not only that, we have today’s golden resource: data. Well, when I say “we have”, I basically mean a few lucky big corporations have truly meaningful and useful data. The more data you give to a machine learning algorithm, the better it will generalise to new, unseen data you’ll give it. Just as a human brain, if it only knows one language, it will be impossible for it to grasp the meaning of words from a different and unrelated one, no matter how well it knows that one initial language.

I’ll make another parentheses here for a funny story (which I consider relevant and I like) : a friend of mine was on a plane reading a book in Romanian. An English man sitting next to my friend asks after a while “What language is that? Is it Romanian? Because I don’t know it”. Surprised by the specific question and the hypothesis, my friend replies “Is Romanian the only language you don’t know?”. The English man laughs and explained how he reached his conclusion. He noticed it looked a bit Slavic, but also very Latin and it had a Roman alphabet. I particularly liked how he pondered on the matter, connected some dots, made some assumptions, probably made a few guesses in his mind and decided to evaluate his final prediction. And this again makes me wonder how computers will be able to “think” like this. But wait a second, they already do. Based on different criteria, but they do a pretty good job. Google or Bing Translate already guess the languages you type, without you specifying it. The problem now is, we see it as normal from a computer to be able to do that, but quite spectacular from a human. Here is the point where I’d probably philosophize. It’s interesting, isn’t it?

Cool, so why is there so much statistics and mathematics behind machine learning? Well, because of data. Whenever you have a lot of raw data, you want to make some sense out of it so you call upon statistics, which calls upon some maths. A Statistics superhero will come around and say “Hey! I know some formulas that will help you understand the mean of this data, a standard deviation, you’ll draw some pretty cool distribution graphs and find some pretty cool numbers. Do you want to do this on paper? Or should I call my friend, the Computer superhero? He knows some pretty cool shortcuts”. Then of course, for a 5 by 6 table of data you’d probably do a pretty good job on your own. But when you’re given 10GB of data, you might have no other choice than to call your Computer superhero to the rescue. So cool, he now helps you out with your data set. But his superpowers are limited. It has a lot of memory and does calculations really fast, but it requires a human to give it some algorithms (a set of instructions, like recipes) and it will produce a result (the final meal) – could be bad or good, depends how skilled you are as a “cook”. If you tell it to keep the food in the oven for 2 hours and then 1 more, the instructions will probably destroy your end result. Bottom line, what a machine learning specialist will know, is how to be a 5-star Michelin chef. He’ll know for the specific data set, what ingredients from the machine learning toolbox he needs to complement it nicely, distinguish its patterns and highlight it. It will also know how to use the hardware at hand: CPUs, GPUs to make sure they’ll get the desired, expected, perfect end result in time.

I’d end this post by paraphrasing a really cool quote I read on being a data scientist nowadays. And it says that a data scientist is better at statistics than a software engineer and better at programming than a statistician. However, it is equally true that he is worse at statistics than a statistician and worse at programming than a software engineer. Different viewpoints, same idea.