First Term of my Masters

My bad, I wasn’t expecting such a long post, but believe me, there is a lot to talk about! So here we go!

My first term, out of three is done. Well, not entirely. Just the classes. One assignment was out of the way today, 3 more to submit by mid-January and 1 until the end of January. And of course, 3 exams for this term’s modules in May.

It’s interesting for me to switch now to a different set of 4 modules for my second term. This is, in fact, one of the main reasons the Masters has been so challenging so far. The content for a class, which could easily span an entire academic year, is condensed to 10 weeks, maintaining a high level of complexity in terms of content.

Looking back on the first 2 and a half months of my Masters, I realise my skill set and I have both evolved tremendously. The pressure and the increased pace do make you create the impossible, possible. It’s a tough road towards the end result, but the feeling you get when you grasp the meaning of the concepts is priceless.

Although the MSc is entitled Web Science and Big Data Analytics, it has a high focus on machine learning. We are able to customise it anyway we like and by this I mean select whichever modules we like from all Computer Science MSc modules available.

My 2 core modules this term were Statistical NLP and Complex Networks. As elective, I chose Computer Vision and Supervised Learning. I made an ambitious choice, as I knew these were 2 tough modules, where expectations are high and the content is advanced. But I couldn’t say no to the challenge. The syllabus for both was incredibly interesting and indeed they turned out to be an excellent choice.

First off, in Computer Vision, we’ve covered the contents of Simon Prince’s amazing book Computer Vision : Models, Learning and Inference. I would highly recommend it for anyone interested in the subject. We learnt about 2D and 3D image geometry, object identification, tracking, face recognition, texture synthesis, all under probabilistic models. We also covered graphical models, random forests, homographies and particle filters. As William Freeman (MIT) put it, “computer vision and machine learning have gotten married and this book is their child”.

I was really really surprised when I received the grades for our first coursework submission and I found out I got an A+, equivalent to above 90%. As I mentioned, I had times when I thought I would never be able to develop what I was required, but pushing myself not to give up made me create the impossible, possible. And it feels amazing!

In Supervised Learning we had 3 different lecturers throughout the term, including John Shawe-Taylor, renowned for his book on Kernel Methods for Pattern Analysis. This module is equally demanding and has a strong focus on kernel methods. From our 2nd teaching week we covered kernels and regularisation, then moved on to online learning in our 3rd week. The module then introduced us to SVMs, Gaussian Processes, decision trees and ensemble learning, multitask learning, learning theory and sparsity methods. All with lots and lots and lots of maths.

Statistical Natural Language Processing was really really fun. It’s assessment is divided into 3 main assignments, the only class with no exam. The last one will be a group coursework and we’ll be required to use deep learning, with recurrent neural nets (yuppy yey!). We covered language models, with the traditional n-grams and we looked at machine translation, tagging and information extraction, such as entity recognition. I discovered there are some really really interesting things to do in natural language processing, now that deep learning algorithms enable a more accurate recognition of complex patterns. Languages have different levels of complexity. And training a model to learn a certain language really well is achievable. But there are so many aspects that make this task so difficult. Jargon, word ordering, neologisms, tricky meanings, all need to be accounted for. A really cool project UCL’s NLP research group received funding for, is the automated fact-checking project, sponsored by Google. They help avoid the spread of fake news via social media or search engines.

Last, but not least, in Complex Networks, we studied graph theory, random network models, as well as various ways in which networks’ properties can be analysed and help make informed predictions and decisions. Concepts such as the small-world property or the 6 degrees of separation, power laws and scale free networks have been covered throughout the term. A very important point made in this class was that links in networks carry meaningful information from the real world. For example, a trade is made between 2 countries (nodes in a network), due to an agreement, involving political, social, macro-economical reasons and so on. New links form with a reason, old links are broken with a reason. Its amazing how much predictable power network studies can have. The reason why this links really well with our Masters is because the web gave birth to a whole range of discoveries in the field of network science, due to its unprecedented complexity and scale. As Barabas notes in his book, “the Web is the largest network humanity has ever built. It exceeds in size even the human brain (N ≈ 1011 neurons)”. It is a fascinating network to study and in the past 2 decades it produced some of the best papers in network science.

I’ll finish off here, as this simply exceeds what I had in mind when I started writing. But hopefully this will prove helpful for anyone interested in pursuing an MSc in the future at UCL. I am happy to share more details of my experience and answer any other questions you might have about particular modules.

In term two, I am very eager to start studying Information Retrieval & Data Mining, Web Economics, Interaction Design and Advanced Topics in Machine Learning, taught by Google DeepMind’s researchers.


Facebook Data & Analytics Event

I have been super lucky to get invited a couple of weeks ago to this amazing event on Tuesday at Facebook’s London HQ. Of course, I couldn’t say no. It was a mix of interesting talks and informal networking around data and analytics.

Also at this event, the Geekettes had their official re-launch, introducing their new London group leaders. Geekettes is a global organisation for women in technology, founded by Jess Erickson and Denise Philipp in Berlin. With 9 hubs around the world, I am very proud to have joined them as a member after this event.

The four talks of the evening revolved around the ways in which Facebook is handling its large amount of data for advertising, product quality and research.

Manohar Paluri, researcher at Facebook AI, offered a great talk about how they use AI to achieve the company’s mission – connecting everyone. He focused on computer vision and what role it plays in enhancing accessibility. For instance, blind users can benefit from great voice overs describing what a photo their friends have shared looks like. Recent research even goes into segmentation of images or videos, allowing users to hover over parts of the image and find out what is in that area: a lake, the sky, a person, a dog, anything! Last month, The Verge published an article exactly on these latest developments.

Another great project he talked about was identifying where people are in the world by using AI on satellite images. Wired wrote an article about it earlier this year. I particularly liked his view on AI, which concluded his presentation – the aim is to “provide super powers to humans with the help of AI”. It is amazing to realise that things we usually take for granted like our sight, voice or our hearing, represent for others super powers. And AI is great at doing that! This reminded me of another brilliant initiative a developer from Microsoft London, with whom I had the pleasure to work, had of using AI to help blind people understand their environment.

Serhad Bolukcu then discussed about bringing analytical discipline to marketing, as he is managing the department dedicated to empowering Facebook pages of small and medium businesses. He presented the way they use AI to help them analyse the vast amount of data they gather and what tools and algorithms they use.

I also really like Maya Bruhis’ talk on moving fast with stability, an analogy to Facebook’s mantra “move fast and break things”. As the Product Quality lead in Analytics, she is very much responsible for making sure the platform is working properly, the daily code releases don’t break anything crucial and they are preventing as much as possible any incidents (I had no idea people started calling the police when Facebook was down #facebookdown).

She briefly explained how they are achieving these goals and what are the data sources they use as signals. She compared really nicely the work that they do as making sure that Facebook’s “heart rate” is in a healthy condition at all times.

I’ll leave you with some photos I took during the event.