An interview with Xander Steenbrugge, Machine Learning Researcher & YouTuber at “Arxiv Insights”

Source: Deep Learning on Medium

An interview with Xander Steenbrugge, Machine Learning Researcher & YouTuber at “Arxiv Insights”

Our interviewee today is Xander Steenbrugge. Xander heads the Applied ML Research at ML6, where he and his team enable large companies to create state-of-the-art Deep Learning solutions to transform their businesses through the power of AI. Xander’s primary focus is to stay on top of all the new developments in the field of AI and spot business and creative opportunities when they arise.

Xander takes a keen interest in explaining complex things and as a result, he started a YouTube channel namely Arxiv Insights. He has discussed several concepts pertaining to machine learning in this channel, The learning dynamics behind generalization and overfitting being my most favorite one. I also liked his take on Variational AutoEncoders (VAEs), I really liked how he explained the motivation behind the development of VAEs and how the re-parameterization trick was incorporated in VAEs. Off work, Xander is often seen giving talks and taking workshops at Conferences. If you want to stay updated with the latest happenings in the machine learning field, definitely follow Xander on Twitter.

I would like to wholeheartedly thank Xander for taking the time to do this interview. I hope this interview serves a purpose towards the betterment of data science and machine learning communities in general 🙂

An interview with Xander Steenbrugge, Head of Applied Machine Learning Research at ML6

Sayak: Hi Xander! Thank you for doing this interview. It’s a pleasure to have you here today.

Xander: Glad to be here!

Sayak: Maybe you could start by introducing yourself — what is your current job and what are your responsibilities over there?

Xander: Well that’s a fairly straightforward question, which in my case doesn’t really have a straightforward answer I’m afraid…

So I’ve been a Machine Learning Consultant at ML6 for the past 5 years but during that period I quickly realized that I wanted to spend more time focussing on research and experimenting with the latest models & techniques in the ML landscape. As such, I started a Ph.D. programme which is a collaboration between ML6 and the University of Ghent. My current focus there is on applied Reinforcement Learning; specifically, we’re trying to make RL algorithms more robust in physical environments by working mainly on the representation learning part of the learning process (think disentangled features, VAE’s, meta-learning, generalization, …).

At the same time I also try to create new videos for my YouTube channel from time to time (a single video takes me about 10 full working days to make, so most of that work happens at the weekends & in my spare time).

Then there is another big project I’ve been working on for the past 4 months: I’m almost finished with my first online MOOC on Reinforcement Learning! Together with a team of passionate educators from the states, I’ve been working on this quite intensely since June and I’m glad to say we’re almost done. The course will be a full RL track from zero to hero with videos, quizzes, and hands-on Python assignments in PyTorch. And best of all, the course will be 100% free!

Finally, I’ve also been working on a pretty cool artistic project where I’m using GANs to create a unique style of music visualization. Planning to release some teasers next week!

So when people ask me “what’s your job?” it’s kinda hard to say what that is… Simply put, I’m very passionate about technology and I usually just roll with whatever interests me, I don’t really see any of these things as “work” :p

Sayak: This is something I can absolutely relate to. All the things that I do (which might seem like “work”) are direct results of my passion for the subject. Also, I am really looking forward to all the things you mentioned! I am very curious to know how did you become interested in pursuing machine learning?

Xander: I’m actually an Electrical Engineer by education (transistors, microcircuits, semiconductors, … that stuff). But in my master’s, I picked up a thesis on Brainwave classification with EEG devices. We were able to improve the existing BCI (brain-computer-interface) system at the lab by quite a good margin to the point where fully incapacitated patients were able to communicate with a computer (eg typing letters) by simply imagining certain physical movements (like clenching fists or moving your feet). Even though they could not physically move a single muscle, the brain would still create certain signals and our machine learning system would recognize these and take the appropriate actions. The fact that by using these machine learning methods we were able to read brainwaves and give locked-in patients a way to communicate with the outside world was mind-blowing to me, I was hooked!

Sayak: That was a pretty unconventional start but the project really is a highly impactful one. When you were starting in machine learning what kind of challenges did you face? How did you overcome them?

Xander: Haha, well, my first day at ML6 I discovered that Matlab (which I used for my thesis) is not often used in the industry. Everybody was using Python, so I had to learn that the day I started my first job. My first few projects were relatively traditional old-school ML problems that I solved mainly with scikit-learn. Even though I wasn’t doing any fancy Deep Learning at the time, I learned some very good lessons the hard way: real data is never clean.

My first year being a Data Scientist was a pretty hard reality check: data cleaning, cross-validation, logistic regression, A/B testing, those kinds of things. Note: TensorFlow didn’t even exist at that point, so scikit-learn was like the swiss army knife of a data scientist (it actually still is for many tasks…) And it doesn’t matter how fancy your algorithm is, if your data is messy, things won’t work. So …

I also learned the importance of visualizing your data so you know what you’re dealing with. Many people download a dataset and start coding right away, whereas a lot of bugs and problems can be avoided by looking at the data first and sketching out an approach before diving into codeland.

Sayak: I am 101% on-boarded with the statement where you mentioned scikit-learn is still like the swiss army knife of a data scientist. I am also on the same side with respect to data visualization — I generally spend a good amount of time with data before even importing any machine learning modules!. What were some of the capstone projects you did during your formative years?

Xander: Honestly, as a consultant, I’ve worked on so many different projects it’s hard to pick one. I worked for retail, banks, off-shore ocean vessels, energy suppliers, governments, … you name it! The great thing about this kind of consulting is that you learn to become a jack of all trades because every project is different and requires you to learn new skills. The downside is usually that you can’t spend too much time to make things REALLY good if it works, it’s good enough (and sometimes that’s a pity).

Sayak: I see. It must have been rewarding for you! How did you come up with the idea of starting Arxiv Insights?

Xander: So, when I started my Ph.D., I had to do a ton of reading. I was basically reading 5+ papers/day trying to catch up with the cutting edge stuff of the field. Many of these papers required a lot of mental capacity to get through long, dense theoretical sections, complicated math derivations, appendices that seemed to never end, etc. For some of the more “mainstream” papers I regularly found good blogposts that would summarize the main ideas in a more digestible form, but on YouTube: there was nothing.

The bigger channels like Siraj Raval and Two Minute Papers were already out there and they’re great if you want to get a glimpse of ‘the surface of things’ but to me, they weren’t very helpful in building a stronger, more foundational understanding of the technical underpinnings of these theoretical ideas. I got a bit frustrated: why can’t I just watch a 20 mins video instead of spending 2 hours trying to plow through this incredibly dense academic research paper?

Then, a light sparked: why don’t I simply do it myself?

The truth of the matter is that many of my initial videos were made right after I had learned those same concepts myself. I was basically learning them through reading and then trying to summarize them for myself through making a video. Turns out: if you have to explain something on YouTube, you really have to understand what you’re talking about.

Sayak: Thank you for sharing that, Xander. Really very thoughtful of you! Your explanations are filled with magical intuitions. But you introduce just the right amount of math in there too. Would you like to share your take on grasping the mathematical aspects of machine learning?

Xander: I’m not a mathematician, so I’ll come forward right away and admit: I don’t fully understand every single mathematical derivation of all the papers I read. And honestly, I feel like that’s not always 100% necessary in order to understand and use the corresponding ideas.

Usually, my flow is as follows: I ready through a paper quite quickly & diagonally (eg 10 mins for a 9-page paper) until I have a basic understanding of what’s going on. If I decide the idea is worth more time, I dig deeper and do a second read-through of eg 30 mins, paying closer attention to derivations, tables, etc. At that point, I usually check to see if there’s an open-source implementation on GitHub: for me, nothing helps to better understand a research paper than simply looking at the Python implementation.

Once I have a solid understanding of the implemented ideas I then decide if it is worth spending time understanding the math by going through Appendices etc. But as you can see, for me this is often the last step, once my mental model already has a good structure, I fill in the gaps with math.

But I’m pretty sure this is a very personal thing: if you’re super strong at math, this story might be the exact opposite. I’ve simply noticed that I’m pretty fast in ‘getting’ the general idea of a paper intuitively without initially needing any mathematical foundation to build it on.

Sayak: This approach is so much helpful, Xander and I am very sure that the community will definitely find this extremely effective. These fields like machine learning are rapidly evolving. How do you manage to keep track of the latest relevant happenings?

Xander: Great question! I think this is something many people struggle with. The amount of information out there is just insane, way more than any human brain could ever hope to process in a lifetime. The solution is that you have to start building your own digital filter. And what many people don’t realize is that building such a digital filter takes time, you can’t just do it in a day.

I use several different media channels to get to the information that’s relevant for me, I’ve made a small summary of my most used sources here:

And honestly, Twitter is probably my personal favorite. There are a few things I love about Twitter: first, you only have 140 characters to send your message across, which means: no bullshit, straight to the point: I love that. Secondly, Twitter has this amazing property that it’s very easy to directly contact people you’ve never met. Want to send a personal message to Elon Musk? Forget email or letters, Twitter is your best bet!

Finally, while most media platforms leave you completely at the mercy of a recommendation engine that fills your timeline, Twitter gives you the unique option to manage that timeline by carefully following/unfollowing those information sources you find most valuable. Every couple of months I manually go through all the people I’m following and curate that list to stay below 250 people. One of the biggest risks in our digital age is drowning in information.

Everybody has that moment when you have 18 open Chrome tabs of “things I still need to read”. Sometimes it’s good to just close everything and start fresh, give your brain some rest. As Hinton said: read a lot but not too much, or else you’ll start thinking like everybody else and that’s not optimal.

Sayak: I could second more on the Twitter part, Xander. To me, it’s hands-down one of the best channels to share information. Being a practitioner, one thing that I often find myself struggling with is learning a new concept. Would you like to share how do you approach that process?

Xander: I’m a big fan of the “do first, think later principle”. Usually, I clone a Github repo before I even read the paper. Nowadays, with things like Google Colab for example, it’s so incredibly cheap and easy to just play with things without any downsides. I recently picked up Blender (a 3D computer graphics toolset) and I had two options: start going through the official documentation or simply install the thing and start messing around. Obviously I did the second. I had some idea of what I wanted to achieve (create 3D renders of my 2D GAN-based visual art) and so I simply watched a couple of YouTube tutorials (there are some really great channels out there!) and before I really knew what I was doing, I had a solution to my problem: goal achieved. Now, several weeks later I’m getting pretty good with Blender and now I feel it’s time to start diving into the documentation to fill in the gaps that random exploration usually won’t fill.

Sayak: This is lovely! I like to take a slightly different approach, though. I first get myself a basic introduction to the tool I am about to try out and then I start poking around with it. Any advice for the beginners?

Xander: Don’t see yourself as a beginner. Most of the technological tools we use today are all pretty new: that means that pretty much everybody is a beginner in some way. Sure, some people have 4 years of experience in TensorFlow, but what is 4 years? TensorFlow 2.0 just came out, and that’s new for everyone, so instead of looking at all the catching up you have to do, just skip to the front of the line and hop on the train, it’s moving fast!

The real challenge is not to become good at something, but to become good at quickly picking up new skills. We’re living in a time where progress is not determined by the power of our new tools, but rather by our ability to learn how to use them to create the next set of tools.

Sayak: That is simply spot-on! Thank you so much, Xander, for doing this interview and for sharing your valuable insights. I hope they will be immensely helpful for the community.

Xander: Always glad to share with the community! It’s a great time to be alive you know? I don’t think there was ever a time when young, curious individuals had such potential to deploy their skills and make an impact on the world!