Overview of Udacity Artificial Intelligence Engineer Nanodegree, Term 2

After finishing a first term of the Artificial Intelligence Nanodegree I enrolled into the second term with a clear expectation to learn more about deep neural networks and, particularly, about convolutional and recurrent neural networks. Besides, it wouldn’t be logical to stop in a half of my way 🙂

Waiting time 1

I finished the first term 3 weeks before the deadline and waited until Udacity unlocked the term 2. I used that time for learning other things and doing small projects, but I think it is not very effective from the time point of view. I would really appreciate if Udacity unlocks the next term right after finishing the first one.

I also asked their support about the opposite situation — is it possible to take a longer break between terms and start later with another cohort. The answer was “yes” and without any penalties. So if you cannot complete both terms in a row, it should not be a problem.

Term 2. Deep Learning and Applications


The term 2 consists of 2 parts:

  • Deep Learning and Applications where you learn more details about CNN’s, RNN’s and GAN’s
  • Concentration part where you dive deeper into the most interesting area for you. You choose between Computer Vision, Natural Language Processing and Voice User Interface. You can make only one concentration as of today, which is a pity since at least 2 concentrations are absolutely doable from my point of view.

Convolutional Neural Networks

The first part of applying deep learning techniques is about Convolutional Neural Networks. The section is very well structured, you get a detailed explanation, solving quizzes and small project during the lessons. The material quality is definitely higher than by Deep Learning Foundation Nanodegree. The final project is a dog breed detector where you use transfer learning technique to achieve reasonable results.

You use Keras as a tool to master the projects, so you avoid the hassle of applying raw TensorFlow and can concentrate on understanding CNN instead of learning tools.

The materials were just right for me, the explanation was given in a clear and proper way, provided references were quite useful as well. Alexis Cook is a very good tutor and I was enjoying the section.

After finishing this section I’ve got an impression that I understood the background of CNN’s and their applications. The project was somewhat demanding, but I would prefer to work on less prepared code and research more myself.

Recurrent Neural Networks and LSTM

The second section was about Recurrent Neural Networks (vanilla and LSTM) and their applications. The structure is similar to a CNN section with explanations, references, small quizzes and projects.

I must say, from the explanation quality it is the best section through the whole course and it cannot be done better. Before I started, I had some problems with understanding the concepts, and also LSTM architecture was not 100% clear to me. After finishing I would say, I caught the idea very well. Jeremy Watt is really great at explaining complex things in a simple way.

BUT!!! The final project of a sentiment prediction is really a joke. Almost everything is prepared and you need to do very few things to pass. It was really-really disappointing. So good as learning materials were, so bad the final project was 🙁

Generative Adversarial Networks

The last section was about Generative Adversarial Networks. It consists of common explanation, deep convolutional GANs and Semisupervised Learning. Since the tutor is Ian Goodfellow, you can expect a high-quality explanation and at the end, you have a good understanding of concepts behind the GANs.

Unfortunately all the materials I learned already in Deep Learning Foundation Nanodegree. I was disappointed since I expected that I get a new stuff for the money and not that I already learned. Anyway, I refreshed my knowledge and went for the concentration part.

Waiting time 2 and concentration

I finished all projects 2 weeks earlier than required and must wait until the concentration part were unlocked. Again from the time point of view, it is not optimal and I do not see any reason, why the next part cannot be unlocked right away.

Concentration selection

You dive deeper either solving Computer Vision or Natural Language Processing or Voice User Interfaces tasks. All three concentrations get presented to you by Udacity tutor and a person from a company (an engineer or CEO in case of Affectiva) that helps to prepare the section.

Computer Vision concentration is made in cooperation with Affectiva and the start project is about using of their SDK for emotion recognition. Natural Language Processing (NLP) concentration is made in cooperation with IBM and the start project is about using of IBM Watson to solve the NLP tasks. Voice User Interface concentration is made in cooperation with Amazon and the start project is about using Alexa Skill-Set to create your own voice interface. All start projects are mostly for fun and not evaluated.

Originally, I wanted to go for Computer Vision but after watching the overview I changed my mind and selected Voice User Interface. The reason was the best presentation (from my point of view) among all three and I read at Udacity forum that the capstone project is most demanding.

Voice User Interface concentration

At the start, you develop a voice interface with Amazon Alexa. Amazon Alexa is capable to recognise a speech, understand it and react appropriately. During the project, you create your own skill’s voice interaction model that functions on Amazon Alexa and create AWS Lambda function to handle the requests. Actually, you learn to use Amazon ecosystem to solve the task of speech recognition and its understanding. The project is not about Deep Learning rather about using prepared building blocks. Anyway, it was quite interesting and not very simple.

After finishing that very practical section you dive into problems of speech recognition. You learn about challenges, signal analysis, data preparation, feature extractions, phonetics, language models, traditional Automatic Speech Recognition systems (with Hidden Markov Model, for instance) and finally Deep Neural Networks in the speech recognition.

The quality of material and provided references are great, the project was demanding, though I wished to had it not so well-prepared. I didn’t regret a second about my choice and I would choose it again if I must.


The main thing — I feel to be well prepared to change from a pure software engineering into Machine Learning/Artificial Intelligence area. It is absolutely clear to me that I’m lacking the professional experience in solving real-world problems, so the first step would be to work somewhere between the software engineering and machine learning areas to catch more from the problems that we can solve applying machine learning and deep learning techniques.

Udacity prepares you very well to be a practitioner, but I wouldn’t expect that you have a chance to change into pure researching area by completing the course. I would highly recommend completing Deep Learning Foundation — it helps a lot by Artificial Intelligence Engineer Nanodegree if you are new to that area. From the other side, I required only 4 months with 10–15 hours per week to finish the course instead of 6 months as advertised.

I’m still not sure about my investment — two Nanodegrees cost me 2000 USD. But at least I learned last year a lot of new things and I hope it helps me either in my current career path or to change into a new area. Anyway, I understood that Deep Learning techniques are a must also for a pure software engineer to solve rising problems 🙂

Source: Deep Learning on Medium