AI Learning Path

AI Learning Path


Technology, especially Software Development, is not only growing horizontally and vertically, but it is also getting very segmented and specialized.
Therefore, it is tough for a software developer to decide what to study and at what depth.
As a Full Stack Software Developer / Consultant, I get many questions about which software technology and methods I should invest my time, energy and possibly money into for educational materials, courses, and certificates. Well, you can start by asking yourself these 5 five questions.

  1. What is the most exciting software technology out there?
  2. Which vendor implementation has the best chance to beat the competition?
  3. How important is it on my professional profile?
  4. How can I sell it to my clients?
  5. How will it increase my bottom line?

AI and Machine Learning is a paradigm shift and will be the biggest revolution in IT history, but currently, it still has many challenges. I started to look at Machine Learning and Computer Vision very carefully at around the end of 2015.
My friend Zoltan Feher was visiting with us for Christmas. We were brainstorming about a home appliance product, that would require computer vision and machine learning that led me to consider AI to solve the problem. Consequently, this answered question #1. Answering question #2 was a much tougher, so I chose a safe and tried the product, MatLab implementation of Convolutional Neural Network, not knowing anything about how it worked. I knew for sure that it was not the right one, but at that time nothing else seemed appealing and it would do for now. Answering the first two questions was compelling enough for me to decide that AI and Machine Learning were worth my time, energy, and money. Your case might be different and will be based on your entrepreneurial spirit and circumstances.
It is hard to shift through the various tools, products, and educational courses; many were stars in the past but irrelevant in 2015. After a long, tedious but very thorough research I found the learning path that was suitable for me based on my plans. I found my decision validated by many industry leaders and forums. Yours might be different.

In this post, I will discuss the three major AI learning paths, what it involves, and the pros and cons. However, before that let’s put things into perspective: why use AI, why get into AI and why now.

Why Use AI?

There are problems that traditional programming cannot solve, but machine learning can. A data scientist can categorize the problems by asking five simple questions. Brandon Rohrer has an excellent post [2] where he explains in great detail what those five items are, but here is a summary.

5 Questions of a data scientist [00:18:19]

  1. Which category? – Classification (binary, multi-class)
  2. Is it weird? – Anomaly
  3. Predict how much/many? – Regression
  4. How is this related (Data structure)? – Clustering, Recommender System
  5. What’s next? – Reinforcement Learning

Let’s look at these in detail.

  1. Classification
    Sometimes we need to classify things like classifying patients as diseased or not. Classification: the output variable takes class labels.
  2. Regression
    Another question we can ask is sort of a predicting the future problem. For example, the price of a house depending on the size and location of the house can be some numerical value (which can be continuous). Regression: the output variable takes continuous values.
  3. Anomaly
    Another class of algorithms called anomaly detectors (aka also outlier detection). It does not conform to an expected pattern or other items in a dataset. Typical problems are bank fraud, spam, a structural defect, medical conditions or errors in a text.
  4. Clustering
    Separate the dataset into different groups (clusters) based on similarities.
  5. Recommender System
    Think of how Netflix and Amazon are recommending a movie or a product that you might like.
  6. Reinforcement Learning
    The spookiest of them all! The go is an order of magnitude more complex than chess. AlphaGo beat the go world champion in 2016 December and again in 2017 May [22]. It got better at the game by playing against instances of itself. That was pretty wild. That’s an example of reinforcement learning.
5 Questions of a Data Scientist
5 Questions of a Data Scientist

How does data science work?

Algorithm = Recipe
Your data = Ingredients
Computer = Blender
Your answer = Smoothie

Why Get Into AI?

These five questions demonstrate the power of machine learning, and we cannot do any of these with traditional computer science. For example, we can do the regression in Excel, but Excel can not do planet scale data, it was just not designed for it (sorry Charles Simonyi, my Hungarian countryman). Therefore we create these models that have already distilled all the relevant parts of the data so that when we run it on new information, it is very efficient. That is the process of training, which is very computationally expensive and hard.

So all of this stuff is mainly things that we cannot do today with traditional computer science.
Therefore, if nothing else this would be a very good reason to get into AI!

Why Now?

The image below describes the onion relationship among AI, Machine Learning, and Artificial Neural Networks (Neural Networks in short).

AI layers
AI Layers

Neural Networks did not work well in the past. They do now and here is why.
Three catalysts are creating and maintaining this explosion:

  1. New Combo of Math
    the spark
  2. Big Data
    the fuel
  3. Massive Computation
    the horsepower

Learning Strategies

There are two distinct approaches to understanding Machine Learning:

  1. Learn the high-level process of applied machine learning.
  2. Find out how to use a tool enough to be able to work through problems.
  3. Practice on datasets, a lot.
  4. Transition into the details and theory of machine learning algorithms.
  • Bottom-up
    Start with the theory first.

There are two sides to Machine Learning

  • Practical Machine Learning
    Activities are cleaning data, querying databases, transforming data, and gluing algorithm and libraries together. Also, you need to write custom code to get reliable answers from data to satisfy challenging and ill-defined questions. It’s messy but real.
  • Theoretical Machine Learning
    It is about academic level math, abstractions and idealized scenarios and limits and beauty and informing what is possible. It is a whole lot neater and cleaner and removed from the mess of reality.

Based on my research, motivation, determination and goals I chose the bottom-up theoretical machine learning path. That is much harder than the top-down practical machine learning path but far more rewarding due to the thorough understanding of the theories, models, and hyper- parameter tweaking. It also requires an enormous initial investment of your time and energy and some money.
It will be an iterative process to learn practical insights of data cleaning, feature detection, and better understand the various models, develop a great intuition that is very beneficial in the optimization process.

We will do these iterations together in my upcoming posts.

A complete novice who is not sure if studying machine learning is worth the energy that it requires to be good, great, or excellent at it, should start with the top-down practical machine learning path.

These two quotes from Andrew Ng is very appropriate;

“Machine learning has matured to the point by where if you take one class you can actually become pretty good at applying it.”

Familiarity with algebra and probabilities are certainly helpful, he added, but the only real prerequisite to his course is a basic understanding of programming.

Machine learning becoming

“one of the more highly sought-after skills in Silicon Valley,”

The process can be incredibly intimidating!
The way we will make software in the very near future, is not the way we have done software in the past.
We have been studying computer science, and this new thing is an empirical science. It is a different science, and all the buzzwords are different. We have to learn all of these new buzzwords.

AI Buzzwords
AI Buzzwords

The good news is that you do not have to learn them all at once. It is not all or nothing. Let’s go back to the goal. The goal is that you want to make your software intelligent. So there are ways Microsoft can help you without you having to make the full jump into data science. For one thing, we have these Cognitive Services.
Now let’s get back discussing the various learning paths and level of immersion into data science.

Three Learning Paths

1. AI Consumer

It is a top-down practical machine learning path. You can be good at it by using Cognitive Services APIs. These are canned pre-trained models which let you build apps with robust algorithms using just a few lines of code. Essentially you become a Cognitive Services Consumer. At build 2017, Microsoft introduced custom services that opened up the capabilities even further. Here is a snapshot of the available services.

Cognitive Services list
Cognitive Services list

You can get started by looking at the documentation and tutorials at the Cognitive Services Documentation.
There is also the Microsoft Virtual Academy with the Azure Developer Workshop (Storage, Cognitive, ML, Stream Analytics, Containers, and Docker) video course.

When you outgrow the capabilities and accuracy of Cognitive Services, even with the new custom vision and speech APIs, then it is time to roll up your sleeves and to get your hands dirty and learn some data science.

2. Data Scientist

The TopTal definition of a Data Scientist is

A Data Scientist is someone who makes value out of data. Data Scientist duties typically include creating various Machine Learning-based tools or processes within the company, such as recommendation engines or automated lead scoring systems. People within this role should also be able to perform statistical analysis.

How to transition from Developer to Data Scientist? [00:32:20]

In the past, there were a plethora of data science and machine learning tools available like Matlab, Octave, and Python packages like SciKit Learn. As part of the process, data scientists had to be intimately familiar with the data to be analyzed. Matlab and Octave now have built-in statistical analysis and plotting capabilities. For Python, there are numerous data visualization packages available like numpy, matplotlib, Panda, Seaborn, Plotly and Cufflinks (my favorite).

As a data scientist there are many questions to ask:

  • How to organize data for machine learning? [6]
  • How to clean data? [7]
  • How to handle missing values? [8]
  • How does feature engineering work? [9]
  • How to get good quality data? [10]
  • Why visualize data? [11]

You can do all of these steps and more by using Azure Machine Learning Studio. You can get a great taste for data science, so get rocking. Here are the capabilities:

ML Studio
ML Studio

There are a lot of documentation, tutorials, and training videos that are available, to learn how to become a Data Scientist. If you serious about it and are pursuing this as a career I would recommend the Microsoft Professional Program for Data Science online course, a Massive Open Online Course (MOOC). MOOCs are concentrated long term courses consisting of many video lectures.

Advice for Data Scientist Learning Path

Start with the Cognitive Services. There’s much good stuff there. They may solve your problems. You might be done. However, my advice is, don’t be done. Even if it is solving all the problems that are immediately in front of you, my advice is to go farther down the path. Moreover, I think that Azure ML Studio is an excellent way to do it. It is like VB was 20 years ago. It has amazing visualization capabilities that bring in many non-professionals and makes them productive. Start to go through this process. Play with data, visualize data. Now that you have this vocabulary to work with, new people are going to be in your life. Also, check out MicrosoftML [19].
The next step is to become an expert Machine Learning Engineer. Microsoft has stuff that rewards the experts.

AI Learning Path
AI Learning Path

3. Machine Learning Engineer

The next and final level is the Machine Learning Engineer.
The Machine Learning Engineer, addition to Data Scientist skills, also have to be able to design models, tweak hyperparameters, package machine learning models to be consumed by data scientists and developers.
Here are the essential skills [17]:

  1. Computer Science Fundamentals and Programming
  2. Probability and Statistics
  3. Data Modeling and Evaluation
  4. Applying Machine Learning Algorithms and Libraries
  5. Software Engineering and System Design

There are a lot of MOOCs available to teach you the essential skill set, but I have not found any that taught everything. Since I have a Master in Electrical Engineering, most of the math I needed, studied 30 years ago, but I still needed some refresher tutorials. I had to put together my Machine Learning Engineer curriculum.

Recommended Courses and resources:

Also, check out [13] for a very exhausting list resources.

Microsoft Cognitive Toolkit, previously known as CNTK, just like Google’s TensorFlow empowers you to develop sophisticated neural network models that run on your local GPU, Azure and inside Azure Data Lake and SQL Server 2017 via USQL and Python respectively. It can take advantage of massively parallel setups better than anybody. I highly recommend checking out the new Cognitive Toolkit course on edX, Deep Learning Explained, by Sayan Pathak, Ph.D., Roland Fernandez and Jonathan Sanito.

Deep Learning Explained
Deep Learning Explained

Batch AI Training

Cognitive Toolkit shines with multiple GPU, multi-server cluster training. That is where the metal hits the road, and you have to be a Machine Learning Engineer to take advantage of this technological marvel. Now, this is possible with Batch AI Training consisting NC and ND series Azure nodes with NVIDIA Tesla P40 and P100 GPUs [18].

ND6s61 P40112 GBAzure Network
ND12s122 P40224 GBAzure Network
ND24s244 P40448 GBAzure Network
ND24rs244 P40448 GBInfiniBand
NC6s_v261 P100112 GBAzure Network
NC12s_v2122 P100224 GBAzure Network
NC24s_v2244 P100448 GBAzure Network
NC24rs_v2244 P100448 GBInfiniBand

We will be discussing Cognitive Toolkit extensively in upcoming posts.

Please watch the Navigating the AI Revolution session to see the learning paths other developers took to become an AI Consumer, Data Scientist or Machine Learning Engineer.

How fast is AI moving?

The Google Brain project recognized cats in images in summer of 2012 that required 16K CPUs and a dozen scientists. Geoffrey Hinton, Alex Krizhevsky and Ilya Sutskever in winter 2012 did the same with only 3 GPUs using a different technique called Convolutional Neural Network (CNN).
The other advantage is accuracy.
For relatively small amounts of data traditional ML and neural networks get back the same precision. However, then traditional ML kind of tops out, but neural networks will keep getting better with more data, and that is why we keep throwing more data at it.

Deep Learning Accuracy
Deep Learning Accuracy


AI is just unstoppable. It will just keep getting better, better, and better. Also, the things it can do and do well are just increasing.
This paradigm shift will be the biggest revolution in IT history, and it is coming like a tsunami, sweeping everything in its path.

Please join me for a great ride at the top of the wave! Let’s create the future together! [21]


  1. Navigating the AI Revolution – Bill Barnes, Micheleen Harris
  2. Five Questions Data Science Answers – Brandon Rohrer
  3. Data Science and Robots – Brandon Rohrer
  4. How to choose algorithms for Microsoft Azure Machine Learning – Gary Ericson, Larry Franks, Brandon Rohrer
  5. Find an Algorithm that Fits – Brandon Rohrer
  6. How to organize data for machine learning – Brandon Rohrer
  7. How to clean data – Brandon Rohrer
  8. How to handle missing values – Brandon Rohrer
  9. How feature engineering works – Brandon Rohrer
  10. How to get good quality data – Brandon Rohrer
  11. Why visualize data – Brandon Rohrer
  12. Machine learning algorithm cheat sheet for Microsoft Azure Machine Learning Studio – Gary Ericson, Larry Franks, C.J. Gronlund, Brandon Rohrer
  13. Machine Learning for Software Engineers – Nam Vu
  14. Applied Machine Learning Process – Jason Brownlee
  15. 4-Steps to Get Started in Machine Learning – Jason Brownlee
  16. What is Azure Machine Learning Studio? – Gary Ericson, Larry Franks, Paulette McKay
  17. 5 Skills You Need to Become a Machine Learning Engineer – Arpan Chakraborty
  18. More GPUs, more power, more intelligence – Corey Sanders
  19. Introduction to MicrosoftML – Brad Severtson
  20. An Introduction to Statistical Learning – Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
  21. Microsoft: Productivity Future Vision – Dr. Craig Sanderson, Dr. Andrew Phillips, Dr. J Helen Fitton, Dr. Pia Winberg, Dr. Sita Narayan-Dinanauth, Dr. Damien Stringer
  22. Google’s AlphaGo AI defeats the world Go number one Ke Jie –  Sam Byford (The Verge)
  23. The best Data Science courses on the Internet – David Venturi
  24. Every single Machine Learning course on the internet – David Venturi

3 Replies to “AI Learning Path”

Leave a Reply

Your email address will not be published. Required fields are marked *