What is Machine Learning? Here are 4 Tips to Get Started
You might not realize it, but machine learning (ML) has already ingrained itself as a key part of your daily life. Like watching TV? Machine learning has made television more watchable than ever. Enjoy shopping online? You can thank ML for those personalized recommendations. This same technology also helps protect your credit card against fraud and understand your voice commands to, for example, find out when and where the newest hit movie is playing in a theater near you.
As much as machine learning makes a difference in our lives, however, some confusion still lingers. One question that comes up time and again is, “Are machine learning and artificial intelligence (AI) the same?” They are different, it turns out: ML is a subset of AI. Another common issue is determining what is considered “machine learning” as opposed to “deep learning (DL)”. In this case, deep learning is a subset of machine learning, so the answer is a bit trickier to pin down.
The questions go on and on. We wanted to take a few minutes to provide the answers. In this blog post, we’ll give four tips to help you learn some foundations of machine learning – and what makes it different from AI, DL, or other related acronym.
1. Know This 1-Sentence Definition of Machine Learning
So, what is machine learning, you might ask? Let’s start out with a simple, one-sentence definition:
Machine Learning (n.) – A subset of artificial intelligence that helps computers perform specific tasks without task-specific instructions.
Pretty cut-and-dry, right? As the name implies, machine learning helps machines learn how to make predictions and decisions without explicit orders from a human. Instead, ML depends on extracting and classifying unique features within its “training data” – a sample of data used to build a machine learning model. After this model is sufficiently trained, it is able to make predictions or actions – which ideally increase in accuracy – without explicit instructions.
Now, we can start building on this single source of knowledge.
2. Understand Common Types of Machine Learning Algorithms
Let’s take a look at three broad classes of machine learning algorithms: supervised learning, unsupervised learning, and reinforcement learning. These techniques differ in a few areas, such as the type of data input, output, and problem being solved. We’ll take a deeper look into each class.
In supervised learning, the focus is using a set of inputs and desired outputs to build a mathematical model. This is done by using “training data”, which is usually represented as a matrix of “training examples”. Each training example, a vector in the matrix, consists of labelled data in the form of inputs and a single, desired output1.
For example, consider the task of seeing whether or not an oil palm plantation2 appears in a satellite image. In this case, the training input would include satellite images with and without an oil palm plantation. Each training example would have an output label that marked whether or not the image contained a plantation.
In general, outputs in supervised learning could be restricted to a set of values (a classification problem) or a numerical value that falls within a certain range (a regression problem). By running the algorithm multiple times, supervised learning can get closer and closer to an optimal function that correctly predicts outputs for new inputs. If the model improves its accuracy over time, it is thought of as having “learned to perform a given task”.
Unsupervised learning shares much in common with its supervised counterpart, with one big exception: labelled data. Whereas supervised learning makes use of labelled inputs and outputs, unsupervised learning takes in raw data that hasn’t been labelled, classified, or categorized. Instead, this type of algorithm must find structure in its training data inputs.
One way to reveal these patterns is through a process known as “clustering”. By picking up on unique features within the data, clustering puts inputs into categories to create structure3. In some cases, the algorithm might group the data into an overly rigid or loose framework – these relate to overfitting or underfitting a model, respectively. But, there is a potential solution. Dimensionality reduction is the process of finding the smallest set of key features in a set of data inputs for a given ML problem. By focusing on the features in the dataset that have the highest impact, the model raises its probability of accurately fitting the data.
Autonomous vehicles that read and react to changing road conditions. Intelligent agents that learn how to play games against human opponents. These are just a couple examples of reinforcement learning in action. This algorithm is based on the idea that given an environment, an agent (i.e. software program) should take the actions necessary to minimize or maximize some total reward4. An example could be a massively multiplayer online role playing game where the goal is to minimize the amount of damage a player receives in battle.
In reinforcement learning, the environment is often shown as a Markov Decision Process – a mathematical framework to continuously reinforce positive actions and dampen negative decisions (hence the name, reinforcement learning)5. Another method, dynamic programming, is often used within this class. Doing so finds an optimal or near-optimal solution to a complicated problem by splitting it into many smaller problems, and recursively finding the best-fit solutions to the sub-problems6.
3. Create Machine Learning Models
Machine learning is more than just algorithms mindlessly crunching data. In fact, a given ML implementation might use multiple algorithms for prediction and decision making. There has to be a way for multiple algorithms to run on complex data sets and easily communicate with each other.
Enter: mathematical models. These models make sure machine learning that seems good in theory doesn’t fall apart in practice. While many machine learning models exist, we’ll cover three common types.
Artificial Neural Network
- Input layer
- Hidden layer(s)
- Output layer
In Brief: Artificial Neural Networks
Based loosely on the human brain, an ANN is a computing system made of nodes and edges that processes input through multiple layers to produce a prediction or decision output7. Nodes can take one or more signals as input and generate an output using a non-linear activation function. Both nodes and edges have weights – real numbers – that influence signal strength.
How It Works:
- Data enters the network through the input layer.
- Inputs move to the nodes in the hidden layer(s).
- Hidden nodes receive one or more signals as input and produce an output based on its activation function.
- Signals are passed through the hidden layer(s) to the output layer.
- Using backpropagation, the weights of the nodes and edges adjust as the network learns.
- The process repeats until the optimal objective function of the ANN is reached.
Support Vector Machine (SVM)
In Brief: Support Vector Machine
A process in supervised learning, an SVM uses a series of related techniques in linear classification and regression to group data into one of two categories8.
How It Works:
- Training examples are created, such that each example is a pair of a data point and one of two category labels.
- Using training data, the SVM algorithm finds the hyperplane that maximizes the distance between it and the nearest data point on each side. In this way, training data falls into one of two categories.
- With new, uncategorized data, the SVM algorithm uses either linear or non-linear classification to predict which category the input falls into.
Bayesian Network (BN)
- Directed acyclic graph
- Random variables
- Conditional independence
In Brief: Bayesian Network
Using a directed acyclic graph – a set of nodes and edges such that no repeated path exists – a BN uses a set of random variables and given probabilities to represent the relationships between a series of inputs and outputs9.
Also known as a “belief network” or “causal network”.
How It Works:
- Training data is constructed such that each example contains one or more random variables and associated probabilities of occurring.
- Using the training data, a model is built that represents the set of random variables and their conditional independences, such that no cycles occur in the graph.
- Machine learning algorithms perform inference and learning on the network.
- With new data, the Bayesian Network can provide the probability that any possible event in the network can occur.
4. Figure Out Your Business Use Case(s)
“You have to evaluate the personality of the data,” says Janet George, Fellow and Chief Data Scientist/Officer at Western Digital.
She points out the need to look closely at data: its signals, patterns, and linearity (or nonlinearity). This helps you understand the data that can be passed on to a machine learning model and classify the problem that needs to be solved.
“First of all, figure out the class of problem,” Janet adds. “Is it a classification, clustering, prediction, or other type of problem?” She makes the case that, sometimes, the data might not support the problem that you’re trying to solve. For example, if you’re looking to solve a business problem about predictions, but your potential training data lacks predictive signals, then your problem could be a mismatch right away for using machine learning.
After you’ve figured whether or not machine learning can help solve your challenge, Janet suggests that you “go to open source and start pulling out the specific algorithm(s) you want to help you.” From there, you can start experimenting with machine learning models10 and fine-tune them to meet your specific business needs.
Read More on Machine Learning
- Learn about Western Digital’s future in AI and ML
- Find out how to understand access patterns in machine learning
- What is supervised learning? https://searchenterpriseai.techtarget.com/definition/supervised-learning
- WIDS 2019 Datathon. https://www.widsconference.org/datathon.html
- Unsupervised learning. https://www.mathworks.com/discovery/unsupervised-learning.html
- Reinforcement learning. https://www.geeksforgeeks.org/what-is-reinforcement-learning/
- Reinforcement Learning Demystified: Markov Decision Process (Part 1). https://towardsdatascience.com/reinforcement-learning-demystified-markov-decision-processes-part-1-bf00dda41690
- Tutorial for Dynamic Programming. https://www.topcoder.com/community/competitive-programming/tutorials/dynamic-programming-from-novice-to-advanced/
- What is an artificial neural network? https://www.digitaltrends.com/cool-tech/what-is-an-artificial-neural-network/
- Understanding Support Vector Machine algorithm from examples (along with code). https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/
- What are Bayesian Networks? http://www.cs.cmu.edu/afs/cs.cmu.edu/project/learn-43/lib/photoz/.g/web/glossary/bayesnet.html
- Top 20 Python AI and Machine Learning Open Source Projects. https://www.kdnuggets.com/2018/02/top-20-python-ai-machine-learning-open-source-projects.html