Chapter 1
Your Project
What Are You Trying to Do?
Considering whether to use deep learning techniques or machine learning techniques depends on your project. While one task alone might be more suited to machine learning, your full application might involve multiple steps that, when taken together, are better suited to deep learning.
In this chapter, we will present you with six common tasks to evaluate which of the techniques best apply depending on your project:
- Predicting an output
- Identifying objects
- Moving physically or in a simulation
- Uncovering trends
- Enhancing images or signals
- Responding to speech or text
PREDICT an output based on historical and current data
Example: Use real-time sensor data from a motor to predict remaining useful life for rotating machinery. The Similarity-Based Remaining Useful Life Estimation example uses linear regression.
Applications: Predictive maintenance, financial trading, recommender systems
Input: Sensor data, timestamped financial data, numeric data
Common algorithms: Linear regression, decision trees, support vector machines (SVMs), neural networks, association rules
Typical approach: Machine learning is more common
IDENTIFY objects or actions in image, video, and signal data
Example: Create a computer vision application that can detect vehicles. The Object Detection Using Faster R-CNN Deep Learning example uses a convolutional neural network.
Applications: Advanced driver assistance (ADAS) with object detection, robotics, computer vision perception for image recognition, activity detection, voice biometrics (voiceprint)
Input: Images, videos, signals
Common algorithms: CNNs, clustering, Viola-Jones
Typical approach: Deep learning is more common
MOVE an object physically or in a simulation
Example: Perform robotic path planning to learn the best possible route to a destination. The Reinforcement Learning (Q-Learning) File Exchange submission uses a deep Q network.
Applications: Control systems, robotics in manufacturing, self-driving cars, drones, video games
Input: Mathematical models, sensor data, videos, lidar data
Common algorithms: Reinforcement learning (deep Q networks), artificial neural networks (ANNs), CNNs, recurrent neural networks (RNNs)
Typical approach: Deep learning is more common
UNCOVER trends, sentiments, fraud, or threats
Example: Determine how many topics are present in text data. The Analyze Text Data Using Topics Models example uses the latent Dirichlet allocation (LDA) topic model.
Applications: Natural language processing for safety records, market or medical research, sentiment analysis, cybersecurity, document summarization
Input: Streaming text data, static text data
Common algorithms: RNNs, linear regression, SVMs, naive Bayes, latent Dirichlet allocation, latent semantic analysis, word2vec
Typical approach: Machine learning is more common
ENHANCE images and signals
Example: Create high-resolution images from low-resolution images. The Single Image Super-Resolution Using Deep Learning example uses a very-deep super-resolution (VDSR) neural network.
Applications: Improve image resolution, denoise signals in audio
Input: Images and signal data
Common algorithms: LSTM, CNNs, VDSR neural network
Typical approach: Deep learning is more common
RESPOND to speech or text commands based on context and learned routines
Example: Automatically recognize spoken commands like “on,” “off,” “stop,” and “go.” The Speech Command Recognition Using Deep Learning example uses a CNN.
Applications: Customer care calls, smart devices, virtual assistants, machine translation and dictation
Input: Acoustic data, text data
Common algorithms: RNNs (LSTM algorithms in particular), CNNs, word2vec
Typical approach: Both approaches are used
Guess the Algorithm
A researcher designed a way to take ultra-low-dose CT scans (which reduce the amount of radiation exposure, but also reduce image resolution) and apply image processing techniques to regain image resolution.
What technique did he use?
Nope!
Dr. Ryohei Nakayama, Ritsumeikan University, used a convolutional neural network to reduce radiation exposure by 95% while maintaining a comparable level of diagnostic information.
“To improve the clarity of ultra-low-dose chest CT scans, I applied an approach that uses two CNNs, one targeting the lung areas of the CT images and the other targeting the non-lung area. The image data set that I used to train the CNNs was provided by researchers at Mie University. It consists of 12 image pairs, each of which includes a normal-dose scan and an ultra-low-dose scan of the same tissue. (Because taking a second image means exposing a patient to additional radiation, we had to limit the study to a relatively small subject pool.) Each image in the study was 512 x 512 pixels, and each scan contained 250 images (slices).”
Correct!
Dr. Ryohei Nakayama, Ritsumeikan University, used a deep learning algorithm, the convolutional neural network, to reduce radiation exposure by 95% while maintaining a comparable level of diagnostic information.
“To improve the clarity of ultra-low-dose chest CT scans, I applied an approach that uses two CNNs, one targeting the lung areas of the CT images and the other targeting the non-lung area. The image data set that I used to train the CNNs was provided by researchers at Mie University. It consists of 12 image pairs, each of which includes a normal-dose scan and an ultra-low-dose scan of the same tissue. (Because taking a second image means exposing a patient to additional radiation, we had to limit the study to a relatively small subject pool.) Each image in the study was 512 x 512 pixels, and each scan contained 250 images (slices).”
In general, if you have a large data set, deep learning techniques can produce more accurate results than machine learning techniques. Deep learning uses more complex models with more parameters that can be more closely “fit” to the data.
So, how much data is a “large” data set? It depends. Some popular image classification networks available for transfer learning were trained on a data set consisting of 1.2 million images from 1,000 different categories.
If you want to use machine learning and have a laser-focus on accuracy, be careful not to overfit your data.
Overfitting happens when your algorithm is too closely associated to your training data, and then cannot generalize to a wider data set. The model can’t properly handle new data that doesn’t fit its narrow expectations.
To avoid overfitting from the start, make sure you have plenty of training, validation, and test data. Use the training and validation data first to train the model; the data needs to be representative of your real-world data and you need to have enough of it. Once your model is trained, use test data to check that your model is performing well; the test data should be completely new data.
If you think your model is starting to overfit the data, take a look at:
- Regularization — Penalizes large parameters to help keep the model from relying too heavily on individual data points.
- Dropout probability — Randomly skips some data to avoid the model memorizing the data set.
Data scientists often refer to the ability to share and explain results as model interpretability. A model that is easily interpretable has:
- A small number of features that typically are created from some physical understanding of the system
- A transparent decision-making process
Interpretability is important for many health, safety, and financial applications, for example, if you need to:
- Prove that your model complies with government or industry standard
- Explain factors that contributed to a diagnosis
- Show the absence of bias in decision-making
If you must have the ability to demonstrate the steps the algorithm took to reach a conclusion, focus your attention on machine learning techniques. Decision trees are famously easy to follow down their Boolean paths of “if x, then y.” Traditional statistics techniques such as linear and logistic regression are well accepted. Even random forests are relatively simple to explain if taken one tree at a time.
If your project is more suited to a neural network, support vector machine, or other model of similar opacity, you still have options.
Research on interpretability using proxies
Local interpretable model-agnostic explanations (LIME) take a series of individual inputs and outputs to approximate the decision-making. |
Another area of research is the use of decision trees as a method to illustrate a more complex model. |
Domain Knowledge
How much do you know about the system where your project sits? If you are working on a controls application, do you understand the related systems that might affect your project, or is your experience more siloed? Domain knowledge can play a part in choosing what data to include in a model and determining the most important features of that data.
What Data Should You Include?
For example, a medical researcher wants to make sense of a large amount of patient data. There could be thousands of features from patient stats, from the characteristics of a disease to DNA traits to environmental elements. If you have a solid understanding of the data, select the features you think will be the most influential and start with a machine learning algorithm. If you have high-dimensional data, try dimensionality reduction techniques such as principal component analysis (PCA) to create a smaller number of features to try to improve results.
Feature Selection
For a model to produce accurate results, you need to make sure it’s using the right data. Feature selection is how you ensure your model is focused on the data with the most predictive power and is not distracted by data that won’t impact decision making. Precise feature selection will result in a faster, more efficient, more interpretable model.
If you have a lot of domain knowledge, use machine learning and manually select the important features of your data.
If you have limited domain knowledge, try automatic feature selection techniques such as neighborhood component analysis or use a deep learning algorithm like CNN for feature selection.
If your data has lots of features, use principal component analysis with machine learning to reduce dimensionality.
For Example
Signal processing engineers are often required to transform their 1D signals to reduce dimensionality (signals often come in at high frequency rates, making the amount of data untenable to process in its raw form) and to expose prominent features specific to the data. One common way is to convert 1D signals into a 2D representation using a transform such as spectrogram.
This conversion highlights the most prominent frequencies of a signal. This creates an “image” that can then be used as input into a CNN.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)