Chapter 2
Your Data
In this chapter, we take a look at three questions:
- Is your data tabular?
- If your data is nontabular, what type is it?
- Is your data labeled?
Traditional machine learning techniques were designed for tabular data, which is organized into independent rows and columns. In tabular data, each row represents a discrete piece of information (e.g., an employee’s address).
There are ways to transform tabular data to work with deep learning models, but this may not be the best option to start off with.
EmployeeID |
AddressLine1 |
AddressLine2 |
StartDate |
---|---|---|---|
1111 |
"5 Maple St" |
"" |
01-Jan-2005 00:00:00 |
7654 |
"8 Main Ave" |
"Apt 13" |
31-Dec-2014 00:00:00 |
80 |
"835 High St" |
"" |
31-May-2000 00:00:00 |
6424 |
"42 Oakridge Rd" |
"Unit 4" |
02-Aug-2013 00:00:00 |
Tabular data can be numeric or categorical (though eventually the categorical data would be converted to numeric).
Images and Video: Deep learning is more common for image and video classification problems. Convolutional neural networks are designed to extract features from images that often result in state-of-the-art classification accuracies. Intuitively, the operations performed by the convolutional filters are able to extract progressively higher-level features from images, making it possible to discern high-level differences such as cat versus dog.
Sensor and Signal: Machine learning has been more common, but deep learning is gaining popularity. Traditional approaches involve extracting features from signals and then using these features with a machine learning algorithm. More recently, signals have been passed directly to LSTM networks, or converted to images (for example, by calculating the signal’s spectrogram), and then that image is used with a convolutional neural network. Wavelets provide yet another way to extract features from signals, with techniques like wavelet scattering showing promising results when combined with machine learning algorithms.
Text: Like sensor data, machine learning has been more common though deep learning is growing in use for text data. Text can be converted to a numerical representation via bag-of-words models and normalization techniques and then used with traditional machine learning techniques such as support vector machines or naive Bayes. Newer techniques use text with recurrent or convolutional neural network architectures. In these cases, text is often transformed into a numeric representation using a word-embedding model such as word2vec.
Guess the Algorithm
An oil company created a more efficient way to keep track of geo-tagged machine inventory for maintenance scheduling. They set up a machine vision system to identify tags with serial numbers, use object character recognition to extract numbers, and associate images with inventory.
Which algorithm did they use?
Nope!
Shell International used an R-CNN as well as a number of image processing techniques to make this project a success. They started with a series of geo-tagged images of serial number labels. They read in these images, and because each image is very large, they extracted region proposals, which were then fed into the CNN proper. Shell then used a VGG-16 pretrained neural network with transfer learning to determine the classes: Tag and No tag.
When they extracted region proposals, they needed to apply some preprocessing techniques, starting with fish-eye correction and then into the CNN.
Shell needed enough training data to ensure their model was robust, so they expanded the definition of the label to include signs as well. They then performed some data augmentation to further increase the size of the training data set.
Identified tags were then fed into an object character recognition (OCR) algorithm to extract SAP codes.
Correct!
Shell International used an R-CNN as well as a number of image processing techniques to make this project a success. They started with a series of geo-tagged images of serial number labels. They read in these images, and because each image is very large, they extracted region proposals, which were then fed into the CNN proper. Shell then used a VGG-16 pretrained neural network with transfer learning to determine the classes: Tag and No tag.
When they extracted region proposals, they needed to apply some preprocessing techniques, starting with fish-eye correction and then into the CNN.
Shell needed enough training data to ensure their model was robust, so they expanded the definition of the label to include signs as well. They then performed some data augmentation to further increase the size of the training data set.
Identified tags were then fed into an object character recognition (OCR) algorithm to extract SAP codes.
If You Have No Labeled Data
Focus on machine learning techniques (in particular, unsupervised learning techniques). Labeling for deep learning can mean annotating objects in an image, or each pixel of an image or video, for semantic segmentation. The process of creating these labels, often referred to as “ground-truth labeling,” can be prohibitively time-consuming.
If You Have Some Labeled Data
Try transfer learning and/or labeling apps if you want to use deep learning. Because transfer learning focuses on training a smaller number of parameters in the deep neural network, it requires a smaller amount of labeled data.
Another approach for dealing with small amounts of labeled data is to augment that data. For example, it is common with image data sets to augment the training data with various transformations on the labeled images (such as reflection, rotation, scaling, and translation).
If You Have Lots of Labeled Data
With plenty of labeled data, both machine learning and deep learning are available. The more labeled data you have, the more likely that deep learning techniques will be more accurate.
Handy Labeling Apps
Image Labeler enables you to label ground truth data in a collection of images. Define rectangular region of interest (ROI) labels, pixel ROI labels, and scene labels, and use these labels to interactively label your ground truth data. Write, import, and use your own custom automation algorithm to automatically label ground truth.
Ground Truth Labeler works in the same way as the Image Labeler app but is specifically for automated driving applications.
Audio Labeler enables you to label ground truth audio data at both the region level and file level. Create label definitions for consistent and fast labeling. Visualize the time-domain waveform during playback and specify regions by drawing directly on the time-domain waveform.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)