Video length is 32:05

Automated Optical Inspection and Defect Detection with Deep Learning | Deep Learning Webinars 2020, Part 4

From the series: Deep Learning Webinars 2020

Automated inspection and defect detection are critical for high throughput quality control in production systems. They are widely adopted in many industries for detection of flaws on manufactured surfaces such as metallic rails, semiconductor wafers, contact lenses, and so on.

Recent developments in deep learning have significantly improved our ability to detect defects. Learn how to use MATLAB® to develop deep learning-based approaches to detect and localize different types of anomalies, such as:

  • Data access and preprocessing techniques, including denoising, registration, and intensity adjustment
  • Semantic segmentation and labeling of defects and abnormalities
  • Defect detection using MobileNetv2, Grad-CAM, and other deep learning techniques
  • Deployment to multiple hardware platforms such as CPUs and GPUs

Published: 20 Oct 2020

Hello, everyone. My name is Harshita Bhurat. I'm the product manager of the image processing and computer vision products. Today we will talk about automated optical inspection.

Let's start by defining what we mean by automated inspection. It is automating the process of visually inspecting manufacturing parts for failures and defects. There are various other terms used for this process, automated defect detection, machine vision, visual inspection, automated inspection, et cetera. For this talk today, we will assume that all of these terms mean the same thing.

Let me give you examples of some of our customers using our tools. Airbus used MATLAB and its solutions to build a robust and 10 air model to automatically detect multiple defects of elements of the aircraft. The biggest advantage to them, of using our tools, was interactive prototyping and testing in a very short amount of time.

Next is Musashi. Musashi Seimitus Industry prototyped an anomaly detection system using deep learning with MATLAB. They used it for inspecting beveled gears and automotive parts. This approach reduced the workload and cost for the manual visual inspection of over a million parts per month.

The next example here is Korea Railroad Research Institute. They used MATLAB to detect surface defects, missing parts, and cracks in all railway facility components, such as rails, sleepers, and fasteners. Their process was made easier and quicker by using labeling apps, and built-in libraries for pre and post processing of data.

The last example here is of Kansai Electric Power. They applied deep learning to assess creep damage on steel pipe welds. Evaluating damage to all of these internal pipes is difficult using conventional inspection of the outer surface areas. So the concept engineers attempted to use images from strained distribution of the outer surface, to evaluate damaged internal pipes. Availability of sample code seminars, rich documentation, and all of the other learning resources on deep learning available in the MATLAB tools, helped them get up to speed easily and quickly.

Those are all very complex systems and complex solutions, but for the sake of this talk, let's try and simplify the process. Very simply put, what do all of these systems have in common? This is what any system would look like. You have a manufacturing belt with parts on it, an inspection camera that captures all these images, or parts of the system, and then you have an image analysis system that analyzes these images, and classifies them as good or defective.

Now, if that inspection system took, for example, four images of the hex nuts for us, can you find the defective one here? That was fairly simple. Now what if that inspection took more images for us, can you still find the defective ones? Probably. How about now? How about now? I could go on, but you get the point. What we really need is an efficient AI learning technique to just classify all of these images for us, and separate the defective nuts from the good ones, and I'm going to show you some of these techniques in MATLAB.

Let's get into the specifics of what the process involves. When I talk about a defect detection workflow, a simplified view results in three main stages: data preparation, algorithms and modeling, and deployment. This is an iterative process. Once you deploy a system, you get more data, you iterate and refine and tune your AI models, and so on. Through this talk, I will go through parts of the workflow, talk about some of the challenges within each stage, and point out how MATLAB can make it all come together efficiently and effectively, to deliver value to you.

So let's start with data preparation. Data preparation and management turns out to be one of the most critical ingredients to success, and it's really, really hard. Today, data comes from multiple sensors from multiple databases. Some of the data is structured, some is unstructured. Some of the data spans across different time intervals. Data can be noisy, or it can be from different domains. Sometimes data doesn't exist, and we need to create synthetic data.

This data needs to be prepared for the next stage, the algorithms and the modeling stage, and it's very time consuming. It's also beyond just data manipulation. Data labeling is required for any kind of supervised learning algorithm, and this can not only be time consuming, but also very error prone.

Specifically, we're going to discuss the four most common questions when it comes to data preparation. How do I access large data that doesn't fit in memory? How do I pre-process data and get the right features? How do I label my data faster? What if I have an imbalanced dataset, or, what if I don't have enough data?

For accessing large data, when training deep neural networks, we're often working with really large amounts of data. Often, we cannot load all of it into memory. One way MATLAB handles this problem is with datastores. Rather than loading all of your data into memory, datastores load in data only as you need it. They act as a pointer to the data. MATLAB has datastore capabilities for images, audio, and files. There's also ways to create custom data storage if none of these fit your data type.

And now the feature that addresses the big data issue, is tall arrays. Tall arrays allow you to work with out-of-memory numeric data. In the deep learning space, you can use this when training deep neural networks for numeric arrays.

And the last feature here is BigImage. It is used to read large TIFF image files and the image data it contains. The BigImage object represents really big images of smaller blocks of data. These files of blocks of data can be independently loaded and processed. You can also process images at different and multiple resolution levels, like image pyramids. You can also read, write, and set blocks in arbitrary regions of the image.

So how about processing data, and getting the right features? Most deep learning networks are pretty effective in detecting anomalies, but they're not perfect. One way to increase accuracy is by pre-processing the images in your data set.

MATLAB has a large number of apps to support various pre-processing techniques. Here you see the registration app that lets you explore various algorithms for registering misaligned images. To have the training models focus on the surface of the nuts, you can match the position and the orientation of the images to a reference image. You can then create a mask to eliminate the background.

Here you see another app for more image processing techniques. This is called the Image Segmenter. In Image Segmenter, you can explore and experiment with various segmenting techniques, like clustering. Here, for example, you saw a simple marking foreground and background segments your image, and creates a mask. You can then refine these masks with more morphological operations.

There are also other apps for thresholding and region analysis. You can then use these enhanced images, and also generate MATLAB code from these apps, iteratively, to use on a big data set.

In addition to apps, you can also find a number of built-in functions and libraries for other techniques, such as de-noising, adjusting image intensity, texture based and fiber metric filtering, et cetera. MATLAB includes a very large and growing database of building algorithms.

All of this pre-processing affects a predicted score of a defect detection algorithms. Here you can see the comparison of results with and without pre-processing. All the defective units are clustered in the upper left. You can see that without pre-processing, the system was able to identify only three defective nuts, while the fourth was an outlier here, and that pre-processing it was effectively able to detect all four of them.

If you have your histogram of the predicted scores, you can see that the pre-processing, the peak of normal scores, shifts to the right, creating a sufficient and clear margin of separation from the defective scores.

So our next challenge is about how do I label my data faster? Labeling for deep learning is repetitive, tedious, and time consuming, but necessary. In order to perform deep learning, we need lots of label data. This tends to consume a lot of time when performing workflows, like object detection or semantic segmentation. Imagine manually drawing bounding boxes around thousands of images, or having to define the classes of every single pixel in these images.

Now, the next logical question is, how do I label my data faster? MATLAB has several interactive tools for accelerating the data labeling process. The first is the image labeler, and the video labeler. In this video, we will see how these labeler apps provide automation capabilities to accelerate the labeling process. These apps help you use techniques such as semantic segmentation. In it, the regions of an image are classified. Here, for example, we are trying to classify a fire, and then, that classification is automatically applied, correctly, through every frame of a video.

There are other labeling apps too, all of which can generate code for re-use. The next one here is the BigImage labeler on MATLAB Central. This allows you to label large images that do not fit into memory interactively. You don't have to worry about extracting patches, labeling each batch, and reconstructing the image. Instead, you can interactively move around the image, and label different parts on it.

Now we are done with labeling. So you have a dataset that you've spent a lot of time and effort to add accurate labeling. Suppose then you start training your deep learning model. How can you be confident that the data you have is enough? You may easily find out that you need more data to get more higher accuracy, to get your model to be robust to a wider set of images, or use a different, possibly more complicated, model that simply requires more data and labels to converge.

So instead of creating new fresh data, a very common strategy consists in generating additional data, starting from your original dataset. This process is called data augmentation. Image applications are a good example where augmenting your data makes sense, and can yield significant results.

One approach is to use a data transmission command in MATLAB. This will apply random transformations like skill, rotation, translation, warping, et cetera to the dataset. You can also perform color transformations like hue jitter and contrast jitter on gray-scale images. You can write these files out as new images, and then add them to your dataset.

The approaches we discuss here can also be applied to correcting a data imbalance. Another approach is to use GANs, or Generated Adversarial Networks. These networks can create synthetic images from noise. The video is showing how GANs are creating new face images from random noise. It's using deep learning to improve your deep learning.

Let's move on to the second step in the workflow, the one that perhaps gets the most attention, AI based algorithms and modeling. Within the MATLAB environment, you have direct access to common algorithms used for classification and prediction, from regression, to deep networks, to clustering. And with MATLAB, you can easily use pre-built models developed by the broader community, such as the ResNet-50 for classification, or YOLO v2 for object recognition, et cetera.

While algorithms and pre-built models are a good start, it's not enough. Examples are the way engineers learn how to use algorithms, and then find best approaches for the specific problems. We provide a host of examples for using and building deep learning models, in a wide range of domains. We have over 200 of them in the deep learning tool box.

Let's start with deep learning for classification. There are two approaches for deep learning. One is draining a deep network from scratch. This means gathering a very large label dataset, and designing a network architecture that will learn the features and the model. This is good for new applications, or ones with a large number of output categories. However, we don't anticipate most MATLAB users will be doing a lot of deep learning from scratch. Because of the amount of data, and the rate of learning, these networks typically take days or weeks to train.

Hence, we see more people trying out the second method, which is fine tuning a pre-trained model. This is called transfer learning. The idea is that you take an existing network, for example, AlexNet or GoogLeNet, and feed in the new data containing previously unknown classes. After making some tweaks to the network, you can now perform a new task. This also has the advantage of needing less data, and hence, it needs less computation time. Time drops to minutes or hours, from days to weeks.

Here, you see the Deep Network Designer app. This app lets you build, visualize, edit, and train deep learning networks. Using this app, you can import and edit networks, or even build new networks. You can drag and drop to add new layers, and then create new connections. You can also view and edit layer properties. You can analyze the network to ensure that the network architecture is being defined correctly, and detect problems before training. You can also generate MATLAB code for building and training networks. After you finish designing a network, you can export it to the workspace, where you can save or train the network. As you can imagine, composing the network by dragging and dropping features, you could accelerate the deep learning process, especially if it's the first time you're going through this exercise.

Let's talk a little more about transfer learning. A Convolutional Neural Network, or CNN, is a type of deep neural network which can work directly with structured data, such as images, in order to classify them. Here on the slide, we can see a typical architecture for a CNN. We can think of this as a representation of the data as it travels through the network, at a very high level. Some of these layers are going to be detecting features, while other layers are going to be involved in the classification of the input image. And all the layers are going to be trained together, that is, the training process is going to involve adjusting the weights on each of these layers. So the goal is to have the images class be classified correctly.

If you look into the architecture of the CNN more carefully, we are going to start with an input layer whose parameters will be determined by the image dataset. Following this layer, we have several convolution, value, and pooling layers. These layers will be involved in learning features from the images.

Finally, the parameters are going to be flattened so that we can deal with fully connected layers on a Softmax, to tell us what the images are classified as. For transfer learning, or fine tuning a pre-trained model, we will re-use feature extraction layers, and we'll replace these classification layers.

We are again going to use the Deep Network Designer for transfer learning. You can import networks and network architectures from TensorFlow-Keras, Caffe, and to and from the ONNX model format. We're going to load pre-trained networks and edit them for transfer learning. Here, in this case, we pick MobileNet-v2, as it's faster, has higher accuracy, and a smaller footprint. You can drag and drop to add new layers, and create new connections. You can view and edit layer properties. You can also analyze the network to ensure that the network architecture is defined correctly, and detect problems before training. You can import image data for classification problems, and select augmentation options.

The app enables you to train image classification networks important, or create it in the app. For other types of data, you can construct a network, and then export the network for training. You can visualize and monitor the training progress. You can then edit the training options, and re-train the network, if required. Here you can see the plots of accuracy, loss, and validation metrics. After it's finished, you can now see that the accuracy is of 92.5%. You can also export the trained network and information, either into the workspace, or generate code for training.

We can now run this model as an inference engine. In this video, we use two random images at a time from the original dataset off hex nuts, to show how this works. As you can see, the nuts are getting classified as good or bad. You can see that the model is probably doing a good job of classifying them as good and effective, but it's very hard to tell whether it's accurate just by looking at all of these images.

That brings us to our next question, is it possible to classify an unknown image correctly? Why does the modern misclassify certain images? How do we verify that the model is, in fact, classifying the right features?

Deep learning networks are often considered to be black boxes, and there is no way of figuring out why it predicted what it did. And when these models fail, they give incorrect predictions. They often fail without any warning or explanation. These questions can be answered by explainable AI techniques, such as Class Activation Mapping, known as CAM, or the gradient CAM. The CAM techniques give you visual explanations of the predictions of the CNN.

These examples show how to use class activation mapping to investigate and explain the predictions of a deep CNN for image classification. In this example, the red highlighted areas are the features the network uses to classify the image. You can see that the presence of the keyboard helps the model determine this is a picture of a mouse. This may or may not be a valid way to make this decision.

Here, the network classifies this image of a coffee cup as a buckle. The network detects and focuses on the watch wristband and not the coffee cup. How do you determine what is correct, the model, or the Crowd tool? It actually differs from case to case. You can use class activation mapping to identify bias in the training set, and to increase the model accuracy. You could then make the network more robust by collecting and labeling more data.

Now let's use this CAM technique on our model that classifies the hex nuts as good or defective. Here, as you can see, that the highlighted areas-- the red areas, if they are on the whole surface, the nuts are classified as good. And if the red areas are only on the scratches, the nuts classified as bad.

Moving on to object detection with deep learning. Object detection using deep learning provides a fast and accurate means to predict the location of an object in an image. Deep learning is a powerful machine learning technique, in which the object detector automatically learns, and which features required for detection tests. Several techniques with these are available in MATLAB, such as Faster R-CNN, YOLO-- the You Only Look Once-- v2, on a Single Shot Detection, which is SSD. Applications for these object detection techniques include image classification, seeing and understanding, self-driving vehicles, surveillance, and so on.

As an example, here is a prominent algorithm called YOLO v2, which stands for You Only Look Once, version 2. It is particularly good to do neural network for a few reasons. It can detect objects in real time, which makes it very versatile for tasks such as autonomous driving and traffic monitoring. And it is also faster than some of the other previous deep learning algorithms, such as the Faster R-CNN. The YOLO v2 model runs a deep learning CNN on an input image to produce network predictions. The object decoder then decodes these predictions, and generates bounding boxes.

You can design a custom YOLO v2 model, layer by layer, in MATLAB. You can start from a pre-trained CNN as a feature extractor, or train from scratch. You need lots of label data. The apps that you saw earlier can also be used to label rectangular ROIs for object detection. You can find extensive documentation to show you how to get started with, train, and detect objects using YOLO v2 in our MATLAB documentation.

Here is an example of a fire detection model built using the same YOLO v2 model.

For all these numerous network architectures and different datasets, it's natural to get overwhelmed. It's confusing to know which network architecture is best for you, or which dataset works best for you. Don't you wish there was some way of being able to run experiments to train and compare these options, and pick the one with the best results? Well, we have an app for that. In the latest release, we have the Experiment Manager app to manage multiple deep learning experiments. You can analyze, compare results, and code.

Here is a short overview video highlighting its capabilities. As you can see, using the app, you can explore hyper-parameters. You can monitor training progress. You can define experiments with MATLAB code. You can keep track of your work, and you can also export model results.

Deep learning models have to be incorporated into a larger system to be useful, so let's talk about deployment. For deployment, we've got a unique co-generation framework that allows models developed in MATLAB to be deployed anywhere, without having to re-write the original model. This gives you the ability to test and deploy the entire system, and not just the model itself.

MATLAB enables you to deploy your deep learning networks from a single source in MATLAB onto various, embedded, hardware platforms. These platforms can be in video GPUs, Intel and ARM CPUs, Xilinx and Intel SOCs, and FPGAs. With the help of MathWorks tools, you can explore and target embedded hardware easily. MATLAB can generate need of optimized code for multiple frameworks. This gives you the flexibility to deploy to either lightweight, lower power embedded devices, such as those used in the car, or low cost, rapid prototyping boards, such as the Raspberry Pi. Or, even edge based IoT applications, such as a sensor and a controller on a machine in a factory. In almost all applications, you need to deploy your deep learning network together with the pre-processing and the post-processing logic. You can simulate your whole application in MATLAB, so, of course, you can then all deploy the whole application to hardware, using the corridors.

Here are two examples of the deployment workflow. On the left, you see MATLAB algorithms deployed to the Xilinx ZCU102 board. And on the right, the same algorithms are deployed on the Jetson Xavier, without re-doing all the work.

We're not going to go into the deployment workflows today. Here are some key resources that are available online if you're interested in these topics. Here you see another example of defect detection algorithms deployed on an ARM Cortex-A microprocessor.

MATLAB can be deployed not only to embedded devices, but also desktop or server environments. This allows you to scale from desktop executables, to cloud based enterprise systems, for example, on AWS or Azure.

Automatic co-generation eliminates all the coding errors, and is an enormous value driver for any organization adopting it. The power and flexibility of our code generation and deployment frameworks is unmatched.

This completes our three step workflow. We learned about data access, including accessing very large amounts and large size of data. We learned about generating synthetic data, or data augmentation, when we do not have enough data. We learned about labeling this data to be able to use for our deep learning workflows. Next, we learned about AI and deep learning modeling for defect detection. We learned about creating networks from scratch, or fine tuning pre-trained networks. We also learned about deploying all of these defect detection algorithms on embedded, as well as enterprise systems.

In summary, MATLAB has interactive and easy to use apps that help you explore, iterate, and automate your workflows. It also provides you the flexibility and options to choose from a variety of networks and optimizations based on your data and requirements. MATLAB provides an easy and extensible framework for defect detection, all the way from data access to deployment.

Thank you.