Video length is 2:30

Applied Machine Learning, Part 4: Embedded Systems

From the series: Applied Machine Learning

Walk through several key techniques and best practices for running your machine learning model on embedded devices. 

The video discusses options for making your model faster and reducing its memory footprint, including automatic C/C++ code generation, feature selection, and model reduction.

Published: 18 Jan 2019

The phrase “machine learning” brings to mind complex algorithms that use lots of computations to train a model. But computations on “embedded devices” are limited in the amount of memory and compute available. 

Now, when I say “embedded devices,” I’m referring to objects with a special-purpose computing system, so think of things like a household appliance or sensors in an autonomous vehicle.

Today, we’ll discuss the different factors to keep in mind when preparing your machine learning model for an embedded device. 

Different types of models require different amounts of memory and time in order to make a prediction. For example, single decision trees are fast and require a small amount of memory. Nearest neighbor methods are slower and require more memory, so you might not want to use them for embedded applications.

Another thing to keep in mind when determining which models to use on an embedded device is how you will get your model to the device.  

Most embedded systems are programmed in low-level languages such as C.

But machine learning is typically done in high-level interpreted languages such as MATLAB, Python, or R.

If you have to maintain code bases in 2 different languages, it is going to be very painful to keep them in sync.   

MATLAB provides tools that automatically convert a machine learning model to C code, so you don’t need to manually implement the model in C separately.

So what if, after converting a model to C, you find out that it isn’t going to meet the requirements of our system?  Maybe the memory footprint is too big, or the model takes too long to make predictions?  

You could try other types of models and see if the code meets the requirements.  Maybe start with a simple model such as a decision tree.  

Alternatively, you could go back earlier in the process and see if you can reduce the number of features in the model. You can use tools such as neighborhood component analysis, which are useful for determining the impact that the features have on the results.  If you see that some features are weighted low, you could drop them from our model, making our model more concise.

Certain types of models have different reduction techniques associated with them. For decision trees, you can use pruning techniques, where you drop nodes that provide the smallest accuracy improvement.

One other approach is to look at reducing the memory required for storing the model parameters. For example, seeing if the model can be converted to a fixed-point representation that maintains acceptable accuracy.

Depending on your use case, any of these tactics may be appropriate. Hardware considerations, network connections, and budget are all key factors that will influence design decisions.  

That was just a quick overview of embedding machine learning models. For more information on preparing models for embedded devices, see the links below.