Skip to content
MathWorks - Mobile View
  • 登录到您的 MathWorks 帐户登录到您的 MathWorks 帐户
  • Access your MathWorks Account
    • 我的帐户
    • 我的社区资料
    • 关联许可证
    • 登出
  • 产品
  • 解决方案
  • 学术
  • 支持
  • 社区
  • 活动
  • 获取 MATLAB
MathWorks
  • 产品
  • 解决方案
  • 学术
  • 支持
  • 社区
  • 活动
  • 获取 MATLAB
  • 登录到您的 MathWorks 帐户登录到您的 MathWorks 帐户
  • Access your MathWorks Account
    • 我的帐户
    • 我的社区资料
    • 关联许可证
    • 登出

视频与网上研讨会

  • MathWorks
  • 视频
  • 视频首页
  • 搜索
  • 视频首页
  • 搜索
  • 联系销售人员
  • 试用软件
4:43 Video length is 4:43.
  • Description
  • Full Transcript
  • Related Resources

Applied Machine Learning, Part 3: Hyperparameter Optimization

From the series: Applied Machine Learning

Machine learning is all about fitting models to data. This process typically involves using an iterative algorithm that minimizes the model error. The parameters that control a machine learning algorithm’s behavior are called hyperparameters. Depending on the values you select for your hyperparameters, you might get a completely different model. So, by changing the values of the hyperparameters, you can find different, and hopefully better, models.    

This video walks through techniques for hyperparameter optimization, including grid search, random search, and Bayesian optimization. It explains why random search and Bayesian optimization are superior to the standard grid search, and it describes how hyperparameters relate to feature engineering in optimizing a model.

Machine learning is all about fitting models to data. The models consist of parameters, and we find the value for those through the fitting process. This process typically involves some type of iterative algorithm that minimizes the model error. That algorithm has parameters that control how it works, and those are what we call hyperparameters.

In deep learning, we also call the parameters that determine the layer characteristics hyperparameters. Today, we’ll be talking about techniques for both.

So, why do we care about hyperparameters?  Well, it turns out that most machine learning problems are non-convex. This means that depending on the values we select for the hyperparameters, we might get a completely different model. By changing the values of the hyperparameters, we can find different, and hopefully better, models.  

Ok, so we know that we have hyperparameters, and we know we want to tweak them, but how do we do that? Some hyperparameters are continuous, some are binary, and others might take on any number of discrete values. This makes for a tough optimization problem. It is almost always impossible to run an exhaustive search of the hyperparameter space, since it takes too long.  

So, traditionally, engineers and researchers have used techniques for hyperparameter optimization like grid search and random search. In this example, I’m using a grid search method to vary 2 hyperparameters – Box Constraint and Kernel Scale – for an SVM model.  As you can see, the error of the resulting model is different for different values of the hyperparameters. After 100 trials, the search has found 12.8 and 2.6 to be the most promising values for these hyperparameters.

Recently, random search has become more popular than grid search. 

 “How could that be?” you may be asking.

Wouldn’t grid search do a better job of evenly exploring the hyperparameter space?  

Let’s imagine you have 2 hyperparameters, “A” and “B”. Your model is very sensitive to “A,” but not sensitive to “B.”  If we did a 3x3 grid search, we would only ever evaluate 3 different values of “A.” But if we did a random search, we would probably get 9 different values of “A”, even though some may be close together. As a result, we have a much better chance of finding a good value for “A.”  In machine learning, we often have many hyperparameters. Some have a big influence over the results, and some don’t.  So random search is typically a better choice.

Grid search and random search are nice because it’s easy to understand what’s going on.  However, they still require many function evaluations. They also don’t take advantage of the fact that, as we evaluate more and more combinations of hyperparameters, we learn how those values affect our results. For that reason, you can use techniques that create a surrogate model – or an approximation of the error as a function of the hyperparameters.

Bayesian optimization is one such technique. Here we see an example of a Bayesian optimization algorithm running, where each dot corresponds to a different combination of hyperparameters. We can also see the algorithm’s surrogate model, shown here as the surface, which it is using to pick the next set of hyperparameters.

One other really cool thing about Bayesian optimization is that it doesn’t just look at how accurate a model is. It can also take into account how long it takes to train.  There could be sets of hyperparameters that cause the training time to increase by factors of 100 or more, and that might not be so great if we’re trying to hit a deadline. You can configure Bayesian optimization in a number of ways, including expected improvement per second, which penalizes hyperparameter values that are expected to take a very long time to train.

Now, the main reason to do hyperparameter optimization is to improve the model.  And, although there are other things we could do to improve it, I like to think of hyperparameter optimizations as being a low-effort, high-compute type of approach. This is in contrast to something like feature engineering, where you have higher effort to create the new features, but you need less computational time. It’s not always obvious which activity is going to have the biggest impact, but the nice thing about hyperparameter optimization is it lends itself well to “overnight runs,” so you can sleep while your computer works.

That was a quick explanation of hyperparameter optimization. For more information, check out the links in the description.

 

 

 

Related Products

  • Statistics and Machine Learning Toolbox

Learn More

Bayesian Optimization Workflow
Model Building and Assessment
Bayesian Optimization Documentation
What Is AutoML?

3 Ways to Speed Up Model Predictive Controllers

Read white paper

A Practical Guide to Deep Learning: From Data to Deployment

Read ebook

Bridging Wireless Communications Design and Testing with MATLAB

Read white paper

Deep Learning and Traditional Machine Learning: Choosing the Right Approach

Read ebook

Hardware-in-the-Loop Testing for Power Electronics Control Design

Read white paper

Predictive Maintenance with MATLAB

Read ebook

Electric Vehicle Modeling and Simulation - Architecture to Deployment : Webinar Series

Register for Free

How much do you know about power conversion control?

Start quiz
Related Information
Related Information
MATLAB for Machine Learning

Feedback

Featured Product

Statistics and Machine Learning Toolbox

  • Request Trial
  • Get Pricing

Up Next:

Walk through several key techniques and best practices for running your machine learning model on embedded devices. 
2:30
Part 4: Embedded Systems
View full series (4 Videos)

Related Videos:

34:34
Machine Learning Made Easy
5:36
Machine Learning for Predictive Modelling (Highlights)
44:37
Machine Learning for Predictive Modelling
41:25
Machine Learning with MATLAB
34:31
Machine Learning with MATLAB: Getting Started with...

View more related videos

MathWorks - Domain Selector

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

  • Switzerland (English)
  • Switzerland (Deutsch)
  • Switzerland (Français)
  • 中国 (简体中文)
  • 中国 (English)

You can also select a web site from the following list:

How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

Americas

  • América Latina (Español)
  • Canada (English)
  • United States (English)

Europe

  • Belgium (English)
  • Denmark (English)
  • Deutschland (Deutsch)
  • España (Español)
  • Finland (English)
  • France (Français)
  • Ireland (English)
  • Italia (Italiano)
  • Luxembourg (English)
  • Netherlands (English)
  • Norway (English)
  • Österreich (Deutsch)
  • Portugal (English)
  • Sweden (English)
  • Switzerland
    • Deutsch
    • English
    • Français
  • United Kingdom (English)

Asia Pacific

  • Australia (English)
  • India (English)
  • New Zealand (English)
  • 中国
    • 简体中文Chinese
    • English
  • 日本Japanese (日本語)
  • 한국Korean (한국어)

Contact your local office

  • 联系销售人员
  • 试用软件

MathWorks

Accelerating the pace of engineering and science

MathWorks 公司是世界领先的为工程师和科学家提供数学计算软件的开发商。

发现…

了解产品

  • MATLAB
  • Simulink
  • 学生版软件
  • 硬件支持
  • File Exchange

试用或购买

  • 下载
  • 试用软件
  • 联系销售
  • 定价和许可
  • 如何购买

如何使用

  • 文档
  • 教程
  • 示例
  • 视频与网上研讨会
  • 培训

获取支持

  • 安装帮助
  • MATLAB Answers
  • 咨询
  • 许可中心
  • 联系支持

关于 MathWorks

  • 招聘
  • 新闻室
  • 社会愿景
  • 客户案例
  • 关于 MathWorks
  • Select a Web Site United States
  • 信任中心
  • 商标
  • 隐私权政策
  • 防盗版
  • 应用状态

京公网安备 11010502045942号京ICP备12052471号

© 1994-2022 The MathWorks, Inc.

  • Weibo
  • WeChat

    WeChat

  • Bilibili
  • Youku
  • LinkedIn
  • RSS

关注我们