Main Content

Clean Missing Data

Find, fill, or remove missing data in the Live Editor

Since R2019b

Description

The Clean Missing Data task lets you interactively handle missing data values such as NaN or <missing>. The task automatically generates MATLAB® code for your live script.

Using this task, you can:

  • Find, fill, or remove missing data in a workspace variable.

  • Customize the method for filling data.

  • Define nonstandard missing value indicators.

  • Visualize the missing data and the cleaned data.

Related Functions

Clean Missing Data generates code that uses the ismissing, standardizeMissing, fillmissing, and rmmissing functions.

Clean Missing Data task in the Live Editor

Open the Task

To add the Clean Missing Data task to a live script in the MATLAB Editor:

  • On the Live Editor tab, select Task > Clean Missing Data.

  • In a code block in the script, type a relevant keyword, such as missing, NaN, fill, or remove. Select Clean Missing Data from the suggested command completions. For some keywords, the task automatically updates one or more corresponding parameters.

Examples

expand all

Interactively fill missing values in nonuniformly sampled data.

Create a vector of nonuniform sample points, and evaluate the sine function over the points.

x = [-4*pi:0.1:0 0.1:0.2:4*pi];
A = sin(x);

Inject missing values into A.

A(A < 0.75 & A > 0.5) = missing;

Open the Clean Missing Data task in the Live Editor. To clean the data, select A as the input data and x as the x-axis coordinates of the data.

The Clean Missing Data task can fill or remove missing data. To fill the missing entries using linear interpolation of neighboring nonmissing values, use the Cleaning method field to select Fill missing and Linear interpolation.

The task plots the cleaned data and indicates that the linear interpolation filled 21 missing entries in the input data.

Because the default legend location covers some filled missing entries, specify the legend location as the outside top-right corner of the axes.

Live Task
legend("Location","northeastoutside")

Related Examples

Parameters

expand all

This task operates on input data contained in a vector, table, or timetable. The data can be of type single, double, duration, calendarDuration, datetime, categorical, string, char, or cell arrays of character vectors.

When providing a table or timetable for the input data, select All supported variables to clean all variables with a supported type. Select All numeric variables to clean all variables of type single or double. To choose specific supported variables to clean, select Specified variables and then select the variables individually.

Specify the method for filling missing data as one of these options.

MethodDescription
Linear interpolationLinear interpolation of neighboring, nonmissing values
Constant valueSpecified scalar value, which is 0 by default
Previous valuePrevious nonmissing value
Next valueNext nonmissing value
Nearest valueNearest nonmissing value as defined by the x-axis
Spline interpolationPiecewise cubic spline interpolation
Shape-preserving cubic interpolation (PCHIP)Shape-preserving piecewise cubic spline interpolation
Modified Akima cubic interpolationModified Akima cubic Hermite interpolation
Moving medianMoving median with specified window size
Moving meanMoving mean with specified window size
K-nearest neighborsMean of nearest neighbors defined by a distance function
Custom functionCustom fill method, specified as a local function or a function handle

Specify the window type and size when the method for filling missing data is Moving median or Moving mean.

WindowDescription
CenteredSpecified window length centered about the current point
AsymmetricSpecified window containing the number of elements before the current point and the number of elements after the current point

Window sizes are relative to the X-axis variable units.

Version History

Introduced in R2019b

expand all