How to increase the size of a data array?

Question

JP Deka 2024-1-28

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2075236-how-to-increase-the-size-of-a-data-array

回答： Yash 2024-2-6

Suppose, I am given a data array y generated using

x = 0:0.1:6;

y = sin(x);

Now, we can evaluate the size of x and from there, we can evaluate the size of y.

But if we are given a dataset with array size of m and there are no independent variables from which the dataset has been generated, how will it be possible to increase the size of the dataset from m to 2*m?

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Dyuman Joshi 2024-1-28

You can concatanate the data.

See - cat

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Yash 2024-2-6

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2075236-how-to-increase-the-size-of-a-data-array#answer_1403561

在 MATLAB Online 中打开

Hi,

As I can understand, you are interested in increasing your dataset from "m" to "2*m" without any independent variable. Given below are a few approaches that you can take:

1. Replication

You can simply replicate the dataset to increase its size. This doesn't provide new information but doubles the number of data points.

x = 0:0.1:6;
y = sin(x);
y_doubled = repmat(y, 1, 2); % Replicates the array y twice along the second dimension

2. Interpolation

If the data points represent samples from a continuous function and are evenly spaced, you could interpolate between the points to create new data points. However, this assumes that the data behaves nicely between the points.

m = length(y); % Original size of the dataset
x_original = 1:m; % Original indices for the dataset
x_interpolated = linspace(1, m, 2*m); % New indices for the interpolated dataset
y_interpolated = interp1(x_original, y, x_interpolated, 'spline'); % Interpolate y to have 2*m points

3. Data Augmentation

In machine learning, data augmentation techniques are often used to increase the size of the dataset by adding slightly modified copies of already existing data or newly created synthetic data. Techniques include adding noise, scaling, or other transformations that are known to be plausible given the nature of the data.

y_augmented = [y; y + randn(1, m) * 0.05]; % Augment y with noisy versions of itself

4. Bootstrapping

Bootstrapping is a statistical method that involves sampling with replacement. It can be used to create a new dataset of size 2*m by resampling the original dataset.

y_bootstrapped = y(randi(m, 1, 2*m)); % Randomly sample from y with replacement

5. Extrapolation

Extrapolation is the process of estimating beyond the original observation range, which is highly speculative and often inaccurate, especially without knowledge of the underlying process.

% This is generally not recommended unless you have a good model of the data
p = polyfit((1:m)', y', 5); % Fit a polynomial of degree 5 to the data
x_extrapolated = (1:2*m)';
y_extrapolated = polyval(p, x_extrapolated); % Evaluate the polynomial at new points

Each of these methods has its own assumptions and potential pitfalls. It's important to choose a method that's appropriate for the characteristics of your data and the requirements of your analysis.

If you want to get more details based on your specific dataset, kindly share your code and dataset so that we can have more insights.

Hope this helps!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

How to increase the size of a data array?

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

How to increase the size of a data array?

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论