wrong values in histogram plotting

6 次查看(过去 30 天)
Hello,
I'm trying to plot a histogram of an array. I have a csv file with a list of double values, and I want to see how many elements have a value that is less or equal to 10% of the maximal value, 20%, 30% and etc.. I tried using the following code, but I get wrong statistics, when I check how many elements have a lesser or equal value to 10% of the maximal element, I see that there are 11173940 such elements. I did so by using the following code:
maxElement = max(array);
elementCount = sum(array < maxElement * 0.1);
when I print the histogram it shows like there are less than 180 elements that constitute this condition. this is the code I used (I have a lot of csv files that I want to read and analyze in the same manner, that's why the filename loop):
clear; clc;
dataDir = 'hist_res_rel';
fileList = dir(strcat(dataDir, '/*.csv'));
plotDir = 'plot_dir_rel';
for i = 1:numel(fileList)
fileName = fileList(i).name;
epoch = fileName(length(fileName)-5:length(fileName)-4);
if contains(fileName,'a_rel')
plot_title = strcat('A Realtive Value Change Between Epochs: ', epoch, '-', num2str(str2double(epoch)+10));
end
if contains(fileName,'b_rel')
plot_title = strcat('B Realtive Value Change Between Epochs: ', epoch, '-', num2str(str2double(epoch)+10));
end
rel_val = readmatrix(strcat(dataDir, fileName));
rel_val = abs(rel_val);
Max = max(rel_val);
p = 0.1;
x = zeros(10, 1);
y = zeros(10, 1);
for index = 1:10
percentage = Max * p;
x(index) = percentage;
if index == 1
y(index) = sum(rel_val <= x(index));
else
y(index) = sum(rel_val <= x(index) & rel_val > x(index-1));
end
p = p + 0.1;
end
f = histogram(rel_val, x);
xticks(x);
title(plot_title);
xlabel('Percantage of Relative Change');
ylabel('Amount of Parameters');
xticklabels({'0', '10','20','30', '40', '50', '60', '70', '80', '90', '100'});
saveas(f, strcat(plotDir, '/plot_', fileName(1:length(fileName)-3), '.jpg'));
end
this is the histogram that I get:
and this is the csv file that I'm trying to analyze just to make sure everything works (sorry, it's so large I had to use an external site for the upload):
Thank you so much for your time and attention, I appreciate your help.

采纳的回答

Ganesh
Ganesh 2023-12-27
I understand that your histogram is inconsistent with the data you have. The issue you are facing can be easily resolved by adding 0 at the start of the variable "x".
When using a histogram, the histogram calculates the number of data points between edges. As your variable "x" begins with Max*0.1, the histogram plots interval between Max*0.1 and Max*0.2 and so on. By adding 0 at the start you can make the first edge to be 0, Max*0.1, which will give you the right result.
x = [0;x] % Add this line before plotting the histogram
Kindly refer to the following document for more information and examples on using the "histogram()" function:
Hope this helps!
  1 个评论
Elinor Ginzburg
Elinor Ginzburg 2023-12-27
Thank you so much! yeah that was the problem, I don't use Matlab so often, so I'm a bit rusty. Thanks again!

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Histograms 的更多信息

产品


版本

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by