How to process data Bins

6 次查看(过去 30 天)
Simon Page
Simon Page 2020-9-30
回答: sanidhyak 2025-7-2
I am trying to convert the following binscatter graph from expressing each bincount as a percentage of the total bincount datapoints. Instead I want to express the percentage that each xbin is of the total xbins edges.
ie. The x-axis is seconds, the y-axis is mm of displacement. Each x-axis bin covers 60 seconds and I want to show that for the first xbin the 4% bin should actually be about 90% of the total datapoints for those first 60 seconds. The colorbar should express 0-100% (or alternatively could be 0-1 instead of percent).
Can anyone help me.
xEdges = min(xVals):60:max(xVals);
yEdges = min(yVals):1:max(yVals);
hHist = histogram2(hAxes,xVals,yVals,xEdges,yEdges,'DisplayStyle','tile', 'Normalization', 'probability');
hist_Counts = histcounts2(xVals,yVals,xEdges,yEdges);
hist_Counts_normalized = (hist_Counts / sum(hist_Counts,'all'))*100;
hHist = histogram2(hAxes,'XBinEdges',xEdges,'YBinEdges',yEdges,'DisplayStyle','tile','BinCounts',hist_Counts_normalized);
colorbar;
xlim([0 2400])
ylim([-6 6])
xBinCenters = hHist.XBinEdges - [0,diff(hHist.XBinEdges)/2];
yBinCenters = hHist.YBinEdges - [0,diff(hHist.YBinEdges)/2];
for i = 2:numel(xBinCenters)
for j = 2:numel(yBinCenters)
x_loc = xBinCenters(i);
y_loc = yBinCenters(j);
if ~(round(hHist.Values(i-1,j-1),2) < 0.2)
text(hAxes,x_loc,y_loc,[num2str(round(hHist.Values(i-1,j-1),2)),'%'],'Color',[0 0 0],'FontSize',10,'FontWeight','bold');
else
continue
end
end
end

回答(1 个)

sanidhyak
sanidhyak 2025-7-2
I understand that you are trying to display a "2D binscatter" ("histogram2") such that the values represent the percentage distribution of displacement values within each "x-axis" bin, rather than expressing each bincount as a percentage of the total number of datapoints.
When using MATLAB’s "histogram2" function with "Normalization","probability", the histogram is normalized globally, meaning each bin count is divided by the total number of input datapoints. This is why your current implementation reflects global percentages rather than "per-xbin" percentages.
To resolve this, normalize each "x-bin" independently by computing "histcounts2" manually and applying per-row normalization. Kindly refer to the following corrected implementation:
% Define bin edges
xEdges = min(xVals):60:max(xVals);
yEdges = min(yVals):1:max(yVals);
% Compute raw 2D histogram
hist_Counts = histcounts2(xVals, yVals, xEdges, yEdges);
% Normalize each x-bin independently
hist_Counts_normalized = zeros(size(hist_Counts));
for i = 1:size(hist_Counts,1)
row_sum = sum(hist_Counts(i,:));
if row_sum ~= 0
hist_Counts_normalized(i,:) = hist_Counts(i,:) / row_sum * 100;
end
end
% Plot the normalized histogram
hHist = histogram2(hAxes, ...
'XBinEdges', xEdges, ...
'YBinEdges', yEdges, ...
'BinCounts', hist_Counts_normalized, ...
'DisplayStyle', 'tile');
colorbar;
caxis([0 100]); % Scale colorbar from 0 to 100%
% Add axis limits
xlim([0 2400])
ylim([-6 6])
% Compute bin centers
xBinCenters = hHist.XBinEdges(1:end-1) + diff(hHist.XBinEdges)/2;
yBinCenters = hHist.YBinEdges(1:end-1) + diff(hHist.YBinEdges)/2;
% Annotate bins with values
for i = 1:numel(xBinCenters)
for j = 1:numel(yBinCenters)
val = hist_Counts_normalized(i,j);
if val >= 0.2
text(hAxes, xBinCenters(i), yBinCenters(j), ...
[num2str(round(val,2)),'%'], ...
'Color', [0 0 0], 'FontSize', 10, 'FontWeight', 'bold');
end
end
end
This solution ensures that each time window ("x-bin") reflects the local distribution of displacements ("y-bins"), summing to 100% per column. The "colorbar" and "bin" annotations now accurately show localized behavior over time.
For further reference on "histogram bin normalization", you may refer to the following official documentation:
I hope this helps!

类别

Help CenterFile Exchange 中查找有关 Data Distribution Plots 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by