Using accumarray for histcounts2 with >1024 bins in 1 dimension

3 次查看(过去 30 天)
As the title expresses, I'm trying to generate a bivariate histograms of datasets wherein I will often have more than 1024 bins in one dimension, given my requisite data fidelity / bin width. As such, I'm trying to use accumarray to take the place of histcounts2, however I'm having trouble defining subs.
For full context, I'm starting from an arbitrarily-sized sparse array (lets say 100-by-100000). I then convert that to a full array using find(), and am finally trying to generate a bivariate histogram to visualize the data.
*Sidenote* variable names used herein are meant to be usefull for the example, not reflective of the actual variable names in my script.
DataSparse = sparse(100,100000);
DataSparse(randi(100*100000,[474 1])) = (82-36).*rand(474,1)+36;
[DataFull(:,1),DataFull(:,2),DataFull(:,3)] = find(DataSparse);
[~,~,subs] = unique([DataFull(:,2),quant(DataFull(:,3),5)],'rows');
%The columns of the original sparse matrix are already at the minimum
%acceptable fidelity for the data in that dimension, but the
%"height"/z-data/values in the sparse matrix can be dropped to a lower
%resolution.
%I know the subs definition is lacking, but am unsure how to properly
%define it. It's just where I'm at now.
BigHistogram = accumarray(subs,DataFull(:,3),[],@numel,[],1);
%again, this doesn't work to generate the equivalent of
%BigHistogram = hiscounts2(DataFull(:,2),DataFull(:,3),'BinWidth',[55 5]),
%but histcounts2 fails to retain the desired resolution if there's
%sufficiently far-flung data in DataSparse.
Help?

采纳的回答

Steven Lord
Steven Lord 2023-2-3
MATLAB limits the number of bins if you specify BinWidth. If you specify a list of edges MATLAB will use that list of edges to determine the bins even if that results in more than 1024 bins.
DataSparse = sparse(100,100000);
DataSparse(randi(100*100000,[474 1])) = (82-36).*rand(474,1)+36;
[DataFull(:,1),DataFull(:,2),DataFull(:,3)] = find(DataSparse);
% Set up bin edge vectors
[min2, max2] = bounds(DataFull(:, 2));
xedges = min2:55:max2;
[min3, max3] = bounds(DataFull(:, 3));
yedges = min3:5:max3;
BigHistogram = histcounts2(DataFull(:,2),DataFull(:,3), ...
'XBinEdges', xedges, 'YBinEdges', yedges);
whos
Name Size Bytes Class Attributes BigHistogram 1806x9 130032 double DataFull 474x3 11376 double DataSparse 100x100000 807704 double sparse cmdout 1x33 66 char max2 1x1 8 double max3 1x1 8 double min2 1x1 8 double min3 1x1 8 double xedges 1x1807 14456 double yedges 1x10 80 double
Note the sizes of xedges and BigHistogram.
  3 个评论
Steven Lord
Steven Lord 2023-2-3
You could test if the last element of xedges is strictly less than max2. If it is concatenate that last element plus 55 to the xedges vector. Alternately add 55 to max2 and use the result as the third input to colon when you build xedges.
Gabriel Stanley
Gabriel Stanley 2023-2-3
Yeah, I've ended up getting to the second option. Ultimately it's not a big deal if there's an extra empty bin at the top end. Ty.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Data Distribution Plots 的更多信息

产品


版本

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by