Extracting data from histogram plots

Question

0 个投票

Hello. I'm trying to process some data from some chemical analyses I did a while ago. I have 3 types of data: particle diameter, nitrogen content (%), and sulfur content (%). I've already managed to organize the particle diameter data into a histogram plot with something like 50 bins. Now, I'd like to figure out the average nitrogen and sulfur content of the particles in each bin. I'm not sure how to do this, though, and I haven't found any obvious tutorials to explain how to do this. Any advice?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Adam Danz 2023-3-10

编辑：Adam Danz 2023-3-11

在 MATLAB Online 中打开

0 个投票

3 methods to group data and compute mean for each group

Each method deals with empty bins differently.

discretize + splitapply

Use discretize to group each value into the bins used in histogram and then splitapply to compute the mean for each group. Note that each bin must contain at least one data point.

Example: compute the mean of data in bins defined by edges.

rng default  % for reproducibility of this demo
data = rand(1,100)*100;
edges = 0:10:100; 
binID = discretize(data,edges)
binID = 1×100
     9    10     2    10     7     1     3     6    10    10     2    10    10     5     9     2     5    10     8    10     7     1     9    10     7     8     8     4     7     2
a = splitapply(@mean,data,binID)
a = 1×10
    5.3838   15.2780   26.0259   35.6310   46.5284   55.4195   66.1338   75.5438   83.3041   94.1885

discretize + groupsummary

Use discretize to group each value into the bins and then groupsummary to compute the mean of each group. When working with vectors, the first two arguments must be column vectors.

Note that the output vector skips empty bins. See additional outputs to groupsummary to identify which bins are represented in the first output.

s = groupsummary(data(:),binID(:),'mean')
s = 10×1
3838
2780
0259
6310
5284
4195
1338
5438
3041
1885

discretize + accumarray

Use discretize to group each value into the bins and then accumarray to compute the mean of all bins.

Note that empty bins are represented by a 0.

m = accumarray(binID(:),data,[],@mean)
m = 10×1
3838
2780
0259
6310
5284
4195
1338
5438
3041
1885

Comparison of these methods when some bins are empty

data = randn(100,1)+10;  % expected range: ~6 : ~13
edges = 0:3:15;  % 5 bins but the first two will be empty
binID = discretize(data, edges);
m = accumarray(binID,data,[],@mean)
m = 5×1
         0
         0
    8.4699
   10.1766
   12.7170
s = groupsummary(data,binID(:),'mean')
s = 3×1
    8.4699
   10.1766
   12.7170
a = splitapply(@mean,data,binID)
Error using splitapply
For N groups, every integer between 1 and N must occur at least once in the vector of group numbers.

7 个评论
显示 5更早的评论隐藏 5更早的评论

Haley Royer 2023-3-11

Hi again. I've run into another issue. When trying to use splitapply I get the following error

"Group numbers must be a vector of positive integers, and cannot be a sparse vector."

My understanding is that because I have values in my column that are zero, splitapply cannot be used. Some of the particles I'm looking at don't have nitrogen or sulfur, but I still have to average a group such as 0 0 0 5.0 2.5. Any way to get around this?

Adam Danz 2023-3-11

Let's keep it civil here.

As you mentioned, if one of the bins have no values, then splitapply won't work.

I'll add alternatives to my answer.

请先登录，再进行评论。

Extracting data from histogram plots

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

7 个评论
显示 5更早的评论隐藏 5更早的评论

更多回答（0 个）

类别

标签

Community Treasure Hunt

Extracting data from histogram plots

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

采纳的回答

7 个评论 显示 5更早的评论 隐藏 5更早的评论

更多回答（0 个）

类别

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

7 个评论
显示 5更早的评论隐藏 5更早的评论