I have 6 groups (named A to F) of continuous data and most of the groups follow a non-normal distribution. I've plotted the values using a boxplot with notch 'on' and applied a Kruskal-Wallis test which confirmed that the groups did not come from the same distribution. I then used multcompare to check the significance of each of the group pairs. The data is in fdata, the group names in fgroups:
boxplot(fdata,'Notch', 'on', 'Symbol', 'r.');
[p, tbl, stats]=kruskalwallis(fdata,fgroups,'on');
disp(tbl);
c=multcompare(stats,'display','on');
[ncomp,nccol] = size(c);
disp(' ');
disp(' Comparing groups - showing only significant differences')
for j=1:ncomp
if c(j,nccol) <= 0.05
disp([' Group ' fgroups{c(j,1)} ' to ' fgroups{c(j,2)} ' - p = ' num2str(c(j, nccol))]);
end
end
Both the printout and the plot of the mean rank sum showed that groups B, D & F were not significantly different. However, looking at the boxplot of group D it was clear that the notches did not overlap with those of groups B & F, which would indicate that that D is significantly different from B & F. When I separated out B, D & F and analysed them as a group, multcompare then gave (what I assume to be) the correct answer: D was significantly different from B & F (although B & F are not different).
So what is going on? I note that the plot shows that multcompare is analyzing the 'mean rank sum' and is using all of the groups to calculate the rank (instead of the ranks between the pairs of groups?). Obviously when you have fewer groups you are going to have a different rank sum and thus a different answer, which doesn't seem right.
Of course, it may be that I'm using multcompare incorrectly - please advise.