How to calculate the conditional probability of an event?

13 次查看(过去 30 天)
I have an array similar to this array = [A A B A C A B B B C C A A C]. I want to calculate p(C|A), p(C|B), p(C|C). How can I do this just having this information? I want to know what is the probability of C happening after a previous event A, B or C.

采纳的回答

William
William 2021-4-23
Hello Myriam. It is not clear whether A, B, and C here are text characters, or if they have numeric values. If we assume they have numerical values, like A=1, B=2 and C=3, then you could use
y = diff(array);
P_AC = sum(y==2);
P_BC = sum(y==1);
P_CC = sum(y==0);
  1 个评论
Myriam Moss
Myriam Moss 2021-4-24
编辑:Myriam Moss 2021-4-24
Hi William. Thank you. They are characters.
I'm new to matlab sorry. Could you explain to me your logic, please?

请先登录,再进行评论。

更多回答(2 个)

William
William 2021-4-25
Actually, I believe that p(C|A) would be:
y = strfind(array, 'A');
N_A = length(y);
p_CA = N_AC/N_A;
There is one further thing to consider, though. It may be true that the very last element of array is an 'A'. I don't think this should be counted in N_A because we don't know whether it would have been followed by a 'C' or not. So, if the last element of array is 'A', we should reduce N_A by 1.
y = strfind(array, 'A');
N_A = length(y);
if y(end)==length(array) || y(end)==length(array)-1 % The string might end
N_A = N_A - 1; % with an 'A' or an 'A '
end
p_CA = N_AC/N_A;

William
William 2021-4-25
Myriam,
If A, B and C were variables with the values 1, 2 and 3, then in your example:
array = [1, 1, 2, 1, 3, 1, 2, 2, 2, 3, 3, 1, 1, 3]
The diff() function returns the difference between each value and the next value, so
diff(array) = [0, 1, -2, 2, -2, 1, 0, 0, 1, 0, -2, 0, 2]
Every time an A is followed by a C, the difference is 2. Every time a B is followed by a C, the difference is 1. So, I was suggesting that you count the number of times A is followed by C by counting the number of times that the value 2 appears in diff(array) with a statement like c = sum(diff(array) == 2). Unfortunately, I see now that this does not work correctly for the number of times B is followed by C, because this results in a value of 1 in diff(array), and a value of 1 is also produced when an A is followed by a B.
Since you have said that A, B and C are characters, I assume that you mean that:
array = 'A A B A C A B B B C C A A C';
In this case, maybe a better solution would be:
y = strfind(array, 'A C');
N_AC = length(y);
y = strfind(array, 'B C');
N_BC = length(y);
  1 个评论
Myriam Moss
Myriam Moss 2021-4-25
Thank you William! Now I have the number of times C appears after A and B.
If I define
y = strfind(array, 'C');
N_C = length(y);
If I want p(C|A), for example, I should do:
p_CA = N_AC/N_C , do you agree? :)

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Dimensionality Reduction and Feature Extraction 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by