problem with binary code

Question

FRANCISCO 2013-8-26

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/85666-problem-with-binary-code

good I ask a question that has nothing to do at the moment with programming in Matlab, but with statistical issues and wonder if anyone can help,

My purpose is predicting the following number of binary string. For this, I have a sequence of binary digits that is:

s= 1(1) 0(2) 1(3) 1(4) 0(5) 0(6) 1(7) 0(8) 0(9) 1(10) 1(11) 1(12) 1(13) 0(14) 0(15) 0(16) 1(17) 1(18) 1(19) 0(20).

What I did then is creating substrings produced as follows:

1) 1 (1) 0 (2) 1 (3) 1 (4) --- [1 0 1 1]

2) 1 (1) 1 (3) 0 (5) 1 (7) --- [1 1 0 1]

3) 1 (1) 1 (4) 1 (7) 1 (10) --- [1 1 1 1]

4) 1 (1) 0 (5) 0 (9) 1 (13) --- [1 0 0 1]

5) 1 (1) 0 (6) 1 (11) 0 (16) --- [1 0 1 0]

6) 1 (1) 1 (7) 1 (13) 1 (19) --- [1 1 1 1]

7) 0 (2) 1 (3) 1 (4) 0 (5) --- [0 1 1 0]

8) 0 (2) 1 (4) 0 (6) 0 (8) --- [0 1 0 0]

9) 0 (2) 0 (5) 0 (8) 1 (11) --- [0 0 0 1]

10) 0 (2) 0 (6) 1 (10) 0 (14) --- [0 0 1 0]

11) 0 (2) 1 (7) 1 (12) 1 (17) --- [0 1 1 1]

12) 0 (2) 0 (8) 0 (14) 0 (20) --- [0 0 0 0]

13) 1 (3) 1 (4) 0 (5) 0 (6) --- [1 1 0 0]

14) 1 (3) 0 (5) 1 (7) 0 (9) --- [1 0 1 0]

15) 1 (3) 0 (6) 0 (9) 1 (12) --- [1 0 0 1]

16) 1 (3) 1 (7) 1 (11) 0 (15) --- [1 1 1 0]

17) 1 (3) 0 (8) 1 (13) 1 (18) --- [1 0 1 1]

18) 1 (4) 0 (5) 0 (6) 1 (7) --- [1 0 0 1]

19) 1 (4) 0 (6) 0 (8) 1 (10) --- [1 0 0 1]

20) 1 (4) 1 (7) 1 (10) 1 (13) --- [1 1 1 1]

21) 1 (4) 0 (8) 1 (12) 0 (16) --- [1 0 1 0]

22) 1 (4) 0 (9) 0 (14) 1 (19) --- [1 0 0 1]

23) 0 (5) 0 (6) 1 (7) 0 (8) --- [0 0 1 0]

24) 0 (5) 1 (7) 0 (9) 1 (11) --- [0 1 0 1]

25) 0 (5) 0 (8) 1 (11) 0 (14) --- [0 0 1 0]

26) 0 (5) 0 (9) 1 (13) 1 (17) --- [0 0 1 1]

27) 0 (5) 1 (10) 0 (15) 0 (20) --- [0 1 0 0]

28) 0 (6) 1 (7) 0 (8) 0 (9) --- [0 1 0 0]

29) 0 (6) 0 (8) 1 (10) 1 (12) --- [0 0 1 1]

30) 0 (6) 0 (9) 1 (12) 0 (15) --- [0 0 1 0]

31) 0 (6) 1 (10) 0 (14) 1 (18) --- [0 1 0 1]

32) 1 (7) 0 (8) 0 (9) 1 (10) --- [1 0 0 1]

33) 1 (7) 0 (9) 1 (11) 1 (13) --- [1 0 1 1]

34) 1 (7) 1 (10) 1 (13) 0 (16) --- [1 1 1 0]

35) 1 (7) 1 (11) 0 (15) 1 (19) --- [1 1 0 1]

36) 0 (8) 0 (9) 1 (10) 1 (11) --- [0 0 1 1]

37) 0 (8) 1 (10) 1 (12) 0 (14) --- [0 1 1 0]

38) 0 (8) 1 (11) 0 (14) 1 (17) --- [0 1 0 1]

39) 0 (8) 1 (12) 0 (16) 0 (20) --- [0 1 0 0]

40) 0 (9) 1 (10) 1 (11) 1 (12) --- [0 1 1 1]

41) 0 (9) 1 (11) 1 (13) 0 (15) --- [0 1 1 0]

42) 0 (9) 1 (12) 0 (15) 1 (18) --- [0 1 0 1]

43) 1 (10) 1 (11) 1 (12) 1 (13) --- [1 1 1 1]

44) 1 (10) 1 (12) 0 (14) 0 (16) --- [1 1 0 0]

45) 1 (10) 1 (13) 0 (16) 1 (19) --- [1 1 0 1]

46) 1 (11) 1 (12) 1 (13) 0 (14) --- [1 1 1 0]

47) 1 (11) 1 (13) 0 (15) 1 (17) --- [1 1 0 1]

48) 1 (11) 0 (14) 1 (17) 0 (20) --- [1 0 1 0]

49) 1 (12) 1 (13) 0 (14) 0 (15) --- [1 1 0 0]

50) 1 (12) 0 (14) 0 (16) 1 (18) --- [1 0 0 1]

51) 1 (13) 0 (14) 0 (15) 0 (16) --- [1 0 0 0]

52) 1 (13) 0 (15) 1 (17) 1 (19) --- [1 0 1 1]

53) 0 (14) 0 (15) 0 (16) 1 (17) --- [0 0 0 1]

54) 0 (14) 0 (16) 1 (18) 0 (20) --- [0 0 1 0]

55) 0 (15) 0 (16) 1 (17) 1 (18) --- [0 0 1 1]

56) 0 (16) 1 (17) 1 (18) 1 (19) --- [0 1 1 1]

57) 1 (17) 1 (18) 1 (19) 0 (20) --- [1 1 1 0]

And I've also calculated the relative frequency of these substrings

0 0 0 0------ 0,0175438596491228

0 0 0 1------ 0,0350877192982456

0 0 1 0------ 0,0877192982456140

0 0 1 1------ 0,0701754385964912

0 1 0 0------ 0,0701754385964912

0 1 0 1------ 0,0701754385964912

0 1 1 0------ 0,0526315789473684

0 1 1 1------ 0,0526315789473684

1 0 0 0------ 0,0175438596491228

1 0 0 1 0,122807017543860

1 0 1 0 0,0701754385964912

1 0 1 1 0,0701754385964912

1 1 0 0 0,0526315789473684

1 1 0 1 0,0701754385964912

1 1 1 0 0,0701754385964912

1 1 1 1 0,0701754385964912

Now let's say I want to know if the number 21 of the succession will be "0" or "1". To do this do the following:

s= 1(1) 0(2) 1(3) 1(4) 0(5) 0(6) 1(7) 0(8) 0(9) 1(10) 1(11) 1(12) 1(13) 0(14) 0(15) 0(16) 1(17) 1(18) 1(19) 0(20) X(21)

and now build substrings that have to do with X:

1: 1 (18), 1 (19), 0 (20), X (21) --- [1,1,0, X]

2: 0 (15), 1 (17), 1 (19), X (21) --- [0,1,1, X]

3: 1 (12), 0 (15), 1 (18), X (21) --- [1,0,1, X]

4: 0 (9), 1 (13), 1 (17), X (21) --- [0,1,1, X]

5: 0 (6), 1 (11), 0 (16), X (21) --- [0,1,0, X]

6: 1 (3), 0 (9), 0 (15), X (21) --- [1,0,0, X]

And replacing the X I have:

X = 1,

1: 1 (18), 1 (19), 0 (20), X (21) --- [1,1,0, 1 ]

2: 0 (15), 1 (17), 1 (19), X (21) --- [0,1,1, 1 ]

3: 1 (12), 0 (15), 1 (18), X (21) --- [1,0,1, 1 ]

4: 0 (9), 1 (13), 1 (17), X (21) --- [0,1,1, 1 ]

5: 0 (6), 1 (11), 0 (16), X (21) --- [0,1,0, 1 ]

6: 1 (3), 0 (9), 0 (15), X (21) --- [1,0,0, 1 ]

X = 0,

1: 1 (18), 1 (19), 0 (20), X (21) --- [1,1,0, 0 ]

2: 0 (15), 1 (17), 1 (19), X (21) --- [0,1,1, 0 ]

3: 1 (12), 0 (15), 1 (18), X (21) --- [1,0,1, 0 ]

4: 0 (9), 1 (13), 1 (17), X (21) --- [0,1,1, 0 ]

5: 0 (6), 1 (11), 0 (16), X (21) --- [0,1,0, 0 ]

6: 1 (3), 0 (9), 0 (15), X (21) --- [1,0,0, 0 ]

Okay, from here someone could tell me how I can study the probability to predict the next number in the string?

10 个评论
显示 8更早的评论隐藏 8更早的评论

FRANCISCO 2013-8-30

sorry, is that English is not my language and may misunderstand or do not know clearly express. Imagine that we flip a coin 20 times (binary sequence). When "face" is "1" and when "cross" is "0". What I want to know is the probability of getting "heads" or "tails" in the next release (Release No. 21). To do what I have done is to make combinations with subsequence length = 4 and find the most repeated patterns. And then find the sub in which I get the release (21), in order to increase my probability of success. I do not know how to find the probabilities of the subsequences. For example let's look at the first pattern with the X:

1: 1 (18) 1 (19) 0 (20), X (21) --- [1,1,0, X]

Here are two possible combinations if Substituting X = 1 and X = 0:

[1,1,0, 1]

[1,1,0, 0]

If we look at the above table shows the relative frequencies where we can see that:

1 1 0 1 ------ 0.0701754385964912

1 1 0 0 ------ 0.0526315789473684

then we see that along all combinations 1101 has been repeated more times. What I want to find is the probability that 1101 will repeat once again, or to repeat 1100.

dpb 2013-8-30

编辑：dpb 2013-8-30

在 MATLAB Online 中打开

If it's a fair coin, then the P(H) on the (N+1)th trial is still 0.5 whatever the preceding sequence -- even if the preceding N were all T (or H).

If it's not fair, then estimate the actual bias of p (or q). As noted above, there's insufficient evidence on the result of the number of H above to conclude that p~=0.5 isn't as good a value as any.

What other information is there to use? The point is that one random realization of a process is subject to randomness such that another realization from the same process could produce the obverse case from what you have above -- namely that

Pobs(1101) ~ 0.05 
Pobs(1100) ~ 0.07

instead or some other values entirely. I don't see any reason not to use the expected value for either sequence of 1/16 = 0.0625. Note that your values are scattered on either side of that, additional evidence that the underlying assumption of binomial w/ p=q=0.5 is as good a model as any.

ADDENDUM: The above 1/16 and the observed values roughly equal correlate w/ the other note just posted that basically implies that despite the selection of other than the sequence samples as they arrived the generating process looks w/ this limited sample as though it is pretty much a fair binomial with p=0.5.

dpb 2013-8-30

Another comment on the "probabilities" you've calculated from sequences. If the process is one of generating a sequence, then the observed sequence from the process is not represented by sequences other than those from n:m. The selection of arbitrary subsequences such as many of those you've listed above are not actual sample sequences unless the previous assumption you've claimed is so about there being serial dependence is violated as you've arbitrarily selected samples with different steps between samples.

All I can see that are valid observations if you have reason to look at four subsequent samples are the 16 that you can construct from 1:4, 2:5, ..., 16:20. Everything else is dependent upon there being no correlation at all from one to the other to be a valid sequence (which seems to violate the earlier assertion regarding the underlying randomness not being purely random).

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

David Sanchez 2013-8-30

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/85666-problem-with-binary-code#answer_95510

在 MATLAB Online 中打开

In your case, the relative frequency of the sequence:

0 0 1 1 is 0,0701754385964912

Then,it means that if you take your whole sequence as a projection of what's going to happen in the future, that very same relative frequency will be the likelihood (probability) of the value to happen again. The likelihood of 0000 is 0,0175438596491228, the likelihood of 0001 is 0,0350877192982456, and so on.

3 个评论
显示 1更早的评论隐藏 1更早的评论

dpb 2013-8-30

But, unless the generator is one that is particularly flawed for some reason like the period is very short or it does have a peculiarity of serial correlation of period N or the like that you can exploit, there's not much to be said other than the particular realization gave you that particular case.

If you're just trying to study a given generator, as mentioned above look for the NIST battery of tests for randomness for ideas on how and what is tested for in general.

Perhaps if you tried to outline the end objective of where you're headed as a final result rather than focusing on the mechanics it would lead to a better response but as is I just don't see what good this is going to do you to look at it this way unless it is trying to qualify the PRNG.

Walter Roberson 2013-8-31

Perhaps you should be checking the autocorrelation.

请先登录，再进行评论。

problem with binary code

10 个评论
显示 8更早的评论隐藏 8更早的评论

回答（1 个）

3 个评论
显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

problem with binary code

10 个评论 显示 8更早的评论隐藏 8更早的评论

回答（1 个）

3 个评论 显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

10 个评论
显示 8更早的评论隐藏 8更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论