extracting subsequences of binary string

as would be the code for the following string have the next subsequences ?
STRING
1(1), 0(2), 1(3), 1(4), 0(5), 0(6), 1(7), 0(8), 0(9), 1(10), 1(11), 1(12), 1(13), 0(14), 0(15), 0(16), 1(17), 1(18), 1(19), 0(20)
SUBSEQUENCES
01: 1(01), 0(02), 1(03), 1(04) -> [1,0,1,1],
02: 1(01), 1(03), 0(05), 1(07) -> [1,1,0,1],
03: 1(01), 1(04), 1(07), 1(10) -> [1,1,1,1],
04: 1(01), 0(05), 0(09), 1(13) -> [1,0,0,1],
05: 1(01), 0(06), 1(11), 0(16) -> [1,0,1,0],
06: 1(01), 1(07), 1(13), 1(19) -> [1,1,1,1],
07: 0(02), 1(03), 1(04), 0(05) -> [0,1,1,0],
08: 0(02), 1(04), 0(06), 0(08) -> [0,1,0,0],
09: 0(02), 0(05), 0(08), 1(11) -> [0,0,0,1],
10: 0(02), 0(06), 1(10), 0(14) -> [0,0,1,0],
11: 0(02), 1(07), 1(12), 1(17) -> [0,1,1,1],
12: 0(02), 0(08), 0(14), 0(20) -> [0,0,0,0],
13: 1(03), 1(04), 0(05), 0(06) -> [1,1,0,0],
14: 1(03), 0(05), 1(07), 0(09) -> [1,0,1,0],
15: 1(03), 0(06), 0(09), 1(12) -> [1,0,0,1],
16: 1(03), 1(07), 1(11), 0(15) -> [1,1,1,0],
17: 1(03), 0(08), 1(13), 1(18) -> [1,0,1,1],
18: 1(04), 0(05), 0(06), 1(07) -> [1,0,0,1],
19: 1(04), 0(06), 0(08), 1(10) -> [1,0,0,1],
20: 1(04), 1(07), 1(10), 1(13) -> [1,1,1,1],
21: 1(04), 0(08), 1(12), 0(16) -> [1,0,1,0],
22: 1(04), 0(09), 0(14), 1(19) -> [1,0,0,1],
23: 0(05), 0(06), 1(07), 0(08) -> [0,0,1,0],
24: 0(05), 1(07), 0(09), 1(11) -> [0,1,0,1],
25: 0(05), 0(08), 1(11), 0(14) -> [0,0,1,0],
26: 0(05), 0(09), 1(13), 1(17) -> [0,0,1,1],
27: 0(05), 1(10), 0(15), 0(20) -> [0,1,0,0],
28: 0(06), 1(07), 0(08), 0(09) -> [0,1,0,0],
29: 0(06), 0(08), 1(10), 1(12) -> [0,0,1,1],
30: 0(06), 0(09), 1(12), 0(15) -> [0,0,1,0],
31: 0(06), 1(10), 0(14), 1(18) -> [0,1,0,1],
32: 1(07), 0(08), 0(09), 1(10) -> [1,0,0,1],
33: 1(07), 0(09), 1(11), 1(13) -> [1,0,1,1],
34: 1(07), 1(10), 1(13), 0(16) -> [1,1,1,0],
35: 1(07), 1(11), 0(15), 1(19) -> [1,1,0,1],
36: 0(08), 0(09), 1(10), 1(11) -> [0,0,1,1],
37: 0(08), 1(10), 1(12), 0(14) -> [0,1,1,0],
38: 0(08), 1(11), 0(14), 1(17) -> [0,1,0,1],
39: 0(08), 1(12), 0(16), 0(20) -> [0,1,0,0],
40: 0(09), 1(10), 1(11), 1(12) -> [0,1,1,1],
41: 0(09), 1(11), 1(13), 0(15) -> [0,1,1,0],
42: 0(09), 1(12), 0(15), 1(18) -> [0,1,0,1],
43: 1(10), 1(11), 1(12), 1(13) -> [1,1,1,1],
44: 1(10), 1(12), 0(14), 0(16) -> [1,1,0,0],
45: 1(10), 1(13), 0(16), 1(19) -> [1,1,0,1],
46: 1(11), 1(12), 1(13), 0(14) -> [1,1,1,0],
47: 1(11), 1(13), 0(15), 1(17) -> [1,1,0,1],
48: 1(11), 0(14), 1(17), 0(20) -> [1,0,1,0],
49: 1(12), 1(13), 0(14), 0(15) -> [1,1,0,0],
50: 1(12), 0(14), 0(16), 1(18) -> [1,0,0,1],
51: 1(13), 0(14), 0(15), 0(16) -> [1,0,0,0],
52: 1(13), 0(15), 1(17), 1(19) -> [1,0,1,1],
53: 0(14), 0(15), 0(16), 1(17) -> [0,0,0,1],
54: 0(14), 0(16), 1(18), 0(20) -> [0,0,1,0],
55: 0(15), 0(16), 1(17), 1(18) -> [0,0,1,1],
56: 0(16), 1(17), 1(18), 1(19) -> [0,1,1,1],
57: 1(17), 1(18), 1(19), 0(20) -> [1,1,1,0],

 采纳的回答

N = 20;
n = 4;
A = hankel(1:N-n+1,N-n+1:N);
k = 0:n-1;
idx = [];
for ii = 1:size(A,1)
p = A(ii,:);
while p(end,end) + k(end) <= N
p = [p;p(end,:)+k];
end
idx=[idx;p];
end
or
N = 20;
n = 4;
A = hankel(1:N-n+1,N-n+1:N);
k = 0:n-1;
c = ceil((N - A(:,end) + 1)/k(end));
i2 = cumsum(c);
i1 = i2 - c + 1;
idx = zeros(i2(end),n);
for jj = 1:N-n+1
idx(i1(jj):i2(jj),:) = bsxfun(@plus,A(jj,:),(0:c(jj)-1)'*k);
end
ADD
s = [1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0];
[j1,j2,j2] = unique(s(idx),'rows')
out = [j1, histc(j2,1:max(j2))/i2(end)]; % This row corrected

8 个评论

thank you very much, that command should now be used to calculate the number of times to repeat each subsequence? is to calculate the probability by dividing the number of occurrences of that subsequence by the total number of subsequences. But I'm not sure which command used to count the number of occurrences of each subsequence
see ADD part in my answer
I get the following error:
Error using horzcat CAT arguments dimensions are not consistent.
Do not find me subsequences. j2 be the number of occurrences of each subsequence?
N = 20;
n = 4;
A = hankel(1:N-n+1,N-n+1:N);
k = 0:n-1;
idx = [];
for ii = 1:size(A,1) p = A(ii,:); while p(end,end) + k(end) <= N
p = [p;p(end,:)+k];
end
idx=[idx;p];
end
s = [1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0];
[j1,j2,j2] = unique(s(idx),'rows')
out = [j1, histc(j2,1:max(i2))];
This makes me:
Undefined function or variable 'i2'.
and interpret the outputs? are the number of occurrences of each subsequence?
many thanks
sorry, I have not understood the code. This it does is calculate the number of times to repeat each subsequence?. It calculates the sub but if calculated occurrences each subsequence?. it?
Again correct last row in my code.

请先登录,再进行评论。

更多回答(2 个)

n = 20;
d = 4;
c = zeros(sum([1,floor((d:n-1)/(d-1))]),d); % Allocate space for c
j = 0;
for k = 1:n-d+1
r = 1;
while k+r*(d-1) <= n
j = j+1;
c(j,:) = k:r:k+r*(d-1);
r = r+1;
end
end
The c array will be a 57 x 4 matrix of subsequence indices taken from 1:20.
c =
1 2 3 4
1 3 5 7
1 4 7 10
.....
17 18 19 20
If you replace the line "c(j,:) = k:r:k+r*(d-1);" by
c(j,:) = s(k:r:k+r*(d-1));
where s is your string, this will generate the subsequence of binary strings you are (apparently) asking for.

3 个评论

I have modified the above code so as to allocate the proper size for the c array.
thank you very much, that command should now be used to calculate the number of times to repeat each subsequence? is to calculate the probability by dividing the number of occurrences of that subsequence by the total number of subsequences. But I'm not sure which command used to count the number of occurrences of each subsequence
One question, as I can do with structure for you automatically calculate subsequences of length 4-20? ie, d = 4:20 but applying for so I said why not have the same dimension:
if true
% code
for d=4:20
c(d)=zeros(sum([1,floor((d:n-1)/(d-1))]),d);
j=0;
for k=1:n-d+1
r=1;
while k+r*(d-1)<=n
j=j+1;
c(j,:)=s(k:r:k+r*(d-1));% s es la cadena binaria / me da las subsecuencias
r=r+1;
end
end
end
end

请先登录,再进行评论。

Here is a slightly shorter version:
n = 20;
d = 4;
f2 = cumsum([0,floor((n-1:-1:d-1)/(d-1))]);
f1 = f2(1:end-1)+1;
f2 = f2(2:end);
c = repmat(0:d-1,f2(end),1);
for k = 1:length(f1)
c(f1(k),:) = c(f1(k),:) + k;
c(f1(k):f2(k),:) = cumsum(c(f1(k):f2(k),:),1);
end

类别

帮助中心File Exchange 中查找有关 Creating and Concatenating Matrices 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by