Remove specific data sequence when reading .bin file
2 次查看(过去 30 天)
显示 更早的评论
How do I remove/skip the specific data sequence [1024 0 2240 -24500] when reading the .bin file in batch? thanks!
filename='rf.bin'; % filename='./rf.bin';
fid=fopen(filename,'r');
dataHeader=fread(fid,123,'uint8'); % skipping the header for .bin
NsperBatch = 1e3; % number of sample per batch
K=100; % Average every K set of values, K=100 in this case
magSpectrumMat=[];
while ~feof(fid)
magSpectrum=0;
for k=1:K
data=fread(fid,NsperBatch*2,'int16','b');
dataIQ=data(1:2:end)+1i*data(2:2:end);
dataSpectrum=fftshift(fft(dataIQ));
magSpectrum=magSpectrum+abs(dataSpectrum).^2;
end
magSpectrum = magSpectrum/K;
magSpectrumMat = [magSpectrumMat magSpectrum];
end
magSpectrumMat_dB=pow2db(magSpectrumMat);
3 个评论
Walter Roberson
2017-10-12
Is there any possibility that it could occur inside the 123 byte header? Is there any possibility it could start inside the 123 byte header but end outside the header?
采纳的回答
Walter Roberson
2017-10-12
Okay, here it is, with skips accounted for, and with automatic padding in case the data is the wrong size.
As you indicated the words to skip could occur "anywhere" after I asked about that, I assumed that the words to skip might even occur during that 123 byte header.
I assumed that if there were not a full K batch that you wanted to take the mean of what was available in the last partial batch rather than dividing by K specifically.
I vectorized a lot of the computation.
I was not completely sure of the order of data you wanted to output. I think your existing code is putting the results for averaging into column vectors in a matrix; that is the output format I create here.
The below is not tested as I do not happen to have your data file.
NsperBatch = 1e3;
header_size = 123;
K=100; % Average every K set of values, K=100 in this case
pattern_to_skip = int16([1024 0 2240 -24500]); %magic sequence of words to ignore
filename = 'rf.bin'; % filename='./rf.bin';
pattern_to_skip = typecast( swapbytes(pattern_to_skip), 'uint8'); %big endian
PL = length(pattern_to_skip);
fid = fopen(filename,'r');
bytes = reshape( fread(fid, inf, '*uint8'), 1, []); %row vector
fclose(fid);
orig_num_bytes = length(bytes);
skiplocs = strfind(bytes, pattern_to_skip);
for idx = fliplr(skiplocs)
bytes(idx:idx+PL-1) = []; %delete bytes
end
postskip_num_bytes = length(bytes);
fprintf('%d groups were skipped\n', (orig_num_bytes - postskip_num_bytes) / PL );
dataHeader = bytes(1:header_size);
bytes = bytes(header_size+1:end);
data_length = length(bytes);
if mod(data_length, 2) ~= 0
fprintf('warning: data is odd number of bytes long, padding\n');
bytes(end+1) = 0;
end
if mod(data_length, 4) ~= 0
fprintf('warning: data is odd number of words long, padding\n');
bytes(end+1:end+2) = 0;
end
words = typecast(bytes, 'int16');
all_dataIQ = double( complex( words(1:2:end), words(2:2:end) ) );
num_dataIQ = length(all_dataIQ);
target_num_dataIQ = NsperBatch * ceil( num_dataIQ / NsperBatch);
if num_dataIQ ~= target_num_dataIQ
fprintf('warning: complex data is not a multiple of %d samples long, padding\n', numNsperBatch);
all_dataIQ(target_num_dataIQ) = 0; %zero fill automatically
end
magSpectra = abs(fftshift( fft( reshape(all_dataIQ, NsperBatch, []) ) )).^2; %do it all at once!
num_spectra = size(magSpectra, 2);
num_full_batches = floor(num_spectra / K);
num_leftover = num_spectra - K * num_full_batches;
num_batches = num_full_batches + (num_leftover ~= 0);
magSpectrumMat = zeros(NsperBatch, num_batches);
for batch_idx = 1 : num_full_batches
bstart = (batch_idx - 1) * K + 1;
bend = bstart + K - 1;
magSpectrum = mean( magSpectra(:, bstart : bend ), 2 );
magSpectrumMat(:, batch_idx) = magSpectrum;
end
if num_leftover ~= 0
magSpectrum = mean( magSpectra(:, end-num_leftover+1 : end), 2 );
magSpectrumMat(:, end) = magSpectrum;
end
8 个评论
Walter Roberson
2017-10-16
Do not modify the bytes = line. Your header is defined by an odd number of bytes, and if you swap at the time you read them in, you would move byte 123 to the position of byte 124 and would then be ignoring the wrong byte. So you have to scan as bytes and delete the garbage as bytes (unless you are sure the garbage never occurs in the headers), and once have scrubbed the garbage you need to trim off the first 123 bytes of what is left.
Once you have trimmed off the header, there is a possibility that you need to byte swap: it depends on how the data was stored.
When you described the data values to remove, I assumed you had read through the data stream and had found those particular numeric values after reading as int16, with the implication that the bytes were in the other order (because the native order on whatever host you are using is little-endian.) But it is possible that you were told the sequence of bytes by someone else who assumed you were using big endian, in which case the byteswap would not be needed... Do you have a sample file known to have the sequence of bytes in it that you could process with byteswap or not on the match for deletion, to check to see which is happening in practice?
If the data is written as big-endian then you would need to byteswap the int16, which you would do by changing
words = typecast(bytes, 'int16');
to
words = byteswap( typecast(bytes, 'int16') );
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Large Files and Big Data 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!