How to write a large .dat file in a loop?

14 次查看(过去 30 天)
Hello,
I am trying to convert a large amount data from one format (64 separate neuralynx .ncs files) into another (a .dat file with all 64 combined, in int16 format) for use with the Kilosort package. Kilosort then uses this huge file in some clever way. The .dat file should be 2D with 64 channels x number of samples per channel (about 3 hours at 20kHz).
Because my data is so large, I am unable to load it all into one matrix and then write it to a file (the total filesize for all 64 inputs is at least 30GB).
So, I have been trying to read each of my input files separately, convert the data type and then write it to a file. In the code below I managed to make it work for a huge .mat file using the matfile function, but I need .dat.
I then tried to do a similar thing using fwrite, where I tried to append each new .ncs readout. All the data gets written to file, but in an enormous 1-D list of numbers, and not in a 64-row array.
I tried to transpose my input in the hope that I would obtain separated rows that way, but that makes no difference.
How can I control how fwrite appends data to my file? Is it at all possible to make a 2D output like this?
Thanks,
Susan
% make .mat file
m = matfile([OutFolder,'\',RatID,'_',RecDate,'.mat'],'Writable',true);
% read .ncs files for all traces and write to .mat and .dat file
for ch = 1:64
%load data
InFile = [InFolder,'\','CSC',num2str(ch),'.ncs'];
[~,~,samples] = readEegDataForKilosort(InFile); % gives 1D array of type double
int_samples = int16(samples); clear samples;
%write to .mat file (not actually useful)
m.WholeRec(1:length(int_samples),ch)=int_samples;
%write to .dat file
if ch == 1; % make file for 1st channel
fileID = fopen([OutFolder,'\',RatID,'_',RecDate,'.dat'],'w');
fwrite(fileID,int_samples','uint16');
fclose(fileID);
else % append for all next channels
fileID = fopen([OutFolder,'\',RatID,'_',RecDate,'.dat'],'a');
fwrite(fileID,int_samples','uint16');
fclose(fileID);
end
clear int_samples
end
  4 个评论
dpb
dpb 2021-1-30
编辑:dpb 2021-1-30
I'm trying to figure out for sure what the input file you need actually looks like, specifically.
I didn't find a description of that file format at the link; it's probably there, but isn't clear where that is.
Let's talk something small in size instead...if you had four channels and 3 observations, there would be twelve values. Are these to be arranged as a sequence of three (3) 4-vectors, sequentially in time as
Ch1O1 Ch201 Ch301 Ch401
Ch1O2 Ch202 Ch302 Ch402
Ch1O3 Ch203 Ch303 Ch403
? or as
Ch1O1 Ch102 Ch103
Ch2O1 Ch202 Ch203
Ch3O1 Ch302 Ch303
Ch4O1 Ch402 Ch403
?
In both cases I've introduced phantom records that would not be in a stream file simply to aid in readability.
The first writes each timestep for all chanels, the second writes all timesteps (observations "O") for each channel sequentially.
Or, does the input processor have, by any chance, the ability to tell it which order the data are in?
The second above is what you have written; a stream file will be just a sequence of bytes; to write in the order by timestep/observation you will have to have those data in memory for all channels for each timestep as it is written.
Susan Leemburg
Susan Leemburg 2021-2-1
I've checked with the people from Kilosort, and they told me that I need the data to be intermingled: first sample 1 for all channels, then sample 2... like in the first example.
I also enter my sampling rate and the number of channels in Kilosort, and I'm pretty sure that the individual traces are reconstructed based on those.
I should be able to get the correct kind of output by reading a portion of each channel, build that into a matrix (one channel per column), write the matrix to my .dat file and then repeat and append until I've written all my data, right?
I also think that a lot of my confusion comes from initally misunderstanding how these particular files work. I thought that, just like for e.g. .mat files and text files, the structure I write into the files with fwrite will just come out in the same shape when I read that file back. But that is clearly not totally the case. Not without some extra instructions anyway.

请先登录,再进行评论。

回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Time Series 的更多信息

产品


版本

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by