How to write a large .dat file in a loop?
7 次查看(过去 30 天)
显示 更早的评论
Hello,
I am trying to convert a large amount data from one format (64 separate neuralynx .ncs files) into another (a .dat file with all 64 combined, in int16 format) for use with the Kilosort package. Kilosort then uses this huge file in some clever way. The .dat file should be 2D with 64 channels x number of samples per channel (about 3 hours at 20kHz).
Because my data is so large, I am unable to load it all into one matrix and then write it to a file (the total filesize for all 64 inputs is at least 30GB).
So, I have been trying to read each of my input files separately, convert the data type and then write it to a file. In the code below I managed to make it work for a huge .mat file using the matfile function, but I need .dat.
I then tried to do a similar thing using fwrite, where I tried to append each new .ncs readout. All the data gets written to file, but in an enormous 1-D list of numbers, and not in a 64-row array.
I tried to transpose my input in the hope that I would obtain separated rows that way, but that makes no difference.
How can I control how fwrite appends data to my file? Is it at all possible to make a 2D output like this?
Thanks,
Susan
% make .mat file
m = matfile([OutFolder,'\',RatID,'_',RecDate,'.mat'],'Writable',true);
% read .ncs files for all traces and write to .mat and .dat file
for ch = 1:64
%load data
InFile = [InFolder,'\','CSC',num2str(ch),'.ncs'];
[~,~,samples] = readEegDataForKilosort(InFile); % gives 1D array of type double
int_samples = int16(samples); clear samples;
%write to .mat file (not actually useful)
m.WholeRec(1:length(int_samples),ch)=int_samples;
%write to .dat file
if ch == 1; % make file for 1st channel
fileID = fopen([OutFolder,'\',RatID,'_',RecDate,'.dat'],'w');
fwrite(fileID,int_samples','uint16');
fclose(fileID);
else % append for all next channels
fileID = fopen([OutFolder,'\',RatID,'_',RecDate,'.dat'],'a');
fwrite(fileID,int_samples','uint16');
fclose(fileID);
end
clear int_samples
end
4 个评论
dpb
2021-1-30
编辑:dpb
2021-1-30
I'm trying to figure out for sure what the input file you need actually looks like, specifically.
I didn't find a description of that file format at the link; it's probably there, but isn't clear where that is.
Let's talk something small in size instead...if you had four channels and 3 observations, there would be twelve values. Are these to be arranged as a sequence of three (3) 4-vectors, sequentially in time as
Ch1O1 Ch201 Ch301 Ch401
Ch1O2 Ch202 Ch302 Ch402
Ch1O3 Ch203 Ch303 Ch403
? or as
Ch1O1 Ch102 Ch103
Ch2O1 Ch202 Ch203
Ch3O1 Ch302 Ch303
Ch4O1 Ch402 Ch403
?
In both cases I've introduced phantom records that would not be in a stream file simply to aid in readability.
The first writes each timestep for all chanels, the second writes all timesteps (observations "O") for each channel sequentially.
Or, does the input processor have, by any chance, the ability to tell it which order the data are in?
The second above is what you have written; a stream file will be just a sequence of bytes; to write in the order by timestep/observation you will have to have those data in memory for all channels for each timestep as it is written.
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Time Series 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!