Pre-Allocate structure with String / Datetime fields slows code down considerably
9 次查看(过去 30 天)
显示 更早的评论
Hi,
I am trying to read through and sort two large .txt files, around 300 mb at the largest.
Originally, for each line of code I read, I would re create the matrix lile this:
strarray.full = [strarray.full ; new_info]
strarray.newdate = [strarray.newdate ; new_info ]
This slowed down considerably once the files reached around 20 mb. I've seen that Pre Allocating matrices prevent MATLAB from having to re create the growing matrix each iteration. So now I have the following:
strarray.newdate = NaT(2000000,1);
strarray.full = strings(2000000,1);
where I have a counting varaible ' j ' that counts each time something should be added into the matrix.
strarray.full(j,1) = new_info;
strarray.newdate(j,1) = new_info;
When I did this, the code slowed down considerably, both starting off slower and slowing down faster as time progressed. After running a profiler, it says that nearly all the time is spent putting the info into the pre-allocated matrix.
I've got permission to attach the file, but I cant attach the .txt files directly so I have to strip it down here.
.txt Format 1:
Datetime2 ~ *string* ~ *string* ~ *string*
*string*
Datetime2 ~ *string* ~ *string* ~ *string*
*string*
*string*
*string*
*string*
Datetime2 ~ *string* ~ *string* ~ *string*
*string*
*string*
.txt Format 2:
datetime1 ~ *string* ~~~ *string* ~~~ *string* ~*~
datetime1 ~ *string* ~~~ *string* ~~~ *string* ~*~
datetime1 ~ *string* ~~~ *string* ~~~ *string* ~*~
Thanks.
0 个评论
采纳的回答
Fangjun Jiang
2020-7-16
You are not using struct array. You are putting newdate (which is a datetime array) and full (which is a string array) into a struct strarray (see code difference below). In this case, I wonder if you just use newdate=NaT(2e6,1) and full=strings(2e6,1) directly would be faster. After all, combine these two big array into one struct won't help at all.
You can try struct array following the below pattern to see if it helps. I doubt it.
s1.newdate=NaT(20,1);
s1.newdate(1)
s1.newdate(20)
s2(20).newdate=NaT;
s2(1).newdate
s2(20).newdate
5 个评论
Fangjun Jiang
2020-7-16
Not sure if it has anything to do with NaT(). Could you try pre-allocate it in either of this two ways?
newdate=zeros(N,1);
newdate=repmat(datetime,N,1);
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Data Type Conversion 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
