replace function in tall array. This indicates an internal error. Please contact MathWorks Technical Support.

2 次查看(过去 30 天)
I'm trying to replace a bunch of strings with meaningful strings in each column of a tall array. In the reference for replace function, it says it fully supports tall array. But I ran into the error. See an example below:
b = ["a", "6", "1001", "0", "3"]';
b = [b; b];
code = ["a"; "1001"; "3"; "10"];
meaning = ["atest", "1001t", "3b", "102b"]';
a2 = replace(b, code, meaning); % this works fine
a = replace(tall(b), code, meaning); % this throws an error
a1 = replace(tall(b'), code, meaning); % this throws another error asking me to contact technical support
first error message:
Error using tall/replace (line 21)
Incompatible tall array arguments. The first dimension in each tall array must have the same size, or have a size of
1.
It seems it complains about the tall(b) because the first dimension is not 1. So I explicitely transposed it by tall(b'), it threw the error below:
Error using tall/replace (line 21)
The operation generated an invalid chunk for output parameter 1. The chunk has size [1 10] where the expected size is
[4 10]. This indicates an internal error. Please contact MathWorks Technical Support.
I'm using R2020a.
  2 个评论
Sean de Wolski
Sean de Wolski 2020-3-30
What's your end goal? Do you want to write this back to disk with the replacements? Do you want further downstream processing? For further downstream processing, the idea is that gather will never need the entire array in memory at once. For writing, look at tall.write.
Peng Li
Peng Li 2020-3-30
thanks Sean. yeah my goal is to write the tall table to disk after replacing all 500k*60k with specific meanings. Do you mean that if I gather here, it doesn't need the entire array to be in memory?
Yeah I used write to write this tall table to disk, and it then comes to my previous question actually which I think you also kindly replied lol

请先登录,再进行评论。

回答(2 个)

Jyotsna Talluri
Jyotsna Talluri 2020-3-30
You have to use gather function to calulate the unevaluated tall array tall(b)
a2 = replace(b, code, meaning);
a = replace(gather(tall(b)), code, meaning);
Refer to the documentation link for more details

Sean de Wolski
Sean de Wolski 2020-3-30
编辑:Sean de Wolski 2020-3-30
At the very least it's a doc bug because the doc says that tall arrays are fully supported for replace. It does appear to work with scalar values for old and new but the results are not the same because the replace happens sequentially rather than in one shot.
I'd contact tech support for that.
However, using categorical and renamecats, I'm able to get the same result:
b = ["a", "6", "1001", "0", "3"]';
b = [b; b];
code = ["a"; "1001"; "3"; "10"];
meaning = ["atest", "1001t", "3b", "102b"]';
a2 = replace(b, code, meaning); % this works fine
tb = tall(categorical(b, unique([code;b])));
b2 = renamecats(tb,code, meaning);
bg = gather(b2); % DON'T Call this, just doing it on simple example to check.
assert(isequal(string(bg), a2))
write('test.csv', b2); % Change the pattern to what you want for writing
  3 个评论
Peng Li
Peng Li 2020-3-31
Just tried on this. renamecats works fine as long as code is a subset of tb while it happens that my codebook may contains codes that have never been used in the actual tall table. So the cat by tb = tall(categorical(b, unique([code;b]))); becomes necessary. This looks for me a bit cubersome.
Instead, the categorical function supposed to work directly for this by categorical(b, code, meaning). However, it always throws another error as well whenever I only have one code and one meaning.
See my second question:

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Categorical Arrays 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by