"switch" like functionality on GPUarray
1 次查看(过去 30 天)
显示 更早的评论
I am running code that involves a Markov chain process, and I would like to implement it so that 10,000+ such transitions can be simultaneously performed on the GPUarray.
Basically my Markov chain involves 8 states, but could involve an arbitrary number, with arbitrary couplings/transitions. What I'd really like to do is use arrayfun, and have the code describing transition probabilities be contained in a standard switch statement (not currently supported on GPU). Some generalized code is below:
function newstate=junk(oldstate)
switch(oldstate)
case 1
if (condition1, determined partially by a random number)
newstate=6
if (condition2, determined partially by a random number)
newstate=4
case 2
....
case 8
...
return
and then store "oldstates" on the GPU, and call: newstates=arrayfun(@junk, oldstates);
I guess I am wondering if there is a more elegant way to do this than to do serial if and if-else statements in my "junk" function that is passed to arrayfun. If I am way off on this, please let me know a better way.
1 个评论
Adam
2016-2-5
I'm far from an expert in GPU programming, but from what I know you really don't want to be executing switch-type functionality on a GPU even if it were supported. GPU cores are highly optimised for doing mass parallel calculations with each core running effectively the same code but on different input data, but not for decision trees that fork code off onto one of numerous paths which would cause each core to be running different code at any given moment.
采纳的回答
Joss Knight
2016-2-15
Any switch statement can be reformulated as a sequence of if, elseif statements, which is supported by GPU arrayfun, so you can write the code you want.
Because of the potential cost of branches inside a GPU kernel as articulated in Adam's comment, it's possible you'll get better performance with conditional execution using masks.
newstates = oldstates;
for s = 1:numstates
mask = oldstates == s;
newstates(mask) = arrayfun(@junk, oldstates(mask));
end
This incurs the cost of launching one kernel per state, so you'd have to experiment to see whether it was actually faster. If the code in your arrayfun kernel really is so simple then you'll probably find it's cheaper to use the one kernel. Branching is only more costly if there's a lot happening inside each conditional clause (or a lot of mutually exclusive clauses).
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 GPU Computing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!