"switch" like functionality on GPUarray

1 次查看(过去 30 天)
I am running code that involves a Markov chain process, and I would like to implement it so that 10,000+ such transitions can be simultaneously performed on the GPUarray.
Basically my Markov chain involves 8 states, but could involve an arbitrary number, with arbitrary couplings/transitions. What I'd really like to do is use arrayfun, and have the code describing transition probabilities be contained in a standard switch statement (not currently supported on GPU). Some generalized code is below:
function newstate=junk(oldstate)
switch(oldstate)
case 1
if (condition1, determined partially by a random number)
newstate=6
if (condition2, determined partially by a random number)
newstate=4
case 2
....
case 8
...
return
and then store "oldstates" on the GPU, and call: newstates=arrayfun(@junk, oldstates);
I guess I am wondering if there is a more elegant way to do this than to do serial if and if-else statements in my "junk" function that is passed to arrayfun. If I am way off on this, please let me know a better way.
  1 个评论
Adam
Adam 2016-2-5
I'm far from an expert in GPU programming, but from what I know you really don't want to be executing switch-type functionality on a GPU even if it were supported. GPU cores are highly optimised for doing mass parallel calculations with each core running effectively the same code but on different input data, but not for decision trees that fork code off onto one of numerous paths which would cause each core to be running different code at any given moment.

请先登录,再进行评论。

采纳的回答

Joss Knight
Joss Knight 2016-2-15
Any switch statement can be reformulated as a sequence of if, elseif statements, which is supported by GPU arrayfun, so you can write the code you want.
Because of the potential cost of branches inside a GPU kernel as articulated in Adam's comment, it's possible you'll get better performance with conditional execution using masks.
newstates = oldstates;
for s = 1:numstates
mask = oldstates == s;
newstates(mask) = arrayfun(@junk, oldstates(mask));
end
This incurs the cost of launching one kernel per state, so you'd have to experiment to see whether it was actually faster. If the code in your arrayfun kernel really is so simple then you'll probably find it's cheaper to use the one kernel. Branching is only more costly if there's a lot happening inside each conditional clause (or a lot of mutually exclusive clauses).
  1 个评论
Vivek
Vivek 2016-2-24
Thanks to you and to Adam for these thoughtful comments. Using a mask is elegant. I will attempt both and compare performance.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 GPU Computing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by