"switch" like functionality on GPUarray

Question

1 个投票

I am running code that involves a Markov chain process, and I would like to implement it so that 10,000+ such transitions can be simultaneously performed on the GPUarray.

Basically my Markov chain involves 8 states, but could involve an arbitrary number, with arbitrary couplings/transitions. What I'd really like to do is use arrayfun, and have the code describing transition probabilities be contained in a standard switch statement (not currently supported on GPU). Some generalized code is below:

function newstate=junk(oldstate)
switch(oldstate)
case 1
if (condition1, determined partially by a random number)
 newstate=6
if (condition2, determined partially by a random number)
 newstate=4
case 2
....
case 8
...
return

and then store "oldstates" on the GPU, and call: newstates=arrayfun(@junk, oldstates);

I guess I am wondering if there is a more elegant way to do this than to do serial if and if-else statements in my "junk" function that is passed to arrayfun. If I am way off on this, please let me know a better way.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Adam 2016-2-5

I'm far from an expert in GPU programming, but from what I know you really don't want to be executing switch-type functionality on a GPU even if it were supported. GPU cores are highly optimised for doing mass parallel calculations with each core running effectively the same code but on different input data, but not for decision trees that fork code off onto one of numerous paths which would cause each core to be running different code at any given moment.

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Joss Knight 2016-2-15

在 MATLAB Online 中打开

0 个投票

Any switch statement can be reformulated as a sequence of if, elseif statements, which is supported by GPU arrayfun, so you can write the code you want.

Because of the potential cost of branches inside a GPU kernel as articulated in Adam's comment, it's possible you'll get better performance with conditional execution using masks.

newstates = oldstates;
for s = 1:numstates
  mask = oldstates == s;
  newstates(mask) = arrayfun(@junk, oldstates(mask));
end

This incurs the cost of launching one kernel per state, so you'd have to experiment to see whether it was actually faster. If the code in your arrayfun kernel really is so simple then you'll probably find it's cheaper to use the one kernel. Branching is only more costly if there's a lot happening inside each conditional clause (or a lot of mutually exclusive clauses).

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Vivek 2016-2-24

Thanks to you and to Adam for these thoughtful comments. Using a mask is elegant. I will attempt both and compare performance.

请先登录，再进行评论。

"switch" like functionality on GPUarray

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

更多回答（0 个）

类别

标签

Community Treasure Hunt

"switch" like functionality on GPUarray

1 个评论 显示 -1更早的评论 隐藏 -1更早的评论

采纳的回答

1 个评论 显示 -1更早的评论 隐藏 -1更早的评论

更多回答（0 个）

类别

标签

另请参阅

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论