I have heard many times that if statements in shaders slow down the gpu massively. But I also heard that texture samples are very expensive.
Which one is more endurable? Which one is less impactful?
I am asking, because I need to decide on if I should multiply a value by 0, or put an if statement.
@Smorty don’t know for sure but experience tells me to go with mul zero.
@Smorty because gpus can’t feasibly do speculative execution, forking is more expensive than a lookup, which can be done in parallel and cached, but of course, it depends on what you’re testing and what you’re sampling
it’s not the same to test for one equality than a complex function call, and it’s not the same thing sampling a small or big texture, with or without mipmap levels, aggregation, etc@Smorty Link doesn’t load for me and I don’t know the answer in general, but one thing I can say is that _sometimes_ if statements aren’t an issue at all, which is when the condition evaluates to the same thing for all pixels/fragments. E.g. an “if sin(TIME) < 0.0” costs you almost nothing, whereas “if COLOR.r > 0.5” causes execution to branch and slows you down. But I can’t say how that case compares to a texture lookup, I assume it depends on many thing
I’ve heard that using
mix()
instead (or whatever GDShader calls that GLSL function) can be more performant, since it doesn’t branch. Is that true?Link doesn’t load for me
This post has no link. It’s a text post.
I think there is no obvious way of telling this, because it depends on how you if statement will be constructed and in the end what machine code will be generated from your code.
So best thing would probably be implement both and measure the results. I would argue that’s how performance optimisations work. Don’t trust on what a forum post tells you.
However chances are high that both will have similar performance in a range that doesn’t matter for your use case… Without knowing your use case :)
Impact of if statements depends on how you use them. GPUs are massively parallel and sacrifice complexity to fit more parallel compute. Threads aren’t fully Independent, so regardless of which branch is taken, the thread usually has to wait for both branches.
Calculation of pixels that take the the then branch idle while other ones take the else branch and vice versa. Nested if statements make this exponentially worse.
Try it out and measure.