Branch Predication

I was reading about make, and came across a paragraph about branch predication and Konrad Zuse, who I'd never heard of. Fascinating.

Branch predication is related to the way the current CPUs will execute branches in an if-then-else in parallel, then discard the result that's ignored. (Well, that's not quite true - they only do that for some conditionals that fit into what the CPU can handle.)

Here's a good description at TechTarget. And here's the article that started the hunt.

The idea is that you parallelize the execution of the branches and the predicate (the condition on which a branch is chosen). According to the Thumb2 description in wikipedia, though, machine instructions generally have more than true-false predicates. You have have less than, greater than, or equal. So you can compare two numbers and do a subtraction. The result will be a positive number, a negative number, or zero. That means you can do three branches.

In pseudocode:

bp (boolean) {
  true { a=b; }
  false { a=c; }
}

bp (a,b) {
  = { ... }
  < { ... }
  > { ... }
}

The boolean form is really just a special case of the comparison.

The predicate is parallelized, so computing the values of a and b can take a long time, and as long as the three conditional branches finish up before a and b are computed, you're going to save time. The speed of calculating each branch must depend on the size of the instruction pipeline, and the code in each branch (whether it calls a function, uses memory, or does something else complex).

What's interesting is the situation way over at the other side, where all the branches are complex, and you're on a multicore machine or a networked machine. When you're talking about seconds instead of nanoseconds, like in a scripting environment, you could use parallelism to dispatch programs, over the network, and have them come back with results at different times, and then you pick the branch after the predicate has been calculated.

The obvious case is to launch a bunch of work on the local machine while you wait for results from a remote computer. You can simulate different scenarios with different parameters, calculate the action you take in each situation, while you wait for a result from the remote computer. When the result comes in, you execute your precomputed action.