How are the FLOPS calculated?
People often use the number of floating point operations per second (FLOPS) as a metric for the speed of a computer. One question that arises is how to compare machines with radically different architectures. In particular, what requires only a few operations (or even just a single operation) on one machine could require many operations on another. Classic examples are evaluations of functions like the exp(x) or sin(x). On GPU and Cell hardware, functions like this can often be calculated very quickly, say in one cycle, while this is often counted as 10-20 operations for other machines. We take a conservative approach to FLOP calculation, rendering quantities such as exp(x) or sqrt(x) as a single FLOP, if the hardware supports it. This can significantly underestimate the FLOP count (as others would count an exp(x) as 10 or 20 FLOPS, for example). Others take a much less conservative approach and we are considering giving two counts, adding a more traditional (less conservative) count a