Avoid branch prediction slowdowns in R_PointOnSide
this might be the largest performance optimization i've done yet. after doing some profiling with oprofile, i found that the biggest bottleneck in SRB2 seems to be R_PointInSubsector
, which seems to be responsible for 5% of all CPU time that SRB2 uses according to the samples. after digging deeper, it turns out that what was causing all the CPU usage was caused by an if-condition in R_PointOnSide
that seemed to be inconsistent in it's result. this has a major negative impact on modern CPU's due to how branch prediction works, as hard-to-predict branches can cause the CPU to be forced to roll back operations in it's pipeline, and thus greatly reduce the amount of instructions the CPU is able to do in a clock tick.
however, by replacing the if-statement with some bit operations, we can avoid the branch prediction altogether, thus allowing the CPU to fully utilize it's pipeline to it's fullest. this dramatically increased performance and dropped the CPU time of R_PointInSubsector
to 0.45%.