NPO2 span function optimization
This patch makes non-power-of-two span drawing functions about 3-7 times faster. This is achieved by replacing the modulo operations in the loop with additions and subtractions.
Timedemos of a affected map show around 10% improvement in FPS. (Recording a replay of just standing in a room with NPO2 floors and ceilings would probably show more dramatic results)
Slope drawing is not changed. I'm still working on that and will make a separate merge request when it's ready.
Here's the screenshots the best case scenario improvements I found. The changes in performance were tested on my AMD Ryzen PC and an old Intel gaming laptop.
31 ms --> 4.4 ms!