Optimize HWR_RenderBatches polygon sorting
When 2.2.11 was released a few days ago, I noticed that my system wasn't able to reach the refresh rate of my system of 144hz, despite having a pretty powerful system. So, I went down the rabbit hole of profiling with OProfile to find bottlenecks, and found that HWR_RenderBatches
, the function responsible to rendering polygons on-screen in batches, had some performance issues.
In that function, the polygons are sorted to make sure all polygons are rendered in the correct order. This is done with a call to qsort
with a callback function that checks the order of which the polygons should be sorted. After profiling, though, it turns out that the callback function was responsible for 3% (!) of all CPU usage in the entire program, making it the second most CPU-intensive part followed by R_PointInSubsector
at 6.5% (!!).
How the optimization works is by instead of storing the index to all polygon in order, a reference is stored instead. This eliminates the need of indexing to retrieve the entry for sorting, which knocks down the CPU usage of that function down to 0.7%.