Optimizing Game Performance: NVIDIA's Recommendations

Optimizing Game Performance: NVIDIA's Recommendations
TL;DR
NVIDIA recommends optimizing game code for fewer CPU threads to enhance performance. High-end CPUs with over 8 cores can render games faster with fewer threads. Reducing thread count improves boost clocks, but results vary based on workload complexity. Hyper-threading can degrade performance. Hybrid cores add complexity.

NVIDIA advises game developers to optimize their code for fewer CPU threads to achieve better performance. Some CPU-bound games perform worse when core count increases due to overhead. High-end desktop systems can see up to 15% performance gains by reducing thread count. Performance drop reasons vary across titles and systems. High-end CPUs with more than 8 cores can render games faster with fewer threads.





Reducing thread count can improve boost clocks on multi-core CPUs.


  • Hardware performance: Higher-core-count CPUs sometimes have lower CPU speeds. Reducing the number of threads may enable the active cores to boost their frequency.
  • Hardware resource contention: Reducing the thread count can often decrease the pressure on the memory subsystem, reducing latency and enabling the CPU caches to be more efficient. This is especially true for chiplet-based architectures that do not have a unified L3 cache. Threads executing on different chiplets can cause high cache thrashing.
    • Executing threads on both logical cores of a single physical core (hyperthreading or simultaneous multi-threading) can add latency as both threads must share the physical resource (caches, instruction pipelines, and so on). If a critical thread is sharing a physical core, then its performance may decrease. Targeting physical core counts instead of logical core counts can help to reduce this on larger core count systems.
  • Software resource contention: Locks and atomics can have much higher latency when accessed by many threads concurrently, adding to the memory pressure. False sharing can exacerbate this.
  • OS scheduling issues: An over-subscription of threads to active cores leads to a high number of context switches which can be expensive and can put extra pressure on the CPU memory subsystem.
    • On systems with P/E cores, work is scheduled first to physical P cores, then E cores, and then hyperthreaded logical P cores. Using fewer threads than total physical cores enables background threads, such as OS threads, to execute on the E cores without disrupting critical threads running on P cores by executing on their sibling logical cores.
  • Power management: Reducing the number of threads can enable more cores to be parked, saving power and potentially allowing the remaining cores to run at a higher frequency.
    • Core parking has been seen to be sensitive to high thread counts, causing issues with short bursty threads failing to trigger the heuristic to unpark cores. Having longer running, and fewer threads helps the core parking algorithms.

Hyper-threading can degrade performance by competing for resources.

Hybrid cores add complexity; limiting thread usage maintains performance.

NVIDIA's guidance for better game performance.