Plenty of games seem to suffer from this issue where they have a linked list
DMA going while polling the controller. Having a large slice size causes the
serial transfer to complete before the silly busy wait in the BIOS poll routine
returns, resulting in it thinking that the controller is disconnected. Some
games are very sensitive to this (e.g. Newman Haas Racing), to the point that
even using a slice size of 1 is insufficient for avoiding the race, probably
due to the linked list layout.
Therefore, without major refactoring to ensure the CPU runs every DMA block,
and the associated performance penalty, we just halt the DMA until the serial
transfers have completed. To reduce the chances of this significantly affecting
timing, we add accumulate the ticks that have been "lost", and allow them to be
"used up" when the transfer does happen.
Plenty of games seem to suffer from this issue where they have
a linked list DMA going while polling the controller. Using a
too-large slice size will result in the serial timing being off,
and the game thinking the controller is disconnected. So we
don't hurt performance too much for the general case, we reduce
this to equal CPU and DMA time when the controller is
transferring, but otherwise leave it at the higher size.
Also gets rid of the delay on the GPU side for writing to VRAM (doesn't
make sense), and it's not needed since we slice the block transfers now.
Fixes palette corruption in Vigilante 8, and missing rider in
Championship Motocross 2001 featuring Ricky Carmichael.
* CPU/Recompiler: Use rel32 call where possible for no-args
* JitCodeBuffer: Support using preallocated buffer
* CPU/Recompiler/AArch64: Use bl instead of blr for short branches
* CPU/CodeCache: Allocate recompiler buffer in program space
This means we don't need 64-bit moves for every call out of the
recompiler.
* GTE: Don't store as u16 and load as u32
* CPU/Recompiler: Add methods to emit global load/stores
* GTE: Convert class to namespace
* CPU/Recompiler: Call GTE functions directly
* Settings: Turn into a global variable
* GPU: Replace local pointers with global
* InterruptController: Turn into a global pointer
* System: Replace local pointers with global
* Timers: Turn into a global instance
* DMA: Turn into a global instance
* SPU: Turn into a global instance
* CDROM: Turn into a global instance
* MDEC: Turn into a global instance
* Pad: Turn into a global instance
* SIO: Turn into a global instance
* CDROM: Move audio FIFO to the heap
* CPU/Recompiler: Drop ASMFunctions
No longer needed since we have code in the same 4GB window.
* CPUCodeCache: Turn class into namespace
* Bus: Local pointer -> global pointers
* CPU: Turn class into namespace
* Bus: Turn into namespace
* GTE: Store registers in CPU state struct
Allows relative addressing on ARM.
* CPU/Recompiler: Align code storage to page size
* CPU/Recompiler: Fix relative branches on A64
* HostInterface: Local references to global
* System: Turn into a namespace, move events out
* Add guard pages
* Android: Fix build
Fixes games which have looping linked lists but still expect CD/OTC
reads to work.
Also caps the number of ticks used when looping linked lists are
present, which doesn't steal so much time from the CPU per batch.
Fixes:
- Victory Spike
- Magical Drop III - Yokubari Tokudai-gou!
- Yuukyuu no Eden - The Eternal Eden
- Loading screen in World Cup Golf - Professional Edition
This will cause a slight performance loss. I've left some knobs in which
can be tweaked to mitigate this, but the goal is to be compatible with
all games which require them.