Virtual Jaguar libretro v2.3.0: 158 commits, no new features, the acid suite found real bugs

Polish release. 158 commits since v2.2.0, 263 files changed, +25,240 / -2,805. The acid-test suite from PR #130 (and the 2026-05-03 postscript that retracted three of its findings) called out the DIVL zero-divide trap as a real emulator bug; this release fixes it. The suite itself grew from 142 tests to 56,500+ assertions across 15 binaries. Doom still runs 2x too fast.

Open Source Emulation Atari Jaguar libretro Testing

What landed

v2.3.0 shipped on 2026-05-06 with pre-built libretro cores for 16 platforms (Linux x86_64 / aarch64 / i686, macOS arm64 / x86_64, Windows x86_64 / i686, iOS arm64, tvOS arm64, Android arm64-v8a / armeabi-v7a / x86_64 / x86, Emscripten WASM, PS Vita, Nintendo Switch). 158 commits since v2.2.0.

No new features. This release is performance work, hardware-accuracy fixes, and iOS / Provenance stability. The user-visible feature set matches v2.2.0; the engine underneath is meaningfully closer to the chip.

What the acid suite predicted, what got fixed

The acid-test suite from PR #130 shipped 142 tests in May. The 2026-05-03 postscript retracted three findings as test bugs (GPU/DSP read shadowing, DSP IRQ delivery to 68K, the Doom event-clock framing) and confirmed three as real emulator bugs (the DIVL zero-divide trap, narrow-pixel blitter copies, the BlitterMidsummer2 hang on 1bpp / 2bpp).

v2.3.0 hits one of those three directly:

  • DIVL zero-divide trap fixed in PR #162. The release notes line is "DSP: correct ABS flags, DIV zero-guard, STORE alignment." The DIVL HLE path that landed in v2.2.0 handled the math correctly but didn’t trap on divisor == 0; that’s now closed. The acid test pinned to it (tests/dsp/-bucket, divl_zero_divide.s) flips FAIL to PASS automatically; that’s the contract the suite was designed for.

The narrow-pixel blitter copies and the BlitterMidsummer2 hang aren’t explicitly called out as fixed, but the release does include "Blitter: multiple correctness fixes in collapsed inner loop paths (daddmode, zmode, patfill eligibility guards)" plus "always read framebuffer in phrase mode for byte_merge." Whether those resolve the postscript’s specific cases needs a fresh acid-suite run against develop to confirm. The suite will tell us; that’s the point.

Performance wins

The headline perf work, in order of measurable impact:

  • RISC computed-goto dispatch (PR #146) for GCC/Clang. Replaces the switch-statement opcode loop in the GPU and DSP cores with goto *jump_table[opcode], which lets the branch predictor warm up the next instruction before the current one finishes decoding. Significant speedup on every modern host CPU.
  • Inlined delay-slot execution (PR #147). The Jaguar’s RISC cores have branch-delay slots; the previous path took a function call per slot. Inlined now.
  • Predecode opcode cache (PR #144). Cache the decoded form of each opcode the first time the core sees it, so subsequent executions skip the decode work.
  • Blitter SIMD ADDARRAY cascade (PR #148). Vectorizes the address-array generation inside the blitter inner loop.
  • Blitter ADDRGEN caching (PR #149) hoists the y * width multiplication out of the inner loop where it’s a loop invariant.
  • Inlined ADDRGEN and helpers (PR #152).
  • Fast-path RAM bypass (PR #155) for blitter reads/writes. When the address is known to be in main RAM, skip the full address-dispatch table.
  • Collapsed inner loops for simple copy blits (PR #157) and pattern fill blits (PR #156). The blitter’s general-purpose inner loop has a lot of conditional dispatch; for the common cases, a specialized loop is faster.
  • Branchless COMP_CTRL dbinh + Kogge-Stone maskt prefix (PR #158).

Net effect: the blitter and the GPU/DSP RISC interpreter (the two dominant cost centers in any Jaguar emulator on a modern host) are both meaningfully faster. The accurate blitter is still slower than the fast blitter; that gap is narrower now.

Hardware-accuracy fixes

Real chip behavior now matches in places it didn’t:

  • PIT timer corrected to full 26.59 MHz system clock (PR #154). Was previously running at half rate. Games that use PIT2 timer interrupts as their tick source were getting events at half the expected wall-clock rate. The acid suite flagged the PIT rate in pit_countdown_rate.s; that’s resolved upstream of the test now.
  • TOM visible window derived from hardware registers (PR #164). The display extents come from VDB, VDE, HDB, HDE instead of hardcoded dimensions. Proper overscan handling. This is the same shape of fix that closed the v2.2.0 "left half of screen" Doom rendering bug, extended into the visible-window code path.
  • DAC resamples I2S output when SCLK changes at runtime (PR #163). Fixes audio pitch drift in games that reconfigure the I2S clock during play. Skyhammer is one of the games that does this; the audio improvement isn’t the full HLE engine fix it needs, but it’s directionally right.
  • AvP alpha noise / red artifact fix (PR #166). The blitter now skips destination data read when BKGWREN is set. Eliminates the red visual noise on Alien vs Predator.
  • TOM: restore fixed left-edge origin for scanline positioning.
  • Blitter: restore 64-bit register longword swap in BlitterWriteByte; always read framebuffer in phrase mode for byte_merge.

iOS / Provenance stability

The single most user-impactful fix for anyone on iOS:

  • Reset all static state on unload/deinit with NULL checks (PR #160). Fixes crashes on iOS / Provenance when relaunching the core (load a Jaguar ROM, exit, load another, crash). The libretro core had statics that were never zeroed across retro_unload_game / retro_deinit cycles; iOS re-uses the dylib in-process, so the leftover state corrupted the next run.
  • Move externs into proper headers (PR #161).

This lands the same week Provenance v3 goes into App Store review (iOS + first-ever tvOS release). The Jaguar core has been one of the harder cores to keep stable across re-loads on iOS; v2.3.0 closes the most-reported crash path.

The test suite, an order of magnitude bigger

The acid suite shipped at 142 tests in May. v2.3.0 expands the test infrastructure substantially:

  • 56,500+ automated assertions across 15 test binaries.
  • New harnesses: framebuffer integrity (alpha corruption + screen position), TOM visible window registers, EEPROM lifecycle (write / read / persist across reload), audio pipeline, PIT clock rate verification.
  • Shared test harness library at test/harness/ — common CLI, JSON output, audio / video stats, probe modules for DSP and timing.
  • Per-frame timing diagnostic tool for detecting halfline / cycle anomalies. This is the tool needed to investigate the Doom-too-fast bug; the postscript walked back the original "event-clock divergence" framing as a measurement artifact, but the underlying timing-skew question is still open. The new diagnostic is how it gets answered.
  • Visual regression tests: determinism, frameskip invariance, save-state round-trip, rewind simulation.

Test code grew faster than emulator code this cycle. That’s the right ratio for an accuracy-focused release.

What’s still broken

Honest scope, direct from the v2.3.0 known-issues list:

  • Doom: still runs at approximately 2x speed. Game music silent. The PIT clock-rate fix in this release moves the right direction for game-timing accuracy, but the GPU-rendered bus-contention modeling that Doom relies on isn’t here yet. The new per-frame timing diagnostic is the right next investigation tool.
  • Wolfenstein 3D: still no game music. The DSP audio engine state replication that v2.2.0’s Known Issues called out hasn’t landed yet.
  • Skyhammer / Iron Soldier 2: saturated square-wave audio on HLE. The DAC SCLK resampling helps the pitch-drift cluster but not the engine-state cluster.
  • Battle Sphere: menu navigation issues.
  • Jaguar CD: in progress on a separate branch, not in this release.
  • The accurate blitter is still meaningfully slower than the fast blitter on lower-end hardware. Some games (Tempest 2000) may need the fast blitter to hit full speed.

The BlitterMidsummer2 hang on 1bpp / 2bpp blits and the narrow-pixel partial-pixel-mode copies from the postscript aren’t explicitly listed; those need a fresh acid-suite run against develop to confirm whether the collapsed-inner-loop fixes resolved them.

What’s next

v2.4.0 is unscoped at this writing. The visible queue from the v2.3.0 work:

  • The Doom timing investigation, now that there’s a per-frame timing diagnostic.
  • Jaguar CD branch landing.
  • More HLE BIOS coverage for the still-broken cluster (Wolf3D / Skyhammer / Iron Soldier 2 audio engines).
  • Continued expansion of the acid suite — 56,500 assertions is a milestone, not a finish line.

Release is at github.com/libretro/virtualjaguar-libretro/releases/tag/v2.3.0. SHA256 checksums in SHA256SUMS.txt, verified via SECURITY.md.