felix86 26.02

We have a few exciting news this month!

Vulkan X11 thunking support

The native Vulkan userspace driver can now be used on X11 with thunking. We also implemented the signatures for some missing extensions used by DXVK, which means DXVK on Wine can now use Vulkan thunking. This can be enabled on 64-bit games with FELIX86_ENABLED_THUNKS=vk felix86 --shell, as it is currently disabled by default.

Thunking improves performance and compatibility and reduces time spent on recompilation.

Witcher 3 with DXVK and Vulkan thunking on Milk-V Jupiter

Zink support

The Vulkan extension signatures necessary for Zink were also added. For ease of use, there’s now a zink profile that enables thunking and Zink. Test with FELIX86_PROFILE=zink felix86 --shell.

SpacemiT K3

SpacemiT was kind enough to give us early SSH access to a Linux environment with SpacemiT K3 to test felix86 and run benchmarks!

Hardware

SpacemiT K3 is RVA23 compliant, which means we get access to vector crypto (important for x86 AES & others), unaligned non-atomic accesses supported in hardware (necessary, see next month’s post as to why), Zvfhmin which we’ll eventually use for F16C, Zvbb which allows for some optimizations in our SIMD code, and Zfa which also allows for some optimizations.

Additionally, RVA23 covers the extensions necessary for felix86 operation, which are RVV 1.0 and Zba/Zbb/Zbs.

Benchmarks

The benchmarks we tried ran a lot faster on the SpacemiT K3 than on the K1!

First, the x86-64 7z benchmark run through felix86 26.02 on RISC-V hardware:

// SpacemiT K1, x86-64 7z
                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       2081   567    357   2024  |      41069   746    469   3502
23:       2040   579    359   2079  |      42298   780    469   3659
24:       2042   589    373   2196  |      41572   779    468   3648
25:       2049   606    386   2340  |      40578   779    464   3611
----------------------------------  | ------------------------------
Avr:      2053   585    369   2160  |      41379   771    468   3605
Tot:             678    418   2882
// SpacemiT K3, x86-64 7z
                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       7392   579   1243   7192  |     123233   778   1351  10508
23:       7199   590   1244   7336  |     121907   784   1345  10545
24:       7248   609   1280   7794  |     119814   783   1343  10513
25:       7232   622   1328   8258  |     117826   787   1331  10484
----------------------------------  | ------------------------------
Avr:      7268   600   1274   7645  |     120695   783   1343  10513
Tot:             691   1308   9079

As you can see, on SpacemiT K3 the 7z benchmark run through felix86 gets a 3.54x speedup on average in compression and a 2.91x speedup in decompression compared to the same benchmark through felix86 on SpacemiT K1.

Next, the x86 stockfish benchmark (run as stockfish bench inside felix86 --shell).

// SpacemiT K1, x86-64 Stockfish 17.1, felix86 26.02
Total time (ms) : 99364
Nodes searched  : 2030154
Nodes/second    : 20431
// SpacemiT K3, x86-64 Stockfish 17.1, felix86 26.02
Total time (ms) : 24425
Nodes searched  : 2030154
Nodes/second    : 83117

With SpacemiT K3 the Stockfish benchmark is 4.07x faster.

An interesting observation, is that the native RISC-V Stockfish runs slower than the x86-64 Stockfish through felix86:

// SpacemiT K3, RISC-V Stockfish 17.1
Total time (ms) : 42895
Nodes searched  : 2030154
Nodes/second    : 47328

This isn’t too unexpected, as the x86-64 Stockfish is significantly more optimized with assembly routines, and the RISC-V version should catch up in the future, but it shows how felix86 can translate the optimized x86-64 routines of Stockfish to improve performance.

Next, a run of node -e "console.log('hello')":

// SpacemiT K1, x86-64 Node v24.13.0
2.72s user 0.70s system 110% cpu 3.100 total
// SpacemiT K3, x86-64 Node v24.13.0
real    0m1.408s
user    0m1.156s
sys     0m0.400s

Finally, a run of ffmpeg -f lavfi -i testsrc=duration=10:size=1920x1080:rate=30 -c:v libx264 -benchmark -preset medium -f null -

// SpacemiT K1, x86-64 ffmpeg N-122467-gc3d3377fe1-20260116
bench: utime=612.988s stime=6.089s rtime=99.349s
// SpacemiT K3, x86-64 ffmpeg N-122467-gc3d3377fe1-20260116
bench: utime=175.554s stime=1.746s rtime=32.661s

The benchmark completes three times faster.

A100 cores

The SpacemiT K3 has 8 X100 RVA23 cores with VLEN=256 and 8 A100 non-RVA23 cores with VLEN=1024. Due to differences in VLEN, the same process can’t be scheduled on X100 and A100 cores. Nevertheless, felix86 can work on the VLEN=1024 cores.

To make any child processes get sent to the A100 cores, you need to write the PID of the current process to /proc/set_ai_thread. In many shell programs, such as bash or zsh, this can be done with echo $$ > /proc/set_ai_thread.

Running a felix86 instance will now show us we have a VLEN of 1024:

Extensions enabled for the recompiler: g,v1024,c,b,zicond,zfa,zvbb,zvkned

Which means it should be possible to start some x86 processes on these cores through felix86 to squeeze some extra compute power out of the system if the X100 cores are fully occupied.

Final thoughts

Performance has significantly improved with the SpacemiT K3, so we’re very excited to see it in boards and test games with it.

It seems that the SpacemiT K3 has a 39-bit address space, same as the SpacemiT K1. A 48-bit address space would be nice in future SoCs, as it enables emulation of some programs that rely on the larger address space that x86 offers. One example is PS4 emulators like shadPS4.

The Zacas extension is missing and it is important for x86-64 emulation on RISC-V. It gives us a way to do 128-bit CAS, which is heavily used by Unity. When absent, we use a global lock for all CMPXCHG16B operations, which is not actually atomic with regards to other memory operations. As we get faster cores, it is likely that the lack of Zacas leads to crashes in x86-64 games that use CMPXCHG16B. It is a relatively new extension so it’s understandable for it to not be implemented yet.

Some unaligned atomic support would be nice, such as Zama16b.

Both Zacas and Zama16b are optional extensions for RVA23, so we hope to see them in a future core.

Overall, a great improvement in performance. Looking forward to seeing how games perform under felix86!

Compatible with RVA23

After a few bug fixes, felix86 now officially supports RVA23 hardware with VLEN=256. We also ran tests on QEMU with cpu=max, and after some more bug fixes there’s support for VLEN=128 with all the extensions that QEMU supports.

Miscallaneous niceties

Install script improvements

The felix86 install script (bash <(curl -s https://install.felix86.com)) now supports installing a specific commit that was made in the last 90 days.

Additionally, some oversights in rootfs installation were addressed. The rootfs is now correctly owned by root, and only the $HOME folder inside the rootfs is owned by the user.

Documentation

We now have documentation at https://felix86.com/docs, for developers and users alike. Check it out!

RSS feed

There’s now an RSS feed available at https://felix86.com/feed.xml.

Bluesky

There’s now a Bluesky account for felix86 at @felix86-emu.


Thanks for reading this post.

Big things are coming next month. Stay tuned!

If you like this project, please give us a star on Github: https://github.com/OFFTKP/felix86

Written on February 1, 2026