felix86 26.02
We have a few exciting news this month!
Vulkan X11 thunking support
The native Vulkan userspace driver can now be used on X11 with thunking. We also implemented the signatures for some missing extensions used by DXVK, which means DXVK on Wine can now use Vulkan thunking. This can be enabled on 64-bit games with FELIX86_ENABLED_THUNKS=vk felix86 --shell, as it is currently disabled by default.
Thunking improves performance and compatibility and reduces time spent on recompilation.
Witcher 3 with DXVK and Vulkan thunking on Milk-V Jupiter
Zink support
The Vulkan extension signatures necessary for Zink were also added. For ease of use, there’s now a zink profile that enables thunking and Zink. Test with FELIX86_PROFILE=zink felix86 --shell.
SpacemiT K3
SpacemiT was kind enough to give us early SSH access to a Linux environment with SpacemiT K3 to test felix86 and run benchmarks!
Hardware
SpacemiT K3 is RVA23 compliant, which means we get access to vector crypto (important for x86 AES & others), unaligned non-atomic accesses supported in hardware (necessary, see next month’s post as to why), Zvfhmin which we’ll eventually use for F16C, Zvbb which allows for some optimizations in our SIMD code, and Zfa which also allows for some optimizations.
Additionally, RVA23 covers the extensions necessary for felix86 operation, which are RVV 1.0 and Zba/Zbb/Zbs.
Benchmarks
The benchmarks we tried ran a lot faster on the SpacemiT K3 than on the K1!
First, the x86-64 7z benchmark run through felix86 26.02 on RISC-V hardware:
// SpacemiT K1, x86-64 7z
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 2081 567 357 2024 | 41069 746 469 3502
23: 2040 579 359 2079 | 42298 780 469 3659
24: 2042 589 373 2196 | 41572 779 468 3648
25: 2049 606 386 2340 | 40578 779 464 3611
---------------------------------- | ------------------------------
Avr: 2053 585 369 2160 | 41379 771 468 3605
Tot: 678 418 2882
// SpacemiT K3, x86-64 7z
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 7392 579 1243 7192 | 123233 778 1351 10508
23: 7199 590 1244 7336 | 121907 784 1345 10545
24: 7248 609 1280 7794 | 119814 783 1343 10513
25: 7232 622 1328 8258 | 117826 787 1331 10484
---------------------------------- | ------------------------------
Avr: 7268 600 1274 7645 | 120695 783 1343 10513
Tot: 691 1308 9079
As you can see, on SpacemiT K3 the 7z benchmark run through felix86 gets a 3.54x speedup on average in compression and a 2.91x speedup in decompression compared to the same benchmark through felix86 on SpacemiT K1.
Next, the x86 stockfish benchmark (run as stockfish bench inside felix86 --shell).
// SpacemiT K1, x86-64 Stockfish 17.1, felix86 26.02
Total time (ms) : 99364
Nodes searched : 2030154
Nodes/second : 20431
// SpacemiT K3, x86-64 Stockfish 17.1, felix86 26.02
Total time (ms) : 24425
Nodes searched : 2030154
Nodes/second : 83117
With SpacemiT K3 the Stockfish benchmark is 4.07x faster.
An interesting observation, is that the native RISC-V Stockfish runs slower than the x86-64 Stockfish through felix86:
// SpacemiT K3, RISC-V Stockfish 17.1
Total time (ms) : 42895
Nodes searched : 2030154
Nodes/second : 47328
This isn’t too unexpected, as the x86-64 Stockfish is significantly more optimized with assembly routines, and the RISC-V version should catch up in the future, but it shows how felix86 can translate the optimized x86-64 routines of Stockfish to improve performance.
Next, a run of node -e "console.log('hello')":
// SpacemiT K1, x86-64 Node v24.13.0
2.72s user 0.70s system 110% cpu 3.100 total
// SpacemiT K3, x86-64 Node v24.13.0
real 0m1.408s
user 0m1.156s
sys 0m0.400s
Finally, a run of ffmpeg -f lavfi -i testsrc=duration=10:size=1920x1080:rate=30 -c:v libx264 -benchmark -preset medium -f null -
// SpacemiT K1, x86-64 ffmpeg N-122467-gc3d3377fe1-20260116
bench: utime=612.988s stime=6.089s rtime=99.349s
// SpacemiT K3, x86-64 ffmpeg N-122467-gc3d3377fe1-20260116
bench: utime=175.554s stime=1.746s rtime=32.661s
The benchmark completes three times faster.
FELIX86_UNSAFE_FLAGS option, which is safe in ABI conforming programs.A100 cores
The SpacemiT K3 has 8 X100 RVA23 cores with VLEN=256 and 8 A100 non-RVA23 cores with VLEN=1024. Due to differences in VLEN, the same process can’t be scheduled on X100 and A100 cores. Nevertheless, felix86 can work on the VLEN=1024 cores.
To make any child processes get sent to the A100 cores, you need to write the PID of the current process to /proc/set_ai_thread. In many shell programs, such as bash or zsh, this can be done with echo $$ > /proc/set_ai_thread.
Running a felix86 instance will now show us we have a VLEN of 1024:
Extensions enabled for the recompiler: g,v1024,c,b,zicond,zfa,zvbb,zvkned
Which means it should be possible to start some x86 processes on these cores through felix86 to squeeze some extra compute power out of the system if the X100 cores are fully occupied.
Final thoughts
Performance has significantly improved with the SpacemiT K3, so we’re very excited to see it in boards and test games with it.
It seems that the SpacemiT K3 has a 39-bit address space, same as the SpacemiT K1. A 48-bit address space would be nice in future SoCs, as it enables emulation of some programs that rely on the larger address space that x86 offers. One example is PS4 emulators like shadPS4.
The Zacas extension is missing and it is important for x86-64 emulation on RISC-V. It gives us a way to do 128-bit CAS, which is heavily used by Unity. When absent, we use a global lock for all CMPXCHG16B operations, which is not actually atomic with regards to other memory operations. As we get faster cores, it is likely that the lack of Zacas leads to crashes in x86-64 games that use CMPXCHG16B. It is a relatively new extension so it’s understandable for it to not be implemented yet.
Some unaligned atomic support would be nice, such as Zama16b.
Both Zacas and Zama16b are optional extensions for RVA23, so we hope to see them in a future core.
Overall, a great improvement in performance. Looking forward to seeing how games perform under felix86!
Compatible with RVA23
After a few bug fixes, felix86 now officially supports RVA23 hardware with VLEN=256. We also ran tests on QEMU with cpu=max, and after some more bug fixes there’s support for VLEN=128 with all the extensions that QEMU supports.
Miscallaneous niceties
Install script improvements
The felix86 install script (bash <(curl -s https://install.felix86.com)) now supports installing a specific commit that was made in the last 90 days.
Additionally, some oversights in rootfs installation were addressed. The rootfs is now correctly owned by root, and only the $HOME folder inside the rootfs is owned by the user.
Documentation
We now have documentation at https://felix86.com/docs, for developers and users alike. Check it out!
RSS feed
There’s now an RSS feed available at https://felix86.com/feed.xml.
Bluesky
There’s now a Bluesky account for felix86 at @felix86-emu.
Thanks for reading this post.
Big things are coming next month. Stay tuned!
If you like this project, please give us a star on Github: https://github.com/OFFTKP/felix86