felix86 26.07
Two-year anniversary, new features, bug fixes, and initial ptrace support!
Two years of felix86
Today marks the two-year anniversary of felix86.
In these two years, felix86 went from a program being able to run nano, to an emulator that can barely run vkcube, to running simple 2D games like VVVVVV and Balatro, to running Windows games and 32-bit games, to now running complex 3D games with good FPS. Hopefully we are able to continue working on this project for years to come!
Ptrace support
Ptrace functionality was introduced this month. Signal and syscall stops, event stops like clone/fork/vfork/execve, accessing registers and debug registers, software and hardware breakpoints (but not watchpoints, no kernel functionality in RISC-V for it yet…), single-stepping, attaching and detaching, and more.
With this functionality, software that uses ptrace starts to work. One example is x86 gdb through felix86. Using the host RISC-V gdb was always possible, but that debugs both the emulator and the guest application, while the guest x86 gdb debugs just the guest application. This however isn’t too important, as gdb has a server protocol that can be used with emulators that we could implement.
Other programs are more interesting. For example winedbg can now be used in felix86. This is very helpful for debugging Windows applications. For example, recently we discovered that Windows GOG installers would show an error messagebox. With winedbg we can attach from a separate felix86 instance and get a backtrace, which helps track where the error comes from.
Getting a backtrace of a Windows program running through Wine is a lot easier with winedbg
Another thing we wanted to get running was proot. However, it was immediately obvious that RISC-V hardware is not ready for it yet.
Non-relocatable executable with entrypoint at 0x600000000000, needs Sv48 or higher
We are also able to run guest strace. This shows that our syscall stops work and it might be useful for debugging in some cases.
Guest strace prints the guest syscalls only
Trap flag support
Part of improving debugger support was implementing the trap flag in x86, which allows for single-stepping instructions.
x86-64-v3 support
The missing instructions for x86-64-v3 were added, namely PEXT/PDEP and MOVBE. There’s no instructions for bit extract/deposit in RISC-V, so a polyfill implementation is adapted to RISC-V to provide faster emulation than a naive loop.
Forking glibc
The felix86 installer script will now install a custom fork of glibc in the installation path (/opt/felix86) that helps felix86 with emulating rare clone syscalls. The full list of reasons for this fork can be found in the documentation.
Implement runtime 32-bit switching
Previously, felix86 would decide if the recompiler would translate x86 or x86-64 based on the ELF. In recent versions of Wine, 32-bit programs are emulated using a 64-bit version of Wine. When Wine wants to run 32-bit code it performs a special far jump instruction that changes the CS segment to 0x23, which has long mode disabled and 32-bit operands enabled. When 32-bit code is done running, it returns to 64-bit code using a special return instruction that changes the CS segment to 0x33, which has long mode enabled.
These instructions are now implemented in felix86 26.07, which allows it to run 32-bit games with Wine 11 that uses the WoW64 mode by default.
Bug fixes
invalidate_caller_thunk instruction tearing
A really annoying bug is now fixed. When blocks are invalidated, due to self-modifying code or memory mappings being modified, the affected blocks are not immediately invalidated. Instead, the first couple of instructions of each block would be replaced with new instructions that would trampoline to a function that invalidates the block and unlinks the caller. The reason for this design is if we were to invalidate the affected block immediately, we would also have to unlink all potential linkers. This would require a lot of memory to track. Instead, we defer the invalidation to the next jump to this block by the affected thread and any further jumps will only unlink the caller without invalidating the target block twice, which may have been recompiled to the new code.
The bug was due to the fact that we would replace the first two instructions of the block. Most crashes would occur when the first and second instruction were on different cache lines, however it was possible for the crash to occur when they were on the same cache line, as the invalidator will also flush the icache. If the affected thread is running the block as we are invalidating it, and runs the old first instruction and the new second instruction, it would jump to a bogus location and crash.
This is now fixed by only changing the first instruction of the block. The first instruction is atomically replaced with two compressed instructions, C.LDSP and C.JR. These instructions will load a pointer to a trampoline from the stack (which we have full control of during JIT code) and jump to it. It would be better if we could use C.LD from $gp which contains our ThreadState*, but C.LD can’t load from $gp. This fix suffices, as it is no longer possible for the thread to execute half of the trampoline, either the old first instruction gets executed or both the compressed trampoline instructions.
This bug would cause rare random crashes in applications that use self-modifying code, such as Steam or Unity games.
Misc. fixes
- Fix auxiliary carry flag for XADD (#566)
- Fix ADC/SBB overflow calculation edge-case for 64-bit operands (#541)
- Slight inaccuracy in 128-bit path detection for unsigned divide (#545)
- Use fence.i after generating extension detection code (#547)
- Fix bug in ARCH_GET_FS/ARCH_GET_GS (#556)
Penetration testing
With help from Radically Open Security we identified some vulnerabilities in felix86. Particularly, since the emulator can be installed in binfmt_misc with the C flag, we care about bugs in the emulator that could cause arbitrary code execution and thus privilege escalation with setuid(0) binaries.
The bugs found were fixed, such as felix86 not correctly setting AT_SECURE which is responsible for preventing LD_PRELOAD and others in privileged executables.
As an additional security measure, felix86 26.07+ will not be installed with the C flag by default to completely prevent present or future bugs of this kind. If you need to run privileged executables you can temporarily install it with the C flag using sudo felix86 --binfmt-misc-setuid or by running a privileged shell: sudo felix86 --shell.
This means that AppImage files won’t work by default, as they rely on the fusermount3 executable. If you understand the risks and aren’t planning on running untrusted code you may install it as it was previously installed with the same --binfmt-misc-setuid argument.
You can read the full report here.
Thanks for reading this post.
If you like this project, please give us a star on GitHub: https://github.com/OFFTKP/felix86
