GSoC 2018 - Vulkan-ize Virglrenderer
Several months ago started the GSoC 2018. Once again, I found a project which got my attention.
Virglrenderer is a library designed to provide QEMU guests with OpenGL acceleration. It is composed of several components:
- a MESA driver, on the guest, which generates Virgl commands
- a lib, on the host, which takes virgl commands and generated OpenGL calls from it.
If you want to read more: 3D Acceleration using VirtIO.
This library was built with OpenGL in mind. Today, Vulkan is correctly supported, and is becoming a new standard. It might be time to bring Vulkan to QEMU’s guests !
To do so, we will need to work on two components:
- A Vulkan ICD. Writting one for MESA sounds like a good idea.
- A Vulkan back-end for Virglrenderer.
Now, we can face the first issue: Vulkan is not designed with abstraction in mind. The time of the old GlBegin/glVertex is kinda dead.
If we want to avoid any unnecessary abstraction, we cannot easily reduce the amount of calls made to the API. Thus, the vast majority of the VK calls will be forwarded to the host. However, there is some area in which we can bend the rules a bit.
The rest of this post contains the same content as the announce email (virgl ML).
- Several Vulkan objects can be created
- Memory can be mapped and altered on the client.
- Changes are written/read to/from the server on flush/invalidation
- Basic features for command buffers are supported.
As a result, a sample compute shader can be ran, and the results can be read.
I only use vtest for now. The client part lies in mesa/srv/virgl.
To compile virglrenderer with vulkan, the option –with-vulkan is needed. Running the server as-is does not enable Vulkan. And for now, Vulkan cannot be used in parallel with OpenGL (Issue #1). To enable Vulkan, the environment variable VTEST_USE_VULKAN must be set.
The client driver is registered as a classic Vulkan ICD. When the loader call icdNegociateLoaderICDInterfaceVersion, the driver connects to the server. On failure, the driver reports as an invalid driver.
Once connected, the ICD will fetch and cache all physical devices. It will also fetch information about queue, memory and so. Physical devices are then exposed as virtual-gpus. Memory areas are showed as-is, except for the VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, which is disabled. This forces the application to notify every modification made to a mapped memory.
The object creation part relies heavily on API-Forwarding. For now, I don’t see how I could avoid that.
Once basic objects are created, the client will ask to map some memory. For now, no clever thing is done. The ICD will provide a buffer. On flush, a transfer command is issued. Virglrenderer will then map the corresponding memory region, write/read, and unmap it. A memory manager could be used on the server in the future to avoid mapping/unmapping regions each time a transfer occurs.
Commands and execution
Command pool creation is forwarded to the server. For now, a command buffer is attached to its pool. To retrieve a command buffer from a handle, I need to know from which pool it came from. (Issue #2) Command buffer creation is also forwarded to the server.
Command buffers state is managed on the client. Each vkCmd* call will modify an internal state. Once vkEndCommandBuffer is called, the state is sent to the server. The server will then call corresponding vkCmd* functions to match retrieved the state.
Vulkan entry points are generated at compile time. Heavily inspired from Intel’s entry-point generation. However, since object creation relies on API-Forwarding, I started to work on a code generator for these functions.
Using a json, the interesting informations are outlined. Then a Python script will generate functions used to forward object creation to the vtest pipe. Even-though the Vulkan API seams pretty consistent, some specific cases and time constraints forced me to abandon it.
This script is still available in the mesa/src/virgl/tools and virglrenderer/tools folder, but is lacking features. Also, since I had different needs on both sides of vtest, scripts diverge a lot. The most recent version is the Virglrenderer one. It’s a second iteration, and it might be easier to work with.
In the current state, I use it to generate a skeleton for vtest functions, and then fixes the implementation. In the future, it could save us some time, especially if we use the same protocol for VirtIO commands.
1: (Virglrenderer) Vulkan cannot be used next to OpenGL.
There is no reason for it except a badly though integration of the vulkan initialization part into virglrenderer.
2: (Virglrenderer) Command buffers are scattered into several pools
Command buffers are scattered into several pools the client created. To fetch a command buffer vk-handle, I need to first fetch the corresponding pool from a logical device, then fetch the command buffer. Since VirtIO ant Vtest provides a FIFO, maybe we could drop the command pool creation forwarding. Use only one pool per instance, and thus simplify command buffers lookups.
3: (MESA) Vtest and VirtIO switch is not straightforward right now.
An idea could be to add a level between vgl_vk* functions an vtest. vgl_vk* function would still manage the state of the ICD. the mid-layer would convert handles and payload to a common protocol for both VirtIO and Vtest. (Both could use vgl handles and some metadata). Then, a backend function, which would choose between vtest and virtio.
The handles could be either forwarded as-is (vtest case) Or translated to real virgl handles in the case of a kernel driver which could do a translation, or check them. But the metadata should not change.
4: (Virglrenderer/MESA) vtest error handling is bad.
Each command sends a result payload, and optionally, data. This result payload contains two informations. An error code, and a numerical value. Use as a handle, or a size. On server failure, error-codes should be used.
5: bugs, bugs and bugs.
This project is absolutely NOT usable right now.
My first step should be to rebase this project onto the current virglrenderer version, and rewrite the history. In the mean time, rewrite the initialization part to allow both OpenGL and Vulkan to be ran. Then, fix the vtest/virtio architecture. Add this new mid-layer. Once refactorized, I should work on the error handling for client-server interactions.
Once in a sane state, other issues will have to be addressed.
How to test it
There is a main repo used to build and test it rapidly. In it, a bash script and a dockerfile (+ readme, todo)
The bash script in itself should be enough. But if the compilation fails for a reason, the dockerfile could be used.
The README provided should be enough to make the sample app run.