Innovating the Impossible: Moving Polystream to run in Microsoft Windows Containers
Polystream was founded on a spirit of rebellion and ingenuity. Our command streaming tech rewrites the cloud gaming rule book, delivering 3D content in new and innovative ways. And, as a company, we’re constantly finding smart ways to break the rules, reject outdated traditional solutions, and experiment as we build the impossible!
Earlier this year, everyone at Polystream teamed up to do an internal Polystream Jam, challenging ourselves to build amazing things with our Fantom engine and streaming tech in just 48 hours. (Our team, “HHCVB_KHR?”, played with an early prototype to stream a simple Vulkan application via command streaming!).
Now, I’ve been working with some of the team on a new experiment of our own…
I’m Octavian, one of Polystream’s Software Engineers. Recently, I’ve been exploring how to stream a Windows application running from a Docker container – a method that allows for even greater scale while also driving the streaming cost down further than we have before.
This post takes a look at our process, some of the challenges we encountered, and how we overcame them.
How do Docker Windows Containers differ from a Windows Virtual Machine?
First, let’s establish some of the basics – defining how Docker Windows Containers differ from a Windows Virtual Machine.
In theory, if an application runs on your Windows PC, then it should run in a Windows Container. This is certainly true for some applications, but there is a major difference.
Windows Containers only supports Command Line applications in a non-interactive way. There is no “Desktop” or UI. Once a container is booted it gives you the “good old Ms-DOS” prompt and that’s it.
Normally most applications have one or more configuration files that control certain aspects of that application, like Window Size, Number of Audio channels, type of input, graphical settings, etc. Typically, when an application starts, it reads these configuration files and then starts initialising the application. Because this application is running on Microsoft Windows during the initialization process it will have to use Windows API commands; for example, if an application tries to initialise Audio by using Windows API calls, it will ask to see if the platform supports Audio rendering.
If the PC has an audio device, the Windows API will return True and so the application can then proceed with the initialization. Internally the Windows API will try to do what the caller asks if the hardware supports it.
“In theory, if an application runs on your Windows PC, then it should run in a Windows Container. […] But there is a major difference.“
Most applications that run on today’s PCs under Microsoft Windows have a window and the user interacts with them via mouse, controller or other input system. These applications are considered interactive applications. The whole Desktop space, in which a user can use the mouse to start an application, listen to music, write a document, etc is an Interactive Session.
A Windows Container provides a non-interactive session. We don’t have a desktop, we don’t know about Audio devices, GPUs or even the mouse. It only knows about the keyboard and allows you to execute CLI based programs.
As a result, trying to run a Window based application inside a Windows container will not work.
But what if we want to run a Window based Application in a Windows Container and stream it?
In theory, if the application running in a container asks, using WinApi, to create a 1920×1080 window, the container platform would say something along the lines of “not supported as there is no Desktop”.
We wondered what would happen if we could change what the application gets back as a response from WinApi? Our theory was that the application will continue with its initialization thinking a 1920×1080 was created 😊
The answer to this problem is API virtualization & detouring. This is required for each component an application will use like: Window Management and Input, Audio, Graphics and Controller support.
The Window Management component:
This handles window creation, window size, minimization and maximization, window position and other events like keyboard and mouse. It also handles the capabilities reported by the platform.
By virtualizing the Windows APIs responsible for this and synchronizing the states and capabilities between Server and Client using Polystream’s command-streaming technology secret sauce, any application that starts in a container will now be able to create a window as big as requested by the Client (this includes 16:9, 21:9, 32:9, 4k or any possible size or aspect ratio).
When referring to supported resolution, size of desktop, monitor count, etc, emember a container doesn’t know anything about resolutions as it has no valid desktop. The capabilities are gathered on the Client machine and sent to the Server container via command-streaming. As a result when the application asks about capabilities, we report the ones from the client machine.
Keyboard and Mouse input:
In a typical Windows application, the input can be handled in two ways, via DirectInput API or by the Window Procedure. A container doesn’t have a Mouse Pointer and as such using SendInput via WinApi will not work. To properly handle mouse events, we must understand how Windows actually generates them and how each of the two modes work. When clicking a mouse at X, Y coordinates in an application, the window procedure receives a message from the Operating System that a mouse event happened at that location. These events come as WM_ messages that a window receives and are decoded by the Window Procedure function. By grabbing these commands on the client and sending them to the application running in a container, we can replicate with 100% accuracy the whole input pipeline.
Windows Audio support consists of different API’s that can be used such as DirectSound or XAudio. Behind them sits the Windows Core Audio Interface. Since a container doesn’t have an Audio Endpoint, any application that runs in a container will fail to initialise the Audio System and as a result will not generate any sound. The solution is again, to virtualize the Core Audio Interface and the other APIs like XAudio. This way, the application running in the container can have full knowledge about the Audio Endpoint and capabilities of the Client Machine.
Hardware Accelerated Graphics:
There are multiple APIs for graphics rendering, some specific to the Windows Platform, like DirectX, others cross-platform like OpenGL. To get an application to run in a container, the whole Graphics API the application is using, needs to be fully virtualized. Like for audio, the container application must have full knowledge of the Client graphics card capabilities, resolution, etc.
Figuring out next steps
So far so good, but how do all of these components tie together?!
When a Windows application starts and asks the OS about the hardware capabilities, all these components need to be ready to go before that. We’ve found there are different ways to accomplish this:
- The classic one is to have each virtualized API as a Dynamic Linked Library (DLL) and have them next to the application executable and make use of the Dynamic-Link Library Search Order mechanism present in Windows. However, this method fails for some of the components and it’s not 100% reliable.
- Since we are running inside a container, another approach would be to replace the Windows System DLLs with our own. However, this approach has disadvantages as well.
With these less-than-optimal solutions in mind, we decided to try a different approach. Enter…payload injection.
The better way to do it is to use a payload injection method during mainCRTStartup(), before the application starts executing the main() function.
This method comes with a few key advantages:
- Per application Virtualized APIs
- Initialization of runtime capabilities done before the main application even starts
- Fully reliable and deterministic outcome (since the Virtualized API is initialised in the same and correct order unlike when relying on Windows Dynamic-Link Library Search Order mechanism)
Docker Windows Container Setup
We’d overcome one challenge already. But it was also essential that we select the right Windows Container base image.
A Windows Server image doesn’t support a lot of components required by an interactive application. For example, there is no Audio Support and DirectX support is minimal. On previous versions of Windows Server, the Media Foundation Feature could be installed and this would allow a Windows application to use XAudio or XInput, for example. So, for running an interactive Windows Application in a Windows container, a Windows 10 container image must be used as a starting point. Besides this, other redistributable packages also need to be set up, like Visual C++, DirectX, etc. Once all the prerequisites are set up, using the method of injecting and virtualizing the interfaces gives us the ability to fully run an interactive Windows Application in a Windows Container.
From Starting the Application in a Windows Container
To connecting the Client, on another PC
To getting the application streamer from the Windows Container to the Client PC
Throughout this process, we encountered and overcame an array of technical challenges, working together to figure out a viable approach. Together, we confirmed that the important and hard work for this process happens at the API level, and how each API is virtualized, how compression of data and sending the data takes place and the synchronisation between the server and client machine.
Being able to run the streamed application from a Docker Container in this way provides some key benefits:
- Complete isolation from the machine on which the docker image runs on
- Ability to run multiple concurrent docker containers on the same cloud machine or deterministic runs each time
- Ability to scale and stream many more thousands of streams from any machine in the cloud from any cloud vendor
In short, our experiment was incredibly useful – helping us to experiment and iterate new ways to bring the world closer to the metaverse & innovate the impossible!