The big new feature in the Triton Ocean SDK 4.0 release is support for multi-threaded rendering in OpenGL and DirectX11. With virtual reality becoming more prevalent in training and simulation, we can no longer ignore the need to render two (or more) views at once as quickly as possible!
While you may continue to use Triton 4 in exactly the same way as you did in Triton 3, taking advantage of its ability to build up display command lists in separate concurrent threads for each view will require some changes to your code. You’ll find some new sample applications in the SDK illustrating multi-threaded rendering in DirectX and OpenGL using several different techniques. And, here’s the new section in our documentation that talks about the details:
Multi-threaded rendering with Triton
Prior to Triton 4.0, to support multi-threaded environments, the user had to specify the following flag in the config file, and do the drawing as usual:
thread-safe = yes
This is a very naive approach however, as it effectively has a top level mutex, forcing Triton to do its work for multiple views as well as the rendering in a serialized fashion.
Starting with Triton 4.0, Triton supports fully featured multi-threaded rendering for both DirectX 11 and OpenGL. But to leverage this the application must be carefully designed and the new APIs used correctly and in the correct order.
This is best explained using an example. Let’s say that you have a VR application with left and right views, where each camera is a slightly different frustum corresponding to the left and right eye.
The first order of business is for Triton to generate the wave geometry and wake effects for the ocean. This necessitates a call to Ocean::UpdateSimulation. This call needs to happen in the main thread, where the DirectX 11 device/immediate context or the OpenGL context are current.
The appearance of local waves may change if the wind conditions differ between views. But for VR, the views are close enough to be considered to be at the same location for our purposes – so you can get away with calling Ocean::UpdateSimulation to update the waves only once per frame. UpdateSimulation() requires a Camera parameter; you can just pass in either the left or right view’s Camera object here, or create a dummy Camera positioned between the two eyepoints.
Following this, you need to do a Ocean::UpdateSimulation() followed by Ocean::DrawConcurrent() for the left and the right cameras respectively. These subsequent calls to UpdateSimulation() won’t recompute the waves, but will recompute the wake effects specific to each view. These calls can happen in completely separate threads. And there is no limit on how many views you can render simultaneously (this example only has two.)
Following this, you need to call Ocean::PostDrawConcurrent(), in the main thread.
The sequence of calls therefore is:
Ocean::UpdateSimulation(time, main camera/viewing frustum) // in main thread
Ocean::UpdateSimulation(time, left camera)
Ocean::DrawConcurrent(left camera) // can happen in a completely different thread
Ocean::UpdateSimulation(time, right camera)
Ocean::DrawConcurrent(right camera) // can happen in a completely different thread
Ocean::PostDrawConcurrent() // in main thread, after both view threads are done calling DrawConcurrent()
If you have completely disparate views (e.g. separate windows with viewpoints that are many kilometers apart), and these viewpoints may have different wind conditions or different sets of nearby wakes, then the strategy of sharing a single Ocean geometry between your views no longer works well. You can still just call Ocean::DrawConcurrent()from separate threads using unique Camera objects to simulate and render the Ocean for each view concurrently, but you’ll want to use separate Ocean instances for each view in this case.
Besides the Camera, one needs to pass in an additional ‘context’ parameter. Depending on whether we are using an OpenGL or DirectX 11 renderer, this means different things, and we will describe this next.
DirectX 11 multi-threaded rendering:
A key observation of any rendering is that the actual work of generating commands for the GPU to execute and then actually executing them on the GPU can effectively be decoupled.
Therefore, the command list generation for each different view can proceed in completely different threads. After all command lists for each view are generated, they can be rapidly executed on the GPU on the main thread.
To this end, DirectX 11 leverages ‘deferred contexts’ and ‘command lists’:
In DirectX 11 , you would want to create a ‘deferred context/command list’ pair for each view/thread in question and pass them in to the call to Ocean::DrawConcurrent, along with the camera for the view/thread in question. When a Ocean::DrawConcurrent is specified using a valid DirectX 11 deferred context and camera, Triton will generate the correct set of commands and append it to the context/command list passed in, but not actually do the rendering. Following this you actually ‘execute’ the command list using the appropriate DirectX 11 API.
We have a fully featured sample DirectX11MultiThreadedSample that demonstrates this multi-threaded rendering for DirectX 11.
OpenGL multi-threaded rendering:
OpenGL rendering follows the same paradigm as DirectX 11. Unfortunately, at this time, the OpenGL specification does not include command lists. Vendor specific APIs/extension are available however, the most complete being Nvidia’s:
We however did not want to tie ourselves to a specific vendor/extension, therefore we created our own abstract API along the lines of DirectX 11 context/command list API. In OpenGL a ‘context’ means a completely different thing however (meaning the actual context for the window/offscreen buffer), so we call our command list/contexts for each view/thread an ‘OpenGL Stream’.
Similar to DirectX 11, we create an OpenGL Stream for each view/thread in question using: Environment::CreateOpenGLStream. Once the stream is created (and prior to any drawing), you must call Ocean::Initialize with the stream and the camera in question as parameters, indicating your intent to render with this stream and camera. This call must happen where an OpenGL context is current. Following this, stream pointer and camera is what is going to be passed in as parameters to Ocean::DrawConcurrent. These calls can then proceed in completely different threads (an actual OpenGL context is not even required in each thread, since the Ocean::DrawConcurrent call is effectively just appending the commands to the OpenGL stream). Following this we execute each of the steams using Environment::ExecuteOpenGLStream. Again, execution must happen where there is a current OpenGL context.
We have a fully featured sample OpenGLMultiThreadedSample that demonstrates this multi-threaded rendering for OpenGL. There are various draw strategies that you can use. Please refer to SampleDeclares.h for additional information. Project files for Visual Studio 2013 and 2015/2017 are included; for Linux, refer to the README_LINUX file for build guidance.
One important thing to note is that you don’t want to actually render/execute the streams in multiple threads, each thread having their own OpenGL contexts, because internally the GPU is going to serialize the calls anyway, and there is no performance improvement. If at all, there is degradation because of OpenGL context switches, etc. Nonetheless, some rendering engines out there (e.g. OpenSceneGraph) do support multi-context, threaded rendering, and for completeness we have provided a code path/draw strategy that demonstrates how to do multi-threaded rendering when there are actual multiple OpenGL contexts. Please also note that we currently only support shared OpenGL contexts, that is to say the contexts are sharing resources (textures, vertex buffers, etc), which is what you want to ideally do any way.
Lastly, and quite importantly, this flag:
thread-safe = yes
is effectively ignored when Ocean::DrawConcurrent is used. Triton internally is completely thread safe as long as you create/pass in the correct DirectX 11 deferred contexts/OpenGL streams, and cameras for the thread(s) in question as demonstrated in the DirectX 11 and OpenGL samples. You would also want to remove any additional mutexes in your application code and adjust the calls to Triton accordingly.