10: The OpenGL Pipeline
03: Vulkan Design
Constraints
COMP5822M: High Performance Graphics
Vulkan Basics
The design of Vulkan is explicit:
It does NOT track state of objects
It does NOT manage memory for you
It does NOT handle synchronization
It will NOT check for errors at runtime
Which means more work for the programmer
COMP5822M: High Performance Graphics
Vulkan Extension
OpenGL kept being modded and extended
So it built up lots of cruft
Vulkan assumes it will be extended
Nearly everything is treated as an extension
So you have to check even for the basics
Even presentation on screen is an extension
COMP5822M: High Performance Graphics
Vulkan Elements
An instance stores library state
A shader implements a core loop
A pipeline controls the shaders
A render pass builds up an image
A command buffer holds an execution instance
A queue represents execution resources
A swap chain captures the output for display
COMP5822M: High Performance Graphics
Instance Logic
OpenGL had many state variables
But they were global
And not re-entrant
So handling two renders becomes tricky
An instance abstracts the state
And stores it in a separate object
So we can have multiple renders at once
COMP5822M: High Performance Graphics
Shader Construction
Last week, we had three core loops
Each one iterated over a stream of input
So a shader just generalises this
And expresses a single core loop
Each shader is a parallel construct
Many instances run at once
There is an implicit “for each” call for them
COMP5822M: High Performance Graphics
Pipeline Construction
The shaders still need to be invoked
Their I/O buffers need to be set up
And pools of (hardware) threads to run them
Plus the remaining fixed functionality
All of which is encapsulated in a pipeline
Which allows us to keep several around
It’s just an object with state
COMP5822M: High Performance Graphics
Render Pass
These days, we do multipass rendering
E.g. background, fog, characters, &c.
So we run multiple pipelines in sequence
We call each run a subpass
And the whole collection is a render pass
An abstraction of a sequence of related tasks
COMP5822M: High Performance Graphics
Batch Processing
In the old days . . .
You wrote your code
You prepared your data
You submitted a job to a queue
And they ran it when they had time
Eventually, you picked up your output
We still do this, but it’s hidden and faster
COMP5822M: High Performance Graphics
Jobs in Vulkan
The queue still exists (and is called that)
The job is called a command buffer
A single run of a render pass
Producing output to be picked up
So we have construct these as well
And submit the command buffer to the queue
Then wait for it to be picked up
COMP5822M: High Performance Graphics
Double Buffering
We render to an image
Then we show it on screen
This is NOT instantaneous
Video hardware reads one byte at a time
During which we mustn’t mutate it
Or else it changes halfway through (tearing)
So we keep two images and alternate
COMP5822M: High Performance Graphics
Double Buffer Pseudocode
COMP5822M: High Performance Graphics
Double-Buffering
Swaps on vertical interrupt signal (or G-Sync)
Cost is frame render time + 2x display time
Render time to prepare image
First display time is synchronisation lag
Second display time is actual display cost
120Hz refresh means that you may have:
30 ms render + 2x 8ms = 46ms or 20 fps
Or 30 fps, but with 16ms frame lag
COMP5822M: High Performance Graphics
Double Buffer Problems
What happens if :
Your frame render is too long?
You miss the frame deadline
Your frame repeats twice
Your frame render is too short?
You skip a frame
So the faster the better
COMP5822M: High Performance Graphics
It gets worse . . .
What about stereo / VR?
When do you swap images?
When each image is done? (synch issues)
When both images are done? (lag issues)
What if we want to reuse the last frame?
E.g. to copy the background render?
Distorted to account for movement
COMP5822M: High Performance Graphics
The Swap Chain
Instead of two buffers being swapped
We have a swap chain
An arbitrary number of renderable buffers
Inevitably, under programmer control
Which means we need to manage them
COMP5822M: High Performance Graphics
Multithreading in Vulkan
Vulkan assumes external synchronisation
No two threads mutate data at once
No implicit semaphores or mutexes
Thread-safety is the programmer’s job
Some parameters marked for external synch
But semaphores and fences are available
COMP5822M: High Performance Graphics
Memory Management
Devices can access host or device memory
Application may manage host memory
And is required to manage device memory
I.e. you have to do your own malloc()/free()
We will need to enumerate device memory
Count it, track it, use it, release it
COMP5822M: High Performance Graphics
Host Memory Allocation
Performed by an allocator (i.e. a function)
One allocator per thread
No shared memory between threads (please)
All data on a thread in a shared space
No protection per thread necessary
Which ignores accidental interference
Since CPU only has a single memory space
COMP5822M: High Performance Graphics
Pooled Memory Allocation
We often create a pool of objects (an array)
Uninitialised or initialised
Then (re-)use whichever one is free
Allocation very lightweight (per thread)
But programmer takes responsibility
Irritating for small applications
Crucial for large ones
COMP5822M: High Performance Graphics
System Resources
The library could track all of your resources
But that wouldn’t fit the Vulkan ideal
So the programmer has to track them
By calling functions to find them
But who owns the information?
We need to retrieve a copy of the information
Which means memory allocation
COMP5822M: High Performance Graphics
Object Enumeration
Software pattern then develops
Query how many resources there are
Allocate array of structs for their descriptions
Query again to fill in the array
Which makes a lot of the code repetitive
COMP5822M: High Performance Graphics
Physical & Logical Devices
We can use a physical device in different ways
So we also have logical devices
Which means we need to set them up
Enumerate physical devices
Choose physical device
Use it to instantiate logical device
COMP5822M: High Performance Graphics
C-Style OO Design
A class is just a structure with:
The data members
A part code for the object type
A trap table of member function pointers
Supports runtime polymorphism
Member functions take this/self explicitly
Usually as the first parameter
COMP5822M: High Performance Graphics
OO in Vulkan
Vulkan uses this design pattern
The part code is called ‘sType’
There is often a ‘pNext’ pointer
Which allows linked lists for parameters
There is a dispatch table (ie trap table)
A struct with a dispatch table is dispatchable
A struct without it is non-dispatchable
COMP5822M: High Performance Graphics
Dispatchable Functions
Most Vulkan functions are members
Which means the principal classes are:
Instance
Physical Device
Logical Device
Command Buffer
Queue
COMP5822M: High Performance Graphics
Constructors
How do you initialise an object?
By passing parameters to a function
By writing a constructor function
By writing a factory function
All of these are essentially equivalent
Vulkan uses a factory
E.g. vkCreateInstance()
COMP5822M: High Performance Graphics
Creation Parameters
How do we know which values to set?
Vulkan has no default values
Everything has to be set explicitly
Owch! Functions with many parameters
Instead, Vulkan uses parameter structures
One structure with the relevant parameters
So you write code to fill it in
COMP5822M: High Performance Graphics
Development Code Paths
How do we change how the library works?
E.g. add debugging / logging / profiling
We could add extra code to the library
But this always has a runtime overhead
And requires constant redesign of library
And lots of code paths to test and validate
COMP5822M: High Performance Graphics
Function Interception
Another code pattern develops
Remember the dispatch table?
Create a new one and use it instead
Each function can be:
Replaced completely
Or called anyway, with extra tracking
This means we add layers of functionality
COMP5822M: High Performance Graphics
Validation Layers
Vulkan lets you have validation layers
Including a standard debug layer
But this means extra setup code (again)
Indispensable in practice
COMP5822M: High Performance Graphics
After All That . . .
Clearly, we need to set a lot of things up
And we have to do it in the right order
Since some objects depend on others
Then we need to have runtime code
And finally, we need to clean up afterwards
COMP5822M: High Performance Graphics
Setup Sequence
Create Window
Set up Instance & Validation Layers
Attach Instance to Surface
Set up Device
Set up Output
Set up Pipeline
Set up Queue
COMP5822M: High Performance Graphics
Device & Output Setup
Set up Device
Enumerate Physical Devices
(Optional) Enumerate Memory Properties
Choose Physical Device
Create Logical Device
Set up Output
Create Swap Chain
Allocate Images for Swap Chain
COMP5822M: High Performance Graphics
Pipeline Setup
Set up Output
Set up Render Pass
Set up Vertex Buffers
Set up Graphics Pipeline
Create / compile shaders
Configure raster & other fixed stages
Set up Framebuffers from Swap Chain
COMP5822M: High Performance Graphics
Queue Setup
Set up Queue
Set up Command Pool & Buffers
Set up Synchronisation Primitives
COMP5822M: High Performance Graphics
Render Sequence
Synchronise
Choose Image in Swap Chain
Submit Command Buffer to Queue
Present Image to User
COMP5822M: High Performance Graphics
Cleanup Sequence
Clean up Swap Chain
Clean up Synchronisation Primitives
Destroy Command Pool
Destroy Logical Device
Release OS-Specific Resources
Destroy Instance
Release Window
COMP5822M: High Performance Graphics