10: The OpenGL Pipeline
04: Memory, Queues, Barriers, Presentation
COMP5822M: High Performance Graphics
Overview
Memory Management
Host Memory
Device Memory
Shared (Mapped) Memory
Buffers & Images
Queues & Commands
Data Movement
COMP5822M: High Performance Graphics
Host Memory
Two choices
Use the default allocator
Good choice for small projects
Create your own alloc()/realloc()/free()
And pass function pointers to Vulkan
Which will call them when needed
The only serious choice for big projects
COMP5822M: High Performance Graphics
Device Memory
Memory on a device is requested from device
vkAllocateMemory() / vkFreeMemory()
Returns a pointer to GPU memory space
Must bind it to the Vulkan resource
I.e. tell Vulkan where it is
Possible to ask for lazy memory allocation
Especially for images
COMP5822M: High Performance Graphics
Device Memory Issues
Device may have allocation limit
i.e. maximum number of memory chunks
And of course you need to query this
Must be externally synchronised
Always flush the pipeline & lock first
Never do this during render run
Always pre-allocate & reuse
COMP5822M: High Performance Graphics
Shared (Mapped) Memory
GPU & CPU can share memory
Especially for integrated chips (eg Intel)
Pass Vulkan a GPU pointer
It will give you the equivalent CPU pointer
Using vkMapMemory()/vkUnmapMemory()
COMP5822M: High Performance Graphics
Binding Memory
Software pattern is straightforward:
vkGetImageMemoryRequirements()
vkAllocateMemory() (or host equivalent)
vkBindImageMemory()
All of which are externally synchronised
COMP5822M: High Performance Graphics
Types of Resources
Essentially, there are only two types of data
Buffers – i.e. linear arrays
Images – 2D arrays
No linked lists, no heaps, no priority queues
Because these have horrible access patterns
So GPUs don’t use them
COMP5822M: High Performance Graphics
Buffer Flags
Buffers have all sorts of flags:
Source/Destination for transfers
Usable as texel buffers (for shaders)
Usable as uniform storage
Usable as index/vertex buffers
Usable as indirect buffers
Essentially, type information for the buffer
COMP5822M: High Performance Graphics
Buffer Sharing Mode
All buffers have a “sharing mode” flag:
EXCLUSIVE – access will be one-thread only
CONCURRENT – access may be multiple
Will be lower-performance (sometimes)
Still no guarantees on synchronisation
You have to handle this yourself
So default to EXCLUSIVE
COMP5822M: High Performance Graphics
Image Options
Images have even more flags
One flag for each role an image has
Including atomic read/write access
Images can be stored in many formats
Including compressed formats
For now, stick to uncompressed for writing
Use compressed formats for reading only
COMP5822M: High Performance Graphics
Image Properties
Images are created for a specific device
i.e. it is required for instantiation
Must specify:
Extent (size)
Mipmapping (if desired)
Gamma Encoding (ditto)
Even have the option of sparse storage
COMP5822M: High Performance Graphics
Queues & Commands
A queue is an abstraction of a processor core
And associated compute resources
They are grouped in queue families
Sets of logically identical cores
With no cost to transfer work
Except for synchronisation
Effectively, an abstraction of warps
COMP5822M: High Performance Graphics
Command Buffers
In essence, a recorded sequence of operations
But it’s tied to a specified framebuffer
Which means reuse is awkward
Can only reuse with same framebuffer
Net result: assume you release & recreate
Each frame starts it all over
Later on, you can get more sophisticated
COMP5822M: High Performance Graphics
Command Buffers, II
Work recorded to command buffer
vkBeginCommandBuffer() … vkEnd…
A sequence of opcodes / instructions
Access must be externally synchronised
So usually one command buffer per thread
i.e. it’s actually in CPU
COMP5822M: High Performance Graphics
Buffer Submission
Record Command Buffer on CPU
Submit to Vulkan queue for processing
Opcodes / instructions transfer to GPU
Then execute on next available core
Release / Reset Command Buffer when done
COMP5822M: High Performance Graphics
Data Barriers
One of the biggest synchronisation problems
How do you make sure a read/write is safe?
By using barriers
Synchronisation stages in the pipeline
Force all processing to stall at a given stage
Until previous stage is complete
COMP5822M: High Performance Graphics
Types of Barrier
Global Memory Barrier
Specify 2 stages in pipeline to have a barrier
Prevents out-of-sequence memory access
But very expensive
Image / Buffer Barrier
Similar barrier for a particular image/buffer
Use sparingly
COMP5822M: High Performance Graphics
Presentation
How to present the image to the viewer
Requires co-operation with the OS/Toolkit
Platform-specific extensions
We refer to a surface which we display on
COMP5822M: High Performance Graphics
Presentation Steps
1. Check for device with presentation ability
2. Create swap chain with OS/toolkit images
3. Generate image in framebuffer
4. Transfer from framebuffer to swap chain
Essentially, duplicates the image
But can be handled with pointers
COMP5822M: High Performance Graphics