Lecture 4:
Vulkan, Part 3 (More Resources)
COMP5822M – High Perf. Graphics
Copyright By PowCoder代写 加微信 powcoder
– Vulkanobjectsoverview
– Instance, Physical Device, Device
– VK_EXT_debug_utils
– SurfaceKHR + SwapchainKHR
– Memory heaps & types
– VkDeviceMemory
COMP5822M – High Perf. Graphics
– Soundseasyenough…
COMP5822M – High Perf. Graphics
– Brief clarification on memory
– Buffers & Images
COMP5822M – High Perf. Graphics
Example: RTX 2070, Windows
GeForce RTX 2070: 3 Heaps
– heap 0: 8031 MBytes, DEVICE_LOCAL – heap 1: 16328 MBytes, (no flags)
– heap 2: 214 MBytes, DEVICE_LOCAL
GeForce RTX 2070: 5 memory types
– type 0: from heap 1, (no flags)
– type 1: from heap 0, DEVICE_LOCAL
– type 2: from heap 1, HOST_VISIBLE HOST_COHERENT
– type 3: from heap 1, HOST_CACHED HOST_VISIBLE HOST_COHERENT – type 4: from heap 2, DEVICE_LOCAL HOST_VISIBLE HOST_COHERENT
COMP5822M – High Perf. Graphics
Example: RTX 2070, Windows
GeForce RTX 2070: 3 Heaps
– heap 0: 8031 MBytes, DEVICE_LOCAL
– heap 2: 214 MBytes, DEVICE_LOCAL
System RAM (part of it)
GPU can use this.
Less bandwidth on a dGPU (goes over PCIe)
– heap 1: 16328 MBytes, (no flags)
GeForce RTX 2070: 5 memory types
– type 0: from heap 1, (no flags)
– type 1: from heap 0, DEVICE_LOCAL
– type 2: from heap 1, HOST_VISIBLE HOST_COHERENT
– type 3: from heap 1, HOST_CACHED HOST_VISIBLE HOST_COHERENT – type 4: from heap 2, DEVICE_LOCAL HOST_VISIBLE HOST_COHERENT
COMP5822M – High Perf. Graphics
Example: RTX 2070, Windows
GeForce RTX 2070: 3 Heaps
– heap 0: 8031 MBytes, DEVICE_LOCAL
– heap 2: 214 MBytes, DEVICE_LOCAL
GeForce RTX 2070: 5 memory types
– type 0: from heap 1, (no flags)
– type 1: from heap 0, DEVICE_LOCAL
System RAM (part of it)
GPU can use this.
Less bandwidth on a dGPU (goes over PCIe)
Still need to allocate it with
vkAllocateMemory() and get a
– heap 1: 16328 MBytes, (no flags)
– type 2: from heap 1, HOST_VISIBLE HOST_COHERENT
– type 3: from heap 1, HOST_CACHED HOST_VISIBLE HOST_COHERENT – type 4: from heap 2, DEVICE_LOCAL HOST_VISIBLE HOST_COHERENT
COMP5822M – High Perf. Graphics
VkDeviceMemory handle.
Can’t just “malloc”/”operator new” memory for this (easily).
Example: RTX 2070, Windows
GeForce RTX 2070: 3 Heaps
– heap 0: 8031 MBytes, DEVICE_LOCAL
GeForce RTX 2070: 5 memory types
– type 0: from heap 1, (no flags)
– type 3: from heap 1, HOST_CACHED HOST_VISIBLE HOST_COHERENT – type 4: from heap 2, DEVICE_LOCAL HOST_VISIBLE HOST_COHERENT
System RAM (part of it)
GPU can use this.
Less bandwidth on a dGPU (goes over PCIe)
Still need to allocate it with vkAllocateMemory() and get a
– heap 1: 16328 MBytes, (no flags)
– heap 2: 214 MBytes, DEVICE_LOCAL
– type 1: from heap 0, DEVICE_LOCAL
Very limited amount. Didn’t
– type 2: from heap 1, HOST_VISIBLE HOST_COHERENT
discuss what to use this for.
Any ideas/suggestions/guesses?
COMP5822M – High Perf. Graphics
VkDeviceMemory handle.
Can’t just “malloc”/”operator new” memory for this (easily).
Resources: Buffers and Images
– VkDeviceMemory: “raw” allocation – Can’t really use directly
– VkBuffer: untyped buffer / array – VkImage: “Image”
– Formatted and typed pixel array – 1D,2D,3D
– Layers(cubemap=6layers)
COMP5822M – High Perf. Graphics
– Soundseasyenough…
COMP5822M – High Perf. Graphics
– Per-vertex data (positions, normals, …)
– Indices (indexed meshes)
– Uniform data
– Storage buffers (read-write in shaders) -…
COMP5822M – High Perf. Graphics
Buffer creation
– Createbufferobject
vkCreateBuffer() & VkBufferCreateInfo
– Findmemorytype
vkGetBufferMemoryRequirements[2]()
– Allocate Memory
vkAllocateMemory() & VkMemoryAllocateInfo
– BindMemoryToBuffer
vkBindBufferMemory[2]() & VkBindBufferMemoryInfo
COMP5822M – High Perf. Graphics
Buffer Creation
– Must specify how the buffer will be used
– Bitfield of VkBufferUsageFlags, VK_BUFFER_USAGE_*_BIT
– TRANSFER_SRC, TRANSFER_DST: source/destination of copy
– UNIFORM_BUFFER: used as uniform buffer
– VERTEX_BUFFER: used to source per-vertex data
– INDEX_BUFFER: used to source indices
– STORAGE_BUFFER: similar to UNIFORM_BUFFER, larger, read-write -…
COMP5822M – High Perf. Graphics
Buffer Creation
– SharingMode
– Important if buffer is accessed from multiple queues
– EXCLUSIVE: no concurrent access (must manually transfer ownership)
– CONCURRENT: multiple queues can access concurrently
– Only one queue => EXCLUSIVE
“VK_SHARING_MODE_CONCURRENT may result in lower performance access to the buffer or image than VK_SHARING_MODE_EXCLUSIVE.”
COMP5822M – High Perf. Graphics
Why all these separate steps?
– Flexibility
– Example:Suballocations
– Example:Canaliasmemory
– I.e., use the same memory for
two different buffers/images
– Must not be used at the same time!
– Requires extra synchronization.
COMP5822M – High Perf. Graphics
– Imagesusedfor – Textures
– Render targets (color buffer, depth buffer, …) -…
COMP5822M – High Perf. Graphics
Image Creation
– Similar to buffer creation
– Need to specify
– Image type: TYPE_1D, TYPE_2D, TYPE_3D
– Size (“extent”), mips, layers
– Sampling options (multisampling)
– Initial Layout
– Usage, sharing mode
COMP5822M – High Perf. Graphics
Image Creation
– Note: Cube map = IMAGE_2D with 6 layers
COMP5822M – High Perf. Graphics
Image Creation
– Note: Cube map = IMAGE_2D with 6 layers
Cube map by Emil Person (“Humus”) CC-BY-Attrib 3.0
See http://humus.name
COMP5822M – High Perf. Graphics
Image Creation
– Note: Cube map = IMAGE_2D with 6 layers
Cube map by Emil Person (“Humus”) CC-BY-Attrib 3.0
See http://humus.name
COMP5822M – High Perf. Graphics
Image Creation
– Note: Cube map = IMAGE_2D with 6 layers
Cube map by Emil Person (“Humus”) CC-BY-Attrib 3.0
See http://humus.name
COMP5822M – High Perf. Graphics
Image Format
– VK_FORMAT_{COMPONENTS}_{SUFFIX}
– Example: VK_FORMAT_R8G8B8A8_SRGB
– Very long list of different formats
– Moreexamples:
– R8G8B8_*: tempting, typically not an
option (3 byte alignment!)
– R8G8B8A8_*, B8G8R8A8_*:
(what you use instead)
– D32_SFLOAT: depth buffer
COMP5822M – High Perf. Graphics
Image Format
– Suffixes?
– Interpretation of data (e.g. R8 = 8 bit value)
– _SINT: signed integer, R8: [-127,127]
– _UINT: unsigned integer, R8 = [0,255]
– _UNORM: value mapped to [0, 1] (R8: 255 steps between 0, 1)
– _SNORM: value mapped to [-1, 1] (R8: 255 steps between -1, 1)
– _SRGB: value mapped to [0, 1], but with sRGB transformation
– _SFLOAT: data is interpreted as IEEE floating point value
COMP5822M – High Perf. Graphics
Image Format
– RGBAformats:generalpurpose
– I.e., not just color data
– Examples: normal maps
– Compressed formats
– Planar formats -…
COMP5822M – High Perf. Graphics
Image Tiling
– Memorylayout
– VK_IMAGE_TILING_OPTIMAL – VK_IMAGE_TILING_LINEAR
COMP5822M – High Perf. Graphics
Image Tiling
– Memorylayout
– VK_IMAGE_TILING_OPTIMAL vs VK_IMAGE_TILING_LINEAR
COMP5822M – High Perf. Graphics
Image Tiling
– Memorylayout
– VK_IMAGE_TILING_OPTIMAL vs VK_IMAGE_TILING_LINEAR
Memory layout (Linear)
COMP5822M – High Perf. Graphics
Image Tiling
– Memorylayout
– VK_IMAGE_TILING_OPTIMAL vs VK_IMAGE_TILING_LINEAR
Memory layout (Optimal)
Memory layout (Linear)
COMP5822M – High Perf. Graphics
Image Tiling
– Memorylayout
– VK_IMAGE_TILING_OPTIMAL vs VK_IMAGE_TILING_LINEAR
(Possible mem. layout, example only! Actual layout unknown, determined by GPU/driver!)
Memory layout (Optimal)
Memory layout (Linear)
COMP5822M – High Perf. Graphics
Image Tiling
(Apologies for the misaligned grid…)
COMP5822M – High Perf. Graphics
Image Tiling
(Apologies for the misaligned grid…)
COMP5822M – High Perf. Graphics
Image Tiling
COMP5822M – High Perf. Graphics
https://en.wikipedia.org/wiki/File:F our-level_Z.svg (Author: , License = GFDL)
Image Tiling
COMP5822M – High Perf. Graphics
https://en.wikipedia.org/wiki/File:F our-level_Z.svg (Author: , License = GFDL)
Image Tiling
– UseTILING_OPTIMALwheneverpossible
COMP5822M – High Perf. Graphics
Image Usage
– VkImageUsageFlags, VK_IMAGE_USAGE_*_BIT
– TRANSFER_SRC, TRANSFER_DST: source/destination of copy
– SAMPLED: used as a texture
– COLOR_ATTACHMENT: color render target
– DEPTH_STENCIL_ATTACHMENT: depth buffer (depth+stencil buffer)
COMP5822M – High Perf. Graphics
Hardware support
– Onlycertainformat/usage/tiling combinations are valid
– This depends on hardware
– Texture sampling may use special
hardware units
– Same for writing to images during
rendering (=> blending!)
COMP5822M – High Perf. Graphics
Required Image support
– Certain combinations of format/usage/tiling are mandatory
– https://www.khronos.org/registry/vulkan/specs/1.3/html/chap34.htm
l#features-required-format-support
– Highlights
– R8G8B8A8_{SRGB,UNORM}: use as texture, color attachment
– Note: not for compute pipeline use!
– D16_UNORM mandatory for depth, but 16-bit depth is a bit low
– Either of X8_D24_UNORM_PACK32, D32_SFLOAT
COMP5822M – High Perf. Graphics
Image Layout
– Images have a “layout”
– Operations require correct layout
– Some operations change the layout (“transition”)
– Canmanuallytransitionlayout
– Image Barrier
– Initiallayout:UNDEFINED(orPREDEFINED)
COMP5822M – High Perf. Graphics
Image Layout
– Layouts:
– UNDEFINED: cannot be used, image contents are undefined
– TRANSFER_DST_OPTIMAL: can be copied to
– TRANSFER_SRC_OPTIMAL: can be copied from
– COLOR_ATTACHMENT_OPTIMAL: can be rendered to
– SHADER_READ_ONLY_OPTIMAL: can be read from in shader (texture) – DEPTH_STENCIL_ATTACHMENT_OPTIMAL: use as depth/stencil buffer -…
– PRESENT_SRC_KHR: can be used to present
COMP5822M – High Perf. Graphics
Image Layout
– Layouts:
– GENERAL: all types of device access,
may be suboptimal
COMP5822M – High Perf. Graphics
Example: Texture
– Create Image
– Memory type with DEVICE_LOCAL
– Tiling: OPTIMAL
– Usage: TRANSFER_DST | SAMPLED_BIT
– Layout: UNDEFINED
– Transition image from UNDEFINED to TRANSFER_DST
– Staging buffer (TRANSFER_SRC)
– Fill with data, and copy to image
(vkCmdCopyBufferToImage())
– Transition image from TRANSFER_DST
to SHADER_READ_ONLY_OPTIMAL 2021/2022
COMP5822M – High Perf. Graphics
– Whytheextrabookkeeping?
– Optimization hint, may affect performance
– Could affect how images are stored in memory
– AMD:caresaboutimagelayouts
– Compress/decompress/rearrange
during transitions
– Using wrong layouts/GENERAL listed
under “Common Mistakes”
– NVIDIA:doesn’tcare
– Lists using GENERAL under “Good Practices”
– I use the correct layouts and avoid GENERAL
COMP5822M – High Perf. Graphics
– Levelsinmipmappyramid
– Textures: mipmapping is default
– Need to allocate space (1/3 extra)
– Later: need to set up sampler to
use “trilinear” filtering as well.
– More when we talk about textures
COMP5822M – High Perf. Graphics
Src: (https://www.cse.chalmers.se/edu/course/TDA362/texturing.pdf)
image pyramid: Only 33% extra – Levelsinmiepmmoray.ppyramid
– Textures: mipmapping is default
Task: Show that this is the case
– Need to allocate space (1/3 extra)
geometrically.
H-inLt:autsers:qnuaeresdtotorepsreetseunpt esaachmlepvlel r to and see how many you need to fill a
use “trilinear” filtering as well.
square 3x the size of the original texture
– More when we talk about textures
COMP5822M – High Perf. Graphics
Src: (https://www.cse.chalmers.se/edu/course/TDA362/texturing.pdf)
Image Views
– Reference to (parts of) a VkImage
– RenderingdoesnotdirectlyuseVkImage – Always goes through an VkImageView
COMP5822M – High Perf. Graphics
– Soundseasyenough…
COMP5822M – High Perf. Graphics
Image Views
– vkCreateImageView,VkImageViewCreateInfo – Whole or part of image
– E.g. can render to a specific mip level or layer – Swizzle:remapcomponents
– VkImageViewType:
– VK_IMAGE_VIEW_TYPE_1D, 2D, 3D – CUBE
– 1D_ARRAY, 2D_ARRAY, CUBE_ARRAY
COMP5822M – High Perf. Graphics
Preview: VkSampler
– VkSampler: how to sample an image – When used as a texture
COMP5822M – High Perf. Graphics
Quick summary
– Buffers and Images – Imageviews
– Nexttime:Commands,queues& pipelines.
COMP5822M – High Perf. Graphics
Thank you for your attention.
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com