Lecture 16:
Real-Time Overview
COMP5822M – High Perf. Graphics
Copyright By PowCoder代写 加微信 powcoder
– Dataformats
– Std140 memory layout
– Rendering“tricks” – LOD
– Billboards/impostors – Culling
COMP5822M – High Perf. Graphics
– Accelerationstructures – RayTracingintro
COMP5822M – High Perf. Graphics
Acceleration structures
– Accelerate spatial queries
– E.g., for frustum culling?
– Key component for modern ray tracers
(both offline and online)
– Collision detection
COMP5822M – High Perf. Graphics
Quadtree / Octree
– Quadtree=2D
– Octree=3D(butsameidea)
COMP5822M – High Perf. Graphics
Quadtree / Octree
– Quadtree=2D
– Octree=3D(butsameidea)
COMP5822M – High Perf. Graphics
Quadtree / Octree
– Quadtree=2D
– Octree=3D(butsameidea)
COMP5822M – High Perf. Graphics
Quadtree / Octree
– Quadtree=2D
– Octree=3D(butsameidea)
– Placeobjectsinleafs?
COMP5822M – High Perf. Graphics
Quadtree / Octree
– Quadtree=2D
– Octree=3D(butsameidea)
– Placeobjectsinleafs?
– Overlappingobjects
– Place in smallest node that fits it?
– Place in multiple nodes?
– Extend nodes? (Loose quad-/octree)
COMP5822M – High Perf. Graphics
– Binarytree
– Quadtree had fan-out = 4
– Octree: fan-out = 8
– Split along axes in fixed order – E.g. X->Y->X->Y…
COMP5822M – High Perf. Graphics
– Binarytree
– Quadtree had fan-out = 4
– Octree: fan-out = 8
– Split along axes in fixed order – E.g. X->Y->X->Y…
COMP5822M – High Perf. Graphics
– Binarytree
– Quadtree had fan-out = 4
– Octree: fan-out = 8
– Split along axes in fixed order – E.g. X->Y->X->Y…
COMP5822M – High Perf. Graphics
– Binarytree
– Quadtree had fan-out = 4
– Octree: fan-out = 8
– Split along axes in fixed order – E.g. X->Y->X->Y…
COMP5822M – High Perf. Graphics
– Similarissues
– Split not always possible
(if objects have any size)
– Need to figure out where to split
COMP5822M – High Perf. Graphics
– Binarytree(likekD-Tree) – Arbitrarysplitplanes
COMP5822M – High Perf. Graphics
– Binarytree(likekD-Tree) – Arbitrarysplitplanes
COMP5822M – High Perf. Graphics
– Binarytree(likekD-Tree) – Arbitrarysplitplanes
COMP5822M – High Perf. Graphics
– Binarytree(likekD-Tree) – Arbitrarysplitplanes
COMP5822M – High Perf. Graphics
– Construction difficult – NP-hard
– Used in Quake to sort triangles
– And several other games that followed
COMP5822M – High Perf. Graphics
Bounding Volume Hierarchy (BVH)
– Cluster objects into groups
– E.g., by placing them in a box/sphere/…
COMP5822M – High Perf. Graphics
Bounding Volume Hierarchy (BVH)
– Cluster objects into groups
– E.g., by placing them in a box/sphere/…
COMP5822M – High Perf. Graphics
Bounding Volume Hierarchy (BVH)
– Cluster objects into groups
– E.g., by placing them in a box/sphere/…
COMP5822M – High Perf. Graphics
Bounding Volume Hierarchy (BVH)
– Cluster objects into groups
– E.g., by placing them in a box/sphere/…
COMP5822M – High Perf. Graphics
BVH builds
– Top-downvsbottom-up
– Top-down:
– Subdivide scene into 2 groups
– Subdivide each group into 2 groups
– Continue recursively
– Decide on grouping somehow
– Surface area heuristic (SAH)?
COMP5822M – High Perf. Graphics
BVH builds
– Bottomup:
– Grab N objects, group them
– Grab N objects, group them
– … (until all objects are in a group)
– Grab N groups from previous step, group those -…
– Continue this until a single group remains
= root node of BVH
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
– How many tests did we make in this example?
COMP5822M – High Perf. Graphics
– How many tests did we make in this example?
– 7 AABB tests + 4 triangle tests
– Would have been cheaper to just test all 8 triangles
– Plus 8 tris in an array => good memory access pattern
COMP5822M – High Perf. Graphics
– How many tests did we make in this example?
– 7 AABB tests + 4 triangle tests
– Would have been cheaper to just test all 8 triangles
– Plus 8 tris in an array => good memory access pattern
– OK. N = 8 is not very large.
– In fact, we’d probably pack at least 8 triangles into a leaf
– One of many considerations in a well-designed BVH
– Others: fan out, memory layout, …
COMP5822M – High Perf. Graphics
Bounding Volume Hierarchy (BVH)
– Differentvolumes – AABB
– …?(kDOP?)
– AABB-BVH:Mostcommonstructure?
– Extensive use in ray tracing
– More next time.
COMP5822M – High Perf. Graphics
– Real-time !
COMP5822M – High Perf. Graphics
– Real-time !
COMP5822M – High Perf. Graphics
I too do spend a lot of time banging my head against the desk debugging…
– Real-time !
COMP5822M – High Perf. Graphics
RTX OFF RTX ON (many hours later)
– Real-time !
COMP5822M – High Perf. Graphics
– The following is shamelessly stolen from “Real-Time of Correct* Soft Shadows” S. Hill, M. McGuire, E. Heitz (2018)
– It does make a good introduction.
– This was presented just minutes after (NVIDIA) unveiled RTX a few rooms away.
– If Jensen’s presentation had been delayed, the above presentation would not have taken place (in that form).
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
– Well … except what?
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
COMP5822M – High Perf. Graphics
Problem: With partial occlusion (shadows), Light no longer has a regular shape (and the shape varies!)
COMP5822M – High Perf. Graphics
The Solution:
COMP5822M – High Perf. Graphics
The Solution:
̄\_(ツ)_/ ̄
COMP5822M – High Perf. Graphics
The Solution:
̄\_(ツ)_/ ̄ – Real-time ? Maybe.
COMP5822M – High Perf. Graphics
– GPU-based ray tracing has a long history
– As does specialized hardware (E.g., ASICs or with FPGAs)
– Now:StandardizedinAPIs
– May or may not utilize specialized hardware.
– DXR (DirectX Raytracing) in DX12
– Extensions in Vulkan
– VK_KHR_acceleration_structure, VK_KHR_ray_tracing_pipeline, VK_KHR_ray_query
– (old) VK_KHR_ray_tracing, VK_NV_ray_tracing, VK_NVX_ray_tracing COMP5822M – High Perf. Graphics
Real-Time – History
– DXRwasannouncedinMarch2018
– RTX/Turing cards in August 2018 (SIGGRAPH)
COMP5822M – High Perf. Graphics
Real-Time – History
– DXRwasannouncedinMarch2018
– RTX/Turing cards in August 2018 (SIGGRAPH)
COMP5822M – High Perf. Graphics
– Not just for shadows of course
– Shadows are just one of the common returning problems
– Canbeusedformanyotherthings:
– Reflections, refractions
– Indirect illumination
– Ambient occlusion
– Full real-time path tracing? There are demos out there…
COMP5822M – High Perf. Graphics
Real-Time – History
– DXRwasannouncedinMarch2018
– RTX/Turing cards in August 2018 (SIGGRAPH)
– LargelybasedonideasfromOptiX(NVIDIA)
– Many of the same concepts/APIs
– E.g., ray generation programs, hit programs, miss programs,
acceleration structures, …
– Vulkan version is again very similar to both
COMP5822M – High Perf. Graphics
Real-Time – History
– Longhistorybeforethat
– Not strictly real-time, but also for off-line rendering
COMP5822M – High Perf. Graphics
Real-Time – History
– Longhistorybeforethat
– Not strictly real-time, but also for off-line rendering
– Muchfocusonacceleratingfindingintersections – => Acceleration structures!
– Additionally: efficiently dealing with parallel traversal etc.
– GPUs like non-diverging control flow and coherent memory accesses
COMP5822M – High Perf. Graphics
Real-Time – History
– Manydifferentaccelerationstructures
– E.g., kD-trees, BSP-trees, octrees, …, BVHs
– Find intersection in O( log(n) )
COMP5822M – High Perf. Graphics
Real-Time – History
– Manydifferentaccelerationstructures
– E.g., kD-trees, BSP-trees, octrees, …, BVHs
– Find intersection in O( log(n) )
– But: bad (=random) memory access patterns
(rasterization = memory friendly)
– Computational cost not dominating problem here?
– Divergent execution additional hurdle for GPUs
COMP5822M – High Perf. Graphics
Real-Time – History
– Furthermore: Dynamic Geometry
– Tracing with acceleration structure: O( log(N) )
– But need an acceleration structure
– Dynamicgeometry:
– Update / refit acceleration structures
– Rebuild each frame.
– O( N log(N) ) or worse
COMP5822M – High Perf. Graphics
What changed?
– StandardizedaroundAABBBVHs
– Fast algorithms to build (e.g. LBVH class of methods)
– Reasonable ways of refitting + rebalancing
– Traversal reasonably efficient, even in parallel
– See e.g. series of publications 2008++
– For funsies: check how many of the authors are at NVIDIA 🙂
COMP5822M – High Perf. Graphics
vs. Rasterization
COMP5822M – High Perf. Graphics
vs. Rasterization
– Onelargedifferencetorasterization!
– Rasterization:
– Typical render loop:
bind_object0_resources() // pipeline/shaders, textures, uniforms, …
draw_object0()
bind_object1_resources() draw_object1()
COMP5822M – High Perf. Graphics
vs. Rasterization
– Ray tracing: Ray can hit any object! – Need all data “bound” at once
bind_all_resources() // all shaders, textures, uniforms, …
draw_scene()
COMP5822M – High Perf. Graphics
vs. Rasterization
– Ray tracing: Ray can hit any object! – Need all data “bound” at once
bind_all_resources() // all shaders, textures, uniforms, …
draw_scene()
– New:manyshadersatonce
– Manytextures/“uniform”parameters
COMP5822M – High Perf. Graphics
vs. Rasterization
– RayTracing:
– Many shaders “at once”
– Many textures at once
– Many parameters at once
– Rasterization:
– Single shader at once -…
COMP5822M – High Perf. Graphics
Vulkan/RTX
– Components:
– Acceleration structures
– Pipeline + Shader Programs + Shader Binding Table
(VK_KHR_ray_tracing_pipeline)
– OR Ray Queries in fragment/vertex/… shaders
(VK_KHR_ray_query)
– Accelerationstructurecommonbetweenthetwo
– Which is why it’s separate (VK_KHR_acceleration_structure)
COMP5822M – High Perf. Graphics
Acceleration Structures
– Two types of acceleration structures:
– Bottom Level Acceleration Structure (BLAS)
– Top Level Acceleration Structure (TLAS)
COMP5822M – High Perf. Graphics
Acceleration Structures
– Twotypesofaccelerationstructures:
– Bottom Level Acceleration Structure (BLAS)
– Top Level Acceleration Structure (TLAS)
COMP5822M – High Perf. Graphics
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
Acceleration Structures
– Twotypesofaccelerationstructures:
– Bottom Level Acceleration Structure (BLAS)
– Top Level Acceleration Structure (TLAS)
– BLAS:mesh/object
COMP5822M – High Perf. Graphics
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
Acceleration Structures
– Twotypesofaccelerationstructures:
– Bottom Level Acceleration Structure (BLAS)
– Top Level Acceleration Structure (TLAS)
– BLAS:mesh/object
COMP5822M – High Perf. Graphics
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
Acceleration Structures
– Twotypesofaccelerationstructures:
– Bottom Level Acceleration Structure (BLAS)
– Top Level Acceleration Structure (TLAS)
– BLAS:mesh/object
– TLAS: instance of mesh/object
– I.e., one BLAS
– Associated transform
– 3×4 matrix (rot. + transl.)
COMP5822M – High Perf. Graphics
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
Acceleration Structures
– Twotypesofaccelerationstructures:
– Bottom Level Acceleration Structure (BLAS)
– Top Level Acceleration Structure (TLAS)
– BLAS:mesh/object
– TLAS: instance of mesh/object
– I.e., one BLAS
– Associated transform
– 3×4 matrix (rot. + transl.)
COMP5822M – High Perf. Graphics
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
Acceleration Structures
– List of triangles
– List of bounding boxes
– List of bounding boxes:
– Custom intersection shader
– For non-triangle data
COMP5822M – High Perf. Graphics
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
Acceleration Structures
– List of triangles
The triangles in the BLAS are defined from positions only I.e., normals etc are not included in the acceleration structures
Need to pass those separately (SSBOs)
It might make sense to keep positions in separate arrays
– List of bounding boxes:
– Custom intersection shader
– For non-triangle data
– List of bounding boxes
COMP5822M – High Perf. Graphics
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
(i.e. planar vertex layouts instead of interleaved ones)
Acceleration Structures
– Vulkan will build the acceleration structures for us
– Involves multiple steps
– Why? Same as always:
– You have to manage resources / allocate memory
– You get to say when and how expensive calculations take place
COMP5822M – High Perf. Graphics
Acceleration Structures
– Vulkan will build the acceleration structures for us
– Involves multiple steps
– Why? Same as always:
– You have to manage resources / allocate memory
– You get to say when and how expensive calculations take place
– VK_KHR_acceleration_structure:
– New: Options to build on CPU
– Options to save/load existing acceleration structure
COMP5822M – High Perf. Graphics
Acceleration Structures
– Sponzaexample:
– 25meshes/objects – 260ktriangles
– 25BLASes:~65ms
– TLAS with 25 instances: 160μs
– Note: old timings (VK_NV_ray_tracing)
COMP5822M – High Perf. Graphics
– 5differentshadertypes
– 3 required
– 2 optional
COMP5822M – High Perf. Graphics
– 5differentshadertypes
– 3 required
– 2 optional
– Encapsulated in a pipeline!
COMP5822M – High Perf. Graphics
– Required:
– Shader
– Miss Shader
– Closest Hit Shader
COMP5822M – High Perf. Graphics
– Required:
– Shader
– Miss Shader
– Closest Hit Shader
– Optional:
– Intersection
COMP5822M – High Perf. Graphics
– Required:
– Shader
– Miss Shader
– Closest Hit Shader
– Optional:
– Intersection
COMP5822M – High Perf. Graphics
traceRayEXT()
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
– Required:
– Shader
– Miss Shader
– Closest Hit Shader
– Optional:
– Intersection
COMP5822M – High Perf. Graphics
traceRayEXT()
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
– Required:
– Shader
– Miss Shader
– Closest Hit Shader
– Optional:
– Intersection
COMP5822M – High Perf. Graphics
traceRayEXT()
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
– Required:
– Shader
– Miss Shader
– Closest Hit Shader
– Optional:
– Intersection
COMP5822M – High Perf. Graphics
traceRayEXT()
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
– Required:
– Shader
– Miss Shader
– Closest Hit Shader
– Optional:
– Intersection
COMP5822M – High Perf. Graphics
traceRayEXT()
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
Shader Binding Table
– Solution to having many shaders
– Example: multiple closest hit shaders
– Different materials, …
– Shader Binding Table (SBT)
– Connects BLAS/object to a shader
– Can pass additional parameters to
COMP5822M – High Perf. Graphics
Shader Binding Table
– Solution to having many shaders
– Example: multiple closest hit shaders
– Different materials, …
– Shader Binding Table (SBT)
– Connects BLAS/object to a shader
– Can pass additional parameters to
COMP5822M – High Perf. Graphics
Src: https://devblogs.nvidia.com/introduction-nvidia- rtx-directx-ray-tracing/
Shader Binding Table
– Solution to having many shaders
– Example: multiple closest hit shaders
– Different materials, …
– Shader Binding Table (SBT)
– Connects BLAS/object to a shader
– Can pass additional parameters to
COMP5822M – High Perf. Graphics
Src: https://devblogs.nvidia.com/introduction-nvidia- rtx-directx-ray-tracing/
Shader Binding Table
– Solution to having many shaders
– Example: multiple closest hit shaders
– Different materials, …
– Shader Binding Table (SBT)
– Connects BLAS/object to a shader
– Can pass additional parameters to
COMP5822M – High Perf. Graphics
(optional data – Shader identifier (optional data –
Shader identifier
push constants, …)
Shader identifier
(optional data – push constants, …)
push constants, …)
Src: https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/
Shader Binding Table
– SBT begins with Shader
– Then miss shader(s)
– Then groups of closest hit shaders -…
– (Essentially multiple concatenated arrays)
COMP5822M – High Perf. Graphics
Shader Binding Table
– SBT begins with Shader
– Then miss shader(s)
– Then groups of closest hit shaders -…
– (Essentially multiple concatenated arrays)
– If you hate descriptor layouts/pools/sets/writes
– Well, this is the same, except without the training wheels.
COMP5822M – High Perf. Graphics
Ray Queries / Inline
– Alternative to the Pipeline – Not supported everywhere
– Bind TLAS as descriptor to fragment/
vertex/compute/… shader (uniform)
– Use GLSL ray query API to trace rays
– rayQueryInitialize(), rayQueryProceedEXT(), …
– Simpler?Maybe.
COMP5822M – High
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com