Add true HDR output#7554
Conversation
Move the pure-math vertex/index generation out of `gropengldeferred.cpp` into graphics/util/primitives so it can be reused by the Vulkan backend. Modernize to use `SCP_vector` instead of `vm_malloc`/`vm_free` for automatic memory management.
Replace direct `ImGui_ImplOpenGL3` calls in game code with backend-agnostic `gr_imgui_new_frame` and `gr_imgui_render_draw_data` function pointers, matching the pattern used by all other `gr_*` functions. This makes it possible for the Vulkan backend to provide its own ImGui implemantation.
`bm_close` calls `gf_bm_free_data` for each bitmap slot, which needs the graphics backend (Vulkan texture manager, OpenGL context) to still be alive. Move `bm_close` before the backend cleanup switch in `gr_close`.
`gr_flash_internal` used int vertices with `SCREEN_POS` (`VK_FORMAT_R32G32_SINT`) but the default-material vertex shader expects vec4 float at location 0. OpenGL silently converts via glVertexAttribPointer; Vulkan requires exact type matching. Use float vertices with `POSITION2` format instead. There should be no difference in behavior.
The `SCREEN_POS` vertex format is no longer used after the only use in `gr_flash` was removed. Remove it entirely.
Deduplicate compressed texture block-size mapping and mip-size calculation into two inline helpers in `ddsutils.h`, replacing repeated inline formulas in `ddsutils.cpp` and `gropengltexture.cpp`.
Add a render system capability to indicate whether GPU timestamp query handles can be immediately reused after reading. When queries are not reusable, `free_query_object` returns handles to the backend via `gr_delete_query_object` instead of the tracing free list, letting the backend manage its own reset lifecycle. This greatly simplifies query management for Vulkan. Also change shutdown to discard gpu_events for backends where queries aren't reusable (no more frames will be submitted to make them available).
Move `output_uniform_debug_data` before `gr_reset_immediate_buffer` so debug text is rendered while the immediate buffer still contains valid data. The previous ordering read from a buffer that was already reset to offset 0, which is logically wrong for any backend and a hard failure for deferred-submission backends.
`gr_set_proj_matrix` already branches on rendering_to_texture to choose top-left (RTT) vs bottom-left (screen) viewport origin. `gr_end_2d_matrix` should match, but it unconditionally used the bottom-left formula. Add the same `rendering_to_texture` branch so the viewport is restored correctly when rendering to a texture.
Change `bool clipEnabled` to `uint clipEnabled` in the default-material shader UBO. GLSL bool has implementation-defined std140 layout; uint is portable and matches the SPIR-V decompiled output. Add an else-branch writing `gl_ClipDistance[0] = 1.0` when clipping is disabled. Without this, gl_ClipDistance is undefined and some drivers cull geometry unexpectedly.
Memcpy from a `const void*` to `void*` is trivial enough. However, this case was missing, resulting in a false positive compilation error.
Extract shader loading and preprocessing (include/predefine expansion) into code/graphics/shader_preprocess.cpp, so it can be shared with the Vulkan backend.
Bundle Vulkan headers (v1.4.309).
Bundle Vulkan Memory Allocator (v3.2.1).
ddsutils.cpp checked OpenGL-specific GLAD globals to decide whether to decompress DXT textures. When the Vulkan backend was active these variables were never set, so all DXT textures were decompressed to 32bpp RGBA. Replace the GLAD checks with gr_is_capable() queries for the new CAPABILITY_S3TC and existing CAPABILITY_BPTC, making ddsutils backend-agnostic. Add the S3TC capability handler to the OpenGL backend.
Extract shader type tables (filenames, descriptions) and
variant tables (type, flag, define, description) into shared
code/graphics/shader_types.{h,cpp}.
Also move FXAA quality preset defines into shader_types so both
backends can share a single implementation.
Implement a Vulkan 1.1 renderer that replaces the previous stub with a fully functional backend, mostly matching the OpenGL backend's rendering capabilities. Core rendering infrastructure: - `VulkanMemory`: Custom allocator with sub-allocation from device-local and host-visible memory pools - `VulkanBuffer`: Per-frame bump allocator for streaming uniform/vertex/index data (persistently mapped, double-buffered, auto-growing) - `VulkanTexture`: Full texture management including 2D, 2D-array, 3D, and cubemap types with automatic mipmap generation and sampler caching - `VulkanPipeline`: Lazy pipeline creation from hashed render state, with persistent VkPipelineCache - `VulkanShader`: GLSL shader loading. Shader code and metadata are shared with OpenGL, with differences guarded by preprocessor conditions - `VulkanDescriptorManager`: 3-set descriptor layout (Global/Material/PerDraw) with per-frame pool allocation, auto-grow, and batched updates - `VulkanDeletionQueue`: Deferred resource destruction synchronized to frame-in-flight fences Design choices: - Two frames in flight with fence-based synchronization - Asynchronous texture upload, no `waitIdle` in hot path - Single command buffer per frame; render passes begun/ended as needed for the multi-pass deferred pipeline - Per-frame descriptor pools - All descriptor bindings pre-initialized with fallback resources (zero UBO + 1x1 white texture) so partial updates never leave undefined state - Streaming data uses a bump allocator (one large VkBuffer per frame) - Pipeline cache persisted to disk for fast startup on subsequent runs - Use VMA (Vulkan Memory Allocator) for buffer management Some notable Vulkan vs OpenGL differences are: - Depth range is [0,1] not [-1,1]: shadow projection matrices adjusted, shaders that linearize depth need isinf/zero guards at depth boundaries where OpenGL gives finite values - In Vulkan, all shader outputs must be initialized. Leaving them uninitialized can result in random corruptions, while OpenGL allows leaving them in some cases - Swap chain is B8G8R8A8: screenshot/save_screen paths swizzle to RGBA - Vulkan render target is "upside down", y-flip for render target is handled through negative viewport height, as is common - Texture addressing for AABITMAP/INTERFACE/CUBEMAP forced to clamp (OpenGL's sampler state happens to do this implicitly) - Render pass architecture requires explicit transitions between G-buffer, shadow, decal, light accumulation, fog, and post-processing passes (OpenGL just switches FBO bindings)
Include shadows.sdr unconditionally in main-v.sdr and main-f.sdr (was guarded by #ifndef VULKAN / #ifdef OPENGL). Add shadowUV[4] and shadowPos varyings to Vulkan's VertexOutput. Add shadow_map sampler to Vulkan's fragment declarations. Remove #ifndef VULKAN guard around forward shadow getShadowValue() call. Unify shadow depth write to use VARIANCE_SHADOW_SCALE_INV for both backends.
Write the real shadow map texture to Global Set 0 Binding 2 during model draw calls (was always fallback). Enables forward-pass shadows for the Vulkan backend.
Cloak effect: declare sFramebuffer (scene color copy) at Set 1 Binding 5 in the Vulkan model fragment shader. The texture was already bound by VulkanDraw. Lightshaft cockpit mask: declare cockpit sampler at texture array element 1 in Vulkan lightshaft shader. Currently samples a white fallback (no cockpit depth isolation yet), matching existing Vulkan behavior but unifying the shader code.
Cleaner, and avoids accidentally leaving holes
This is implemented differently in different places. And was missing in others.
Replace the bare texture view getters with ready-made structures.
…terial OpenGL backend does a check to make sure we disable blending when rendering to the gbuffers. Vulkan backend does not so we should bring the check to when we set the model material.
…s, check depth mode for ZBUFFER_TYPE_FULL instead. Just checking for deferred rendering status messed up blending for transparency passes.
m_fillmode is set from gr_set_fillmode() but is usually just used to reset the fill mode state outside of the rendering backends so we should be using the material fill mode internally instead.
…clean-up of transparency geometry processing code. Temporarily(?) reduced alpha channel transparency geo pass threshold from 0.95f to 0.75f due to certain models (MVP 4.7.3 Triton) having the entire diffuse texture with alpha values less than 1.0f which ended up putting the entire geometry into the transparency pass. Somehow looked fine in OpenGL but we have to do this for Vulkan. For now?
…tices exceed the unsigned short maximum. Restored the original alpha threshold for transparent geo.
…when rendering shadow maps.
…when it should instead be resuming either the swapchain or scene framebuffer render pass. Fixes shadows when rendering directly to swapchain (techroom, mission briefings).
…e eShaderReadOnlyOptimal to match the scene render pass final layout.
…d number of color attachments when resuming the scene render pass.
…load. Instead of using the load scene render pass after shadow maps, create a resume scene render pass.
…ass. Turns out we did need to bind a deferred render pass after shadow maps. I made a grave assumption thinking vulkan_scene_texture_begin() only bound the scene texture but actually, it selectively binds based on if deferred lighting is enabled or not.
…tProcessing refactor: modularize lighting and fog logic into standalone VulkanDeferredLighting and VulkanFog subsystems refactor: modularize G-buffer and MSAA logic into VulkanDeferredGBuffer subsystem refactor: modularize bloom logic into standalone VulkanBloom subsystem refactor: modularize VulkanPostProcessing with self-contained shadow and distortion subsystems refactor: centralize post-processing context and streamline resource management refactor: texture upload logic and centralize mip-level calculations
Uses rsync to copy files with copy_file_to_target() on non-Windows platforms in order to preserve symlinks. The standard cmake method of file copying doesn't, which duplicates files, thereby making packaged builds larger than necessary.
- enable Vulkan renderer by default - remove Vulkan SDK install from workflows - bump prebuilt version
BMagnu
left a comment
There was a problem hiding this comment.
Check out SDR_FLAG_TONEMAPPING_LINEAR_OUT.
It's basically poor-man's HDR output in the OpenGL pass, currently only used for the OpenXR pass, since the headset swapchain seems to do its own tonemap, or at the very least expects non-SDR input (it's highly possible that proper HDR input is in fact the correct thing to forward here, instead of truly linear data).
This should very likely be merged into this work, even if it means supporting OpenGL to some degree, rather than exist as a weird, second, half-baked HDR pass. Especially since proper handling across not only gameplay but also menus will fix #7181.
I'm not yet fully familiar with most of the Vulkan PR, but I assume that before this change, there is no handling of Cmdline_window_res, which this new buffer could then do in the future as well? I also assume that this PR is going to be the proper place to mirror #7484 for the SDL3 upgrade. If so, we should at least prepare / design this in a way that makes it easy to retrofit.
Note that this is based on taylor's PR (#7553) and should only be merged afterwards.
Summary
Adds optional HDR10 (PQ / ST.2084 + BT.2020) swap chain output to the Vulkan renderer, with paper-white / peak-luminance controls and an in-game calibration screen. When disabled (the default) or when running on a non-HDR display / the OpenGL renderer, behavior is unchanged.
What's included
Swap chain & metadata (
VulkanRendererSetup.cpp)VK_EXT_swapchain_colorspace(instance) andVK_EXT_hdr_metadata(device, when available).A2B10G10R10swap chain with theHDR10_ST2084color space, falling back cleanly to SDR/sRGB otherwise.VkHdrMetadataEXT(paper white / peak luminance, BT.2020 primaries) and re-applies it on swap chain recreation (resize / fullscreen toggle).Frame composition refactor (
VulkanRenderer.cpp,VulkanRendererLoop.cpp)RGBA16Fcomposition buffer at window resolution. All rendering (scene, UI, ImGui) now targets this buffer.Tonemap / output shaders (
tonemapping-f.sdr,gamma.sdr)hdr_modeuniform:0= existing SDR tonemap,1= HDR scene tonemap (exposure + headroom clamp relative to paper white, stored as extended sRGB in the fp16 composition buffer),2= HDR10 output encode (linearize, scale to nits, BT.709 -> BT.2020, PQ encode).Scene_ldr,Scene_luminance) widen to fp16 when HDR is active so highlights above paper white survive.Options (
2d.cpp/2d.h)Graphics.HDR(bool, requires restart),Graphics.HDRPaperWhite(nits, live),Graphics.HDRPeakLuminance(nits, live), persisted via the options system.Gr_hdr_output_activereflects whether the renderer actually negotiated an HDR10 swap chain (distinct from the request flag).Calibration screen (
ingame_options_ui.cpp/.h)OpenGL (
gropenglpostprocessing.cpp)Notes / limitations