Add Vulkan rendering backend#7233
Conversation
|
Thanks for the PR! |
|
Okay, played around with it a little. |
|
I played around with it a little bit as well, and using nSight I could at least get as far as seeing that rendering a background with a skybox causes some amount of corruption to get into the main framebuffer - I haven't yet been able to see where it's coming from (as rendering a skybox should be one of the simpler things, just some basic geo and a couple textures, no lighting), but, well, there it is. |
i'm on the hard-light discord server i'm "mara" there. i'm not very active on discord, but happy to discuss.
To be honest i've only been replaying the retail campaign with it. So code paths not exercised there will be less (or even not) tested. Please let me know which mods this happens with!
Thanks for trying. i've installed NVidia Coresight but couldn't get it to report any issues in the level i tried (and i'd been using Vulkan's validation layer as well as RenderDoc during development and it should be clean). Can you send me the level file this happens with? And the messages that i should watch for? |
My testing was done using the latest mediaVPs mod, running both the first mission and using the lab environment. RenderDoc capture available here: https://drive.google.com/file/d/1ficdGUP-e8xfmUjZWAzWZmwpa9aAtFmt/view?usp=drive_link |
|
I was similarly running the MediaVPs' first mission (where I got the artifacts), and I was testing the "Icarus" Cutscene from Blue Planet (which crashes on trying to render the opening movie, skippable with |
|
Im not in any position to ask, but instead of getting the vulkan lib and headers currently installed in the host system, maybe its better to use a glad2 loader for vulkan in the same way as it is for OpenGL? |
|
Wow, nice work. Pretty straightforward design, nothing surprising. I have a local WIP DX12 implementation I've been working on and off on in my spare time for general practice and I see a lot of similar decisions you've made here in your VK implementation. I kind of wonder if we need to double buffer the immediate buffer so that we leave alone the one that's in-flight. But maybe it doesn't matter if the fence in the command buffer submission and flip takes care of everything. Or is it the buffer manager that keeps track of the frame num? Surprised that my batching code made it out intact. Also surprised that my render primitives immediate code also made it out intact. Sorry if it caused any headaches. |
|
Ill put this in here for reference in case anyone is interested. i did tried to see if i can change it to use the glad2 loader instead, as i expected since it is using vulkan.hpp, it is using the Vulkan C++ bindings, glad 2 loader has the C bindings, in the exact same way as with the version OpenGL. So its not a huge amount of work to change it, but it is still considerable work to change all bindings. (like 2-3 days). It is some work just to get it compile again not knkwing it is going to still work after that. I also got the current PR version to compile for android by just adding the missing .hpp vulkan headers to the Android NDK, not elegant as im adding stuff to the toolchain but, it will do for now. Buuuuut it does not compile for 32 bits (x86/arm32), but it does for x86_64/arm64, not sure if this also the case for regular builds On my phone with a Mali-G57 On my Retroid G2 Handheld with a Qualcomm G2 and an Adreno 22 GPU, it fails to init vulkan because it lacks a transfer queue. I guess it is VK_QUEUE_TRANSFER_BIT? So its not completely 1.1 it uses an optional extension/feature. |
Thanks. The renderdoc capture should be helpful for reproduction.
It does. This is handled purely in the Vulkan layer. The buffer manager does a double buffering of all dynamic and streaming buffers in
Hahah it wasn't too bad!
Will look into it. It seems it would be way easier to vendor vulkan.hpp instead of switching to using C bindings, so i'll go for that first. |
|
Okay. i've bundled the Vulkan and Vulkan-CPP headers in With this, it should be possible to build it on (or for) platforms without the Vulkan library and headers installed. |
|
Trying to get it to pass the CI now. Will squash all these changes into the main (or otherwise original) commit when done. |
|
i'm not happy where clang-tidy is taking some of these. It first wants to make these functions static (because it could), and now it want to refer to them by fully qualified class name instead of instance: - auto* texSlot = texManager->getTextureSlot(handle);
+ auto* texSlot = graphics::vulkan::VulkanTextureManager::getTextureSlot(handle);- drawManager->stencilClear();
+ graphics::vulkan::VulkanDrawManager::stencilClear();Which is strictly correct but it's also less readable, and asymmetric with the rest of the API. Will see if Edit: it did. Will look into rendering issues next. |
e5c9a34 to
61f890e
Compare
|
Hi, Shivansps here, im on a diferent account, i think i know why it says there is no transfer queue on the adreno driver. I think this if here is wrong if (!values.transferQueueIndex.initialized && queue.queueFlags & vk::QueueFlagBits::eTransfer) { Acording to the documentation "All commands that are allowed on a queue that supports transfer operations are also allowed on a queue that supports either graphics or compute operations. Thus, if the capabilities of a queue family include VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT, then reporting the VK_QUEUE_TRANSFER_BIT capability separately for that queue family is optional." eGraphics (and eCompute) all include a transfer queue but may not report it. |
Good catch. Yes, the logic there is wrong. "It worked on NVidia" 😊 Will fix. Edit: Mind that the transfer queue is currently unused, as this makes the upload code simpler, due to there being no cross-queue synchronization requirement. In the current design there wouldn't be a benefit to using it, just overhead, as there's (AFAIK) no way to exploit parallelism here. So we could even decide to completely remove checking for it. |
|
i've pushed a few rendering corruption fixes. Some wrong assumptions about renderpass state, and Vulkan vs GL differences. The cubemap corruption and random framebuffer noise should be solved now. |
|
Just reporting back here, the change to the transfer queue selection did work. Now the Adreno GPU works and can get into the game. |
|
Today i saw two things:
Changing to C++ types fixes 32 bit compilation struct VulkanAllocation { Why this compiles its not going to work or it is going to have additional issues as VulkanPipeline.cpp has shifts to go out of range for 32 bit types. |
|
While not an immediate priority, I'd love to question the following design goal: Long / Medium term, I would like for FSO to ship with shadertool or something to allow it to compile to SPIR-V itself. This gets rid of a lot of issues here. First, we'd be able to keep text-based shaders that can be dual-use for OpenGL and Vulkan. Any incompatibilities can just be put in preprocessor blocks like main-f's prereplace, allowing full dual-use of all shaders. Furthermore, it'd allow table-able postprocessing and shader changes. While currently a full shader replace is necessary for custom shaders, I eventually want this to be properly modular, so being able to modify parts of shaders is a goal, and that for sure requires compilation on-the-fly. Shipping with shadertool and then compiling on load (ideally after game-settings.tbl, especially since the recent Z-Compress changes) all available shaderfiles to SPRIV shouldn't be that hard either. |
|
@Shivansps @GamingCity @The-E @BMagnu |
|
Please, I dont want to make you waste time on android testing, its not even a working platform yet. Ill post if i can find out something. If you want to see i have a Fso_Android_Wrapper](https://github.com/Shivansps/Fso_Android_Wrapper) as the android test app, Fso-Android-Prebuilts were i have the script and instructions to build the fso dependencies and fso itself, and i have a "android-build-vulkan" branch on my fork where i added this pr to my previous android work, I did found one problem with android on VulkanRenderer It seems that if you leave at that and use I changed it to this that did worked. auto supported = deviceValues.surfaceCapabilities.supportedTransforms; I dont know if thats the right fix, it does not seems to do anything in windows. Keep in mind i used an AI to point me to this and the potential fix as i did not know if anything in vulkan could cause this, it told me to check where the preTransfor and surface capabilities are set for the transform and that i should use the eidentity flag. https://docs.vulkan.org/refpages/latest/refpages/source/VkSurfaceTransformFlagBitsKHR.html |
|
Fair enough re: compiling shaders and large dependencies, but I think it is worth here. |
|
I may have seen that issue with the Triton on a Radeon iGPU in the past, but that was around 2 years ago when BTA 2 first came out. It would happen with some models but not with others. |
|
Bashed my head against RenderDoc this entire week but this should fix corrupted geometry for real: 77610a8 It looks like it was a matter of checking the transparency buffer indexed vertices count to make sure we weren't exceeding the unsigned short maximum. |
|
@SamuelCho Yep, that fixed it! 👍 |
|
I think this fixes shadows: SamuelCho@ecc1507 The Vulkan backend tried to directly set the shadow viewport and scissor regions to the command buffer but it would get overridden by the material pipeline config when setting the model material. I guess Claude didn't think it needed the state tracker for viewport and scissor states for shadow map passes. Along with that, the triangle winding order also needed to be reversed for shadows so I put in a little exemption for shadow map rendering when rendering models. I don't think we had to do this in OpenGL so this may be a temp fix until we figure out some more comprehensive way to account for the differences as before. Shadows still don't work in the tech room BTW. There's a hard coded assumption in the Vulkan shadow start function that assumes we require g-buffers to render shadows which is also an incorrect assumption Claude made that needs to be fixed as we still use forward rendering in places. That's going to be the next thing I'll be working on. |
|
Shadows are working for me now in the lab and in mission. 👍 I do get broken shadows in the tech room though. It appears to happen when you view a ship in the lab (Fenris in this case, didn't try others) and then go to the tech room. Any ship that you view will have a fixed position shadow over it that generally matches the shadow cast by the ship you viewed in the lab. (Using shadow quality of medium, in case that makes a difference) I only mention it because I assumed it wouldn't work at all, but then saw a ship half-covered by a shadow. I'm guessing that will be a non-issue when you get tech room shadows fixed but thought I'd point it out in case it indicates a state/rendering bug that would otherwise be hidden. |
|
This should now fix techroom and mission briefing shadows: SamuelCho@c21dbb1 Not the prettiest fix but Vulkan's render pass system makes binding render targets a bit more complicated. It isn't as simple as the push/pop framebuffer state tracking we had for the OpenGL side. So, combined with Claude's understandable generalizations, there's some unnecessary render pass binds happening that need to be looked at again. But at least this gets us closer to stable. |
|
That works great in the techroom and briefings, but it's crashing for me otherwise (lab and in-mission). I am on a Mac, so it's doing Vulkan->Metal translation, and that might be triggering the error. The backtrace certainly seems to indicate that the assertion is in the Metal side. I'll try to confirm that same behavior on a Linux box when I get time. |
|
Hmm, that's too bad. I did notice a discrepancy with the scene render pass load that gets bound after shadow pass rendering. Maybe this'll do the trick? SamuelCho@6189228 |
|
Sorry, the actual assertion message would have been super useful I think. I'd like to blame lack of sleep for not including that, but honestly I just missed it. |
|
Finally got the chance to test on my Linux box as well. Tech room and loadout looks fine, and while it doesn't crash, rendering is completely broken in the lab and mission and models are generally black with some odd sparkles or something visible as you move the models around. If shadows are disabled then rendering is fine on Linux, and it stops crashing on Mac. |
|
Okay, that assertion is what I needed to see. We weren't updating the number of color attachments in the state trackers so let's see if this fixes it: SamuelCho@2b714c6 |
|
No change on either platform I'm afraid. I'm including the debug log from the Mac this time, and it stops at the assertion (which isn't logged). The one on Linux is functionally the same, but continues on past that point without any error or warning messages, and without properly rendering any models. My graphics knowledge pretty much stopped at OpenGL 2, so I realize I'm not much help here. But if there is something you'd like me to try locally or that I should look for let me know. Or just send me a patch with a bunch of printf's to get more info, if that helps. |
|
I finally realized that I wasn't testing with Vulkan validation layers turned on. So hopefully this should let us catch some errors that I normally wouldn't see on my own machine. For posterity, the -gr_debug commandline argument enables the validation layers. Though, I didn't see anything get reported to the log but I was able to see the errors in RenderDoc once I made a capture. Long story short, I made a whoopsy thinking that I could change the expected layout for the Load scene render pass to use after the shadow map pass. I changed it back to normal and made a new render pass for the purposes of resuming after shadow map rendering. I hope this solves the problem. Hopefully the validation layer caught all the problems across platforms. SamuelCho@1305c14 |
|
Still no change. I thought that maybe I just had something in my tree that was b0rked, but gave a Windows test build to someone else and they said it worked fine. So for whatever reason it works on Windows but not Mac or Linux. Keeping in mind that I have no real understanding of this code, nor the affect of any changes made, I made some incremental changes through trial and error. These adjustments got it all working for me: vk_mac_test.patch. Perhaps you can spot some difference there to help narrow down the problem. For reference, if I disable validation and just let it get funky with the current broken behavior, it renders like this (but animated/flashing): |
|
I tried this following build here on Windows, based on the branch Running with shadows enabled in retail, plus the |
|
Update, got a build based on current master and lab now properly renders. |
|
The |
|
That patch also renders fine on my machine just fine but it threw a validation error unfortuantely:
It was still useful to use as a reference so thanks for that. I tried to compare the render pass info in that gbuffer pass with the scene render pass resume. I think this was the only discrepancy FWIW: SamuelCho@63c7b1e |
|
Edit, given I was using Talyor's custom branch and the |
|
Alright so I made an idiotic mistake in not double checking to see what vulkan_scene_texture_begin() did before doing this scene render pass nonsense. I assumed scene_texture_begin() always just bound the scene framebuffer but no, it actually binds a g-buffer framebuffer if deferred lighting is on. It only binds the scene framebuffer if deferred lighting is off. The latest changes shared here definitely helped me figure out my mistake so thank you for that. So in actuality, after shadow map pass, we have to see if there are three potential framebuffers we need to rebind. I assumed it was just the swapchain and scene framebuffer but we actually need to make sure to check if the g-buffer needs to be bound. So sorry for the wild goose chase: SamuelCho@effb90a |
|
YES!!!! 👍 That got it working for me! Tested on Mac and Linux with the same results. Great work! The MSAA glitch is still present, but it's possible that it's unrelated and may well have been there for a while. I'll try to bisect that at some point this next week and see what I can find. |
|
Posting this here for the record and for anyone that might be following along: I created some new test builds based on current master (26.0.0-RC3) with all of the Vulkan changes. These are just for Win64 and macOS arm64, where the most testing can be done and the performance improvements can be seen. There is still the odd texture corruption bug in some missions but so far it's been fully playable for me. Windows x64: https://pxo.nottheeye.com/files/test/fs2open/vulkan-test-Win64.zip My test branch is at https://github.com/notimaginative/fs2open.github.com/tree/vulkan-pr-FIXES in case you want to build for your own testing or another platform. You'll need to have the vulkan sdk installed (or relevant packages from your package manager), or use the scp-prebuilt Vulkan PR with |
|
Guys, can we create a new PR that is based on taylor's branch? That would make getting it ready for merge easier. |
|
Maybe taylor should make the new PR since he's already has a branch rebased with his and my changes? |
|
Yeah I was planning to do that since laanwj appears to be MIA. I was hoping to get the vulkan prebuilt PR merged first though so that I won't have to rebase again to get the PR checks working. |
|
That's entirely reasonable |
|
This has be superseded by #7553. |








Implement a Vulkan 1.1 renderer that replaces the previous stub with a fully functional backend, mostly matching the OpenGL backend's rendering capabilities. The game should be playable with minimal divergence from OpenGL rendering.
This is, most likely, too big to go in all at once, but just filing it here for reference because it's reached a testable state.
Core rendering infrastructure. The code lives under
code/graphics/vulkan:VulkanMemory: Custom allocator with sub-allocation from device-local and host-visible memory poolsVulkanBuffer: Per-frame bump allocator for streaming uniform/vertex/index data (persistently mapped, double-buffered, auto-growing)VulkanTexture: Full texture management including 2D, 2D-array, 3D, and cubemap types with automatic mipmap generation and sampler cachingVulkanPipeline: Lazy pipeline creation from hashed render state, with persistent VkPipelineCacheVulkanShader: SPIR-V shader loading (main, deferred, effects, post-processing, shadows, decals, fog, MSAA resolve, etc.)VulkanDescriptorManager: 3-set descriptor layout (Global/Material/PerDraw) with per-frame pool allocation, auto-grow, and batched updatesVulkanDeletionQueue: Deferred resource destruction synchronized to frame-in-flight fencesDesign choices:
waitIdleor other CPU-on-GPU blocking in hot pathSome notable Vulkan vs OpenGL differences are:
Preparation patches to common game code (these commits need to go in first):
SCREEN_POSvertex format: Cleanup after previous commitvoid *, const void*What's possibly left to be done:
Unify OpenGL and Vulkan shaders where possible: the only shader shared with OpenGL (defined in the buid system's
SHADERS_GL_SHARED) is still the default material. Although the Vulkan backend does some things differently, it would definitely be possible to share more code. But i didn't want to accidentally break OpenGL in some way.Integrate VMA (Vulkan Memory Allocator). Some of the memory handling could be simplified by importing this dependency.
OpenXR anything. This is currently not implemented at all.
Build steps:
To run (with maximum debugging and Vulkan layer validation):
Full disclosure: i used Claude Opus 4.6 while developing this. However, the overall direction and design is my own, and i've paid careful attention to the code.