Optimizing and Sharing Shader Structures

24 August 2023

Reasoning

Here's an example of real code, exposing options to a post-processing stage.

layout(push_constant) uniform PushConstant {
    vec4 viewport;
    vec4 options;
    vec4 transform_ops;
    vec4 ao_options;
    vec4 ao_options2;
    vec4 proj_info;
    mat4 cameraProj;
    mat4 invProj;
};

Even for the person who wrote this code, it's hard to tell what each option does from a glance. This is a great way to create bugs, since it's extremely easy to mix up accessors like ao_options.x and ao_options.y. Ideally, we want these options to be separated but there's a reason why they're packed in the first place.

Alignment rules

Say you're beginning to explore Phong shading, and you want to expose a position and a color property so you can change them while the program is running. In a 3D environment, there are three axes (X, Y and Z) so naturally it must be a vec3. Light color also makes sense to be a vec3. When emitted from a light, it's color can't really be "transparent" so we don't need the alpha channel. The GLSL code so far looks like this:

#version 430

out vec4 finalColor;

layout(binding = 0) buffer block {
    vec3 position;
    vec3 color;
} light;

void main() {
    const vec3 dummy = vec3(1) - light.position;
    finalColor = vec4(vec3(1.0, 1.0, 1.0) * light.color, 1.0);
}

(There's no Phong formula here, we want to make sure the GLSL compiler doesn't optimize anything out.)

When writing the structure on the C++ side, you might write something like this:

struct Light {
    glm::vec3 position;
    glm::vec3 color;
} light;

light.position = {1, 5, 0};
light.color = {3, 2, -1};

For this example I used the debug printf system, which is part of the Vulkan SDK so we can confirm the exact values. The output is as follows:

Position = (1.000000, 5.000000, 0.000000)
Color = (2.000000, -1.000000, 0.000000)

As you can see, the first value of color is getting chopped off when reading it in the shader. The usual solution to the problem is to use a vec4 instead:

struct Light {
    glm::vec4 position;
    glm::vec4 color;
};

And to confirm, this does indeed fix the issue:

Position = (1.000000, 5.000000, 0.000000)
Color = (3.000000, 2.000000, -1.000000)

But why does it work when we change to it a vec4? This section from the Vulkan specification spells it out for us:

The base alignment of the type of an OpTypeStruct member is defined recursively as follows:

A scalar has a base alignment equal to its scalar alignment.
A two-component vector has a base alignment equal to twice its scalar alignment.
A three- or four-component vector has a base alignment equal to four times its scalar alignment.

The third bullet point hits it right on the head, vec4 and vec3 have the same alignment! An alternative solution could be to use alignas:

struct Light {
    glm::vec3 color;
    alignas(16) glm::vec3 position;
};

There's a bunch of more nitty and dirty alignment issues that stem from differences between C++ and GLSL, and this is one of those cases. In my opinion, this shouldn't be nessecary for the programmer to handle themselves.

Passing booleans

Another example of esoteric shader rules is when you try passing booleans. Take a look at this C++ structure, which seems okay at first glance:

struct TestBuffer {
    bool a = false;
    bool b = true;
    bool c = false;
    bool d = true;
};

And this is how it's defined in GLSL:

layout(binding = 0) buffer readonly TestBuffer {
    bool a, b, c, d;
};

When sent to the shader, the values of the structure end up like this:

a = 1, b = 0, c = 0, d = 0

This is because because SPIR-V doesn't seem to define a physical size for bool, so it could be represented as anything (like an unsigned integer). In this case, you actually want to define them as integer:

layout(binding = 0) buffer readonly TestBuffer {
    int a, b, c, d;
};

This is a little disappointing, because the semantic meaning of a boolean option is lost when you declare them as integers. You can also pack a lot of booleans into the space of one 32-bit integer, which could be a possible space-saving optimization in the future.

Sharing structures

The last problem is keeping the structures in sync. There's usually one instance of the structure written in C++ and many copies in GLSL shaders. This is problematic because member order could change, so parts of the structure itself could be undefined and can easily escape notice. Having one definition for all shaders and C++ would be a huge improvement!

Struct compiler

What I ended up with is a new pre-processing step, which I called the "struct compiler". I tried searching on the Internet to see if someone has already made a tool like this, but couldn't find much - maybe shader reflection is more popular. I did learn a lot from making this tool anyway. It's main goals are:

Define the shader structures in one, centralized file.
Structures should be able to be written on a higher-level, allowing us to decouple the actual member order, alignment and packing from the logic. This enables the compiler to optimize the structure in the future, maybe beyond what we can reasonably hand-write.
The structure is usable in GLSL and C++.

First you write a .struct file, describing the required members and their types. Here's the same post-processing structure showcased in the beginning, but now written in the compiler's custom syntax:

primary PostPushConstant {
    viewport: vec4
    camera_proj: mat4
    inv_proj: mat4
    inv_view: mat4

    enable_aa: bool
    enable_dof: bool

    exposure: float
    display_color_space: int
    tonemapping: int

    ao_radius: float
    ao_r2: float
    ao_rneginvr2: float
    ao_rdotvbias: float
    ao_intensity: float
    ao_bias: float
}

This looks much better, doesn't it? Even without knowing anything else about the actual shader, you can guess which options do what with some accuracy. Here's what it might look like, compiled to C++:

struct PostPushConstant {
    glm::mat4 camera_proj;
    glm::mat4 inv_proj;
    glm::mat4 inv_view;
    glm::vec4 viewport;
    glm::ivec4 enable_aa_enable_dof_display_color_space_tonemapping_;
    glm::vec4 exposure_ao_radius_ao_r2_ao_rneginvr2_;
    glm::vec4 ao_rdotvbias_ao_intensity_ao_bias_;
    ...
};

(Setters like set_exposure() and set_exposure() are used instead of accessing the glm::vec4 manually.)

I hook the generation step in my buildsystem to automatically run, so all you need to do is include the auto-generated header. To use the structure in GLSL, I created a new directive that inserts the GLSL version of the structure given by the struct compiler. The same system that generates the C++ headers also generates GLSL which inserts where this directive is found:

#use_struct(push_constant, post, post_push_constant)

(The syntax could use some work, but the first argument is the usage, and the second argument is the name of the struct. The third argument is a unique name for the instance.)

Since the member order and names are undefined, you must access the members by a setter/getter in GLSL and C++. I think this is a worthwhile trade-off for readable code.

vec3 ao_result = pow(ao, ao_intensity())

This tool runs as a pre-processing step offline, before shader compilation begins. The tool's source code is available here, which is taken from one of my personal projects. It's quickly written and I don't recommend using it directly, but I'm confident that this idea is worth pursuing.

Tags:

3d c++performance

About KDAB

The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.

Joshua Goins

Software Engineer

Joshua Goins is a Software Engineer at KDAB

Optimizing and Sharing Shader Structures

Reasoning

Alignment rules

Passing booleans

Sharing structures

Struct compiler

Related Content

Sign up for the KDAB Newsletter

Optimizing and Sharing Shader Structures

Reasoning

Alignment rules

Passing booleans

Sharing structures

Struct compiler

Related Content

Shader Variants

FMA Woes

Qt World Summit 2019 talk videos are online

Sign up for the KDAB Newsletter