Holoplot UX - Example of modern user interface that could benefit from compressed textures. (KDAB designed UX, photo courtesy of Holoplot.)
It’s every programmer’s worst nightmare. Your beautiful app is running at a snail’s pace, crippled by virtual memory swapping. Even worse, you’ve added one last bitmap resource, and suddenly unrelated chunks of the UX aren’t showing up!
Desktop machines have become powerful enough that programmers rarely worry about performance issues, and today’s embedded systems usually have enough horsepower at their disposal. But in constructing sophisticated user interfaces with more and more elaborate imagery and high-definition backgrounds, we are often sapping one resource that’s always scarce: RAM. That’s especially true for the dedicated video RAM (VRAM) that sits on the GPU.
With ever-increasing screen densities, it’s easy to see why VRAM is in such demand. An iPad with a retina display is 2048x1536 pixels, and at 4 bytes per pixel, that’s a whopping 12.5MB of RAM for just a static background image. Toss in a few more large bitmaps, masks, or textures, and you can be consuming dozens or hundreds of megabytes of VRAM, not to mention regular RAM.
“Not me! All my images are compressed JPEGs or PNGs,” you say. Yes, but Qt Quick 2 (QQ2) uncompresses images on loading them, so your svelte, skinny PNGs uncompress into giant blocks of RAM on program start. And those bitmaps take an equivalent bite out of VRAM when they’re displayed. Making one-bit deep masks or using a 16- or 8-bit color depth doesn’t really help, as everything gets expanded out into 32-bit RGBA on loading.
What’s an enterprising Qt Quick 2 programmer to do? Take a tip from our game engine friends, that’s what. Modern video games load and manage hundreds of large, high-quality textures, and they get away with it by using custom compressed textures. The textures are directly loaded off disk and left in a compressed format that the GPU can understand. These formats aren’t JPEG or PNG—they’re highly specialized formats that need a time-consuming conversion process. But the GPU can directly read those formats and uncompress them to the display on-the-fly, meaning that your RAM (and VRAM) only takes the hit for the compressed sizes, not the fully expanded size. That can result in dramatic savings of both RAM and VRAM.
We don’t want to reinvent all Qt’s display machinery, so unless we can convince Qt to compress textures, any possible savings would be academically nice, but practically impossible. Fortunately, through the QOpenGLTexture class (a KDAB contribution, btw), QQ2, provides all the necessary APIs to let us change the underlying behavior to use compressed textures without mucking around in the internals.
Overriding the QSGTexture class
The first step is to compress the images using a GPU-friendly compression scheme. Unfortunately there are a number of these formats, many are proprietary and not well documented, and they aren’t readily exportable from common image-editing tools like GIMP or Photoshop. Thankfully though, ARM includes, as part of their ARMMali visual technology suite, a freely available and great tool that deals with a number of the most encountered compression formats: the Mali GPU Texture Compression Tool. Using the Mali GPU Texture compression tool to compress some samples images at the highest quality setting took over 30 minutes on a decent machine, so you may want to settle for slightly less than perfection!
Loading compressed NVIDIA GPU PKM files for the texture provider
Once you have your images compressed in a GPU-digestible way, the QQ2-related code is pretty straightforward with QOpenGLTexture doing most of the heavy lifting. Here’s sample code (including all the assorted code snippets) with the tweaks you’ll need. Note that this sample code is not a full executing sample and just a proof of concept, so #include <all standard disclaimers> …
To pull off the compressed texture magic, we need a custom image provider (CompressedTextureImageProvider), a custom texture provider (derived from QQuickTextureFactory), and a custom QSGTexture subclass (CompressedSGTexture in the sample code). Although the compressed texture is nearly the same size as the original PNG, the RAM/VRAM savings for a background image during runtime is about 12MB!
Compressed Texture Factory
The other big advantage of compressed textures is CPU utilization, especially during program initialization. Everyone appreciates faster program start times, and that’s especially true on constrained devices like tablets, mobiles, or embedded devices. Not only are we skipping the decompression step during the load phase, but we’re also minimizing the size of copies between RAM and VRAM.
Tons less memory and faster execution for a sprinkling of calls and an extra step added to the build process. Not a bad payoff!
2 Comments
21 - Nov - 2015
Daniel Kabel
Nice post! But how can I deal with images dynamically loaded in the app? For example an image viewer application which loads images from a network share (displaying a large amount of images in a grid view). I think compressing them is not an option as it takes to much time (?)
21 - Nov - 2015
Andy Gryc
Do you have the option to pre-compress the images in place? That will double the storage requirement but if it's on a network share, that's probably not an issue. Then if the GPU-compressed version is available your app can use it, and if not you'd default to the standard image. Do the pre-compress as a separate background thread, just like creating thumbnails. Not sure if that works for your app (and may not be worth the hassle), but it's the only thing that comes to mind.