Fast Duplicate Tracking

DuplicateTracker Merged to KDToolBox

7 September 2021

Double Lookup

The contains() + insert() anti-pattern performs two look-ups for the same key, because the position that contains() internally has calculated is lost by the time contains() returns and thus has to be re-calculated in insert() again. We know that the position will be the same, but the compiler doesn't. The STL associative containers return a pair (boooooo!) of iterator and bool, with the latter indicating whether the item was inserted (true) or was found to pre-exist (false). Since we want to insert anyway, we can just attempt the insert() and check whether it changed something:

std::unordered_set<~~~> seen;
~~~~
    if (auto [it, inserted] = seen.insert(x); !inserted)
        continue; // was already present
    // no need for seen.insert(x): already done

While this is still quite a mouthful, we've managed to reduce the look-ups by half.

The Qt containers lack this convenient API, of course, instead forcing the user to check whether the size of the container changed, like this:

QSet<~~~~> seen;
~~~~
    const auto oldSize = seen.size();
    seen.insert(x);
    if (seen.size() == oldSize)
        continue; // was already present
    // no need for seen.insert(x): already inserted

It's odd, but works, too.

Allocations, Allocations

While QSet or std::unordered_set are fast once constructed, they both allocate a new node per element inserted. So in the usual case where you don't have (many) duplicates, you're performing O(N) memory allocations, N being the number of candidates to check.

How nice it would be if we could combine the speed of a hash table with the cache-friendliness of, say, a QVarLengthArray. Oh, wait. We can!

In C++17, we finally got some of the Bloomberg Allocators merged and, while quite a few implementations are still behind implementing them, that shouldn't deter us, thanks to feature test macros.

Taking a look at the std::pmr namespace, we find a few template aliases that appear to implant a non-standard allocator, std::pmr::polymorphic_allocator, into the usual range of STL containers, including vector and unordered_set.

If you want to know all the gory details of that class, book one of KDAB's Modern C++ trainings. Suffice to say that std::pmr::polymorphic_allocator implements an allocator that gets memory from a std::pmr::memory_resource, which is an interface you can inherit to implement your own custom allocation strategies.

Enter `monotonic_buffer_resource`

But we don't even need to do that. We just need to plug one of the existing memory_resources, std::pmr::monotonic_buffer_resource, into std::pmr::unordered_set.

The monotonic_buffer_resource class implements a grow-only memory pool over a static char buffer where deallocation is a no-op and allocation just ups a pointer. In case the initial buffer is exhausted, monotonic_buffer_resource allocates a new block and continues to allocate from there. If that's exhausted too, it allocates an even larger block, etc.

This allocator is as fast as it gets, but it isn't thread-safe and it never frees memory (until destroyed itself). But that's perfect for building temporary containers whose lifetimes are bound by a function.

The setup is a bit verbose, due to the flexibility of the std::pmr framework. But it's always the same:

char buffer[1024]; // or whatever size fits
std::pmr::monotonic_buffer_resource res{buffer, sizeof buffer};
std::unordered_set<~~~> seen(N, &res); // N: expected number of elements
~~~~
    if (auto [it, ins] = seen.insert(x); !ins)
        continue;
    // already inserted

Now the first 1 KiB of memory is allocated on the stack, and only then do we go to the heap. Passing the expected number of elements to the unordered_set constructor (that's like a reserve(), not a resize()!) is very important here because, while a re-hash wouldn't take the nodes out of buffer, it would waste memory for the new bucket list because monotonic_buffer_resource doesn't re-use freed-up memory again, thus creating gaps in our precious, buffer.

Tying It All Together

So, we have three things that we need to keep in mind:

Avoid double lookups,
Use std::pmr::monotonic_buffer_resource to avoid memory allocations,
...but fall back to QSet or std::unordered_set when std::pmr isn't available.

Then, it's just a question of writing a good hash function (which has recently become much simpler).

QDuplicateTracker (private API) bound these together for the first time, with a much nicer API:

QDuplicateTracker<~~~> seen;
seen.reserve(N);
~~~~
    if (seen.hasSeen(x))
        continue;
    // already inserted

Once you know which patterns to look for, the use-cases seem endless: Qt changes in topic "qduplicatetracker".

I don't recommend using QDuplicateTracker from Qt 5 anymore though, since, as I mentioned, I have taken this class through another two iterations in customer projects, finally culminating in the addition to KDToolBox. That's the version you should be using.

Alternatively, it should be possible, even in Qt projects, to use the Qt 6 version, which I have recently improved for the Qt 6.3 release next year, simply by copying its header file into your project and replacing the QHasher internal struct (which depends on implementation details of Qt 6's QHash container) with a template argument.

I invite you to study the source-code and post any questions that you may have in the comments below.

Happy duplicate tracking!

Tags:

c++performance qml qt tools

About KDAB

The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.

Marc Mutz

Former KDAB employee

Marc Mutz is a former KDAB employee

Fast Duplicate Tracking

DuplicateTracker Merged to KDToolBox

Double Lookup

Allocations, Allocations

Enter `monotonic_buffer_resource`

Tying It All Together

Related Content

Sign up for the KDAB Newsletter

Qt 5 to Qt 6 Migration Services

Fast Duplicate Tracking

DuplicateTracker Merged to KDToolBox

Double Lookup

Allocations, Allocations

Enter monotonic_buffer_resource

Tying It All Together

Related Content

10 Tips to Make Your QML Code Faster and More Maintainable

Introducing the ConnectionEvaluator in KDBindings

Full Stack Tracing, Part 3

Sign up for the KDAB Newsletter

Qt 5 to Qt 6 Migration Services

Enter `monotonic_buffer_resource`