Comprehensive C++ Hashmap Benchmarks (2022)

(martin.ankerl.com)

34 points | by klaussilveira 5 days ago

4 comments

  • jandrewrogers 17 minutes ago
    Benchmarks are fine but they will only be loosely correlated with the measured performance for any specific use case.

    There is still substantial performance to be gained by creating bespoke hashmap designs at every point of use in code. The high dimensionality of the algorithm optimization space makes it improbable that any specific hashmap algorithm implementation will optimally capture the characteristics of a use case or set of use cases. The variance can be relatively high.

    It isn't uncommon to find several independent hashmap designs inside performance-engineered code bases. The sensitivity to small details makes it difficult to build excellent hashmap abstractions with broad scope.

    • jeffbee 11 minutes ago
      It's also the case that performance of a hashmap, or anything, in a small-scale benchmark may not reflect the performance in a large system that does things other than manage maps. There are side effects like how many icache lines are visited during a map operation. These don't hurt microbenchmarks but they can matter in real systems. It may not be completely pointless to microbenchmark a data structure but it can be ultimately misleading.
  • spacechild1 52 minutes ago
    Note that this benchmark does not include boost::unordered_flat_map. This is an open addressing variant of boost::unordered_map which has only been released in December 2022.

    I wanted to mention this because boost::unordered_flat_map and boost::unordered_flat_set are among the fastest open addressing hash containers in C++ land. Internally, they use lots of cool SIMD tricks. If anyone is interested in the details, here's a nice blog post by the developer: https://bannalia.blogspot.com/2022/11/inside-boostunorderedf...

    • loeg 8 minutes ago
      The F14/Abseil maps are included and use cute SIMD tricks, FWIW (discussed by the blogspot author).
  • hermitcrab 1 hour ago
    Would be interested to hear how the Qt QHash compares.

    https://doc.qt.io/qt-6/qhash.html

    • rurban 57 minutes ago
      Still using linked lists as std::unordered_map. So it won't fly, but keeps ptr stability.
      • hermitcrab 17 minutes ago
        Didn't all the QList<> linked lists become QVector<> is Qt 6?
  • rurban 59 minutes ago
    Not really comprehesive. Doesn't include my favorite https://github.com/greg7mdp/parallel-hashmap which adds thread-safety to performance.
    • aw1621107 55 minutes ago
      For what it's worth, there's this bit from the parallel-hashmap readme:

      > We encourage phmap users to switch to gtl if possible. gtl provides the same functionality as this repository, but requires C++20 or above.

      And the benchmarks do include gtl.