为什么 std::unordered_map 很慢，我可以更有效地使用

时间：2023-09-17

本文介绍了为什么 std::unordered_map 很慢，我可以更有效地使用它来缓解这种情况吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我最近发现了一件奇怪的事情.看起来，使用完全没有缓存来计算 Collatz 序列长度似乎快了 2 倍以上/em> 比使用 std::unordered_map 缓存所有元素.

I’ve recently found out an odd thing. It seems that calculating Collatz sequence lengths with no caching at all is over 2 times faster than using std::unordered_map to cache all elements.

请注意，我确实从问题 Is gcc std 中得到了提示::unordered_map 实现慢?如果是这样 - 为什么? 我试图利用这些知识使 std::unordered_map 表现得尽可能好(我使用了 g++ 4.6，它确实比最新版本的 g++ 表现更好，并且我尝试指定一个合理的初始桶数，我使它完全等于地图必须容纳的最大元素数).

Note I did take hints from question Is gcc std::unordered_map implementation slow? If so - why? and I tried to used that knowledge to make std::unordered_map perform as well as I could (I used g++ 4.6, it did perform better than recent versions of g++, and I tried to specify a sound initial bucket count, I made it exactly equal to the maximum number of elements the map must hold).

相比之下，使用std::vector缓存一些元素比完全没有缓存快 17 倍，比使用 std::unordered_map 快近 40 倍.

In comparision, using std::vector to cache a few elements was almost 17 times faster than no caching at all and almost 40 times faster than using std::unordered_map.

我做错了什么还是这个容器太慢了，为什么?可以让它执行得更快吗?或者哈希图本质上是无效的，应该在高性能代码中尽可能避免使用?

Am I doing something wrong or is this container THAT slow and why? Can it be made performing faster? Or maybe hashmaps are inherently ineffective and should be avoided whenever possible in high-performance code?

有问题的基准是:

#include <iostream> #include <unordered_map> #include <cstdint> #include <ctime> std::uint_fast16_t getCollatzLength(std::uint_fast64_t val) { static std::unordered_map <std::uint_fast64_t, std::uint_fast16_t> cache ({{1,1}}, 2168611); if(cache.count(val) == 0) { if(val%2 == 0) cache[val] = getCollatzLength(val/2) + 1; else cache[val] = getCollatzLength(3*val+1) + 1; } return cache[val]; } int main() { std::clock_t tStart = std::clock(); std::uint_fast16_t largest = 0; for(int i = 1; i <= 999999; ++i) { auto cmax = getCollatzLength(i); if(cmax > largest) largest = cmax; } std::cout << largest << ' '; std::cout << "Time taken: " << (double)(std::clock() - tStart)/CLOCKS_PER_SEC << ' '; }

输出:耗时:0.761717

而一个完全没有缓存的基准:

Whereas a benchmark with no caching at all:

#include <iostream> #include <unordered_map> #include <cstdint> #include <ctime> std::uint_fast16_t getCollatzLength(std::uint_fast64_t val) { std::uint_fast16_t length = 1; while(val != 1) { if(val%2 == 0) val /= 2; else val = 3*val + 1; ++length; } return length; } int main() { std::clock_t tStart = std::clock(); std::uint_fast16_t largest = 0; for(int i = 1; i <= 999999; ++i) { auto cmax = getCollatzLength(i); if(cmax > largest) largest = cmax; } std::cout << largest << ' '; std::cout << "Time taken: " << (double)(std::clock() - tStart)/CLOCKS_PER_SEC << ' '; }

输出耗时:0.324586

推荐答案

标准库的映射确实天生就很慢(std::map 特别是，但 std::unoredered_map代码>以及).Google 的 Chandler Carruth 在他的 CppCon 2014 演讲中解释了这一点；简而言之:std::unordered_map 对缓存不友好，因为它使用链表作为存储桶.

The standard library's maps are, indeed, inherently slow (std::map especially but std::unoredered_map as well). Google's Chandler Carruth explains this in his CppCon 2014 talk; in a nutshell: std::unordered_map is cache-unfriendly because it uses linked lists as buckets.

这个问题提到了一些有效的哈希映射实现 - 改用其中一个.

This SO question mentioned some efficient hash map implementations - use one of those instead.

这篇关于为什么 std::unordered_map 很慢，我可以更有效地使用它来缓解这种情况吗?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持html5模板网！

上一篇：以编程方式获取缓存行大小? 下一篇：__builtin___clear_cache 是如何工作的?

相关文章

fpermissive 标志有什么作用?What does the fpermissive flag do?(fpermissive 标志有什么作用?)

如何在我不想编辑的第 3 方代码中禁用来自 gccHow do you disable the unused variable warnings coming out of gcc in 3rd party code I do not wish to edit?(如何在我不想编辑的第 3 方代码中禁

使用 GCC 预编译头文件Precompiled headers with GCC(使用 GCC 预编译头文件)

如何在 OS X 中包含 omp.h?How to include omp.h in OS X?(如何在 OS X 中包含 omp.h?)

如何让 GCC 将 .text 部分编译为可写在 ELF 二进制文How can I make GCC compile the .text section as writable in an ELF binary?(如何让 GCC 将 .text 部分编译为可写在 ELF 二进制文件中?)

GCC、字符串化和内联 GLSL?GCC, stringification, and inline GLSL?(GCC、字符串化和内联 GLSL?)

最新文章

如何获取 boost::asio::ip::tcp::socket 的 IP 地址?

字节序何时成为一个因素?

C++/Boost:示例中的未定义符号?

处理多个客户端的单个 TCP/IP 服务器(在 C++ 中)?

WinSock2.h 中的重新定义错误

使用 boost::asio 发送 Protobuf 消息

TCP 同时打开和自连接预防

检测何时拔下网线

浮点字节序?

Boost::Asio : io_service.run() vs poll() 或者我如何在 m