C++uniform_int_distribution 在第一次调用时总是返回

时间：2023-10-07

本文介绍了C++uniform_int_distribution 在第一次调用时总是返回 min()的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在标准库的至少一个实现中，std::uniform_int_distribution<> 的第一次调用不会返回随机值，而是分布的最小值.也就是说，给定代码:

In at least one implementation of the standard library, the first invocation of a std::uniform_int_distribution<> does not return a random value, but rather the distribution's min value. That is, given the code:

default_random_engine engine( any_seed() );
uniform_int_distribution< int > distribution( smaller, larger );
auto x = distribution( engine );
assert( x == smaller );

对于 any_seed()、smaller、smaller 的任何值，

...x 实际上将 smaller，或更大.

...x will in fact be smaller for any values of any_seed(), smaller, or larger.

要在家玩，您可以尝试在 gcc 4.8.1 中演示此问题的代码示例.

To play along at home, you can try a code sample that demonstrates this problem in gcc 4.8.1.

我相信这是不正确的行为?如果这是正确的行为，为什么随机分布会返回这个明显非随机的值?

I trust this is not correct behavior? If it is correct behavior, why would a random distribution return this clearly non-random value?

对观察到的行为的解释

如果可能结果的范围小于 rng 产生的数字范围，uniform_int_distribution 就是这样将随机位映射到数字的:

Explanation for the observed behavior

This is how uniform_int_distribution maps the random bits to numbers if the range of possible outcomes is smaller than the range of number the rng produces:

const __uctype __uerange = __urange + 1; // __urange can be zero
const __uctype __scaling = __urngrange / __uerange;
const __uctype __past = __uerange * __scaling;
do
  __ret = __uctype(__urng()) - __urngmin;
while (__ret >= __past);
__ret /= __scaling;

其中 __urange 是 larger -smaller 并且 __urngrange 是 rng 可以返回的最大值和最小值之间的差值.(代码来自 libstdc++ 6.1 中的 bits/uniform_int_dist.h)

where __urange is larger - smaller and __urngrange is the difference between the maximum and the minimum value the rng can return. (Code from bits/uniform_int_dist.h in libstdc++ 6.1)

在我们的例子中，rng default_random_engine 是一个 minstd_rand0，它产生 __scaling == 195225785 对于范围 [0,10] 你测试.因此，如果 rng() <195225785，分配将返回0.

In our case, the rng default_random_engine is a minstd_rand0, which yields __scaling == 195225785 for the range [0,10] you tested with. Thus, if rng() < 195225785, the distribution will return 0.

minstd_rand0 返回的第一个数字是

(16807 * seed) % 2147483647

(其中 seed == 0 被调整为 1 顺便说一句).因此，我们可以看到由 minstd_rand0 产生的第一个值以小于 11615 的数字作为种子将产生 0，uniform_int_distribution<国际 >分布( 0, 10 ); 你用过.(修改我的一个错误.;))

(where seed == 0 gets adjusted to 1 btw). We can thus see that the first value produced by a minstd_rand0 seeded with a number smaller than 11615 will yield 0 with the uniform_int_distribution< int > distribution( 0, 10 ); you used. (mod off-by-one-errors on my part. ;) )

您提到了更大种子的问题会消失:一旦种子变得足够大以实际使 mod 操作执行某些操作，我们就不能简单地通过除法将整个范围的值分配给相同的输出，因此结果将看起来更好.

You mentioned the problem going away for bigger seeds: As soon as the seeds get big enough to actually make the mod operation do something, we cannot simply assign a whole range of values to the same output by division, so the results will look better.

没有.通过始终选择较小的随机数，您在应该是随机的 32 位种子中引入了显着的偏差.结果中出现的偏见并不奇怪或邪恶.对于随机种子，即使您的 minstd_rand0 也会产生相当均匀的随机第一个值.(虽然之后的数字序列不会有很好的统计质量.)

No. You introduced significant bias in what is supposed to be a random 32 bit seed by always choosing it small. That bias showing up in the results is not surprising or evil. For random seeds, even your minstd_rand0 will yield a fairly uniformly random first value. (Though the sequence of numbers after that will not be of great statistical quality.)

案例 1:您想要高统计质量的随机数.

Case 1: You want random number of high statistical quality.

为此，您可以使用更好的 rng，例如 mt19937 并为其整个状态空间设定种子.对于 Mersenne Twister，这是 624 个 32 位整数.(作为参考，这里是我尝试正确执行此操作的一些有用建议在答案中.)

For that, you use a better rng like mt19937 and seed its entire state space. For the Mersenne Twister, that's 624 32-bit integers. (For reference, here is my attempt to do this properly with some helpful suggestions in the answer.)

案例 2:您真的只想使用那些小种子.

Case 2: You really want to use those small seeds only.

我们仍然可以从中获得不错的结果.问题是伪随机数生成器通常有点连续地"依赖于随机数生成器.在他们的种子上.为了解决这个问题，我们丢弃了足够的数字，让最初相似的输出序列发散.因此，如果您的种子必须很小，您可以像这样初始化您的 rng:

We can still get decent results out of this. The problem is that pseudo random number generators commonly depend "somewhat continuously" on their seed. To ship around this, we discard enough numbers to let the initially similar sequences of output diverge. So if your seed must be small, you can initialize your rng like this:

std::mt19937 rng(smallSeed);
rng.discard(700000);

为此使用像 Mersenne Twister 这样的好 rng 至关重要.我不知道有什么方法可以从种子不佳的 minstd_rand0 中获得合适的值，例如参见这个火车失事.即使播种正确，mt19937 的统计特性也远胜一筹.

It is vital that you use a good rng like the Mersenne Twister for this. I do not know of any method to get even decent values out of a poorly seeded minstd_rand0, for example see this train-wreck. Even if seeded properly, the statistical properties of a mt19937 are superior by far.

您有时会听到对大型状态空间或缓慢生成的担忧，但在嵌入式世界之外通常并不担心.根据 boost 和 cacert.at，MT 甚至比 minstd_rand0.


Concerns about the large state space or slow generation you sometimes hear about are usually of no concern outside the embedded world. According to boost and cacert.at, the MT is even way faster than minstd_rand0.
尽管如此，您仍然需要执行丢弃技巧，即使您的结果在没有肉眼的情况下看起来不错.在我的系统上它只需要不到一毫秒，而且你不经常播种 rng，所以没有理由不这样做.
You still need to do the discard trick though, even if your results look good to the naked eye without. It takes less than a millisecond on my system, and you don't seed rngs very often, so there is no reason not to.
请注意，我无法准确估计我们需要的丢弃次数，我从  中获取了该值这个答案，它链接这篇论文为理性.我现在没有时间解决这个问题.
Note that I am not able to give you a sharp estimate for the number of discards we need, I took that value from this answer, it links this paper for a rational. I don't have the time to work through that right now.

                        这篇关于C++uniform_int_distribution 在第一次调用时总是返回 min()的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持html5模板网！



上一篇：生成一个范围内的N个随机数，其总和为常数 
下一篇：如何确保 std::random_shuffle 总是产生不同的结果? 

 
相关文章
读取输入文件，最快的方法?read input files, fastest way possible?(读取输入文件，最快的方法?)
在 C++ 中读取格式化输入的最简单方法?The easiest way to read formatted input in C++?(在 C++ 中读取格式化输入的最简单方法?)
从 .txt 文件读取到 C++ 中的二维数组Reading from .txt file into two dimensional array in c++(从 .txt 文件读取到 C++ 中的二维数组)
如何在 C++ 中模拟按键按下How to simulate a key press in C++(如何在 C++ 中模拟按键按下)
为什么在 cin.ignore() 之后没有 getline(cin, var) 读取Why doesn#39;t getline(cin, var) after cin.ignore() read the first character of the string?(为什么在 cin.ignore() 之后没有 getline(cin, var) 读取
scanf 格式输入的 cin 类比是什么?What is the cin analougus of scanf formatted input?(scanf 格式输入的 cin 类比是什么?)



最新文章

为什么不直接使用random_device?
需要一个快速的 C++ 随机生成器
什么是游戏的好的随机数生成器?
用于从每个元素具有不同概率的列表中进行选择
rand() 和 random() 函数有什么区别?
跨平台一致的伪随机数
改变uniform_int_distribution的范围
就生成随机数而言，种子是什么?
1.0 是 std::generate_canonical 的有效输出吗?
在 C/C++ 中获取大随机数

C++uniform_int_distribution 在第一次调用时总是返回

问题描述

推荐答案

对观察到的行为的解释

Explanation for the observed behavior

相关文章

最新文章