从大型 mysql 表中选择随机行的快速方法是什么?
What is a fast way to select a random row from a large mysql table?
我正在使用 php,但我对任何解决方案感兴趣,即使它是另一种语言.
I'm working in php, but I'm interested in any solution even if it's in another language.
获取所有 id,从中随机选择一个,然后检索整行.
Grab all the id's, pick a random one from it, and retrieve the full row.
如果你知道 id 是连续的,没有空洞,你可以只获取最大值并计算一个随机 id.
If you know the id's are sequential without holes, you can just grab the max and calculate a random id.
如果这里和那里都有漏洞,但主要是顺序值,并且您不关心稍微偏斜的随机性,则获取最大值,计算一个 id,然后选择 id 等于或大于 id 的第一行你计算过.偏斜的原因是,跟在这些洞后面的id比跟在另一个id后面的id有更高的机会被选中.
If there are holes here and there but mostly sequential values, and you don't care about a slightly skewed randomness, grab the max value, calculate an id, and select the first row with an id equal to or above the one you calculated. The reason for the skewing is that id's following such holes will have a higher chance of being picked than ones that follow another id.
如果您随机订购,您手上的表格扫描会很糟糕,并且快速这个词不适用于这样的解决方案.
If you order by random, you're going to have a terrible table-scan on your hands, and the word quick doesn't apply to such a solution.
不要那样做,也不应该按 GUID 订购,它也有同样的问题.
Don't do that, nor should you order by a GUID, it has the same problem.
这篇关于从mysql的大表中快速选择随机行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!
如何有效地使用窗口函数根据 N 个先前值来决定How to use windowing functions efficiently to decide next N number of rows based on N number of previous values(如何有效地使用窗口函数根据
在“GROUP BY"中重用选择表达式的结果;条款reuse the result of a select expression in the quot;GROUP BYquot; clause?(在“GROUP BY中重用选择表达式的结果;条款?)
Pyspark DataFrameWriter jdbc 函数的 ignore 选项是忽略整Does ignore option of Pyspark DataFrameWriter jdbc function ignore entire transaction or just offending rows?(Pyspark DataFrameWriter jdbc 函数的 ig
使用 INSERT INTO table ON DUPLICATE KEY 时出错,使用 Error while using INSERT INTO table ON DUPLICATE KEY, using a for loop array(使用 INSERT INTO table ON DUPLICATE KEY 时出错,使用 for 循环数组
pyspark mysql jdbc load 调用 o23.load 时发生错误 没有合pyspark mysql jdbc load An error occurred while calling o23.load No suitable driver(pyspark mysql jdbc load 调用 o23.load 时发生错误 没有合适的
如何将 Apache Spark 与 MySQL 集成以将数据库表作为How to integrate Apache Spark with MySQL for reading database tables as a spark dataframe?(如何将 Apache Spark 与 MySQL 集成以将数据库表作为