我在我的表中发现了需要删除的重复项.该表包含:
I found duplicates in my table which needed to be removed. The table contains:
一个学生应该只有一个科目、班级和 XP_lvl 的记录.在这种情况下,删除重复项是基于删除所有但保留一个.
One student should have only one record of subject, class and XP_lvl. In this case the removal of duplicities is based on delete all but keep one.
在我的情况下,重复看起来像这样:
In my case duplicates looks like this:
| ID | STUDENT_ID | SUBJECT_ID | CLASS_ID | EXPERTISE_LVL |
|---|---|---|---|---|
| 1 | 1AAA | 55FFE | CLASS808 | 2 |
| 2 | 1AAA | 55FFE | CLASS808 | 2 |
| 3 | 2AAB | 49BB | CLASS890 | 3 |
| 4 | 2AAB | 49BB | CLASS890 | 3 |
| 5 | 2AAB | 49BB | CLASS890 | 4 |
| 6 | 2AAB | 49BB | CLASS890 | 3 |
我通过创建 (CONCAT(STUDENT_ID, CONCAT(SUBJECT_ID, CLASS_ID))) 的唯一 ID,然后通过 count.. >1 识别出所有重复项> 工作正常.
I have identified all the duplicates by creating unique ID of (CONCAT(STUDENT_ID, CONCAT(SUBJECT_ID, CLASS_ID))) and then by having count.. >1 which works fine.
现在我需要识别所有 ID,以便我可以在从查询中删除时使用 ID NOT IN (SELECT...).
Now I need to identify all the ID so I can use ID NOT IN (SELECT...) in my delete from query.
所以我这样做了..
AND ID NOT IN (SELECT UID FROM (
SELECT
min(ID) AS UID,
STUDENT_ID,
SUBJECT_ID,
CLASS_ID
FROM
my_table
GROUP BY
STUDENT_ID,
SUBJECT_ID,
CLASS_ID
HAVING
count(CONCAT(STUDENT_ID, CONCAT(SUBJECT_ID, CLASS_ID))) > 1))
但是,我不能使用 min/max(ID) 来选择要保留的 ID,因为正如您所见,对于学生 2AAB,存在具有不同 XP_LVL 的重复项.
However I cannot use min/max(ID) to choose which ID to keep because as you can see for student 2AAB there are duplicities with different XP_LVL.
在这种情况下,我需要选择最高 XP_LVL 的 ID 来保留和删除所有其他的.
In this case I need to select ID of highest XP_LVL to keep and delete all other.
我尝试使用 RANK、ROWNUM 不同的排序和子选择的负载,但没有想要的结果.
I tried using RANK, ROWNUM different ordering and loaaads of subselects but without desired results.
有谁知道如何有效地做到这一点?我们正在谈论 6k 重复,所以我不能一一做.提前感谢您的帮助.
Does anyone have any idea how to do it effectively ? We are talking about 6k duplicities so I cant do it one by one. Thank you for your help in advance.
您可以使用 ROW_NUMBER 解析函数并使用 ROWID 伪列关联删除:>
You can use ROW_NUMBER analytic function and correlate the deletion using the ROWID pseudo-column:
DELETE FROM my_table
WHERE ROWID IN (
SELECT ROWID
FROM (
SELECT ROW_NUMBER() OVER (
PARTITION BY student_id, subject_id, class_id
ORDER BY expertise_lvl DESC
) AS rn
FROM my_table
)
WHERE rn > 1
)
db<>fiddle 这里
这篇关于SQL - 使用 ID NOT IN 的高级重复删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!
获取所有产品、类别和元数据的 SQL 查询 woocommSQL query to get all products, categories and meta data woocommerce/wordpress(获取所有产品、类别和元数据的 SQL 查询 woocommerce/wordpre
我可以在不编写 SQL 查询的情况下找出数据库列表Can I figure out a list of databases and the space used by SQL Server instances without writing SQL queries?(我可以在不编写 SQL 查询的情况下
如何创建对 SQL Server 实例的登录?How to create a login to a SQL Server instance?(如何创建对 SQL Server 实例的登录?)
如何通过注册表搜索知道SQL Server的版本和版本How to know the version and edition of SQL Server through registry search(如何通过注册表搜索知道SQL Server的版本和版本)
为什么会出现“数据类型转换错误"?使用 ExWhy do I get a quot;data type conversion errorquot; with ExecuteNonQuery()?(为什么会出现“数据类型转换错误?使用 ExecuteNonQuery()?)
如何将 DataGridView 中的图像显示到 PictureBox?How to show an image from a DataGridView to a PictureBox?(如何将 DataGridView 中的图像显示到 PictureBox?)