This is very known question to find Duplicate Rows from the Table and Delete Duplicate rows from the table I came to know about "ROW_NUMBER()" of SQL SERVER 2005. Lets see how we can use this feature: There is one Table:
INSERT some data into the table:
Now we have data like: Now, we need to Find the duplicate Rows from the table. So we need to get "A", "C" and "D". Lets see the use of Row_Number():
Output will be like: So we can Delete Duplicate Rows as we have EmpID (Primary Key) I Used: PARTITION BY: which will find same rows and assign ID accordingly. ORDER BY: In which order we want to assign ID. By Default its Ascending |
Learn SQL and database management at SQLYoga for articles, tutorials, and tips to improve your skills and streamline data operations. Join our community!
March 19, 2009
SQL SERVER: Find Duplicate Rows with Row_Number() SQL SERVER 2005
18+ years of Hands-on Experience
MICROSOFT CERTIFIED PROFESSIONAL (Microsoft SQL Server)
Proficient in .NET C#
Hands on working experience on MS SQL, DBA, Performance Tuning, Power BI, SSIS, and SSRS
Subscribe to:
Post Comments (Atom)
If you have am identity column you don't need anything else:
ReplyDeleteCREATE TABLE FindDuplicates(
EmpID INT IDENTITY(1,1),
EmpName VARCHAR(500)
)
----INSERT some data into the table:
INSERT INTO FindDuplicates VALUES('A')
INSERT INTO FindDuplicates VALUES('A')
INSERT INTO FindDuplicates VALUES('B')
INSERT INTO FindDuplicates VALUES('C')
INSERT INTO FindDuplicates VALUES('C')
INSERT INTO FindDuplicates VALUES('D')
INSERT INTO FindDuplicates VALUES('D')
INSERT INTO FindDuplicates VALUES('D')
INSERT INTO FindDuplicates VALUES('E')
INSERT INTO FindDuplicates VALUES('E')
INSERT INTO FindDuplicates VALUES('E')
INSERT INTO FindDuplicates VALUES('E')
select
max(EmpID)
from
FindDuplicates
group by
EmpName
having
count(*)>1
while @@rowcount>0
delete from
FindDuplicates
where
EmpID in
(
select
max(EmpID)
from
FindDuplicates
group by
EmpName
having
count(*)>1
)
select * from FindDuplicates
Hi,
ReplyDeleteYes we can also use the way that you suggested to remove duplicate rows on basis of IDENTITY column.
But when we need to identify the duplicate rows based on another columns too, then it better to use ROW_NUMBER.
You can also find that syntax simplicity with the use of ROW_NUMBER.
Tejas
Hi All,
ReplyDeleteI want to display all duplicate records in the table. My query has to fetch all the records which are duplicate(First Name or Last Name). Also I want the ability to also pull names where there might be a middle initial placed in the end of the first name field, (i.e., "Maria Z. " vs. "Maria") as well.
Please guide me to find this.
Table:
ID FirstName LastName
1 Zach H Hoffman
2 Zach Hoffman
3 Troy Hoffman
4 Shawn Livermore
5 Prem S
6 Jony Hoffman H
7 Zach Modan
I need the query to filter.........
ID FirstName LastName
1 Zach H Hoffman
2 Zach Hoffman
3 Troy Hoffman
6 Jony Hoffman H
7 Zach Modan
I hope this example will give you clear idea.....
Thanks in Advance
Prem
Thanks for this article
ReplyDeleteCan We Do The Following With Identity Column:
ReplyDeleteRemove Duplicates While Keeping One Row Of Each Duplicate(I Mean, On Encountering Duplicate Record, We Should Keep The One Row Out Of Those Duplicate Records).
Hi Singh,
ReplyDeleteYes, we can do the same with the ROW_NUMBER. What we need to do is just write following query:
;With CTE AS(
SELECT ROW_NUMBER() OVER(PARTITION BY EmpName ORDER BY EmpName) As RowID,
EmpID,
EmpName
FROM FindDuplicates
)
DELETE
FROM CTE
WHERE RowID > 1
This above query, will keep row having RowID = 1 and duplicate rows will be removed from the table: "FindDuplicates".
Thanks,
Tejas
SQLYoga.com