Deleting Duplicate Records from a table

First Method: (USING TEMPORARY TABLE)

1. Create a table called 'Employee_DuplicateRecord'.
 
CREATE TABLE Employee_DuplicateRecord(id INT,name varchar(100))

2. Now insert some rows of data in this table

INSERT INTO Employee_DuplicateRecord VALUES (1,'AA')
INSERT INTO Employee_DuplicateRecord VALUES (1,'AA')--duplicate
INSERT INTO Employee_DuplicateRecord VALUES (1,'AA')--duplicate
INSERT INTO Employee_DuplicateRecord VALUES (2,'BB')
INSERT INTO Employee_DuplicateRecord VALUES (2,'BB')--duplicate
INSERT INTO Employee_DuplicateRecord VALUES (3,'CC')
INSERT INTO Employee_DuplicateRecord VALUES (3,'CC')--duplicate
INSERT INTO Employee_DuplicateRecord VALUES (4,'DD')
INSERT INTO Employee_DuplicateRecord VALUES (5,'EE')

3. Now view the records.

SELECT * FROM Employee_DuplicateRecord

It will return 9 rows having 4 duplicate values.

4. Now create a intermediate table -'Employee_DuplicateRecord_Temp'
CREATE TABLE Employee_DuplicateRecord_Temp (id int,name varchar(100))

5. Now select distinct records from first table - 'Employee_DuplicateRecord' And  Insert disticnt records to table - Employee_DuplicateRecord_Temp

INSERT INTO Employee_DuplicateRecord_Temp (id,name) (SELECT DISTINCT(id),name FROM  employee_duplicateRecord)

6. Now Delete all the rows from Main table- Employee_DuplicateRecord

DELETE Employee_DuplicateRecord

7. Now, Insert from intermediate table -'Employee_DuplicateRecord_Temp' to 'Employee_DuplicateRecord'.


INSERT INTO Employee_DuplicateRecord (id,name)
 (SELECT id,name FROM  employee_duplicateRecord_temp)

Now,

 SELECT * FROM Employee_DuplicateRecord


It will return the rows without duplicate.
8. Drop intermediat table - 'Employee_DuplicateRecord_Temp'
DROP 'Employee_DuplicateRecord_Temp' 
 


Second Method: (USING CTE )

1. Create a table -  'Employee_DuplicateRecord_second '

CREATE TABLE Employee_DuplicateRecord_second (id int,name varchar(100))

2. Insert some rows into it.

INSERT INTO Employee_DuplicateRecord_second VALUES (1,'AA')
INSERT INTO Employee_DuplicateRecord_second VALUES (1,'AA')

INSERT INTO Employee_DuplicateRecord_second VALUES (1,'AA')
INSERT INTO Employee_DuplicateRecord_second VALUES (2,'BB')
INSERT INTO Employee_DuplicateRecord_second VALUES (2,'BB')
INSERT INTO Employee_DuplicateRecord_second VALUES (3,'CC')
INSERT INTO Employee_DuplicateRecord_second VALUES (3,'CC')
INSERT INTO Employee_DuplicateRecord_second VALUES (4,'DD')
INSERT INTO Employee_DuplicateRecord_second VALUES (5,'EE')

3. Now , create one CTE (Common Table Expression).In query we are using ROW_NUMBER( ) function.

WITH CTE_DuplicateCount (id,name, DuplicateCount)AS(   SELECT id,name,  ROW_NUMBER() OVER(PARTITION BY id,name ORDER BY id) AS DuplicateCount
  FROM Employee_DuplicateRecord_second)Delete FROM CTE_DuplicateCount where DuplicateCount>1
Here ,
      ROW_NUMBER( ) OVER (PARTITION BY id,name BY id) AS DuplicateCount will give the count of dulplicate rows.If    first time same id and name will come 'duplicatecount'  will return 1,second time then 2 and so on.
Finally we are deleting from the rows from CTE where duplicatecount is greater than 1.
It will eliminate the duplicate rows.

4. Now select from table - Employee_DuplicateRecord_second

SELECT * FROM Employee_DuplicateRecord_second
Deleting Duplicate Records from a table Deleting Duplicate Records from a table Reviewed by kamal kumar das on January 27, 2012 Rating: 5

No comments:

ads 728x90 B
Powered by Blogger.