Removing Duplicates in Excel

We sometime look for easy way to remove duplicate  entries from a large table without any filters or complicated formulas. Starting Office 2007, Microsoft added a direct method to remove duplicates from a range.  There is one shortfall in this method which I will describe later.

  • Use mouse or keyboard arrow keys with “Shift” button pressed to select the area which contains duplicate content.
  • Click on “Data” tab of excel and click “Remove duplicate” button in the multi function tool bar.
  • A dialog box pops up. In the dialog box define the columns of the area to be included in the comparison of individual rows. All cells of the two rows should thus not display the same content due to which the rows become duplicates of each other.
  • In the column field of dialog box, remove the check marks in front of the columns which needs to be ignored during the comparison and click ‘OK’.
  • The duplicates in the columns of excel has been removed.

Note: If all the columns are not included in the comparison(when there are obvious differences between individual row),  Excel always first top most row out of the rows that have been identified as duplicate. This is needs attention when the previously excluded cells are required later.

This method short fall is that, you will not be able to see which cells has duplicates. If you want to know which cells having the duplicate values, go to “Conditional formating” and select “Highlight Cell Rules”. In that, choose “Duplicate Values”.  In the pop up select color which you want to see for the duplicate values. Now, use a filter by color and select the color you have opted for duplicats. It will show all the cells with duplicate values. Once found, you can decide on which data need to be removed and which needs to be retained.

0 comments… add one

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.