In Excel 2003 and earlier versions, you can use an Advanced Filter to remove duplicates. In Excel 2007, there’s a new command on the Ribbon to make it easier to remove duplicates from a list.
Be careful with the Excel 2007 Remove Duplicates feature though – it really removes the duplicates. If you use an Advanced Filter instead, you have the option of hiding duplicates, or creating a unique list in a new location.
How It Works
Update: Jason Morin asked a few questions about the Remove Duplicates feature, and how it works, so I’ll answer the questions here. (Thanks Jason!)
1) Does the new Duplicates capability discern between text strings and numerical values that look the same on the screen?
No, it treats the text strings and numbers the same. If the list has a 10 and a ’10, they’ll be treated as duplicates. There isn’t a settings option that I can see, where you can adjust this. In the Advanced Filter feature, those would be seen as 2 unique items.
2) What about non-visible characters in the cell? Does it consider “Pen” and “Pen ” the same? As a user I would view this as a duplicate, but Excel may not.
No, those won’t be treated as duplicates, because the space character in the second entry makes it different. Advanced Filter would do the same.
3) How does it decide WHICH duplicate to remove in the data set?
The first instance of each item is left, and all subsequent entries are deleted.
4) I assume if I’m working with record set (>1 column), I need to concatenate data from columns to create a unique identifier for each record, then run the Duplicates on the new column I created.
You can use the check marks in the Remove Duplicates dialog box, and select all the columns you want to include. Only if all the included columns are duplicated, will an item be removed. Advanced Filter works the same way, but without the check marks.
In this example, the list in cells A1:A10 contains a few duplicates.
Follow these steps to remove the duplicates.
- Select any cell in the list, or select the entire list
- On the Ribbon’s Data tab, click Remove Duplicates.
Watch the Remove Duplicates Video
Here’s a very short video that shows the steps to remove duplicates in Excel 2007.