Search Not Just Numbers

Tuesday 1 April 2014

Excel Tip: Identifying and removing duplicates

One of the biggest challenges when dealing with lists of data is spotting and/or removing duplicate entries.

Fortunately, the more recent versions of Excel offer some excellent tools for handling this problem.

I thought that this week, I would give you a quick introduction to them.

Highlighting Duplicates

Excel offers a quick facility to highlight duplicate entries in a list.

Simply highlight the list and, from the Home ribbon, select Conditional Formatting, Highlight Cells Rules, Duplicate Values.

You then get two drop-down boxes to choose whether you want to highlight the duplicates or the unique values and what colours you want to highlight them in. It's as simple as that.

This will simply highlight all of the cells that have a duplicate (or don't, if you select unique values).

Removing Duplicates

Often, you don't need to see the duplicates, just get rid of them. Excel also has a tool for that, which is also a little more sophisticated than the previous tool.

It doesn't just look at duplicate cells, but duplicate rows.

To use the tool, highlight all of the columns containing your data (data should be formatted correctly as covered in my earlier post on this subject, or at least with a column for each field and a row for each record).

On the Data ribbon, select Remove Duplicates and you will see the following dialog box:


First of all, check that the tick-box, My data has headers, is correctly ticked or unticked.

In the main window of the dialog box, all of the columns you have selected will be shown. and you can tick which ones must be duplicated to consider the entry a duplicate row.

In this example, with all columns ticked, the rows must be exactly the same before they are deleted. However, if we unticked the value column, then any rows with a duplicate code would be considered duplicates (irrespective of the entry in the Value column).

It is always the first row that is kept, and all subsequent duplicates deleted. This is only relevant if you have not ticked all of the columns, otherwise all of the duplicate rows are the same anyway.

Excel Expert Course

If you enjoyed this post, go to the top of the blog, where you can subscribe for regular updates and get two freebies "The 5 Excel features that you NEED to know" and "30 Chants for Better Charts".

No comments:

Post a Comment