Search Not Just Numbers

Thursday 23 February 2012

99% of Excel users get this wrong - How do you lay out your data?

"Learn the fundamentals of the game and stick to them. Band-Aid remedies never last."
Jack Nicklaus (Champion US golfer) 



When someone comes to me with a problem in an existing spreadsheet, the problem is invariably in the layout of the data. The spreadsheet is built for one purpose and works OK for that until something slightly different is required and it proves almost impossible to get the report that's needed.

If a few simple rules are followed when laying out your data, then producing additional reports from that data, and using it for different purposes, becomes simple, instead of the nightmare it is for many users.

These rules apply to any lists of data, be it monthly financial information, transactional data (such as lists of sales, purchases, payments or receipts), customer or supplier lists. If you are going to store data in your spreadsheet to produce reports from, you need to follow these rules.

At the heart of these rules is the approach - you are not laying out your final report here, you are laying out the data in a format that can be reported from! These are two very different things (see my OAP approach to reporting in Excel).

The rules to follow:

  1. Columns with headings and no gaps
    • Every column should have its own UNIQUE heading, in the first row;
    • There should be no empty columns;
    • These columns represent the fields of a database, e.g. Customer Code, Customer Name, Telephone Number, Email Address, etc.
  2. One row per record and no gaps
    • Every record should have all of its data on one row. E.g. in the above example, one row per customer;
    • There should be no empty rows;
  3. Don't group data by putting it in different columns (THIS IS THE ONE THAT ALMOST EVERYONE GETS WRONG)
    • Don't split out financial or numerical data into separate columns to categorise the data into months, expense categories, customers, agents, etc.
    • Do have one column for the financial or numerical data and create a column for month, expense category, customer or agent, to categorise each row;
    • You can use data validation drop-down lists to select the appropriate category for each row;
    • This one is counter-intuitive because in any report, you will almost certainly will want a column (or row) for each of these categories - but if you do this in the data you will massively restrict what you can do with it.
The benefits:
  • Data following the rules above is perfectly prepared to be analysed using countless tools within Excel, for example: pivot tables, autofilter, SUMIF, COUNTIF, etc.
  • Most changes to the data don't require a change to the data layout. New categories, e.g. expense categories, customers, agents, etc. can just be added to the drop-down lists. Any new entries in these columns will be automatically picked up by pivot-tables, autofilter, etc. with no work involved.If you had to create a new column each time, you would also need to edit every report that used the data.
  • You can choose to analyse the data by any category you want. It takes seconds to edit a pivot table that has a column for each month and change it to a column for each expense category. This is almost impossible if the data was laid out in those columns.
  • You can add additional category columns to the data if needed and these can even be calculated from the data. You might, for example, introduce departments - simply add a department column to the raw data, and your pivot tables can analyse the data by this category as well, or instead of existing categories.
As you can see, if you lay out your data according to these rules, you can do pretty much anything you want with it. The spreadsheet can grow with your business, and with any additional reporting requirements you want to add.

It can take a little bit of time to get your head around point 3, but believe me, you'll be pleased you decided to be among the 1% that get this right.

If you'd prefer me to redesign your spreadsheet for you, just visit www.needaspreadsheet.com and let me know what you need and I will send you a fixed price quote.


If you enjoyed this post, go to the top left corner of the blog, where you can subscribe for regular updates and your free report. If you wish to help me to provide future posts like this, please consider donating using the button in the right hand column.

10 comments:

  1. Great Post. I have worked with Excel for a very long time, and it's a great data analysis tool if used in the correct way. I have seen all the "sins" that you point out in this post, and always encourage Excel users to follow the approach you suggest - in essence, treat Excel like a Database, and leverage its powerful analysis tools. There is one more "rule" that I think should be added to this list - Don't Use Different Tabs in Excel to separate data. The typical example that I see is where analysis is done for say Sales in a tab for January, a tab for February etc. This has two major failings. The big one is that the data is now spread across different tabs,and cannot be (easily) aggregated. So if a user wants Year To Data figures, they now need to summarise data from multiple tabs. It's much easier to have a single tab with a column for the month, and then a pivot the data for a multi month report. In fact, that could even be a formula that calculates the Month (or even week number) based on a transaction date field. The second issue with the multi Tab approach is that the column layout now needs to be maintained multiple times. So picture a scenario where you have data in 12 tabs (one per month) , and now you need to track a new attribute. This means going to each tab and adding the column, hopefully in the correct place in each tab. If you get it incorrect, and at a later stage need to copy the data together, the columns will be misaligned.

    ReplyDelete
  2. Paraic

    I couldn't agree more with your comments about multiple tabs. I see exactly what you've described over and over again.

    Months is the most common one, but I've seen it done with everything - departments, salespeople, products, customers, suppliers, etc.

    Thanks for the contribution.

    ReplyDelete
  3. You have to be careful in these circumstances. Excel is NOT a database and it shouldn't be treated like one. With using Excel effectively, like learning VBA, then consolidating across multiple columns is very easy.

    The major problem found today is people treat Excel like a database and it isn't, and at some point in the future the capabilities of Excel no longer match what they need the software to do.

    ReplyDelete
  4. an example for poin three would be helpful.

    ReplyDelete
  5. I would also like an example for point three - I can't picture what you are describing at all. As with the other comments, I've seen all the sins mentioned. I've even met some excel users who don't use pivots, they summarize the data themselves with formulas - OUCH! I think its an inefficient use of time and restricts you from making adjustments to layout and data based on questions/concerns from your audience. Pivots allow you to make instant adjustments to help answer questions.

    ReplyDelete
    Replies
    1. Isaac and Anonymous

      Regarding point three, imagine a table containing sales data, where we wish to record the profit centre that the sale relates to

      Wrong: A column for each profit centre with the associated sales entered in the columns.

      Right: One profit centre column containing the profit centre for each line and a separate sales column containing the sales figure.

      I hope that helps.

      Delete
  6. These are good tips, though often my cashbook transactions need to be split across more than one category.

    Is there an easy way to cope with split transactions?

    Thanks
    Jon

    ReplyDelete
    Replies
    1. Jon

      The simple way to do it is to split the transaction across two or more rows, then you can allocate the category to each row.

      Delete
  7. Glen
    A good article. The explanation for the single Coln for all profit centres would have eluded me

    What, though if you did want many columns for numerous profit centres
    When I do this I use IF FORMULA
    but have to write the formula in each column
    I feel I should be able to write a formula for 1 column and paste it into many columns.
    The absolute and relative references to anchor cells does not work
    Often for visual reasons it is helpful to Instantly see the outcome of the posted data entry
    I fully accept your solution as it is neater and useful
    Thank you for your blogs and lines of thought

    ReplyDelete
    Replies
    1. If you really need that, you can use the anchoring cells by making the column headings for the cost centres the same as the entries in the profit centre column and using the dollars to manage it.

      E.g.

      Say you have four profit centres, North, South, East and West.

      Say your profit centre column is in column D, containing the relevant entry as per my approach above, and the profit figure is in column C. You could then have four individual columns for the cost centres in columns E,F,G and H. The headings in row 1 for these columns should be exactly the same as the four options in column D, i.e. North, South, East and West.

      If you enter the following in cell E2, you should be able to copy it to all 4 columns and copy it down:

      =IF($D2=E$1,$C2,0)

      You could use "" instead of 0 if you prefer.

      If you need a recap on how the dollar signs are doing this, have a look at my earlier post:

      http://www.notjustnumbers.co.uk/2015/01/excel-tip-greatest-excel-tip-of-all-time.html

      I hope that helps.

      Delete