Search Not Just Numbers

Friday, 18 August 2017

Excel Tip: An introduction to SUMPRODUCT and why you should learn it

Let me start with an apology for leaving it so long without a post. It's been a busy time work-wise and, in addition, I've taken up running to finally lose the excess weight and run the Great North Run half marathon next month. After spending the last 20 years or so doing very little exercise while sitting in front of a computer screen, I decided I had to do something about it - and raise some money for a good cause along the way!

OK! Excuses over! I've got a very powerful but often misunderstood function to tell you about.

On the face of it, SUMPRODUCT does a very simple thing, it multiplies arrays together and sums the results. Unless you're a serious mathematician, you're probably thinking that this is not something you've ever had a desire to do! But bear with me.

First of all, here is a simple explanation of how it works:

=SUMPRODUCT({5,6,4},{7,4,12}) returns 107 because:


You can enter up to 255 arrays like this (as long as they are all the same size) and SUMPRODUCT will multiply them then sum the totals as above. That's basically what SUMPRODUCT does. Now let's look at why this is very useful.

First of all, the SUMPRODUCT function allows you to work with arrays without entering the function using Ctrl+Alt+Enter, which means that you don't run the risk of accidentally clicking into the cell and clicking enter, then wondering why your array formula has stopped working. If you don't know what an array formula is ignore this point, as you don't need to worry about it with SUMPRODUCT. If you do, then you'll know what I'm talking about!

More importantly, the arrays entered into SUMPRODUCT can be formulae that result in arrays. The formula above could have been entered as:

=SUMPRODUCT(A2:A4,C2:C4)

if cells A2, A3 and A4 contained 5,6 and 4 respectively and cells C2, C3 and C4 contained 7, 4 and 12.

But we could also (say) add 1 to each value in the first array before multiplying by entering:

=SUMPRODUCT(A2:A4+1,C2:C4)

resulting in 130  - as ((5+1) x 7) + ((6+1) x 4) + ((4+1) x 12)) = 130.

You're probably still saying, "So what?", but here's where it gets useful.

These formulae can use other Excel functions and, even more usefully, conditions that return arrays of Trues and Falses (converted into 1s and 0s).

Let's say that in the following data, we need to know the total quantity of product A sold in the North:



We could use SUMIFS as follows:

=SUMIFS(C2:C8,A2:A8,"North",B2:B8,"A") which returns 38.

Or we could use:

=SUMPRODUCT(--(A2:A8="North"),--(B2:B8="A"),C2:C8) to return the same answer.

This works as follows...

The formula (A2:A8="North") results in the array:

{TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE} because these are the 7 results of the formula (A2="North" is TRUE, A3="North" is FALSE, etc.)

By placing the double minus (--) in front of the formula, it forces this to return 1s and 0s instead of TRUEs and FALSES, i.e.:

{1,0,0,1,0,0,1}

Our SUMPRODUCT formula therefore arrives at the same answer as the SUMIFS because the three arrays we are multiplying are as follows:



By multiplying the two arrays (A2:A8="North") and (B2:B8="A") after converting them to 1s and 0s, the result will only be a 1 if both conditions are true (as if either returns 0, we will be multiplying the other by zero).

Incidentally, if we don't multiply this by the C2:C8 range we can use this as an alternative to COUNTIFS

i.e. 

=SUMPRODUCT(--(A2:A8="North"),--(B2:B8="A"))

is the same as:

=COUNTIFS(A2:A8,"North",B2:B8,"A")

But why would you want to use this instead of SUMIFS, or COUNTIFS?

Well, SUMPRODUCT is a lot more flexible.

Let's say we wanted to know total sales value, rather than quantity.

We can't do this with SUMIFS without adding a new column (being column C x column D) and applying the formula to that, however with SUMPRODUCT we can just multiply by column D right in the formula:

=SUMPRODUCT(--(A2:A8="North"),--(B2:B8="A"),C2:C8,D2:D8) 

or let's say we wanted total sales of product A for North and South:

=SUMPRODUCT(--(A2:A8="North")--(A2:A8="South"),--(B2:B8="A"),C2:C8,D2:D8) 

Here we have added the two arrays together (-- is a +) - which will return a 1 if  column A is EITHER North or South, as one of the columns will be 1 if this is true whereas if it is not, both columns will be 0.

These are still pretty simple uses of the function, but hopefully this is enough to illustrate that they can go far beyond the capabilities of SUMIFS and COUNTIFS.




If you enjoyed this post, go to the top of the blog, where you can subscribe for regular updates and get two freebies "The 5 Excel features that you NEED to know" and "30 Chants for Better Charts".

Wednesday, 5 April 2017

Excel Tip: Look out for this when you use PivotTable filters

Regular readers and clients will know that I'm a big fan of PivotTables. If you format your data into a table format, you can create all sorts of reports quickly, easily and flexibly using them.

In this (short) post, I want to highlight a little trap to watch out for when you filter a PivotTable - particularly if you are filtering out items, such as blanks for example.

On any of the Row, Column or Report Filter fields you can apply a manual filter by ticking (or unticking) individual entries.


In this field I have filtered out the blanks by unticking "(blank)" and leaving everything else ticked. This is exactly the kind of situation that can lead you to fall into the trap.

To understand the problem, you need to think about how a PivotTable works.

The entries in the PivotTable come from the data itself. So this list only includes the Job Descriptions that currently appear in the data (and possibly some that used to, depending on your settings), so the list above may not be comprehensive in future, as new job descriptions appear in the data.

Let's say a new job description of "Clean Fridge Freezer" appears in the data.

Our intention above was to exclude (blank), but Excel doesn't know this. We could just as easily have meant to only include "Clean Oven", "Clean Oven and hob", "Clean Range" and "Clean Range and Fridge Freezer".

So should "Clean Fridge Freezer" be included in the filter or not? We have not given Excel any clue on this, but Excel needs to do something with it, so there is a default position.

By default, Excel assumes that we have pro-actively ticked the things we want to include - which seems a perfectly reasonable assumption, but if that is true, then "Clean Fridge Freezer" hasn't been ticked, so should not be included.

In our example where we were essentially "unticking to exclude", as opposed to "ticking to include", we need to tell Excel this. Fortunately, this is a simple process.

We just need to change a setting in the Field Settings for this field (Job Description in our example), which can be accessed by right-clicking on the Field itself in the PivotTable (or left-clicking on the field in bottom section of the Field List) and choosing Field Settings.

We then need to tick the box to "Include new items in manual filter":


Once this is ticked, new items will be included by default (just for this field). Note that this setting will not change the selection currently in place, so if you hadn't ticked it when you set the filter up and only discover this when a new item isn't included, you will still need to manually tick the new item. However, all future new entries will be automatically ticked.


If you enjoyed this post, go to the top of the blog, where you can subscribe for regular updates and get two freebies "The 5 Excel features that you NEED to know" and "30 Chants for Better Charts".

Friday, 24 February 2017

Excel Tip: Case-sensitive COUNTIFS and SUMIFS

I have covered both SUMIFS and COUNTIFS in earlier posts.

To recap:

You can use COUNTIFS to count the instances of a value in a range, e.g.
=COUNTIFS(A1:A100,"H") 
will count how many cells with value "H" there are within the range A1:A100. 
You can also add additional pairs of criteria, so that 
=COUNTIFS(A1:A100,"H",B1:B100,3) 
will count how many cells with value "H" there are within the range A1:A100, where the corresponding value in the range B1:B100 is 3. 
SUMIFS works in a similar way, but you enter an additional range as the first argument that is summed, e.g. 
=SUMIFS(C1:C100,A1:A100,"H",B1:B100,3) 
will sum the entries in column C, where the corresponding entry in column A is "H" and B is 3.

Where these are both very useful functions, one particular flaw is that they are not case-sensitive.

So, there is no difference between

=COUNTIFS(A1:A100,"H")

and

=COUNTIFS(A1:A100,"h")

A client recently faced this problem in reporting from a holiday spreadsheet where a full day's holiday is marked with an "H" and a half-day is marked with a "h".

The client wished to total the number of days holiday, which if COUNTIFS was case-sensitive, would be as simple as:

=COUNTIFS(A1:A100,"H")+(0.5*COUNTIFS(A1:A100,"h"))

But unfortunately, we can't use COUNTIFS at all!

We do have a function in Excel that allows us to compare two values, taking account of case, and that is the EXACT function.

=EXACT(A1,B1)

will return TRUE if A1 is exactly the same as B1, so

=EXACT("H","h")

would return FALSE.

Unfortunately, we can't use this with COUNTIFS, because COUNTIFS doesn't allow us to simply insert a condition, as it requires us to enter the conditions in pairs, as above.

We can, however, use the SUMPRODUCT function, which works with arrays, to apply the EXACT function to a whole array (e.g. A1:A100).

=SUMPRODUCT(EXACT(A1:A100,"H")*1)

The array EXACT(A1:A100,"H") will be 100 TRUEs and FALSEs. By multiplying these by 1, they become 1s and 0s. So, this will now calculate the number of capital Hs for us.

To complete our example, we can add this to:

=SUMPRODUCT(EXACT(A1:A100,"h")*0.5)

Giving us:

=SUMPRODUCT(EXACT(A1:A100,"H")*1)+SUMPRODUCT(EXACT(A1:A100,"h")*0.5)

which will return the number of Hs added to half the number of hs, giving us our total of holiday taken.

We can easily add multiple criteria too, as well as change it to the equivalent of SUMIFS.

So,

=COUNTIFS(A1:A100,"H",B1:B100,3)

can be made case-sensitive by using

=SUMPRODUCT(EXACT(A1:A100,"H")*1,(B1:B100=3)*1)

as SUMPRODUCT will multiply the resulting arrays of 1s and 0s, so the resulting array will only show a 1 when both conditions are true.

=SUMIFS(C1:C100,A1:A100,"H",B1:B100,3)

can be made case-sensitive by using

=SUMPRODUCT(EXACT(A1:A100,"H")*1,(B1:B100=3)*1,C1:C100)

as the we have added the array C1:C100, being the values in column C, which are multiplied by the 1s and 0s

So, say, cell A2 contains "H", B2 contains 5 and C2 contains 30.

The second position in each of the arrays are as follows:

First array: A2 does exactly equal H so the result is 1
Second Array: B2 doesn't equal three so the result is 0
Third Array: C2 is 30, so the result is 30

The SUMPRODUCT function multiplies these together so that the 2nd position in the resulting array is:

1 x 0 x 30 = 0

And, say, cell A3 contains "H", B3 contains 3 and C3 contains 15.

The third position in each of the arrays are as follows:

First array: A3 does exactly equal H so the result is 1
Second Array: B3 does equal three so the result is 1
Third Array: C3 is 15, so the result is 15

The SUMPRODUCT function multiplies these together so that the 3rd position in the resulting array is:

1 x 1 x 15 = 15

Hopefully you can see that if both conditions are true, we get the value in column C, but if either are false, we are multiplying by zero, so the array that is summed by the SUMPRODUCT function, will only include the values in column C, where the first two conditions are true. This is exactly the same as a SUMIFS, except it is case-sensitive!






If you enjoyed this post, go to the top of the blog, where you can subscribe for regular updates and get two freebies "The 5 Excel features that you NEED to know" and "30 Chants for Better Charts".

Friday, 27 January 2017

Excel Tip: Setting the print area and page breaks

In this post I want to touch on a topic that I've not written a lot about on this blog - printing.

There are many settings in Excel that you can use to influence what is printed, and I'm sure I will revisit this topic to cover some more of them soon.

In this post, I simply want to focus on how to determine how a single worksheet appears on paper, when you click the Print button.

This is essentially driven by two elements:

  • The Print Area - which determines what section of the sheet is printed
  • The Page Breaks - which determine how this is spread across the physical pages printed.

Print Area

By default, the print area will be what is know as the Used Range of the worksheet. This is from cell A1 to the column of the rightmost cell with anything in it to the row of the cell furthest down the worksheet with anything in it.

This, however, can be changed by highlighting the range that you want as your Print Area and selecting Print Area, Set Print Area from the Page Layout ribbon.

If everything you want to print is not adjacent to each other, you can select multiple ranges by holding the Control key while you select them.

You can even set the Print Area as a dynamic range, but we'll leave that for a future post.

Page Breaks

Excel breaks up the Print Area based upon the settings on the Page Layout ribbon.

Here, you can set the page orientation, paper size, margins, etc. as well as set the scale, or more usefully, set the number of pages wide the Print Area should be considered to be and how many pages long (using the Width and Height settings).

These Width and Height settings are, by default, set to automatic, meaning that they will be determined by the page size and orientation, along with the scale set beneath them.

You can, however, fix them so that, for example, the whole Print Area is resized to fit on 1 page wide and 1 page tall (by setting them both to 1). Obviously, with that setting, there will be no page breaks.

Another typical setting is to set the width to 1 but leave the height as automatic. This will resize so that it is always only 1 page wide, but will insert as many page breaks as required for the height.

Most of the time, this is all you need to determine your page breaks, however, there are also times where you want the breaks to happen in specific places. You can preview where they are and move them around using Page Break Preview.

This is accessed from the View ribbon and shows blue lines where the page breaks are and a watermark telling you the page numbers. These blue lines can be dragged to wherever you need them.

That's about it, and the good thing is that all of these settings stay with the sheet, so you should only need to set them once unless the layout of the sheet changes.





If you enjoyed this post, go to the top of the blog, where you can subscribe for regular updates and get two freebies "The 5 Excel features that you NEED to know" and "30 Chants for Better Charts".