|
Data querying without query language - II
Let
us continue exploring the Criteria Range option to truly appreciate
the power and sophistication of the functionality that Excels
Advanced Filter packs.
Criteria Range option
We have just a block of simple data. That is the List Range. We have the Copy
To area, if we want a copy of the filtered data. Where does the criteria come
in?
Without the criteria range being specified, you can only filter for Unique Rows,
as shown above.
Now, you may not want to filter just for unique rows. You will require further
filtering also. For example, you may require all items that are having quantity
less than 5 or between 5 and 10 and so on.
I am sure you have already used the Auto Filter feature for this. This way,
you get a popup with each heading, which lists all the unique values in the
column. You either select one of them / blanks / non blanks or a custom filter.
Custom filters are available for each column. You typically have the commands
like equal to, greater/lesser than and so on. This allows you to combine two
filter criteria with each other as AND or OR. So far so good.
AutoFilter always works in-place. Copying the filtered data to another
place requires additional manual cut-paste. This is one problem with AutoFilter.
Now consider a complex condition where you want to specify that the Qty should
be either between 3 and 5 or it should be between 100 and 103. How would you
manage that with AutoFilter? Not possible. This is another limitation of the
AutoFilter feature.
Yet another problem with AutoFilter command is the conditions selected for different
columns are always used with the AND clause. There is no way of filtering with
the OR clause.
For example, you cannot ask AutoFilter to show only those rows where the Item
is Pencil OR the Qty is above 10. Impossible.
So how do you manage? In real life data analysis, you will require these things
often. The most obvious choice for techies is to use the ODBC approach where
you can build any query of any complexity using the base fields and nested WHERE
conditions.
But what about end users? They cant learn complex SQL commands anyway.
Now, someone thought about this problemand the result of this thinking
was the simple looking textbox called Criteria Range.
Notice that this textbox does not ask you to enter the actual criteria. So you
cannot enter qty<3 here. It asks you for a range.
As you know, range means a block of cells in Excel. So let us try this. We want
to filter rows where Item is Pencil OR Qty is above 10.
Before we try this, we will need to cancel the unique row filtering we did earlier.
This is simple to do. Just choose Data Filter Show All. Now all
rows are shown as per the original data.
Remember to do this before trying any different filtering method.
Now, follow these steps.
1. Select some other area of the worksheet that is beyond the boundaries of
the base data. This is important because filtering in-place hides the rows.
If the criteria information happens to be on the hidden rows, it would be hidden
too. This may confuse you. So it is best to put criteria information either
above or below the base data, away from it (keep at least 1 blank row).
2. So type the criteria range as follows (shown with yellow background colour).
3. Now go to the Advanced Filter dialog.
4. Check that the base data range is correctly selected in List Range.
5. Now click within the Criteria Range textbox and select our criteria region.
6. For clarity of understanding let us not do filtering in-place. Select an
empty area for filtering output.
7. Now click on Ok.
8. For ease
of understanding, I have colour-coded the examples to be shown. Yellow background
means base data, blue means filter condition and orange means filtered data.
9. Now choose Ok.
How did the filter work here? It is filtering the records where either Pencil
or Paint. This you could have done in AutoFilter with Custom
option. But what if you wanted three or more items? The AutoFilter Custom option
allows only two criteria. Therefore, this is a more flexible and powerful option.
As you can see this has actually worked with two field clauses working with
OR condition. This happened because the Item filter condition was on a different
row than the Qty filter condition.
In short, if you want OR conditions with multiple items on the same field, put
those fields on separate rows.
In general, conditions on the same row of Criteria Range are used as AND. Conditions
on separate rows are used as OR.
Now the next obvious question is how do I manage the AND condition? The answer
is also obvious. Place both field criteria on the same row and try again. It
will now filter data where Item is Pencil AND Qty is above 10. (This is what
AutoFilter does efficiently anyway).
Two conditions with OR on multiple columns
This would be making it very complex.
Let us say, we want to filter records where the Qty is between 4 and 9 OR the
Item is Pen. How do you manage it?
Here is how:
Now, this result is not right. You should have expected the row to either have
the right qty or have item Pen. But the first row is neither. It is Pencil
with Qty of 12. How did this happen? It is a normal behaviour of
Excelnot a problem.
The issue is all the text data you use in filter conditions is used as a starting
point. Therefore Pen finds Pen as well as Pencil.
If you want to specifically find only Pen, then you have to use
the syntax form:
==Pen
Now choose and try selecting the Advanced Filter. You will get the desired results.
Using formula based conditions
You can use conditions that are based upon formulas rather than actual values
within the data itself. There are some rules to be followed here. You can refer
to help to find out more details.
Combining AND as well as OR on the same field
This is an interesting option. So I thought it worth highlighting.
We want to filter based upon quantity. But there are two ranges within which
we want to filter.
Qty should be either between 1 to 5 or between 10 to 15. How do you do this?
To complicate things further, I also have a specific value called 9, which should
also be included.
Here is how the criteria is specified.
Qty Qty
>0 <6
>11 <16
9
Now this is not all. If you explore all this, there are many more methods including
wildcards.
All this is actually SELECT syntax
As you can see all this is possible very easily using standard SQL SELECT syntax.
But end-users cannot be expected to learn this syntax. Therefore, Excel provides
a similar level of functionality to completely novice users by providing the
Criteria Range feature.
As you can see, this allows users to specify filter conditions quickly, visually
and without any confusing syntax. This is a classic example of how complex stuff
can be simplified with little extra thought and effective programming.
So when do you use the ODBC method?
Of course the ODBC method is more flexible and powerful because it has the entire
power of SQL syntax behind it.
1. When you have two or more related data areas to compare, (which are available
in two separate blocks) ODBC is the only method available. The Excel feature
demonstrated above works only on one block of data (table) at a time.
2. Secondly, when you have very complex conditions / computed
columns and so on, which Excel does not support, SQL syntax is the only method
available.
Of course you have a choice. Whether to perform these queries on data that is
dumped in Excel or to do it before the data is put into Excel.
When NOT to use this feature?
This feature works on rows. Therefore, filtering is done row by row.
If you want to view data within a block, but not by row by cell, then
this feature is not useful. What do you use then? For example, you want to see
all quantities that are between 3 and 6 but you dont want to filter anything.
You just want to highlight them.
There are two ways: Conditional formatting and Circle invalid data.
Other ways
Conditional formatting provides a method by which cells satisfying
a particular condition will have a different formatting typically font
colour or style or background colour. Most users know how to handle this. Therefore,
I will not demonstrate this. Use this feature when you want to highlight the
data visually on a permanent basis.
Circle invalid data works when you have a block of data on which
validations have been applied. This is not often used but is very useful when
you want the highlighting to be transient.
Circle invalid data option
| March |
April |
May |
| 458 |
681 |
774 |
| 281 |
596 |
314 |
| 177 |
45 |
418 |
| 81 |
220 |
991 |
| 422 |
468 |
718 |
| 693 |
379 |
225 |
| 481 |
844 |
920 |
| 306 |
733 |
509 |
| 261 |
178 |
716 |
| 394 |
172 |
711 |
| 249 |
778 |
654 |
| 723 |
465 |
569 |
| 648 |
758 |
328 |
| 557 |
522 |
312 |
| 302 |
197 |
275 |
| 659 |
975 |
137 |
Consider the following data. These are some values related to each month. Let
us not worry what these values are. What we want to do is to quickly find values
which are between 950 and above.
Consider that this data is voluminous and it is impossible to manually browse
through it to find out these items. Further, if you do find these items manually,
you will need to make some mark there to identify the items. After you have
finished working, you will have to manually remove all the marks. This is very
tedious and repetitive.
Of course there is a better way available.
1. Highlight the data.
2. Choose Data menu Validations.
3. Open the Allow combo box on the Settings tab
4. Choose Whole Number.
5. Then choose Less Than 950. Note that this condition is REVERSE
of what we wanted (>950).
6. Choose Ok.
Now, Validations is a feature which is to be used primarily when
you want to validate data entry. Thus it is generally used with blank cells
which someone is going to enter data into. In this case we are using this in
retrospect.
Now, all values which are above 950 will be considered as validation errors.
However, because the data entry is already done, Excel does not show any error
message.
Here comes the real good feature.
7. Choose Tools Customize Toolbars tab.
8. Enable Formula Auditing toolbar.
9. On this
toolbar click a button called Circle Invalid Data. Now watch what
happens.
10. This is a very nice and intuitive way. If you remember, auditors used to
encircle invalid entries on paper based reports / ledgers. Excel still retains
that nostalgia while providing great sophistication.
11. Now, how do you remove these circles? Simple. Just click the button Clear
Validation Circles on the Formula Auditing toolbar. Smart isnt it?
12. By the way, this Encircle Invalid data works across the worksheet across
various validations and various data blocks.
13. Thus to use this feature effectively, first consider all blocks of data
which you want to validate. Highlight each block and apply appropriate validation
formulas. And then click the Encircle Invalid Data button. Thats it.
14. Now another finesse. On a large sheet, you may be confused about where exactly
is the invalid data? You may have to scroll and search for the encircled data
in an ad-hoc manner. This may be very cumbersome on a complex sheet. You think
there is a way? Of course. The Microsoft team has thought of this also. Where
do you think the solution to this issue is? The Go To
option!
15. Choose Edit menu Go To. Click on Special. Now observe that there
are many useful things here which you have always ignored. Learn them now.
16. One of these is called Data Validation. This highlights all areas which
have data validations applied to them. This simplifies your searching for the
invalid data.
How to reduce coding using this feature
Now, you may think all this is for end users. But it is not. If you code reports
for end users and you find that such level of filtering is required often by
the users, it is a great idea to deliver your data report in an Excel format.
This way, users can perform any of these filtering actions and data crunching
without even talking to you.
If you do not use Excel as a report writer, each time user wants to have a different
filter, they will ask the developer to change the report. Now each time there
is more coding, more testing, more wastage of time and resources.
Excel based reports completely eliminate this wastage while providing complete
control and flexibility to the end users.
Summary
Remember where we started the article? With a nice tip given by a reader. And
further exploration led us to learn so much more. I never stop getting amazed
by how much there is to learn.
The other amazing thing is that our memory NEVER gets full, NEVER needs to be
upgraded, NEVER heats up our head, NEVER requires more power! Furthermore, it
is ALWAYS ready to absorb more and more knowledge.
So moral of the story is Learn more and enjoy! Your
infrastructure is fully geared up for it.
 |
About the Author:Dr Nitin
Paranjape is the Chairman and MD of Maestros (Mediline). He is a consultant
with many organisations, covering appropriate technology utilisation, business
application of relevant technology, application architecture and audit as
well as knowledge transfer. He has authored more than 650 articles on various
technology-related subjects. He can be contacted at nitin@mediline.co.in |
|