Issue dated - 12th January 2004

-


Previous Issues

CURRENT ISSUE
NEWS ANALYSIS
INDIA NEWS
COLUMNS
TECH FORUM

THE C# COLUMN

BETWEEN THE BYTES
TECHNOLOGY
SPECIALS <NEW>
Symantec Report
Security Headquarters
JobsDB
MINDPRINTS
HMA BANKBIZ
EC SERVICES
ARCHIVES/SEARCH
IT APPOINTMENTS
Openings At Jobstreet.com
WRITE TO US
SUBSCRIBE/RENEW
CUSTOMER SERVICE
ADVERTISE
ABOUT US

 Network Sites
  IT People
  Network Magazine
  Business Traveller
  Exp. Hotelier & Caterer
  Exp. Travel & Tourism
  Exp. Pharma Pulse
  Exp. Healthcare Mgmt.
  Express Textile
 Group Sites
  ExpressIndia
  Indian Express
  Financial Express

 
Front Page > TechSpace > Story Print this Page|  Email this page

Data querying without query language - II

Let us continue exploring the Criteria Range option to truly appreciate the power and sophistication of the functionality that Excel’s Advanced Filter packs.

Criteria Range option

We have just a block of simple data. That is the List Range. We have the Copy To area, if we want a copy of the filtered data. Where does the criteria come in?

Without the criteria range being specified, you can only filter for Unique Rows, as shown above.

Now, you may not want to filter just for unique rows. You will require further filtering also. For example, you may require all items that are having quantity less than 5 or between 5 and 10 and so on.

I am sure you have already used the Auto Filter feature for this. This way, you get a popup with each heading, which lists all the unique values in the column. You either select one of them / blanks / non blanks or a custom filter. Custom filters are available for each column. You typically have the commands like equal to, greater/lesser than and so on. This allows you to combine two filter criteria with each other as AND or OR. So far so good.

AutoFilter always works ‘in-place’. Copying the filtered data to another place requires additional manual cut-paste. This is one problem with AutoFilter.

Now consider a complex condition where you want to specify that the Qty should be either between 3 and 5 or it should be between 100 and 103. How would you manage that with AutoFilter? Not possible. This is another limitation of the AutoFilter feature.

Yet another problem with AutoFilter command is the conditions selected for different columns are always used with the AND clause. There is no way of filtering with the OR clause.

For example, you cannot ask AutoFilter to show only those rows where the Item is Pencil OR the Qty is above 10. Impossible.

So how do you manage? In real life data analysis, you will require these things often. The most obvious choice for techies is to use the ODBC approach where you can build any query of any complexity using the base fields and nested WHERE conditions.

But what about end users? They can’t learn complex SQL commands anyway. Now, someone thought about this problem—and the result of this thinking was the simple looking textbox called Criteria Range.

Notice that this textbox does not ask you to enter the actual criteria. So you cannot enter ‘qty<3’ here. It asks you for a range.

As you know, range means a block of cells in Excel. So let us try this. We want to filter rows where Item is Pencil OR Qty is above 10.

Before we try this, we will need to cancel the unique row filtering we did earlier. This is simple to do. Just choose Data – Filter – Show All. Now all rows are shown as per the original data.

Remember to do this before trying any different filtering method.

Now, follow these steps.

1. Select some other area of the worksheet that is beyond the boundaries of the base data. This is important because filtering in-place hides the rows. If the criteria information happens to be on the hidden rows, it would be hidden too. This may confuse you. So it is best to put criteria information either above or below the base data, away from it (keep at least 1 blank row).

2. So type the criteria range as follows (shown with yellow background colour).

3. Now go to the Advanced Filter dialog.

4. Check that the base data range is correctly selected in List Range.

5. Now click within the Criteria Range textbox and select our criteria region.

6. For clarity of understanding let us not do filtering in-place. Select an empty area for filtering output.

7. Now click on Ok.

8. For ease of understanding, I have colour-coded the examples to be shown. Yellow background means base data, blue means filter condition and orange means filtered data.

9. Now choose Ok.

How did the filter work here? It is filtering the records where either ‘Pencil’ or ‘Paint’. This you could have done in AutoFilter with ‘Custom…’ option. But what if you wanted three or more items? The AutoFilter Custom option allows only two criteria. Therefore, this is a more flexible and powerful option.

As you can see this has actually worked with two field clauses working with OR condition. This happened because the Item filter condition was on a different row than the Qty filter condition.

In short, if you want OR conditions with multiple items on the same field, put those fields on separate rows.

In general, conditions on the same row of Criteria Range are used as AND. Conditions on separate rows are used as OR.

Now the next obvious question is how do I manage the AND condition? The answer is also obvious. Place both field criteria on the same row and try again. It will now filter data where Item is Pencil AND Qty is above 10. (This is what AutoFilter does efficiently anyway).

Two conditions with OR on multiple columns

This would be making it very complex.

Let us say, we want to filter records where the Qty is between 4 and 9 OR the Item is Pen. How do you manage it?

Here is how:

Now, this result is not right. You should have expected the row to either have the right qty or have item Pen. But the first row is neither. It is “Pencil” with Qty of “12”. How did this happen? It is a normal behaviour of Excel—not a problem.

The issue is all the text data you use in filter conditions is used as a starting point. Therefore ‘Pen’ finds ‘Pen’ as well as ‘Pen’cil. If you want to specifically find only ‘Pen’, then you have to use the syntax form:

=”=Pen”

Now choose and try selecting the Advanced Filter. You will get the desired results.

Using formula based conditions

You can use conditions that are based upon formulas rather than actual values within the data itself. There are some rules to be followed here. You can refer to help to find out more details.

Combining AND as well as OR on the same field

This is an interesting option. So I thought it worth highlighting.

We want to filter based upon quantity. But there are two ranges within which we want to filter.

Qty should be either between 1 to 5 or between 10 to 15. How do you do this? To complicate things further, I also have a specific value called 9, which should also be included.

Here is how the criteria is specified.

Qty     Qty

>0       <6
>11     <16
9

Now this is not all. If you explore all this, there are many more methods including ‘wildcards’.

All this is actually SELECT syntax

As you can see all this is possible very easily using standard SQL SELECT syntax. But end-users cannot be expected to learn this syntax. Therefore, Excel provides a similar level of functionality to completely novice users by providing the Criteria Range feature.

As you can see, this allows users to specify filter conditions quickly, visually and without any confusing syntax. This is a classic example of how complex stuff can be simplified with little extra thought and effective programming.

So when do you use the ODBC method?

Of course the ODBC method is more flexible and powerful because it has the entire power of SQL syntax behind it.

1. When you have two or more related data areas to compare, (which are available in two separate blocks) ODBC is the only method available. The Excel feature demonstrated above works only on one block of data (table) at a time.

2. Secondly, when you have very complex conditions / computed columns and so on, which Excel does not support, SQL syntax is the only method available.

Of course you have a choice. Whether to perform these queries on data that is dumped in Excel or to do it before the data is put into Excel.

When NOT to use this feature?

This feature works on rows. Therefore, filtering is done row by row.

If you want to view data within a block, but not by row – by cell, then this feature is not useful. What do you use then? For example, you want to see all quantities that are between 3 and 6 but you don’t want to filter anything. You just want to highlight them.

There are two ways: Conditional formatting and Circle invalid data.

Other ways

‘Conditional formatting’ provides a method by which cells satisfying a particular condition will have a different formatting – typically font colour or style or background colour. Most users know how to handle this. Therefore, I will not demonstrate this. Use this feature when you want to highlight the data visually on a permanent basis.

‘Circle invalid data’ works when you have a block of data on which validations have been applied. This is not often used but is very useful when you want the highlighting to be transient.

Circle invalid data option

March April May
458 681 774
281 596 314
177 45 418
81 220 991
422 468 718
693 379 225
481 844 920
306 733 509
261 178 716
394 172 711
249 778 654
723 465 569
648 758 328
557 522 312
302 197 275
659 975 137

Consider the following data. These are some values related to each month. Let us not worry what these values are. What we want to do is to quickly find values which are between 950 and above.

Consider that this data is voluminous and it is impossible to manually browse through it to find out these items. Further, if you do find these items manually, you will need to make some mark there to identify the items. After you have finished working, you will have to manually remove all the marks. This is very tedious and repetitive.

Of course there is a better way available.

1. Highlight the data.

2. Choose Data menu – Validations.

3. Open the ‘Allow’ combo box on the Settings tab

4. Choose Whole Number.

5. Then choose ‘Less Than 950’. Note that this condition is REVERSE of what we wanted (>950).

6. Choose Ok.

Now, ‘Validations’ is a feature which is to be used primarily when you want to validate data entry. Thus it is generally used with blank cells which someone is going to enter data into. In this case we are using this in retrospect.

Now, all values which are above 950 will be considered as validation errors.

However, because the data entry is already done, Excel does not show any error message.

Here comes the real good feature.

7. Choose Tools – Customize – Toolbars tab.

8. Enable Formula Auditing toolbar.

9. On this toolbar click a button called ‘Circle Invalid Data’. Now watch what happens.

10. This is a very nice and intuitive way. If you remember, auditors used to encircle invalid entries on paper based reports / ledgers. Excel still retains that nostalgia while providing great sophistication.

11. Now, how do you remove these circles? Simple. Just click the button ‘Clear Validation Circles’ on the Formula Auditing toolbar. Smart isn’t it?

12. By the way, this Encircle Invalid data works across the worksheet across various validations and various data blocks.

13. Thus to use this feature effectively, first consider all blocks of data which you want to validate. Highlight each block and apply appropriate validation formulas. And then click the Encircle Invalid Data button. That’s it.

14. Now another finesse. On a large sheet, you may be confused about where exactly is the invalid data? You may have to scroll and search for the encircled data in an ad-hoc manner. This may be very cumbersome on a complex sheet. You think there is a way? Of course. The Microsoft team has thought of this also. Where do you think the solution to this issue is? The Go To… option!

15. Choose Edit menu – Go To. Click on Special. Now observe that there are many useful things here which you have always ignored. Learn them now.

16. One of these is called Data Validation. This highlights all areas which have data validations applied to them. This simplifies your searching for the invalid data.

How to reduce coding using this feature

Now, you may think all this is for end users. But it is not. If you code reports for end users and you find that such level of filtering is required often by the users, it is a great idea to deliver your data report in an Excel format. This way, users can perform any of these filtering actions and data crunching without even talking to you.

If you do not use Excel as a report writer, each time user wants to have a different filter, they will ask the developer to change the report. Now each time there is more coding, more testing, more wastage of time and resources.

Excel based reports completely eliminate this wastage while providing complete control and flexibility to the end users.

Summary

Remember where we started the article? With a nice tip given by a reader. And further exploration led us to learn so much more. I never stop getting amazed by how much there is to learn.

The other amazing thing is that our memory NEVER gets full, NEVER needs to be upgraded, NEVER heats up our head, NEVER requires more power! Furthermore, it is ALWAYS ready to absorb more and more knowledge.

So moral of the story is – Learn more and enjoy! Your infrastructure is fully geared up for it.

About the Author:Dr Nitin Paranjape is the Chairman and MD of Maestros (Mediline). He is a consultant with many organisations, covering appropriate technology utilisation, business application of relevant technology, application architecture and audit as well as knowledge transfer. He has authored more than 650 articles on various technology-related subjects. He can be contacted at nitin@mediline.co.in
<Back to top>


© Copyright 2003: Indian Express Group (Mumbai, India). All rights reserved throughout the world. This entire site is compiled in
Mumbai by The Business Publications Division of the Indian Express Group of Newspapers.
Please contact our Webmaster for any queries on this site.