|
Tech form
Data querying without query language
I
wrote the article How to eliminate duplicate records in Excel
some time ago (issue dated 20 Nov 2003. Article available at http://www.expresscomputeronline.com/20031020/techspace01.shtml)
I received very good response about this article from readers. However, I also
received a mail from a reader, Mike Musterd, who showed me a much simpler way
of achieving same resultwith just four clicks in Excel no ODBC,
no SQL DISTINCT clause. Just pure Excel.
As I have always been harping on you never stop learning. I really am
thankful to Mike for highlighting this simple but powerful feature.
I started exploring the feature further. Of course it eliminated duplicates
with few clicks. But I continued studying it furtherand then at the end
of an hour, I had explored a feature that is available in just a small dialog
with three text boxes, two option buttons and one checkbox. But the power and
sophisticated functionality it packs is really mind blowing.
So here is the great feature for all of you to notice, explore, use and exploit.
Recap of the previous article
For those of you who have not read the previous article, here is a quick recap.
1. Consider you have some data entered / dumped in Excel
2. You want to find out and eliminate duplicate rows, if any.
3. ODBC provides a driver for Excel data.
4. The Excel file is considered as the Database and named ranges are considered
as tables.
5. So name a range, create an ODBC DSN for the Excel sheet first.
6. Now, use the New database query option of Excel to get all data from the
named range table which could contain duplicates.
7. Now edit the query and use the DISTINCT clause with the SELECT command.
8. Thats it.
That is a fair number of steps.
Problems with this approach
This approach is perfectly usable, technically correct and innovative. The
only problem with it is that a regular user would not be in a position to take
all these steps easily.
Editing queries and ODBC DSN creation is not something that appeals to simple
users. So now we are ready to see the 4-click solution to achieve the same result.
The simpler, yet more sophisticated feature
Let us consider
the same base data we had used last time.
Now follow these steps.
1. Click anywhere within the data. No need to highlight the block of data
2. Open Data menu Filter Advanced Filter. The dialog appears with
the List Range of data block is automatically selected.
3. Now select the check box Unique Records Only. Observe that the
option Filter the list, in-place is already selected by default.
4. Now click OK.
Thats it. Just 4 steps instead of many more!
This is almost magical. Imagine the amount of user focus and clever coding which
goes into putting such features in products. That is why I admire the way Microsoft
has inculcated the ability to add finesse to day-to-day needs year on
year version by version.
Obviously, this functionality is within the reach of even novice users.
Liked it? Use it next time you want to filter data and eliminate duplicates.
More importantly, show this to your colleagures / friends whom you think can
benefit from it.
Even more importantly, try to explore features yourself while you are working.
It is a really rewarding experience.
The magic has just begun
After showing the simpler method, the article snippet should have ended. But
it did not. Look at the dialog again carefully.
We just used the List range in our example. There are other textboxes
called Criteria Range and Copy to.
Copy To option
Copy
to is the simpler one to understand. When you select the Copy to another
location option, the Copy To textbox is enabled. It simply allows you to put
the filter results into another empty area of Excel worksheet rather than doing
the filtering in-place. This is to be used when you want the original data to
be retained as it is.
One important thought: When you think you have completely understood the feature,
stop and explore further. In majority of cases you will find something more!
And soon you will stop convincing yourself that you have completely understood
a feature! When you think you have understood something fully, you stop thinking
further. And effective technology utilisation is all about thinking further.
So dont fall into the trap called Oh, I know this now
Why so much gyan here suddenly? Because, I found another feature related to
the apparently simple Copy To feature.
Consider that you have eliminated the duplicated rows and want to make a copy
of the unique rows. Thats fine. But you dont want all the columns
from the source data. You want only few of them. Lets say only the Qty
column in our case.
How do you manage that? Copying all and then deleting is valid but cumbersome.
Now again someone has thought of this need and created a nice feature for this.
Follow these steps:
1. Before you use the advanced filter, identify the area where you want to copy
the filtered data.
2. Now, just put the labels of the columns that you want in the final copy in
that region.
3. Click somewhere in the original data and choose Advanced Filter.
4. Choose the range for Copy To in such a way that all the column name header(s)
you wrote in the target area are selected.
5. Now choose OK in the Advanced Filter dialog.
6. Thats it. The data will be filtered and only the specified columns
will be copied to the target range.
Is that not clever?
Soon you will start realising that all this is a very simplistic but highly
effective implementation of the SELECT syntax. But more about this later.
What is wrong with In-place filter and AutoFilter?
If you use the Filter Data In Place option, the data is simply filtered
by hiding the unwanted rows same way as AutoFilter does it.
This has some disadvantages. Firstly, when you hide rows of this block of data,
other unrelated data on the same row elsewhere on the worksheet is also hidden.
This may not be desirable.
In fact, the biggest problem with a spreadsheet is that a regular spreadsheet
row does not mean anything. It is the blocks of data that are important. Since
you can hide a row or column at will, it is important to ensure that you enter
data blocks in such a way that they do not have any common rows or columns.
This is a simple conclusion based upon common sense. But often, while designing
or entering data in spreadsheets, we often forget to take this precaution.
Ideally, multiple data blocks should be entered as follows in Excel. This is
a representative diagram. But it will convey the idea.

Each color represents different blocks of data. Of course there will be formulas,
totals and all thatbut each with non-overlapping row and column blocks.
By arranging information in this manner, you never have to use the Shift
cells up/down option. You can easily delete/hide rows or columns for a
given data block without worrying whether it would delete some other data far
away on the same row/column.
Come to think of this it is so simple. But do you know what the real
problem is? We are so busy and savvy that we never end up thinking about such
simple things. And believe me, this thought may be simple but the impact of
this way of data layout management is far from simple. It is profound and highly
productive.
Anyway let us come back to the Advanced Filter. Now the option left is this
simple looking textbox called Criteria Range.
However, the functionality that this tiny text box encompasses
is so much that I will cover it in the next article.
 |
About the Author:Dr
Nitin Paranjape is the Chairman and MD of Maestros (Mediline).
He is a consultant with many organisations, covering appropriate
technology utilisation, business application of relevant technology,
application architecture and audit as well as knowledge transfer.
He has authored more than 650 articles on various technology-related
subjects. He can be contacted at nitin@mediline.co.in |
|