|
Cashing in on caching
Tech Forum - Dr. Nitin Paranjpe
|
Article
Summary
This article highlights the level of granularity, programmatic control
and sophistication available in the .NET framework for caching. We’re
all familiar with caching as a concept. However, it is no longer just
as simple as enabling caching after your application is developed. Now,
we need to consider caching at the architectural level. There are lots
of technologies and lots of options. Using these effectively will lead
to greatly enhanced performance and scalability of your applications.
|
In
an earlier article, I had mentioned that one of the ways of ensuring high performance
is to take advantage of all types of caching provided by all available ingredients
of your application. Base OS, database, hard disk controller, the CPU itself—all
these components keep caching information all the time to provide performance
gains.
While exploring caching in the .NET framework,
I came across so many areas that I thought it was worth mentioning it here.
The idea is not to give an exhaustive hands-on course on caching, but to show
at how many levels you have control over possible re-use of recent information—which
is what caching is all about.
What is caching?
Although we are all aware of what caching
is, it is important to understand it again.
In simplistic terms we could refer to caching
as "the process of storing frequently used data in a medium that is faster
to access than accessing it from the original medium".
For example, we store data in RAM once
it is read from the hard disk—hoping that the data will be requested by some
other application and can now be served from RAM rather than from the slower
hard disk. This is what all of us are used to. Various types of applications/devices
right from disk drive controllers, the OS and databases use this technique for
faster access.
So, why discuss it again? What has changed?
The answer is that lots has changed. In fact, I would categorise ‘caching’ as
one of the topics that gives one a ‘false sense of knowledge’.
False sense of knowledge
You must have heard of the term ‘false
sense of security’. Once you install a firewall, most people feel their systems
are secure. But it is not so. There is much more to IT security than just installing
firewalls! Similarly, we often suffer from a ‘false sense of knowledge’. Let
me explain…
Every IT professional will know about caching
as it is explained in above. This was the knowledge we gained from our past
experience. Fine. Nothing wrong with that.
But over time, the concept of caching has
expanded to include many more aspects, layers and scenarios. Moreover, caching
was earlier a part of core systems which developers did not have to manage specifically.
For example, database caching is something that the database engine automatically
provides. No active effort is involved. At the most, the administrators can
tweak the amount of memory reserved for caching.
Because of this apparent familiarity with
a particular concept, we fail to keep track of the enhancements and new capabilities
available in the technology. That is what I call ‘false sense of knowledge’.
If someone asks, ‘do you know all about caching?’, most will readily say ‘yes’—without
being even aware is that what one knows about it is a snapshot of that technology
as it existed many years ago.
In short, caching has become a much more
sophisticated and usable technology that is available to developers as an active
opportunity to create higher-performance applications. It is no longer an OS
or database level default feature alone.
What has changed in caching?
First of all, all the earlier ways of caching
still continue to be used. What has changed is the scope, the context and the
programmability of caching, especially since .NET was introduced.
|
Caching is now a sophisticated programmable tool. You cannot
just enable caching after an application is written. You have to start
thinking about using caching right from the architectural stage.
|
This means, caching is to be thought of
as an active ingredient of your application—not an afterthought.
The change in the baseline concept of caching
is due to the need to manage ‘state’. State means any data that needs to be
preserved for a specific period of time in the context of a specific entity.
For example, the state of user data needs to be maintained during the session
lifetime. Some data may need to be stored permanently (in databases).
Now we can consider caching in a new light.
|
State management is achieved through caching.
|
Why cache information?
- To reduce data transfer across boundaries where such transfers are time-
and resource-consuming
- To store data when generating data is complex in the first place
- To minimise retrieval time, when retrieval from the original location
is very slow
Minimise costly transfer across boundaries
What are these boundaries? These could
be processes within a single computer or it could be across computers. In either
case, there is a lot of overhead involved in transferring information.
Making calls between processes requires
using remote procedure calls (RPCs) and data serialisation, both of which can
result in a performance hit. By using a cache to store static or semi-static
data in the consuming process, instead of retrieving it each time from the provider
process, the RPC overhead decreases and the application’s performance improves.
These types of transfers could even require
security validation, which is an additional overhead from a performance point
of view.
Generation of data requires processing
If generation of data requires complex querying,
remote stored procedure calls, heterogeneous database access, etc, the process
is fairly time consuming. If this type of data is used frequently, you can use
one of the two approaches. Either eliminate processing totally by pre-calculating
this data—as a view/scheduled refresh of a table, etc. The other method is caching.
If the data has to be recreated with all the processing, and requires frequent
reuse, then caching is an obvious choice. If the data changes often, caching
may not be the option, because we want the final results to be accurate. Accuracy
is more important than performance alone.
Minimising slow disk IO
This is the traditional use of caching.
But it is not just for databases. It could be for other files as well—especially
in the .NET context. Loading different XML files, schemas, and configuration
files is an expensive operation. By using a cache to store the files in the
consuming process instead of reading it each time from the disk, applications
can benefit from performance improvements.
Warning: If caching is used in the wrong
context, or caching uses an inappropriate medium to store the data, it may even
have an adverse impact on performance.
Like any other tool/technology, caching
must be employed only when relevant to the application context.
Where to stored cached information
Most often, caching will be done in RAM.
This is obviously because of the lightning speed with which RAM-based data can
be cached.
However, under certain circumstances it
is necessary to use disk-based caching as well. Here are scenarios where disk
caching is beneficial:
- Offline data usage
- Caching of very large amount of data, which cannot
fit into available RAM. This data should essentially be of the type that requires
very heavy processing to recreate it.
- If caching has to survive a reboot (in case of
a computer) or process termination (in case of processes). RAM content is
lost when the computer reboots, or when the process is terminated.
Caching in practice
This section contains three lists.
- Caching technologies
- Caching from an architectural perspective
- Caching from a programming perspective.
All these lists are inter-related. All
are long lists—and the permutations are too many. Still, I would prefer to first
list all these options first.
Why so? Because I want to highlight the
level of sophistication possible in appropriate usage of caching. Considering
the sheer number of available options, you also need to devote adequate amounts
of time and thought to utilising effective caching while developing applications.
Caching technologies available
- ASP.NET cache
- a. Session and application objects (also available in ASP)
- b. Cache object
- c. Page output cache
- d. Page fragment cache
- ASP.NET cache used in Win Forms! (Yes, this is possible)
- ASP.NET session state
- a. In Process
- b. State Server
- c. SQL Server
- Remoting Singleton Cache
- Memory mapped files
- a. Windows NT Service
- b. Cache Management DLL
- SQL Server
- Static variables
- Client-side caching
- a. Hidden fields
- b. Hidden frames
- c. View State
- d. Cookies
- e. Query Strings
- Internet Explorer caching
Caching from an architectural perspective
As you know, applications contain user, business
and data layers from an architectural perspective. Here are the options available
for caching within these application layers.
- User services
- a. UI Components
- b. UI Process components
- c. UI Data
- Business services
- a. Service interfaces
- b. Business workflows
- c. Business components
- d. Business entities
- Data services
- a. Data access components
- b. Helpers
- c. Service agents
- Security services
- Operational management
- a. Configuration information
- b. Metadata
Caching from a programming perspective
- Connection strings
- Data elements
- XML Schemas
- Windows Forms Controls
- Images
- Configuration files
- Security context
As you can see the lists are mind-boggling.
Currently, I am almost sure that none of the applications designed on .NET have
considered all these possibilities and chosen the most relevant ones, based
upon specific needs.
I strongly recommend that you should go
through the nuances of all these aspects. There is enough material available
in books, online, MSDN, websites and so on which discusses practically every
aspect of what is highlighted in this article.
The important point is to have the patience
to understand the pros, cons and relevance of each of these available technologies
and apply these in your day-to-day work.
Another important thing. It is possible
that you do not work primarily on Microsoft platform. No problem. In all other
technologies caching is equally important and definitely available as a programmatically
controllable functionality. So explore all types of caching available in your
platform and utilise it effectively.
 |
About the Author:Dr Nitin
Paranjape is the Chairman and MD of Maestros (Mediline). He is a consultant
with many organisations, covering appropriate technology utilisation, business
application of relevant technology, application architecture and audit as
well as knowledge transfer. He has authored more than 650 articles on various
technology-related subjects. He can be contacted at nitin@mediline.co.in |
|