|
Cashing in on caching — II
Tech Forum - Dr. Nitin Paranjpe
|
Article summary:
The previous article described various ways of utilising caching within
various application layers for achieving better performance. This article
focuses on some of the methods in greater detail.
|
Before
we go into details of the various types of programmatic caching
technologies, here is a simple but often forgotten tip.
| While using an object that can be cached, first
check for its existence in the cache. If not found, instantiate it explicitly. |
This sounds like common sense, but it is
quite possible (especially in large teams) that an object cached by one developer
is not known to another and everyone instantiates it every time, defeating the
purpose of caching.
ASP.NET
cache object
This object is similar to the Application
object. Like the application object, the cache object also has an application-specific
life span. It is cleared when the application ends and is re-created when the
application starts. The main difference is that the cache object provides more
functionality required for caching purposes. The caching-specific features are:
a. Expiration policies
b. Dependencies
i. Key-based
ii. File-based
c. Priorities
d. Callbacks
Let us now see what all this means. Most
of these features are based upon real life needs of managing cached items more
effectively. One of the main problems with caching is that of managing the lifespan
of items. Regular caching allows you to invalidate the cache manually. Otherwise
the objects have a default lifespan (session or application). You may want more
refined control over the lifespan.
Time limit:
The simplest control is based upon a definable
time limit. There are two types of time limits:
1. Absolute
2. Sliding
Absolute is simple enough. Just specify
the explicit date and time at which the object will expire. Now, assume you
have kept an absolute limit of 20 minutes based upon your understanding of user
load. Now, suddenly the activity increases and lots of users are using the cached
information. In spite of this, when the absolute time limit is reached, the
cache will expire and the next user will have to wait longer.
Sliding limits solves this problem. When
you use sliding limits, you only specify a time period for which the object
will remain cached. If no user accesses the object during this period, the object
will be released. However, if a user does access the object, the limit is renewed
automatically.
This is similar to object pooling logic
at a conceptual level.
How do you decide the interval? This depends
upon the scenario. But most often, cached objects derive data from a database.
Therefore, you should invalidate the cached item based upon your understanding
of the frequency of data updates.
|
Do not keep the interval too short. This will defeat the
purpose of caching because no other requests could be served during the
cache lifetime.
|
File dependency:
This is a nice feature. The cache item
depends upon a file for its existence. When the file changes, the item is invalidated.
Typically, the files would be XML files. However, you can also use other files
that are relevant to your application.
Key dependency:
It is possible that some items in the cache
are related to other cached items. This feature allows you to create dependencies
between cached objects. If the parent object expires, the child objects also
expire automatically.
This is useful in scenarios where related
items may not have the same expiry limit. Why is this called key dependency?
Because the dependency is based upon a key rather than the cached item itself.
It is a two-step process. Firstly you make the dependency key based upon a cached
item, and then map it to another dependent item. For syntax, refer to .NET documentation.
Managing the cache lifecycle
The type of caching we were used to earlier
was automated. For example, a database query would read data into memory and
automatically cache it. If this was unused for some time, the cached data would
be removed from memory. Next time there was a similar request, the data would
be brought into RAM again and cached. The same is the case with other background
caching mechanisms like disks, processors, IE cache and so on.
Now here we have to manage caching manually,
so to say. We as developers have to decide and implement the following:
1. Decide what needs to be cached.
2. Decide whether the item should
be cached upfront (proactive) or wait for the first call and then cache it
from then on (reactive)
3. Decide and implement the lifespan
of the cached item.
4. Decide if any dependencies are
required
5. Decide how cached items that are
removed due to expiration or dependency are going to be brought back into
cache. Here two approaches are possible:
a. Every usage call to a (supposedly)
cached item should check if the item exists in cache and that is valid.
If not, you should call another piece of code that will bring the item back
into cache.
b. Implement an automated mechanism
that will recreate the cached item once it expires.
6. Ensure that caching is neither
excessive nor minimal. Both will reduce performance
7. Monitor the caching implementation
with real life like loads and fine-tune your implementation. You may even
want to consider deciding upon the caching lifetime, based upon current workload
at runtime.
One important thing to note in this context
is:
| Always check whether a supposedly cached object is
valid when you reference it in your code. |
How to update the cache?
|
You can write code to recreate the cached item when it is
invalidated.
|
Cached items may be removed because of
expiry of allotted lifetime or dependencies. If you want to automatically bring
back the same item into cache when it expires, you will need to know when the
item expires. This notification mechanism is called cache callback. For this
purpose, a CacheItemRemovedCallback delegate is available.
This accepts three parameters:
1. Key – which is the index location
of the item removed.
2. Value – the object removed from
the cache.
3. Reason – this specifies the reason
why the item was removed. The possible reasons are :
a. Dependency changed – the item
on which this item depended has changed.
b. Expired – the time limit is over.
c. Removed – item was removed programmatically.
One interesting thing to note here that the item can be invalidated if you
add another item into cache (using the Insert method) that has the same
name.
d. Underused – the system decided
to remove this item because of underuse, and also to free
memory.
Notification-based update
Usually, you cache items that originate
from a database. Caching increases performance because you do not repeat the
database query. But what happens when the data changes? Ideally the cache should
be recreated every time the underlying data changes. This is done using notification-based
updation. For using this feature with SQL Server, you require to load Notification
Services.
Of course, other than this, you could also
implement various notification mechanisms based upon external events to control
the cache lifetime.
Output caching
This is another great feature. When used
in the right context, this can boost performance dramatically. Output means
the page generated from ASPX file. There are two types—full page or parts of
the page (fragments). Full page caching is less useful in real life because
most of application specific pages will need some dynamically generated data.
The concept is simple. ASPX pages will
finally render the output to the client machine. In addition to this, the page
output is also stored in the cache. If another similar request arrives at the
Web server, the page is served from cache. Simple.
Page caching
Dynamic page caching
Most serious applications do not serve
static pages. What do you do for dynamic pages? Now, what are dynamic pages?
Pages whose content will change with repeated requests are dynamic.
Based upon how a page request comes and
the expected output change, here are some ways in which you can cache pages
based upon various items like the parameters passed. If parameters of repeat
request are same, the cached page is served. Otherwise a new request is made
to the original page.
There is lot of flexibility in managing
dynamic pages. You can decide whether to cache based upon passed parameters,
headers, specific controls based upon property values or override caching with
custom parameters. A complete description is available in books online.
|
Think whether Page Caching can be used even while designing
dynamic pages.
|
Caching pages that are updated periodically
Similarly, there are some pages that are
updated with a known frequency. In such cases, you can cache them with an expiration
policy that is similar to an update frequency.
For example, if you have a page that generates
a report of a summary of business transactions, and the upload of the data into
the base server occurs at 20 minute intervals, there is a guarantee that within
this 20 minute interval, the page output is going to be exactly same. Therefore,
you make this page cache live for 20 minutes. After 20 minutes it is re-created
and kept alive for next 20 minutes.
Web service level caching
Web services calls can also be cached.
Remember, the purpose of caching is to reduce unnecessary processing when repeat
processing is guaranteed to yield the same results. This may apply to certain
Web service method calls as well.
Where to store the cached pages?
The obvious answer would be the on the
Web server itself. This is of course available. But there are two other interesting
options available.
One is on the client itself! This would
be a very nice thing if the client is likely to ask for pages with similar parameters
repeatedly and the data does not change frequently.
Another nice and well thought out feature
is to store the cached pages on a proxy server so that the proxy server itself
serves the pages and the Web server does not even receive the request!
The location can be specified using the
OutputCacheLocation enumeration. These values are to be used in the @outputcache
directive.
Fragment caching
When you cannot cache the entire page because
the results are too dynamic, you can still think of using Page Fragment caching.
Often, the entire page output does not change. There could be some elements
that are amenable to caching.
If you can identify such elements that
don’t change often, you can convert these to user controls on the Web form and
specifically cache them.
This is a very powerful feature. As usual,
using this feature cannot be an afterthought. You need to analyse the page contents
to identify fragments that can boost performance based upon caching and design
equivalent user controls upfront.
However, this methodology can also be used
in retrospect for optimising performance of a sluggish page by eliminating reprocessing
of less frequently changing items.
Using ASP.NET caching in NON-ASP applications!
This is an interesting one. Due to the
way we think, it is often possible that we may miss the obvious. All along we
have talked about caching in the Web and ASP.NET context. Therefore, it is quite
natural to never think of this caching when you are not developing web applications.
However, if you apply little more thought,
you will realise that the cache object is not really specific to ASP.NET based
rendering at all. It is just a general purpose caching methodology that has
been made available within the framework.
Nothing prevents us from using the caching
functionality in other types of applications where there is a base need for
caching.
The purpose of caching is to store state
and avoid repeated processor/disk/database intensive operations. This is a need
that is universal—even in Win form applications.
All that you need to do is use the ‘system.web.httpruntime.cache’
namespace and reference it from your application. That’s it. You have all the
great caching features at your disposal now.
ASP.NET session state
Session state was traditionally stored
in cookies. Now it is possible to manage this state even when cookies are disabled.
Further, it is also possible to make this work across a Web farm. This is possible
by using either SQL Server or a Windows Service to store the session state.
This has already been covered in an earlier
article. ("Configuring ASP.NET applications"—11 Nov 2002 issue; URL:
http://www.expresscomputeronline.com/ 20021111/techspace1.shtml)
View state
View state is a good feature available
in ASP.NET. It is a property that is available for all Web form controls. This
is used to store state across multiple calls to the same page.
This is internally stored in a hidden and
hashed field. The contents of the viewstate property will not be available if
the browser’s ‘View Source’ command is used. So it is safer than traditionally
used hidden fields.
It is simple, effective and secure. However,
if you use it for complex controls, the performance may degrade.
Proactive v/s reactive caching
When to get the information into the cache
is an important question. If you expect regular usage of the cache, you may
want to populate the cache beforehand rather than waiting for the first request.
This is called proactive loading of state information.
This is useful when you are sure of the
size and update frequency as well as usage of the cache. If you are not sure,
it is better to wait for the first request to populate the cache. This is called
reactive loading.
Summary
I have only covered a part of the entire
gamut of caching possibilities. However, I am only trying to highlight the importance
of considering caching as a great tool to ensure great performance. The earlier
you start thinking about caching in the application development cycle, the more
will be the benefits.
| Any |
Client or Web server
or proxy server |
| Client |
Client side |
| Downstream |
On a proxy server
(or any HTTP caching device) |
| None |
No caching allowed |
| Server |
Web server |
| ServerandClient |
Server or client
machine but NOT on proxy server |
 |
About the Author:Dr Nitin
Paranjape is the Chairman and MD of Maestros (Mediline). He is a consultant
with many organisations, covering appropriate technology utilisation, business
application of relevant technology, application architecture and audit as
well as knowledge transfer. He has authored more than 650 articles on various
technology-related subjects. He can be contacted at nitin@mediline.co.in |
|