Issue dated - 18th August 2003

-


Previous Issues

CURRENT ISSUE
INDIA NEWS
STOCK FILE
E-BUSINESS
INDIA TRENDS
NEWS ANALYSIS
OPINION
FOCUS
COMPANY WATCH
TECHSPACE
SECURE SPACE
EVENTS
PRODUCTS
COLUMNS
TECH FORUM

THE C# COLUMN

BETWEEN THE BYTES
TECHNOLOGY
SPECIALS <NEW>
Symantec Report
Security Headquarters
JobsDB
MINDPRINTS
HMA BANKBIZ
EC SERVICES
ARCHIVES/SEARCH
IT APPOINTMENTS
WRITE TO US
SUBSCRIBE/RENEW
CUSTOMER SERVICE
ADVERTISE
ABOUT US

 Network Sites
  IT People
  Network Magazine
  Business Traveller
  Exp. Hotelier & Caterer
  Exp. Travel & Tourism
  Exp. Pharma Pulse
  Exp. Healthcare Mgmt.
  Express Textile
 Group Sites
  ExpressIndia
  Indian Express
  Financial Express

 
Front Page > TechSpace > Story Print this Page|  Email this page

Cashing in on caching

Tech Forum - Dr. Nitin Paranjpe

Article Summary
This article highlights the level of granularity, programmatic control and sophistication available in the .NET framework for caching. We’re all familiar with caching as a concept. However, it is no longer just as simple as enabling caching after your application is developed. Now, we need to consider caching at the architectural level. There are lots of technologies and lots of options. Using these effectively will lead to greatly enhanced performance and scalability of your applications.

In an earlier article, I had mentioned that one of the ways of ensuring high performance is to take advantage of all types of caching provided by all available ingredients of your application. Base OS, database, hard disk controller, the CPU itself—all these components keep caching information all the time to provide performance gains.

While exploring caching in the .NET framework, I came across so many areas that I thought it was worth mentioning it here. The idea is not to give an exhaustive hands-on course on caching, but to show at how many levels you have control over possible re-use of recent information—which is what caching is all about.

What is caching?

Although we are all aware of what caching is, it is important to understand it again.

In simplistic terms we could refer to caching as "the process of storing frequently used data in a medium that is faster to access than accessing it from the original medium".

For example, we store data in RAM once it is read from the hard disk—hoping that the data will be requested by some other application and can now be served from RAM rather than from the slower hard disk. This is what all of us are used to. Various types of applications/devices right from disk drive controllers, the OS and databases use this technique for faster access.

So, why discuss it again? What has changed? The answer is that lots has changed. In fact, I would categorise ‘caching’ as one of the topics that gives one a ‘false sense of knowledge’.

False sense of knowledge

You must have heard of the term ‘false sense of security’. Once you install a firewall, most people feel their systems are secure. But it is not so. There is much more to IT security than just installing firewalls! Similarly, we often suffer from a ‘false sense of knowledge’. Let me explain…

Every IT professional will know about caching as it is explained in above. This was the knowledge we gained from our past experience. Fine. Nothing wrong with that.

But over time, the concept of caching has expanded to include many more aspects, layers and scenarios. Moreover, caching was earlier a part of core systems which developers did not have to manage specifically. For example, database caching is something that the database engine automatically provides. No active effort is involved. At the most, the administrators can tweak the amount of memory reserved for caching.

Because of this apparent familiarity with a particular concept, we fail to keep track of the enhancements and new capabilities available in the technology. That is what I call ‘false sense of knowledge’. If someone asks, ‘do you know all about caching?’, most will readily say ‘yes’—without being even aware is that what one knows about it is a snapshot of that technology as it existed many years ago.

In short, caching has become a much more sophisticated and usable technology that is available to developers as an active opportunity to create higher-performance applications. It is no longer an OS or database level default feature alone.

What has changed in caching?

First of all, all the earlier ways of caching still continue to be used. What has changed is the scope, the context and the programmability of caching, especially since .NET was introduced.

Caching is now a sophisticated programmable tool. You cannot just enable caching after an application is written. You have to start thinking about using caching right from the architectural stage.

This means, caching is to be thought of as an active ingredient of your application—not an afterthought.

The change in the baseline concept of caching is due to the need to manage ‘state’. State means any data that needs to be preserved for a specific period of time in the context of a specific entity. For example, the state of user data needs to be maintained during the session lifetime. Some data may need to be stored permanently (in databases).

Now we can consider caching in a new light.

State management is achieved through caching.

Why cache information?

  • To reduce data transfer across boundaries where such transfers are time- and resource-consuming
  • To store data when generating data is complex in the first place
  • To minimise retrieval time, when retrieval from the original location is very slow

Minimise costly transfer across boundaries

What are these boundaries? These could be processes within a single computer or it could be across computers. In either case, there is a lot of overhead involved in transferring information.

Making calls between processes requires using remote procedure calls (RPCs) and data serialisation, both of which can result in a performance hit. By using a cache to store static or semi-static data in the consuming process, instead of retrieving it each time from the provider process, the RPC overhead decreases and the application’s performance improves.

These types of transfers could even require security validation, which is an additional overhead from a performance point of view.

Generation of data requires processing

If generation of data requires complex querying, remote stored procedure calls, heterogeneous database access, etc, the process is fairly time consuming. If this type of data is used frequently, you can use one of the two approaches. Either eliminate processing totally by pre-calculating this data—as a view/scheduled refresh of a table, etc. The other method is caching. If the data has to be recreated with all the processing, and requires frequent reuse, then caching is an obvious choice. If the data changes often, caching may not be the option, because we want the final results to be accurate. Accuracy is more important than performance alone.

Minimising slow disk IO

This is the traditional use of caching. But it is not just for databases. It could be for other files as well—especially in the .NET context. Loading different XML files, schemas, and configuration files is an expensive operation. By using a cache to store the files in the consuming process instead of reading it each time from the disk, applications can benefit from performance improvements.

Warning: If caching is used in the wrong context, or caching uses an inappropriate medium to store the data, it may even have an adverse impact on performance.

Like any other tool/technology, caching must be employed only when relevant to the application context.

Where to stored cached information

Most often, caching will be done in RAM. This is obviously because of the lightning speed with which RAM-based data can be cached.

However, under certain circumstances it is necessary to use disk-based caching as well. Here are scenarios where disk caching is beneficial:

  • Offline data usage
  • Caching of very large amount of data, which cannot fit into available RAM. This data should essentially be of the type that requires very heavy processing to recreate it.
  • If caching has to survive a reboot (in case of a computer) or process termination (in case of processes). RAM content is lost when the computer reboots, or when the process is terminated.

Caching in practice

This section contains three lists.

  1. Caching technologies
  2. Caching from an architectural perspective
  3. Caching from a programming perspective.

All these lists are inter-related. All are long lists—and the permutations are too many. Still, I would prefer to first list all these options first.

Why so? Because I want to highlight the level of sophistication possible in appropriate usage of caching. Considering the sheer number of available options, you also need to devote adequate amounts of time and thought to utilising effective caching while developing applications.

Caching technologies available

  • ASP.NET cache
    • a. Session and application objects (also available in ASP)
    • b. Cache object
    • c. Page output cache
    • d. Page fragment cache
  • ASP.NET cache used in Win Forms! (Yes, this is possible)
  • ASP.NET session state
    • a. In Process
    • b. State Server
    • c. SQL Server
  • Remoting Singleton Cache
  • Memory mapped files
    • a. Windows NT Service
    • b. Cache Management DLL
  • SQL Server
  • Static variables
  • Client-side caching
    • a. Hidden fields
    • b. Hidden frames
    • c. View State
    • d. Cookies
    • e. Query Strings
  • Internet Explorer caching

Caching from an architectural perspective

As you know, applications contain user, business and data layers from an architectural perspective. Here are the options available for caching within these application layers.

  • User services
    • a. UI Components
    • b. UI Process components
    • c. UI Data
  • Business services
    • a. Service interfaces
    • b. Business workflows
    • c. Business components
    • d. Business entities
  • Data services
    • a. Data access components
    • b. Helpers
    • c. Service agents
  • Security services
  • Operational management
    • a. Configuration information
    • b. Metadata

Caching from a programming perspective

  • Connection strings
  • Data elements
    • a. Dataset
    • b. Datareader
  • XML Schemas
  • Windows Forms Controls
  • Images
  • Configuration files
  • Security context

As you can see the lists are mind-boggling. Currently, I am almost sure that none of the applications designed on .NET have considered all these possibilities and chosen the most relevant ones, based upon specific needs.

I strongly recommend that you should go through the nuances of all these aspects. There is enough material available in books, online, MSDN, websites and so on which discusses practically every aspect of what is highlighted in this article.

The important point is to have the patience to understand the pros, cons and relevance of each of these available technologies and apply these in your day-to-day work.

Another important thing. It is possible that you do not work primarily on Microsoft platform. No problem. In all other technologies caching is equally important and definitely available as a programmatically controllable functionality. So explore all types of caching available in your platform and utilise it effectively.

About the Author:Dr Nitin Paranjape is the Chairman and MD of Maestros (Mediline). He is a consultant with many organisations, covering appropriate technology utilisation, business application of relevant technology, application architecture and audit as well as knowledge transfer. He has authored more than 650 articles on various technology-related subjects. He can be contacted at nitin@mediline.co.in
<Back to top>


© Copyright 2003: Indian Express Group (Mumbai, India). All rights reserved throughout the world. This entire site is compiled in
Mumbai by The Business Publications Division of the Indian Express Group of Newspapers.
Please contact our Webmaster for any queries on this site.