|
How Rediff’s ‘caching’ in on content
If you are one of the few who think that personalisation
on the Web is all about placing cookies on user PCs and tracking
behaviour, think again. Rediff.com, one of India’s most popular
portals, is about to change all this with the help of indigenously-developed
systems using artificial intelligence to not only recommend other
links but also boost related e-commerce. Srikanth R P has the details
Around five years ago, when India’s sleepy
population was slowly waking up to the Internet revolution, one
of the country’s top portals, rediff.com, was looking at ways to
make its content stand out. Unlike traditional media like newspapers
and television, Rediff had no precedents to draw on, simply because
the Internet phenomenon in India was just starting out. Creating
compelling content was therefore the key to not only survival but
also competitive advantage.
But creating compelling content was not
enough as each and every person on the Internet has different topics
of interest. Rediff therefore began to look at the various ways
through which the site’s content could be served according to each
user’s interest. One was the familiar route of placing data in cookies
stored on users PCs, which could help the site owner monitor the
type of users coming to the site and make changes accordingly. But
if the user was surfing the Net from a cyber cafe, this approach
was not useful.
 |
 |
| When a user is reading a particular
story, the site automatically generates related story links |
Customised content
The second option was to follow an approach
like msn.com or excite.com, which try to maintain loyalty by inviting
users to customise their preferences. These sites recognise the
user after he has signed in, and serve pages according to the preferences
defined. While this was an advantage over the cookie method, it
restricted the user to specific areas of interest. Further, a user
had to register himself on the site to avail of personalisation
features.
Rediff wanted to avoid these two methods
and yet understand a user’s preferences without locking him into
any specific format. While the two methods were examples of explicit
personalisation, Rediff wanted to go beyond and develop a system
wherein it could recommend news items to users depending on the
article he/she was currently reading.
Explains Zaki Ansari, senior editor, Rediff,
"Our portal targets Indians across the world. But unlike traditional
media, we cannot have different geographic editions. Nor can we
define our content according to various time zones. While the medium
of the Net offers a tremendous advantage, it is also a tremendous
disadvantage if one cannot serve the right content to the right
user. We also wanted our content to be flexible in the sense that
the content would be separate from the design." That meant
that once content was published on the site, Rediff would need the
flexibility of serving it to any device on-the-fly, be it a handheld
or a PC or a mobile phone. Rediff then came up with a concept called
object-oriented journalism, which treats individuals as the primary
objects and classifies them into communities. This means that when
a user is reading a particular story, the site automatically generates
related story links. At the same time, the site also generates links
for a sub-section called ‘People who read this also read.’ The idea
is to serve similar content for a set of people who may be like-minded
in thinking and may like similar sets of stories, Ansari adds.
What is unique about Rediff’s approach is
that all the links are generated automatically using software that
has artificial intelligence capabilities. Rediff’s approach is a
departure from the traditional way a user navigates a site. For
example, even if a particular story is not part of a particular
section, a user can still read the story by way of the recommended
links generated. This encourages lateral surfing and increases the
time a user spends on the site. Ansari claims that the site can
match almost 80 percent of the surfing pattern of an average user.
Since this approach does not put stories into a particular section
(even though Rediff also provides sections like news, cricket and
movies), an average user will surf a greater number of stories.
So when you click on a news link, ‘More clarifications needed for
sending troops to Iraq,’ the system automatically generates recommended
links like ‘Iran enjoys a unique system’ or ‘Congress against sending
troops to Iraq.’
Contextual shopping
This, the company says, helps the site to
place like-minded people in communities even if there is no logical
resemblance between two sections. The same idea can be extended
further. To give another example, if a Hindi movie is released,
Rediff could automatically generate links that show the availability
of VCDs of the movie in Rediff’s shopping section. Similarly, a
ring-tone based on the movie’s songs could be made available to
the user whenever he clicks on news related to the movie. The same
is the case with news links like ‘Sachin hits ton in rain-affected
match.’ Rediff can automatically generate links that show the availability
of books on the master batsman in the shopping section. Additionally,
if a user wants to receive an alert every time there is a development
concerning ‘Infosys,’ Rediff can do so as the content is XML-based
and can be tailored to suit different requirements.
Once the goals were defined, Rediff started
building the technology for translating intention into reality.
Today, any article published on the site is independent of design,
and can be moulded into any device and in any form. The organisation
also created an editorial workflow system called the Rediff Backyard—a
browser-based system, which allows reporters to file their stories
from any place in the world without worrying about the formatting.
This is a kind of virtual newsroom, and team members can at any
point of time know the various stories filed in various sections.
After a reporter files the report, the system converts the article
into an XML-format. XML ensures that content and form are well defined
and kept separate from each other. The XML copy contains a description
of the data according to features like headline, byline and date.
Since the data is separate from formatting features like colour,
font and alignment, Rediff can publish it in any design and form.
Intelligent categorisation
The copy then passes through the Indexer
that tries to understand the meaning of each article and gives a
weight to every particular word. For instance, ‘a’ would get the
least weightage because it is a very common word. But a word like
‘Reliance’ will have over 90 percent weightage as it has more importance
compared to other words in an article. The Indexer uses the Bayesian
Inference Probability Theory to find out how close the meaning of
a particular word is to a particular article. Next comes the Categoriser,
which acts like a human editor. It sorts copies on the basis of
rules written by the Rediff team members. Rediff has tried to follow
the International Press Telecommunications Council’s standard that
classifies an article according to different levels. For example,
a rule can be written to distinguish news articles that contain
a word like, say, Apple. If words like information, technology,
hardware and Mac also appear in the same article, then the inference
can be drawn that the user is probably looking for information pertaining
to Apple Macintosh and not the fruit. As more words and rules are
fed into the Indexer and Categoriser, the system becomes more intelligent.
There is also a related-content engine that
looks for similar or related stories across different sections depending
on the weightage attached to the stories by the Indexer. Finally,
there is the recommendation engine that suggests stories to the
user under the ‘People who read this also read’ category.
Rediff’s approach is an innovative way of
understanding rapidly changing behavioural trends, and taking the
help of technology to surge ahead in an industry that is still looking
for ways to make money out of content.

|