|
Can Newgen revolutionise document compression?
In a market already dominated by the monopoly
of compression schemes like CCITT G4, JPEG, JBIG2 and LZW, the viability
of a new product is a question mark. But with a superior technological
advancement coupled with ingenious business strategy, document management
and imaging solutions provider Newgen Software is fully geared to
give tough competition to the existing image compression algorithms,
says Shipra Arora
 |
According to
Hareish Gur,
NIF will not only offer the
advantage of higher
compression but also the ability to do so without losing text
and colour resulting in high resolution |
Looking at the enormous growth in volumes of data,
any advancement in document compression technology is a milestone.
What makes compressed images an important business need is the fact
that documents are being constantly archived, communicated and manipulated
in digital format and there is a growing demand for instant access
to high quality documents.
Newgens recent technology innovation in
colour document compression called Newgen Image File-Format (NIF),
a result of two-and-a-half years of research and development efforts
by the Advanced Image Processing Group, is believed to offer compression
rates of up to 300 times. Hareish Gur, group head & deputy general
manager, Advanced Imaging Group, Newgen Software Technologies says
that NIF will be both a compression scheme and a file format (.NIF),
just like JPEG. What will drive competition, however, is the companys
strategy of piggybacking on competition by integrating the NIF compression
scheme into Adobes PDF format. These factors can help in establishing
the NIF brand name. The company is even considering patenting the
technology, though it has not taken a decision on this as yet. With
the beta version being released this month, this technology will
be commercially available by early next quarter.
Whats new in NIF?
According to Gur, NIF is an open standard format
that allows ultra-high compression ratios for scanned colour and
grayscale office documents without losing text legibility and OCRability
of the scanned document.
But the real benefit will depend on the quality
of images because generally it has been observed that higher the
compression greater the distortion and blocky effects in the images.
"NIF will not only offer the advantage of higher compression
but also the ability to do so without losing text and colour resulting
in high resolution," explains Gur.
Some of the business benefits to the users are
in terms of both cost savings in storage and time saving in transferring
colour documents to Internet. This is even more critical considering
the bandwidth scenario, as a majority of users still dial-up for
Internet connectivity. A smaller image size will mean that images
can be easily and instantly transmitted and viewed via standard
Web browsers thereby resulting in efficient scanning, storing, downloading
and emailing mission critical documents via corporate Intranets
or even the Internet. These are the benefits that the company will
have to explain to users to get them on board, says an industry
expert.
How does it work?
The technology works on multi-layer compression.
The scanned document is separated into multiple layersa layer
containing high-resolution text (or hard edges); one layer of low-resolution
background; another layer containing colours and soft edges. Then
each layer is compressed separately according to an algorithm that
yields the best results for image size and clarity. This is done
on the basis of analytical strengths of the technology. The technology
uses JPEG and JPEG 2000 for lossy and CCITT G4 and JBIG2 for lossless
compression.
(Graphics compression techniques are of two types
: lossless and lossy. Lossless techniques throw away redundant bits
of information without affecting the quality of the image, but lossy
techniques while reducing file size compromise on image quality.)
Based on end user requirements, mix and
match of these combinations is possible without hindering
interoperability. According to Gur, all these standards being open,
there is no imposition of proprietary format and thus
user confidence is boosted. While the encoder for bitonal areas
can encode losslessly, the background layer is a lossy compression.
The text layer is neither touched for resolution reduction, nor
for any lossy operation resulting in a clear digital document which
retains the quality of the original scanned document at high compression
ratios. The most critical stage in the process of creating a NIF/PDF
file is the ability to separate the foreground, background, colour
and other parts via advanced image processing techniques known as
segmentation.
Competitive scenario
There are three major players in the compression
market, namely CCITT G4, JPEG and LZW. While CCITT G4 is a black
& white (B&W) compression standard, the latter two are colour
compression standards. As per industry estimates almost 90 percent
of the compression market is still dominated by the B&W standard
because of higher costs and prohibitory file sizes involved in colour
compression.
While on one hand NIF will be facing competition
from the B&W document compression market, on the colour front
it will have JPEG and LZW compression schemes to contend with. The
90 percent B&W market is also a potential market which NIF can
target and try to move towards colour. What will work to NIFs
advantage is the increasing adoption of colour document imaging
technologies by the business world, with the choice of storing it
at the same cost of the commonly used B&W standard (CCITT G4-compressed
TIFF). According to Gur, the size of a NIF compressed office document
will be almost same as a CCITT G4-compressed document.
On the colour front, JPEG has the disadvantage
of being a lossy compression scheme, which means that there is loss
of content/information during compression process. The size is less
but the quality suffers. As a result the JPEG scheme is not very
readily used in document compression and medical imaging. Apart
from this, the JPEG compression scheme is only supported by file
formats like .JPG, .JPEG and .JFIF. Similarly there are only three
file formats, namely .GIF, .TIF and .PDF, which use LZW compressed
schemes. On the other hand, the first release of NIF itself will
support file formats like .NIF (its own file format), .PDF, .TIFF,
.BMP, .PNG, .GIF, which means that all these formats can be opened
with the same viewer. This is done by converting the various formats
into .NIF or NIF compressed .PDF formats.
Vis-à-vis the LZW compression scheme, NIF
is an open standards-based scheme. This means it will be available
for all to read and implement and will create a fair, competitive
market for implementations of the standard. Thereby not locking
in the customer into a particular vendor or group and maximising
end-user choice. Being an open standard, NIF will be free for all
to implement, with no royalty or fee. Newgen will be making a restricted
version of NIF available as a freeware on the Internet for individual
users. However, the SDK and advanced level viewer will be priced.
On the other hand for LZW, the joint patent owners CompuServe and
Unisys are into an agreement whereby they agree to encourage the
GIF developers who use CompuServe as a distributor to pay a royalty
fee to Unisys. For each registered copy of a program that uses the
LZW compression technology, the developer pays 1.5% of the sale
price of the program to CompuServe, or $0.15, whichever is greater.
However, what could make matters a little difficult
for NIF is the fact that in June 2003 the patent for LZW will be
expiring making it freely available. This will mean that people
will be able to use .GIF file formats, etc. without paying. But
this does not seem to deter Newgen. The company points out that
price tag wont drive the competition. "Competition
will largely depend on who offers better compression schemesboth
in terms of compression size and quality," adds Gur. LZW cannot
offer more than 8-bit colour per pixel for .GIF and .PDF. Its
because 24-bit colour compression tends to increase the size of
the image. NIF, however, will be able to offer 24-bit colour at
an optimised size through the segmentation process.
It will also offer the 8-bit colour per pixel choice.
Business strategy
Breaking into the technology domain of CCITT-G4,
JPEG and LZW will, however, not be easy. Despite its technology
strength, competition is tough for NIF to establish itself among
already established technologies. The failure of an almost similar
document compression offering from a US-based company LizardTech,
(which had acquired DjVu colour document compression technology
from AT&T Labs) to garner expected market share tells adequately
on the tough market scenario and the competition Newgen will have
to face.
It portends that more than the technology, the
company has to get its business strategy right. Learning a lesson
from LizardTechs case, Newgen think tank has tried to induce
more flexibility and risk capacity into its NIF strategy. NIF being
an open standard format, Newgen has decided to use it to leverage
on Adobes PDF market share as well, thereby increasing the
scope of addressing the market. This means that not only will Newgen
be able to address the untapped market, but also cater to Adobes
market. Gur explains that NIF technology complements Abobes
technology by packaging multiple NIF layers into a PDF file, in
compliance with Adobes specs for PDF creation. The resulting
NIF-compressed PDF can be opened in a Acrobat Reader. This will
provide Newgen access to millions of desktops worldwide having Acrobat
Reader installed on them, which otherwise wasnt possible.
With an advantage of a readymade market to the company it will make
its business strategy more risk-free.
According to an expert, what ails LizardTech is
its proprietary format due to which it has not been able to take
PDFs market share head on. Besides, the pricing of DjVu was
also prohibitively high ($20,000 for SDK and about four cents per
document conversion fee). Learning from this, Newgen has kept its
options open. It means that the user has the choice of saving the
file (BMP, TIFF, etc.) either in Newgens .NIF format or Adobes
.PDF format (NIF compressed). Though the NIF compressed PDF files
are slightly larger than corresponding .NIF files, the conversion
from .NIF to .PDF and vice-versa is a fast and non-lossy process.
Typically, a 25 MB (A4, 300 DPI colour document) uncompressed BMP
file when converted into NIF compressed PDF file will occupy only
about 20 KB more than corresponding .NIF file (the size of the .NIF
file being around 100 KB). Scoring one-up on the strategy front
here, this is the spot where the company feels it will hit competition
the most.
Business strategy is also designed towards generating
multiple revenue streams. The companys revenue model comprises
the following: -
- Bundling the technology along with scanners
- Releasing viewers
- Releasing a development suite providing a
wide range of tools & APIs for integrating into any third
party application.
The company is also looking at integrating the
technology into its mainline solution OmniDocs, a document management
system, to begin with. The restricted freeware version on the Internet
will enable the company to get the possible users to experiment
with the technology, ultimately leading to adoption of the complete
version.
The software package for viewing, generation and
distribution of lightweight colour document images, which will be
available as NIFView, is an image viewer that supports
opening and saving of files like NIF, PDF, TIFF, BMP, PNG, GIF,
etc. On the other hand companys other offering - NIF
SDK will allow integration of NIF technology into all applications.
It will include a collection of ActiveX controls, automation servers,
Applets and platform-independent APIs for viewing, loading,
saving, extracting text layers, annotating, etc. External OCR engines
can be fed the binary text layer directly, thereby enabling faster
and more accurate results vis-a-vis threshold or binary-scanned
image. (OCR engines normally work on binary images only.)
Bundling with scanners and MFDs will be another
important revenue generation model for the company. It is already
in talks with scanner and MFD vendors globally for bundling NIF
technology with their product lines. The company has received positive
responses from various potential partners. "All big companies,
whom we have met and demonstrated the software, are gung-ho about
it and want to bundle our technology with their scanners and MFDs,"
says Gur. According to him, if the vendors pay royalty, the technology
will be theirs otherwise it will remain Newgens brand.
Application Areas
Some of the application areas that the company
will be targeting are advertising, publication, distribution and
scan-to-web applications and back file conversion of colour publications
and documents, workflow applications etc. It also caters to a whole
range of B2C and B2B applications, from financial record storage
and distribution to online publishing, online retailing, web publishing,
e-book publishing. The files in NIF format can be easily put onto
the website or embedded in HTML documents. Newgen will be targeting
segments like BPO, telecom and insurance sectors, which are likely
to be early beneficiaries of NIF document compression technology.
It will also be targeting libraries, SME and SOHO segments.
Final Word
According to an IDC estimate, by 2004 there will
be 19 million flatbed scanners. This is the size of one of the potential
markets for NIF technology. In addition to this increasing thrust
towards colour document imaging by enterprises will mark the way
for NIF technology. However, its easier said than done. A
lot will depend on how well the company is able to market the technology
and establish the NIF brand strongly among CCITT G4, JPEG and LZW.
Still to come are the counter strategies adopted by the competitors.
|