|
Storage Special: Case Study
The airline that’s never grounded: IA’s DR saga
59 aircraft, 66 offices countrywide, 17 offices
overseas, approximately 275 flights flying 20,000 people a day and
210 GB of mass storage. These are the components that go into the
making of domestic air carrier Indian Airlines. IA is probably one
of the few public sector organisations with a disaster recovery
system up-and-running. Rahul Neel Mani reports
 |
| Cost, time and business process continuity
are the major considerations in any DR implementation, says
A k Rastogi |
At Indian Airlines (IA), information technology
is not just supporting business, it’s spearheading it. The continuity
provided by IT is crucial to run the IA business effectively. For
A K Rastogi, director IT at Indian Airlines, planning for business
continuity is a priority focus area at present.
Considering the potentially devastating
impact of any disaster, Rastogi’s aim is to ensure IT infrastructure
at IA is robust enough to withstand any calamity. "IT infrastructure
has become harder to manage and control as it is distributed across
the enterprise," says Rastogi. IA’s disaster recovery plan involves
the integration of various hardware, software and network equipment.
"Various components are configured as per business process continuity
requirements, which depends on the criticality of the business function,"
elaborates Rastogi. He feels that the cost of a DR design would
increase exponentially if one were to lower the recovery time. Therefore,
cost, time and business process continuity are the major considerations
in any DR implementation.
Some business process operations cannot
afford any interruption, and it may be mandatory to have a hot standby
configuration of software, hardware and network equipment, to ensure
uninterrupted flow in the operation. Here, the recovery process
is transparent to the end-users. IA’s DR plan model is based on
this simple but smooth mode of operation.
Says Rastogi, "In case the business can
accept shorter duration of interruptions, alternate or fallback
systems may be configured to take care of DR. The database and library
are maintained on primary or secondary storage an the backup system,
depending on the time available for recovery and cost considerations."
The priority and the costing of the solution selected influences
the decision making process.
IA’s IT set-up
As Rastogi puts it, Indian Airlines is perhaps
the first commercial organisation in India to have implemented a
DR set-up on a mainframe system, for mission-critical applications.
There are two IBM ES 9000, 9672 R21 mainframe systems with dual
CPUs, placed at two different locations—about 500 metres apart in
the same campus. These are called System A and System B respectively.
Both the locations/systems are connected to all IA offices in India
and abroad, through dedicated communication lines. The two sites
are interconnected through high-speed ESCON (IBM enterprise system
channel to channel connection) fibre optic links. "The system processes
around 2.16 million passenger-related transactions daily. The database
also has the synchronous mirror image at both the sites, ensuring
100 percent database recovery in case of any unforeseen incident,"
says Rastogi. Also, for other on-line business transactions, important
files are captured at both the sites. Both system and network configurations
are designed to ensure complete recovery of core business applications
in the event of a disaster. In case of a disaster, production system
applications can be run on the backup system at Site B. Data synchronisation
is transparently supported by the operating system and transaction
processing middleware.
At present, System A is being used fully
for passenger services applications, while System B is used for
various batch/online applications. It also acts as a back up to
System A, with all program libraries loaded in it. "Any update to
the program source and libraries are also simultaneously carried
out on both the systems," says Rastogi.
Both the systems are being run round-the-clock
in a controlled environment with a redundant backup of UPSs and
diesel generator sets. The systems provide batch, online and real-time
processing environments. Rastogi says that the resources on both
these systems have been configured for optimum utilisation, keeping
in view the need for backup of mission-critical applications.
"To optimise costs, IA has taken a single
license copy of the mission-critical applications and associated
system software with single operation at any of the sites," explains
Rastogi.
The network
Indian Airlines, like any other IT savvy
company, has a top-of-the line nationwide computer network. Both
the Systems A and B help the airline staff access various applications
using this network. The network is primarily connected by leased
lines. "A mix of dumb terminals and PC workstations are used to
access applications. The system has access worldwide (IC international
stations and CRS systems) through SITA network on a TCP/IP-based
network," says Rastogi.
There are two IBM 3745 front-end communication
processors connected to the hosts to form a redundant network configuration.
The two 3745 communication controllers across the sites are connected
through a fibre optic link. All the devices in the network access
the host via the 3745 processor. "A router-based mash network has
been implemented, using a mix of 64 Kbps digital links and 9.6 Kbps
analogue links, with a backup on ISDN/PSTN," he says.
Application-wise mainframe storage needs of
IA
Applications and data at IA are stored on
9394 (RAMAC II) storage devices. "9394 is a rack-mounted disk array
subsystem. The basic components are storage rack, the 9395 (DASD)
and cluster cards," Rastogi says. The storage device supports RAID
5 data protection system. There are 4 DASD/drawer, each drawer having
formatted capacity of 11.35 GB. There are 10 drawers (40 DASDs)
in System A and 12 drawers (48 DASDs) in System B.
DR solution at IA
The resources available on the two systems
have been configured to process mission-critical applications. The
passenger services-based applications are processed on System A
and the other non-critical applications are processed on System
B. In the event of any disaster, System B resources are reconfigured
to run mission-critical applications and process non-critical applications
in degraded mode at off-peak times. The total blackout period on
account of the DR implementation is around 40-50 minutes and there
is no loss of data. "The DR system has been activated twice since
the system was implemented, due to unavoidable contingency caused
by weather," says Rastogi.
- Business continuity with minimum interruptions.
- Complete database integrity of all applications.
- Customer confidence.
- Recoverable disasters are managed cost effectively without
any extra expenditure on IT infrastructure, except some
additional provisioning for mass storage.
- All critical and non-critical applications are processed
with resource planning for disaster duration.
- Transfer of data from mission critical to non-critical
applications continues for smooth processing.
|
- System level
- Daily incremental back-up at remote backup site
- Daily synchronisation of libraries, catalogues and configuration
files on two systems
- Weekly DASD-wise physical backup of data and applications
at remote backup site
- Application level
- Online disk mirroring/replication of data on second leg
of database at DASDs
- Transaction log backup
|
IA’s IT backend
|
Application |
Database |
Disk Space (GB) |
| Passenger
services system |
IBM TPFDF |
11.1
Library
25 sequential
83 database |
| Test
and development system |
IBM TPFDF |
27.7 |
| Management
information system |
DB2,
VSAM |
9.2 |
| Frequent
flyer programme |
DB2,
VSAM |
3.7 |
| Aircraft
spares inventory information system |
DB2,
VSAM |
5.6 |
| Personnel
information system |
DB2,
VSAM |
2 |
| Financial
applications |
VSAM |
21.9 |
|