|
Solving storage problems virtually
Virtualisation
will help in efficiently utilising storage resources and effectively managing
heterogenous storage devices, says P P Subramanian
Data storage has taken centrestage in todays business
environment as information has transformed to take the shape of the most important
asset of organisations. It is so because in the current business environment,
access to the right information at the right time is of utmost importance to
remain competitive. Consequently, data storage has evolved significantly over
the years, from the erstwhile DAS (Direct Attached Storage) to networked storage
SAN (Storage Area Network). Organisations have benefited greatly from SAN as
it allows them to utilise capacity optimally and also enables disparate storage
systems to be viewed as a single storage system to users. However, even though
SAN systems have increased capacity utilisation by simplifying connectivity,
they have also introduced another layer of management for switches, host bus
adopters and fibre channel-enabled storage ports. Besides, lack of SAN standards
and differences in operating platforms and storage devices have created substantial
problems of manageability and interoperability. It is also a known fact in the
industry that storage management costs run six to eight times higher than the
acquisition cost of the storage itself. In such a scenario, the industry is
very excited about the promise that virtualisation technology holds.
The use of virtualisation technology promises to reduce
total cost of ownership (TCO) and increase utilisation of existing storage systems.
Virtualisation technology leverages the connectivity provided by SAN, by creating
a layer of abstraction between the SAN and the servers. This abstraction enables
the servers in the SAN to view the physical storage as a common pool of capacity.
In traditional storage systems storage was directly
attached to a server and its excess capacity was often unutilised, as it could
not be shared between servers. In order to ensure that enough space was always
available; each standalone server was provided with excess capacity. This resulted
in huge cost overheads, as the storage utilisation was to the tune of 30-50
percent only. The introduction of SAN significantly increased the utilisation
of capacity through fibre channel network as they connected multiple storage
devices with multiple servers. In such an environment, a pool of storage was
created and logical units of storage could be reallocated to servers through
SAN management tools without the need to physically re-cable the storage devices.
However a SAN environment relies heavily on networking hardware and software
such as hubs, switches, and host-bus adapters for creating a fibre channel for
connecting numerous servers. This adds another layer of management for interconnection
of the fibre channel hardware and software. This creates issues of interoperability
and manageability in heterogeneous storage environments. At this instance, virtualisation
promises to melt down various SAN devices to form a common pool
of capacity and simplify management of these devices by masking the complexities
arising out of using heterogeneous storage devices.
When we take a look at different types of storage environments,
each of the environments have specific approach models for storage with the
essential components remaining the same. A storage environment consists of the
applications (Microsoft Word or other such applications), logical component
(databases), virtual (volumes and physical (storage disks, tapes, etc). NAS
provided an abstraction layer between the application and the logical data.
In the same manner, SAN-enabled abstraction between virtual resources and the
physical storage, while the logical data and virtual were still connected. However,
virtualisation takes a step further in abstraction and enables separation of
the logical data and the virtual resources. This type of a model simplifies
storage management concerns to a great extent. In addition, virtualisation promises
virtual utilisation of storage capacity beyond 100 percent by charging
users for a virtual amount of storage but only allocating a portion
as it is required.
However, when it comes to implementation of the concept
of virtualisation different vendors have different opinions about where should
virtualisation capabilities reside. Some say that virtualisation capabilities
should reside at the server-level, some prefer the fabric-level (such as in
the SAN fabric switches or appliances), and some insist that virtualisation
live at the storage system-level, built into storage arrays and devices. However
each of these approaches has their own set of disadvantages and limitations
including interoperability, management and performance issues. However, in practice
virtualisation must be a coordinated effort, shared among the server, SAN and
storage.
Ideally virtualisation must be addressed at the access
and control level so that storage addresses can be remapped and redirected to
create a virtual pool of capacity and can be managed by discovering, provisioning
and maintaining the data path between the application and the storage. However,
in todays scenario capacity utilisation is not a major concern for IT
managers as costs have come down drastically, but management of storage has
become a major concern these days. IT managers today use capacity as a tool
for managing storage, instead of making capacity the end for management. For
example, the transaction performance of a revenue-producing application should
not be impacted by conflicting capacity demands of a messaging application.
At the same time, the productivity of employees should not be stifled by limitations
on storage capacity in a messaging application. Therefore, the productive use
of storage virtualisation should be focused on managing the growing explosion
of data, rather than on limiting it to utilising every last megabyte of storage
capacity.
There are many point solutions in the market for implementing
virtualisation across a heterogeneous storage environment. However, they are
not complete. It is because simply mapping LUN (logical units) capacity across
heterogeneous devices without taking into account the differences in performance
of various devices can lead to escalations in cost and decrease in revenue.
Consider Figure-1 for instance. It depicts two server
clusters running on two different operating system platforms. Servers A and
A form a cluster on one type of operating system, with alternate paths
to virtual volume A. Volume A is virtualised across storage array E and storage
array C, sharing portions of disks W, X, Y, and Z. Servers B and B are
in a different cluster with another operating system that uses alternate paths
to virtual volume B. Volume B is virtualised across storage arrays E and C and
shares portions of disks W, X, Y, and Z.
The first problem in such architecture
is that there is the need for many more dedicated ports than would have been
if the clusters were direct attached. This is necessary because the storage
ports need to understand the command language of the particular platform they
are connected to. In addition, the servers must have a path to each physical
part of a virtual volume. The problem with a heterogeneous storage system
is that some of the platforms might be faster than the others and this would
make the virtualisation performance quite unpredictable. Apart from this,
different systems might have different scaling characteristics as the load
increases. These aspects make recovery very unpredictable and erratic, normally
resetting the pending input/ output to a physical device.
However, when we consider a virtualisation architecture
wherein the servers are direct-attached to the storage, the effect is dramatically
different. (Ref: Fig -2).
Fig-2: SAN Virtualisation with Virtual Storage
Ports and Host Storage Domain
This kind of architecture introduces two elements,
the Virtual Storage Ports and Host Storage Domains. The Virtual Storage Ports
support the connection of heterogeneous platforms to each storage port and hence
a specific mode set is no longer required as a fabric switch in a SAN can direct
multiple heterogeneous platform servers to a single port on the storage system.
Using the host storage domains, a mode set can be specified at the storage domain
level so that it can converse with the corresponding server platform. This kind
of a feature can provide separate, secure, storage pools for separate host groups.
Even though such features that enable access side of
virtualisation and obliterate many of the issues of virtualisation, additional
intelligence should be added to the management or control side of virtualisation.
This is the real challenge of virtualisation.
One of the ways to tackle this challenge is to provide
a storage management framework, which will be based on open standards such as
CIM (Common Information Model) and SOAP (Simple Object Access Protocol). This
kind of an open approach provides enterprise storage seekers with the choice
of going in for best-of-the-breed solutions and would lend itself to collaboration
among multiple storage vendors.
While this kind of an approach to virtualisation will
simplify management of storage, but it will not eliminate the need to continuously
research on developing scalable, reliable and high-performance storage systems.
The author is the country manager of Hitachi Data Systems
India. He can be reached at subbu.subramanian@hds.com
|