|
Automating storage tiering
Data sizes have gone up by an order of magnitude over the
past five years and the rate of growth is accelerating. Automated storage tiering
can help optimize cost and performance by automatically moving data from expensive
FC/SSD to SATA as per pre-set policies. By Manjari Juneja
Data
centers face the challenge of provisioning colossal amounts of storage while
keeping things affordable while meeting ever-increasing performance demands.
Plagued by growing power consumption, lack of space to house storage equipment
and environmental concerns, the concept of Automated Storage Tiering (AST) is
starting to catch on. Here, the capacity to be provisioned is divided into separate
pools of storage space with various cost/performance attributes. At the top
you will find the Tier 1 pool, which is the most expensive and the highest performing
of the lot. The bottom tier is occupied by SATA-based storage arrays.
Storage tiering refers to establishing a hierarchy of storage components. After
all theres no point in storing rarely accessed data on your fastest and
most expensive drives. In short, it is a solution to address data growth and
cost management.
Although the concept seems quite simple and it is easy enough
to create tiers, the challenge lies in marking data and moving it between tiers.
The data is usually moved up and down between tiers based on access history,
age of data, criticality of data etc. This is where things start to get complex
as tiering cannot be done manually but needs an automated mechanism if IT administrators
are not to be overwhelmed.
|
"The
Easy Tier feature for AST on the DS8700 is designed to minimize human
intervention for manual tuning. It provides the capability of data migration
at a sub-LUN level for automated, granular data placement"
- Sandeep K. Dutta
Vice President - Storage, Systems and Technology Group, IBM India/SA
|
|
"Storage
is at an inflection point and companies are slowly moving toward shared
Ethernet and AST from FC-centric storage to allow data and application
managers to store data themselves. Tiering storage ranks among IT's top
five initiatives for the next two years"
- Anand Naik
Director, Systems Engineering, Symantec
|
|
"Dynamic
workloads make it difficult to provide predictable, consistent performance.
Administrators need an underlying infrastructure that provides full visibility,
is self-managing and can quickly adapt to changes in tiered environments,
viz. AST."
- Sanjay Lulla
Director - Technology Solutions, India & SAARC, EMC Corporation
|
High-end disk systems contain tens to thousands of terabytes
of data, often stored on thousands of volumes. Manually trying to determine
which data should reside on what drive tier is not a trivial task. Shifting
patterns of how data is accessed over time mean that data appropriately moved
to one drive tier today might best be moved to a different tier tomorrow. Which
data is hot (as in frequently accessed) and which is cold
(as in rarely accessed) can change faster than people can react. Further, the
ultra high-performance of solid-state drives (SSDs) comes at substantially higher
cost than that of mechanical disk drives. [FC drives are also quite costly and
when techniques such as short stroking are employed to improve performance,
they end up costing almost as much as SSDs from a capacity utilization standpoint
Editor.] It is vital that only the hottest data (that doesnt reside
in the systems limited-capacity electronic cache) should reside on SSDs
or short-stroked FC disks.
Aman Munglani, Principal Research Analyst, Gartner, said, AST is a nascent
technology. As of now only two vendors provide this technology; development
is happening and other players are also trying to get into it. Its penetration
will be low in the beginning but it will move up and over the next one to one-and-a-half
years it will see tremendous growth. The major benefits with here include a
reduction in indirect costs including manpower; human intervention is minimal
here.
AST is a solution to the problems of getting the best performance and capacity
profile from tiered storage. Although businesses can benefit from AST, the technology
involves extensive and continuous movement of data. This translates to additional
IO overhead affecting the performance of primary applications. However, since
most legacy storage systems are still in the process of developing in-band storage
virtualization techniques, such technologies represent a good mid-term approach.
Most disk systems place entire volumes on one tier and cannot place different
portions of a volume on different tiers based on how frequently those portions
are accessed. This lack of granular tier management results in high tier-to-tier
data movement overhead and wasted investment.
Sanjay Lulla, Director- Technology Solutions, India &
SAARC, EMC Corporation, said, We have recognized that while the scale
and scope of storage is increasing, administrators are compelled to reduce their
labor requirements and still somehow maintain complete control over complex,
rapidly growing infrastructure. Dynamic workloads make it difficult to provide
predictable and consistent performance levels. What administrators need is an
underlying infrastructure that provides full visibility, is self-managing and
is able to quickly adapt to change in tiered environments, viz. AST. It meets
these challenges by simplifying the management of the storage infrastructure
and helping organizations realize the full benefits of tiering. It leverages
technology designed to automate the dynamic allocation and relocation of data
across different storage types based on the changing performance requirements
of applications. For organizations to take full advantage of multi-tier environments,
we created FAST (Fully Automated Storage Tiering), which automates the movement
of data within a storage system as a replacement to the significant amount of
manual storage administrative tasks. FAST represents the next evolution in storage
tiering as the new kind of storage intelligence allows users to move content
between various tiers of storage in a non-disruptive fashion ensuring application
availability at all times.
Anand Naik, Director, Systems Engineering, Symantec, commented, Ensuring
the availability of information and managing it is a vital activity for businesses
today. Therefore, it is imperative to manage information properly in order to
reduce costs and enable companies to stop provisioning fresh storage [even when
utilization is on the lower side]. Storage is definitely at an inflection point
and companies are slowly moving toward shared Ethernet and AST from FC-centric
storage. This allows data and application managers to store data themselves.
Tiering storage ranks among IT's top five initiatives over the next two years,
along with storage virtualization and data reduction technologies. Dynamic Storage
Tiering (DST) is a multi-tier storage value proposition that enables the allocation
of file storage space from different storage tiers according to predetermined
rules or policies.
How it works
Data has to be placed on various tiers based on policies. The policies could
be defined based on performance, business continuity, security, compliance,
etc. Based on the above policies, the data located on enterprise storage systems
would be mapped as hot and cold data. The data would be then moved onto the
various tiers. There are a couple of ways of moving data:
- Online movement: The data is constantly moved in
and out of various tiers based on user interaction. At any point of time,
data would be tiered. However, this could have some performance impact as
it runs in parallel to when the user data is being accessed or written onto
the system.
- Planned movement: One could move data at
a pre-determined frequency between tiers.
It solves the biggest challenge, that of cost management, plus it addresses
the issue of data growth as well. When data grows, one would have to ensure
that the performance of the system is still good. With tiering resulting in
data being put in the right tier, this would result in better system performance.
Moreover, you could use inexpensive SATA disks in the lower tiers and add costly
SSDs/FC disks to the higher tiers ensuring that cost is controlled and that
old hardware continues to eke out a useful existence.
Narayanan B., Project Manager - Storage, American Megatrends India Pvt. Ltd.,
said, In AMI StorTrends software, automated tiering of storage devices
is done based on the type of storage, RPM, size of drives, RAID levels, number
of drives and background activity. The challenge in storage tiering is to place
the appropriate data in the right tier. This is done by StorTrends StorSmart
that intelligently places data on different tiers according to its value. This
process, known as data classification or Information Lifecycle Management (ILM),
is central to assuring that valuable data is placed on higher tiers and relatively
less valuable data is relegated to lower tiers. In order to get the most optimal
cost performance architecture, the identification of the value of each block
of data becomes crucial. In an AMI StorTrends storage server, with its extensive
Storage Resource Management (SRM) modules, the storage is not merely seen as
a container of data. Rather an important dimension of intelligence is appended
to every block, transitioning blocks of data to information.
You can customize data migration based on the customer environment. Based on
pre-set policies, the management tool will move data from one type of storage
to another. The key benefits are optimization of overall storage cost along
with a performance boost as frequently accessed data is placed on the fastest
disks. In the absence of tiered storage, performance keeps degrading as data
volumes grow. Also the backup window becomes unmanageable.
Pallab Talukdar, CEO, Fujitsu India, said, Two things are important for
implementing AST. One is the classification of data and the second is the movement
of data. It is necessary to classify data based on its criticality and frequency
of access. Classification is the most important aspect of AST as it defines
and maps the storage media required to store data and has a direct implication
on managing the cost of ownership of data. Once the data is classified and mapped
to the appropriate storage media, an automation system (data mover software)
is required to be implemented as per the policy defined based on classification.
This classification of data requires manual intervention as only business owners
can decide on the criticality and classify data accordingly. However, software
tools are available to assist in the classification of the same.
A big issue that the industry has constantly faced is to keep up with ever-growing
storage requirements with data growing at over 60% per annum in addition to
the number of users and resources. This has usually led to IT administrators
and database management teams spending a significant number of hours managing
performance levels and evaluating storage infrastructure.
AST is a multi-tier storage value proposition that enables
the allocation of file storage space from different storage tiers according
to predetermined rules or policies. It provides a language for describing files,
tiers and circumstances. It also allows you to specify target storage volumes
for file creation (where to allocate blocks for new files) and for file relocation
(where to allocate replacement blocks when migrating a file from one storage
tier to another). By moving files dynamically to different storage tiers in
response to changing business circumstances, AST, also called DST, solves the
problem of tiering.
- Lowers OPEX by automating data movement
between storage types/tiers
- Reduces infrastructure costs by optimizing
the use of existing devices through the alignment of data with appropriate
storage resources
- Improves application performance by moving
less frequently accessed data off high-performance storage
- Cost-effectively and efficiently integrates
large elements of storage and data (such as that derived from a business
merger or acquisition)
- Provides a foundation for an Information
Lifecycle Management (ILM) strategy
Source: Symantec
|
Bringing down the cost
The biggest problem that storage administrators face today is growth in capacity
requirements. This does not have to mean sacrificing application performance
requirements. While SAS disks and SSDs lend themselves to higher performance
requirements, their intrinsic lower capacities mean that capacity demands cannot
be met with just these media; not while keeping acquisition and maintenance
costs low. Nor can administrators choose high capacity drives (such as 2 TB
SATA drives) without sacrificing on performance. AST comes to the rescue by
allowing administrators to buy limited amounts of fast and expensive media and
mix them with high capacity inexpensive media, thereby bringing down the overall
cost of the storage system. The lower power requirement of such a system also
means that the running cost of such a system is significantly lower.
Based on the frequency of usage of data, it is stored on primary, secondary
or tertiary storage thereby reducing the overall cost of storage. The cost per
TB reduces from primary to secondary to tertiary, as does performance. Data
that is accessed infrequently is moved to SATA disks and IT managers can reduce
the amount of expensive SSD/FC/SAS storage in use. 20% of data is accessed 80%
of the time and 80% of the data is accessed only 20% of the time. Therefore,
by properly classifying data and then mapping it to the appropriate storage
media, companies can save considerably.
SSD or AST
|
"Two
things are important for implementing AST. One is the classification of
data and the second is the movement of data. It is necessary to classify
data based on its criticality and frequency of access"
- Pallab Talukdar
CEO, Fujitsu India
|
|
"As
of now only two vendors provide this technology; its penetration will
be low in the beginning but it will move up and over the next one to one-and-a-half
years it will see tremendous growth"
- Aman Munglani
Principal Research Analyst, Gartner
|
|
"Storage
systems are going to have large amounts of SSD/Flash, which are going
to be dynamic with serial ATA behind them, and the whole concept of HSM
and tiered storage is going to go away"
- Satyaki Mitra
Principal Technology Advisor, NetApp India
|
Solid state disks are high performance drives that use solid
state memory to store persistent data. They emulate a hard drives interface
replacing them in most applications. However, the cost of SSDs is an order of
magnitude higher than that of any other storage media limiting its use to mission-critical
applications. AST boosts application performance across the board. While a separate
SSD tier might address a distinct performance problem, AST can solve this and
other problems to reduce cost while maintaining top notch performance.
SSD has a lot of promise thanks to its high performance.
Nevertheless, commercial adoption hasnt picked up as capacities remain
relatively low and costs high. Using SSD by itself or in large amounts is not
a scalable model especially when data is growing exponentially. Hence, AST is
more suitable for todays environments. SSD remains relevant for areas
where performance is critical and cost is not a consideration but in all other
scenarios AST (including SSD/FC/SAS) will prevail.
Satyaki Mitra, Principal Technology Advisor, NetApp India,
said, Tiering is a way to manage the migration of data between high performance
(high cost) and high capacity (low cost & low performance) disk storage
systems. Storage systems are going to have large amounts of SSD/Flash, which
are going to be dynamic with serial ATA behind them, and the whole concept of
HSM and tiered storage is going to go away.
Lulla added, We believe that both approaches are complementary. An organization
can maximize efficiency and productivity by implementing FAST and adding a Tier
Zero with SSDs.
Upgrading to an automated approach AMI said that the first
step in AST involves choosing a storage controller that is capable of supporting
multiple data tiers. The type of disk in each tier should be chosen carefully
by the administrator keeping in mind the capacity, performance and cost requirements
of that tier. When possible, additional storage technology should be incorporatedtechnology
that doesnt require significant investments of time and resources to learn
its operations. New solutions that are power or space efficient must be integrated
into the tiered array.
Sriram S., CEO, iValue InfoSolutions Pvt. Ltd., said, Upgrades
can be done both ways and it depends on the business need and data points. Storage
hardware-based tiering is always going to be more cost-effective than deploying
a separate appliance. Tiered storage by design gives good headroom for data
growth and management.
If the storage system's architecture supports seamless integration of new technology,
an upgrade limited to Flash/SSD technology should be sufficient for an organization
to reap the benefits of this technology. Alternatively, FC drives can be short
stroked. However, if the architecture does not support the infusion of such
new technology or the optimal use of it, then a new storage system may be required
to employ AST.
Minimal human intervention
|
"There
is no human intervention required to move data between different tiers.
Everything is policy driven. However, care needs to be taken while defining
policies"
- Venkat Naraina
Director - SSE (Storage Software Engineering), LSI India
|
Existing tiered storage comes with management tools to completely
automate data migration based on set rules. With a storage tiering management
tool in place, human interference is nearly eliminated. Periodic audits are
recommended to fine tune policies for getting the maximum benefit from investments
in tiered storage.
Venkat Naraina, Director - SSE (Storage Software Engineering),
LSI India, said, There is no human intervention required to move data
between different tiers. Everything is policy driven. However, care needs to
be taken while defining policies.
Sandeep K. Dutta, Vice President - Storage, Systems and Technology Group, IBM
India/SA, said, The Easy Tier feature for AST on the DS8700 is designed
to minimize human intervention for manual tuning. It provides the capability
of data migration at a sub-LUN level for automated granular data placement.
You can also manually migrate entire volumes with Easy tier. It is fully automated
and is easy to install. The DS8700 comes with a built-in advisor tool
that reports which data is hot and which is cold to help users assess the potential
value of enabling Easy Tier in their environment.
Moving data out of a specific vendors solution
The last tier could even be cloud storage. Typically, the different storage
tiers are controlled by a single storage controller and are designed to work
across multiple media types and configurations.
Tiering is always done within the vendor solution for the faster retrieval of
data and the effective functioning of management software. You can always integrate
third-party solutions like archival.
Dutta added, This is possible with a SAN Volume Controller from IBM which
provides storage virtualization and also has the capability to move data between
two different disk arrays without any application downtime.
There are various software solutions available that allow data to move
to tiers outside the specific vendors solution by suitably designing the
solution, said Talukdar.
The underlying data management framework including provisioning, backups, DR
etc. remains identical and agnostic to the mode of access or configuration.
You can make those selections based on todays reality, knowing that you
are not locked-in if there is a need to change in the future.
manjari.juneja@expressindia.com
|