|
Data center Metrics
The power of metrics
You cannot manage what you cannot measure. Data center managers
rely upon metrics to alert them before things get out of control. With high
uptimes and reliability being the watch words, metrics are an important tool
in the IT managers arsenal says Neeraj Gandhi
A
data center is an integral part of the overall infrastructure in any enterprise
and an efficiently run operation contributes to a companys growth. The
rise in adoption of enterprise application software such as ERP, CRM and BI
and the consequent flood of information generated by these systems have helped
raise the profile of the data center as this is where all enterprise applications
are hosted. Consequently, when a data center experiences problems, the business
suffers.
As they have a lot to deliver, data center managers are constantly
under pressure. To ease this stress, and ensure that a data center functions
properly, the IT team has to oversee a number of parameters including bandwidth/connectivity,
cooling and supply of power, server horsepower, storage capacity et al. It becomes
important, in this context, for IT to gauge the performance of the data center.
Organizations increasingly depend upon their IT infrastructure to support mission-critical
activities. It is here that the measurement of performance of different elements
of a data center becomes necessary in order to maintain high performance and
deliver the goods.
|
"SwaP
evaluates the efficiency of a server within the constraints of space and
power consumption. However, efficiency factors of power supplies, air
conditioners etc. need to be factored in for accurate comparisons"
- Arnab Roy
General Manager- Marketing,
Sun Microsystems India Pvt. Ltd
|
|
"Metrics
are required to balance the equation between power and cooling.
It is important to measure these
aspects in order to effectively manage a data center"
- James Mouton
Senior Vice President & GM, Industry Standard Servers, HP
|
|
"Metrics
have helped us consolidate our servers into a single incidence and save
on server investment and application development. Our CPU utilization
has increased to 80 percent up from 60 percent"
- T.G Dhandapani
CIO,
TVS Motor Company Ltd
|
To attain this particular objective, data center administrators
resort to different metrics. A metric is the unit of measurement of a particular
characteristic or the performance and efficiency of an element in the data center.
Metrics can apply to individual components or to the data center as a whole.
Monitoring metrics helps ensure that a data center is functioning in a smooth
and consistent manner. This helps address the issue of power and cooling, and
identify potential problems so that they can be mitigated before they get out
of hand.
| Issue #1: Ability to track and assess equipment
availability
For most organizations, the cost of server or network
downtime is significant and internal customers expect network and system
availability of five nines or 99.999 percent. On a daily basis, IT managers
need to be able to assess availability/reliability of equipment and all
external components that support operations, so that they can reduce downtime,
identify and mitigate issues, and provide a secure environment for an
organizations mission-critical equipment.
Meeting the Challenge: Environmental monitoring
solutions provide real-time feedback about critical systems with continuous,
proactive monitoring of all pertinent factors including temperature, amperage
draw, humidity, dew point, and physical security. These solutions allow
administrators to set thresholds for environmental conditions and send
alerts securely via e-mail or text message. In addition, environmental
monitoring systems provide valuable historical reports, alert information,
and logs that allow administrators to identify trends and adapt practices
accordingly. This data can help with statistical analysis, modeling, and
forecasting.
Issue #2: Ability to measure energy consumption
in the data center
Across industries, rising data center power consumption
and heat are major issues, particularly as organizations are incorporating
blade servers and high-density server racks into their IT infrastructure.
Many organizations are studying how power consumption can be reduced in
the data center. The Green Grid, a newly formed non-profit consortium
of information technology companies proposes the use of Power Usage Effectiveness
(PUE) and Data center Efficiency (DCE) metrics, which would enable IT
personnel to estimate the energy efficiency of data centers, compare results
against other data centers, and determine if energy efficiency improvements
need to be made.
Meeting the Challenge: Utilizing PUE and
DCE information, IT personnel can start evaluating their own energy efficiency.
Using these metrics, as well as application-specific data, data center
managers should start considering ways to reduce data center power consumption.
Standalone data centers can also use the EPA Energy Star building performance
rating tool, Portfolio Manager, to rate a facilitys energy performance
in comparison to similar facilities (at the whole-building level). Some
answers include transitioning to three-phase power provisioning. Higher
voltage power reduces amperage requirements, allows equipment to operate
more efficiently, and can reduce the amount of hardware required.
|
By defining a set of metrics, IT can ensure that other departments have full
access to their key applications 24x7x365.
Metrics are important as they enable the measurement of system performance.
A data center is built keeping certain business objectives in mind. Metrics
are required to ascertain whether the data center is meeting those objectives
or not.
T G Dhandapani, CIO, TVS Motor Company Ltd., said, Metrics are important
in order to measure performance and provide an environment for continuous improvement
that drives people. Any function that adds value in an organization should be
evaluated on quality, cost and delivery through the appropriate metrics. The
data center is no exception [to this rule].
Kaushik Chandra, CTO, PricewaterhouseCoopers added, Metrics are required
to sustain the growth of the data center as it has to expand [keeping pace]
with the business growth. Metrics help us in capacity planning and they are
also important from a security perspective. Having a set of metrics necessarily
means that you have to monitor the same which can lead to the detection of any
unusual activity in the data center.
Power and cooling in a data center are perhaps the two biggest issues that bedevil
IT organizations. There is a growing need to control costs in order to enable
future expansion and innovation. James Mouton, Senior Vice President & GM,
Industry Standard Servers, HP said, Metrics are required to balance the
equation between power and cooling. It is important to measure these aspects
in order to manage the data center effectively. If we cannot measure these variables,
we cannot fix related problems and cut costs which may be otherwise go up. So
it has become a strategic decision for IT users.
| Syska Hennessy, a New York-based engineering, technology
and consulting firm, has proposed a new data centre performance metric that
gives more detailed information for calculating data centre performance.
The Syska Hennessy system examines 11 different aspects of data centre performance
and measures them on a scale from one to 10. These 11 items are power, HVAC,
fire and life safety, security, IT infrastructure, controls and monitoring,
commissioning and testing, operations, maintenance, operations and maintenance
procedures, and disaster preparedness. Syska Hennessy calls its system the
Criticality Levels. According to the company it is a more complete
way of looking at criticality and defining the targeted reliability levels
for a particular facility. It argues that the systems [metrics] that have
been in place in the past have been ambiguous and limited.
The Uptime Institute created a four-tier rating
system that applied the IT concepts of high availability and concurrent
maintainability to the underlying data centre infrastructure. Tier-1 data
centers are the most basic while a Tier-4 is fundamentally immune to planned
and unplanned downtime. The research firm developed its classifications
over a decade ago. According to the institute the tier system is an overall
conceptual look at the data centre, while the Syska Hennessy proposal
examines more detailed, engineering aspects.
Both the Uptime Institute and Syska Hennessy are for-profit
consulting firmsnot formal standards bodies or professional associations.
The companies develop these data centre performance metrics because it
means a good deal of publicity and good standingand therefore business
for the consulting firms.
|
Popular metrics
There is no paucity of metrics that are used in Indian data centers. The only
difference is that some are old while others are new. There are even some that
have been designed by enterprises themselves to cater to their specific needs.
The idea here is to tap the performance of different elements that make up a
data centerservers, storage and networking equipment etc. There
are various metrics from the data center perspective that are important such
as environmental, physical access, network performance and server performance
metrics, said Chandra.
Environmental metrics relate to the data center environment; these are statistics
regarding the temperature, humidity etc. which can point to deficiencies in
the cooling environmenta critical factor in modern rack-based data centers.
Similarly UPS performance data tells you about the quality of raw power being
supplied and the number of interruptions that have occurred all of which can
help determine the root cause of such incidents.
| Research on data centres indicates that over the
next five years, data centres will run out of power. Orange, Calif.-based
AFCOM itself predicted that over the next five years power failures and
other limits on power availability will halt data centre operations at more
than 90 percent of all companies. Stamford, Conn.-based research firm Gartner
Inc. said that by 2008, half of all data centres will have insufficient
power and cooling capacity to meet demands of high-density equipment. You
have to remember that electricity is the mothers milk of any data
centre, Fanara said. Its the inefficiency of that use
in the data centre that creates the problem of running out of power.
Heres a rundown of all the data centre energy
efficiency metrics, labels and benchmarks that should hit the streets
by early next year:
- Energy Star label for servers. Much like
the labels it has for refrigerators, ceiling fans and other household
devices, the EPA hopes to develop an Energy Star label for computer
servers. Fanara said that should be ready by early next year.
- SPEC energy-efficiency benchmark. This
metric would rate performance compared with energy consumed so that
users can determine which servers are best for different kinds of workloads.
Fanara said it will be ready by the end of this year. According to Green
Grid officials at the Next Generation Data Centre conference last month,
the SPEC power benchmark may even be ready by October 2007.
- A data centre efficiency metric. This
standard would measure the energy efficiency of an entire data centre
rather than just individual servers. It would compare total electricity
consumed by the data centre to what is actually getting to the IT equipment,
represented as a ratio. Fanara said that the EPA will endorse one or
more of these metrics by the end of this year.
|
Chandra added that modern data centers by themselves are usually out-of-bounds
for general users. As such, all access needs to be scrutinized and permitted
only when there is a genuine need for it. Access metrics help in this regard.
One of the critical metrics is network performance. Closely monitoring this
helps maintain adequate bandwidth at all times and isolate problems of network
congestion. It also helps in capacity build-up, as any network upgrade requires
considerable planning and lead-time.
There are many other metrics being used across verticals. Enterprises such as
Wipro, TVS Motors, LG, HP, Sun Microsystems and IBM use varied metrics in their
data centers.
Wipro has kept its data centers temperature between 22 and 23 degrees
centigrade and network bandwidth utilization is kept below 70 percent. The acceptable
level of CPU utilization is 80 percent. In the same way system and network availability
is measured in terms of uptime with 99.5 percent being the standard; power consumption,
data storage capacity and the total number of mailboxes are also tracked.
Bala Giridhar, Head IT Global, Wipro Technologies said, In the past we
were measuring utilization, availability and capacity, which was not enough.
In recent times, we have also started measuring temperature and power-related
parameters as it has become absolutely critical. We also plan to measure power
consumption per rack and the utilization of real estate (raw space) because
the data center has become denser and more complex.
| For the long term, The Green Grid proposes the Datacenter
Performance Efficiency (DCPE) metric. The DCPE is the natural evolution
from PUE and DCE and is described as follows:
DPE=Useful work divided by Total facility power
While the DCPE is much more difficult to determine,
experts feel that this is a key strategic focus for the industry. In effect,
this calculation defines the data centre as a black box power goes
into the box, heat comes out, data goes into and out of the black box,
and a net amount of useful work is done by the black box. This in some
ways parallels the work being done with the EPA and Standard Performance
Evaluation Corporation (SPEC) at the server level in which the SPEC working
group may produce a standard on the performance of a system, and the EPA
provides a process by which to measure power consumed by the server. The
Green Grid hopes to eventually increase the scope of that work to all
IT equipment and will require broad participation from the IT community
to help guide and define this work.
|
- Space, Watts and Performance (SWaP).
- Network bandwidth utilization
- CPU utilization
- Uptime and availability
- Data growth
- Performance per watt
|
Similarly, TVS Motors needed a significant increase in compute
capacity and wanted to keep costs down. The company witnessed a rise in demand
for enhanced applications in ERP, a leap in the number of users due to geographical
expansion (new manufacturing plants and manufacturing facilities set up abroad)
leading to the introduction of a new set of metrics. In the past the uptime
and availability of applications were measured through downtime in percentage
terms; offline backup time was calculated in hours and CPU load utilization
was held at 80 percent. Now the company has developed two sets of metrics: those
impacting its top line growth and those that impact the bottom line. Top line
metrics are used for measuring application response time which has risen significantly.
Metrics impacting the bottom line are measured by computing system utilizationCPU
utilization (peak and average) and disaster recovery utilization for backup
and testing. Dhandapani said, Metrics have been taken up keeping in mind
relevancy, adequacy and accuracy. These metrics drive the data center manager
and the CIO to deliver the best possible service to the business.
LG Electronics has to meet an obligation of 99.999 percent uptime. To this end
it tracks elements such as storage, server, LAN, WAN, database and UPS. All
these components fall under the umbrella of Uptime Metrics. For example, average
storage response time is now set at 12 to 14 milliseconds. Server CPU utilization
is kept under 80 percent and WAN link response time is set at 70 to 80 milliseconds.
UPS utilization is kept under 75 percent. Daya Prakash, Head, IT, LG Electronics
India Pvt Ltd, said, Performance measurement of all these elements of
a data center has helped us support our growing business and bring down TCO.
| A closer look at the big picture brings to light
another reason for the use of metrics, viz., the evolving role of CIOs and
IT heads. They were earlier accountable for infrastructure costs and maintenance
of the data center. Now they have to also look at reducing the operating
costs of the data center.
K S Ganesan, CTO, Microland Ltd. said, A
CIO now has to clearly design measures and adopt policies to fulfill his
additional responsibility. These could include strategies to reduce energy
costs, measures to optimize the data center, IT budgeting strategy, and
an outsourcing strategy in the case of hosted data center space.
The use of the right set of metrics has helped
TVS consolidate all of its servers into a single incidence and save on
server investment and application development. Also thanks to the availability
of robust servers, the threshold for CPU utilization for scaling has increased
to 80 percent up from 60 percent, said T G Dhandapani, CIO, TVS
Motor Company Ltd.
Bala Giridhar, Head IT Global, Wipro Technologies added,
The right set of metrics have helped us identify single points of
failure, provide a better environment to the systems and devices [resulting
in] improved reliability. It has also helped us take initiatives like
consolidation and virtualization to get better value for money.
|
Sun Microsystems has introduced SWaP (Space, Wattage and Performance), a metric
that assesses the efficiency and effectiveness of rack-optimized server deployments
in a data center. Arnab Roy, General Manager- Marketing, Sun Microsystems India
Pvt. Ltd said, It evaluates the efficiency of the server within the constraints
of space and power consumption. The SWaP tool is accurate in capturing system
power, space and performance. However, efficiency factors of power supplies,
air conditioners etc need to be factored in for more accurate comparisons [to
be made].
According to Calvin Nicholson, Marketing Manager, Server Technology Inc., some
other popular metrics are: server efficiency measured in MIPS (million of instructions
per second) / watts. Data center efficiency is equal to power consumed by IT
equipment divided by power used by the facility where IT equipment refers to
the equipment on the raised floor and the facilitys power consumption
is measured at the its utility meter.
Using this information and the cost of a kWh, power costs can be determined.
| The Environmental Protection Agency (EPA) has said
that as early as next year [2008], data centre managers will be able to
use US government recommendations for making purchasing decisions. Andrew
Fanara, the head of the EPA Energy Star product development team gave the
keynote address at AFCOMs Data Centre World conference in Dallas recently.
He said that, Energy Starin conjunction with Lawrence Berkeley National
Laboratory (LBNL) and industry groups like the Standard Performance Evaluation
Corp. (SPEC), Green Grid and the Uptime Institute Incwill have benchmarks
and labels coming out at the end of this year and early next to aid data
centre managers in determining how energy efficient their facilities are
and which vendors offer the most energy-efficient servers. |
Growing importance
Sanjeev Gupta, Product Manager - Site & Facilities Services, IBM India said,
Data center infrastructure costs have increased multi-fold over the years.
In order to justify the annual IT spend on data center infrastructure, it is
important to gauge the performance of the data center.
Every business has its own outlook vis-a-vis metrics, and different reasons
for adopting them. Giridhar said, Service level agreements with business
[heads], scalability in line with business growth, cost optimization, meeting
environmental requirements of devices or systems in the data center, higher
growth, new products with denser electronics requiring higher power and cooling,
are the main drivers for metrics at Wipro.
Another important factor is cost. Metrics have been adopted to cut down
the cost of power, equipment and people. They are being used to know whether
a company is wasting money or investing in innovation, opined Mouton.
Adherence to security standards like ISO 27001 is another driver, added Chandra.
With data centers increasingly playing a crucial role in business growth, it
is in the interest of all data center managers to adopt metrics to gauge the
performance of their data centers so that they can have better control over
operational costs.
neeraj.gandhi@expressindia.com
|