 |
| Route
optimisation might sound familiar, but this isn’t your
typical QoS technology. Can it shake up the IP routing
market? Doug Allen finds out |
Internet
QoS has often seemed like an oxymoron, especially between
networks. But the combination of something old, multi-homing,
and something new, Route Optimisation (RO), could lead to
a profound change in public network business capabilities.
RO is built on multi-homing, where businesses use two or more
diverse routes to multiple ISPs, similar to building highway
off-ramps to overcrowded spots such as an airport or stadium.
When traffic fills up on one road, drivers find an alternate
route to reach their destination. Multi-homing works the same
way: End-user businesses establish multiple roads to their
Internet destinations.
In this article youll learn about multi-homing, RO,
and how the two work together. Youll learn if you need
a RO-multi-homing solution, and youll become acquainted
with the major players in this burgeoning industry. Finally,
youll find out how to choose the solution that best
fits your enterprise and determine whether the solution should
be Customer Premises Equipment (CPE)-based or service-based.
Multi-homing is nothing new. Many businesses multi-home for
redundancy and disaster recovery, including members of the
finance, e-commerce, technology, entertainment, ASP/Managed
Service Provider (MSP), and data-centre world. These companies
must maintain absolute availability for their partners, suppliers,
and customers, meaning they need mission-critical, 24-by-7
Internet availability.
Multi-homing also supports multi-pathing, extra user traffic
from a growing Web presence, and a more stable connection.
But redundancy is the main attraction. Providers such as CoreExpress,
Savvis Communications, and Focal Communications, as well as
many ISPs, have all used multi-homing to offer premium services
often at premium cost.
Because routing depends on Border Gateway Protocol (BGP) to
detect and route paths across the shortest distance, multi-homing
only provides incremental ISP links. It doesnt assure
QoS for IP VPNs, for instance, or supply chain, extranet,
or e-commerce apps, or even Voice over IP (VoIP). BGP has
no visibility into backbone performance and congestion, or
load balancing. And given its hot potato nature
for routing at peering points, bottlenecks and choke points
are all too common.
Furthermore, BGP4 has slow failover rates when updating Internetwork
routing tables, and traffic is lost in the interim. So, if
youre multi-homing, youve got all this capacity,
but still precious little additional performance, which leads
to stranded bandwidth and unhappy customers.
RO, used with multi-homing, is the answer to these limitations.
RO makes better use of bandwidth and lowers costs by monitoring
performance information from across a wide swath of the Internet
(across multiple backbones, sometimes globally, sometimes
more regionally). RO then makes intelligent routing decisions
as to which link, or home will perform best with
respect to latency, packet loss, jitter, and so on, for outbound
traffic. You can also load balance outgoing business traffic
across multiple pipes, figure out which link to use in order
to best meet the performance needs of a particular application,
or provide least-cost routing for a particular flow. For instance,
you might want to use a Tier 1 ISP for your IP VPN, but you
can relegate Web surfing or e-mail to cheaper links, giving
the end user more price flexibility.
Multi-homing in itself doesnt provide optimised routing
multi-homing is just access pipes that you must manage. RO
products automate this process and reduce human error, a major
problem in managing BGP, and inform the network manager of
routing options. The manager can then either automate the
process so the RO recommendations are executed, or actively
control the process.
Target market
So,
do you need this stuff? Many analysts say yes, if you run
mission-critical apps over the Web with anything from multiple
T1s to multiple OC-3s. Vendors report interest from financial
services, manufacturing firms, and Internet portal companies,
among others. RO lowers costs particularly for enterprises
that pay for usage, as opposed to fixed fees. It can also
slow the drive to migrate to fatter pipes as demand increases,
further cutting costs.
However, a Burton Group report points out that companies might
want to segment their multi-homing services. All employees
might not require multi-homing. The decisions about
what methods and equipment to use can be reduced to a manageable
level by carefully evaluating the following: First, what areas
within the corporate network require the high availability
provided by multi-homing? writes William Terrill, an
analyst at Burton Group. Second, what level of internal
capabilities is available to configure and manage the multi-homed
links? And third, what financial benefits and costs will be
incurred?
ISPs are another candidate for RO. Using the technology, they
can deliver greater network reliability and QoS and charge
a premium for services tiered accordingly. They can also use
RO to optimise traffic with upstream ISPs, thus adding further
service guarantees to end-users.
The upshot is a market worth $2.6 billion by 2005, up from
$410 million today, according to NetsEdge Research Group,
an analyst group that follows RO closely. Vendor figures are
considerably higher.
The Players
A handful of start-ups have blasted out of the gate with various
services, creating a lot of excitement. These start-ups include
BGP-based RO software (netVmg), hardware (RouteScience), and
service (Sockeye Networks). Some other players are Opnix,
Proficient Networks and already established Radware, whose
LinkProof product uses Network Address Translation (NAT) as
an alternative to BGP. NSP InterNAP is the established player,
with five years of experience in the field.
These offerings fall into two main camps: CPE (netVmg, RouteScience),
or network service (Sockeye, InterNAP). Ill first look
at CPE-style implementation, using netVmgs Flow Control
Platform as an example. Though software implementations will
obviously differ from hardware, you can abstract some general
points from this approach that apply to both hardware and
software.
netVmg typically assesses a customers existing network
configuration, crafts performance- and cost-optimisation policies
corresponding to the customers business rules and processes,
and installs the companys software on off-the-shelf
servers, helping the customer configure them to their requirements.
Then, netVmg installs the servers at the customer premises.
Finally, netVmg performance-tests the platform to ensure that
the flows are routed according to defined policies, and that
QoS and cost-constrained routing are functioning properly.
Our
software only needs to reside at one location, that is, where
the application resides that is being accessed by remote users
or partners, says Ed English, netVmgs vice president
of corporate development. There is no requirement for
any equipment, software, or change in network equipment or
providers at the remote end of the connection.
Though some products bill themselves as completely customer-configurable,
most require at least some customer and vendor collaboration
to get up and running. One point of contention is the degree
to which the customer must modify the content and applications,
or alter the network architecture to allow RO hardware access
to the existing CPE router and data path.
On the service side, Sockeye is the new fair-haired child
with its GlobalRoute service. Fresh off its first customer
win (Focal Communications), the company provides the same
route-monitoring capability as netVmg, but partners with Akamai
Technologies to access its 13,000 servers on more than 1,000
networks. This gives Sockeye a global view of Internet performance,
which it supplements with local loop and CPE probe measurements
to enable a richer report of IP backbone conditions. Managers
can use this data for their own routing decisions or to automatically
implement them. Keep in mind that most of these companies
have significant value-added features and differentiators
that bear discussion. However, rather than comparing them
head-to-head, vendor by vendor, you can find the key feature
sets later in the article.
Sockeyes biggest competitor is InterNAP, which connects
a single access line from the customer premises to a proprietary
POP, which then connects to the Tier 1 ISPs. Traffic is routed
according to the most optimal path. The big difference is
that Sockeye requires multi-homing into the customer premises,
and InterNAP doesnt.
Customer feedback on RO is scarce, though vendors have announced
a few wins (Opnix, for instance, claims a sizeable handful
of wins). InterNAP, with its five-year history, has the most
forthcoming customer base. Mike Apgar, president and CEO of
Speakeasy Network, a national ISP in 120 Metropolitan Service
Areas (MSAs), uses InterNAP for all of Speakeasys traffic-including
e-mail, VoIP, and video. InterNAP has provided us with
no less than 100 percent uptime over four years, says
Apgar. The quality of their bandwidth and the extremely
high level of customer service and support more than outweighs
the cost differential of a more conventional provider, and
they let us relieve our internal staff of the management overhead
that would be necessary to achieve InterNAPs service
level.
Similarly, Doug Cavit, VP of network infrastructure at McAfee.com,
a single-site business with what it claims is the 44th largest
Web site in the world, has used InterNAP for two years for
Internet connectivity and downloads, pushing out over 200Mbits/sec
at peak periods. When asked about cost savings and maintaining
control, Cavit replies, It has been performing very
well, and we have dodged major backbone outage issues and
have been able to consolidate our expenses from multiple vendors
to a very few. InterNAP has an excellent NOC with great engineering
talent and depth, but still allows us a degree of control;
we still monitor and control all of our vendors but weve
had to staff up less so its been more of a workforce
augmentation tool.
Feature Set
Below are some things to look for in a RO solution, CPE- or
service-based.
-
First, look for backbone-performance measurement techniques.
The product should measure all multi-homed links, not just
the active link. When switching from an active route that
has degraded to another route, the chosen backbone must
be optimal for the traffic flow, not randomly chosen.
-
Sometimes its beneficial to use both active and passive
measurement. If you dont like pinging, used in active
measurement, look for a solution using active and passive
measurement, where Internet measurement traffic from servers
and routers is sent to vendor probes that link to the RO
platform. The product should allow customer preference to
dictate how data is gathered.
-
Some players push proactive monitoring capabilities that
anticipate where congestion is likely to occur across the
links and correlate that with the customers data flow.
Customer traffic is re-routed accordingly before performance
bogs down, helping offset slow-acting reactive measurement
schemes.
-
Response time, or how quickly the solution detects congestion
on an active path and shifts that traffic to the optimal
route, varies by vendor. The gating factor relies on a local
CPE device and a provider NOC, and thus delays routing changes.
Your evaluation should include proven results based on per-flow,
per-application QoS, measured by the end user.
-
Customers must have full control over RO policies. If the
cheapest inactive path cant meet the Service Level
Agreement (SLA), route traffic to the cheapest backbone
that can. Policies must embrace business rules that relate
users, user groups, and their IP addresses to specific cost
and QoS parameters. You should centralise administration
under a network manager, but that person might want to give
certain end users policy-modification privileges.
-
A Web GUI is a must for network management, along with a
Command Line Interface (CLI) or an equivalent program to
set policies and weighted link usage as it relates to cost.
SNMP monitoring, covering environmental, power, disk, memory,
and CPU utilisation, is also crucial.
-
The RO platform must support all traffic types you plan
to use. Web, VPN, and VoIP support is essential. If you
have more granular traffic needs or plan to add more esoteric
protocols, youll need to ask about those specifically.
-
Make sure your RO platform provides flexible load-balancing
customisation options. You might want to load balance based
on actual link usage. Setting up a simple link preference
policy might not be granular enough or as cost-effective.
-
Detailed reporting that tracks actual loss, latency, and
jitter performance, both for active paths and projected
performance, is necessary for each ISP link. Compare the
inbound and outbound bandwidth usage and cost of the ISPs.
Also consider per-user, per-flow performance measurement
that tracks QoS across multiple networks from the customers
perspective. Reports should highlight performance to selected
destination prefixes across each link. Fault isolation,
diagnosis, and route change activity are also essential.
-
Management must be secure, with authentication and authorisation
that meets your existing level of security.
-
A solution portfolio might also be important. If you want
more support, look for a vendor who will get to know your
network and offer handholding and expertise in BGP routing;
Web architecture and performance; and VPN and Operation
Support System (OSS) billing issues that develop inevitably
with multiple ISPs.
Product or Service?
With this list in mind, you can begin to evaluate products
and services, based on an intimate understanding of your networks
needs. CPE-based vendors contend their standalone products
measure end-user experience better, leading to improved routing
decisions. Another article of faith is that customer premises
gear isnt slowed down by working with a provider NOC
offsite, so RO products can react faster than a service. (This
hasnt been verified independently.)
NetVmg contends its software is non-intrusive and configured
to highly customer-specific policies, whereas service solutions
are implemented in a more cookie-cutter fashion. Our
customers are telling us that they want intelligence and control
at the edge of the network; they dont want to completely
rely on their service providers to do this for them, nor do
they want to be connected to some remote, foreign system feeding
instructions to their equipment, says English.
Sockeyes partnership with Akamai gives the company access
to performance data across the globe, and takes local loop
data via customer probes for a fuller picture. Sockeye and
InterNAP both point out that their architecture provides them
a much broader view of the Internet than a CPE vendor could.
This allows smarter routing decisions and better reporting,
potentially giving customers contract-negotiating leverage
over their various ISPs, because end users can turn to competitors
more easily and compare price, QoS, and SLAs. CPE vendors
claim the opposite that going with a single service-based
provider reduces end-user leverage. Neither claim has been
proven yet.
Sockeye and InterNAP also claim a service is cheaper up front
(true) and reduces maintenance chores and eases scalability
problems (likely, but remains to be seen). Services work with
existing network devices and can reduce management tweaking,
leading to a shorter deployment cycle. Hidden CPE costs can
include network overhead, BGP engineering personnel, and carrier
relations staff, as well as account license and monthly maintenance
fees. These are usually factored into service solutions. SLAs
are also important. InterNAP offers network availability,
monthly 55 millisecond (ms) round-trip latency, and under
1 percent packet loss. Other vendors figures werent
available at press time.
Though a thorough, independent evaluation is essential to
verify all vendor claims, InterNAP says it offers a proven
solution for 900 customers, keeps RO maintenance costs in
the network, and is network agnostic. It also offers round-trip
optimisation, so bi-directional traffic flows are optimised.
How? InterNAP says it can influence incoming traffic because
its a major customer of the Tier 1 ISPs, who can help
implement BGPs local preference and community attributes.
Peter Christy, co-principal of NetsEdge Research, suggests
that if a customer has the necessary BGP and carrier expertise,
a CPE solution makes sense, because a service provider might
not customise or optimise the solution as much as the end
user would like. Long term, I think services will make
more sense, but that might be several years away, Christy
says. A service has the potential advantage of greater
visibility into the network. Sockeye, for example, using Akamais
view of the Internet, can anticipate that some route will
be problematic and avoid it, whereas a simple product with
no connection to the rest of the world has to learn there
is an issue, either by probing (spending on extra bandwidth),
or by the failure of an initial data transmission.
Trust, But Verify
While multi-homing means more ISP fees, it should lead to
less total network costs, considering ROs ability to
use cheaper pipes, routing by cost and QoS. You must also
consider extra installation, local loop, and staffing costs
for additional links, though theyre offset by savings
from automating IT staff chores. Multi-homing also contributes
to the complexity of routing tables and could cause BGP breakdowns
on your network. Thorough RO testing is required to prevent
such problems.
If this technology can deliver, youll be able to maintain
performance while lowering costs. With RO, you can set up
links and route to lower-price ISPs, eliminating the need
to buy big fat pipes from Tier 1s that are usually far more
expensive than a decent Tier 2 ISP. With RO and load balancing,
customers can use less bandwidth over two or more ISPs, lowering
costs. Essentially, you can pick lower-cost providers, depending
on the application involved, and save the big guns for the
mission-critical, bandwidth-sensitive stuff, such as a videoconference
over an IP VPN.
Look for more vendors to come into this space and check out
customer references as more betas are completed. This market
will surely go through a prove-it-to-me period
and some consolidation, but routing optimisation will be one
of the few big-splash technologies in 2002 and 2003.
www.networkmagazine.com
|