Telecommunication Services and Service Management Challenges
Abstract: Trends in telecommunications networks including network
convergence, requirements for QoS and service level agreements, and open
service architectures are impacting the service mangement systems and processes.
New results in three areas of IP service management are described. The
architecture of a new platform for service management is presented. This
is the first reported service assurance platform to use ASP technology
as its infrastructure. A new performance mangement suite is described.
This suite currently supports measurement and reporting of web and stream
servers and VoIP softswitches. Finally, a recent result in customer care
automation for processing large volumes of email sent to a customer care
center is reviewed.
Keywords: Service Management, Network Management, IP Services,
Large telecommunications service providers are facing significant changes
in service definition and service management due to network technology
convergence, deregulation and the growing importance of IP networking.
Service management functions impacted by these changes include service
fulfillment, service provisioning, and service assurance. Service providers
have a considerable investment in business process and operation support
systems (OSS) for their existing network. Both process and systems were
developed over many years to support a range of service offerings including
voice (PSTN), centrex, wireless voice, point-to-point data circuits (T1,
T3, etc.), and broadband (SONET, ATM, Frame Relay).
OSS requirements for IP service management must include some new considerations.
First, the network technology and the market place change at an increasingly
rapid pace, meaning that the end-to-end service management suite must be
easily adapted. This is a particular challenge since the suite of systems
is quite large, their lifecycle spans many years, and the network technologies
are diverse. Second, there is increased competitive emphasis on quality
of service (QoS) and the use of service level agreements (SLA) is becoming
more critical. This is one of the most important areas of evolution of
the Internet protocols and related management platforms, and there is much
that can be done using existing technology. Third, service ordering must
be available through web interfaces and electronic data interchange (EDI)
interfaces. There has been considerable progress in this area during the
past five years including the availability of ecommerce platforms and enterprise
integration middleware, and the rise of extensible markup language (XML)
as an interchange format. Fourth, an open service architecture is proposed
voice services mediated by softswitch technology. This is expected to
lead to voice services which involve both third party and service provider
support, complicating the service management function.
The demarcation line between the service provider and the customer is
changing as services move to the application layer. In the residential
context, residential gateways are being developed by which home networks
and services will be mediated. In the business context, service offerings
must integrate with the enterprise network. Further, single service provider
is unlikely to control the entire facility needed to implement the service,
making it difficult to offer end-to-end service guarantees.
Trends in service management have been widely discussed [see Mitra
(00)]. The impact of QoS is discussed [see Jiang
(00)]; the impact of artificial intelligence techniques is discussed
[see Fuller (99)]; the role of network convergence
[see Moyer (01), Modarressi (00),
Garg (98)]; the emerging area of policy-based management
is described [see Wright (99)]. This paper surveys
recent work we have performed in IP service management for a large service
provider and a Tier-1 ISP. In particular we have developed a new service
management platform based on Java Application Service Platform (ASP) technology.
We have also developed a performance management suite for IP services.
Finally, we have developed a system for automatic processing of a large
volume of customer care email.
In the next section we describe in more detail the concepts of IP services.
Subsequent sections describe, respectively, the new management platform,
the performance management suite, and the customer care automation. The
paper concludes with a summary section.
2 The Nature of IP Services
We consider IP Services to refer to session and application layer behavior
and protocols that rely on IP internetworking for end-to-end connectivity
and provide generic functionality. Examples of well-known IP services include:
Web hosting is the most prevalent such service in use today. Large scale
deployment of other services is underway, driven by economies of scale
of the Internet and related technologies, improved capacity of backbone
and access networks, and the evolution of internet protocols towards scalable
QOS-capable transport. Services are typically implemented and operated
by organizations, which are not network service providers (NSPs). In many
cases the use of standard protocols and data formats leads to interoperability
of service implementations and device-independent access.
One of the difficulties in defining what an IP service is relates to
the gray line between generic services (e.g. DNS and LDAP) and specialized
services built from these building blocks. For example, DNS entry modification
is used as a load-balancing component in web server and web caching systems.
LDAP is used in some softswitch implementations. The fact that many services
are built as an aggregation of various systems, dedicated switching devices,
and software components, each with its own management and configuration
interfaces, complicates the creation of large-scale services. Today this
is dealt with by vendor-specific monitoring and configuration tools. From
the service provider perspective, proprietary management tools and techniques
are more costly to operate and integrate with other operations processes
and operations support systems.
Efficient configuration and management of such services is critical
for ensuring rapid time to market and uniform service quality as the service
availability expands. From a monitoring perspective, aggregation of fault
information relative to an end-to-end service view is needed for a number
of reasons, including simplifying operations processes, providing quantitative
relationship between service degradation and customer billing and SLAs,
and scalability for all aspects of OAM&P for the service.
Figure 1: Simplified caching system for a content distribution
Two examples of emerging IP services are content distribution networks
(CDN) and softswitch technology. CDNs [see Fig. 1] are designed for improved
delivery performance of web content, streaming media, and mp3 files. CDNs
are frequently implemented as dedicated high performance networks with
content caching at the terminating points. The caches improve response
time by reducing retrieval time compared to direct access to the origin
server. For large loads, a cluster of caches can be implemented. The techniques
for load distribution for cache clusters are similar to those used for
web server clustering. However, some cache architectures have special protocols
(e.g., ICP) for cache-to-cache lookups. From an end-to-end perspective,
management of the CDN requires an integrated view of the caches, the load
distribution component, the network, and the origin servers. Content management
is an area of growing importance in CDNs.
"Softswitch" is a widely used term for emerging IP-based communications
systems that employ open standards to create integrated networks with a
decoupled service intelligence capable of carrying voice, video and data
traffic [see Huitema (99)]. Softswitch is representative
of the migration from circuit switching to packet/frame/cell-based networks
for voice and video communications. The standards-based approach taken
to date suggests that many voice applications will emerge that leverage
the softswitch platform.
The basic components of a softswitch configuration [see Fig.
2] include the Call Manager, Gatekeeper, and Gateway, which work together
to permit end-to-end signaling, authorization, and connectivity for IP
to PSTN and IP to IP telephony. Other components include aggregation equipment,
billing record creation, and different types of signaling and media gateways.
From an end-to-end perspective, management of the softswitch-enabled telephony
service requires an integrated view of the signaling and transmission components.
Additionally, gateway mediated service means that calls may involve two
or more management domains, and in the case of IP to PSTN, different management
systems and processes for different portions of the connection.
Figure 2: Simplified softswitch configuration
As these examples illustrate, end-to-end management is vital to IP service
management as it has been in traditional network management. The use of
event correlation for service assurance is such networks is discussed in
previous work [see Buford (01)].
3 Management Framework
The use of modular platforms is well established in the network management
industry. This approach is intended to address the needs of both service
providers and enterprises to incrementally upgrade and scale the management
platform as the network functionality evolves and grows.
In recent years there has been trend to provide CORBA (Common Object
Request Broker Architecture) and messaging oriented middleware interfaces
integration with other operations support systems. It has also become
common to provide web-based application clients [see Tsai
(98), Ahn (99)] because of the wide use of web
There have also been extensive efforts to define standard network models
that would promote interoperability between vendor products and simplify
the effort to support new types of network elements. Although consortium
and standards for network models have proliferated, the goals of interoperability
and simplified upgrades have only modest results.
An important area in network element research is to support mobile code
in the software platform of the device, referred to as active networks.
Although limited commercial adoption of this approach has occurred to date,
the impact of active networks on network management is receiving considerable
research interest [see Schwartz (00)].
Figure 3: Architecture of the Noventra service management
system using application service platform (ASP) infrastructure
3.1 NoventraTM ASP-Based Management Platform Architecture
Noventra is the first network management platform to use a Java Application
Server Platform (ASP) for its infrastructure. Many of the features of today's
Java APSes were standardized by the Object Management Group (OMG) but never
fully realized in the vendor products. The Enterprise Java Bean (EJB) products
available today provide object-to-relational containers, server clustering
and load balancing, web-server integration, Java applet integration, name
servers, reliable messaging,
integrated development environments including Unified Modeling Language
(UML), and extensible markup language (XML) support. New EJBs can be dynamically
loaded, simplifying the upgrading the application. These benefits have
to date come at the cost of working with a new and evolving software technology
that has a performance handicap when compared with more established approaches.
The Noventra architecture [see Fig. 3] is designed
for efficient fault management including event collection, persistence,
subscription, and forwarding operations. It also provides for network topology
viewing and selection. Network elements can be securely configured, using
SNMP version 3 security features, or a patented secure remote shell technique
for host-based configuration of network elements.
Events can be received from various sources including SNMP traps, host
log file monitors, and Java remote method invocation (RMI) sources. The
GRACE correlation engine [see Jakobson (00)] integrates
with Noventra using the CORBA Notification Service. QoS threshold events
detected for IP services [see section 4] can also be received.
4 Performance Measurement of Services
Performance measurement of networks and services is used for:
- Service Level Agreement (SLA) measurement and tracking [see Muller
- Service benchmarking against other service providers
- Performance reporting for infrastructure capacity planning
At the transport layer, measurement can be either end-to-end or by segment
[see Jiang (00)]. Performance statistics can be
aggregated, or packet and cell level measurements can be captured. Although
measurement can be made at the network element, usually probes or agents
are used to avoid impacting the performance of the network. Probe techniques
are typically transparent to the network, but require insertion at the
network links at which measurement is needed. Since the probe sees every
bit moving over the physical media, there is a great deal of flexibility
in selecting which flows and protocols to measure. When segment by segment
measurements are made of an end-to-end flow, synchronization of the measurements
is an important issue.
Figure 4: Topological placement of measurement agents is
intended to cover the usage pattern of the user community
In cases where intermediate networks may be under control of multiple
service providers, such as many internet services, agent-based measurement
is useful. Agents are distributed topologically in the network [see Fig.
4], and measure the service by either generating service requests or
intercepting (via an instrumented proxy) actual client requests.
Characterization of the performance of the Internet continues to be
a challenging area [see Paxson (98), Murray (01)]
due to lack of instrumentation of the infrastructure, the many domains
that need to participate, and the growth and evolution of the network.
There are many research projects and commercial products for measurement,
analysis, reporting and visualization.
We have developed an agent architecture for IP service measurement.
Several types of agents have been created, which measure the delivery of
web servers and caches (SiteRadarTM), streaming media servers
(StreamMeterTM), and VoIP (voice over IP) softswitches (CallMeterTM).
The architecture also includes an SNMP (Simple Network Management Protocol)
poller for collecting performance statistics through MIB (Management Information
Base) queries. The agents reside on low-cost computing platforms that can
be placed at multiple sites through out the network. The placement of the
agents should reflect the service traffic that is being measured.
4.1 Agent Architecture
The agent architecture [see Fig. 5] is modular and
scalable. Each agent is configurable to produce synthetic transactions
and record protocol level service measurements. The agents produce periodic
reports encoded in XML, which are automatically uploaded to a master server
and stored in the performance database. All transfers to the master are
encrypted using SSL (secure socket layer). A web-based interface is used
to present performance reports.
The architecture is designed to be integrated with NoventraTM
and other management platforms. QoS events such as performance thresholds
exceeded can be triggered and forwarded.
4.2 Web Service Management
SiteRadarTM uses geographically distributed measurement
agents to measure the response time of web sites and web caches and distinguish
the contributions of networks, servers, and cache systems to the download
time. SiteRadar is comprised of both user and administrator web-based interfaces,
geographically distributed monitoring agents, an agent manager, and a database
that can be accessed by a report engine for querying and viewing reports.
Each agent has a list of sites to measure; these would include benchmark
sites as well sites covered by SLA.
4.3 VoIP SLA Management
CallMeter(tm) creates a synthetic call load to collect QOS information
for VoIP services, including end-to-end per packet (round-trip) delay,
jitter, and lost packet. Each agent makes periodic calls to other agents,
and each pair of agents in a call record the performance statistics. The
flow in each direction is measured. The sampled approach provides a profile
of QOS for voice calls on the network where the agents are connected. The
current implementation supports H.323 and has been
tested against several commercial softswitches. Support for SIP (Session
Initial Protocol) is planned.
4.4 Streaming Service Measurement
StreamMeterTM agents request stream delivery using the
RTSP (Real-Time Stream control Protocol). Per packet measurements are made
for streaming servers supporting RTP (Real-time Transport Protocol); measurements
of Real's streaming proprietary encoded streaming servers can be made at
an aggregate level.
Figure 5: Modular agent architecture for measurement agent
5 Customer Care Automation
Email is emerging as an important channel for ISP customer care. In
general, some amount of automatic processing for routing and ticketing
can be done by keyword analysis. Specific information (e.g., URL's, IP
addresses, etc.) can be gathered to facilitate subsequent diagnostics.
System utilities can also be automatically invoked to save time for Level
2 and 3 customer-care specialists. The results of the utilities can be
stored in the ticket with the complaint. Further automation could be obtained
by the use of natural language processing of incoming complaints, but there
is limited use of NLP for customer care email processing today.
The Customer Support Center (CSC) for a Tier-1 ISP [Bowie
(01)] receives about 30,000 email complaints per month. The majority
is forwarded complaints from individuals on the internet who have received
UCE (unsolicited commercial email) or spam from sources that may be in
ISP's network. Another significant portion is from customers who have detected
a security issue. The ISP needs to validate each complaint and take corrective
action if the problem is within the ISP's network. A system SpamCheckTM
was developed to replace the manual process.
The SpamCheck system is concerned with categorizing the complaint so
that it can be properly ticketed and remedies taken. There is an extensive
set of business rules for this processing. Each email is multiple embedded
or forwarded emails. Special processing beyond that normally needed for
internet email is needed, and this processing is not done by any commercial
email tools or spam analyzers today. There are many structural variations
in the incoming email, and a portion of the embedded spam has frequently
been manipulated by the spammer to make analysis and therefore tracing
difficult. SpamCheck also performs keyword analysis.
Figure 6: SpamCheck receives email service complaints, identifies
embedded email structure, checks for header manipulation, extracts key
fields for analysis and reporting, and correlates related occurrences to
create a single ticketing event
Email automation of email complaint processing is complicated by a number
of factors including: 1) complaints concatenate multiple text paragraphs
and RFC 822 headers, 2) some spammers camouflage their messages to circumvent
detection or correlation, 3) spam categorization depends in part on analysis
of the content of the spam message. The system is designed to work correctly
on 90% of the cases, and provides facilities for manual review of each
A high level view of the system is shown [see Fig. 6].
On the left is a email forwarded by the ISP email server
OSSes and associated business processes represent a significant part
of the cost of providing network services. Manual processes, though flexible
and relatively easy to implement, do not scale. On the other hand, OSSes
may be well-designed for the original function but are frequently difficult
to change, particularly in the face of unexpected network evolution and
later in the lifecycle of the system. As long as the telecommunications
industry faces rapid technology evolution, these will be continued challenges
for service management.
Contributors to the Noventra management framework include: J. McCann,
T. Collins, Y. Liu, and I. Behar. Contributors to the service measurement
suite include: J. Zhang, S. Chang, H. Shang, and M. Li. Contributors to
the customer care tool set include: X. Huang and M. Mady. Collaboration
with G. Jakobson is appreciated. The following are trademarks of Verizon:
SpamCheck, SiteRadar, StreamMeter, CallMeter, Noventra.
[Ahn (99)] Ahn, S. J., Yoo, S. K., Chung, J. W.:
"Design and Implementation of a Web-Based Internet Performance Management
System Using SNMP MIB-II"; IJNM (Intl. J. of Network Management),
9 (1999), 309-321.
[Baentsch (97)] Baentsch, M., Baum, L., Molter,
G., Rothkugel, S., Sturm, P.: "World Wide Web Caching: The Application
Level View of the Internet"; IEEE Communications Magazine, (1997),
[Bowie (01)] Bowie, D., Buford, J., Huang, X.,
Mady, M.: "Automated Email Ticketing For ISP Customer-Care";
Technical Report, Verizon Laboratories, March 2001.
[Buford (01)] Buford, J., Jakobson, G.: "Managing
Dynamic IP Services: Correlation-Based Scenarios and Architecture";
Proc. Enterprise Networking 2001. Atlanta (2001).
[Fuller (99)] W. Fuller: Network Management Using
Expert Diagnostics"; IJNM (Intl. J. of Network Management), 9 (1999),
[Garg (98)] Garg, V., Ness-Cohn, D., Powers, T.,
and Schenkel, L.: "Directions for Element Managers and Network Managers";
IEEE Communications Magazine, 10 (1998), 132-138.
[Huitema (99)] Huitema, C., Cameron, J., Mouchtaris,
P., Smyk, D.: "Architecture for Internet Telephony Service for Residential
Customers"; Bellcore Technical Report, (1999).
[Jakobson (00)] Jakobson, G., Weissman, M., Brenner,
L., Lafond, C., Matheus, C.: "Building Next Generation Event Correlation
Services"; Proc. NOMS 2000, Honolulu (2000).
[Jiang (00)] Jiang, Y., Tham, C.-K., Ko., C.-C.:
"Challenges and Approaches in Providing QoS Monitoring"; IJNM
(Intl. J. of Network Management), 10 (2000), 323-334.
[Mitra (00)] Mitra, D., Sahin, K. E., Sethi, R.,
Silberschatz, A.: "New Directions in Service Management"; Bell
Labs Technical Journal, Jan-Mar (2000), 17-34.
[Modarressi (00)] Modarressi, A., Mohan, S.: "Control
and Management in Next Generation Networks: Challenges and Opportunities";
IEEE Communications Magazine, 10 (2000), 94-102.
[Moyer (01)] Moyer, S., Umar, A.: "The Impact
of Network Convergence on Telecommunications Software"; IEEE Communications
Magazine, 1 (2001), 78-84.
[Muller (99)] Muller, N.: "Managing Service
Level Agreements"; IJNM (Intl. J. of Network Management), 9 (1999),
[Murray (01)] Murray, M., Claffy, K.C.: "Measuring
the Immeasurable: Global Internet Measurement Infrastructure"; PAM
2001 (Workshop on Passive and Active Measurements), Amsterdam (2001).
[Papavassiliou (00)] Papavassiliou, S., and Pace,
M.: "From Service Configuration Through Performance Monitoring to
Fault Detection: Implementing An Integrated and Automated Network Maintenance
Platform For Enhancing Wide Area Transaction Access Services"; IJNM
(Intl. J. of Network Management), 10 (2000), 241-269.
[Paxson (98)] Paxson, V., Mahdavi, J.: "An
Architecture for Large-Scale Internet Measurement"; IEEE Communications
Magazine, 8 (1998), 48-54.
[Schwartz (00)] Schwartz, B., Jackson, A.,
Strayer, W.T., Zhou, W., Rockwell, R. D., Partridge, C.: "Smart Packets:
Applying Active Networks to Network Management"; ACM Trans. On Computer
Systems. 18, 1 (2000), 67-88.
[Tsai (98)] Tsai, C.-W., Chang, R.-S.: "SNMP
through WWW"; IJNM (Intl. J. of Network Management), 8 (1998), 104-119.
[Wang (99)] J. Wang: "A Survey of Web Caching
Schemes for the Internet"; Computer Communications Review, (1999),
[Wright (99)] Wright, M.: "Using Policies
for Effective Network Management"; IJNM (Intl. J. of Network Management),
9 (1999), 118-125.