Home > Telecom Tips > Telecom Essentials > Network management and troubleshooting in the multimedia network
Telecom Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

TELECOM ESSENTIALS

Network management and troubleshooting in the multimedia network


Tom Nolle
07.21.2007
Rating: --- (out of 5)


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


Whatever the specific technology used in today's voice, data and video service provider networks, it is almost certain to be in a packet architecture. Packet networks break information into small pieces -- packets -- which then share transmission resources. This process or sharing makes packet networks more efficient in their use of capacity, given that much of the traffic of networks is bursty in nature. However, it also creates potential collisions of resource demands. Packet networks normally employ a form of topology discovery and adaptive routing designed to improve resiliency, and this can make it difficult to manage network capacity and determine the exact nature of problems when they occur.

Sustaining a multimedia packet network, like any form of network management, is based on the classical FCAPS acronym: Fault, Capacity, Accounting, Performance, and Security. The order of consideration, however, doesn't match the acronym.

Capacity and performance management
Capacity and performance management are the classical preemptive network management tasks, and security management is increasingly being added to the list. The purpose of all three is to create a stable framework for service delivery that will provide customers with good experiences under normal conditions. Each of these is divided into two phases, planning and management.

In the planning phase, network operations uses historical data, estimates for growth in existing services or demand for new ones, and other factors to establish a baseline plan. This plan must include metrics to measure current network behavior against the plan to determine whether the plan's goals are being met, and it is this measurement that is the goal of the network manager.

More on network management
Learn about changes in network management  systems

Read about network monitoring for service providers

The key tools for planning in the "CPS" part of the acronym are threshold measurements of certain network conditions, which would typically include the load levels of trunks and devices and the traffic generated at certain interface points. The purpose of these measurements is to test the conditions against the network plan and generate threshold alerts where there are indications of an unexpected condition. These alerts can be indications of an emerging issue (trunk loads may rise to a threshold level indicating that additional resources are required) or of an acute problem (trunk or device loads are approaching the level where performance is impacted). Thus, these planning and management tasks may also generate a need for troubleshooting.

Good capacity and performance management plans and monitoring also recognize that the network may not always be operating in its optimum state, and that alternate states that reflect the balance of business goals should be defined to respond to failures or congestion. These are often called "failure modes." Failure mode planning is critical for multimedia networks that serve voice, data, and video because these three media are likely to have major differences in economic value and tolerance to out-of-specification network operations. One of the challenges of networks with fully adaptive behavior (routing, spanning tree bridging) is that they may not support explicit failure modes, but simply adapt to conditions in a variable way.

Troubleshooting and problem isolation
Where an acute load condition is detected or where a fault is reported, the process shifts to the fault management area, popularly called "troubleshooting" or "problem isolation." The primary purpose of fault management, simply put, is to fix the problem. Another goal equally important is to sustain network operation in a valid failure mode while the problem is resolved.

When a fault occurs, the network operations personnel should first ensure that the network has entered a valid failure mode state, meaning that the remaining network resources have been allocated according to business priorities. Video is usually the form of traffic with the greatest economic value and performance sensitivity, and so failure modes should ensure video performance and perhaps block new video delivery requests to ensure the current ones are sustained. Voice can be treated similarly. Most data applications are tolerant to some delay and packet loss associated with the resource congestion likely in failure modes, so this can often be made the lowest priority.

Once the network has settled on a failure mode, the network operations troubleshooting process is directed at finding the underlying problem. This can be approached either based on reconstruction or status analysis.

Reconstruction means simply going back through alert messages to find the early period of the failure and then analyzing the events as they unfold. This approach requires a log of network events with very accurate time stamps on alert messages to ensure they are processed in sequence. This approach is often the only way to find problems that arise from "soft" faults like congestion. It normally involves moving forward to the point where the problem is clearly visible, and then backtracking to determine what caused this pivotal event set.

The problem with a reconstruction approach is that it may be time consuming. When a problem is "hard," meaning that its state persists even when the network is in failure mode, status analysis may be a better approach. Examples of this sort of problem are a trunk failure (fiber cut) or a node fault (power, equipment).

Soft problems like congestion are usually attributable either to an abnormal line/node condition (that may be causing retransmission and loss of effective throughput, for example) or by excessive load. The latter may be caused by unexpected traffic or even by an attempted security breach, virus, etc. Where the problem is caused by an abnormal resource behavior, isolating and fixing the failing resource is the goal. Where it is caused by traffic overload, it may be necessary to flow-manage some sources or, in the case of security violations, cut them off.

Hard problems should be diagnosed remotely to the greatest extent possible. For line problems, this means testing from both device endpoints, and for device problems testing each interface to determine the scope of the problem. The facilities to perform these tests vary widely; use whatever is available to its fullest extent.

Dispatching someone to a location to fix a device or reconfigure or restart it manually is a last resort, since this will be both costly and induce a longer period of abnormal network operation. Where it is necessary, be sure to fully isolate the device first, then undertake the repair, recommission and test, and only then bring the device back to operational status. In some extreme cases it may be necessary to bring devices or trunks back online during off periods to avoid adaptive reconfiguration and performance instability.

Any time a problem must be managed, it must be carefully logged. Effective network operations depends on analyzing the response to problems and tuning the procedures so that downtime and cost are minimized if the problem reoccurs in the future.

About the author: Tom Nolle is president of CIMI Corporation, a strategic consulting firm specializing in telecommunications and data communications since 1982. He is a member of the IEEE, ACM and the IPsphere Forum, and the publisher of Netwatcher, a journal in advanced telecommunications strategy issues. Tom is actively involved in LAN, MAN and WAN issues for both enterprises and service providers and also provides technical consultation to equipment vendors on standards, markets and emerging technologies.

Rate this Tip
To rate tips, you must be a member of SearchTelecom.com.
Register now to start rating these tips. Log in if you are already a member.




Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


RELATED CONTENT
Telecom Essentials
IP QoS: Two generations of class-of-service tools
What's all this fuss about telecom carrier capex?
Telepresence, unified communications and collaboration: A network operator's role
Packet optical: Differing views on network elements
Next-gen networks require 24x7 bandwidth readiness
Deploying effective service delivery platforms for next-gen networks
3 reasons to speed legacy to next-gen network migration
Avoiding private IP security risks in public networks
Telecom business model transformation requires symbiotic service models
The role of IMS and SOA in the service ecosystem

Telecom Network Management
Telepresence, unified communications and collaboration: A network operator's role
New Juniper "virtualized" dynamic services gateways emphasize flexibility
Telecom worker safety demands proper training and equipment
Enterprise services revenues climb as telecoms tap economies of scale
Managing protocol layers in carrier infrastructure
Local telecom shortens level 1 help-desk calls by harnessing NetFlow monitoring
Net neutrality returns to Congress, but it's unlikely to go anywhere soon
Combatting piracy, ISPs follow in higher education's footsteps
Nokia Siemens' acquisition of Apertio signals user data consolidation push
Network management planning guide

Triple Play and Bundled Services
Bandwidth caps and restrictions face opposition from customers, study shows
Managed multi-service gateways open opportunity for service providers
Are service providers finally customer-centric? Closer, but not quite
Carriers are hedging their mobile Linux bets with both Android and LiMo
Qwest makes good on fiber network deployment; steers clear of IPTV
IPTV gives telecoms interactive edge in fight for local markets
Telecom market heading for healthy growth, TIA projects
In a cellular world, landline telecoms struggle to define themselves
Can AT&T's VDSL compete in a fiber world?
Over the Wire: Cable voice turnoff bends telecom competition regulations

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
backbone  (SearchTelecom.com)
carrier signal  (SearchTelecom.com)
CDMA  (SearchTelecom.com)
DSLAM  (SearchTelecom.com)
Hayes command set  (SearchTelecom.com)
multichassis multilink PPP  (SearchTelecom.com)
multilink PPP  (SearchTelecom.com)
peering  (SearchTelecom.com)
telecommunications  (SearchTelecom.com)
traffic engineering  (SearchTelecom.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Reprints  |  Site Map




All Rights Reserved, Copyright 2007 - 2008, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts