Troubleshooting networks is more important than ever. As time goes on,
services continue to be added to networks. Each added service introduces a
number of variables. This adds to the complexity of the network troubleshooting
as well as the network itself. Organizations increasingly depend on network
administrators and network engineers having strong troubleshooting skills.
Troubleshooting begins by looking at a methodology that breaks down the
process of troubleshooting into manageable pieces. This permits a systematic
approach, minimizes confusion, and cuts down on time otherwise wasted with
trial and error troubleshooting.
Network engineers, administrators, and
support personnel realize that troubleshooting is a process that takes the
greatest percentage their time. One of the primary goals in this module is to
present efficient troubleshooting techniques, in order to shorten overall
troubleshooting time when working in a production environment.
Two
extreme approaches to troubleshooting almost always result in disappointment,
delay, or failure. On one extreme is the theorist, or rocket scientist,
approach. On the other is the impractical or caveman, approach. Since both of
these approaches are extremes, the better approach is somewhere in the middle
using elements of both.
The rocket scientist analyzes and re-analyzes
the situation until the exact cause at the root of the problem has been
identified and corrected with surgical precision. This sometimes requires
taking a high-end protocol analyzer and collecting a huge sample, possibly
megabytes, of the network traffic, while the problem is present. The sample is
then inspected in minute detail. While this process is fairly reliable, few
companies can afford to have their networks down for the hours, or days, it can
take for this exhaustive analysis.
The caveman’s first instinct is to
start swapping cards, cables, hardware and software until miraculously the
network begins operating again. This does not mean that the network is working
properly, just that it is operating. Unfortunately, the troubleshooting section
in some manuals actually recommends caveman style procedures as a way to avoid
providing more technical information. While this approach may achieve a change
in symptoms faster, this approach is not very reliable and the root cause of
the problem may still be present. In fact, the parts used for swapping may
include marginal or failed parts swapped out during prior troubleshooting
episodes.
Analyze the network as a whole rather than in a piecemeal
fashion. One technician following a logical sequence will almost always be more
successful than a gang of technicians, each with their own theories and methods
for troubleshooting.