Baselines and Performance Metrics
Key Takeaways
- A baseline captures normal behavior for a specific link, device, or time so deviations stand out; refresh it after major changes.
- Bandwidth is capacity and latency is delay — they are independent, and a high-bandwidth link can still have high latency.
- Jitter (variation in latency) and packet loss are the metrics most directly tied to poor voice and video quality.
- Interface errors, CRC errors, and discards point to physical, duplex, or congestion problems rather than application bugs.
- Wireless health uses RSSI, SNR, and channel utilization; usable RSSI for data is roughly -67 dBm or better and SNR above 20 dB.
Baselines and Performance Metrics
Monitoring data only becomes actionable when you can compare it against normal. A baseline is a documented record of normal behavior for a specific network, link, device, application, or time window. With a baseline, a reading of 95 percent utilization is alarming; without one, the same number is just a number.
Why baselines matter
Baselines drive five activities the exam cares about: troubleshooting, alert threshold tuning, capacity planning, change validation, and anomaly detection. If a WAN link normally runs 40 percent at peak and suddenly hits 95 percent, the gap is the clue. Baselines must capture business cycles — a school, clinic, warehouse, and 24x7 SaaS office all peak at different hours — and must be refreshed after major changes such as a new application rollout, a circuit upgrade, a wireless redesign, or a routing change. A stale baseline produces false alarms or, worse, masks a real regression.
Core wired metrics
| Metric | Meaning | Typical cause when abnormal |
|---|---|---|
| Bandwidth | Maximum capacity of a path (e.g., 1 Gbps) | Link is undersized for demand |
| Utilization | Percent of capacity in use | Congestion when sustained near limit |
| Latency | One-way or round-trip delay | Long paths, queuing, slow links |
| Jitter | Variation in latency | Choppy voice and video |
| Packet loss | Packets that never arrive | Retransmissions, dropped calls |
| CRC / frame errors | Frames failing checksum | Bad cable, optic, EMI, or duplex mismatch |
| Discards | Frames dropped intentionally | Full queues, congestion, policy |
| CPU / memory | Device resource use | Control-plane or management instability |
Bandwidth and latency are independent. A 10 Gbps satellite link can still have 600 ms latency. The exam loves the trap that "more bandwidth fixes latency" — it does not. Likewise, a duplex mismatch produces late collisions and CRC errors on a link that pings fine, so correlate errors with throughput, not just reachability.
Wireless metrics
Wi-Fi is a shared medium, so one bad client degrades a whole cell. Key indicators:
| Metric | What it indicates | Healthy guideline |
|---|---|---|
| RSSI | Received signal strength (dBm, negative) | -67 dBm or better for voice/data |
| SNR | Signal-to-noise ratio (dB) | 20 dB+ for reliable data, 25 dB+ for voice |
| Channel utilization | Airtime consumed on the channel | Below ~50% before contention hurts |
| Retries | Frames resent due to loss/interference | High retries signal interference |
| Client count / roaming | Density and mobility | Overloaded AP if count spikes |
Remember RSSI is negative: -50 dBm is stronger than -75 dBm. A client at -80 dBm with high retries will feel slow even though the wired uplink is idle.
Thresholds, alerts, and trends
Thresholds should reflect operational impact and duration, not raw values, to avoid alert fatigue.
| Alert pattern | Example rule |
|---|---|
| Static threshold | WAN utilization above 90% for 15 minutes |
| Baseline deviation | Traffic is 3x normal for this hour |
| Rate of change | DHCP scope filling unusually fast |
| Composite | High latency AND packet loss AND interface errors |
Trend analysis forecasts when a resource will need attention: a WAN circuit projected to exceed capacity next quarter, an access switch with rising PoE budget consumption, or a DHCP scope nearing exhaustion. This is proactive capacity planning rather than reactive firefighting.
Practical scenario
After a cloud backup agent is deployed, users report afternoon slowness. The baseline shows WAN utilization is normally 45 percent at 2 p.m., but monitoring now reports 98 percent for two hours. Flow data confirms backup traffic dominates. Remedies include rescheduling the backup, applying QoS to protect interactive traffic, upgrading bandwidth, or traffic shaping — the baseline made the diagnosis fast.
Reading errors versus discards
The exam distinguishes errors from discards because they have different causes. Errors are frames that arrived damaged — they failed the cyclic redundancy check, or were runts, giants, or alignment errors — and they almost always trace to a physical problem: a kinked or out-of-spec cable, a dirty fiber connector, electromagnetic interference, or a duplex mismatch where one side is half duplex. Discards, by contrast, are intact frames the device chose to drop, usually because an output queue was full during congestion or a policy such as an access control list or quality-of-service drop matched.
So rising errors say "check the physical layer," while rising discards say "check for congestion or policy." Treating them as interchangeable leads to replacing good cable when the real fix is more bandwidth or smarter queuing.
Interpreting utilization over time
A single utilization reading is nearly meaningless; the shape over time is what counts. A link that briefly touches 100 percent during a backup window is healthy, while a link that sits at 85 percent every business hour is a capacity problem waiting to surface as latency and loss. This is why thresholds combine a level with a duration — "above 90 percent for 15 minutes" filters out harmless spikes. Sampling interval matters too: averaging over five minutes can hide one-second microbursts that still overflow a switch buffer and cause discards.
When users complain of slowness but the five-minute graph looks fine, drop to a shorter interval or a packet capture to expose the burst.
Common exam traps
- High bandwidth does not guarantee low latency; they are separate metrics.
- A single metric rarely proves root cause — correlate utilization, errors, loss, logs, and user impact.
- Errors are damaged frames (physical layer); discards are intact frames dropped by congestion or policy.
- RSSI is negative: closer to zero is stronger.
- Baselines are not permanent; refresh them after major changes and seasonal shifts.
Voice calls sound choppy even though average latency is within target. Which metric is most directly associated with the problem?
Which readings most directly point to a physical or link-layer fault rather than an application problem? Choose two.
Select all that apply
A wireless client reports an RSSI of -80 dBm with frequent retries while another shows -55 dBm. What is the correct interpretation?
Why should a baseline be refreshed after a major application migration?