Baselines and Performance Metrics
Key Takeaways
- Baselines define normal network behavior so operators can detect anomalies, plan capacity, and tune alert thresholds.
- Common performance metrics include bandwidth, utilization, latency, jitter, packet loss, errors, discards, CPU, memory, and wireless signal quality.
- Metrics must be interpreted in context because a value that is normal for one link, site, or application may be abnormal for another.
- Thresholds should avoid excessive noise while still alerting before users experience major impact.
- Trend data helps teams identify capacity needs before links, devices, or wireless cells become saturated.
Monitoring becomes more useful when current values can be compared with known-good behavior. A baseline describes what normal looks like for a specific network, device, link, application, or time period.
Why Baselines Matter
Baselines support troubleshooting, alert tuning, capacity planning, change validation, and anomaly detection. If a WAN link is normally 40 percent utilized at peak time and suddenly runs at 95 percent, the team has a strong clue. If no baseline exists, the same number is harder to interpret.
Baselines should capture business cycles. A school, clinic, warehouse, retail site, and software office may all have different busy periods. Baselines should also be refreshed after major changes such as new applications, link upgrades, wireless redesigns, or routing changes.
Core Network Metrics
| Metric | Meaning | Common concern |
|---|---|---|
| Bandwidth | Maximum available capacity of a path | Link may be undersized |
| Utilization | Percentage of capacity currently used | Congestion when sustained near limits |
| Latency | Time for traffic to travel between points | Slow response, voice or video issues |
| Jitter | Variation in latency | Voice and video quality problems |
| Packet loss | Packets that do not arrive | Retransmissions, poor call quality, application failures |
| Errors | Frames or packets with physical or link-layer problems | Cabling, optics, duplex, or hardware issues |
| Discards | Packets dropped intentionally or due to queue pressure | Congestion or policy behavior |
| CPU and memory | Device resource use | Control plane or management instability |
Wireless Metrics
| Metric | What it indicates |
|---|---|
| RSSI | Received signal strength |
| SNR | Signal quality compared with noise |
| Channel utilization | Airtime consumption on a channel |
| Client count | Number of clients associated to an AP or radio |
| Retries | Wireless frames sent again due to loss or interference |
| Roaming events | Movement between APs |
Wireless performance is shared-medium performance. A client with weak signal, high retries, or a busy channel can affect the user experience even when the wired uplink is not saturated.
Thresholds and Alerts
Thresholds should be based on operational impact and historical behavior. A short utilization spike may be harmless, while sustained packet loss on a voice VLAN may be urgent. Alert rules should include severity, duration, affected service, and escalation path.
| Alert pattern | Example |
|---|---|
| Static threshold | WAN utilization above 90 percent for 15 minutes |
| Baseline deviation | Traffic is 3 times normal for this time of day |
| Rate of change | DHCP scope usage increasing unusually fast |
| Composite condition | High latency plus packet loss plus interface errors |
Capacity and Trend Analysis
Trend analysis looks at metric history to forecast when a resource will need attention. It can identify a WAN circuit that will exceed normal capacity next quarter, an access switch with rising PoE load, a DHCP scope nearing exhaustion, or an AP that regularly carries too many clients.
Practical Scenario
After a new cloud backup agent is deployed, users report slow internet access each afternoon. Monitoring shows WAN utilization is normally 45 percent at that time, but now reaches 98 percent for two hours. Flow data confirms backup traffic dominates the link. The fix may involve backup scheduling, QoS, bandwidth upgrade, or traffic engineering.
Common Exam Traps
| Trap | Better exam reasoning |
|---|---|
| "High bandwidth means low latency." | Bandwidth and latency are different metrics. |
| "A single metric proves root cause." | Correlate utilization, errors, loss, logs, and user impact. |
| "All alerts should fire immediately." | Duration and severity reduce noise from brief harmless spikes. |
| "Baselines never change." | Refresh baselines after major changes and business shifts. |
Quick Drill
Choose the metric:
- Variation in delay that affects voice calls: jitter.
- Frames damaged on a copper link: errors.
- Packets dropped because queues are full: discards or packet loss.
- Airtime used on a Wi-Fi channel: channel utilization.
- Whether current traffic is abnormal for 2 p.m.: baseline comparison.
Voice calls sound choppy even though average latency is acceptable. Which metric is most directly associated with variation in delay?
Which metrics can indicate physical or link-layer problems? Choose two.
Select all that apply
Why should baselines be refreshed after a major application migration?