Asset Inventory, Baselines, SLAs, SOPs, and Runbooks
Key Takeaways
- Asset inventory (often a CMDB) records what exists, where it is, who owns it, its support status, software version, and criticality — every other operation depends on it.
- A baseline is documented normal behavior (configuration, performance, traffic, wireless), measured during representative business cycles, used to detect drift and anomalies — it is not a hard maximum.
- An SLA defines service expectations (availability, response/resolution time, hours, escalation); SLOs are measurable targets and OLAs are internal inter-team agreements.
- Standard Operating Procedures standardize routine repeatable work; runbooks give task-specific steps with commands, validation, decision points, rollback, and escalation contacts.
- Operational records must be version controlled, owned, reviewed, and tested — an untested runbook can fail during a real outage.
Operations Records You Must Distinguish
Network operations gets easier when a team can quickly answer four questions: what devices exist, what normal looks like, what service level is promised, and what procedure to follow. N10-009 objective 3.1 lists asset inventory, baselines, SLAs, and procedural documents together, and the exam loves to make you pick the one record that resolves a scenario. Because Domain 3 is about roughly one in five questions, knowing these distinctions cold is high-yield.
Asset Inventory and the CMDB
An asset inventory is the authoritative list of network items. It may live in a Configuration Management Database (CMDB), an IPAM platform, an endpoint management system, or a ticketing system.
| Inventory field | Example |
|---|---|
| Asset ID | NET-SW-IDF2-004 |
| Device type | Access switch |
| Manufacturer / model | Vendor model and hardware revision |
| Serial number | Vendor serial for support/RMA |
| Location | Building, floor, room, rack, RU |
| Owner | Network operations team |
| Support status | Covered by contract through a stated date |
| Software version | Current OS or firmware build |
| Criticality | Core, distribution, access, lab, or spare |
Inventory quality matters because patch planning, contract renewal, vulnerability response, incident triage, capacity planning, and decommissioning all start with knowing what exists. If you cannot find a serial number during an outage, you cannot open a vendor support case.
Baselines: Normal, Not Maximum
A baseline is a documented normal state. It can describe configuration, performance, traffic, or security posture, and it lets teams recognize drift or abnormal behavior.
| Baseline type | Examples |
|---|---|
| Configuration baseline | Approved firmware, NTP servers, SNMP settings, syslog destination, AAA method |
| Performance baseline | Typical CPU, memory, interface utilization, latency, jitter, packet loss |
| Traffic baseline | Normal application flows, busy-hour bandwidth, expected protocols |
| Wireless baseline | Normal RSSI, SNR, channel utilization, client counts per AP |
Baselines must be measured during representative cycles. A baseline captured at midnight does not represent a call center at 10 a.m., and a sample taken during a known outage must never be treated as healthy. The classic exam trap is calling a baseline a hard threshold; it is a comparison reference, so a router idling at 20-35% CPU that suddenly sits at 92% is anomalous only because the baseline told you what normal was.
SLAs, SLOs, and OLAs
A Service Level Agreement (SLA) is a formal commitment, commonly defining availability targets, support hours, response time, resolution targets, maintenance windows, reporting, and escalation.
| Term | Meaning |
|---|---|
| SLA | Agreement with a customer or service consumer |
| SLO (Service Level Objective) | Measurable target used to gauge service quality |
| OLA (Operating Level Agreement) | Internal agreement between supporting teams |
| Maintenance window | Approved time for planned service impact |
| Escalation path | Who is contacted when a target is at risk |
For example, a circuit SLA might promise 99.9% availability (roughly 8.76 hours of downtime allowed per year) and a one-hour provider response. The SLO is the internal metric you track against it; the OLA is what your monitoring team owes your WAN team.
SOPs and Runbooks
Standard Operating Procedures describe repeatable routine work. Runbooks are task-specific and usually include exact commands, checks, decision points, rollback steps, and escalation contacts.
| Document | Best use |
|---|---|
| SOP | Monthly firewall rule review process |
| Runbook | Steps to fail over a WAN circuit |
| Checklist | Pre-change validation items |
| Escalation matrix | Who to contact by severity and system |
A strong runbook lists prerequisites, required access, expected output, verification steps, rollback instructions, and the evidence to attach to a ticket — and it must be tested. A runbook that has never been practiced often fails the first time it is needed in production.
Practical Scenario
A monitoring alert shows high latency across a WAN link. The team uses the inventory to identify the circuit and provider, compares current metrics to the baseline to confirm the anomaly, checks the SLA for the provider's response obligation, opens a ticket, and follows the runbook to collect evidence and escalate once thresholds are exceeded. Each record plays a distinct, non-interchangeable role.
Common Exam Traps
| Trap | Better exam reasoning |
|---|---|
| "Inventory is only for accounting." | It drives supportability, security, troubleshooting, and lifecycle. |
| "A baseline is a maximum limit." | A baseline is normal behavior used for comparison. |
| "An SLA is the same as a runbook." | An SLA sets expectations; a runbook lists operator steps. |
| "SOPs can live in one engineer's notes." | Procedures need shared access, ownership, and version control. |
Quick Drill
- Decide whether interface utilization is unusual: performance baseline.
- Find the serial number of a failed switch: asset inventory.
- Know whether a provider must respond within one hour: SLA.
- Replace a failed firewall using approved steps: runbook.
- Define the recurring access-rule review process: SOP.
A router CPU normally runs between 20 and 35 percent during business hours. Today it is steady at 92 percent. Which operational record makes that comparison meaningful?
Which document is most likely to contain exact commands, validation checks, decision points, and rollback steps for failing over a WAN connection?
A network team wants a measurable internal target, such as resolving access-switch tickets within four business hours, that it tracks against a customer commitment. Which term describes that internal measurable target?