2.3 EN 50600 / ISO-IEC 22237, EPI DCOS & Availability Concepts
Key Takeaways
- EN 50600 (European) and ISO/IEC 22237 (international) classify data centres separately on Availability Class 1-4 and Physical Protection Class 1-4, unlike the single Uptime Tier number.
- EPI DCOS (Data Centre Operations Standard) grades operational maturity — people, process, policy, and procedure — complementing design ratings such as Tier and TIA-942.
- Availability = MTBF / (MTBF + MTTR); it improves by raising MTBF (more reliable/redundant components) or lowering MTTR (faster restore).
- Five nines (99.999%) equals about 5.26 minutes of downtime per year; three nines (99.9%) equals about 8.76 hours.
- Redundancy ladder: N (none), N+1 (one spare), N+2 (two spares), 2N (two full systems), 2N+1 / 2(N+1) (two systems each with a spare).
EN 50600, ISO/IEC 22237, EPI DCOS and Availability Concepts
Beyond the American frameworks, the CDCP exam expects familiarity with the European and international data centre standards, EPI's own operations standard, and the availability mathematics that all of them rely on.
EN 50600 and ISO/IEC 22237
EN 50600 is the European multi-part standard Information technology — Data centre facilities and infrastructures. Its international counterpart, ISO/IEC 22237, is closely aligned and largely harmonised part-for-part. Unlike the single-number Uptime Tier, EN 50600 classifies a facility along several independent dimensions:
- Availability Classes 1-4 — increasing resilience of power and cooling, from Class 1 (no redundancy) to Class 4 (fault tolerant).
- Protection Classes 1-4 — increasing physical security, from minimal to highly protected.
- Granularity — how finely redundancy can be added or capacity scaled.
- Energy efficiency enablement and associated KPIs.
The series is organised into parts: EN 50600-1 (general concepts and classification), -2-1 building construction, -2-2 power distribution, -2-3 environmental control, -2-4 telecommunications cabling, -2-5 security, the -3-1 management and operational information part, and the -4-x KPI parts. Those -4-x KPIs align with ISO/IEC 30134: PUE (30134-2), REF (30134-3), CUE (30134-8), and WUE (30134-9). The important exam point is that EN 50600 / ISO 22237 separate availability from security, so a site can be, for example, Availability Class 3 but Protection Class 2 — a granularity the single Uptime Tier number does not offer.
EPI Data Centre Operations Standard (DCOS)
Where Tier and TIA-942 mostly grade facility design, the EPI Data Centre Operations Standard (DCOS) grades how a data centre is operated. DCOS is a maturity- and audit-based framework covering the people, processes, policies, and procedures that keep a site running: organisation and staffing, training and competence, maintenance management, capacity and change management, incident and problem management, health and safety, and documentation such as SOPs, MOPs, and EOPs. Facilities are assessed against DCOS maturity levels, and EPI certifies both organisations and individual auditors. DCOS is complementary to a design rating: a Tier IV or Rated-4 facility can still suffer outages if operations are immature, which is exactly why operational standards like DCOS — and Uptime's TCOS and M&O Stamp — exist alongside topology ratings.
Availability Mathematics: MTBF, MTTR and 'Nines'
All of these availability classes ultimately rest on one formula:
Availability = MTBF / (MTBF + MTTR)
where MTBF (Mean Time Between Failures) is the average operating time between failures and MTTR (Mean Time To Repair/Restore) is the average time to restore service after a failure. Availability is equivalently uptime / (uptime + downtime). Two levers improve it: increase MTBF (more reliable, more redundant components) or decrease MTTR (faster detection, spares on hand, trained staff, well-written MOPs).
Worked example: if a system has an MTBF of 1,000 hours and an MTTR of 4 hours, availability = 1000 / (1000 + 4) = 0.99602, or 99.6%. To translate an availability percentage into annual downtime, multiply the unavailability by 8,760 hours per year. For 99.982% (the Tier III figure), unavailability = 0.018%, so 0.00018 x 8,760 is about 1.6 hours/year — exactly the Tier III reference value.
Learn the 'nines' shorthand:
| Availability | 'Nines' | Downtime per year |
|---|---|---|
| 99% | two nines | ~3.65 days |
| 99.9% | three nines | ~8.76 hours |
| 99.99% | four nines | ~52.6 minutes |
| 99.999% | five nines | ~5.26 minutes |
'Five nines' (99.999%) is the classic carrier-grade target, about five minutes of downtime a year.
Why Operations Move the Number
Availability classes and Tier ratings describe designed resilience, but industry outage studies consistently attribute a large share of real incidents to human error and process failures rather than equipment. That is the whole rationale for operational frameworks such as EPI DCOS and Uptime's TCOS: a Class 4 or Tier IV design lowers MTBF-driven risk, but only mature operations keep MTTR low and prevent self-inflicted outages (a mistaken EPO press, an untested MOP, a missed maintenance window). In practice, EN 50600 Availability Classes track the same redundancy ladder used throughout this chapter — roughly Class 1 to no redundancy, Class 2 to N+1, Class 3 to concurrent maintainability, and Class 4 to fault tolerance — so the vocabulary you learned for Uptime Tiers transfers directly to the European scheme.
Redundancy Terminology: N, N+1, N+2, 2N, 2N+1
Redundancy notation underpins every availability class. N is the capacity required to carry the full design load — the 'need'. Everything else adds spare capacity or independent systems:
- N — exactly enough, no redundancy; any failure causes an outage.
- N+1 — one spare unit beyond need (for example four UPS modules where three carry the load). Tolerates one component failure.
- N+2 — two spare units; tolerates two simultaneous component failures, or one failure while another unit is already in maintenance.
- 2N — two complete, independent systems, each able to carry 100% of the load; eliminates single points of failure across the whole path (the basis of Tier IV).
- 2N+1, often written 2(N+1) — two independent systems, each of which internally holds an extra spare; the highest common commercial redundancy.
A subtle exam trap: N+1 protects a component, not a path. If all N+1 modules share one bus or one distribution path, that shared path is still a single point of failure — which is why Tier II (N+1, single path) is not concurrently maintainable, while 2N and redundant-path designs are. Match the redundancy scheme to the resilience goal: N+1 for basic redundancy, and 2N or 2(N+1) for concurrent maintainability and fault tolerance.
A system has a Mean Time Between Failures (MTBF) of 2,000 hours and a Mean Time To Repair (MTTR) of 2 hours. What is its approximate availability?
Which statement correctly describes the EN 50600 / ISO/IEC 22237 classification approach?
A UPS installation is described as 2(N+1). What does this configuration provide?