2.3 High Availability (FGCP)
Key Takeaways
- FortiGate High Availability uses the FortiGate Clustering Protocol (FGCP), which forms a cluster of identical units sharing one virtual MAC address and a synchronized configuration.
- Active-passive HA has one primary unit processing all traffic with the others on standby; active-active also distributes (load-balances) sessions across members.
- The primary unit is elected by comparing, in order: HA override and priority, then connected-monitored interfaces, then uptime, then serial number.
- Dedicated heartbeat interfaces carry cluster keepalives and configuration synchronization; losing all heartbeat links can cause a split-brain condition.
- Session pickup synchronizes the session table between members so established sessions survive a failover with minimal disruption.
The FortiGate Clustering Protocol (FGCP)
High Availability (HA) removes the FortiGate as a single point of failure by grouping two or more units into a cluster. FortiOS implements this with the FortiGate Clustering Protocol (FGCP). To form an FGCP cluster, the units must be identical models running the same FortiOS firmware, and they must share the same HA group name and password.
Key FGCP behaviors:
- The cluster presents a virtual MAC address on each interface, so attached switches and hosts see one logical device. After a failover, the new primary advertises the same virtual MACs, which lets neighboring devices keep using their existing ARP entries.
- One unit is elected primary (also called master); the others are secondary (subordinate).
- The primary's configuration is synchronized to all secondary units automatically.
- HA can run in active-passive or active-active mode.
Active-Passive vs Active-Active
In active-passive (A-P) mode, the primary unit processes all traffic while secondary units stay synchronized and idle, ready to take over instantly if the primary fails. This is the most common and predictable HA design.
In active-active (A-A) mode, the primary still owns the virtual MACs and management, but it load-balances sessions — by default, proxy-based UTM/content-inspection sessions — across all cluster members so they share the processing load. Routed/firewall-only sessions are still typically handled by the primary.
| Characteristic | Active-Passive | Active-Active |
|---|---|---|
| Traffic processing | Primary only; others standby | Sessions distributed across members |
| Primary purpose | Redundancy / failover | Redundancy plus load sharing |
| Complexity | Lower, more predictable | Higher |
| Typical use | Most deployments | Heavy content-inspection workloads |
| Throughput benefit | None beyond a single unit | Can exceed a single unit for inspected traffic |
HA Cluster Election Factors
When a cluster forms (or recovers), FGCP elects the primary unit by comparing the members against an ordered list of factors. The comparison stops at the first factor that produces a winner:
- Override and priority — If HA override is enabled, the unit with the highest device priority is preferred. With override disabled, priority is considered only when uptimes are close, which keeps a currently stable primary in place.
- Connected monitored interfaces — The unit with the most monitored (link-monitored) interfaces up is preferred, because it can serve the most connectivity.
- HA uptime — The unit that has been up longest in the cluster is preferred (within the uptime difference margin), favoring stability.
- Serial number — As the final tie-breaker, the unit with the highest serial number becomes primary.
Override is important to understand for the exam: with override disabled (the default), a recovered higher-priority unit will not preempt the current primary, avoiding a second disruptive failover. With override enabled, the highest-priority unit always reclaims the primary role after it recovers.
Heartbeat Interfaces
Heartbeat interfaces are the dedicated links between cluster members that carry:
- HA heartbeat / keepalive packets used to detect that members are alive and to run the election.
- Configuration and session synchronization traffic.
Best practice is to use at least two heartbeat interfaces for redundancy, ideally with a direct cable between units. Each heartbeat interface has a configurable priority. If all heartbeat links fail, the secondary cannot see the primary and may also become primary — a split-brain condition where two units claim the primary role and the same virtual MACs, causing network disruption.
Session Pickup, Synchronization, and Failover
By default FGCP synchronizes the configuration but not the live session table. Enabling session pickup (set session-pickup enable) synchronizes the session table between members so that established sessions survive a failover instead of being dropped and re-established.
Details to remember:
- Session pickup mainly benefits long-lived TCP sessions; short sessions complete before failover matters.
- Synchronizing sessions adds overhead, so session pickup is enabled deliberately, not by default.
- During failover, a secondary detects the primary is gone (missed heartbeats or failed monitored interfaces), promotes itself to primary, and sends gratuitous ARP to update neighboring switches with the virtual MAC locations.
- Useful verification command:
diagnose sys ha status(andget system ha status) shows roles, priorities, mode, and synchronization state.
FGCP failover is typically sub-second to a few seconds; with session pickup enabled, users often experience no noticeable interruption.
An NSE 4 candidate must explain the FGCP primary election order. Which sequence is correct?
A FortiGate HA cluster runs in active-passive mode without session pickup. The primary unit fails. What happens to long-lived TCP sessions that were active at the time of failover?
In an HA cluster, HA override is disabled. A unit with higher device priority fails, the lower-priority unit becomes primary, and later the higher-priority unit recovers. What is the expected behavior?