Systems Admin

SQL Server FCI Part 5 of 13: Creating & Validating the Failover Cluster (CNO, Quorum Witness)

Plumbing is in. Storage is connected. Time to merge two standalone servers into a single Windows Failover Cluster — the brain that orchestrates failover for SQL Server (and any other clustered role we add). Four moves: install the feature on both nodes, run validation (Microsoft’s “is this supported?” gate), create the cluster, configure the Quorum witness disk explicitly.

Step 1 — install Failover Clustering on both nodes

Add Roles and Features wizard on Node-01 with the Features pane open and the Failover Clustering checkbox ticked along with its dependent management tools
Server Manager > Manage > Add Roles and Features > Features > Failover Clustering. Tick. Includes management tools. Install. Reboot if asked.

Node-01: Server Manager > Manage > Add Roles and Features > Features > Failover Clustering. Tick. The wizard auto-selects management tools. Install.

Same wizard on Node-02 mirroring the Failover Clustering selection so both nodes have the cluster service ready
Repeat on Node-02 exactly. Same feature, same options. Both nodes need the cluster bits before validation can run.

Repeat exactly on Node-02. The cluster service must exist on both before validation can run.

PowerShell shortcut if you prefer: Install-WindowsFeature -Name Failover-Clustering -IncludeManagementTools on each node.

Reboot may be requested. Both nodes online before continuing.

Step 2 — validate the configuration

The validation wizard runs ~30 tests covering Inventory, Network, Storage, and System Configuration. Microsoft only supports clusters that have passed validation — meaning zero failures. Warnings are usually acceptable.

Skip this step and you’re running an unsupported cluster. If you ever open a Microsoft support case, the first thing they ask for is the validation report. No report, no support.

Failover Cluster Manager console with the Validate Configuration link highlighted in the centre Actions pane, the wizard that runs Microsoft’s official compatibility check before the cluster is created
Failover Cluster Manager (FCM) > Validate Configuration. Don’t skip. Microsoft only supports clusters that have passed validation — if you ever open a support case, this is the first thing they ask for.

From Node-01, open Failover Cluster Manager (Tools menu). Click Validate Configuration.

Validate a Configuration Wizard Select Servers step with Node-01 and Node-02 added by Browse picker, both fully qualified to infotechninja.local
Add both Node-01 and Node-02. Use full FQDNs.

Add both nodes by FQDN.

Testing Options step with Run all tests (recommended) selected to exercise inventory, network, storage, and system configuration checks
Run all tests. The shorter test suites miss subtle issues; the full run takes 5-15 min and is worth every second.

Run all tests. The shorter suites skip storage IO tests — which is exactly where most issues hide.

Confirmation step listing every test about to run; the wizard will take several minutes to complete because storage tests do real read/write IO against the shared LUNs
Confirmation. Click Next. Coffee time.

Confirm and start. The wizard performs real read/write IO against your shared disks — takes 5-15 minutes.

Validation Report summary screen with green check marks for all tests; the “Create the cluster now using the validated nodes” option is intentionally UNTICKED so the next wizard runs cleanly
Report. Look for Failures — those must be fixed before proceeding. Warnings are usually fine in lab. Untick “Create the cluster now using the validated nodes” — we’ll do that step manually for clarity.

When complete, review the report. Common findings:

  • Warnings about heartbeat redundancy — expected in a 1-NIC heartbeat lab; production should have 2 heartbeat NICs.
  • Warnings about “persistent reservation” on storage — usually a hint to enable SCSI-3 persistent reservations on the SAN; iSCSI Target supports this.
  • Failures — STOP. Read each, fix, rerun.

Untick “Create the cluster now using the validated nodes.” We do that step manually for clarity. Finish.

Step 3 — create the cluster

Failover Cluster Manager Action pane with Create Cluster link highlighted, the wizard that creates the actual cluster object including its CNO in Active Directory
FCM Action pane > Create Cluster.

FCM Action pane > Create Cluster.

Create Cluster Wizard Select Servers step with Node-01 and Node-02 added, ready to commit the cluster definition
Add both nodes again.

Add both nodes.

Access Point for Administering the Cluster step with Cluster Name set to ITN-CL-01 and the cluster IP set to 10.15.1.45 on the Domain network, this name and IP get registered in DNS for client access
Cluster Name: ITN-CL-01. Cluster IP: 10.15.1.45 (free IP on the public/domain subnet). The cluster gets its own AD computer object (CNO) and DNS entry — clients eventually connect to ITN-CL-01.infotechninja.local.

The Access Point step:

  • Cluster Name: ITN-CL-01 — this becomes the CNO (Cluster Name Object) in AD and a DNS entry. Clients eventually connect to ITN-CL-01.infotechninja.local.
  • IP Address: 10.15.1.45 — a free IP on the Public/Domain subnet. The cluster claims this IP; failover keeps it on whichever node owns the cluster role.

AD permission: the user creating the cluster needs Create Computer Objects on the OU where the CNO will live. Default Computers container works for labs; locked-down environments often delegate this to a specific OU. Get this wrong and the wizard fails with an opaque AD error.

Confirmation step before creation with Add all eligible storage to the cluster ticked, automatically pulling the iSCSI Data and Quorum disks into the cluster
Tick Add all eligible storage to the cluster. The wizard pulls in the SAN LUNs automatically.

Confirmation: tick Add all eligible storage to the cluster. The wizard pulls in the iSCSI LUNs automatically.

Summary page after successful creation showing the cluster object created, both nodes joined, both disks added, and the cluster registered in AD as the CNO
Created. Cluster object exists in AD, both nodes joined, both disks added.

Created. The cluster object exists in AD, both nodes joined, both disks owned by the cluster.

Step 4 — verify the cluster

Failover Cluster Manager Nodes view confirming both Node-01 and Node-02 are listed with status Up and the cluster is healthy
Nodes view: both Up. Cluster healthy.

FCM > Nodes. Both nodes Up.

Cluster main pane showing Current Host Server set to Node-01 indicating it currently owns the cluster role and shared resources
Current Host Server: Node-01. The active node. Failover would flip this to Node-02.

Cluster main pane: Current Host Server is Node-01. That’s the active node. Failover would flip this to Node-02.

Visual confirmation: cluster disk icons

File Explorer on Node-01 showing the SQL-Data and Quorum-Witness drives mounted with the distinctive cluster disk icon (drive with a small chain link), the visual cue that Cluster Service now owns the disks
File Explorer on Node-01 — the disks now show the cluster disk icon (drive with a small chain link). Cluster Service owns them now. On Node-02, these drives are not visible — only the active node sees the shared storage, which is correct.

Open File Explorer on Node-01 (the active node). The drives have the distinctive cluster disk icon — a hard drive with a small chain link overlay. That’s Windows telling you Cluster Service owns these disks now, not the local OS.

On Node-02, these drives don’t appear in File Explorer. Correct. Only the active node sees the shared storage.

Step 5 — configure the Quorum witness explicitly

The cluster wizard auto-picks a quorum config. For 2-node clusters, it usually picks Node Majority + a witness, but the “auto” choice depends on disk availability at create time. Be explicit.

Quorum exists to prevent split-brain — the scenario where the heartbeat link breaks but both nodes can still see storage. Without a tiebreaker, both nodes try to take ownership simultaneously and the data corrupts. The witness disk is the third “vote” that decides the winner.

Failover Cluster Manager right-click context menu on the cluster name showing More Actions > Configure Cluster Quorum Settings, the entry point for explicit witness configuration” /><figcaption>Right-click the cluster name > More Actions > <strong>Configure Cluster Quorum Settings</strong>.</figcaption></figure>
<p>Right-click cluster name > <strong>More Actions</strong> > <strong>Configure Cluster Quorum Settings</strong>.</p>
<figure class=Configure Cluster Quorum Wizard with Select the quorum witness option chosen, the path that lets us pick a specific disk rather than letting the wizard auto-select
Choose Select the quorum witness. We want explicit control, not the wizard’s auto-pick.

Select the quorum witness.

Quorum witness type step with Configure a disk witness selected (vs file share witness or cloud witness) since we built a dedicated 2 GB iSCSI quorum LUN
Configure a disk witness. We have one ready — the 2 GB Quorum LUN.

Configure a disk witness. Other options: file share witness (good for stretched clusters) or cloud witness (Azure storage account, good for cloud-only or mixed). For a single-site lab with shared iSCSI storage, disk witness is the standard.

Disk selection step with the 2 GB Cluster Disk explicitly checked, ensuring the witness role binds to the right LUN rather than the wizard guessing
Select the 2 GB Cluster Disk. The wizard handles the bind.

Pick the 2 GB Cluster Disk explicitly. Don’t let the wizard accidentally use SQL-Data.

Final summary page confirming the disk witness has been configured; the cluster now has explicit tie-breaker logic for the 2-node split-brain scenario
Done. Disk Witness configured. Tie-breaker logic explicit. The cluster can now lose Node-01 OR Node-02 OR the heartbeat link without split-brain.

Done. The cluster now has explicit tie-breaker logic.

Things that bite people in this part

Validation failures with no useful message

Common causes: ICMP blocked on the storage subnet (firewall rule), nodes at different OS patch levels, missing CNO permission on the target AD OU. Read the full HTML report — click into each failure for the verbose explanation.

Pre-staged CNO

Locked-down environments often pre-stage the CNO (an admin creates a disabled computer object in AD ahead of time). The cluster create wizard then needs Modify permission on that object — not Create. If pre-staging, ensure the right delegation.

Witness picked the wrong disk

Auto-config sometimes picks the SQL-Data disk as witness (it’s small relative to total — the heuristic gets confused). Always configure explicitly.

Cluster IP conflicts

10.15.1.45 must be free. DHCP scope exclusions, IPAM, or just a quick ping 10.15.1.45 from anywhere on the subnet — if it answers, pick a different IP.

SAN reboot during cluster create

If the iSCSI Target VM is rebooted during cluster creation, the disks may go offline and the wizard panics. Don’t touch the SAN during create.

Checking validation later via PowerShell

You can re-run validation any time without recreating the cluster: Test-Cluster -Node N1, N2 -Include Network, Storage. Useful before adding a third node or changing storage.

What’s next

The cluster is alive. Part 6 installs SQL Server itself on Node-01 as the first node of an FCI — this is the moment SQL gets aware of the cluster and creates its virtual network name. See the full series at SQL Server Clustering pathway.

Leave a Reply