2

The documentation tells us what happens when session timeout for a synchronous Availability Group replica expires

"Even if a disconnected replica is configured for synchronous-commit mode, transactions won't wait for that replica to reconnect and resynchronize."

This documentation tell us a little bit about what happens when the connection recovers.

"When the secondary replica reconnects with the primary replica, they resume synchronous-commit mode."

However, I cannot find any specifics. Suppose that:

  1. I have an Availability Group on Windows Server Failover Clustering with one primary and one synchronous secondary.
  2. My quorum is always perfect.
  3. REQUIRED_SYNCHRONIZED_SECONDARIES_TO_COMMIT is not set.
  4. At 08:00, the session timeout for the synchronous replica expires.
  5. At 08:15, the connection recovers and the replica is 15 minutes behind.

At what point will commits on the primary need to wait on the once-synchronous secondary hardening them? I see three possibilities:

  • A: Commits on the primary begin waiting immediately and will be stalled until the synchronous replica has caught up with the lost 15 minutes of hardening.
  • B: Until the secondary has caught up with the lost 15 minutes, it is effectively asynchronous. It will become synchronous again, causing commits on the primary to wait, only after it has caught up.
  • C: Any of the above, but including a manual step from an administrator to tell the replica that it can start accepting data again.

I would prefer B, but I cannot find any documentation saying that B is the case.

1 Answer 1

6

At what point will commits on the primary need to wait on the once-synchronous secondary hardening them?

B: Until the secondary has caught up with the lost 15 minutes, it is effectively asynchronous. It will become synchronous again, causing commits on the primary to wait, only after it has caught up.

The secondary is technically synchronous commit but won't change the harden policy until it is "close enough" to the primary. You can see this by running your test scenario given and subscribing to various extended events, for example hadr_db_partner_set_sync_state. There is a commit policy column which should change for the partner to WaitForHarden when it switches back to to synchronous commit and waits on log hardens.

This should be documented here, but is very subtle.

Example

Below is a database in an AG that is currently synchronous commit and the AG is healthy. Committing a transaction will give you the following, showing that the commit policy is WaitForHarden among other data.

healthy sync commit

Assume a synchronous commit replica in the topology hits a session timeout for whatever reason. What happens? The secondary will have the commit policy changed local to the primary, it'll still stay sync commit (commit_policy_target shows WaitForHarden) but it won't participate in commit acknowledgement (Commit_Policy shows DoNothing).

session timout hit

Eventually the replica can connect back to the primary and synchronization will start again. Some point in the future the databases should (hopefully) get very close in the hardened log block values. Once this occurs the commit policy will be changed back to WaitForHarden and the commit partner will once again participate in commit acknowledgements.

commit policy back to waitforharden

1
  • Excellent. Is this documented anywhere? Commented yesterday

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.