Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probe fom index less than match #148

Closed
joshuazh-x opened this issue Jan 31, 2024 · 4 comments
Closed

Probe fom index less than match #148

joshuazh-x opened this issue Jan 31, 2024 · 4 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@joshuazh-x
Copy link
Contributor

joshuazh-x commented Jan 31, 2024

I'm learning probe mechanism and notice there is chance for replication starting from an index less than match index in probe state, unlike that in replicate state. For example, a rejected probe response got delayed and is received when the leader falls into another probe state. This is a rare case of cause.

I'm not sure if this is by design. But to my understanding, allowing such a smaller index implies we want to tolerate loss of already replicated log entries.

func (pr *Progress) MaybeDecrTo(rejected, matchHint uint64) bool {
	if pr.State == StateReplicate {
		// The rejection must be stale if the progress has matched and "rejected"
		// is smaller than "match".
		if rejected <= pr.Match {
			return false
		}
		// Directly decrease next to match + 1.
		//
		// TODO(tbg): why not use matchHint if it's larger?
		pr.Next = pr.Match + 1
		return true
	}

	// The rejection must be stale if "rejected" does not match next - 1. This
	// is because non-replicating followers are probed one entry at a time.
	if pr.Next-1 != rejected {
		return false
	}

	pr.Next = max(min(rejected, matchHint+1), 1)
	pr.MsgAppFlowPaused = false
	return true
}
@pav-kv
Copy link
Contributor

pav-kv commented Jan 31, 2024

Hi @joshuazh-x. The probe/replicate state distinction are heuristics. Regardless of what they do, it does not impact raft safety properties. Basically, probe/replicate try to guess what's the good MsgApp message to send. But ultimately, the follower gates these messages into the storage and rejects if they are inconsistent with the log state.

But to my understanding, allowing such a smaller index implies we want to tolerate loss of already replicated log entries.

Could you explain a little what you mean by loss? Do you mean we would be sending entries that are already in Match a second time?

@pav-kv
Copy link
Contributor

pav-kv commented Jan 31, 2024

Are you saying that this line sets pr.Next to be less than pr.Match?

pr.Next = max(min(rejected, matchHint+1), 1)

If so, that would not be by design, rather a bug. The invariant is that Next > Match.

This clause should probably check with pr.Match instead of pr.Next-1, just like the StateReplicate clause above. At the moment this code assumes there is only one probe in flight (in which case pr.Next-1 == pr.Match, and the two checks are equivalent), but this is generally wrong because messages can be delayed, reordered, and duplicated.

We need a test that demonstrates the bug, and a fix.

@pav-kv pav-kv added bug Something isn't working good first issue Good for newcomers labels Jan 31, 2024
@joshuazh-x
Copy link
Contributor Author

Hi @joshuazh-x. The probe/replicate state distinction are heuristics. Regardless of what they do, it does not impact raft safety properties. Basically, probe/replicate try to guess what's the good MsgApp message to send. But ultimately, the follower gates these messages into the storage and rejects if they are inconsistent with the log state.

But to my understanding, allowing such a smaller index implies we want to tolerate loss of already replicated log entries.

Could you explain a little what you mean by loss? Do you mean we would be sending entries that are already in Match a second time?
I meant we don't need to replicate entries preceding match index, The only necessity to do so is when follower would have lost replicated log entries during the same term, which is impossible.

@joshuazh-x joshuazh-x reopened this Feb 1, 2024
@pav-kv
Copy link
Contributor

pav-kv commented Feb 6, 2024

Fixed by #149.

@pav-kv pav-kv closed this as completed Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants