- 2 Minute Streaming
- Posts
- Kafka Acks & Min Insync Replicas Explained
Kafka Acks & Min Insync Replicas Explained
uncovering the commonly-confused min insync replicas setting
Problem
There are three Kafka configs whose interplay is often misunderstood:
min.insync.replicas
(Broker/Topic config)acks
(Producer config)replication.factor
(Topic config)
For some reason people keep mistaking that this results in some quorum write functionality in Kafka.
No such thing exists.
Let’s dissect the configs one by one. Starting from the simplest:
Replication Factor (RF)
Kafka’s replication protocol is such that for every partition:
N replicas of the data exist
these replicas live on brokers, and there is always:
one leader broker
N-1 follower brokers
replication.factor
denotes what
N is. Usually 3.
This is set separately for each topic.
replication factor of one. the replica set is [1].
replication factor of three. the replica set is [1,2,3].
In-Sync Replicas (ISR)
Producer clients only write to the leader broker — the followers asynchronously replicate the data.
It’s a distributed system — how do you know the followers are keeping up with the leader (i.e have the latest data)?
Enter in-sync replicas.
💡 An in-sync replica (ISR) is a broker that has the latest data for a given partition
a leader is always an in-sync replica (by definition)
a follower is an in-sync replica only if it has fully caught up to the partition it’s following
if it falls behind, it’s no longer an in-sync replica
❓ when do we determine it “fell behind”:
1. when the follower hasn’t consumed up to the log’s last offset in the last
replica.lag.time.max.ms
2. when it loses ZooKeeper/KRaft quorum connectivity (
zookeeper.session.timeout.ms
/broker.session.timeout.ms
)
Acks 🫡
Your producer sends data to Kafka. When do you expect a response?
after all in-sync replicas persist the record?
after the leader persists the record?
immediately once sent?
The trade-off is between durability and speed. Different workloads have different requirements, so naturally - this is configurable.
💡 acks (acknowledgements) is a producer client config denoting the number of brokers that acknowledge receipt of a record before the producer considers the write as successful.
It supports three values - 0, 1 and all.
acks=0 - the producer won’t even wait for a response from the broker, it immediately considers the write successful once sent.
acks=1 - the producer considers the write successful when the leader acknowledges the record. The leader broker will know to respond the moment it persists the record to disk.
acks=all - the producer considers it successful when all of the in-sync replicas persist the record. The leader broker will respond once all the in-sync replicas persist the record.
The acks setting is a good way to configure your preferred trade-off between durability guarantees and performance.
☝️Minimum In-Sync Replica
Best to start with a FAQ:
If you have three in-sync replicas, acks=all
and min.insync.replicas=1
, is it true the broker won’t wait for the other two replicas before replying?
No. It will wait for all the in-sync replicas.
If you use acks=all
and there’s only one in-sync replica (the leader) - isn’t that equal to acks=1?
No. Because this request is now prone to fail validation, whereas acks=1
will always pass under this circumstance.
💡 min.insync.replicas is a broker config that denotes the minimum number of in-sync replicas a broker requires to acknowledge a record before responding to a producer configured with acks=all.
If there are fewer in-sync replicas than the minimum, the broker will respond with an error and not persist the record. ❌
This setting denotes validation - the MINIMUM number of in-sync replicas a broker requires to ack a record, in order to NOT fail the produce request.
And it only applies on Produce requests with acks=all
.
If you want me to name it explicitly so you never mistake it, I’d call it:
min.insync.replicas.to.not.fail.an.acks.all.produce.request
For posterity, here are the errors you might hit if you don’t pass this validation:
Server-side Error:
[2023-05-03 19:34:42,890] ERROR [ReplicaManager broker=2] Error processing append operation on partition testche-2 (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.NotEnoughReplicasException: The size of the current ISR Set(2) is insufficient to satisfy the min.isr requirement of 2 for partition testche-2
Client-side Error:
[2023-05-03 19:34:42,789] WARN [Producer clientId=console-producer] Got error produce response with correlation id 19 on topic-partition testc-2, retrying (0 attempts left). Error: NOT_ENOUGH_REPLICAS (org.apache.kafka.clients.producer.internals.Sender)
[2023-05-03 19:34:42,891] ERROR Error when sending message to topic testche with key: null, value: 1 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.NotEnoughReplicasException: Messages are rejected since there are fewer in-sync replicas than required.
Putting It All Together
Acks is about answering the question "when do you expect a response?".
it is a trade-off between durability (in motion) and speed
RF is about how many times to copy the data (durability at rest)
Min ISR is about how to validate your high-durability produce requests (acks=ALL)
Apache®, Apache Kafka®, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.