Improving performance of Kafka Producer
As we know, Kafka uses an asynchronous publish/subscribe model. While our producer calls the send() command, the result returned is a future. That future offers methods to check the status of the information in the process.
Moreover, as the batch is ready, the producer sends it to the broker. Basically, the broker waits for an event, then, receives the result, and further responds that the transaction is complete.
For latency and throughput, two parameters are particularly important for Kafka performance Tuning:
Instead of the number of messages, batch.size measures batch size in total bytes. That means it controls how many bytes of data to collect, before sending messages to the Kafka broker. So, without exceeding available memory, set this as high as possible. Make sure the default value is 16384.
However, it might never get full, if we increase the size of our buffer. On the basis of other triggers, such as linger time in milliseconds, the Producer sends the information eventually. Although by setting the buffer batch size too high, we can impair memory usage, that does not impact latency.
Moreover, we are probably getting the best throughput possible, if our producer is sending all the time. Also, we might not be writing enough data to warrant the current allocation of resources, if the producer is often idle.
In order to buffer data in asynchronous mode, linger.ms sets the maximum time. Let’s understand it with an example, a setting of 100 batches 100ms of messages to send at once. Here, the buffering adds message delivery latency but this improves throughput.
However, the producer does not wait, by default. Hence, it sends the buffer any time data is available.
Also, we can set linger.ms to 5 and send more messages in one batch, rather than sending immediately.
This would add up to 5 milliseconds of latency to records sent, but also reduce the number of requests sent, even if the load on the system does not warrant the delay.
So, for higher latency and higher throughput in our producer, increase linger.ms.
Conclusion
To improve performance of Kafka Producer you need to configure batch.size and linger.ms properties.
batch.size – This is an upper limit of how many messages Kafka Producer will attempt to batch before sending – specified in bytes.
linger.ms – How long will the producer wait before sending in order to allow more messages to get accumulated in the same batch.
It is also goot to configure compression.type and acks properties.
acks - This is acknowledgment that the producer gets from a Kafka broker to ensure that the message has been successfully committed to that broker. The config acks is the number of acknowledgments the producer needs to receive before considering a successful commit.
There are 3 different levels of ack.
- acks = 0, it means that the producer send the message but doesn't wait for any acks from the broker. It's the real "fire and forget". It provides the higher throughput.
- acks = 1, the producer waits for the ack. This ack is sent by the broker (to which the producer is connected and that hosts the leader replica).
- acks = -1, the producer waits for the ack. This ack is sent by the broker as above but only after having the messages replicated to all the replica followers on the other brokers. Of course in this case the throughput decrease if you increment the replication factor, because the message needs to be copied by more brokers (min.insync.replicas) before the "leader" broker returns back the ack to the producer.
compression.type - This is a compression type for a given topic. By default, compression is not enabled in Kafka producer.Compression enables faster transfer not only from producer to broker but also during replication. Compression helps better throughput, low latency, and better disk utilization.
When to do it?
- If records are arriving faster than the kafka producer can send.
- If you have huge amount of data in the your respective Topic, its really burden to your kafka producer.
- If you have a bottlenecks
Comments
Post a Comment