Multi-threaded Apache Kafka Consumer
Why do we need multi-thread consumer model?
Suppose we implement a notification module which allow users to subscribe for notifications from other users, other applications.
Our module reads messages which will be written by other users, applications to a Kafka clusters. In this case, we can get all notifications of the others written to a Kafka topic and our module will create a consumer to subscribe to that topic.
Everything seems to be fine at the beginning. However, what will happen if the number of notifications produced by other applications, users is increased fast and exceed the rate that can be processed by our module?
All the messages/notifications that haven’t been processed by our module, are still in the Kafka topic. However, things get more danger when the number of messages is too much. Some of them will be lost when the retention policy is met (Note that Kafka retention policy can be time-based, partition size-based, key-based). And more important, when our notification module falls very far behind the income notifications/messages, it is not a true “notification” module anymore.
It’s time to think about the multi-thread consumer model.
Multi-threaded Apache Kafka consumer model
There are 2 possible models.
- Multiple consumers with their own threads - model 1
- Single consumer, multiple worker processing threads - model 2
Comments
Post a Comment