官术网_书友最值得收藏!

Additional configuration

You have learned a few mandatory parameters in the beginning. Kafka consumer has lots of properties and in most cases, some of them do not require any modification. There are a few parameters that can help you increase performance and availability of consumers:

  • enable.auto.commit: If this is configured to true, then consumer will automatically commit the message offset after the configured interval of time. You can define the interval by setting auto.commit.interval.ms. However, the best idea is to set it to false in order to have control over when you want to commit the offset. This will help you avoid duplicates and miss any data to process.
  • fetch.min.bytes: This is the minimum amount of data in bytes that the Kafka server needs to return for a fetch request. In case the data is less than the configured number of bytes, the server will wait for enough data to accumulate and then send it to consumer. Setting the value greater than the default, that is, one byte, will increase server throughput but will reduce latency of the consumer application.
  • request.timeout.ms: This is the maximum amount of time that consumer will wait for a response to the request made before resending the request or failing when the maximum number of retries is reached.
  • auto.offset.reset: This property is used when consumer doesn't have a valid offset for the partition from which it is reading the value.
    • latest: This value, if set to latest, means that the consumer will start reading from the latest message from the partition available at that time when consumer started.
    • earliest: This value, if set to earliest, means that the consumer will start reading data from the beginning of the partition, which means that it will read all the data from the partition.
    • none: This value, if set to none, means that an exception will be thrown to the consumer.
  • session.timeout.ms: Consumer sends a heartbeat to the consumer group coordinator to tell it that it is alive and restrict triggering the rebalancer. The consumer has to send heartbeats within the configured period of time. For example, if timeout is set for 10 seconds, consumer can wait up to 10 seconds before sending a heartbeat to the group coordinator; if it fails to do so, the group coordinator will treat it as dead and trigger the rebalancer.
  • max.partition.fetch.bytes: This represents the maximum amount of data that the server will return per partition. Memory required by consumer for the ConsumerRecord object must be bigger then numberOfParition*valueSet. This means that if we have 10 partitions and 1 consumer, and max.partition.fetch.bytes is set to 2 MB, then consumer will need 10*2 =20 MB for consumer record.

Remember that before setting this, we must know how much time consumer takes to process the data; otherwise, consumer will not be able to send heartbeats to the consumer group and the rebalance trigger will occur. The solution could be to increase session timeout or decrease partition fetch size to low so that consumer can process it as fast as it can.

主站蜘蛛池模板: 天峻县| 永昌县| 东丽区| 平安县| 云浮市| 绩溪县| 崇信县| 马鞍山市| 长寿区| 乡宁县| 柘荣县| 普宁市| 甘肃省| 漳州市| 高唐县| 松阳县| 东光县| 辽宁省| 平阳县| 丰顺县| 商南县| 丹巴县| 弥勒县| 马尔康县| 平乐县| 青川县| 阳新县| 泗阳县| 丰县| 金寨县| 香河县| 政和县| 安阳市| 罗江县| 普兰店市| 新乐市| 获嘉县| 乌兰浩特市| 新河县| 舞阳县| 高阳县|