What is the use of __consumer_offsets and _schema topics in Kafka?

What is the use of __consumer_offsets and _schema topics in Kafka?

After setting up the Kafka Broker cluster and creating few topics, we found that the following two topics are automatically created by Kafka:

__consumer_offsets
_schema

What is the importance and use of these topics ?

Solutions/Answers:

Solution 1:

__consumer_offsets is used to store information about committed offsets for each topic:partition per group of consumers (groupID).
It is compacted topic, so data will be periodically compressed and only latest offsets information available.

_schema – is not a default kafka topic (at least at kafka 8,9). It is added by Confluent. See more: Confluent Schema Registry – github.com/confluentinc/schema-registry (thanks @serejja)

Solution 2:

__consumer_offsets: Every consumer group maintains its offset per topic partitions. Since v0.9 the information of committed offsets for every consumer group is stored in this internal topic (prior to v0.9 this information was stored on Zookeeper). When the offset manager receives an OffsetCommitRequest, it appends the request to a special compacted Kafka topic named __consumer_offsets. Finally, the offset manager will send a successful offset commit response to the consumer, only when all the replicas of the offsets topic receive the offsets.

_schemas: This is an internal topic used by the Schema Registry which is a distributed storage layer for Avro schemas. All the information which is relevant to schema, subject (with its corresponding version), metadata and compatibility configuration is appended to this topic. The schema registry in turn, produces (e.g. when a new schema is registered under a subject) and consumes data from this topic.

References

Related:  spark kafka 10 consumer (DirectStream) Hangs