What happen if kafka files are deleted?

What happen if kafka files are deleted?

This is definitively not the way to do it, and it should probably be handled by the cleanup policy but that’s not the point. Let’s imagine the files in log.dirs has been deleted, what’s the impact ?
The broker would crash ?
The offset would start over at 0 after restarting the service ?
Would it be necessary to do anything to fix ?

Solutions/Answers:

Solution 1:

If you delete the files from log.dirs, the data will be deleted but topic will still exist in zookeeper metadata. The broker won’t crash. Once you restart the brokers, it will read the topic as an empty one and you can produce new data.

If you delete the topic from zookeeper metadata as well, it will delete the topic from broker.

In order to check the offsets you can use below command:

// Before deleting the log.dirs directory for topic 'test1'
kafka_2.12-1.1.1 % bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic test1
test1:0:6

// After deleting the directory and restarting the broker
kafka_2.12-1.1.1 % bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic test1
test1:0:0

Solution 2:

In fact it will depend how many brokers you have in your cluster, and from how many of them you delete the files at the same time. Luckily if you delete the files from one broker in a 3-broker cluster, and you have defined a replication factor of 3 for your topics, you will not lose anything and the files will be recreated on the broker where you deleted them.

References

Related:  How can I send large messages with Kafka (over 15MB)?