1. Overview
Seeking in Kafka is similar to locating stored data on a disk before reading. Before reading data from a partition, we must first seek to the correct position.
A Kafka consumer offset is a unique, steadily increasing number that marks the position of an event record in a partition. Each consumer in the group keeps its own offset for each partition to track progress.
Consumers may need to process messages at different positions in the partition for reasons such as replaying events or skipping to the latest message.
In this tutorial, let’s explore Spring Kafka API methods to retrieve messages at various positions within a partition.
2. Seek Using Java API
In most cases, a consumer reads messages from the beginning of a partition and continues to listen for new ones. However, there are situations where we might need to read from a specific position, time, or relative position.
Let’s explore an API that offers different endpoints to retrieve records from a partition by specifying an offset, or by reading from the beginning or the end.
2.1. Seek by Offset
Spring Kafka provides a seek() method to position the reader at the given offset within the partition.
Let’s first explore seeking by offset within a partition by taking the partition and offset value:
@GetMapping("partition/{partition}/offset/{offset}")
public ResponseEntity<Response> getOneByPartitionAndOffset(@PathVariable("partition") int partition,
@PathVariable("offset") int offset) {
try (KafkaConsumer<String, String> consumer =
(KafkaConsumer<String, String>) consumerFactory.createConsumer()) {
TopicPartition topicPartition = new TopicPartition(TOPIC_NAME, partition);
consumer.assign(Collections.singletonList(topicPartition));
consumer.seek(topicPartition, offset);
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
Iterator<ConsumerRecord<String, String>> recordIterator = records.iterator();
if (recordIterator.hasNext()) {
ConsumerRecord<String, String> consumerRecord = recordIterator.next();
Response response = new Response(consumerRecord.partition(),
consumerRecord.offset(), consumerRecord.value());
return new ResponseEntity<>(response, HttpStatus.OK);
}
}
return new ResponseEntity<>(HttpStatus.NOT_FOUND);
}
Here the API exposes an endpoint partition/{partition}/offset/{offset}, which passes the topic, partition, and offset to the seek() method, positioning the consumer to retrieve messages at the specified location. The response model includes the partition, offset, and the message content:
public record Response(int partition, long offset, String value) { }
For simplicity, the API retrieves only one record at the specified position. However, we can modify it to recover all messages starting from that offset. It also does not handle cases where the given offset is unavailable.
To test this, as a first step, let’s add a method that runs before all tests, producing 5 simple messages into the specified topic:
@BeforeAll
static void beforeAll() {
// set producer config for the broker
testKafkaProducer = new KafkaProducer<>(props);
int partition = 0;
IntStream.range(0, 5)
.forEach(m -> {
String key = String.valueOf(new Random().nextInt());
String value = "Message no : %s".formatted(m);
ProducerRecord<String, String> record = new ProducerRecord<>(TOPIC_NAME,
partition,
key,
value
);
try {
testKafkaProducer.send(record).get();
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException(e);
}
});
}
Here, the producer configuration is set, and 5 messages in the format “Message no : %s”.formatted(m) are sent to partition 0, where m represents an integer ranging from 0 to 4.
Next, let’s add a test which invokes the above endpoint by passing partition 0 and offset 2:
@Test
void givenKafkaBrokerExists_whenSeekByPartition_thenMessageShouldBeRetrieved() {
this.webClient.get()
.uri("/seek/api/v1/partition/0/offset/2")
.exchange()
.expectStatus()
.isOk()
.expectBody(String.class)
.isEqualTo("{\"partition\":0,\"offset\":2,\"value\":\"Message no : 2\"}");
}
By invoking this API endpoint, we can see that the third message, located at offset 2, is received successfully.
2.2. Seek by Beginning
seekToBeginning() method positions the consumer at the start of the partition, allowing it to retrieve messages starting from the first one.
Next, Let’s add an endpoint that exposes the first message at the beginning of the partition:
@GetMapping("partition/{partition}/beginning")
public ResponseEntity<Response> getOneByPartitionToBeginningOffset(@PathVariable("partition") int partition) {
try (KafkaConsumer<String, String> consumer =
(KafkaConsumer<String, String>) consumerFactory.createConsumer()) {
TopicPartition topicPartition = new TopicPartition(TOPIC_NAME, partition);
consumer.assign(Collections.singletonList(topicPartition));
consumer.seekToBeginning(Collections.singleton(topicPartition));
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
Iterator<ConsumerRecord<String, String>> recordIterator = records.iterator();
if (recordIterator.hasNext()) {
ConsumerRecord<String, String> consumerRecord = recordIterator.next();
Response response = new Response(consumerRecord.partition(),
consumerRecord.offset(), consumerRecord.value());
return new ResponseEntity<>(response, HttpStatus.OK);
}
}
return new ResponseEntity<>(HttpStatus.NOT_FOUND);
}
Here, the API provides the endpoint partition/{partition}/beginning, passing the topic and partition to the seekToBeginning() method. This positions the consumer to read messages from the start of the partition. The response includes the partition, offset, and message content.
Next, let’s add a test to retrieve the message at the beginning of partition 0. Note that the @BeforeAll section of the test ensures that the producer pushes five messages to the test topic:
@Test
void givenKafkaBrokerExists_whenSeekByBeginning_thenFirstMessageShouldBeRetrieved() {
this.webClient.get()
.uri("/seek/api/v1/partition/0/beginning")
.exchange()
.expectStatus()
.isOk()
.expectBody(String.class)
.isEqualTo("{\"partition\":0,\"offset\":0,\"value\":\"Message no : 0\"}");
}
We can retrieve the first message stored at offset 0 by invoking this API endpoint.
2.3. Seek by End
The seekToEnd() method positions the consumer at the end of the partition, allowing it to retrieve any future messages that are appended.
Next, let’s create an endpoint that seeks the offset position at the end of the partition:
@GetMapping("partition/{partition}/end")
public ResponseEntity<Long> getOneByPartitionToEndOffset(@PathVariable("partition") int partition) {
try (KafkaConsumer<String, String> consumer =
(KafkaConsumer<String, String>) consumerFactory.createConsumer()) {
TopicPartition topicPartition = new TopicPartition(TOPIC_NAME, partition);
consumer.assign(Collections.singletonList(topicPartition));
consumer.seekToEnd(Collections.singleton(topicPartition));
return new ResponseEntity<>(consumer.position(topicPartition), HttpStatus.OK);
}
}
This API offers the endpoint partition/{partition}/end, passing the topic and partition to the seekToEnd() method. This positions the consumer to read messages from the end of the partition.
Since seeking the end means there are no new messages available, this API instead reveals the current offset position within the partition. Let’s add a test to verify this:
@Test
void givenKafkaBrokerExists_whenSeekByEnd_thenLastMessageShouldBeRetrieved() {
this.webClient.get()
.uri("/seek/api/v1/partition/0/end")
.exchange()
.expectStatus()
.isOk()
.expectBody(Long.class)
.isEqualTo(5L);
}
Using seekToEnd() moves the consumer to the next offset where the following message would be written, placing it in one position beyond the last available message. When we invoke this API endpoint, the response returns the last offset position plus one.
2.4. Seek by Implementing ConsumerSeekAware Class
Besides reading messages at specific positions using consumer APIs, we can extend the AbstractConsumerSeekAware class in Spring Kafka. This class allows consumers to dynamically control the seeking in Kafka partitions. It offers methods for seeking specific offsets or timestamps during partition assignment, giving finer control over message consumption.
In addition to the above seek methods, AbstractConsumerSeekAware offers to seek from specific timestamp or relative position seeking.
Let’s explore the relative position seeking in this section:
void seekRelative(java.lang.String topic, int partition, long offset, boolean toCurrent)
The seekRelative() method in Spring Kafka allows consumers to seek a position relative to the current or beginning offset within a partition. Each parameter has a specific role:
- topic: The name of the Kafka topic from which to read messages
- partition: The partition number within the topic where the seek will occur
- offset: The number of positions to move relative to the current or start offset. This can be positive or negative
- toCurrent: A boolean value. If true, the method seeks relative to the current offset. If false, it seeks relative to the beginning of the partition
Let’s add a custom listener which seeks to latest message within the partition by using seekRelative() API:
@Component
class ConsumerListener extends AbstractConsumerSeekAware {
public static final Map<String, String> MESSAGES = new HashMap<>();
@Override
public void onPartitionsAssigned(Map<TopicPartition,
Long> assignments, ConsumerSeekCallback callback) {
assignments.keySet()
.forEach(tp -> callback.seekRelative(tp.topic(), tp.partition(), -1, false));
}
@KafkaListener(id = "test-seek", topics = "test-seek-topic")
public void listen(ConsumerRecord<String, String> in) {
MESSAGES.put(in.key(), in.value());
}
}
The seekRelative() method is called in the onPartitionsAssigned method to manually adjust the consumer’s position when it receives a partition assignment.
The offset value of -1 tells the consumer to move one position backward from the reference point. In this case, since the toCurrent is set as false, it tells the consumer to seek relative to the end of the partition. This means the consumer moves one position back from the last available message.
An in-memory hash map tracks the read messages for testing, storing the received messages as strings.
Finally, let’s add a test to verify that the system retrieves the message at offset 4 successfully by checking the map:
@Test
void givenKafkaBrokerExists_whenMessagesAreSent_ThenLastMessageShouldBeRetrieved() {
Map<String, String> messages = consumerListener.MESSAGES;
Assertions.assertEquals(1, messages.size());
Assertions.assertEquals("Message no : 4", messages.get("4"));
}
The @BeforeAll section of the test ensures that the producer pushes 5 messages to the test topic. The seeking configuration successfully retrieves the last message in the partition.
3. Conclusion
In this tutorial, we explored how Kafka consumers can seek specific positions in partitions using Spring Kafka.
We first examined seeking with consumer APIs, which is useful when precise control over the reading position in a partition is needed. This method works best for scenarios such as replaying events, skipping certain messages, or applying custom logic based on offsets.
Next, we looked at seeking while using a listener, which is more suitable for continuously consuming messages. This approach automatically commits offsets at regular intervals after processing, as the enable.auto.commit property is set to true by default.
As always, the source code for the examples is available over on GitHub.