public class VerifiableSinkTask extends SinkTask
VerifiableSourceTask
that consumes records and logs information about each to stdout. This
allows validation of processing of messages by sink tasks on distributed workers even in the face of worker restarts
and failures. This task relies on the offset management provided by the Kafka Connect framework and therefore can detect
bugs in its implementation.Modifier and Type | Field and Description |
---|---|
static String |
ID_CONFIG |
static String |
NAME_CONFIG |
context, TOPICS_CONFIG, TOPICS_REGEX_CONFIG
Constructor and Description |
---|
VerifiableSinkTask() |
Modifier and Type | Method and Description |
---|---|
void |
flush(Map<TopicPartition,OffsetAndMetadata> offsets)
Flush all records that have been
SinkTask.put(Collection) for the specified topic-partitions. |
void |
put(Collection<SinkRecord> records)
Put the records in the sink.
|
void |
start(Map<String,String> props)
Start the Task.
|
void |
stop()
Perform any cleanup to stop this task.
|
String |
version()
Get the version of this task.
|
close, initialize, onPartitionsAssigned, onPartitionsRevoked, open, preCommit
public static final String NAME_CONFIG
public static final String ID_CONFIG
public String version()
Task
Connector
class's version.public void start(Map<String,String> props)
SinkTask
public void put(Collection<SinkRecord> records)
SinkTask
SinkTask.flush(Map)
or SinkTask.preCommit(Map)
to ensure that offsets are only committed for records
that have been written to the downstream system (hence avoiding data loss during failures).
If this operation fails, the SinkTask may throw a RetriableException
to
indicate that the framework should attempt to retry the same call again. Other exceptions will cause the task to
be stopped immediately. SinkTaskContext.timeout(long)
can be used to set the maximum time before the
batch will be retried.
public void flush(Map<TopicPartition,OffsetAndMetadata> offsets)
SinkTask
SinkTask.put(Collection)
for the specified topic-partitions.flush
in class SinkTask
offsets
- the current offset state as of the last call to SinkTask.put(Collection)
, provided for
convenience but could also be determined by tracking all offsets included in the
SinkRecord
s passed to SinkTask.put(java.util.Collection<org.apache.kafka.connect.sink.SinkRecord>)
. Note that the topic, partition and offset
here correspond to the original Kafka topic partition and offset, before any
transformations
have been applied. These can be tracked by the task
through the SinkRecord.originalTopic()
, SinkRecord.originalKafkaPartition()
and SinkRecord.originalKafkaOffset()
methods.public void stop()
SinkTask
SinkTask.put(Collection)
has returned) and a final SinkTask.flush(Map)
and offset
commit has completed. Implementations of this method should only need to perform final cleanup operations, such
as closing network connections to the sink system.