Class VerifiableSinkTask

java.lang.Object
org.apache.kafka.connect.sink.SinkTask
org.apache.kafka.connect.tools.VerifiableSinkTask
All Implemented Interfaces:
Task

public class VerifiableSinkTask extends SinkTask
Counterpart to VerifiableSourceTask that consumes records and logs information about each to stdout. This allows validation of processing of messages by sink tasks on distributed workers even in the face of worker restarts and failures. This task relies on the offset management provided by the Kafka Connect framework and therefore can detect bugs in its implementation.
  • Field Details

  • Constructor Details

    • VerifiableSinkTask

      public VerifiableSinkTask()
  • Method Details

    • version

      public String version()
      Description copied from interface: Task
      Get the version of this task. Usually this should be the same as the corresponding Connector class's version.
      Returns:
      the version, formatted as a String
    • start

      public void start(Map<String,String> props)
      Description copied from class: SinkTask
      Start the Task. This should handle any configuration parsing and one-time setup of the task.
      Specified by:
      start in interface Task
      Specified by:
      start in class SinkTask
      Parameters:
      props - initial configuration
    • put

      public void put(Collection<SinkRecord> records)
      Description copied from class: SinkTask
      Put the records in the sink. This should either write them to the downstream system or batch them for later writing. If this method returns before the records are written to the downstream system, the task must implement SinkTask.flush(Map) or SinkTask.preCommit(Map) to ensure that offsets are only committed for records that have been written to the downstream system (hence avoiding data loss during failures).

      If this operation fails, the SinkTask may throw a RetriableException to indicate that the framework should attempt to retry the same call again. Other exceptions will cause the task to be stopped immediately. SinkTaskContext.timeout(long) can be used to set the maximum time before the batch will be retried.

      Specified by:
      put in class SinkTask
      Parameters:
      records - the collection of records to send
    • flush

      public void flush(Map<TopicPartition,OffsetAndMetadata> offsets)
      Description copied from class: SinkTask
      Flush all records that have been SinkTask.put(Collection) for the specified topic-partitions.
      Overrides:
      flush in class SinkTask
      Parameters:
      offsets - the current offset state as of the last call to SinkTask.put(Collection), provided for convenience but could also be determined by tracking all offsets included in the SinkRecords passed to SinkTask.put(java.util.Collection<org.apache.kafka.connect.sink.SinkRecord>). Note that the topic, partition and offset here correspond to the original Kafka topic partition and offset, before any transformations have been applied. These can be tracked by the task through the SinkRecord.originalTopic(), SinkRecord.originalKafkaPartition() and SinkRecord.originalKafkaOffset() methods.
    • stop

      public void stop()
      Description copied from class: SinkTask
      Perform any cleanup to stop this task. In SinkTasks, this method is invoked only once outstanding calls to other methods have completed (e.g., SinkTask.put(Collection) has returned) and a final SinkTask.flush(Map) and offset commit has completed. Implementations of this method should only need to perform final cleanup operations, such as closing network connections to the sink system.
      Specified by:
      stop in interface Task
      Specified by:
      stop in class SinkTask