Class SourceTask

java.lang.Object
org.apache.kafka.connect.source.SourceTask
All Implemented Interfaces:
Task

public abstract class SourceTask extends Object implements Task
SourceTask is a Task that pulls records from another system for storage in Kafka.
  • Field Details

    • TRANSACTION_BOUNDARY_CONFIG

      public static final String TRANSACTION_BOUNDARY_CONFIG
      The configuration key that determines how source tasks will define transaction boundaries when exactly-once support is enabled.
      See Also:
    • context

      protected SourceTaskContext context
  • Constructor Details

    • SourceTask

      public SourceTask()
  • Method Details

    • initialize

      public void initialize(SourceTaskContext context)
      Initialize this SourceTask with the specified context object.
    • start

      public abstract void start(Map<String,String> props)
      Start the Task. This should handle any configuration parsing and one-time setup of the task.
      Specified by:
      start in interface Task
      Parameters:
      props - initial configuration
    • poll

      public abstract List<SourceRecord> poll() throws InterruptedException
      Poll this source task for new records. If no data is currently available, this method should block but return control to the caller regularly (by returning null) in order for the task to transition to the PAUSED state if requested to do so.

      The task will be stopped on a separate thread, and when that happens this method is expected to unblock, quickly finish up any remaining processing, and return.

      Returns:
      a list of source records
      Throws:
      InterruptedException
    • commit

      public void commit() throws InterruptedException
      This method is invoked periodically when offsets are committed for this source task. Note that the offsets being committed won't necessarily correspond to the latest offsets returned by this source task via poll(). Also see commitRecord(SourceRecord, RecordMetadata) which allows for a more fine-grained tracking of records that have been successfully delivered.

      SourceTasks are not required to implement this functionality; Kafka Connect will record offsets automatically. This hook is provided for systems that also need to store offsets internally in their own system.

      Throws:
      InterruptedException
    • stop

      public abstract void stop()
      Signal this SourceTask to stop. In SourceTasks, this method only needs to signal to the task that it should stop trying to poll for new data and interrupt any outstanding poll() requests. It is not required that the task has fully stopped. Note that this method necessarily may be invoked from a different thread than poll() and commit().

      For example, if a task uses a Selector to receive data over the network, this method could set a flag that will force poll() to exit immediately and invoke wakeup() to interrupt any ongoing requests.

      Specified by:
      stop in interface Task
    • commitRecord

      @Deprecated public void commitRecord(SourceRecord record) throws InterruptedException

      Commit an individual SourceRecord when the callback from the producer client is received. This method is also called when a record is filtered by a transformation, and thus will never be ACK'd by a broker.

      This is an alias for commitRecord(SourceRecord, RecordMetadata) for backwards compatibility. The default implementation of commitRecord(SourceRecord, RecordMetadata) just calls this method. It is not necessary to override both methods.

      SourceTasks are not required to implement this functionality; Kafka Connect will record offsets automatically. This hook is provided for systems that also need to store offsets internally in their own system.

      Parameters:
      record - SourceRecord that was successfully sent via the producer or filtered by a transformation
      Throws:
      InterruptedException
    • commitRecord

      public void commitRecord(SourceRecord record, RecordMetadata metadata) throws InterruptedException

      Commit an individual SourceRecord when the callback from the producer client is received. This method is also called when a record is filtered by a transformation or when "errors.tolerance" is set to "all" and thus will never be ACK'd by a broker. In both cases metadata will be null.

      SourceTasks are not required to implement this functionality; Kafka Connect will record offsets automatically. This hook is provided for systems that also need to store offsets internally in their own system.

      The default implementation just calls commitRecord(SourceRecord), which is a nop by default. It is not necessary to implement both methods.

      Parameters:
      record - SourceRecord that was successfully sent via the producer, filtered by a transformation, or dropped on producer exception
      metadata - RecordMetadata record metadata returned from the broker, or null if the record was filtered or if producer exceptions are ignored
      Throws:
      InterruptedException