RFI-File-Monitor

The Official Manual

View project on GitHub

Ceph S3 Bucket Monitor

Purpose

Use this engine to monitor the contents of an S3 bucket for new files. The bucket must be hosted on a Ceph cluster, as this engine relies extensively on its API for configuring and handling the bucket events.

Ceph supports HTTP, AMQP 0.9.1 and Kafka push-endpoints to send the events to. Currently this engine only supports AMQP 0.9.1, implemented by for example RabbitMQ and Apache Qpid.

This engine marks all files as Saved when passing them to the Queue manager.

Options

  • Ceph Endpoint: the URL of the Ceph endpoint to connect to. This must start with https://
  • Bucket Name: the name of the bucket that will be monitored for changes.
  • Access Key: the access key belonging to the RadosGW user that will be used by the engine.
  • Secret Key: the secret key that is associated with the access key.

The following options are available when clicking the Advanced Settings button:

  • Process existing files in bucket: turn this option on to add existing objects to the queue manager before starting the bucket monitor.
  • Allowed filename patterns: enter a file extension e.g. *.txt, *.csv (always include the asterisk) to only process files of that type, any other file written to the directory will be ignored.
  • Ignored filename patterns: enter a file extension e.g. *.txt, *.csv (always include the asterisk) to exclude files of these types, any other file written to the directory will be processed.

Use the radiobuttons to select the appropriate push-endpoint and provide appropriate values for the required parameters.

AMQP 0.9.1

  • Hostname: the hostname of the AMQP 0.9.1 broker.
  • Username: the username to use when establishing a connection with the broker.
  • Password: the password to use when establishing a connection with the broker.
  • Producer Port: the port on the broker instance that will be used to connect to from Ceph (the producer).
  • Consumer Port: the port on the broker instance that will be used to connect to from this machine (the consumer).
  • Consumer Use SSL: if enabled, the connection to the broker from the local machine will be SSL encrypted. Usually port 5671 is used when SSL is enabled.
  • CA certificate: if the broker SSL certificates have been self-signed, set this value to a file containing the PEM certificate of the Certificate Authority.
  • Exchange: the exchange to be used on the broker. Note that Ceph currently requires all its connections to the same push-endpoint to use the same exchange!
  • Vhost: the vhost to use on the broker.

Exported File Format

S3Object

Author

Tom Schoonjans