-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding readme for s3 source [KCON-9] #376
Conversation
s3-source-connector/README.md
Outdated
@@ -1,6 +1,6 @@ | |||
# Aiven's S3 Source Connector for Apache Kafka | |||
|
|||
This is a source Apache Kafka Connect connector that stores Apache Kafka messages in an AWS S3 bucket. | |||
This is a source Apache Kafka Connect connector that stores AWS S3 bucket objects in Apache Kafka. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a source Apache Kafka Connect connector that stores AWS S3 bucket objects in Apache Kafka. | |
This is a source Apache Kafka Connect connector that adds AWS S3 bucket objects onto an Apache Kafka topic. |
|
||
Example object name : customertopic-00001-1734445664111.txt | ||
|
||
## Data Format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This header seems incorrect for what is below.
Is this just # Configuration ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it is just configuration, but under it there are relevant sections.
And similar to readme in sink.
s3-source-connector/README.md
Outdated
### S3 Object Names | ||
|
||
S3 connector stores series of files in the specified bucket. | ||
Each object is named using pattern `[<aws.s3.prefix>]<topic>-<partition>-<start_offset>-<extension>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we have it set yet, but there should be a configuration for setting object names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rephrased it. But we are still following the pattern [<aws.s3.prefix>]--<start_offset>- right ?
s3-source-connector/README.md
Outdated
is deprecated and will be replaced with new one during a certain transition period (within 2-3 releases). Most of the | ||
> configuration parameters remain same. | ||
|
||
List of deprecated configuration parameters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets remove the deprecated config params and not support them in the S3 Source Connector so we dont have any legacy we need to clean up in the future.
s3-source-connector/README.md
Outdated
# Possible values 'none' or 'all'. Default being 'none' | ||
# If 'none', then any errors during data transformations or reading objects would cause the connector to fail and shutdown. | ||
# If 'all', then any errors during data transformations or reading objects would have no impact and the connector and | ||
# worker task will continue with next records. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just add errors are logged and ignored for 'all'
# A comma-separated list of topic partitions where the connector's offset storage reader | ||
# can read the stored offsets for those partitions. If not mentioned, s3 objects will be read again if | ||
# available in the bucket | ||
topic.partitions=1,2,3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seem incorrect for the source connector. We should just read all the objects (ideally just once) is this a sink only config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for source connector, and based on these partitions, the offset storage reader reads the map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, I think this wont be perfect when we merge it there will be a few more updates as we get things done done for release.
Yea indeed. Pushed changes. We can always keep updating. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thank you for the updates.
KCON-9 Read me to configure aws s3 source connector