-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x-pack/filebeat/input/awss3: relax queue_url constraints for non-standard endpoints #35520
Conversation
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
if err != nil { | ||
return fmt.Errorf("invalid queue_url: %w", err) | ||
} | ||
if strings.HasSuffix(u.Host, ".amazonaws.com") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other AWS domains besides just amazonaws.com
that you should include. See https://github.com/elastic/beats/blob/main/x-pack/filebeat/input/awss3/input.go#L350
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I was looking for a list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is complicated because of all the ways in which a user can set the AWS region. I think those are:
default_region
in input configAWS_DEFAULT_REGION
env varAWS_REGION
env var- extracted from the
queue_url
region_name
in input config (proposed)
My personal preference in behavior is to require an explicit region name to be configured either through env (which is part of the AWS SDK) or via the existing default_region config option. This would have the most clear semantics IMO -- you must set it or else it won't work and there's no guessing about the region being used. But our own SDK config wrapper injects a "default default" region of us-east-1 so we can't detect if the user did not set a region.
But we can't change that behavior without a breaking change, so if we make region_name
take precedence over queue_url parsing I still that will be clear.
One thing I want to be sure of here is that we have tested this before merging to ensure that it works to pull data from Localstack. @bhapas can help with that step.
@@ -78,6 +79,13 @@ func (c *config) Validate() error { | |||
return fmt.Errorf("number_of_workers <%v> must be greater than 0", c.NumberOfWorkers) | |||
} | |||
|
|||
if c.QueueURL != "" { | |||
region, _ := getRegionFromQueueURL(c.QueueURL, c.AWSConfig.Endpoint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to have this validation or could we relax it? As a user, I think the semantics of explicitly setting a region name would be my preference in my config. (It's just a shame that we cannot rely on the AWS SDK config for this because it injects a default region so we can't tell if the user explicitly configured it).
This will steer users that are having trouble with their combination of queue_url and endpoint toward the region_name which is documented as for testing purposes. Not sure this is bad but I wanted to call it out. A common problem is that users get an error from getRegionFromQueueURL until they learn that they need to set endpoint
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy to relax this if we emit a log warning, though I don't think that is possible at this point, but is in input.go with some additional cost per start. WDYT?
@@ -308,9 +311,11 @@ func getRegionFromQueueURL(queueURL string, endpoint string) (string, error) { | |||
return "", fmt.Errorf(queueURL + " is not a valid URL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a unit test for this function. It has zero test coverage. When it was added it had some, but I must have lost them in a refactoring. Can you please add back the test case from 7b729da.
@@ -257,6 +257,12 @@ configuring multiline options. | |||
|
|||
URL of the AWS SQS queue that messages will be received from. (Required when `bucket_arn` and `non_aws_bucket_name` are not set). | |||
|
|||
[float] | |||
==== `region_name` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using region
is more consistent with everything in the AWS realm.
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Before merging, @bhapas can you please test that this works with Localstack based SQS and S3.
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This pull request is now in conflicts. Could you fix it? 🙏
|
/test |
This comment was marked as outdated.
This comment was marked as outdated.
Currently, the queue_url option is used to obtain the region name for the SQS receiver. This prevents pointing the input at other sources for testing, so add a region_name option to provide an alternative way to provide the region name for non-AWS endpoints. To avoid confusion, prevent using the region_name option in conjunction with amazonaws.com endpoints and document this.
Also improve comprehensability of test cases.
Make getRegionFromQueueURL aware of configured region name, this means that it will return either a non-zero region name or a non-nil error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What does this PR do?
Currently, the queue_url option is used to obtain the region name for the SQS receiver. This prevents pointing the input at other sources for testing, so add a region_name option to provide an alternative way to provide the region name for non-AWS endpoints.
To avoid confusion, log a warning if the user configures the region option in conjunction with amazonaws.com endpoints such that they disagree.
Why is it important?
This is needed for supporting tests of inputs using s3.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs