Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Skip functions/layers pointing to S3 location in providers #2756

Merged
merged 12 commits into from
Mar 30, 2021

Conversation

aahung
Copy link
Contributor

@aahung aahung commented Mar 22, 2021

Which issue(s) does this change fix?

Why is this change necessary?

Use the newly added integration test template file as example, this template should be totally buildable:

"tests/integration/testdata/buildcmd/template-with-s3-code.yaml"

AWSTemplateFormatVersion : '2010-09-09'
Transform: AWS::Serverless-2016-10-31


Resources:
  IgnoreServerlessFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: app.handler
      Runtime: python3.8
      CodeUri: s3://some/path
      Timeout: 600

  IgnoreServerlessFunction2:
    Type: AWS::Serverless::Function
    Properties:
      Handler: app.handler
      Runtime: python3.8
      CodeUri:
        Bucket: some-bucket
        Key: some-key
      Timeout: 600

  IgnoreLambdaFunction:
    Type: AWS::Lambda::Function
    Properties:
      Handler: app.handler
      Runtime: python3.8
      Code: s3://some/path
      Timeout: 600

  IgnoreLambdaFunction2:
    Type: AWS::Lambda::Function
    Properties:
      Handler: app.handler
      Runtime: python3.8
      Code:
        S3Bucket: some-bucket
        S3Key: some-key
      Timeout: 600

  IgnoreServerlessLayer:
    Type: AWS::Serverless::LayerVersion
    Properties:
      ContentUri: s3://some/path
      CompatibleRuntimes:
        - python3.8

  IgnoreServerlessLayer2:
    Type: AWS::Serverless::LayerVersion
    Properties:
      ContentUri:
        Bucket: some-bucket
        Key: some-key
      CompatibleRuntimes:
        - python3.8

  IgnoreLambdaLayer:
    Type: AWS::Lambda::LayerVersion
    Properties:
      Content: s3://some/path
      CompatibleRuntimes:
        - python3.8

  IgnoreLambdaLayer2:
    Type: AWS::Lambda::LayerVersion
    Properties:
      Content:
        S3Bucket: some-bucket
        S3Key: some-key
      CompatibleRuntimes:
        - python3.8

  # the two functions below should still be built even all layers are not available
  ServerlessFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: main.handler
      Runtime: python3.8
      CodeUri: Python
      Timeout: 600
      Layers:
        - !Ref IgnoreServerlessLayer
        - !Ref IgnoreServerlessLayer2
        - !Ref IgnoreLambdaLayer
        - !Ref IgnoreLambdaLayer2

  LambdaFunction:
    Type: AWS::Lambda::Function
    Properties:
      Handler: main.handler
      Runtime: python3.8
      Code: Python
      Timeout: 600
      Layers:
        - !Ref IgnoreServerlessLayer
        - !Ref IgnoreServerlessLayer2
        - !Ref IgnoreLambdaLayer
        - !Ref IgnoreLambdaLayer2

However, currently sam will quit with this error because it will still try to build functions with S3 uri:

$ sam build --use-container --skip-pull-image --cached --parallel -t tests/integration/testdata/buildcmd/template-with-s3-code.yaml

...

Building codeuri: /Users/~/Projects/aws-sam-cli/tests/integration/testdata/buildcmd runtime: python3.8 metadata: {} functions: ['IgnoreServerlessFunction', 'IgnoreServerlessFunction2', 'IgnoreLambdaFunction2']
Requested to skip pulling images ...

Mounting /Users/xinhol/Projects/aws-sam-cli/tests/integration/testdata/buildcmd as /tmp/samcli/source:ro,delegated inside runtime container

Build Failed
Running PythonPipBuilder:ResolveDependencies
Error: PythonPipBuilder:ResolveDependencies - Requirements file not found: /tmp/samcli/source/requirements.txt

How does it address the issue?

When Code, CodeUri, Content, or ContentUri has string value like s3://... or dict value like {"Bucket":.. or {"S3Bucket..., it will be ignored in providers.

Also, I used a single _extract_codeuri() to replace both _extract_lambda_function_code() and _extract_sam_function_codeuri(), which are not consistent regarding the behaviors (one print out the warning while the other does not).

What side effects does this change have?

N/A

Checklist

  • Add input/output type hints to new functions/methods
  • Write design document (Do I need to write a design document?)
  • Write unit tests
  • Write/update functional tests
  • Write/update integration tests
  • make pr passes
  • make update-reproducible-reqs if dependencies were changed
  • Write documentation

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@aahung aahung force-pushed the skip-s3-in-providers branch 3 times, most recently from da68d27 to 7ab4bb2 Compare March 22, 2021 23:26
@aahung aahung marked this pull request as ready for review March 22, 2021 23:34
@aahung aahung requested a review from sriram-mv March 22, 2021 23:34
@aahung aahung changed the title fix: Skip functions/layers pointing to s3 location in providers fix: Skip functions/layers pointing to S3 location in providers Mar 22, 2021
@aahung aahung requested review from jfuss and mndeveci March 22, 2021 23:34
@aahung aahung force-pushed the skip-s3-in-providers branch from 7ab4bb2 to c1b072c Compare March 22, 2021 23:36
Comment on lines +399 to +408
@staticmethod
def _locate_layer_from_ref(
stack: Stack, layer: Dict, use_raw_codeuri: bool = False, ignore_code_extraction_warnings: bool = False
) -> Optional[LayerVersion]:
Copy link
Contributor Author

@aahung aahung Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain: pylint complains about _parse_layer_info() being too complex, here I extract some logic to _locate_layer_from_ref()

Comment on lines -140 to -146
LOG.warning(
"Lambda function '%s' has specified S3 location for CodeUri which is unsupported. "
"Using default value of '%s' instead",
name,
SamBaseProvider.DEFAULT_CODEURI,
)
return SamBaseProvider.DEFAULT_CODEURI
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain: now the warning is being handled by SamFunctionProvider and SamLayerProviders.

@aahung aahung force-pushed the skip-s3-in-providers branch from c1b072c to 76395ec Compare March 23, 2021 00:10
@@ -27,6 +27,13 @@ class SamBaseProvider:
CLOUDFORMATION_STACK = AWS_CLOUDFORMATION_STACK
DEFAULT_CODEURI = "."

CODE_PROPERTY_KEYS = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sam package already has a bunch of these constants for package-able code paths, would it make sense to refactor that, so thats its usable across the code base?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea

Copy link
Contributor Author

@aahung aahung Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another PR ready for this purpose (coming right after this PR is merged)

codeuri = SamBaseProvider._extract_codeuri(layer_properties, code_property_key)

if codeuri and not use_raw_codeuri:
LOG.debug("--base-dir is presented not, adjusting uri %s relative to %s", codeuri, stack.location)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some grammar issue on this sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

"requirements.txt",
}

# @pytest.mark.flaky(reruns=3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment that might need to be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again 🤦

)

overrides = self._make_parameter_override_arg({})
self._verify_invoke_built_functions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template is buildable because s3 based locations are explicitly skipped, what does the local invoke do here?

Copy link
Contributor Author

@aahung aahung Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. I don't think functions containing these remote layers should be local invocable at all. What do you think, disallow their invocation? @sriram-mv in #2760 I tried to prevent these functions from being invoked, but after discussing with Mehmet, that change might be risky. So I removed it. these two functions are still invocable so we should still test invoke them.

Copy link
Contributor Author

@aahung aahung Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented this integ test base in #2760, will rebase this PR once #2760 is merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe only test building? I feel that testing invoke might be unnecessary here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no strong opinion here, the invoke is added based on the current sam-cli behavior, we might not need this in the future. here removed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the invoke just skip? what is our current behavior? could we add a comment on what the behavior we are trying to test. This will be useful for future deep dives.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented!

@aahung aahung requested review from CoshUS and qingchm March 24, 2021 21:57
@aahung aahung force-pushed the skip-s3-in-providers branch from b425b03 to f13ad32 Compare March 24, 2021 23:15
Copy link
Contributor

@qingchm qingchm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open question: should we add any special handling in local invoke/start api if the template contains s3 locations? I mean, any warnings?

@aahung
Copy link
Contributor Author

aahung commented Mar 25, 2021

Open question: should we add any special handling in local invoke/start api if the template contains s3 locations? I mean, any warnings?

good question. I think we should. As right now, customers will see a universal warning "AWS::Serverless::LayerVersion XXX has specified S3 location for ContentUri which is unsupported." I think it is good enough to raise the concern.

But for other cases (like #2760 pass-up example), we don't have the warning yet, but we have LOG.debug in place.

else:
LOG.debug(
'layer "%s" is not recognizable, '
"it might be using intrinsic functions that we don't support yet. Skipping.",
str(layer),

@qingchm
Copy link
Contributor

qingchm commented Mar 25, 2021

#2760

To me the easier solution here would be during the build time if we have the is_s3 function returned true, we can log a message warning about local invoke might fail. The more complex approach would be to add the message at invoke time so that they know why the invoke didn't work. For me personally I prefer the first option (kind of aligns with the layer behaviour), but as I said this is an open question so more opinions are strongly welcomed

codeuri = resource_properties.get(code_property_key, SamBaseProvider.DEFAULT_CODEURI)

if isinstance(codeuri, dict):
return SamBaseProvider.DEFAULT_CODEURI

return cast(str, codeuri)

@staticmethod
def _is_code_uri_s3(codeuri: Optional[Any]) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: isn't Optional[Any] == Any?

Copy link
Contributor

@qingchm qingchm Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can codeuri be anything other than None, dict, str? Shouldn't we use something like Optional[Union[str, Dict]]?

Copy link
Contributor Author

@aahung aahung Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: isn't Optional[Any] == Any?

yeah right.

I changed to Optional[Union[str, Dict]]

)

@staticmethod
def _warn_code_extraction(resource_type: str, resource_name: str, code_property: str) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels slightly weird to have the logging specifically refactored out. Doesn't seem very meaningful on its own.

Would it make sense to (e.g.) instead have a method whose sole purpose is to check whether a resource uses an S3-based CodeUri? Something like that should also allow simplifying the 3 very similar use cases in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do something like we did in #2691 (we had similar discussion before).

Copy link
Contributor

@hoffa hoffa Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so I think the whole flow in these methods feels weird as we're essentially dealing with two very similar abstractions (AWS::Serverless::Function and AWS::Lambda::Function) with vaguely-defined operations (e.g. methods that deal with both abstractions) but still keep repeating similar logic for both.

Not in this scope so I'm happy with this for now, but ideally I'm thinking we would have a well-defined abstraction to deal with functions, e.g. by wrapping them or performing a translation beforehand so we only deal with a subset of well-defined resource types with well-defined operations, so that we don't need all this if and isinstance ceremony scattered across the codebase.

@aahung
Copy link
Contributor Author

aahung commented Mar 25, 2021

#2760

To me the easier solution here would be during the build time if we have the is_s3 function returned true, we can log a message warning about local invoke might fail. The more complex approach would be to add the message at invoke time so that they know why the invoke didn't work. For me personally I prefer the first option (kind of aligns with the layer behaviour), but as I said this is an open question so more opinions are strongly welcomed

That's a good point. For this PR, it is a bit out of the scope.

@staticmethod
def _warn_code_extraction(resource_type: str, resource_name: str, code_property: str) -> None:
LOG.warning(
"%s '%s' has specified S3 location for %s which is unsupported.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning feels a bit ambiguous here. Shouldn't we specify the scope of this "unsupported"?

Base automatically changed from develop to elbayaaa-develop March 25, 2021 21:28
Base automatically changed from elbayaaa-develop to develop March 25, 2021 21:38
@@ -129,6 +129,14 @@ def _extract_functions(
if resource_metadata:
resource_properties["Metadata"] = resource_metadata

if resource_type in [SamFunctionProvider.SERVERLESS_FUNCTION, SamFunctionProvider.LAMBDA_FUNCTION]:
code_property_key = SamBaseProvider.CODE_PROPERTY_KEYS[resource_type]
if SamBaseProvider._is_code_uri_s3(resource_properties.get(code_property_key)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we rename this method? _is_code_uri_s3, we explicitly take in the key to look up in the function. it could be CodeUri, ContentUri, Code, Content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. Renamed to

@staticmethod
def _is_s3_location(location: Optional[Union[str, Dict]]) -> bool:
    """
    the input could be:
    - CodeUri of Serverless::Function
    - Code of Lambda::Function
    - ContentUri of Serverless::LayerVersion
    - Content of Lambda::LayerVersion
    """
    return (isinstance(location, dict) and ("S3Bucket" in location or "Bucket" in location)) or (
        isinstance(location, str) and location.startswith("s3://")
    )

@aahung aahung merged commit fd25093 into aws:develop Mar 30, 2021
moelasmar pushed a commit to moelasmar/aws-sam-cli that referenced this pull request Jul 1, 2021
…2756)

* refactor: Extract logics into _locate_layer_from_ref()

* providers no longer extract functions/layers with s3 uri
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants