Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BVT for the agent #2719

Merged
merged 16 commits into from
Jan 6, 2023
Merged

Add BVT for the agent #2719

merged 16 commits into from
Jan 6, 2023

Conversation

narrieta
Copy link
Member

@narrieta narrieta commented Jan 3, 2023

This PR includes a new agent-bvt test that replaces the current BVT in DCR. There is still a extension I need to add (LAD), but I will do that on a separate PR. The current BVT also includes other tests that are targeted towards provisioning. Those will be added as a different test suite at a later stage.

The PR also includes multiple improvements in logging, reporting test results, organization of the tests, etc. It also defines some of the patterns the tests will follow. I documented those in the test wiki and will also point them out as comments in this PR.

@@ -56,7 +56,7 @@ jobs:
PYLINTOPTS: "--rcfile=ci/3.6.pylintrc --ignore=tests_e2e"

- python-version: 3.9
PYLINTOPTS: "--rcfile=ci/3.6.pylintrc --ignore=azure_models.py,BaseExtensionTestClass.py"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed those files, which were simple prototypes.

@@ -74,7 +74,8 @@ def run(agent_family, output_directory, log):
os.makedirs(bin_path)
log.info("Created {0} directory".format(target_path))

args = ["python3", "setup.py", "bdist_egg", "--dist-dir={0}".format(bin_path)]
Copy link
Member Author

@narrieta narrieta Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows invoking makepkg without cd'ing to the WALinuxAgent directory

return self._node


class AgentTestSuite(TestSuite):
Copy link
Member Author

@narrieta narrieta Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I derived AgentTestSuite fromLISA's TestSuite again, since it simplifies the calling pattern for tests and allows the use of some metadata. I moved all its data members into the AgentTestContext and replaced self._log with the global Logger.

"""
self._set_context(node)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I improved the way we report test results. See the latest test run for an example.

@@ -41,4 +41,4 @@ lisa \
--log_path "$HOME/logs" \
--working_path "$HOME/logs" \
-v subscription_id:"$SUBSCRIPTION_ID" \
-v identity_file:"$HOME/.ssh/id_rsa.pub"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have been the private key, not the public key

@@ -1,33 +0,0 @@
# Create a base class
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace by logging.py

@@ -1,135 +0,0 @@
import os
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed

try:
return operation()
except Exception as e:
# TODO: Do we need to retry on msrestazure.azure_exceptions.CloudError?
Copy link
Member Author

@narrieta narrieta Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the logic in the prototype was retrying on CloudError and HttpResponseError.

The latter, though, is raised in perfectly valid scenarios (e.g. trying to retrieve the instance view of an extension that not installed, or passing an incorrect value to an extension parameter) and should not be retried (maybe some of its derived classes should be retried).

So far I have not seen any instances of CloudError, so I just added a TODO note here.


#
# This module includes facilities to execute some operations on virtual machines and scale sets (list extensions, restart, etc).
#
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module has not been tested and is not used currently. Similar functionality was implemented in the prototype's azure_models.py, and I decided to port it as well, since for sure some tests will need it.

).wait(timeout=_TIMEOUT))


class VmssExtension(_VmExtensionBaseClass):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class has not been tested and is not used currently. Similar functionality was implemented in the prototype's azure_models.py, and I decided to port it as well, since for sure some tests will need it.

finally:
self._clean_up()

# Fail the entire test suite if any test failed
assert_that(failed).described_as("One or more tests failed").is_length(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we do anything with the test names we're appending to the failed list? It looks like we're just checking the length of the list here, should we change failed to a counter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message in the assertion failure includes the list of list of tests and the count, so we'll have a summary of the tests that failed.

Also, we output them as part of 'results' in the summary (we append failed tests both to 'failed' and 'results')

2023-01-03T00:36:02Z.676 [INFO] **************************************** [Test Results] ****************************************
2023-01-03T00:36:02Z.676 [INFO] 
2023-01-03T00:36:02Z.676 [INFO] 	[Passed] ExtensionOperationsBvt
2023-01-03T00:36:02Z.676 [INFO] 	[Failed] RunCommandBvt
2023-01-03T00:36:02Z.676 [INFO] 	[Passed] VmAccessBvt

def execute_script_on_node(self, script_path: Path, parameters: str = "", sudo: bool = False) -> int:
"""
Executes the given script on the test node; if 'sudo' is True, the script is executed using the sudo command.
"""
custom_script_builder = CustomScriptBuilder(script_path.parent, [script_path.name])
custom_script = self._node.tools[custom_script_builder]
custom_script = self.context.node.tools[custom_script_builder]

if parameters == '':
command_line = f"{script_path}"
else:
command_line = f"{script_path} {parameters}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check if parameters == ''? Can we get rid of the first case and just have an extra whitespace at the end of command_line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding the extra space was intentional. A minor detail, but later on we surround the command line with [] and the extra space looks odd.


class Paths:
# E1101: Instance of 'list' has no '_path' member (no-member)
DEFAULT_TEST_SOURCE_DIRECTORY = Path(tests_e2e.__path__._path[0]) # pylint: disable=E1101
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# pylint: disable=E1101
Are these comments suppressing pylint warnings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they are. I've been adding the warning as an extra comment one line above the problematic code. In this case the warning is "# E1101: Instance of 'list' has no '_path' member (no-member)"; somehow pylint seems to think path is a list (it is not)

console_handler = logging.StreamHandler()
log.addHandler(console_handler)

log .setLevel(logging.INFO)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra space here? log .setLevel(logging.INFO)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will remove it, thanks


log: logging.Logger = logging.getLogger("lisa")

if not log.hasHandlers():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is hasHandlers() doing here?

Copy link
Member Author

@narrieta narrieta Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handlers do the actual log output. We attempt to check whether we are running under lisa by doing "logging.getLogger("lisa")".

If the "lisa" logger does not exist (i.e. the test is being run from the command-line, not from a LISA Test Suite), getLogger() will return a new Logger with no handlers. In this case, we throw away that Logger and create a new one named "waagent" and then add a handler to output to stdout.

@@ -0,0 +1,53 @@
#!/usr/bin/env python3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will this module be used for/ what other utilities will we add to this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be used for shell-related functionality. At this point I do not anticipate other functionality that running a command.

Performs an enable operation on the extension.

NOTE: 'force_update' is not a parameter of the actual ARM API. It is provided for convenience: If set to True,
the 'force_update_tag' can be left unspecified and this method will generate a random tag.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the force_update_tag used for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the extension settings do not change, invoking the extension won't generate a new goal state (because the model has not changed). To work around that, CRP exposes the force_update_tag, which is an arbitrary string with no meaning other than providing a way to specify unique values on different calls to enable (our code uses GUIDs)

# BVT for RunCommand.
#
# Note that there are two incarnations of RunCommand (which are actually two different extensions):
# Microsoft.CPlat.Core.RunCommandHandlerLinux and Microsoft.OSTCExtensions.VMAccessForLinux. This
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change this
Microsoft.OSTCExtensions.VMAccessForLinux
to
Microsoft.OSTCExtensions.RunCommand

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will update this, thanks

maddieford
maddieford previously approved these changes Jan 4, 2023
self.stderr: str = stderr


def run_command(command: Any, shell=False) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it's retuning can we rename to run_command_output or something like that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think run_command is appropriate, the function is not different to other functions that return a value so adding "_output" is not needed.

result = f"[Failed] {test.__name__}"

log.info("")
log.exception("******************** %s\n", result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we log error msg too. I think test name is not enough for debugging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.exception logs the exception message and traceback

except SystemExit: # Bad arguments
pass
except: # pylint: disable=bare-except
log.exception("Test failed")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add error msg too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.exception does that

@codecov
Copy link

codecov bot commented Jan 5, 2023

Codecov Report

Merging #2719 (fe57b35) into develop (a3a41bd) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff            @@
##           develop    #2719   +/-   ##
========================================
  Coverage    71.96%   71.96%           
========================================
  Files          104      104           
  Lines        15765    15765           
  Branches      2244     2244           
========================================
  Hits         11345    11345           
  Misses        3909     3909           
  Partials       511      511           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@narrieta narrieta merged commit 2bd03c9 into Azure:develop Jan 6, 2023
@narrieta narrieta deleted the bvt branch January 6, 2023 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants