-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add automatic logging of e2e flakes, and automatic restarts #10654
Conversation
Hi, @nithusha21, this pull request does not have a "CHANGELOG: ..." label as mentioned in the PR checkbox list. Assigning @nithusha21to add the required label. PRs without this label will not be merged. If you are unsure of which label to add, please ask the reviewers for guidance. Thanks! |
Thanks @nithusha21! I don't think my review is required here -- so unassigning myself. |
@nithusha21 since we still have to fix the backend tests, can this be made a draft? |
We need to import `google.oauth2` after we import `python_utils` because importing `python_utils` sets `sys.path` as needed for `google.oauth2`. This commit replicates changes from oppia#11018 that fix the linter to recognize google.oauth2 as a third-party library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a pass and left a few comments. Most of the comments are related to code structure, if required we can resolve that comment after getting this PR merged as this PR will help every other contributors. Let me know if you have any question/suggestion. Thanks!
app_dev.yaml
Outdated
- third_party/python_libs/google/oauth2/ | ||
- third_party/python_libs/google/api_core/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are adding 4 new lib but removing 2 dirs here? Is that expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. The path was incorrect.
requirements.in
Outdated
@@ -27,3 +27,7 @@ requests-toolbelt==0.9.1 | |||
google-cloud-tasks==1.5.0 | |||
googleapis_common_protos==1.52.0 | |||
webapp2==3.0.0b1 | |||
google-auth==1.21.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: Nit: Should we add this in sorted order* (Though it's only sorted till line 22)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -30,10 +33,12 @@ graphy==1.0.0 # via -r requirements.in, googleappenginemapreduce | |||
grpc-google-iam-v1==0.12.3 # via google-cloud-tasks | |||
grpcio==1.32.0 # via google-api-core, googleapis-common-protos, grpc-google-iam-v1 | |||
html5lib==1.0.1 # via -r requirements.in | |||
httplib2==0.18.1 # via google-api-python-client, google-auth-httplib2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also remove this from the build container? (i.e, add this in -ski-file list of app.yaml)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
idna==2.10 # via requests | ||
mock==3.0.5 # via googleappenginemapreduce | ||
mox==0.5.3 # via googleappenginemapreduce | ||
mutagen==1.43.0 # via -r requirements.in | ||
oauthlib==3.1.0 # via requests-oauthlib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here and elsewhere*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
os.getcwd(), 'third_party', 'python_libs') | ||
sys.path.insert(0, _PYTHON_LIBS_PATH) | ||
|
||
import googleapiclient.discovery # isort:skip pylint: disable=wrong-import-position, wrong-import-order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move the google spreadsheet related code to a different file? (We can do it later if required but make sure to create an issue for the same!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #11038 to resolve all the below comments as I don't have time right now to perform this refactor. I will do that when I am back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating ths issue, I'm working on it and I'll create a separate PR.
""" | ||
sheet_id = os.getenv('FLAKY_E2E_TEST_SHEET_ID') | ||
flaky_tests_list = [] | ||
if sheet_id is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if sheet_id is None:
return []
restult = ....
[To avoid lots of code inside if branch!]
scripts/run_e2e_tests.py
Outdated
|
||
import googleapiclient.discovery # isort:skip pylint: disable=wrong-import-position, wrong-import-order | ||
|
||
MAXIMUM_RUNS = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe MAX_RETRY_COUNT or something around that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
flaky_tests_list = [] | ||
google_auth_decode_password = os.getenv('GOOGLE_AUTH_DECODE_PASSWORD') | ||
if google_auth_decode_password is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move the code 647-660 to a function get_flkay_test_data_sheet?
"""Run tests, rerunning at most MAXIMUM_RUNS times if they flake.""" | ||
for _ in python_utils.RANGE(MAXIMUM_RUNS): | ||
flake_state = run_tests(args=args) | ||
if flake_state != 'flake': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we structure it this way:
get_flkay_test_data_sheet:
data = get_flkay_test_data_sheet()
return get_flaky_tests_data_from_sheets(data)
is_test_output_flaky(output):
logs = get_flkay_test_logs()
for log in logs:
if is_output_similar_to_log(output, log)
return True
return False
main:
return_code = 1
Loop max_tries:
output, return_code = run_test()
if is_test_output_faky(output):
update_falkey_test_count() [OR do it this inside the is_test_output_flkay() call]
print 'The test result looks flaky, rerunning the tests (%<attempt number>)'
else:
break
sys.exist(return_code)
[This is just to make the code structured and readable!]
Unassigning @DubeySandeep since the review is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Unassigning @vojtechjelinek since they have already approved the PR. |
I'm seeing an import error again, but only on TravisCI:
|
This works on my local machine (or at least, it gets past the point on Travis where it's currently stuck). Per @kevintab95 I'm going to try closing and reopening this PR in order to (hopefully) break the cache (assuming caching is the issue). @U8NWXD, if this persists, maybe you could just test out (on a separate branch) what happens for a single Travis test, and maybe add logs as needed to get more insight into what's going on? |
Hi @nithusha21. Due to recent changes in the "develop" branch, this PR now has a merge conflict. Please follow this link if you need help resolving the conflict, so that the PR can be merged. Thanks! |
Hi @nithusha21. Due to recent changes in the "develop" branch, this PR now has a merge conflict. Please follow this link if you need help resolving the conflict, so that the PR can be merged. Thanks! |
Closing -- superseded by #11056. |
Overview
Essential Checklist
PR Pointers