Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #6888: Run tests split by classes #7040

Merged
merged 18 commits into from
Jul 1, 2019
Merged

Conversation

ankita240796
Copy link
Contributor

@ankita240796 ankita240796 commented Jun 28, 2019

Explanation

Fixes #6888: Splits running backend tests by test classes instead of test files. This will help to speed up the backend tests hopefully (Currently the tests take 3 hours to run, so once backend tests finish running on this PR, we can check if splitting by classes is doing a help here)

Checklist

  • The PR title starts with "Fix #bugnum: ", followed by a short, clear summary of the changes. (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".)
  • The PR explanation includes the words "Fixes #bugnum: ..." (or "Fixes part of #bugnum" if the PR only partially fixes an issue).
  • The linter/Karma presubmit checks have passed.
    • These should run automatically, but if not, you can manually trigger them locally using python scripts/pre_commit_linter.py and bash scripts/run_frontend_tests.sh.
  • The PR is made from a branch that's not called "develop".
  • The PR has an appropriate "CHANGELOG: ..." label (If you are unsure of which label to add, ask the reviewers for guidance).
  • The PR follows the style guide.
  • The PR addresses the points mentioned in the codeowner checks for the files/folders changed. (See the codeowner's wiki page.)
  • The PR is assigned to an appropriate reviewer.
    • If you're a new contributor, please ask on Gitter for someone to assign a reviewer and don't tick this checkbox.
    • If you're not sure who the appropriate reviewer is, please assign to the issue's "owner" -- see the "talk-to" label on the issue. Do not only request the review but also add the reviewer as an assignee.

@ankita240796
Copy link
Contributor Author

Yes!!! I think this fixes the issue since it took only 30 minutes to complete the backend tests!

@apb7
Copy link
Contributor

apb7 commented Jun 28, 2019

Hi @ankita240796, this looks good but is there a way to actually reduce the running time of the test causing the issue? Backend tests usually used to take around 19 minutes on CircleCI. Thanks!

@codecov
Copy link

codecov bot commented Jun 28, 2019

Codecov Report

Merging #7040 into develop will decrease coverage by 0.09%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           develop    #7040     +/-   ##
==========================================
- Coverage    97.46%   97.37%   -0.1%     
==========================================
  Files          379      378      -1     
  Lines        63138    62835    -303     
==========================================
- Hits         61537    61185    -352     
- Misses        1601     1650     +49
Impacted Files Coverage Δ
core/domain/exp_jobs_one_off_test.py 100% <100%> (ø) ⬆️
extensions/answer_summarizers/models_test.py 99.15% <100%> (ø) ⬆️
core/jobs.py 85.28% <0%> (-14.72%) ⬇️
core/controllers/cron.py 35.55% <0%> (-0.71%) ⬇️
core/jobs_test.py 99.37% <0%> (-0.63%) ⬇️
core/tests/test_utils.py 96.63% <0%> (-0.2%) ⬇️
core/controllers/review_tests.py 100% <0%> (ø) ⬆️
core/storage/user/gae_models.py 100% <0%> (ø) ⬆️
core/storage/user/gae_models_test.py 100% <0%> (ø) ⬆️
core/controllers/review_tests_test.py 100% <0%> (ø) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0cc83ad...a2d0d7a. Read the comment docs.

@ankita240796
Copy link
Contributor Author

Hi @apb7, I don’t think we can reduce the time further. Since the other fix we can have is to split the test file into multiple test file one per storage model but that will basically be the same as running the test split by classes. The prod jobs test will take some increase in time since the jobs are being run for each test function.

Also, I am not sure why the coverage is being shown as decreased here since I haven’t made changes to the file which are being reported. Will merging develop fix that issue?

Thanks!

@apb7
Copy link
Contributor

apb7 commented Jun 28, 2019

Hi @apb7, I don’t think we can reduce the time further. Since the other fix we can have is to split the test file into multiple test file one per storage model but that will basically be the same as running the test split by classes. The prod jobs test will take some increase in time since the jobs are being run for each test function.

Okay, thanks for the explanation. This looks good then.

Also, I am not sure why the coverage is being shown as decreased here since I haven’t made changes to the file which are being reported. Will merging develop fix that issue?

Yes, that should help. /cc @lilithxxx

Thanks!

@ankita240796
Copy link
Contributor Author

Hi @apb7, I just checked the branch is already updated. It was naming issue in the test files. I have fixed that now, PTAL. Thanks!

@@ -2178,7 +2178,7 @@ def test_no_action_is_performed_for_deleted_exploration(self):
self, exp_jobs_one_off.TranslatorToVoiceArtistOneOffJob)


class TestDeleteStateIdMappingModelsOneOffJob(test_utils.GenericTestBase):
class DeleteStateIdMappingModelsOneOffJobTests(test_utils.GenericTestBase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are other files as well which follow this convention. Can you do a grep for class Test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I checked that already but they are mock classes or base classes and are not being used in running the test.

Copy link
Contributor

@apb7 apb7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @ankita240796!

@apb7 apb7 assigned brianrodri and DubeySandeep and unassigned apb7 Jun 28, 2019
Copy link
Member

@seanlip seanlip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @ankita240796! Left some notes, PTAL.

@@ -211,7 +211,24 @@ def _get_all_test_targets(test_path=None, include_load_tests=True):
"""
def _convert_to_test_target(path):
"""Remove the .py suffix and replace all slashes with periods."""
return os.path.relpath(path, os.getcwd())[:-3].replace('/', '.')
classes = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class_names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -211,7 +211,24 @@ def _get_all_test_targets(test_path=None, include_load_tests=True):
"""
def _convert_to_test_target(path):
"""Remove the .py suffix and replace all slashes with periods."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the docstring with the correct args, returns, etc.

Please also change the names of the function / args / variables to reflect what this method is now doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for line in lines:
if line.startswith('class'):
start_index = line.find(' ') + 1
end_index = line.find('Tests(')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fragile. It just takes someone to not name their class properly -- and then the test won't run, but no one will notice it.

Instead, could you do a check based on what the class is a subclass of? You might be able to use stuff like inspect.getmro (grep for that in the codebase) to do this more semantically, but I think the correct definition of test classes is "is a subclass of the base test class", not "is named XXX".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually tried using importlib to import module from the test_target_path but I get errors there as follows:

Checking if node.js is installed in /home/ankita/Desktop/opensrc/oppia_tools
Checking whether Google App Engine is installed in /home/ankita/Desktop/opensrc/oppia_tools/google_appengine_1.9.67/google_appengine
Checking whether google-cloud-sdk is installed in /home/ankita/Desktop/opensrc/oppia_tools/google-cloud-sdk-251.0.0/google-cloud-sdk
Traceback (most recent call last):
  File "scripts/backend_tests.py", line 395, in <module>
    main()
  File "scripts/backend_tests.py", line 290, in main
    include_load_tests=include_load_tests)
  File "scripts/backend_tests.py", line 246, in _get_all_test_targets
    _convert_to_test_target(os.path.join(base_path, root)))
  File "scripts/backend_tests.py", line 219, in _convert_to_test_target
    python_module = importlib.import_module(test_target_path)
  File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/home/ankita/Desktop/opensrc/oppia/utils_test.py", line 23, in <module>
    from core.tests import test_utils
  File "/home/ankita/Desktop/opensrc/oppia/core/tests/test_utils.py", line 30, in <module>
    from core.domain import collection_services
  File "/home/ankita/Desktop/opensrc/oppia/core/domain/collection_services.py", line 33, in <module>
    from core.domain import exp_services
  File "/home/ankita/Desktop/opensrc/oppia/core/domain/exp_services.py", line 38, in <module>
    from core.domain import email_subscription_services
  File "/home/ankita/Desktop/opensrc/oppia/core/domain/email_subscription_services.py", line 19, in <module>
    from core.domain import email_manager
  File "/home/ankita/Desktop/opensrc/oppia/core/domain/email_manager.py", line 22, in <module>
    from core.domain import config_domain
  File "/home/ankita/Desktop/opensrc/oppia/core/domain/config_domain.py", line 22, in <module>
    import schema_utils
  File "/home/ankita/Desktop/opensrc/oppia/schema_utils.py", line 32, in <module>
    from core.domain import html_cleaner  # pylint: disable=relative-import
  File "/home/ankita/Desktop/opensrc/oppia/core/domain/html_cleaner.py", line 24, in <module>
    import bleach
ImportError: No module named bleach

Is there some path issues and do I need to insert path for the modules?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly -- see core/tests/gae_suite.py.

But if this becomes too tricky/messy, then perhaps a lexical analysis of the "Abc" in class XYZ(Abc) might be fine -- maybe ensure it is from a given set of strings. That's more fragile though, so better to avoid if possible, but it might be a good middle ground if it proves too hard to use getmro().

Copy link
Contributor Author

@ankita240796 ankita240796 Jun 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding the paths as in tests/gae_suite.py but I still get errors for modules being imported.

Also, for lexical analysis, the classes should inherit from a fixed class but in many files, we have new base classes and then test classes as a subclass of them. Would it be not better if we could have a test to ensure that all test class names should end in Test/Tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, for lexical analysis, the classes should inherit from a fixed class but in many files, we have new base classes and then test classes as a subclass of them. Would it be not better if we could have a test to ensure that all test class names should end in Test/Tests?

That could also work, but how would you do that? Such a test would still need to identify which classes are "test classes".

And yeah, that is the drawback of lexical analysis, we'd probably have to maintain a list of "allowed subclasses" or something, but that is indeed icky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use metaclasses to keep track of subclasses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually fixed now. I added paths similar to that in core/tests/gae_suite.py. I missed pylint path earlier since it was not present in gae_suite.py but test files like pylint_extensions_tests require pylint import as well and so adding that to path fixed the issue!

@seanlip
Copy link
Member

seanlip commented Jun 28, 2019

(Also, yay on 3 hours --> 30 minutes!!)

@ankita240796
Copy link
Contributor Author

Hi @seanlip @brianrodri, I have addressed all the review comments. I am sorry for the delay here, I was actually travelling yesterday and was not able to fix this due to that. PTAL, Thanks!

Copy link
Member

@seanlip seanlip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woohoo! Thanks @ankita240796 :) LGTM on the general approach, but I think you'll need to figure out why the backend tests are failing. OK to merge once you've got them to pass. Thanks!

@seanlip seanlip mentioned this pull request Jun 30, 2019
8 tasks
@ankita240796
Copy link
Contributor Author

ankita240796 commented Jun 30, 2019

Hi @seanlip, I am trying to check the coverage issue locally but I cannot reproduce it. Is it due to some cache issue on ci? Thanks!

@seanlip
Copy link
Member

seanlip commented Jun 30, 2019

TBH, I don't know, sorry...

/cc @apb7 @oppia/dev-workflow-team -- is there a way to clear the cache on CircleCI?

@seanlip
Copy link
Member

seanlip commented Jun 30, 2019

@ankita240796, one thought -- if you can open a separate PR with exactly the same changes as this one, and that PR passes backend tests on CircleCI, then I think I can merge this one. That might be a backup solution...though we should figure out the cache clearing if possible.

@ankita240796
Copy link
Contributor Author

Hi @YashJipkate, I remember you required clearing the ci cache in one of your PRs. If that's the case, can you please guide me here. Thanks!

@ankita240796
Copy link
Contributor Author

@ankita240796, one thought -- if you can open a separate PR with exactly the same changes as this one, and comment out the other travis lines except the backend tests, and that PR passes backend tests, then I think I can merge this one. That might be a backup solution...

Sure @seanlip, I will do that now. Thanks!

@anubhavsinha98
Copy link
Contributor

Hi @ankita240796, I am not sure maybe look you can look at https://support.circleci.com/hc/en-us/articles/115015426888-Clear-project-cache

@apb7
Copy link
Contributor

apb7 commented Jun 30, 2019

Hi @ankita240796 and @seanlip, I checked out the complete log. This could be caused by cache issues. Ankita, here are the instructions to clear cache on CircleCI. These can be done in this PR only. Ping me in case you have any doubts. Thanks!

@ankita240796
Copy link
Contributor Author

Hi @anubhavsinha98 @apb7, Thanks for the help. I have commented out the lines to clear the cache and we can wait to see if it works as expected.

@ankita240796
Copy link
Contributor Author

It still fails. So, I don't think this is a cache issue. I am trying to search if I can find a fix for this. Thanks!

OPPIA_TOOLS_DIR = os.path.join(CURR_DIR, '..', 'oppia_tools')
THIRD_PARTY_DIR = os.path.join(CURR_DIR, 'third_party')
DIRS_TO_ADD_TO_SYS_PATH = [
os.path.join(OPPIA_TOOLS_DIR, 'coverage-4.5.3', 'coverage', 'cmdline'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this directory cannot be found. I wonder if it is actually being downloaded and installed correctly? Does the backend test script run the setup/install-third-party scripts first?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could try doing a listdir to the console to see what's in the folder -- see whether coverage.cmdline exists at all and maybe cat its contents to the terminal.

Copy link
Member

@seanlip seanlip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ankita240796! Let's get this in to unblock the build, and file an issue for immediate fixing that tries to do this more robustly.

@seanlip seanlip removed their assignment Jul 1, 2019
@seanlip
Copy link
Member

seanlip commented Jul 1, 2019

Merging since this passed all checks at ca06f27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run backend tests split on basis of test classes instead of files
6 participants