-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix part of #1366: Cron job to keep history of dashboard statistics #2173
Fix part of #1366: Cron job to keep history of dashboard statistics #2173
Conversation
@wxyxinyu @seanlip @AllanYangZhou I have tried to implement a cron job that keeps a copy of weekly statistics of a creator. But I'm not sure how to test this (both manually + writing tests). Do you have any ideas about how could I proceed with testing ? UPDATE: |
Current coverage is 48.01%@@ develop #2173 diff @@
==========================================
Files 185 185
Lines 15829 15857 +28
Methods 2717 2720 +3
Messages 0 0
Branches 2688 2692 +4
==========================================
+ Hits 7609 7613 +4
- Misses 8220 8244 +24
Partials 0 0
|
else: | ||
return None | ||
|
||
def save_last_dashboard_stats(user_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can pass in the entire list of users, and then run the update over all of them, and do put_multi() to store all of them at once. That way you don't need to call get() and put() for every user (which will take a long time).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think it makes more sense to create a mapreduce job to do this instead. (Do a one-off job, not a continuous job, and let the cron start the job). It might not be possible to actually load all the users to memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a one-off mapreduce job. But I couldn't find any classmethod which can be used to kick off this job from cron.py
[start_compurtation()
fulfills the very same purpose, but only for continuous jobs]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: 2 lines above top-level function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Would you also need to get the total number of plays in this cron job? (We were going to experiment with changing the change in rating to change in number of ratings.) Also, how did you test this manually? Just checking, because I don't find any tests for cron jobs either. I think testing the get/set functions for the data should be fine, especially since your cron has very little logic inside the get(). |
"""Save statistics for creator dashboard of a user by appending to a list | ||
keyed by datetime.utcnow().date(). | ||
|
||
This method is called through a cron job which is run weekly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function should know (and assume) nothing about the cron job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed reference about cron job.
@wxyxinyu I can add tests for number of plays as well. Just to confirm - it will require that I stop the job and re-start it, right ? UPDATE: |
def get(self): | ||
"""Handles GET requests.""" | ||
user_jobs_one_off_test.DashboardStatsOneOffJobTests.run_one_off_job() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanlip I don't think this would work. Couldn't find any classmethod that can be used here to kick off this one-off job.
[start_computation
fulfills the same purpose, but only for continuous jobs!]
Any suggestions ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have existing one-off jobs in the codebase. How do they work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that these are started by pressing a button on the admin page.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I knew that these can be manually tested through the admin panel. But my attempts of testing this manually falied. It kept throwing a wierd error -
No module named webtest
Not sure why this error was being thrown. I tried restarting the server a few times but it did not seem to go away. So I decided to commit without testing this line, and thus added a line comment here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it
user_jobs_one_off.DashboardStatsOneOffJob.enqueue(
user_jobs_one_off.DashboardStatsOneOffJob.create_new())
works!
@@ -86,6 +87,15 @@ def get(self): | |||
email_services.send_mail_to_admin(email_subject, email_message) | |||
|
|||
|
|||
class AppendLastDashboardStatsToList(base.BaseHandler): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CronDashboardStatsHandler? (Follow existing naming conventions, unless there's a good reason not to.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay. Renamed
average_ratings=average_ratings).put() | ||
is_average_ratings = True | ||
else: | ||
is_average_ratings = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if you set average_ratings as None here? Then you don't need the is_average_ratings variable or the branching below in line 440.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This thought had crossed my mind during writing this code. But I went for creating a separate flag for this instead)
Done.
PTAL |
Please review. |
average_ratings = None | ||
|
||
if not mr_model: | ||
if average_ratings: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you still have the branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a cleaner way to do this is to follow the get_or_create()
pattern in other models (for example, core.storage.activity.ActivityReferencesModel). Then, you just call get_or_create, and update the model you get, so you don't need to branch on the if not mr_model
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used get_or_create
Hi @526avijitgupta, did a full review -- PTAL. Thanks! |
LGTM. Thanks @526avijitgupta! @wxyxinyu do you have any other comments? If not I think this is mergeable once the travis checks pass. |
"""Save statistics for creator dashboard of a user by appending to a list | ||
keyed by a datetime string. | ||
""" | ||
model = user_models.UserStatsModel.get(user_id, strict=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use get_or_create() here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I just have one really small thing, and this will be good to merge. |
@wxyxinyu Addressed the review comment. Should be ready to merge now I think! |
…2173) * Cron job for recording creator stats * Fixes * Write getter and tests for weekly dasboard stats * Minor changes to model * Fix initial round of review comments * Fix call to one-off job through cron * Removed extra whiteline * Fix accidental line deletion * Partially addressed review comments * Add sample test for multiple weeks * Add test to check format of datetime string * Add schema version to user dashboard stats * Lint fix * Fix brackets around conditional * Fix continuous job and modify tests accordingly * Address review comments * Create new stats model if not already present * Address review comments * Use get_or_create for model * Reused get_or_create in method
…ics (oppia#2173) * Cron job for recording creator stats * Fixes * Write getter and tests for weekly dasboard stats * Minor changes to model * Fix initial round of review comments * Fix call to one-off job through cron * Removed extra whiteline * Fix accidental line deletion * Partially addressed review comments * Add sample test for multiple weeks * Add test to check format of datetime string * Add schema version to user dashboard stats * Lint fix * Fix brackets around conditional * Fix continuous job and modify tests accordingly * Address review comments * Create new stats model if not already present * Address review comments * Use get_or_create for model * Reused get_or_create in method
Tasks to be completed as a part of this PR: