Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export schedule dispatch #5643

Merged
merged 17 commits into from
Sep 21, 2018
Merged

Export schedule dispatch #5643

merged 17 commits into from
Sep 21, 2018

Conversation

tjiang11
Copy link
Contributor

Sets up automatic backups utilizing the GCP import export service, as datastore admin backups are being deprecated.

@codecov-io
Copy link

codecov-io commented Sep 12, 2018

Codecov Report

Merging #5643 into develop will decrease coverage by 0.04%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5643      +/-   ##
===========================================
- Coverage    45.83%   45.79%   -0.05%     
===========================================
  Files          511      513       +2     
  Lines        29826    29872      +46     
  Branches      4514     4519       +5     
===========================================
+ Hits         13672    13679       +7     
- Misses       16154    16193      +39
Impacted Files Coverage Δ
...s_and_skills_dashboard/TopicsAndSkillsDashboard.js 9.52% <0%> (-4.43%) ⬇️
...ploration_editor/feedback_tab/ThreadDataService.js 20.38% <0%> (-0.37%) ⬇️
...topics_and_skills_dashboard/SkillsListDirective.js 2.32% <0%> (-0.31%) ⬇️
...v/head/pages/creator_dashboard/CreatorDashboard.js 20% <0%> (-0.26%) ⬇️
...itor/subtopics_editor/SubtopicsListTabDirective.js 0.96% <0%> (-0.18%) ⬇️
...d/pages/skill_editor/SkillEditorNavbarDirective.js 1.92% <0%> (-0.08%) ⬇️
...ts/top_navigation_bar/TopNavigationBarDirective.js 0.85% <0%> (-0.01%) ⬇️
...shboard/TopicsAndSkillsDashboardNavbarDirective.js 4.54% <0%> (ø) ⬆️
...es/topic_editor/questions/QuestionsTabDirective.js 0.86% <0%> (ø) ⬆️
...plates/dev/head/components/SkillCreationService.js 5.88% <0%> (ø) ⬆️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f112523...3a13290. Read the comment docs.

Copy link
Member

@seanlip seanlip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tjiang11! Looks pretty good, left a few comments.

export/app.yaml Outdated
version: "latest"

handlers:
- url: /cloud-datastore-export
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this handle URLs with stuff after /cloud-datastore-export too? Like, it should 404 on /cloud-datastore-export/1?

Asking because I noticed such URL requests get sent here by dispatch.yaml but then don't seem to get handled here if I'm reading correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm actually I think maybe dispatch.yaml just shouldn't include anything after /cloud-datastore-export, I don't see why it would need to. I'll change dispatch.yaml

dispatch.yaml Outdated
@@ -0,0 +1,4 @@
dispatch:
# Send requests for exports to the cloud datastore admin service.
- url: "*/cloud-datastore-export/*"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use underscores instead of hyphens, same as we do in other places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


output_url_prefix = self.request.get('output_url_prefix')
assert output_url_prefix and output_url_prefix.startswith('gs://')
if '/' not in output_url_prefix[5:]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does 5: come from? Maybe make the origin of the "5" more explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question, this code came from the tutorial on the online docs. I forgot to ask about accreditation, since when I tried to accredit it to Google in the copyright notice, I couldn't pass the lint checks.

Copy link
Member

@seanlip seanlip Sep 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, just leave a comment in the file with a link to the source.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment

}
headers = {
'Content-Type': 'application/json',
'Authorization': 'Bearer ' + access_token
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer interpolation: 'Bearer %s' % access_token

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

deadline=60,
headers=headers)
if result.status_code == httplib.OK:
logging.info(result.content)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's going to be lots of logs, maybe log a summary at the end of how many data items were successfully exported instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can you tell this is going to be a lot of logs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's one per request, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it just be one log then, from one request?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you're right! I am not sure what I was thinking here, for some reason I thought there was a "for" loop. Please ignore this comment.

import webapp2


class Export(webapp2.RequestHandler):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExportToCloudDatastoreHandler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


class Export(webapp2.RequestHandler):

def get(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guard this by a cron header check? See cron.py for how to do this -- @acl_decorators.can_perform_cron_tasks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having some trouble with this, when I try running the cron job it keeps saying it can't find the module core.domain? I tried putting an init.py in the export folder but it seems to say the same thing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I wouldn't have expected that. I wonder whether it's because app engine is regarding export/ as its own folder that's isolated from the world.

Perhaps take a look at pythonpath in that file to see what's in the pythonpath, and contrast it to what you get when logging pythonpath in a file in core?

If they truly are separated then it's fine to add a decorators.py file in export/ as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look into the pythonpath, and it does seem like it's different, since the first thing listed is referring to the cloud-datastore-admin app specifically. I went ahead and made a separate acl_decorators.

@tjiang11
Copy link
Contributor Author

Note that the current state does not properly work.

# Look for slash in the portion of the bucket URL that comes
# after 'gs://'. If not present, then only a bucket name has been
# provided and we append a trailing slash.
if '/' not in output_url_prefix[5:]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll still want to do something about the "5:". Maybe extract 'gs://' into a named constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@seanlip seanlip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple more things, otherwise LGTM. Thanks!

export/app.yaml Outdated

libraries:
- name: webapp2
version: "latest"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a specific version. (Don't want it to change under us.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

import json
import logging

import oppia.export.acl_decorators as acl_decorators
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from oppia.export import acl_decorators

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@seanlip seanlip merged commit 701ea1e into develop Sep 21, 2018
@seanlip seanlip deleted the export-schedule-dispatch branch September 21, 2018 02:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants