Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up Observable notebook for data exploration #166

Closed
kallewesterling opened this issue Oct 11, 2022 · 9 comments
Closed

Set up Observable notebook for data exploration #166

kallewesterling opened this issue Oct 11, 2022 · 9 comments

Comments

@kallewesterling
Copy link
Collaborator

No description provided.

@kmcdono2
Copy link
Member

kmcdono2 commented Oct 13, 2022

Starting things:

For small samples, when determining which variations of the pipeline to use on a larger corpus or articles, and also to review output of any corpus:

  • info about the sample itself: # of titles, publication date span, # of articles, average OCR quality [replace with something actually useful]
  • how many toponyms total and of each type (location, building, street, etc.) + statistics about the confidence score for each
  • which toponyms could not be resolved, or have a confidence score below a certain threshold?
  • average # of toponyms per article per title (or other variables)
  • for all resolved toponyms, map (possible to interact with based on confidence score/only view one or more types of place)
  • spatial "center" of resolved toponyms (e.g. where do they cluster)
  • [how to also to bring in OCR quality to sense check/viz results]

@kmcdono2
Copy link
Member

  • Potential for this notebook to be used to show people what the impact of different choices in using pipeline as well as for internal use

@kallewesterling
Copy link
Collaborator Author

kallewesterling commented Oct 17, 2022

First draft available here, based on sample data from @npedrazzini. (only one article—I think?—and so on).. We can easily scale this up once we have more data.

@kmcdono2
Copy link
Member

Awesome @kallewesterling ! Will explore tomorrow first thing!!

@kmcdono2
Copy link
Member

One Q about the below viz - I'm never sure whether the label I see when I hover is for the individual grey circle or the parent circle. Is there a way to make that clearer?

Screen Shot 2022-10-19 at 09 40 36

@kallewesterling
Copy link
Collaborator Author

One Q about the below viz - I'm never sure whether the label I see when I hover is for the individual grey circle or the parent circle. Is there a way to make that clearer?

Yes, I just need to create some kind of "tooltip" for the hovering. the current solution is just a temporary one.

@kallewesterling
Copy link
Collaborator Author

kallewesterling commented Nov 15, 2022

drawing

Click for full-res.

@kmcdono2
Copy link
Member

Moving future work on this to #179

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants