Skip to content

Commit

Permalink
adding directory with some of the WANE explorations for gov info coll…
Browse files Browse the repository at this point in the history
…ection
  • Loading branch information
tsuomela committed Mar 5, 2016
1 parent 8ce2d00 commit 6cad511
Show file tree
Hide file tree
Showing 12 changed files with 3,469,152 additions and 0 deletions.
407 changes: 407 additions & 0 deletions gov-info/.ipynb_checkpoints/day3-adding-polish-checkpoint.ipynb

Large diffs are not rendered by default.

3,643 changes: 3,643 additions & 0 deletions gov-info/.ipynb_checkpoints/gephi-export-trial-checkpoint.ipynb

Large diffs are not rendered by default.

840 changes: 840 additions & 0 deletions gov-info/.ipynb_checkpoints/wane-bipartite-graphs-checkpoint.ipynb

Large diffs are not rendered by default.

9 changes: 9 additions & 0 deletions gov-info/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Working with the Gov-Info files

This directory contains iPython notebooks, WANE, and GML files which were used to work with a partial set of the Interent Archive for the parl.gc.ca domain.

The WANE files are a derived dataset containing extracted named entities and URLs.

The Python notebooks were developed to extract the named entities and URLs from the JSON in the WANEs and transform those items into a graph using the networkx Python module.

After transforming the files into a graph the graphs were saved using the GML file type for import into Gephi in order to visualize the graphs and perform eploratory data analysis.
Loading

0 comments on commit 6cad511

Please sign in to comment.