forked from pypi/warehouse
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: move BigQuery to user docs (pypi#17162)
* docs: move BigQuery to user docs Signed-off-by: William Woodruff <william@trailofbits.com> * docs: APIs and Datasets Signed-off-by: William Woodruff <william@trailofbits.com> --------- Signed-off-by: William Woodruff <william@trailofbits.com>
- Loading branch information
Showing
4 changed files
with
49 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,26 +1,10 @@ | ||
BigQuery Datasets | ||
================= | ||
|
||
We use BigQuery to serve our public datasets. PyPI offers two tables whose | ||
data is sourced from projects on PyPI. The tables and its pertaining data are licensed | ||
under the `Creative Commons License <https://creativecommons.org/licenses/by/4.0/>`_. | ||
.. important:: | ||
|
||
Download Statistics Table | ||
------------------------- | ||
This API documentation has been migrated to a new page in | ||
the `user documentation <https://docs.pypi.org/>`_: | ||
|
||
The download statistics table allows you learn more about downloads patterns of | ||
packages hosted on PyPI. This table is populated through the `Linehaul | ||
project <https://github.com/pypa/linehaul-cloud-function/>`_ by streaming download logs from PyPI | ||
to BigQuery. For more information on analyzing PyPI package downloads, see the `Python | ||
Package Guide <https://packaging.python.org/guides/analyzing-pypi-package-downloads/>`_ | ||
* `BigQuery Datasets <https://docs.pypi.org/api/bigquery/>`_ | ||
|
||
Project Metadata Table | ||
---------------------- | ||
|
||
We also have a table that provides access to distribution metadata | ||
as outlined by the `core metadata specifications <https://packaging.python.org/specifications/core-metadata/>`_. | ||
The table is meant to be a data dump of metadata from every | ||
release on PyPI, which means that the rows in this BigQuery table | ||
are immutable and are not removed even if a release or project is deleted. | ||
This data can be accessible under the | ||
``bigquery-public-data.pypi.distribution_metadata`` public dataset on BigQuery. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# BigQuery Datasets | ||
|
||
We use BigQuery to serve our public datasets. PyPI offers two tables whose | ||
data is sourced from projects on PyPI. The tables and its pertaining data are licensed | ||
under the [Creative Commons License]. | ||
|
||
## Download Statistics Table | ||
|
||
*Table name*: `bigquery-public-data.pypi.file_downloads` | ||
|
||
The download statistics table allows you learn more about downloads patterns of | ||
packages hosted on PyPI. | ||
|
||
This table is populated through the [Linehaul project] by streaming download | ||
logs from PyPI to BigQuery. For more information on analyzing PyPI package | ||
downloads, see the [Python Package Guide]. | ||
|
||
## Project Metadata Table | ||
|
||
*Table name*: `bigquery-public-data.pypi.distribution_metadata` | ||
|
||
We also have a table that provides access to distribution metadata | ||
as outlined by the [core metadata specifications]. | ||
|
||
The table is meant to be a data dump of metadata from every | ||
release on PyPI, which means that the rows in this BigQuery table | ||
are immutable and are not removed even if a release or project is deleted. | ||
|
||
[Creative Commons License]: https://creativecommons.org/licenses/by/4.0/ | ||
[Linehaul project]: https://github.com/pypa/linehaul-cloud-function/ | ||
[Python Package Guide]: https://packaging.python.org/guides/analyzing-pypi-package-downloads/ | ||
[core metadata specifications]: https://packaging.python.org/specifications/core-metadata/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters