Skip to content

Commit

Permalink
revised complete readmes
Browse files Browse the repository at this point in the history
  • Loading branch information
rumachan committed Dec 18, 2024
1 parent c5f4870 commit 52cbb1a
Show file tree
Hide file tree
Showing 3 changed files with 63 additions and 40 deletions.
72 changes: 37 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,38 @@
# Data Tutorials

The purpose of this repository is to provide a home for several data tutorials written to improve the accesibility of the different GeoNet data sets. This repository is an easy way to access the tutorials, provides versioning and allows the users to suggest changes or improvements.
The purpose of this repository is to provide a home for several data tutorials written to improve the accesibility of the different GeoNet data sets. This repository is an easy way to access the tutorials, provides versioning and allows users to suggest changes or improvements.

The tutorials in this repository are mostly [Jupyter notebook](https://jupyter.org/) files. They demonstrate some simple ways to retrieve and work with data from different GeoNet services such as: FDSN, FITS, Tilde, etc. Most are written in the Python programming language. Older versions of some notebooks were written in the R programming language. We are no longer supporting these, but the notebooks are still available, although we make no guarantee about their current usability. To access these notebooks, please use this [github commit](https://github.com/GeoNet/data-tutorials/tree/5609561894b924211da975d1794eb00b5fcff99d).
The tutorials in this repository are mostly [Jupyter notebook](https://jupyter.org/) files. They demonstrate some simple ways to retrieve and work with data from different GeoNet services such as: FDSN, Tilde, etc. Most are written in the Python programming language. Older versions of some notebooks were written in the R programming language. We are no longer supporting these, but the notebooks are still available, although we make no guarantee about their current usability. To access these old R notebooks, please use this [github commit](https://github.com/GeoNet/data-tutorials/tree/5609561894b924211da975d1794eb00b5fcff99d).

**All notebooks use Python 3. We do not support Python 2.7.**

Tutorials are reviewed every 3 - 6 months. We confirm that they still run, and make any necessary adjustments so that they remain a valuable, working resource for GeoNet's data users.
Tutorials are reviewed every 3 - 6 months, or sooner if circumstances require. During review, we confirm that notebooks still run, and make any necessary adjustments so that they remain a valuable, working resource for GeoNet's data users.

Tutorials are organised by data access method, rather than data type. Within the folder for each data access method is a file README.md. This file contains most of the general material about data accessed by that method. This frees up individual notebooks to concentrate on data access and use, and reduces the maintenance required for each notebook. When you are using a particular notebook, it is therefore important that you refer to the README.md file in the same folder as the notebook.
Tutorials are mostly organised by data access method, but in some cases we may also provide tutorials for a specific data type.

This repository also hosts scripts and codes used for GeoNet's data blogs, when applicable. These are news stories focussed on GeoNet data and how to use
and understand it. They were first published in June 2022 and are accessible through the [GeoNet News web page](https://www.geonet.org.nz/news). While data blogs are not tutorials, the material often contains
code excerpts and examples that our data users will find helpful, such as Jupyter notebooks and shell scripts. The material in the repository is that used by the blog's authors to prepare the blog at the time it was written. In contrast to data tutorials, with blogs we make no effort to review and keep up to date Jupyter notebooks, shell scripts, or any other code-like material. Also, we do not provide the environment and software versions that we may have used in preparing blog material. In many cases, the python environement described below may work if you want to run a Jupyter notebook used to generate material for a blog.
Within the folder for each data access method is a file README.md. This file contains most of the general material about data accessed by that method. This frees up individual notebooks to concentrate on data access and use, and reduces the maintenance required for each notebook. When you are using a particular notebook, it is therefore important that you refer to the README.md file in the same folder as the notebook.

This repository also hosts scripts and codes used for some GeoNet's data blogs. More information is available at the bottom of this README.

## Summary of Tutorials

### By data access method

| Data access method | Description |
| ------------- | ------------- |
| [AWS Open Data](./AWS_Open_Data) | A file README.md describing GeoNet's data available through AWS Open Data |
| [FDSN](./FDSN) | Demonstrates how to access data through GeoNet's different FDSN web services (Dataselect, Station and Event). These tutorials are applicable to **seismic**, **acoustic-infrasound**, and **tsunami gauge (full sample rate)** data sets. |
| [FITS](./FITS) | Shows how to retrieve and use data from [FITS](https://fits.geonet.org.nz/api-docs/). FITS is used to access **daily GNSS position data**, **manually collected volcano data**, and **volcano data logger data (limited cases)**. FITS is in the process of being replaced by [Tilde](https://tilde.geonet.org.nz/), and will later be deprecated. |
| [Tilde](./Tilde) | Shows how to retrieve data from GeoNet's [Tilde API](https://tilde.geonet.org.nz/v3/api-docs/). These turorails apply to **DART**, **envirosensor**, and **tsunami gauge (down-sampled)** data. Tutorials cover Tilde's data, stats, and data summary APIs.
| [FITS](./FITS) | Shows how to retrieve and use data from [FITS](https://fits.geonet.org.nz/api-docs/). FITS is used to access **daily GNSS position data**, and is in the process of being replaced by [Tilde](https://tilde.geonet.org.nz/). |
| [Tilde](./Tilde) | Shows how to retrieve data from GeoNet's [Tilde API](https://tilde.geonet.org.nz/v3/api-docs/). These tutorials apply to **DART**, **envirosensor**, and **tsunami gauge (down-sampled)** data. Tutorials cover Tilde's data, stats, and data summary APIs. Further examples using Tilde are available in the volcano tutorials.|

### By data type

| Data type | Description |
| ------------- | ------------- |
| [Volcano](./Volcano) | Demonstrates how to access data and use **various manually collected volcano data**, **envirosensor data**, **scanDOAS data**, **daily GNSS position data**, **data aggregation**, and **multi-domain (cross-domain) data**, largely using [the Tilde API](https://tilde.geonet.org.nz/). Most volcano data types are covered.|

## How to Run Tutorials
The file [**environment.yml**](environment.yml) ensures that you have the correct Python environment to run the data tutorials. It allows you to install the correct Python packages, and where appropriate package versions. We use this environment when writing and reviewing notebooks. If you use a different environment, tutorials may work, but we cannot guarantee it.
The file [**environment.yml**](./environment.yml) ensures that you have the correct Python environment to run the data tutorials. It allows you to install the correct Python packages, and where appropriate package versions. We use this environment when writing and reviewing notebooks. If you use a different environment, tutorials may work, but we cannot guarantee it. When we review tutorials we also review the environment and periodically may change that to more recent Python and module versions to ensure we are benefitting from recent updates.

### 1. Python environment manager
We use [miniforge](https://github.com/conda-forge/miniforge) to manage our Python environments. Other environment managers such as [Anaconda](https://www.anaconda.com/) are also suitable. Both support Windows, Mac, and Linux. If you do not yet have a python environment manager, we recommend you install one as you need a specific environment for the notebooks to work correctly.
Expand All @@ -33,52 +41,46 @@ We use [miniforge](https://github.com/conda-forge/miniforge) to manage our Pytho
You have three options.
#### a) Clone the git data-tutorials repository
`git clone https://github.com/GeoNet/data-tutorials.git`

You can also use `SSH` but you will need to add a [SSH key](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account) to your account and terminal environment.
### b) Copy the zip file containing the code
Click on green `<>Code` icon near the top of the page and then `Download ZIP`.
This will create a file `data-tutorials-main.zip` on your computer, which you will need to uncompress to access the notebooks. All common file compression tools will uncompress the ZIP file.
### c) Copy and paste
Navigate to a section of a notebook you are interested in and copy-paste the code you want to use into a notebook on your computer.

### 3. Install the environment file
Create an environment called `GeoNet` from the specifications in the file `environment.yml`.

`conda env create -f environment.yml`

And then activate this environment.

`conda activate GeoNet`

Install the Python kernel in this environment.

`conda install -c conda-forge ipykernel`

`python -m ipykernel install --user --name=GeoNet`
### 3. Installing the Python environment and using JupyterLab
For comprehensive instructions, consulte official [conda documentation for managing Python environments](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html).

We recommend using [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/index.html) rather than the older Jupyter Notebook.

If you don't have JupyterLab installed, here are [detailed instructions](https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html).

Open JupyterLab

From the Linux command line
Create an environment called `geonet-tutorial` from the specifications in the file `environment.yml`.

`jupyter lab`
`conda env create -f environment.yml`

To reopen jupyter notebooks when opening a new command prompt, navigate to your working directory and run,
Activate this environment and make it available to JupyterLab.

`conda activate base`
```
conda activate geonet-tutorial
python -m ipykernel install --user --name=geonet-tutorial
```

`conda activate GeoNet`
You can install JupyterLab in your `base` conda environment, run it from that environment and access the `geonet-tutorial` environment from within JupyterLab. Alternatively, you can install JupyterLab in the `geonet-tutorial` environment.

`jupyter notebook`
To open JupyterLab from the Linux or MacOS command line type `jupyter lab`. From Windows, open the command prompt from the Start menu and type `jupyter lab`.

### 4. Running tutorials as standalone scripts
Tutorials are only available as Jupyter notebooks. If you want to run a notebook as a standalone Python script you can do that. Open the notebook in JupyterLab and export it as an [Executable Script](https://jupyterlab.readthedocs.io/en/stable/user/export.html).

## Data Blogs

Data blogs are stored in the folder [Data_Blog](./Data_Blog). Within this is a
sequentially numbered folder including a simplified version of the blog's subject. For example, the folder *blog_01_val* contains the
first blog published and the subject was Volcanic Alert Level (VAL) data.
Data blogs are news stories focussed on GeoNet data and how to use
and understand it. They were first published in June 2022 and are accessible through the [GeoNet News web page](https://www.geonet.org.nz/news). While data blogs are not tutorials, the material sometimes contains
code excerpts and examples that our data users will find helpful, such as Jupyter notebooks and shell scripts. The material in the repository is that used by the blog's authors to prepare the blog at the time it was written. In contrast to data tutorials, with blogs we make no effort to review and keep up to date Jupyter notebooks, shell scripts, or any other code-like material. Also, we do not provide the environment and software versions that we used in preparing blog material. In many cases, the Python environment in [environment.yml](./environment.yml) may work if you want to run a Jupyter notebook used to generate material for a blog. If you are having difficulties, please ask us.

Code from data blogs are stored in the folder [Data_Blog](./Data_Blog). Within this is a
sequentially numbered folder including a shortened version of the blog's subject. For example, the folder *blog_01_val* contains blog number 01, and the subject was Volcanic Alert Level (VAL) data. Not all blogs have code so not all folders are in the repository.

If you want an up to date list of published blogs, go to the [News section on our web page](https://www.geonet.org.nz/news), filter for Data Blog and then hit the Search button.
21 changes: 21 additions & 0 deletions Volcano/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Working with Volcano Data

The files in this folder are Jupyter notebooks written in Python. They demonstrate some simple ways to access and use volcano data. Most data access is via [GeoNet's Tilde Time Series API](https://tilde.geonet.org.nz/) using the [Python programming language](https://www.python.org/). Because there is a wide variety of volcano data types, we have provided specialised tutorials.

## Python ##

**All notebooks use Python 3. We do not support Python 2.7.**

## The Tilde API ##
When using Tilde, the volcano-specific notebooks only use the Tilde Data Endpoint. Refer to the [Tilde README file](../Tilde/README.md) for a more complete description of Tilde endpoints and response messages.

## Notebooks ##

| File | Description |
|------|-------------|
| [Envirosensor](./Volcano_data_envirosensor.ipynb) | Demonstrates how to retrieve and graph multiGas data, fumarole and water temperature data, water level data, self potential and ground temperature data, wind data, and rainfall data.|
| [GNSS](./Volcano_data_gnss.ipynb) | Demonstrates how to retrieve and graph GNSS data. These data are currently delivered by [FITS](https://fits.geonet.org.nz/api-docs/) but will move to Tilde |
| [Manually collected](./Volcano_data_manualcollect.ipynb) | Demonstrates how to retrieve and graph water chemistry data, airborne gas emission rate data, soil gas data, lake levelling (Lake Taupō) data, and how to create spreadsheet-like output files for water chemistry data.|
| [ScanDOAS](./Volcano_data_scandoas.ipynb) | Demonstrates how to retrieve and graph scanDOAS data, including working with data from multiple sensors.|
| [Data aggregation](./Volcano_data_aggregation.ipynb) | Demonstrates how to use Tilde's data aggregation functions with volcano data.|
| [Multi-domain data](./Volcano_data_multidomain.ipynb) | Demonstrates how to retrieve and graph data from more than one data domain. Covers Tilde data with [Volcanic Alert Level data](https://doi.org/10.21420/we5s-1n52), and Tilde data with [historic volcanic activity data](https://doi.org/10.21420/bw31-2x60).|
10 changes: 5 additions & 5 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
name: GeoNet
name: geonet-tutorial
channels:
- conda-forge
dependencies:
- python=3.10
- python[version='>=3.12']
- cartopy
- numpy
- matplotlib[version='>=3.6']
- pandas
- matplotlib[version='>=3.8']
- pandas[version='>=2.2']
- geopandas
- obspy=1.4
- obspy=[version='>=1.4']
- tabulate
- ipykernel

0 comments on commit 52cbb1a

Please sign in to comment.