Skip to content

Commit

Permalink
fixes in hyper p sampling, update examples, remove cov matrixes in da…
Browse files Browse the repository at this point in the history
…ta, estimate covs
  • Loading branch information
hvasbath committed Jan 5, 2018
1 parent 7eb95e7 commit f7eb163
Show file tree
Hide file tree
Showing 4 changed files with 154 additions and 43 deletions.
52 changes: 31 additions & 21 deletions data/examples/FullMT/config_geometry.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
--- !beat.BEATconfig
name: FullMT
name: /home/vasyurhm/BEATS/FullMT
date: dummy
event: !pf.Event
lat: 29.07
Expand All @@ -26,20 +26,26 @@ event: !pf.Event
moment: 7.223676349339025e+19
magnitude: 7.20583885303153
duration: 22.0
project_dir: /home/vasyurhm/Software/Seismosoft/beat/data/examples/FullMT
project_dir: /home/vasyurhm/BEATS/FullMT
problem_config: !beat.ProblemConfig
mode: geometry
source_type: MTSource
stf_type: HalfSinusoid
n_sources: 1
datatypes: [seismic]
hyperparameters:
h_any_P_Z: !beat.heart.Parameter
name: h_any_P
h_any_P_T: !beat.heart.Parameter
name: h_any_P_T
form: Uniform
lower: [-1.0]
lower: [-3.0]
upper: [6.0]
testvalue: [2.5]
testvalue: [1.5]
h_any_P_Z: !beat.heart.Parameter
name: h_any_P_Z
form: Uniform
lower: [-3.0]
upper: [5.0]
testvalue: [1.0]
priors:
depth: !beat.heart.Parameter
name: depth
Expand Down Expand Up @@ -116,13 +122,13 @@ problem_config: !beat.ProblemConfig
seismic_config: !beat.SeismicConfig
datadir: ./
blacklist: []
calc_data_cov: false
pre_stack_cut: false
calc_data_cov: true
pre_stack_cut: true
waveforms:
- !beat.WaveformFitConfig
include: true
name: any_P
channels: [Z]
channels: [Z, T]
filterer: !beat.heart.Filter
lower_corner: 0.01
upper_corner: 0.1
Expand All @@ -132,10 +138,11 @@ seismic_config: !beat.SeismicConfig
arrival_taper: !beat.heart.ArrivalTaper
a: -20.0
b: -10.0
c: 180.0
d: 210.0
c: 250.0
d: 270.0
gf_config: !beat.SeismicGFConfig
store_superdir: /tmp/GF/FullMT/
store_superdir: /home/vasyurhm/BEATS/GF
reference_model_idx: 0
n_variations: [0, 1]
error_depth: 0.1
error_velocities: 0.1
Expand All @@ -150,42 +157,45 @@ seismic_config: !beat.SeismicConfig
21.64 6.23 3.6 2.8 1283. 600.
mantle
21.64 7.95 4.45 3.2 1449. 600.
source_depth_min: 8.0
source_depth_max: 8.0
source_depth_min: 0.0
source_depth_max: 15.0
source_depth_spacing: 1.0
source_distance_radius: 1000.0
source_distance_spacing: 1.0
nworkers: 10
nworkers: 25
reference_location: !beat.heart.ReferenceLocation
lat: 29.07
lon: 34.73
elevation: 0.0
depth: 0.0
station: AqabaMT
code: qseis
sample_rate: 1.0
rm_gfs: true
sampler_config: !beat.SamplerConfig
name: SMC
progressbar: true
parameters: !beat.SMCConfig
n_chains: 1000
n_chains: 400
n_steps: 100
n_jobs: 1
n_jobs: 4
tune_interval: 10
coef_variation: 1.0
stage: '0'
stage: 0
proposal_dist: MultivariateNormal
check_bnd: true
update_covariances: false
rm_flag: true
hyper_sampler_config: !beat.SamplerConfig
name: Metropolis
progressbar: true
parameters: !beat.MetropolisConfig
n_jobs: 1
n_stages: 10
n_stages: 5
n_steps: 25000
stage: '0'
stage: 0
tune_interval: 50
proposal_dist: Normal
thin: 2
thin: 5
burn: 0.5
rm_flag: false
Binary file modified data/examples/FullMT/seismic_data.pkl
Binary file not shown.
116 changes: 108 additions & 8 deletions docs/examples.rst
Original file line number Diff line number Diff line change
@@ -1,28 +1,36 @@


*******************************
Scenarios in beat/data/examples
-------------------------------
*******************************
In the following tutorials we will use synthetic example data to get familiar with the basic functionality of BEAT.


Regional Full Moment Tensor
^^^^^^^^^^^^^^^^^^^^^^^^^^^
---------------------------
Clone project
^^^^^^^^^^^^^
This setup is comprised of 20 seismic stations that are randomly distributed within distances of 40 to 1000 km compared to a reference event.
To copy the scenario (including the data) to a directory outside of the package source directory please edit the target path and execute::

cd /path/to/beat/data/examples/
beat clone FullMT /absolute/path/to/your/directory/FullMT --copy_data

This will create a BEAT project directory with a configuration file and some synthetic example data.

Calculate Greens Functions
^^^^^^^^^^^^^^^^^^^^^^^^^^
The station-event geometry determines the grid of Greens Functions (GFs) that will need to be calculated next.

In the geometry_config.yaml under: seismic_config gf_config store_superdir- the path to where the Greens Functions are supposed to be stored needs to be filled!
In the config_geometry.yaml under: seismic_config gf_config store_superdir- the path needs to be defined to where the Greens Functions are supposed to be stored!
This directory is refered to as the $GF_path in the rest of the text.::

beat build_gfs FullMT --datatypes='seismic'

This will create an empty Greens Function store named AqabaMT_ak135_1.000Hz_0 in the $GF_path. Under $GF_path/AqabaMT_ak135_1.000Hz_0/config it is always recommended to cross-check again the velocity model and the specificationos of the store.
Dependend on the case study there are crucial parameters that often need to be changed from the default values: the spatial grid dimensions, the sample rate and the wave phases (body waves and/or surface waves) to be calculated.

In the geometry_config.yaml under seismic config we find the gf_config, which holds the major parameters for GF calculation::
In the config_geometry.yaml under seismic config we find the gf_config, which holds the major parameters for GF calculation::

gf_config: !beat.SeismicGFConfig
store_superdir: /home/vasyurhm/BEATS/GF/
Expand Down Expand Up @@ -67,7 +75,7 @@ The 'n_workers' variable defines the number of CPUs to use in parallel for the G

For our use-case these specifications are fine for now.

Under waveforms in the geometry_config.yaml there are the seismic phases defined the GFs are going to be calculated for.::
Under waveforms in the config_geometry.yaml there are the seismic phases defined the GFs are going to be calculated for.::

- !beat.WaveformFitConfig
include: true
Expand All @@ -85,7 +93,7 @@ Under waveforms in the geometry_config.yaml there are the seismic phases defined
c: 50.0
d: 55.0

In this case the GFs are going to be calculated for the P body waves. We can add additional WaveformFitConfig(s) if we want to include more phases. Like in our case of a regional setup we would like to include surface waves. For the build GFs command only the existence of the WaveformFitConfig and the name is of importance and we can ignore the other parameters so far. So lets add this to the geometry_config.yaml file below the any_P WaveformFitConfig. Note: both entries have to have the same indentation!::
In this case the GFs are going to be calculated for the P body waves. We can add additional WaveformFitConfig(s) if we want to include more phases. Like in our case of a regional setup we would like to include surface waves. For the build GFs command only the existence of the WaveformFitConfig and the name is of importance and we can ignore the other parameters so far. So lets add this to the config_geometry.yaml file below the any_P WaveformFitConfig. Note: both entries have to have the same indentation!::

- !beat.WaveformFitConfig
include: true
Expand Down Expand Up @@ -117,12 +125,104 @@ Checking again the store config under $GF_path/AqabaMT_ak135_1.000Hz_0/config sh
id: slowest
definition: '0.8'

Finally, we are good to start the GF calculations!
Finally, we are good to start the GF calculations!::

beat build_gfs FullMT --datatypes='seismic' --force --execute

Depending on the number of CPUs that have been assigned to the job this may take few minutes.


Next we can use the fomosto tool together with snuffler to inspect if the GFs look reasonable. To plot the 10 elementary GF components in a depth of 8km at a distance of 500km::

fomosto view $GF_path/AqabaMT_ak135_1.000Hz_0 --extract='8k,500k'

This looks reasonably well.
TODO: Insert picture here ...

Sample the solution space
^^^^^^^^^^^^^^^^^^^^^^^^^
Once we are confident that the GFs are reasonable we may continue to define the optimization specific setup variables.
First of all we check again the WaveformFitConfig for the waves we want to optimize.
In this case we want to optimize the whole waveform from P until the end of the surface waves.
As the wavetrains are very close in the very near field we do not want to have overlapping time windows, which is why we deactivate one of the WaveformFitConfigs, by setting
include=False on the `slowest` WaveformfitConfig.

Also there we may define a distance range of stations taken into account,
define a bandpass filter and a time window with respect to the arrival time of the respective wave.
Therefore, stations that are used to optimize the P-wave do not necessarily need to be used to optimize the surface waves by defining different distance ranges.
Similarly, different filters and arrival time windows maybe defined as well.

The optimization is done in the R, T, Z rotated coordinate system to better tune, which part of the waves are optimized. That is particularly important if the S-wave
is going to be used as one would typically use only SH waves that are the S-waves in the T-component.
For P-waves one would like to use the Z(vertical) component and for surface waves both components.

Finally, we fix the depth prior to 8km (upper and lower) as we only calculated GFs for that depth.::

priors:
depth: !beat.heart.Parameter
name: depth
form: Uniform
lower: [8.0]
upper: [8.0]
testvalue: [8.0]

Of course, in a real case this would not be fixed.
Also we may inspect the data:

beat check FullMT --what='traces'

Now that we checked the optimization setup we are good to go.

Firstly, we fix the source parameters to some random value and only optimize for the hyperparameters (HPs).
How many different random source parameters are choosen and the sampling repeated is determined by the hyper_sampler_config parameters n_stages (default:10) ::

beat sample FullMT --hypers

This reduces the initial search space from 40 orders of magnitude to usually 5 to 10 orders. Checking the config_geometry.yaml, the HPs parameter bounds show something like::

hyperparameters:
h_any_P_T: !beat.heart.Parameter
name: h_any_P_T
form: Uniform
lower: [-4.0]
upper: [5.0]
testvalue: [0.5]
h_any_P_Z: !beat.heart.Parameter
name: h_any_P_Z
form: Uniform
lower: [-4.0]
upper: [5.0]
testvalue: [0.5]

At this point the bounds could be relaxed again as well by manually editing the configuration file, or the step could be entirely skipped.
Now that we have an initial guess on the hyperparameters we can run the optimization using the default sampling algorithm, a Sequential Monte Carlo sampler.
The sampler can effectively exploit the parallel architecture of nowadays computers. The 'n_jobs' number should be set to as many CPUs as possible in the configuration file.
Note: n_chains divided by n_jobs MUST yield a whole number! An error is going to be thrown is this is not the case!::

sampler_config: !beat.SamplerConfig
name: SMC
progressbar: true
parameters: !beat.SMCConfig
n_chains: 500
n_steps: 100
n_jobs: 1
tune_interval: 10
coef_variation: 1.0
stage: 0
proposal_dist: MultivariateNormal
check_bnd: true
update_covariances: false
rm_flag: true

Dependend on the hardware, sampler specifications and number of jobs that have been defined this calculation is going to take several hours.
Therefore, in order to avoid crashes or in the case of remote connection via ssh it is very much recommended to use something like `screen`
to detach the terminal where the process is running.

beat sample FullMT

Summarize the results
^^^^^^^^^^^^^^^^^^^^^

Plotting
^^^^^^^^


29 changes: 15 additions & 14 deletions src/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,26 +172,16 @@ class Composite(Object):
"""
Class that comprises the rules to formulate the problem. Has to be
used by an overarching problem object.
Parameters
----------
hypers : boolean
determines whether to initialise Composites with hyper parameter model
"""

def __init__(self, hypers=False):
def __init__(self):

self.input_rvs = {}
self.fixed_rvs = {}
self.name = None
self._like_name = None
self.config = None

if hypers:
self._llks = []
for t in range(self.n_t):
self._llks.append(shared(num.array([1.]), borrow=True))

def get_hyper_formula(self, hyperparams):
"""
Get likelihood formula for the hyper model built. Has to be called
Expand Down Expand Up @@ -236,7 +226,7 @@ class GeodeticComposite(Composite):

def __init__(self, gc, project_dir, event, hypers=False):

super(GeodeticComposite, self).__init__(hypers=hypers)
super(GeodeticComposite, self).__init__()

self.event = event

Expand Down Expand Up @@ -285,6 +275,11 @@ def __init__(self, gc, project_dir, event, hypers=False):

self.config = gc

if hypers:
self._llks = []
for t in range(self.n_t):
self._llks.append(shared(num.array([1.]), borrow=True))

@property
def n_t(self):
return len(self.datasets)
Expand Down Expand Up @@ -677,7 +672,7 @@ class SeismicComposite(Composite):

def __init__(self, sc, event, project_dir, hypers=False):

super(SeismicComposite, self).__init__(hypers=hypers)
super(SeismicComposite, self).__init__()

logger.debug('Setting up seismic structure ...\n')
self.name = 'seismic'
Expand Down Expand Up @@ -775,9 +770,15 @@ def __init__(self, sc, event, project_dir, hypers=False):

self.wavemaps.append(wmap)
else:
logger.info('The waveform defined in "%s" config is not '
logger.info(
'The waveform defined in "%s" config is not '
'included in the optimization!' % wc.name)

if hypers:
self._llks = []
for t in range(self.n_t):
self._llks.append(shared(num.array([1.]), borrow=True))

def get_unique_stations(self):
sl = [wmap.stations for wmap in self.wavemaps]
us = []
Expand Down

0 comments on commit f7eb163

Please sign in to comment.