Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docstrings of ess, autocorr and bpv plots #2185

Merged
merged 16 commits into from
Jan 5, 2023
40 changes: 20 additions & 20 deletions arviz/plots/autocorrplot.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,53 +23,53 @@ def plot_autocorr(
backend_kwargs=None,
show=None,
):
"""Bar plot of the autocorrelation function for a sequence of data.
r"""Bar plot of the autocorrelation function (ACF) for a sequence of data.

Useful in particular for posteriors from MCMC samples which may display correlation.
The ACF plots are helpful as a convergence diagnostic for posteriors from MCMC
samples which display autocorrelation.

Parameters
----------
data : InferenceData
Any object that can be converted to an :class:`arviz.InferenceData` object
refer to documentation of :func:`arviz.convert_to_dataset` for details
var_names : list of str, optional
Variables to be plotted, if None all variables are plotted. Prefix the
variables by ``~`` when you want to exclude them from the plot. Vector-value
stochastics are handled automatically.
filter_vars : {None, "like", "regex"}, default=None
If `None` (default), interpret var_names as the real variables names. If "like",
interpret var_names as substrings of the real variables names. If "regex",
interpret var_names as regular expressions on the real variables names. A la
``pandas.filter``.
Variables to be plotted. Prefix the variables by ``~`` when you want to exclude
them from the plot. See the :ref:`this section <common_var_names>` for usage examples.
filter_vars : {None, "like", "regex"}, default None
If `None` (default), interpret `var_names` as the real variables names. If "like",
interpret `var_names` as substrings of the real variables names. If "regex",
interpret `var_names` as regular expressions on the real variables names. See
the :ref:`this section <common_filter_vars>` for usage examples.
max_lag : int, optional
Maximum lag to calculate autocorrelation. Defaults to 100 or num draws,
whichever is smaller.
combined : bool, default=False
Maximum lag to calculate autocorrelation. By Default, the plot displays the
first 100 lag or the total number of draws, whichever is smaller.
combined : bool, default False
Flag for combining multiple chains into a single chain. If False, chains will be
plotted separately.
grid : tuple
grid : tuple, optional
Number of rows and columns. Defaults to None, the rows and columns are
automatically inferred.
figsize : (float, float), optional
Figure size. If None it will be defined automatically.
Note this is not used if ``ax`` is supplied.
textsize : float
textsize : float, optional
Text size scaling factor for labels, titles and lines. If None it will be autoscaled based
on ``figsize``.
on `figsize`.
labeller : Labeller, optional
Class providing the method ``make_label_vert`` to generate the labels in the plot titles.
Read the :ref:`label_guide` for more details and usage examples.
ax : 2D array-like of matplotlib_axes or bokeh_figure, optional
A 2D array of locations into which to plot the densities. If not supplied, Arviz will create
its own array of plot areas (and return it).
backend : str, optional
Select plotting backend {"matplotlib","bokeh"}. Default "matplotlib".
backend : {"matplotlib", "bokeh"}, default "matplotlib"
Select plotting backend.
backend_config : dict, optional
Currently specifies the bounds to use for bokeh axes. Defaults to value set in ``rcParams``.
backend_kwargs : dict, optional
These are kwargs specific to the backend being used, passed to
:func:`matplotlib.pyplot.subplots` or
:func:`bokeh.plotting.figure`.
:func:`matplotlib.pyplot.subplots` or :class:`bokeh.plotting.figure`.
For additional documentation check the plotting method of the backend.
show : bool, optional
Call backend show function.

Expand Down
79 changes: 41 additions & 38 deletions arviz/plots/bpvplot.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,58 +36,60 @@ def plot_bpv(
group="posterior",
show=None,
):
"""
Plot Bayesian p-value for observed data and Posterior/Prior predictive.
r"""Plot Bayesian p-value for observed data and Posterior/Prior predictive.

Parameters
----------
data : InferenceData
:class:`arviz.InferenceData` object containing the observed and
posterior/prior predictive data.
kind : str, default "u_value"
Type of plot to display ("p_value", "u_value", "t_stat").
For "p_value" we compute p := p(y* ≤ y | y). This is the probability of the data y being
larger or equal than the predicted data y*. The ideal value is 0.5 (half the predictions
below and half above the data).
For "u_value" we compute pi := p(yi* ≤ yi | y). i.e. like a p_value but per observation yi.
This is also known as marginal p_value. The ideal distribution is uniform. This is similar
to the LOO-pit calculation/plot, the difference is than in LOO-pit plot we compute
pi = p(yi* r ≤ yi | y-i ), where y-i, is all other data except yi.
For "t_stat" we compute := p(T(y)* ≤ T(y) | y) where T is any test statistic. See t_stat
argument below for details of available options.
kind : {"u_value", "p_value", "t_stat"}, default "u_value"
Specify the kind of plot:

* The ``kind="p_value"`` computes :math:`p := p(y* \leq y | y)`.
This is the probability of the data y being larger or equal than the predicted data y*.
The ideal value is 0.5 (half the predictions below and half above the data).
* The ``kind="u_value"`` argument computes :math:`p_i := p(y_i* \leq y_i | y)`.
i.e. like a p_value but per observation :math:`y_i`. This is also known as marginal
p_value. The ideal distribution is uniform. This is similar to the LOO-PIT
calculation/plot, the difference is than in LOO-pit plot we compute
:math:`pi = p(y_i* r \leq y_i | y_{-i} )`, where :math:`y_{-i}`,
is all other data except :math:`y_i`.
* The ``kind="t_stat"`` argument computes :math:`:= p(T(y)* \leq T(y) | y)`
where T is any test statistic. See ``t_stat`` argument below for details
of available options.

t_stat : str, float, or callable, default "median"
Test statistics to compute from the observations and predictive distributions.
Allowed strings are "mean", "median" or "std".
Alternative a quantile can be passed as a float (or str) in the
interval (0, 1). Finally a user defined function is also
Allowed strings are “mean”, “median” or “std”. Alternative a quantile can be passed
as a float (or str) in the interval (0, 1). Finally a user defined function is also
acepted, see examples section for details.
bpv : bool, default True
If True add the Bayesian p_value to the legend when ``kind = t_stat``.
plot_mean : bool, default True
Whether or not to plot the mean test statistic.
reference : str, default "analytical"
How to compute the distributions used as reference for u_values or p_values. Allowed values
are "analytical" and "samples". Use `None` to do not plot any reference.
Defaults to "samples".
reference : {"analytical", "samples", None}, default "analytical"
How to compute the distributions used as reference for ``kind=u_values``
or ``kind=p_values``. Use `None` to not plot any reference.
mse : bool, default False
Show scaled mean square error between uniform distribution and marginal p_value
distribution.
n_ref : int, default 100
Number of reference distributions to sample when ``reference=samples``.
hdi_prob : float, optional
Probability for the highest density interval for the analytical reference distribution when
computing u_values. Should be in the interval (0, 1]. Defaults to the
``kind=u_values``. Should be in the interval (0, 1]. Defaults to the
rcParam ``stats.hdi_prob``.
color : str, optional
Matplotlib color
grid : tuple, optional
Number of rows and columns. Defaults to None, the rows and columns are
Number of rows and columns. By default, the rows and columns are
automatically inferred.
figsize : (float, float), optional
asael697 marked this conversation as resolved.
Show resolved Hide resolved
Figure size. If None it will be defined automatically.
textsize : float, optional
Text size scaling factor for labels, titles and lines. If None it will be
autoscaled based on ``figsize``.
Text size scaling factor for labels, titles and lines. If None it will be autoscaled based
on `figsize`.
data_pairs : dict, optional
Dictionary containing relations between observed data and posterior/prior predictive data.
Dictionary structure:
Expand All @@ -101,14 +103,15 @@ def plot_bpv(
Labeller : Labeller, optional
Class providing the method ``make_pp_label`` to generate the labels in the plot titles.
Read the :ref:`label_guide` for more details and usage examples.
var_names : list of variable names
Variables to be plotted, if `None` all variable are plotted. Prefix the variables by ``~``
when you want to exclude them from the plot.
filter_vars : {None, "like", "regex"}, optional, default=None
If `None` (default), interpret var_names as the real variables names. If "like",
interpret var_names as substrings of the real variables names. If "regex",
interpret var_names as regular expressions on the real variables names. A la
``pandas.filter``.
var_names : list of str, optional
Variables to be plotted. If `None` all variable are plotted. Prefix the variables by ``~``
when you want to exclude them from the plot. See the :ref:`this section <common_var_names>`
for usage examples.
filter_vars : {None, "like", "regex"}, default None
If `None` (default), interpret `var_names` as the real variables names. If "like",
interpret `var_names` as substrings of the real variables names. If "regex",
interpret `var_names` as regular expressions on the real variables names. See
the :ref:`this section <common_filter_vars>` for usage examples.
coords : dict, optional
Dictionary mapping dimensions to selected coordinates to be plotted.
Dimensions without a mapping specified will include all coordinates for
Expand Down Expand Up @@ -136,12 +139,12 @@ def plot_bpv(
and ``reference=analytical``).
backend_kwargs : bool, optional
These are kwargs specific to the backend being used, passed to
:func:`matplotlib.pyplot.subplots` or
:func:`bokeh.plotting.figure`. For additional documentation
check the plotting method of the backend.
group : {"prior", "posterior"}, optional
Specifies which InferenceData group should be plotted. Defaults to 'posterior'.
Other value can be 'prior'.
:func:`matplotlib.pyplot.subplots` or :class:`bokeh.plotting.figure`.
For additional documentation check the plotting method of the backend.
group : {"posterior", "prior"}, default "posterior"
Specifies which InferenceData group should be plotted. If "posterior", then the values
in `posterior_predictive` group are compared to the ones in `observed_data`, if "prior" then
the same comparison happens, but with the values in `prior_predictive` group.
show : bool, optional
Call backend show function.

Expand Down
92 changes: 47 additions & 45 deletions arviz/plots/essplot.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,56 +37,70 @@ def plot_ess(
show=None,
**kwargs,
):
"""Plot quantile, local or evolution of effective sample sizes (ESS).
r"""Generate quantile, local, or evolution ESS plots.

The local and the quantile ESS plots are recommended for checking
that there are enough samples for all the explored regions of the
parameter space. Checking local and quantile ESS is particularly
relevant when working with HDI intervals as opposed to ESS bulk,
which is suitable for point estimates.

Parameters
----------
idata : obj
idata : InferenceData
Any object that can be converted to an :class:`arviz.InferenceData` object
Refer to documentation of :func:`arviz.convert_to_dataset` for details
var_names : list of variable names, optional
var_names : list of str, optional
Variables to be plotted. Prefix the variables by ``~`` when you want to exclude
them from the plot.
filter_vars : {None, "like", "regex"}, optional, default=None
If `None` (default), interpret var_names as the real variables names. If "like",
interpret var_names as substrings of the real variables names. If "regex",
interpret var_names as regular expressions on the real variables names. A la
``pandas.filter``.
kind : str, optional
Options: ``local``, ``quantile`` or ``evolution``, specify the kind of plot.
relative : bool
them from the plot. See the :ref:`this section <common_var_names>` for usage examples.
filter_vars : {None, "like", "regex"}, default None
If `None` (default), interpret `var_names` as the real variables names. If "like",
interpret `var_names` as substrings of the real variables names. If "regex",
interpret `var_names` as regular expressions on the real variables names. See
the :ref:`this section <common_filter_vars>` for usage examples.
kind : {"local", "quantile", "evolution"}, default "local"
Specify the kind of plot:

* The ``kind="local"`` argument generates the ESS' local efficiency for
estimating quantiles of a desired posterior.
* The ``kind="quantile"`` argument generates the ESS' local efficiency
for estimating small-interval probability of a desired posterior.
* The ``kind="evolution"`` argument generates the estimated ESS'
with incrised number of iterations of a desired posterior.

relative : bool, default False
Show relative ess in plot ``ress = ess / N``.
coords : dict, optional
Coordinates of var_names to be plotted. Passed to :meth:`xarray.Dataset.sel`.
grid : tuple
Number of rows and columns. Defaults to None, the rows and columns are
Coordinates of `var_names` to be plotted. Passed to :meth:`xarray.Dataset.sel`.
grid : tuple, optional
Number of rows and columns. By default, the rows and columns are
automatically inferred.
figsize : tuple, optional
figsize : (float, float), optional
Figure size. If None it will be defined automatically.
textsize : float, optional
Text size scaling factor for labels, titles and lines. If None it will be autoscaled based
on figsize.
rug : bool
Plot rug plot of values diverging or that reached the max tree depth.
rug_kind : bool
on `figsize`.
rug : bool, default False
Add a `rug plot <https://en.wikipedia.org/wiki/Rug_plot>`_ for a specific subset of values.
rug_kind : str, default "diverging"
Variable in sample stats to use as rug mask. Must be a boolean variable.
n_points : int
n_points : int, default 20
Number of points for which to plot their quantile/local ess or number of subsets
in the evolution plot.
extra_methods : bool, optional
Plot mean and sd ESS as horizontal lines. Not taken into account in evolution kind
min_ess : int
extra_methods : bool, default False
Plot mean and sd ESS as horizontal lines. Not taken into account if ``kind = 'evolution'``.
min_ess : int, default 400
Minimum number of ESS desired. If ``relative=True`` the line is plotted at
``min_ess / n_samples`` for local and quantile kinds and as a curve following
the ``min_ess / n`` dependency in evolution kind.
labeller : Labeller, optional
Class providing the method ``make_label_vert`` to generate the labels in the plot titles.
Read the :ref:`label_guide` for more details and usage examples.
ax : 2D array-like of matplotlib axes or bokeh figures, optional
ax : 2D array-like of matplotlib_axes or bokeh_figure, optional
A 2D array of locations into which to plot the densities. If not supplied, Arviz will create
its own array of plot areas (and return it).
extra_kwargs : dict, optional
If evolution plot, extra_kwargs is used to plot ess tail and differentiate it
If evolution plot, `extra_kwargs` is used to plot ess tail and differentiate it
from ess bulk. Otherwise, passed to extra methods lines.
text_kwargs : dict, optional
Only taken into account when ``extra_methods=True``. kwargs passed to ax.annotate
Expand All @@ -99,11 +113,11 @@ def plot_ess(
:func:`~matplotlib.axes.Axes.plot` or to :class:`~bokeh.plotting.figure.line`
rug_kwargs : dict
kwargs passed to rug plot.
backend : str, optional
Select plotting backend {"matplotlib","bokeh"}. Default "matplotlib".
backend_kwargs : bool, optional
backend : {"matplotlib", "bokeh"}, default "matplotlib"
Select plotting backend.
backend_kwargs : dict, optional
These are kwargs specific to the backend being used, passed to
:func:`matplotlib.pyplot.subplots` or :func:`bokeh.plotting.figure`.
:func:`matplotlib.pyplot.subplots` or :class:`bokeh.plotting.figure`.
For additional documentation check the plotting method of the backend.
show : bool, optional
Call backend show function.
Expand All @@ -114,22 +128,19 @@ def plot_ess(

Returns
-------
axes: matplotlib axes or bokeh figures
axes : matplotlib_axes or bokeh_figure

See Also
--------
ess: Calculate estimate of the effective sample size.
ess : Calculate estimate of the effective sample size.

References
----------
* Vehtari et al. (2019) see https://arxiv.org/abs/1903.08008

Examples
--------
Plot local ESS. This plot, together with the quantile ESS plot, is recommended to check
that there are enough samples for all the explored regions of parameter space. Checking
local and quantile ESS is particularly relevant when working with HDI intervals as
opposed to ESS bulk, which is relevant for point estimates.
Plot local ESS.

.. plot::
:context: close-figs
Expand All @@ -141,15 +152,6 @@ def plot_ess(
... idata, kind="local", var_names=["mu", "theta"], coords=coords
... )

Plot quantile ESS and exclude variables with partial naming

.. plot::
:context: close-figs

>>> az.plot_ess(
... idata, kind="quantile", var_names=['~thet'], filter_vars="like", coords=coords
... )

Plot ESS evolution as the number of samples increase. When the model is converging properly,
both lines in this plot should be roughly linear.

Expand Down