Identify bottlenecks at the parameter-family level
Source:R/assess.R
identify_bottlenecks_family.RdParameters are grouped into families defined by the prefix before the first
"[" in their name (for example, "beta[1]" and "beta[2]"
belong to the family "beta"). For each family, the function computes
median efficiency metrics and derived quantities that help identify
bottlenecks.
Usage
identify_bottlenecks_family(
samples,
runtime_s,
ess_threshold = 1000,
sampler_params = NULL,
model = NULL,
mcmc_conf = NULL,
ignore_patterns = c("^lifted_", "^logProb_"),
strict_sampler_only = TRUE,
auto_configure = TRUE,
rhat_threshold = 1.01,
ess_per_s_min = 0
)Arguments
- samples
An object containing MCMC samples; typically an object of class
mcmc.list,mcmc,matrix, ordata.frame.- runtime_s
Numeric scalar. Wall-clock runtime of the MCMC run in seconds.
- ess_threshold
Numeric scalar. Target ESS per family (default is 1000).
- sampler_params
Optional character vector of parameter names to keep when defining families. Parameters not in this vector are ignored.
- model
A
nimbleModel(compiled or uncompiled).- mcmc_conf
Optional MCMC configuration (from
configureMCMC). IfNULL, a fresh configuration is built internally.- ignore_patterns
Character vector of regular expressions for node or family names to exclude from the bottleneck search.
- strict_sampler_only
Logical; if
TRUE, only nodes actually sampled by a user-level sampler are considered.- auto_configure
Logical; if
TRUE, the function will configure a baseline MCMC whenmcmc_confis missing.- rhat_threshold
Numeric scalar kept for API symmetry (not used in the ranking).
- ess_per_s_min
Numeric scalar. Optional CE threshold (ESS per second) used to flag families below this value. Use 0 to deactivate.
Value
A list or data.frame describing bottleneck families and their structural and computational load.
A list with components:
typeCharacter string, either
"ok"or"degenerate_only".detailsList with components
ce,ae,time, anddegeneratesummarising the diagnostics.per_familyData frame (or tibble) of metrics by family.
summarySingle-row data frame (or tibble) with global summaries across families.
top3Data frame containing the three worst families according to the main ranking criterion.
Details
Group parameters into families and rank them by median efficiency metrics.
For each family, the following median metrics are computed:
AE_med= median(AE) (low values are worse),CE_med= median(CE) (low values are worse, CE is ESS per second),ESS_med= median(ESS),Rhat_med= median(Rhat, withna.rm = TRUE).
From these, the following diagnostics are derived:
slow_node_time=ess_threshold / CE_med(seconds needed to reach the target ESS; higher is worse),meet_target= logical flag,TRUEwhenslow_node_time <= runtime_s.
Families with degenerate metrics (non-finite or non-positive ESS, AE,
or CE) are reported in the degenerate component and excluded from
the ranking.
When sampler_params is provided, only parameters whose names are
included in sampler_params are used to form families (typically
stochastic nodes that are actually sampled).