Generates a comprehensive diagnostic panel of MCMC bottlenecks using
efficiency and convergence metrics computed per node or node family.
The function can optionally restrict analyses to nodes that are
effectively sampled (i.e. have associated samplers in the NIMBLE
configuration), identified automatically via conf.mcmc$getSamplers().
It produces publication-ready figures of:
Median Algorithmic Efficiency (AE = ESS/iter) by node family;
Median Computational Efficiency (CE = ESS/s) by node family;
Worst targets by CE (lowest ESS/s);
Median or worst \(\hat{R}\) (Gelman-Rubin) by node or family.
The function saves each plot both as PDF and PNG in the specified output directory. Bar widths and spacing are optimized for compact presentation.
Usage
plot_bottlenecks(
diag_tbl,
out_dir = "outputs/diagnostics",
top_k = 20L,
rhat_ref = 1.05,
sampled_only = FALSE,
conf.mcmc = NULL,
samples_ml = NULL,
make_esss_targets = TRUE,
make_esss_families = TRUE,
make_time_families = TRUE,
make_rhat_hist_targets = TRUE,
make_rhat_worst_targets = TRUE,
make_rhat_median_families = TRUE,
make_hist_ae_families = FALSE,
make_hist_ce_families = FALSE
)Arguments
- diag_tbl
data.frameortibblecontaining diagnostics per target node. Must include columns such astarget,Family,ESS,ESS_per_sec, and optionallyRhat.- out_dir
Character string; path to output directory for saving figures (default:
"outputs/diagnostics"). Will be created recursively if missing.- top_k
Integer; number of worst or best nodes to display (default:
20L).- rhat_ref
Numeric; reference threshold for Gelman-Rubin \(\hat{R}\) (default:
1.05).- sampled_only
Logical; if
TRUE, restricts plots to nodes that have an explicit sampler inconf.mcmcand are present insamples_ml. Default:FALSE.- conf.mcmc
NIMBLE MCMC configuration object, typically produced by
configureMCMC(model, ...)or stored asbuild_fn()$conf. Used to extract sampler-attached target nodes whensampled_only = TRUE.- samples_ml
Optional MCMC list (as returned by
runMCMC(..., nchains > 1)), used to match sampler targets with actual sample column names.- make_esss_targets
Logical; if
TRUE, produces barplot of worst targets by computational efficiency (default:TRUE).- make_esss_families
Logical; if
TRUE, produces barplot of median algorithmic efficiency (AE) by node family (default:TRUE).- make_time_families
Logical; if
TRUE, produces barplot of median computational efficiency (CE) by node family (default:TRUE).- make_rhat_hist_targets
Logical; if
TRUE, produces barplot of median Rhat per family (default:TRUE).- make_rhat_worst_targets
Logical; if
TRUE, produces barplot of the worst Rhat targets (default:TRUE).- make_rhat_median_families
Logical; if
TRUE, produces an alias of the median Rhat-by-family plot (default:TRUE).- make_hist_ae_families
Logical; if
TRUE, draw AE histograms by family.- make_hist_ce_families
Logical; if
TRUE, draw CE histograms by family.
Value
Invisibly returns a named list of ggplot objects:
bar_family_algorithmic_eff- Median AE by family;bar_family_computational_eff- Median CE by family;bar_target_CE- Worst targets by CE;rhat_family_template- Median Rhat by family;rhat_worst_targets- Worst targets by Rhat.
Each plot is also saved in out_dir as both .pdf and .png.
Details
This function is a core visualization tool for diagnosing performance bottlenecks in large hierarchical Bayesian models (e.g., SAM-like or GEREM-type stock assessment models). It integrates runtime, efficiency, and convergence metrics in a standardized panel of plots, suitable for benchmarking, model comparison, or publication figures.
When sampled_only = TRUE, it automatically extracts the list of stochastic
nodes (targets) from conf.mcmc$getSamplers() and intersects them with the
variable names present in samples_ml. This ensures only stochastically
sampled nodes are visualized, excluding lifted or deterministic intermediates.