Diagnose model structure, dependencies, and sampler time (parameter and family levels)
Source:R/diagnostics.R
diagnose_model_structure.RdInspect a NIMBLE model to extract the universe of nodes, classify stochastic
vs deterministic nodes, compute downstream dependencies per node, map
configured samplers to their target nodes, and (optionally) profile
per-sampler run time. Produces tidy tables and publication-quality figures
at both the parameter level and the family level (where
"family" = base variable name before any indices, replacing indices by
square brackets, for example "beta[]").
Usage
diagnose_model_structure(
model,
include_data = FALSE,
removed_nodes = NULL,
ignore_patterns = c("^lifted_", "^logProb_"),
make_plots = TRUE,
output_dir = NULL,
save_csv = FALSE,
node_of_interest = NULL,
sampler_times = NULL,
sampler_times_unit = "seconds",
auto_profile = TRUE,
profile_niter = 3000L,
profile_burnin = 100L,
profile_thin = 1L,
profile_seed = NULL,
np = 0.1,
by_family = TRUE,
family_stat = c("median", "mean", "sum"),
time_normalize = c("none", "per_node"),
only_family_plots = FALSE,
...
)Arguments
- model
A compiled or uncompiled
nimbleModel(required).- include_data
Logical; include data nodes when enumerating nodes (default
FALSE).- removed_nodes
Character vector of nodes to exclude explicitly (default
NULL).- ignore_patterns
Character vector of regular expressions used to exclude nodes (for example
"^lifted_","^logProb_").- make_plots
Logical; if
TRUE, generate ggplot objects and optionally save them (defaultTRUE).- output_dir
Directory where figures/CSVs will be saved; if
NULL, nothing is written to disk (defaultNULL).- save_csv
Logical; if
TRUE, write CSV exports (dependencies per node, family tables) (defaultFALSE).- node_of_interest
Optional character scalar naming a node to highlight or subset (reserved for user logic) (default
NULL).- sampler_times
Optional numeric vector of per-sampler times aligned with
nimble::configureMCMC(model)$getSamplers().- sampler_times_unit
Character label for time axis (for example
"seconds","ms") (default"seconds").- auto_profile
Logical; if
TRUEandsampler_timesisNULL, profile sampler times automatically via a short MCMC run (defaultTRUE).- profile_niter
Integer; iterations used by the auto-profiler (default
3000L).- profile_burnin
Integer; burn-in iterations for auto-profiler (kept for API compatibility; not used internally) (default
100L).- profile_thin
Integer; thinning for auto-profiler (kept for API compatibility; not used internally) (default
1L).- profile_seed
Optional integer seed for reproducibility (default
NULL).- np
Proportion in
(0, 1]used elsewhere for "worst sampler" selection (kept for API compatibility) (default0.10).- by_family
Logical; if
TRUE, compute and plot family-level summaries in addition to parameter-level summaries (defaultTRUE).- family_stat
One of
c("median", "mean", "sum"); summary statistic for family-level aggregation (default"median").- time_normalize
One of
c("none", "per_node"); if"per_node", divide family time by the number of distinct nodes in the family (default"none").- only_family_plots
Logical; if
TRUE, only family-level figures are exported (parameter-level plots are not written) (defaultFALSE).- ...
Additional arguments forwarded to
nimble::configureMCMC().
Value
A named list containing:
dependencies_df: data.frame of (node, dependency) pairs.dep_counts: data.frame with per-parameter downstream dependency counts.samplers_df: data.frame listing samplers and their target nodes (list-column).per_param_times: data.frame with per-parameter aggregated sampler time.deps_df: tidy data.frame used for parameter-level dependency plotting.sampler_df: tidy data.frame used for parameter-level time plotting.fam_deps_df: family-level dependency summary (one statistic per family).fam_time_df: family-level time summary (optionally normalized).plots: list of ggplot objects (some may beNULL):plot_dependencies,plot_sampler_time,plot_combined,plot_dependencies_family,plot_sampler_time_family,plot_combined_family.
Details
Key features:
Robust filtering of nodes (ignored patterns, removal list).
Downstream dependency counts per node and per family.
Per-sampler times aggregated to parameters and families.
Optional auto-profiling of samplers via a short MCMC run.
Non-regressive: original per-parameter plots are preserved by default.