Skip to contents

Inspect a NIMBLE model to extract the universe of nodes, classify stochastic vs deterministic nodes, compute downstream dependencies per node, map configured samplers to their target nodes, and (optionally) profile per-sampler run time. Produces tidy tables and publication-quality figures at both the parameter level and the family level (where "family" = base variable name before any indices, replacing indices by square brackets, for example "beta[]").

Usage

diagnose_model_structure(
  model,
  include_data = FALSE,
  removed_nodes = NULL,
  ignore_patterns = c("^lifted_", "^logProb_"),
  make_plots = TRUE,
  output_dir = NULL,
  save_csv = FALSE,
  node_of_interest = NULL,
  sampler_times = NULL,
  sampler_times_unit = "seconds",
  auto_profile = TRUE,
  profile_niter = 3000L,
  profile_burnin = 100L,
  profile_thin = 1L,
  profile_seed = NULL,
  np = 0.1,
  by_family = TRUE,
  family_stat = c("median", "mean", "sum"),
  time_normalize = c("none", "per_node"),
  only_family_plots = FALSE,
  ...
)

Arguments

model

A compiled or uncompiled nimbleModel (required).

include_data

Logical; include data nodes when enumerating nodes (default FALSE).

removed_nodes

Character vector of nodes to exclude explicitly (default NULL).

ignore_patterns

Character vector of regular expressions used to exclude nodes (for example "^lifted_", "^logProb_").

make_plots

Logical; if TRUE, generate ggplot objects and optionally save them (default TRUE).

output_dir

Directory where figures/CSVs will be saved; if NULL, nothing is written to disk (default NULL).

save_csv

Logical; if TRUE, write CSV exports (dependencies per node, family tables) (default FALSE).

node_of_interest

Optional character scalar naming a node to highlight or subset (reserved for user logic) (default NULL).

sampler_times

Optional numeric vector of per-sampler times aligned with nimble::configureMCMC(model)$getSamplers().

sampler_times_unit

Character label for time axis (for example "seconds", "ms") (default "seconds").

auto_profile

Logical; if TRUE and sampler_times is NULL, profile sampler times automatically via a short MCMC run (default TRUE).

profile_niter

Integer; iterations used by the auto-profiler (default 3000L).

profile_burnin

Integer; burn-in iterations for auto-profiler (kept for API compatibility; not used internally) (default 100L).

profile_thin

Integer; thinning for auto-profiler (kept for API compatibility; not used internally) (default 1L).

profile_seed

Optional integer seed for reproducibility (default NULL).

np

Proportion in (0, 1] used elsewhere for "worst sampler" selection (kept for API compatibility) (default 0.10).

by_family

Logical; if TRUE, compute and plot family-level summaries in addition to parameter-level summaries (default TRUE).

family_stat

One of c("median", "mean", "sum"); summary statistic for family-level aggregation (default "median").

time_normalize

One of c("none", "per_node"); if "per_node", divide family time by the number of distinct nodes in the family (default "none").

only_family_plots

Logical; if TRUE, only family-level figures are exported (parameter-level plots are not written) (default FALSE).

...

Additional arguments forwarded to nimble::configureMCMC().

Value

A named list containing:

  • dependencies_df: data.frame of (node, dependency) pairs.

  • dep_counts: data.frame with per-parameter downstream dependency counts.

  • samplers_df: data.frame listing samplers and their target nodes (list-column).

  • per_param_times: data.frame with per-parameter aggregated sampler time.

  • deps_df: tidy data.frame used for parameter-level dependency plotting.

  • sampler_df: tidy data.frame used for parameter-level time plotting.

  • fam_deps_df: family-level dependency summary (one statistic per family).

  • fam_time_df: family-level time summary (optionally normalized).

  • plots: list of ggplot objects (some may be NULL): plot_dependencies, plot_sampler_time, plot_combined, plot_dependencies_family, plot_sampler_time_family, plot_combined_family.

Details

Key features:

  • Robust filtering of nodes (ignored patterns, removal list).

  • Downstream dependency counts per node and per family.

  • Per-sampler times aggregated to parameters and families.

  • Optional auto-profiling of samplers via a short MCMC run.

  • Non-regressive: original per-parameter plots are preserved by default.

Examples

if (FALSE) { # \dontrun{
res <- diagnose_model_structure(
  model            = my_nimble_model,
  make_plots       = TRUE,
  output_dir       = "outputs/diagnostics",
  save_csv         = TRUE,
  by_family        = TRUE,
  family_stat      = "median",
  time_normalize   = "per_node",
  only_family_plots = FALSE
)
} # }