Statistical estimation strategies

Overview

run_fb4() supports four strategies for estimating the proportion of maximum consumption (p):

Strategy	Input required	Output
`"binary_search"`	target final weight	point estimate of p
`"mle"`	observed final weights	p + SE + 95 % CI
`"bootstrap"`	observed final weights	p + SD + bootstrap CI
`"hierarchical"`	individual mark-recapture data	population mean and SD of p

Every strategy takes a Bioenergetic object as input and returns an fb4_result object with the same structure.

Shared setup

The Bioenergetic object below is used by all four strategies. It represents an adult Chinook salmon cohort (Oncorhynchus tshawytscha) over a 100-day growing season.

data(fish4_parameters)

chinook_db <- fish4_parameters[["Oncorhynchus tshawytscha"]]
stage      <- names(chinook_db$life_stages)[1]
sp_params  <- chinook_db$life_stages[[stage]]
sp_info    <- chinook_db$species_info

days <- 1:100
temp_data <- data.frame(
  Day         = days,
  Temperature = round(10 + 5 * sin(2 * pi * (days - 60) / 365), 2)
)
diet_data <- list(
  proportions = data.frame(Day = days, Alewife = 0.7, Shrimp = 0.3),
  prey_names  = c("Alewife", "Shrimp"),
  energies    = data.frame(Day = days, Alewife = 4900, Shrimp = 3200)
)

bio <- Bioenergetic(
  species_params      = sp_params,
  species_info        = sp_info,
  environmental_data  = list(temperature = temp_data),
  diet_data           = diet_data,
  simulation_settings = list(initial_weight = 1800, duration = 100)
)
bio$species_params$predator$ED_ini <- 4500
bio$species_params$predator$ED_end <- 5500

Observed final weights simulated around a target of 3 000 g are used by the likelihood and bootstrap strategies:

set.seed(42)
obs_weights <- rnorm(30, mean = 3000, sd = 80)

Strategy 1 — Binary search

Binary search finds the p that produces a specified target weight. Use it when a point estimate is enough and you have a known growth target.

res_bs <- run_fb4(bio,
                  strategy  = "binary_search",
                  fit_to    = "Weight",
                  fit_value = 3000)

cat("p estimate  :", round(res_bs$summary$p_value, 4), "\n")
#> p estimate  : 0.9095
cat("Final weight:", round(res_bs$summary$final_weight, 1), "g\n")
#> Final weight: 3000 g

Strategy 2 — Maximum Likelihood Estimation

MLE finds the p that maximises the likelihood of observing the supplied weights, assuming a log-normal distribution around the predicted weight. It returns a standard error and a 95 % confidence interval via the profile likelihood.

res_mle <- run_fb4(bio,
                   strategy         = "mle",
                   fit_to           = "Weight",
                   observed_weights = obs_weights)

cat("p estimate :", round(res_mle$summary$p_value, 4), "\n")
#> p estimate : 0.9106
cat("SE         :", round(res_mle$summary$p_se,    4), "\n")
#> SE         : 0.0055
cat("Converged  :", res_mle$summary$converged, "\n")
#> Converged  : TRUE

Strategy 3 — Bootstrap

Bootstrap resamples the observed weights and re-fits the model at each replicate, producing a distribution of p estimates without assuming a particular form for the weight distribution.

res_boot <- run_fb4(bio,
                    strategy         = "bootstrap",
                    fit_to           = "Weight",
                    observed_weights = obs_weights,
                    n_bootstrap      = 100,
                    upper            = 1,
                    parallel         = FALSE)

cat("p mean :", round(res_boot$summary$p_mean, 4), "\n")
#> p mean : 0.9111
cat("p SD   :", round(res_boot$summary$p_sd,   4), "\n")
#> p SD   : 0.0052

Strategy 4 — Hierarchical model (mark-recapture)

Biological context

The hierarchical strategy is designed for mark-recapture bioenergetics: individual fish are tagged, weighed at release (initial weight), and re-weighed at recapture (final weight) after a fixed monitoring period. The model estimates a p value for each individual, then pools those estimates into a population-level distribution of feeding levels.

This answers two biological questions: - What is the average feeding level (p) of the population under the observed environmental conditions? - How variable is individual feeding behaviour within that population?

Simulating a mark-recapture data set

First we use binary search to find the p that is consistent with this bio object (initial weight 1 800 g, 100 days). That value becomes the centre of the simulated population distribution.

res_ref <- run_fb4(bio,
                   strategy  = "binary_search",
                   fit_to    = "Weight",
                   fit_value = 2500)
ref_p <- res_ref$summary$p_value
cat("Reference p (target 2 500 g):", round(ref_p, 4), "\n")
#> Reference p (target 2 500 g): 0.7528

set.seed(123)
n_fish <- 20

# Cohort: individual initial weights varying ± 15 % around 1 800 g
initial_weights <- round(runif(n_fish, min = 1800 * 0.85, max = 1800 * 1.15))

# Each fish has its own feeding level drawn from a population distribution
# centred on the reference p estimated above (sigma_p = 0.06)
true_p <- pmax(0.3, pmin(1.5, rnorm(n_fish, mean = ref_p, sd = 0.06)))

# Simulate final weights: run individual direct simulations + measurement noise
final_weights <- vapply(seq_len(n_fish), function(i) {
  bio_i <- bio
  bio_i$simulation_settings$initial_weight <- initial_weights[i]
  res_i <- run_fb4(bio_i, strategy = "direct", p_value = true_p[i])
  round(res_i$summary$final_weight * rnorm(1, mean = 1, sd = 0.02), 1)
}, numeric(1))

mr_data <- data.frame(
  individual_id  = sprintf("fish_%02d", seq_len(n_fish)),
  initial_weight = initial_weights,
  final_weight   = final_weights
)

head(mr_data, 6)
#>   individual_id initial_weight final_weight
#> 1       fish_01           1685       2620.3
#> 2       fish_02           1956       2720.2
#> 3       fish_03           1751       2565.9
#> 4       fish_04           2007       2790.0
#> 5       fish_05           2038       2686.7
#> 6       fish_06           1555       2581.8

Running the hierarchical model

res_hier <- run_fb4(
  x                = bio,
  strategy         = "hierarchical",
  backend          = "tmb",
  fit_to           = "Weight",
  observed_weights = mr_data
)

cat(sprintf("Population mean p  (mu_p)   : %.4f\n", res_hier$summary$mu_p_estimate))
#> Population mean p  (mu_p)   : 0.7447
cat(sprintf("Population SD  p  (sigma_p) : %.4f\n", res_hier$summary$sigma_p_estimate))
#> Population SD  p  (sigma_p) : 0.0847
cat(sprintf("Number of individuals       : %d\n",   res_hier$summary$n_individuals))
#> Number of individuals       : 20
cat(sprintf("Converged                   : %s\n",   res_hier$summary$converged))
#> Converged                   : TRUE

The data were generated with a theoretical population mean of p = 0.753 (the binary-search estimate for a 2 500 g target) and SD = 0.06. Because only 20 fish were drawn, the realised sample mean is 0.746 — this is the value mu_p should recover. sigma_p reflects both the true individual variation (SD = 0.06) and the 2 % measurement noise added to the final weights.

Assumptions and limitations

For mu_p and sigma_p to be biologically meaningful, the study design must satisfy a few conditions.

All fish in the data set must belong to the same species and life stage and must have experienced the same environmental conditions (temperature, diet). If you mix life stages or fish from different habitats, mu_p becomes a statistical average with no clear ecological interpretation.

The model also assumes a fixed monitoring period: every individual is tracked for the same n_days. Variable recapture intervals are not currently supported.

On the statistical side, individual p values are modelled on the log scale, so the implied population distribution is approximately log-normal. This guarantees p > 0 and works well when individual variation is moderate. The model accepts a covariate matrix (covariates_matrix) if you want to explain part of the variation in p by body size, sex, or tagging location; without covariates, all unexplained variation accumulates in sigma_p.

Finally, the hierarchical model requires backend = "tmb" — automatic differentiation via TMB is needed for the mixed-effects likelihood. The pure-R backend does not support this strategy.

Comparing strategies

Strategies 1–3: single-population estimates

Binary search, MLE, and bootstrap all answer the same question — what p produces the observed growth of a representative fish? — and can therefore be compared numerically. All three use the same bio object (initial_weight = 1800 g, target or observed weights ≈ 3 000 g).

Strategies 1–3 fitted to the same data (Chinook salmon, 100-day simulation, target weight ≈ 3 000 g).
Strategy	p estimate	SE / SD	Uncertainty type
binary_search	0.9095	NA	none (point estimate)
mle	0.9106	0.0055	SE (asymptotic)
bootstrap	0.9111	0.0052	SD (non-parametric)

Strategy 4: population-level distribution

The hierarchical model answers a different question: not the p of a representative individual, but the distribution of p across a population of tagged fish. Its output (mu_p, sigma_p) is not numerically comparable to the single estimates above because the input data differ (each fish has its own initial and final weight) and the parameter being estimated is a population mean, not an individual feeding level.

Hierarchical model results for the simulated mark-recapture cohort (n = 20 fish, true mu_p = 0.753 ).
Parameter	Estimated value
mu_p (population mean)	0.7447
sigma_p (population SD)	0.0847
n_individuals	20
converged	TRUE

References

Deslauriers D, Chipps SR, Breck JE, Rice JA, Madenjian CP (2017). Fish Bioenergetics 4.0: An R-Based Modeling Application. Fisheries 42(11):586–596.

Chipps SR, Wahl DH (2008). Bioenergetics modeling in the 21st century: reviewing new insights and revisiting old constraints. Transactions of the American Fisheries Society 137(1):298–313.

Hans Ttito

2026-03-29