Skip to content
Sindbad.ParameterOptimization Module
julia
ParameterOptimization

The ParameterOptimization module provides tools for optimizing SINDBAD models, including parameter estimation, model calibration, and cost function evaluation. It integrates various optimization algorithms and utilities to streamline the optimization workflow for SINDBAD experiments.

Purpose

This module is designed to support optimization tasks in SINDBAD, such as calibrating model parameters to match observations or minimizing cost functions. It leverages multiple optimization libraries and provides a unified interface for running optimization routines.

Dependencies

Related (SINDBAD ecosystem)

  • ErrorMetrics: Metric implementations for cost evaluation.

  • TimeSamplers: Temporal helpers used by some workflows.

  • OmniTools: Shared helpers and table utilities.

External (third-party)

  • StableRNGs: Reproducible random number generation for stochastic workflows.

Internal (within Sindbad)

  • Sindbad.Setup

  • Sindbad.Simulation

  • Sindbad.Types

  • SindbadTEM

Optional dependencies (extensions / weakdeps)

Some optimizer backends are enabled via Julia extensions (see root Project.toml and ext/):

  • CMAEvolutionStrategySindbadCMAEvolutionStrategyExt

  • OptimizationSindbadOptimizationExt

Other packages listed under [weakdeps] may be used by experimental workflows but are not required for the base module to load.

Included Files

  • handleDataForCost.jl: Helpers for aligning forcing/observation/model outputs for cost evaluation.

  • getCost.jl: Cost extraction and convenience wrappers.

  • optimizer.jl: Core optimization logic (algorithm selection + option normalization).

  • cost.jl: Cost functions for evaluating model–observation mismatch.

  • prepOpti.jl: Prepares inputs and bookkeeping for optimization runs.

  • optimizeTEM.jl: Optimization routines for single sites and spatial workflows.

  • sensitivityAnalysis.jl: Sensitivity analysis utilities.

Note

  • The package integrates multiple optimization libraries, allowing users to choose the most suitable algorithm for their problem.

  • Designed to be modular and extensible, enabling users to customize optimization workflows for specific use cases.

  • Supports both gradient-based and derivative-free optimization methods, ensuring flexibility for different types of cost functions.

Examples

julia
julia> using Sindbad

julia> # Run parameter optimization from a configuration file
julia> # result = runExperimentOpti("path/to/experiment_config.json")

julia> # Calculate cost between model output and observations
julia> # cost_result = runExperimentCost("path/to/experiment_config.json")

Functions

combineMetric

Sindbad.ParameterOptimization.combineMetric Function
julia
combineMetric(metric_vector::AbstractArray, ::MetricSum)
combineMetric(metric_vector::AbstractArray, ::MetricMinimum)
combineMetric(metric_vector::AbstractArray, ::MetricMaximum)
combineMetric(metric_vector::AbstractArray, percentile_value::T)

combines the metric from all constraints based on the type of combination.

Arguments:

  • metric_vector: a vector of metrics for variables

methods for combining the metric

  • ::MetricSum: return the total sum as the metric.

  • ::MetricMinimum: return the minimum of the metric_vector as the metric.

  • ::MetricMaximum: return the maximum of the metric_vector as the metric.

  • percentile_value::T: percentile_value^th percentile of metric of each constraint as the overall metric

Examples

julia
julia> using Sindbad

julia> metric_vec = [0.5, 0.7, 0.3]
3-element Vector{Float64}:
 0.5
 0.7
 0.3

julia> combineMetric(metric_vec, MetricSum())
1.5

julia> combineMetric(metric_vec, MetricMinimum())
0.3

julia> combineMetric(metric_vec, MetricMaximum())
0.7
Code
julia
function combineMetric end

function combineMetric(metric_vector::AbstractArray, ::MetricSum)
    return sum(metric_vector)
end

function combineMetric(metric_vector::AbstractArray, ::MetricSum)
    return sum(metric_vector)
end

function combineMetric(metric_vector::AbstractArray, ::MetricMinimum)
    return minimum(metric_vector)
end

function combineMetric(metric_vector::AbstractArray, ::MetricMaximum)
    return maximum(metric_vector)
end

function combineMetric(metric_vector::AbstractArray, percentile_value::T) where {T<:Real}
    return percentile(metric_vector, percentile_value)
end

cost

Sindbad.ParameterOptimization.cost Function
julia
cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, cost_method<: CostMethod)

Calculate the cost for a parameter vector.

Arguments

  • parameter_vector: Vector of parameter values to be optimized

  • 'default_values': Default values for model parameters

  • selected_models: Collection of selected models for simulation

  • space_forcing: Forcing data for the main simulation period

  • space_spinup_forcing: Forcing data for the spin-up period

  • loc_forcing_t: Time-specific forcing data

  • output_array: Array to store simulation outputs

  • space_output: Spatial output configuration

  • space_land: Land surface characteristics

  • tem_info: Temporal information for simulation

  • observations: Observed data for comparison

  • parameter_updater: Function to update parameters

  • cost_options: Options for cost function calculation

  • multi_constraint_method: Method for handling multiple constraints

  • parameter_scaling_type: Type of parameter scaling

  • sindbad_cost_method <: CostMethod: a type parameter indicating cost calculation method

Returns

Cost value representing the difference between model outputs and observations

sindbad_cost_method:

CostMethod

Abstract type for cost calculation methods in SINDBAD

Available methods/subtypes:

  • CostModelObs: cost calculation between model output and observations

  • CostModelObsLandTS: cost calculation between land model output and time series observations

  • CostModelObsMT: multi-threaded cost calculation between model output and observations

  • CostModelObsPriors: cost calculation between model output, observations, and priors. NOTE THAT THIS METHOD IS JUST A PLACEHOLDER AND DOES NOT CALCULATE PRIOR COST PROPERLY YET

Examples

julia
julia> using Sindbad

julia> # Calculate cost for a parameter vector
julia> # cost_value = cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, CostModelObs())
Code
julia
function cost end

function cost(parameter_vector, _, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, ::CostModelObs)
    @debug parameter_vector
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    runTEM!(updated_models, space_forcing, space_spinup_forcing, loc_forcing_t, space_output, space_land, tem_info)
    cost_vector = metricVector(output_array, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function cost(parameter_vector, _, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, ::CostModelObs)
    @debug parameter_vector
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    runTEM!(updated_models, space_forcing, space_spinup_forcing, loc_forcing_t, space_output, space_land, tem_info)
    cost_vector = metricVector(output_array, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function cost(parameter_matrix, _, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, cost_out::Vector, ::CostModelObsMT)
    @debug "parameter_matrix:: ", size(parameter_matrix)
    parameter_set_size = size(parameter_matrix, 2)
    done_params=1
    Threads.@threads for parameter_index in eachindex(1:parameter_set_size)
        idx = Threads.threadid()
        parameter_vector = parameter_matrix[:, parameter_index]
        @debug parameter_vector
        updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
        coreTEM!(updated_models, space_forcing, space_spinup_forcing, loc_forcing_t, space_output[idx], space_land, tem_info)
        cost_vector = metricVector(space_output[idx], observations, cost_options)
        cost_metric = combineMetric(cost_vector, multi_constraint_method)
        cost_out[parameter_index] = cost_metric
        @debug "Parameter column:: ", idx, round(100 * done_params/parameter_set_size,digits=2), parameter_set_size, cost_metric, cost_vector
        done_params += 1
    end
    return cost_out
end

function cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, ::CostModelObsPriors)
    # prior has to be calculated before the parameters are backscaled and models are updated
    cost_prior = metric(MSE(), parameter_vector, parameter_vector, default_values)
    cost_metric = cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, CostModelObs())
    cost_metric = cost_metric + cost_prior
    @debug cost_vector, cost_metric
    return cost_metric
end

function cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    cost_metric = cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, CostModelObs())
    return cost_metric
end

function costLand end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, ::Nothing, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

costLand

Sindbad.ParameterOptimization.costLand Function
julia
costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)

costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, _, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)

Calculates the cost of SINDBAD model simulations for a single location by comparing model outputs as collections of SINDBAD land with observations using specified metrics and constraints.

In the first variant, the land_time_series is preallocated for computational efficiency. In the second variant, the runTEM stacks the land using map function and the preallocations is not necessary.

Arguments:

  • parameter_vector::AbstractArray: A vector of model parameter values to be optimized.

  • selected_models: A tuple of selected SINDBAD models in the given model structure, the parameters of which are optimized.

  • forcing: A forcing NamedTuple containing the time series of environmental drivers for the simulation.

  • spinup_forcing: A forcing NamedTuple for the spinup phase, used to initialize the model to a steady state.

  • loc_forcing_t: A forcing NamedTuple for a single location and a single time step.

  • land_timeseries: A preallocated vector to store the land state for each time step during the simulation.

  • land_init: The initial SINDBAD land NamedTuple containing all fields and subfields.

  • tem_info: A nested NamedTuple containing necessary information for running SINDBAD TEM, including helpers, models, and spinup configurations.

  • observations: A NamedTuple or vector of arrays containing observational data, uncertainties, and masks for calculating performance metrics.

  • parameter_updater: A function to update model parameters based on the parameter_vector.

  • cost_options: A table specifying how each observation constraint should be used to calculate the cost or performance metric.

  • multi_constraint_method: A method for combining the vector of costs into a single cost value or vector, as required by the optimization algorithm.

  • parameter_scaling_type: Specifies the type of scaling applied to the parameters during optimization.

Returns:

  • cost_metric: A scalar or vector representing the cost, calculated by comparing model outputs with observations using the specified metrics and constraints.

Note

  • The function updates the selected models using the parameter_vector and parameter_updater.

  • It runs the SINDBAD TEM simulation for the specified location using runTEM.

  • The model outputs are compared with observations using metricVector, which calculates the performance metrics.

  • The resulting cost vector is combined into a single cost value or vector using combineMetric and the specified multi_constraint_method.

Examples:

  1. Calculating cost for a single location:
julia
cost = costLand(parameter_vector, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
  1. Using a custom multi-constraint method:
julia
custom_method = CustomConstraintMethod()
cost = costLand(parameter_vector, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, custom_method, parameter_scaling_type)
  1. Handling observational uncertainties:
  • Observations can include uncertainties and masks to refine the cost calculation, ensuring robust model evaluation.
Code
julia
function costLand end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, ::Nothing, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

getCostVectorSize

Sindbad.ParameterOptimization.getCostVectorSize Function
julia
getCostVectorSize(algo_options, parameter_vector, ::ParameterOptimizationMethod || GSAMethod)

Calculates the size of the cost vector required for a specific optimization or sensitivity analysis method.

Arguments:

  • algo_options: A NamedTuple or dictionary containing algorithm-specific options (e.g., population size, number of trajectories).

  • parameter_vector: A vector of parameters used in the optimization or sensitivity analysis.

  • ::ParameterOptimizationMethod: The optimization or sensitivity analysis method. Supported methods include:

    • CMAEvolutionStrategyCMAES: Covariance Matrix Adaptation Evolution Strategy.

    • GSAMorris: Morris method for global sensitivity analysis.

    • GSASobol: Sobol method for global sensitivity analysis.

    • GSASobolDM: Sobol method with Design Matrices.

Returns:

  • An integer representing the size of the cost vector required for the specified method.

Notes:

  • For CMAEvolutionStrategyCMAES, the size is determined by the population size or a default formula based on the parameter vector length.

  • For GSAMorris, the size is calculated as the product of the number of trajectories and the length of the design matrix.

  • For GSASobol, the size is determined by the number of parameters and the number of samples.

  • For GSASobolDM, the size is equivalent to that of GSASobol.

Code
julia
function getCostVectorSize end

function getCostVectorSize(algo_options, parameter_vector, ::CMAEvolutionStrategyCMAES)
    cost_vector_size = Threads.nthreads()
    if hasproperty(algo_options, :multi_threading)
        if algo_options.multi_threading
            if hasproperty(algo_options, :popsize)
                cost_vector_size = algo_options.popsize
            else
                cost_vector_size = 4 + floor(Int, 3 * log(length(parameter_vector)))
            end
        end
    end
    return cost_vector_size
end

function getCostVectorSize(algo_options, parameter_vector, ::CMAEvolutionStrategyCMAES)
    cost_vector_size = Threads.nthreads()
    if hasproperty(algo_options, :multi_threading)
        if algo_options.multi_threading
            if hasproperty(algo_options, :popsize)
                cost_vector_size = algo_options.popsize
            else
                cost_vector_size = 4 + floor(Int, 3 * log(length(parameter_vector)))
            end
        end
    end
    return cost_vector_size
end

function getCostVectorSize(algo_options, __precompile__, ::GSAMorris)
    default_opt = sindbadDefaultOptions(GSAMorris())
    num_trajectory = default_opt.num_trajectory
    len_design_mat = default_opt.len_design_mat
    if hasproperty(algo_options, :num_trajectory)
        num_trajectory = algo_options.num_trajectory
    end
    if hasproperty(algo_options, :len_design_mat)
        len_design_mat = algo_options.len_design_mat
    end
    cost_vector_size = num_trajectory * len_design_mat
    return cost_vector_size
end

function getCostVectorSize(algo_options, parameter_vector, ::GSASobol)
    default_opt = sindbadDefaultOptions(GSASobol())
    samples = default_opt.samples
    nparam = length(parameter_vector)
    norder = length(algo_options.method_options.order) - 1
    if hasproperty(algo_options, :samples)
        samples = algo_options.samples
    end
    cost_vector_size = samples * (norder * nparam + 2)
    return cost_vector_size
end

function getCostVectorSize(algo_options, parameter_vector, ::GSASobolDM)
    return getCostVectorSize(algo_options, parameter_vector, GSASobol())
end

getData

Sindbad.ParameterOptimization.getData Function
julia
getData(model_output::LandWrapper, observations, cost_option)
getData(model_output::NamedTuple, observations, cost_option)
getData(model_output::AbstractArray, observations, cost_option)

Arguments:

  • model_output: a collection of SINDBAD model output time series as a time series of stacked land NT or as a preallocated array.

  • observations: a NT or a vector of arrays of observations, their uncertainties, and mask to use for calculation of performance metric/loss

  • cost_option: information for a observation constraint on how it should be used to calculate the loss/metric of model performance

Code
julia
function getDataWithoutNaN end

function getDataWithoutNaN(y, yσ, ŷ, idxs)
    y_view = @view y[idxs] 
    yσ_view = @view yσ[idxs] 
    ŷ_view = @view ŷ[idxs] 
    return (y_view, yσ_view, ŷ_view)
end

function getDataWithoutNaN(y, yσ, ŷ, idxs)
    y_view = @view y[idxs] 
    yσ_view = @view yσ[idxs] 
    ŷ_view = @view ŷ[idxs] 
    return (y_view, yσ_view, ŷ_view)
end

function getDataWithoutNaN(y, yσ, ŷ)
    @debug sum(is_invalid_number.(y)), sum(is_invalid_number.(yσ)), sum(is_invalid_number.(ŷ))
    idxs = (.!isnan.(y .*.* ŷ)) # TODO this has to be run because LandWrapper produces a vector. So, dispatch with the inefficient versions without idxs argument
    return y[idxs], yσ[idxs], ŷ[idxs]
end

getDataWithoutNaN

Sindbad.ParameterOptimization.getDataWithoutNaN Function
julia
getDataWithoutNaN(y, yσ, ŷ, idxs)
getDataWithoutNaN(y, yσ, ŷ)

return model and obs data excluding for the common NaN or for the valid pixels idxs.

Arguments:

  • y: observation data

  • : observational uncertainty data

  • ŷ: model simulation data/estimate

  • idxs: indices of valid data points

Examples

julia
julia> using Sindbad

julia> y = [1.0, NaN, 3.0, 4.0]
4-element Vector{Float64}:
   1.0
 NaN
   3.0
   4.0

julia>= [0.1, 0.2, 0.1, 0.1]
4-element Vector{Float64}:
 0.1
 0.2
 0.1
 0.1

julia>= [1.1, 2.0, 2.9, 4.1]
4-element Vector{Float64}:
 1.1
 2.0
 2.9
 4.1

julia> y_clean, yσ_clean, ŷ_clean = getDataWithoutNaN(y, yσ, ŷ)
([1.0, 3.0, 4.0], [0.1, 0.1, 0.1], [1.1, 2.9, 4.1])
Code
julia
function getDataWithoutNaN end

function getDataWithoutNaN(y, yσ, ŷ, idxs)
    y_view = @view y[idxs] 
    yσ_view = @view yσ[idxs] 
    ŷ_view = @view ŷ[idxs] 
    return (y_view, yσ_view, ŷ_view)
end

function getDataWithoutNaN(y, yσ, ŷ, idxs)
    y_view = @view y[idxs] 
    yσ_view = @view yσ[idxs] 
    ŷ_view = @view ŷ[idxs] 
    return (y_view, yσ_view, ŷ_view)
end

function getDataWithoutNaN(y, yσ, ŷ)
    @debug sum(is_invalid_number.(y)), sum(is_invalid_number.(yσ)), sum(is_invalid_number.(ŷ))
    idxs = (.!isnan.(y .*.* ŷ)) # TODO this has to be run because LandWrapper produces a vector. So, dispatch with the inefficient versions without idxs argument
    return y[idxs], yσ[idxs], ŷ[idxs]
end

getModelOutputView

Sindbad.ParameterOptimization.getModelOutputView Function
julia
 getModelOutputView(_dat::AbstractArray{<:Any,N}) where N
Code
julia
function getModelOutputView(_dat::AbstractArray{<:Any,N}) where N
    dim = 1
    inds = map(size(_dat)) do _
        ind = dim == 2 ? 1 : Colon()
        dim += 1
        ind
    end
    @view _dat[inds...]
end

globalSensitivity

Sindbad.ParameterOptimization.globalSensitivity Function
julia
globalSensitivity(cost_function, method_options, p_bounds, ::GSAMethod; batch=true)

Performs global sensitivity analysis using the specified method.

Arguments:

  • cost_function: A function that computes the cost or output of the model based on input parameters.

  • method_options: A set of options specific to the chosen sensitivity analysis method.

  • p_bounds: A vector or matrix specifying the bounds of the parameters for sensitivity analysis.

  • ::GSAMethod: The sensitivity analysis method to use.

  • batch: A boolean flag indicating whether to perform batch processing (default: true).

Returns:

  • A results object containing the sensitivity indices and other relevant outputs for the specified method.

algorithm:

GSAMethod

Abstract type for global sensitivity analysis methods in SINDBAD

Available methods/subtypes:

  • GSAMorris: Morris method for global sensitivity analysis

  • GSASobol: Sobol method for global sensitivity analysis

  • GSASobolDM: Sobol method with derivative-based measures for global sensitivity analysis


Extended help

Notes:

  • The function internally calls the gsa function from the GlobalSensitivity.jl package with the specified method and options.

  • The cost_function should be defined to compute the model output based on the input parameters.

  • The method_options argument allows fine-tuning of the sensitivity analysis process for each method.

Code
julia
function globalSensitivity end

function globalSensitivity(cost_function, method_options, p_bounds, ::GSAMorris; batch=true)
    results = gsa(cost_function, Morris(; method_options...), p_bounds, batch=batch)
    return results
end

function globalSensitivity(cost_function, method_options, p_bounds, ::GSAMorris; batch=true)
    results = gsa(cost_function, Morris(; method_options...), p_bounds, batch=batch)
    return results
end

function globalSensitivity(cost_function, method_options, p_bounds, ::GSASobol; batch=true)
    sampler = getproperty(Sindbad.ParameterOptimization.GlobalSensitivity, Symbol(method_options.sampler))(; method_options.sampler_options..., method_options.method_options... )
    results = gsa(cost_function, sampler, p_bounds; method_options..., batch=batch)
    return results
end

function globalSensitivity(cost_function, method_options, p_bounds, ::GSASobolDM; batch=true)
    sampler = getproperty(Sindbad.ParameterOptimization.GlobalSensitivity, Symbol(method_options.sampler))(; method_options.sampler_options...)
    samples = method_options.samples
    lb = first.(p_bounds)
    ub = last.(p_bounds)
    A, B = QuasiMonteCarlo.generate_design_matrices(samples, lb, ub, sampler)
    results = gsa(cost_function, Sobol(; method_options.method_options...), A, B; method_options..., batch=batch)
    return results
end

metricVector

Sindbad.ParameterOptimization.metricVector Function
julia
metricVector(model_output::LandWrapper, observations, cost_options)
metricVector(model_output, observations, cost_options)

returns a vector of metrics for variables in cost_options.variable.

Arguments:

  • observations: a NT or a vector of arrays of observations, their uncertainties, and mask to use for calculation of performance metric/loss

  • model_output: a collection of SINDBAD model output time series as a time series of stacked land NT

  • cost_options: a table listing each observation constraint and how it should be used to calculate the loss/metric of model performance

Examples

julia
julia> using Sindbad

julia> # Calculate metrics for all cost options
julia> # metric_vec = metricVector(model_output, observations, cost_options)
Code
julia
function metricVector end

function metricVector(model_output, observations, cost_options)
    loss_vector = map(cost_options) do cost_option
        @debug "***cost for $(cost_option.variable)***"
        lossMetric = cost_option.cost_metric
        (y, yσ, ŷ) = getData(model_output, observations, cost_option)
        (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
        @debug "size y, yσ, ŷ", size(y), size(yσ), size(ŷ)
        # (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
        metr = metric(lossMetric, ŷ, y, yσ) * cost_option.cost_weight
        if isnan(metr)
            metr = oftype(metr, 1e19)
        end
        @debug "$(cost_option.variable) => $(nameof(typeof(lossMetric))): $(metr)"
        metr
    end
    @debug "\n-------------------\n"
    return loss_vector
end

function metricVector(model_output, observations, cost_options)
    loss_vector = map(cost_options) do cost_option
        @debug "***cost for $(cost_option.variable)***"
        lossMetric = cost_option.cost_metric
        (y, yσ, ŷ) = getData(model_output, observations, cost_option)
        (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
        @debug "size y, yσ, ŷ", size(y), size(yσ), size(ŷ)
        # (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
        metr = metric(lossMetric, ŷ, y, yσ) * cost_option.cost_weight
        if isnan(metr)
            metr = oftype(metr, 1e19)
        end
        @debug "$(cost_option.variable) => $(nameof(typeof(lossMetric))): $(metr)"
        metr
    end
    @debug "\n-------------------\n"
    return loss_vector
end

function metricVector(model_output::LandWrapper, observations, cost_options)
    loss_vector = map(cost_options) do cost_option
        @debug "$(cost_option.variable)"
        lossMetric = cost_option.cost_metric
        (y, yσ, ŷ) = getData(model_output, observations, cost_option)
        @debug "size y, yσ, ŷ", size(y), size(yσ), size(ŷ), size(idxs)
        (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ) ## cannot use the valids because LandWrapper produces vector
        metr = metric(lossMetric, ŷ, y, yσ) * cost_option.cost_weight
        if isnan(metr)
            metr = oftype(metr, 1e19)
        end
        @debug "$(cost_option.variable) => $(nameof(typeof(lossMetric))): $(metr)"
        metr
    end
    @debug "\n-------------------\n"
    return loss_vector
end

optimizeTEM

Sindbad.ParameterOptimization.optimizeTEM Function
julia
optimizeTEM(forcing::NamedTuple, observations, info::NamedTuple)

Optimizes TEM parameters by minimizing the cost function between model outputs and observations.

Arguments:

  • forcing: a forcing NT that contains the forcing time series set for ALL locations

  • observations: a NT or a vector of arrays of observations, their uncertainties, and mask to use for calculation of performance metric/loss

  • info: a SINDBAD NT that includes all information needed for setup and execution of an experiment

Returns:

  • parameter_table: A table containing optimized parameter values

Examples

julia
julia> using Sindbad

julia> # Optimize TEM parameters
julia> # optimized_params = optimizeTEM(forcing, observations, info)
Code
julia
function optimizeTEM end

function optimizeTEM(forcing::NamedTuple, observations, info::NamedTuple)
    # get the subset of parameters table that consists of only optimized parameters
    opti_helpers = prepOpti(forcing, observations, info, info.optimization.run_options.cost_method)

    # run the optimizer
    optim_para = optimizer(opti_helpers.cost_function, opti_helpers.default_values, opti_helpers.lower_bounds, opti_helpers.upper_bounds, info.optimization.optimizer.options, info.optimization.optimizer.method)

    optim_para = backScaleParameters(optim_para, opti_helpers.parameter_table, info.optimization.run_options.parameter_scaling)

    # update the parameter table with the optimized values
    opti_helpers.parameter_table.optimized .= optim_para
    return opti_helpers.parameter_table
end

function optimizeTEM(forcing::NamedTuple, observations, info::NamedTuple)
    # get the subset of parameters table that consists of only optimized parameters
    opti_helpers = prepOpti(forcing, observations, info, info.optimization.run_options.cost_method)

    # run the optimizer
    optim_para = optimizer(opti_helpers.cost_function, opti_helpers.default_values, opti_helpers.lower_bounds, opti_helpers.upper_bounds, info.optimization.optimizer.options, info.optimization.optimizer.method)

    optim_para = backScaleParameters(optim_para, opti_helpers.parameter_table, info.optimization.run_options.parameter_scaling)

    # update the parameter table with the optimized values
    opti_helpers.parameter_table.optimized .= optim_para
    return opti_helpers.parameter_table
end

function optimizeTEMYax(forcing::NamedTuple, output::NamedTuple, tem::NamedTuple, optim::NamedTuple, observations::NamedTuple; max_cache=1e9)
    incubes = (forcing.data..., observations.data...)
    indims = (forcing.dims..., observations.dims...)
    forcing_vars = collect(forcing.variables)
    outdims = output.parameter_dim
    out = output.land_init
    obs_vars = collect(observations.variables)

    params = mapCube(optimizeYax, (incubes...,); out=out, tem=tem, optim=optim, forcing_vars=forcing_vars, obs_vars=obs_vars, indims=indims, outdims=outdims, max_cache=max_cache)
    return params
end

optimizeTEMYax

Sindbad.ParameterOptimization.optimizeTEMYax Function
julia
optimizeTEMYax(forcing::NamedTuple, output::NamedTuple, tem::NamedTuple, optim::NamedTuple, observations::NamedTuple; max_cache=1e9)

Optimizes the Terrestrial Ecosystem Model (TEM) parameters for each pixel by mapping over the YAXcube(s).

Arguments

  • forcing::NamedTuple: Input forcing data for the TEM model

  • output::NamedTuple: Output configuration settings

  • tem::NamedTuple: TEM model parameters and settings

  • optim::NamedTuple: Optimization parameters and settings

  • observations::NamedTuple: Observed data for model calibration

Keywords

  • max_cache::Float64=1e9: Maximum cache size for optimization process

Returns

Optimized TEM parameters cube

Code
julia
function optimizeTEMYax(forcing::NamedTuple, output::NamedTuple, tem::NamedTuple, optim::NamedTuple, observations::NamedTuple; max_cache=1e9)
    incubes = (forcing.data..., observations.data...)
    indims = (forcing.dims..., observations.dims...)
    forcing_vars = collect(forcing.variables)
    outdims = output.parameter_dim
    out = output.land_init
    obs_vars = collect(observations.variables)

    params = mapCube(optimizeYax, (incubes...,); out=out, tem=tem, optim=optim, forcing_vars=forcing_vars, obs_vars=obs_vars, indims=indims, outdims=outdims, max_cache=max_cache)
    return params
end

optimizer

Sindbad.ParameterOptimization.optimizer Function
julia
optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, algorithm <: ParameterOptimizationMethod)

Optimize model parameters using various optimization algorithms.

Arguments:

  • cost_function: A function handle that takes a parameter vector as input and calculates a cost/loss (scalar or vector).

  • default_values: A vector of default parameter values to initialize the optimization.

  • lower_bounds: A vector of lower bounds for the parameters.

  • upper_bounds: A vector of upper bounds for the parameters.

  • algo_options: A set of options specific to the chosen optimization algorithm.

  • algorithm: The optimization algorithm to be used.

Returns:

  • optim_para: A vector of optimized parameter values.

algorithm:

ParameterOptimizationMethod

Abstract type for optimization methods in SINDBAD

Available methods/subtypes:

  • BayesOptKMaternARD5: Bayesian Optimization using Matern 5/2 kernel with Automatic Relevance Determination from BayesOpt.jl

  • CMAEvolutionStrategyCMAES: Covariance Matrix Adaptation Evolution Strategy (CMA-ES) from CMAEvolutionStrategy.jl

  • EvolutionaryCMAES: Evolutionary version of CMA-ES optimization from Evolutionary.jl

  • OptimBFGS: Broyden-Fletcher-Goldfarb-Shanno (BFGS) from Optim.jl

  • OptimLBFGS: Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) from Optim.jl

  • OptimizationBBOadaptive: Black Box Optimization with adaptive parameters from Optimization.jl

  • OptimizationBBOxnes: Black Box Optimization using Natural Evolution Strategy (xNES) from Optimization.jl

  • OptimizationBFGS: BFGS optimization with box constraints from Optimization.jl

  • OptimizationFminboxGradientDescent: Gradient descent optimization with box constraints from Optimization.jl

  • OptimizationFminboxGradientDescentFD: Gradient descent optimization with box constraints using forward differentiation from Optimization.jl

  • OptimizationGCMAESDef: Global CMA-ES optimization with default settings from Optimization.jl

  • OptimizationGCMAESFD: Global CMA-ES optimization using forward differentiation from Optimization.jl

  • OptimizationMultistartOptimization: Multi-start optimization to find global optimum from Optimization.jl

  • OptimizationNelderMead: Nelder-Mead simplex optimization method from Optimization.jl

  • OptimizationQuadDirect: Quadratic Direct optimization method from Optimization.jl


Extended help

Notes:

  • The function supports a wide range of optimization algorithms, each tailored for specific use cases.

  • Some methods do not require bounds for optimization, while others do.

  • The cost_function should be defined by the user to calculate the loss based on the model output and observations. It is defined in cost.jl.

  • The algo_options argument allows fine-tuning of the optimization process for each algorithm.

  • Some algorithms (e.g., BayesOptKMaternARD5, OptimizationBBOxnes) require additional configuration steps, such as setting kernels or merging default and user-defined options.

Examples

julia
julia> using Sindbad

julia> # Optimize using CMA-ES algorithm
julia> # optim_para = optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, CMAEvolutionStrategyCMAES())

julia> # Optimize using BFGS algorithm
julia> # optim_para = optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, OptimBFGS())

julia> # Optimize using Black Box Optimization (xNES)
julia> # optim_para = optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, OptimizationBBOxnes())

Implementation Details:

  • The function internally calls the appropriate optimization library and algorithm based on the algorithm argument.

  • Each algorithm has its own implementation details, such as handling bounds, configuring options, and solving the optimization problem.

  • The results are processed to extract the optimized parameter vector (optim_para), which is returned to the user.

Code
julia
function optimizer(::Any, default_values::Any, ::Any, ::Any, ::Any, x::ParameterOptimizationMethod)
    @warn "
    Optimizer `$(nameof(typeof(x)))` not implemented. 
    
    To implement a new optimizer:
    
    - First add a new type as a subtype of `ParameterOptimizationMethod` in `src/Types/ParameterOptimizationTypes.jl`. 
    
    - Then, add a corresponding method:
      - if it can be implemented as an internal Sindbad method without additional dependencies, implement the method in `src/ParameterOptimization/optimizer.jl`.     
      - if it requires additional dependencies, implement the method in `ext/<extension_name>/ParameterOptimizationOptimizer.jl` extension.

    As a fallback, this function will return the default values as the optimized parameters.

    "
    return default_values
end

prepOpti

Sindbad.ParameterOptimization.prepOpti Function
julia
prepOpti(forcing, observations, info, cost_method::CostModelObs)

Prepares optimization setup including cost functions, parameter bounds, and initial values.

Arguments:

  • forcing: Forcing data for the simulation

  • observations: Observation data for cost calculation

  • info: Experiment configuration NamedTuple

  • cost_method: Cost calculation method type

Returns:

  • A NamedTuple containing optimization helpers including cost function, parameter bounds, and default values

Examples

julia
julia> using Sindbad

julia> # Prepare optimization setup
julia> # opti_helpers = prepOpti(forcing, observations, info, CostModelObs())

Prepares optimization parameters, settings, and helper functions based on the provided inputs.

Arguments:

  • forcing: Input forcing data used for the optimization process.

  • observations: Observed data used for comparison or calibration during optimization.

  • info: A SINDBAD NamedTuple containing all information needed for setup and execution of the experiment.

  • cost_method: The method used to calculate the cost function.

Returns:

  • A NamedTuple opti_helpers containing:
    • parameter_table: Processed model parameters for optimization.

    • cost_function: A function to compute the cost for optimization.

    • cost_options: Options and settings for the cost function.

    • default_values: Default parameter values for the models.

    • lower_bounds: Lower bounds for the parameters.

    • upper_bounds: Upper bounds for the parameters.

    • run_helpers: Helper information for running the optimization.

cost_method:

CostMethod

Abstract type for cost calculation methods in SINDBAD

Available methods/subtypes:

  • CostModelObs: cost calculation between model output and observations

  • CostModelObsLandTS: cost calculation between land model output and time series observations

  • CostModelObsMT: multi-threaded cost calculation between model output and observations

  • CostModelObsPriors: cost calculation between model output, observations, and priors. NOTE THAT THIS METHOD IS JUST A PLACEHOLDER AND DOES NOT CALCULATE PRIOR COST PROPERLY YET


Extended help

Notes:

  • The function processes the input data and configuration to set up the optimization problem.

  • It prepares model parameters, cost options, and helper functions required for the optimization process.

  • Depending on the cost_method, the cost function is customized to handle specific data types or computation methods.

Code
julia
function prepOpti end

function prepOpti(forcing, observations, info)
    return prepOpti(forcing, observations, info, CostModelObs())
end

function prepOpti(forcing, observations, info)
    return prepOpti(forcing, observations, info, CostModelObs())
end

function  prepOpti(forcing, observations, info, ::CostModelObsMT; algorithm_info_field=:optimizer)
    algorithm_info = getproperty(info.optimization, algorithm_info_field)
    opti_helpers = prepOpti(forcing, observations, info, CostModelObs())
    run_helpers = opti_helpers.run_helpers
    cost_vector_size = getCostVectorSize(getproperty(algorithm_info, :options), opti_helpers.default_values, getproperty(algorithm_info, :method))
    cost_vector = Vector{eltype(opti_helpers.default_values)}(undef, cost_vector_size)
    
    space_index = 1 # the parallelization of cost computation only runs in single pixel runs

    cost_function = x -> cost(x, opti_helpers.default_values, info.models.forward, run_helpers.space_forcing[space_index], run_helpers.space_spinup_forcing[space_index], run_helpers.loc_forcing_t, run_helpers.output_array, run_helpers.space_output_mt, deepcopy(run_helpers.space_land[space_index]), run_helpers.tem_info, observations, opti_helpers.parameter_table, opti_helpers.cost_options, info.optimization.run_options.multi_constraint_method, info.optimization.run_options.parameter_scaling, cost_vector, info.optimization.run_options.cost_method)

    opti_helpers = (; opti_helpers..., cost_function=cost_function, cost_vector=cost_vector)
    return opti_helpers
end

function  prepOpti(forcing, observations, info, ::CostModelObsLandTS)
    opti_helpers = prepOpti(forcing, observations, info, CostModelObs())
    run_helpers = opti_helpers.run_helpers

    cost_function = x -> costLand(x, info.models.forward, run_helpers.loc_forcing, run_helpers.loc_spinup_forcing, run_helpers.loc_forcing_t, run_helpers.land_time_series, run_helpers.loc_land, run_helpers.tem_info, observations, opti_helpers.parameter_table, opti_helpers.cost_options, info.optimization.run_options.multi_constraint_method, info.optimization.run_options.parameter_scaling)

    opti_helpers = (; opti_helpers..., cost_function=cost_function)
    
    return opti_helpers
end

function  prepOpti(forcing, observations, info, cost_method::CostModelObs)
    run_helpers = prepTEM(forcing, info)

    parameter_helpers = prepParameters(info.optimization.parameter_table, info.optimization.run_options.parameter_scaling)
    
    parameter_table = parameter_helpers.parameter_table
    default_values = parameter_helpers.default_values
    lower_bounds = parameter_helpers.lower_bounds
    upper_bounds = parameter_helpers.upper_bounds

    cost_options = prepCostOptions(observations, info.optimization.cost_options, cost_method)

    cost_function = x -> cost(x, default_values, info.models.forward, run_helpers.space_forcing, run_helpers.space_spinup_forcing, run_helpers.loc_forcing_t, run_helpers.output_array, run_helpers.space_output, deepcopy(run_helpers.space_land), run_helpers.tem_info, observations, parameter_table, cost_options, info.optimization.run_options.multi_constraint_method, info.optimization.run_options.parameter_scaling, cost_method)

    opti_helpers = (; parameter_table=parameter_table, cost_function=cost_function, cost_options=cost_options, default_values=default_values, lower_bounds=lower_bounds, upper_bounds=upper_bounds, run_helpers=run_helpers)
    
    return opti_helpers
end

prepParameters

Sindbad.ParameterOptimization.prepParameters Function
julia
prepParameters(parameter_table, parameter_scaling)

Prepare model parameters for optimization by processing default and bounds of the parameters to be optimized.

Arguments

  • parameter_table: Table of the parameters to be optimized

  • parameter_scaling: Scaling method/type for parameter optimization

Returns

A tuple containing processed parameters ready for optimization

Code
julia
function prepParameters(parameter_table, parameter_scaling)
    
    default_values, lower_bounds, upper_bounds = scaleParameters(parameter_table, parameter_scaling)

    parameter_helpers = (; parameter_table=parameter_table, default_values=default_values, lower_bounds=lower_bounds, upper_bounds=upper_bounds)
    return parameter_helpers
end