SINDBAD

julia

ParameterOptimization

The ParameterOptimization module provides tools for optimizing SINDBAD models, including parameter estimation, model calibration, and cost function evaluation. It integrates various optimization algorithms and utilities to streamline the optimization workflow for SINDBAD experiments.

Purpose

This module is designed to support optimization tasks in SINDBAD, such as calibrating model parameters to match observations or minimizing cost functions. It leverages multiple optimization libraries and provides a unified interface for running optimization routines.

Dependencies

Related (SINDBAD ecosystem)

ErrorMetrics: Metric implementations for cost evaluation.
TimeSamplers: Temporal helpers used by some workflows.
OmniTools: Shared helpers and table utilities.

External (third-party)

StableRNGs: Reproducible random number generation for stochastic workflows.

Internal (within Sindbad)

Sindbad.Setup
Sindbad.Simulation
Sindbad.Types
SindbadTEM

Optional dependencies (extensions / weakdeps)

Some optimizer backends are enabled via Julia extensions (see root Project.toml and ext/):

CMAEvolutionStrategy → SindbadCMAEvolutionStrategyExt
Optimization → SindbadOptimizationExt

Other packages listed under [weakdeps] may be used by experimental workflows but are not required for the base module to load.

Included Files

handleDataForCost.jl: Helpers for aligning forcing/observation/model outputs for cost evaluation.
getCost.jl: Cost extraction and convenience wrappers.
optimizer.jl: Core optimization logic (algorithm selection + option normalization).
cost.jl: Cost functions for evaluating model–observation mismatch.
prepOpti.jl: Prepares inputs and bookkeeping for optimization runs.
optimizeTEM.jl: Optimization routines for single sites and spatial workflows.
sensitivityAnalysis.jl: Sensitivity analysis utilities.

Note

The package integrates multiple optimization libraries, allowing users to choose the most suitable algorithm for their problem.
Designed to be modular and extensible, enabling users to customize optimization workflows for specific use cases.
Supports both gradient-based and derivative-free optimization methods, ensuring flexibility for different types of cost functions.

Examples

julia

julia> using Sindbad

julia> # Run parameter optimization from a configuration file
julia> # result = runExperimentOpti("path/to/experiment_config.json")

julia> # Calculate cost between model output and observations
julia> # cost_result = runExperimentCost("path/to/experiment_config.json")

Functions

combineMetric

Sindbad.ParameterOptimization.combineMetric Function

julia

combineMetric(metric_vector::AbstractArray, ::MetricSum)
combineMetric(metric_vector::AbstractArray, ::MetricMinimum)
combineMetric(metric_vector::AbstractArray, ::MetricMaximum)
combineMetric(metric_vector::AbstractArray, percentile_value::T)

combines the metric from all constraints based on the type of combination.

Arguments:

metric_vector: a vector of metrics for variables

methods for combining the metric

::MetricSum: return the total sum as the metric.
::MetricMinimum: return the minimum of the metric_vector as the metric.
::MetricMaximum: return the maximum of the metric_vector as the metric.
percentile_value::T: percentile_value^th percentile of metric of each constraint as the overall metric

Examples

julia

julia> using Sindbad

julia> metric_vec = [0.5, 0.7, 0.3]
3-element Vector{Float64}:
 0.5
 0.7
 0.3

julia> combineMetric(metric_vec, MetricSum())
1.5

julia> combineMetric(metric_vec, MetricMinimum())
0.3

julia> combineMetric(metric_vec, MetricMaximum())
0.7

Code

julia

function combineMetric end

function combineMetric(metric_vector::AbstractArray, ::MetricSum)
    return sum(metric_vector)
end

function combineMetric(metric_vector::AbstractArray, ::MetricSum)
    return sum(metric_vector)
end

function combineMetric(metric_vector::AbstractArray, ::MetricMinimum)
    return minimum(metric_vector)
end

function combineMetric(metric_vector::AbstractArray, ::MetricMaximum)
    return maximum(metric_vector)
end

function combineMetric(metric_vector::AbstractArray, percentile_value::T) where {T<:Real}
    return percentile(metric_vector, percentile_value)
end

cost

Sindbad.ParameterOptimization.cost Function

julia

cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, cost_method<: CostMethod)

Calculate the cost for a parameter vector.

Arguments

parameter_vector: Vector of parameter values to be optimized
'default_values': Default values for model parameters
selected_models: Collection of selected models for simulation
space_forcing: Forcing data for the main simulation period
space_spinup_forcing: Forcing data for the spin-up period
loc_forcing_t: Time-specific forcing data
output_array: Array to store simulation outputs
space_output: Spatial output configuration
space_land: Land surface characteristics
tem_info: Temporal information for simulation
observations: Observed data for comparison
parameter_updater: Function to update parameters
cost_options: Options for cost function calculation
multi_constraint_method: Method for handling multiple constraints
parameter_scaling_type: Type of parameter scaling
sindbad_cost_method <: CostMethod: a type parameter indicating cost calculation method

Returns

Cost value representing the difference between model outputs and observations

sindbad_cost_method:

CostMethod

Abstract type for cost calculation methods in SINDBAD

Available methods/subtypes:

CostModelObs: cost calculation between model output and observations
CostModelObsLandTS: cost calculation between land model output and time series observations
CostModelObsMT: multi-threaded cost calculation between model output and observations
CostModelObsPriors: cost calculation between model output, observations, and priors. NOTE THAT THIS METHOD IS JUST A PLACEHOLDER AND DOES NOT CALCULATE PRIOR COST PROPERLY YET

Examples

julia

julia> using Sindbad

julia> # Calculate cost for a parameter vector
julia> # cost_value = cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, CostModelObs())

Code

julia

function cost end

function cost(parameter_vector, _, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, ::CostModelObs)
    @debug parameter_vector
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    runTEM!(updated_models, space_forcing, space_spinup_forcing, loc_forcing_t, space_output, space_land, tem_info)
    cost_vector = metricVector(output_array, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function cost(parameter_vector, _, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, ::CostModelObs)
    @debug parameter_vector
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    runTEM!(updated_models, space_forcing, space_spinup_forcing, loc_forcing_t, space_output, space_land, tem_info)
    cost_vector = metricVector(output_array, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function cost(parameter_matrix, _, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, cost_out::Vector, ::CostModelObsMT)
    @debug "parameter_matrix:: ", size(parameter_matrix)
    parameter_set_size = size(parameter_matrix, 2)
    done_params=1
    Threads.@threads for parameter_index in eachindex(1:parameter_set_size)
        idx = Threads.threadid()
        parameter_vector = parameter_matrix[:, parameter_index]
        @debug parameter_vector
        updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
        coreTEM!(updated_models, space_forcing, space_spinup_forcing, loc_forcing_t, space_output[idx], space_land, tem_info)
        cost_vector = metricVector(space_output[idx], observations, cost_options)
        cost_metric = combineMetric(cost_vector, multi_constraint_method)
        cost_out[parameter_index] = cost_metric
        @debug "Parameter column:: ", idx, round(100 * done_params/parameter_set_size,digits=2), parameter_set_size, cost_metric, cost_vector
        done_params += 1
    end
    return cost_out
end

function cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, ::CostModelObsPriors)
    # prior has to be calculated before the parameters are backscaled and models are updated
    cost_prior = metric(MSE(), parameter_vector, parameter_vector, default_values)
    cost_metric = cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, CostModelObs())
    cost_metric = cost_metric + cost_prior
    @debug cost_vector, cost_metric
    return cost_metric
end

function cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    cost_metric = cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, CostModelObs())
    return cost_metric
end

function costLand end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, ::Nothing, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

costLand

Sindbad.ParameterOptimization.costLand Function

julia

costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)

costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, _, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)

Calculates the cost of SINDBAD model simulations for a single location by comparing model outputs as collections of SINDBAD land with observations using specified metrics and constraints.

In the first variant, the land_time_series is preallocated for computational efficiency. In the second variant, the runTEM stacks the land using map function and the preallocations is not necessary.

Arguments:

parameter_vector::AbstractArray: A vector of model parameter values to be optimized.
selected_models: A tuple of selected SINDBAD models in the given model structure, the parameters of which are optimized.
forcing: A forcing NamedTuple containing the time series of environmental drivers for the simulation.
spinup_forcing: A forcing NamedTuple for the spinup phase, used to initialize the model to a steady state.
loc_forcing_t: A forcing NamedTuple for a single location and a single time step.
land_timeseries: A preallocated vector to store the land state for each time step during the simulation.
land_init: The initial SINDBAD land NamedTuple containing all fields and subfields.
tem_info: A nested NamedTuple containing necessary information for running SINDBAD TEM, including helpers, models, and spinup configurations.
observations: A NamedTuple or vector of arrays containing observational data, uncertainties, and masks for calculating performance metrics.
parameter_updater: A function to update model parameters based on the parameter_vector.
cost_options: A table specifying how each observation constraint should be used to calculate the cost or performance metric.
multi_constraint_method: A method for combining the vector of costs into a single cost value or vector, as required by the optimization algorithm.
parameter_scaling_type: Specifies the type of scaling applied to the parameters during optimization.

Returns:

cost_metric: A scalar or vector representing the cost, calculated by comparing model outputs with observations using the specified metrics and constraints.

Note

The function updates the selected models using the parameter_vector and parameter_updater.
It runs the SINDBAD TEM simulation for the specified location using runTEM.
The model outputs are compared with observations using metricVector, which calculates the performance metrics.
The resulting cost vector is combined into a single cost value or vector using combineMetric and the specified multi_constraint_method.

Examples:

Calculating cost for a single location:

julia

cost = costLand(parameter_vector, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)

Using a custom multi-constraint method:

julia

custom_method = CustomConstraintMethod()
cost = costLand(parameter_vector, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, custom_method, parameter_scaling_type)

Handling observational uncertainties:

Observations can include uncertainties and masks to refine the cost calculation, ensuring robust model evaluation.

Code

julia

function costLand end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, ::Nothing, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
    updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
    land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_init, tem_info)
    cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
    cost_metric = combineMetric(cost_vector, multi_constraint_method)
    @debug cost_vector, cost_metric
    return cost_metric
end

getCostVectorSize

Sindbad.ParameterOptimization.getCostVectorSize Function

julia

getCostVectorSize(algo_options, parameter_vector, ::ParameterOptimizationMethod || GSAMethod)

Calculates the size of the cost vector required for a specific optimization or sensitivity analysis method.

Arguments:

algo_options: A NamedTuple or dictionary containing algorithm-specific options (e.g., population size, number of trajectories).
parameter_vector: A vector of parameters used in the optimization or sensitivity analysis.
::ParameterOptimizationMethod: The optimization or sensitivity analysis method. Supported methods include:
- CMAEvolutionStrategyCMAES: Covariance Matrix Adaptation Evolution Strategy.
- GSAMorris: Morris method for global sensitivity analysis.
- GSASobol: Sobol method for global sensitivity analysis.
- GSASobolDM: Sobol method with Design Matrices.

Returns:

An integer representing the size of the cost vector required for the specified method.

Notes:

For CMAEvolutionStrategyCMAES, the size is determined by the population size or a default formula based on the parameter vector length.
For GSAMorris, the size is calculated as the product of the number of trajectories and the length of the design matrix.
For GSASobol, the size is determined by the number of parameters and the number of samples.
For GSASobolDM, the size is equivalent to that of GSASobol.

Code

julia

function getCostVectorSize end

function getCostVectorSize(algo_options, parameter_vector, ::CMAEvolutionStrategyCMAES)
    cost_vector_size = Threads.nthreads()
    if hasproperty(algo_options, :multi_threading)
        if algo_options.multi_threading
            if hasproperty(algo_options, :popsize)
                cost_vector_size = algo_options.popsize
            else
                cost_vector_size = 4 + floor(Int, 3 * log(length(parameter_vector)))
            end
        end
    end
    return cost_vector_size
end

function getCostVectorSize(algo_options, parameter_vector, ::CMAEvolutionStrategyCMAES)
    cost_vector_size = Threads.nthreads()
    if hasproperty(algo_options, :multi_threading)
        if algo_options.multi_threading
            if hasproperty(algo_options, :popsize)
                cost_vector_size = algo_options.popsize
            else
                cost_vector_size = 4 + floor(Int, 3 * log(length(parameter_vector)))
            end
        end
    end
    return cost_vector_size
end

function getCostVectorSize(algo_options, __precompile__, ::GSAMorris)
    default_opt = sindbadDefaultOptions(GSAMorris())
    num_trajectory = default_opt.num_trajectory
    len_design_mat = default_opt.len_design_mat
    if hasproperty(algo_options, :num_trajectory)
        num_trajectory = algo_options.num_trajectory
    end
    if hasproperty(algo_options, :len_design_mat)
        len_design_mat = algo_options.len_design_mat
    end
    cost_vector_size = num_trajectory * len_design_mat
    return cost_vector_size
end

function getCostVectorSize(algo_options, parameter_vector, ::GSASobol)
    default_opt = sindbadDefaultOptions(GSASobol())
    samples = default_opt.samples
    nparam = length(parameter_vector)
    norder = length(algo_options.method_options.order) - 1
    if hasproperty(algo_options, :samples)
        samples = algo_options.samples
    end
    cost_vector_size = samples * (norder * nparam + 2)
    return cost_vector_size
end

function getCostVectorSize(algo_options, parameter_vector, ::GSASobolDM)
    return getCostVectorSize(algo_options, parameter_vector, GSASobol())
end

getData

Sindbad.ParameterOptimization.getData Function

julia

getData(model_output::LandWrapper, observations, cost_option)
getData(model_output::NamedTuple, observations, cost_option)
getData(model_output::AbstractArray, observations, cost_option)

Arguments:

model_output: a collection of SINDBAD model output time series as a time series of stacked land NT or as a preallocated array.
observations: a NT or a vector of arrays of observations, their uncertainties, and mask to use for calculation of performance metric/loss
cost_option: information for a observation constraint on how it should be used to calculate the loss/metric of model performance

Code

julia

function getDataWithoutNaN end

function getDataWithoutNaN(y, yσ, ŷ, idxs)
    y_view = @view y[idxs] 
    yσ_view = @view yσ[idxs] 
    ŷ_view = @view ŷ[idxs] 
    return (y_view, yσ_view, ŷ_view)
end

function getDataWithoutNaN(y, yσ, ŷ, idxs)
    y_view = @view y[idxs] 
    yσ_view = @view yσ[idxs] 
    ŷ_view = @view ŷ[idxs] 
    return (y_view, yσ_view, ŷ_view)
end

function getDataWithoutNaN(y, yσ, ŷ)
    @debug sum(is_invalid_number.(y)), sum(is_invalid_number.(yσ)), sum(is_invalid_number.(ŷ))
    idxs = (.!isnan.(y .* yσ .* ŷ)) # TODO this has to be run because LandWrapper produces a vector. So, dispatch with the inefficient versions without idxs argument
    return y[idxs], yσ[idxs], ŷ[idxs]
end

getDataWithoutNaN

Sindbad.ParameterOptimization.getDataWithoutNaN Function

julia

getDataWithoutNaN(y, yσ, ŷ, idxs)
getDataWithoutNaN(y, yσ, ŷ)

return model and obs data excluding for the common NaN or for the valid pixels idxs.

Arguments:

y: observation data
yσ: observational uncertainty data
ŷ: model simulation data/estimate
idxs: indices of valid data points

Examples

julia

julia> using Sindbad

julia> y = [1.0, NaN, 3.0, 4.0]
4-element Vector{Float64}:
   1.0
 NaN
   3.0
   4.0

julia> yσ = [0.1, 0.2, 0.1, 0.1]
4-element Vector{Float64}:
 0.1
 0.2
 0.1
 0.1

julia> ŷ = [1.1, 2.0, 2.9, 4.1]
4-element Vector{Float64}:
 1.1
 2.0
 2.9
 4.1

julia> y_clean, yσ_clean, ŷ_clean = getDataWithoutNaN(y, yσ, ŷ)
([1.0, 3.0, 4.0], [0.1, 0.1, 0.1], [1.1, 2.9, 4.1])

Code

julia

function getDataWithoutNaN end

function getDataWithoutNaN(y, yσ, ŷ, idxs)
    y_view = @view y[idxs] 
    yσ_view = @view yσ[idxs] 
    ŷ_view = @view ŷ[idxs] 
    return (y_view, yσ_view, ŷ_view)
end

function getDataWithoutNaN(y, yσ, ŷ, idxs)
    y_view = @view y[idxs] 
    yσ_view = @view yσ[idxs] 
    ŷ_view = @view ŷ[idxs] 
    return (y_view, yσ_view, ŷ_view)
end

function getDataWithoutNaN(y, yσ, ŷ)
    @debug sum(is_invalid_number.(y)), sum(is_invalid_number.(yσ)), sum(is_invalid_number.(ŷ))
    idxs = (.!isnan.(y .* yσ .* ŷ)) # TODO this has to be run because LandWrapper produces a vector. So, dispatch with the inefficient versions without idxs argument
    return y[idxs], yσ[idxs], ŷ[idxs]
end

getModelOutputView

Sindbad.ParameterOptimization.getModelOutputView Function

julia

 getModelOutputView(_dat::AbstractArray{<:Any,N}) where N

Code

julia

function getModelOutputView(_dat::AbstractArray{<:Any,N}) where N
    dim = 1
    inds = map(size(_dat)) do _
        ind = dim == 2 ? 1 : Colon()
        dim += 1
        ind
    end
    @view _dat[inds...]
end

globalSensitivity

Sindbad.ParameterOptimization.globalSensitivity Function

julia

globalSensitivity(cost_function, method_options, p_bounds, ::GSAMethod; batch=true)

Performs global sensitivity analysis using the specified method.

Arguments:

cost_function: A function that computes the cost or output of the model based on input parameters.
method_options: A set of options specific to the chosen sensitivity analysis method.
p_bounds: A vector or matrix specifying the bounds of the parameters for sensitivity analysis.
::GSAMethod: The sensitivity analysis method to use.
batch: A boolean flag indicating whether to perform batch processing (default: true).

Returns:

A results object containing the sensitivity indices and other relevant outputs for the specified method.

algorithm:

GSAMethod

Abstract type for global sensitivity analysis methods in SINDBAD

Available methods/subtypes:

GSAMorris: Morris method for global sensitivity analysis
GSASobol: Sobol method for global sensitivity analysis
GSASobolDM: Sobol method with derivative-based measures for global sensitivity analysis

Extended help

Notes:

The function internally calls the gsa function from the GlobalSensitivity.jl package with the specified method and options.
The cost_function should be defined to compute the model output based on the input parameters.
The method_options argument allows fine-tuning of the sensitivity analysis process for each method.

Code

julia

function globalSensitivity end

function globalSensitivity(cost_function, method_options, p_bounds, ::GSAMorris; batch=true)
    results = gsa(cost_function, Morris(; method_options...), p_bounds, batch=batch)
    return results
end

function globalSensitivity(cost_function, method_options, p_bounds, ::GSAMorris; batch=true)
    results = gsa(cost_function, Morris(; method_options...), p_bounds, batch=batch)
    return results
end

function globalSensitivity(cost_function, method_options, p_bounds, ::GSASobol; batch=true)
    sampler = getproperty(Sindbad.ParameterOptimization.GlobalSensitivity, Symbol(method_options.sampler))(; method_options.sampler_options..., method_options.method_options... )
    results = gsa(cost_function, sampler, p_bounds; method_options..., batch=batch)
    return results
end

function globalSensitivity(cost_function, method_options, p_bounds, ::GSASobolDM; batch=true)
    sampler = getproperty(Sindbad.ParameterOptimization.GlobalSensitivity, Symbol(method_options.sampler))(; method_options.sampler_options...)
    samples = method_options.samples
    lb = first.(p_bounds)
    ub = last.(p_bounds)
    A, B = QuasiMonteCarlo.generate_design_matrices(samples, lb, ub, sampler)
    results = gsa(cost_function, Sobol(; method_options.method_options...), A, B; method_options..., batch=batch)
    return results
end

metricVector

Sindbad.ParameterOptimization.metricVector Function

julia

metricVector(model_output::LandWrapper, observations, cost_options)
metricVector(model_output, observations, cost_options)

returns a vector of metrics for variables in cost_options.variable.

Arguments:

observations: a NT or a vector of arrays of observations, their uncertainties, and mask to use for calculation of performance metric/loss
model_output: a collection of SINDBAD model output time series as a time series of stacked land NT
cost_options: a table listing each observation constraint and how it should be used to calculate the loss/metric of model performance

Examples

julia

julia> using Sindbad

julia> # Calculate metrics for all cost options
julia> # metric_vec = metricVector(model_output, observations, cost_options)

Code

julia

function metricVector end

function metricVector(model_output, observations, cost_options)
    loss_vector = map(cost_options) do cost_option
        @debug "***cost for $(cost_option.variable)***"
        lossMetric = cost_option.cost_metric
        (y, yσ, ŷ) = getData(model_output, observations, cost_option)
        (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
        @debug "size y, yσ, ŷ", size(y), size(yσ), size(ŷ)
        # (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
        metr = metric(lossMetric, ŷ, y, yσ) * cost_option.cost_weight
        if isnan(metr)
            metr = oftype(metr, 1e19)
        end
        @debug "$(cost_option.variable) => $(nameof(typeof(lossMetric))): $(metr)"
        metr
    end
    @debug "\n-------------------\n"
    return loss_vector
end

function metricVector(model_output, observations, cost_options)
    loss_vector = map(cost_options) do cost_option
        @debug "***cost for $(cost_option.variable)***"
        lossMetric = cost_option.cost_metric
        (y, yσ, ŷ) = getData(model_output, observations, cost_option)
        (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
        @debug "size y, yσ, ŷ", size(y), size(yσ), size(ŷ)
        # (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
        metr = metric(lossMetric, ŷ, y, yσ) * cost_option.cost_weight
        if isnan(metr)
            metr = oftype(metr, 1e19)
        end
        @debug "$(cost_option.variable) => $(nameof(typeof(lossMetric))): $(metr)"
        metr
    end
    @debug "\n-------------------\n"
    return loss_vector
end

function metricVector(model_output::LandWrapper, observations, cost_options)
    loss_vector = map(cost_options) do cost_option
        @debug "$(cost_option.variable)"
        lossMetric = cost_option.cost_metric
        (y, yσ, ŷ) = getData(model_output, observations, cost_option)
        @debug "size y, yσ, ŷ", size(y), size(yσ), size(ŷ), size(idxs)
        (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ) ## cannot use the valids because LandWrapper produces vector
        metr = metric(lossMetric, ŷ, y, yσ) * cost_option.cost_weight
        if isnan(metr)
            metr = oftype(metr, 1e19)
        end
        @debug "$(cost_option.variable) => $(nameof(typeof(lossMetric))): $(metr)"
        metr
    end
    @debug "\n-------------------\n"
    return loss_vector
end

optimizeTEM

Sindbad.ParameterOptimization.optimizeTEM Function

julia

optimizeTEM(forcing::NamedTuple, observations, info::NamedTuple)

Optimizes TEM parameters by minimizing the cost function between model outputs and observations.

Arguments:

forcing: a forcing NT that contains the forcing time series set for ALL locations
observations: a NT or a vector of arrays of observations, their uncertainties, and mask to use for calculation of performance metric/loss
info: a SINDBAD NT that includes all information needed for setup and execution of an experiment

Returns:

parameter_table: A table containing optimized parameter values

Examples

julia

julia> using Sindbad

julia> # Optimize TEM parameters
julia> # optimized_params = optimizeTEM(forcing, observations, info)

Code

julia

function optimizeTEM end

function optimizeTEM(forcing::NamedTuple, observations, info::NamedTuple)
    # get the subset of parameters table that consists of only optimized parameters
    opti_helpers = prepOpti(forcing, observations, info, info.optimization.run_options.cost_method)

    # run the optimizer
    optim_para = optimizer(opti_helpers.cost_function, opti_helpers.default_values, opti_helpers.lower_bounds, opti_helpers.upper_bounds, info.optimization.optimizer.options, info.optimization.optimizer.method)

    optim_para = backScaleParameters(optim_para, opti_helpers.parameter_table, info.optimization.run_options.parameter_scaling)

    # update the parameter table with the optimized values
    opti_helpers.parameter_table.optimized .= optim_para
    return opti_helpers.parameter_table
end

function optimizeTEM(forcing::NamedTuple, observations, info::NamedTuple)
    # get the subset of parameters table that consists of only optimized parameters
    opti_helpers = prepOpti(forcing, observations, info, info.optimization.run_options.cost_method)

    # run the optimizer
    optim_para = optimizer(opti_helpers.cost_function, opti_helpers.default_values, opti_helpers.lower_bounds, opti_helpers.upper_bounds, info.optimization.optimizer.options, info.optimization.optimizer.method)

    optim_para = backScaleParameters(optim_para, opti_helpers.parameter_table, info.optimization.run_options.parameter_scaling)

    # update the parameter table with the optimized values
    opti_helpers.parameter_table.optimized .= optim_para
    return opti_helpers.parameter_table
end

function optimizeTEMYax(forcing::NamedTuple, output::NamedTuple, tem::NamedTuple, optim::NamedTuple, observations::NamedTuple; max_cache=1e9)
    incubes = (forcing.data..., observations.data...)
    indims = (forcing.dims..., observations.dims...)
    forcing_vars = collect(forcing.variables)
    outdims = output.parameter_dim
    out = output.land_init
    obs_vars = collect(observations.variables)

    params = mapCube(optimizeYax, (incubes...,); out=out, tem=tem, optim=optim, forcing_vars=forcing_vars, obs_vars=obs_vars, indims=indims, outdims=outdims, max_cache=max_cache)
    return params
end

optimizeTEMYax

Sindbad.ParameterOptimization.optimizeTEMYax Function

julia

optimizeTEMYax(forcing::NamedTuple, output::NamedTuple, tem::NamedTuple, optim::NamedTuple, observations::NamedTuple; max_cache=1e9)

Optimizes the Terrestrial Ecosystem Model (TEM) parameters for each pixel by mapping over the YAXcube(s).

Arguments

forcing::NamedTuple: Input forcing data for the TEM model
output::NamedTuple: Output configuration settings
tem::NamedTuple: TEM model parameters and settings
optim::NamedTuple: Optimization parameters and settings
observations::NamedTuple: Observed data for model calibration

Keywords

max_cache::Float64=1e9: Maximum cache size for optimization process

Returns

Optimized TEM parameters cube

Code

julia

function optimizeTEMYax(forcing::NamedTuple, output::NamedTuple, tem::NamedTuple, optim::NamedTuple, observations::NamedTuple; max_cache=1e9)
    incubes = (forcing.data..., observations.data...)
    indims = (forcing.dims..., observations.dims...)
    forcing_vars = collect(forcing.variables)
    outdims = output.parameter_dim
    out = output.land_init
    obs_vars = collect(observations.variables)

    params = mapCube(optimizeYax, (incubes...,); out=out, tem=tem, optim=optim, forcing_vars=forcing_vars, obs_vars=obs_vars, indims=indims, outdims=outdims, max_cache=max_cache)
    return params
end

optimizer

Sindbad.ParameterOptimization.optimizer Function

julia

optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, algorithm <: ParameterOptimizationMethod)

Optimize model parameters using various optimization algorithms.

Arguments:

cost_function: A function handle that takes a parameter vector as input and calculates a cost/loss (scalar or vector).
default_values: A vector of default parameter values to initialize the optimization.
lower_bounds: A vector of lower bounds for the parameters.
upper_bounds: A vector of upper bounds for the parameters.
algo_options: A set of options specific to the chosen optimization algorithm.
algorithm: The optimization algorithm to be used.

Returns:

optim_para: A vector of optimized parameter values.

algorithm:

ParameterOptimizationMethod

Abstract type for optimization methods in SINDBAD

Available methods/subtypes:

BayesOptKMaternARD5: Bayesian Optimization using Matern 5/2 kernel with Automatic Relevance Determination from BayesOpt.jl
CMAEvolutionStrategyCMAES: Covariance Matrix Adaptation Evolution Strategy (CMA-ES) from CMAEvolutionStrategy.jl
EvolutionaryCMAES: Evolutionary version of CMA-ES optimization from Evolutionary.jl
OptimBFGS: Broyden-Fletcher-Goldfarb-Shanno (BFGS) from Optim.jl
OptimLBFGS: Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) from Optim.jl
OptimizationBBOadaptive: Black Box Optimization with adaptive parameters from Optimization.jl
OptimizationBBOxnes: Black Box Optimization using Natural Evolution Strategy (xNES) from Optimization.jl
OptimizationBFGS: BFGS optimization with box constraints from Optimization.jl
OptimizationFminboxGradientDescent: Gradient descent optimization with box constraints from Optimization.jl
OptimizationFminboxGradientDescentFD: Gradient descent optimization with box constraints using forward differentiation from Optimization.jl
OptimizationGCMAESDef: Global CMA-ES optimization with default settings from Optimization.jl
OptimizationGCMAESFD: Global CMA-ES optimization using forward differentiation from Optimization.jl
OptimizationMultistartOptimization: Multi-start optimization to find global optimum from Optimization.jl
OptimizationNelderMead: Nelder-Mead simplex optimization method from Optimization.jl
OptimizationQuadDirect: Quadratic Direct optimization method from Optimization.jl

Extended help

Notes:

The function supports a wide range of optimization algorithms, each tailored for specific use cases.
Some methods do not require bounds for optimization, while others do.
The cost_function should be defined by the user to calculate the loss based on the model output and observations. It is defined in cost.jl.
The algo_options argument allows fine-tuning of the optimization process for each algorithm.
Some algorithms (e.g., BayesOptKMaternARD5, OptimizationBBOxnes) require additional configuration steps, such as setting kernels or merging default and user-defined options.

Examples

julia

julia> using Sindbad

julia> # Optimize using CMA-ES algorithm
julia> # optim_para = optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, CMAEvolutionStrategyCMAES())

julia> # Optimize using BFGS algorithm
julia> # optim_para = optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, OptimBFGS())

julia> # Optimize using Black Box Optimization (xNES)
julia> # optim_para = optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, OptimizationBBOxnes())

Implementation Details:

The function internally calls the appropriate optimization library and algorithm based on the algorithm argument.
Each algorithm has its own implementation details, such as handling bounds, configuring options, and solving the optimization problem.
The results are processed to extract the optimized parameter vector (optim_para), which is returned to the user.

Code

julia

function optimizer(::Any, default_values::Any, ::Any, ::Any, ::Any, x::ParameterOptimizationMethod)
    @warn "
    Optimizer `$(nameof(typeof(x)))` not implemented. 
    
    To implement a new optimizer:
    
    - First add a new type as a subtype of `ParameterOptimizationMethod` in `src/Types/ParameterOptimizationTypes.jl`. 
    
    - Then, add a corresponding method:
      - if it can be implemented as an internal Sindbad method without additional dependencies, implement the method in `src/ParameterOptimization/optimizer.jl`.     
      - if it requires additional dependencies, implement the method in `ext/<extension_name>/ParameterOptimizationOptimizer.jl` extension.

    As a fallback, this function will return the default values as the optimized parameters.

    "
    return default_values
end

prepOpti

Sindbad.ParameterOptimization.prepOpti Function

julia

prepOpti(forcing, observations, info, cost_method::CostModelObs)

Prepares optimization setup including cost functions, parameter bounds, and initial values.

Arguments:

forcing: Forcing data for the simulation
observations: Observation data for cost calculation
info: Experiment configuration NamedTuple
cost_method: Cost calculation method type

Returns:

A NamedTuple containing optimization helpers including cost function, parameter bounds, and default values

Examples

julia

julia> using Sindbad

julia> # Prepare optimization setup
julia> # opti_helpers = prepOpti(forcing, observations, info, CostModelObs())

Prepares optimization parameters, settings, and helper functions based on the provided inputs.

Arguments:

forcing: Input forcing data used for the optimization process.
observations: Observed data used for comparison or calibration during optimization.
info: A SINDBAD NamedTuple containing all information needed for setup and execution of the experiment.
cost_method: The method used to calculate the cost function.

Returns:

A NamedTuple opti_helpers containing:
- parameter_table: Processed model parameters for optimization.
- cost_function: A function to compute the cost for optimization.
- cost_options: Options and settings for the cost function.
- default_values: Default parameter values for the models.
- lower_bounds: Lower bounds for the parameters.
- upper_bounds: Upper bounds for the parameters.
- run_helpers: Helper information for running the optimization.

cost_method:

CostMethod

Abstract type for cost calculation methods in SINDBAD

Available methods/subtypes:

CostModelObs: cost calculation between model output and observations
CostModelObsLandTS: cost calculation between land model output and time series observations
CostModelObsMT: multi-threaded cost calculation between model output and observations
CostModelObsPriors: cost calculation between model output, observations, and priors. NOTE THAT THIS METHOD IS JUST A PLACEHOLDER AND DOES NOT CALCULATE PRIOR COST PROPERLY YET

Extended help

Notes:

The function processes the input data and configuration to set up the optimization problem.
It prepares model parameters, cost options, and helper functions required for the optimization process.
Depending on the cost_method, the cost function is customized to handle specific data types or computation methods.

Code

julia

function prepOpti end

function prepOpti(forcing, observations, info)
    return prepOpti(forcing, observations, info, CostModelObs())
end

function prepOpti(forcing, observations, info)
    return prepOpti(forcing, observations, info, CostModelObs())
end

function  prepOpti(forcing, observations, info, ::CostModelObsMT; algorithm_info_field=:optimizer)
    algorithm_info = getproperty(info.optimization, algorithm_info_field)
    opti_helpers = prepOpti(forcing, observations, info, CostModelObs())
    run_helpers = opti_helpers.run_helpers
    cost_vector_size = getCostVectorSize(getproperty(algorithm_info, :options), opti_helpers.default_values, getproperty(algorithm_info, :method))
    cost_vector = Vector{eltype(opti_helpers.default_values)}(undef, cost_vector_size)
    
    space_index = 1 # the parallelization of cost computation only runs in single pixel runs

    cost_function = x -> cost(x, opti_helpers.default_values, info.models.forward, run_helpers.space_forcing[space_index], run_helpers.space_spinup_forcing[space_index], run_helpers.loc_forcing_t, run_helpers.output_array, run_helpers.space_output_mt, deepcopy(run_helpers.space_land[space_index]), run_helpers.tem_info, observations, opti_helpers.parameter_table, opti_helpers.cost_options, info.optimization.run_options.multi_constraint_method, info.optimization.run_options.parameter_scaling, cost_vector, info.optimization.run_options.cost_method)

    opti_helpers = (; opti_helpers..., cost_function=cost_function, cost_vector=cost_vector)
    return opti_helpers
end

function  prepOpti(forcing, observations, info, ::CostModelObsLandTS)
    opti_helpers = prepOpti(forcing, observations, info, CostModelObs())
    run_helpers = opti_helpers.run_helpers

    cost_function = x -> costLand(x, info.models.forward, run_helpers.loc_forcing, run_helpers.loc_spinup_forcing, run_helpers.loc_forcing_t, run_helpers.land_time_series, run_helpers.loc_land, run_helpers.tem_info, observations, opti_helpers.parameter_table, opti_helpers.cost_options, info.optimization.run_options.multi_constraint_method, info.optimization.run_options.parameter_scaling)

    opti_helpers = (; opti_helpers..., cost_function=cost_function)
    
    return opti_helpers
end

function  prepOpti(forcing, observations, info, cost_method::CostModelObs)
    run_helpers = prepTEM(forcing, info)

    parameter_helpers = prepParameters(info.optimization.parameter_table, info.optimization.run_options.parameter_scaling)
    
    parameter_table = parameter_helpers.parameter_table
    default_values = parameter_helpers.default_values
    lower_bounds = parameter_helpers.lower_bounds
    upper_bounds = parameter_helpers.upper_bounds

    cost_options = prepCostOptions(observations, info.optimization.cost_options, cost_method)

    cost_function = x -> cost(x, default_values, info.models.forward, run_helpers.space_forcing, run_helpers.space_spinup_forcing, run_helpers.loc_forcing_t, run_helpers.output_array, run_helpers.space_output, deepcopy(run_helpers.space_land), run_helpers.tem_info, observations, parameter_table, cost_options, info.optimization.run_options.multi_constraint_method, info.optimization.run_options.parameter_scaling, cost_method)

    opti_helpers = (; parameter_table=parameter_table, cost_function=cost_function, cost_options=cost_options, default_values=default_values, lower_bounds=lower_bounds, upper_bounds=upper_bounds, run_helpers=run_helpers)
    
    return opti_helpers
end

prepParameters

Sindbad.ParameterOptimization.prepParameters Function

julia

prepParameters(parameter_table, parameter_scaling)

Prepare model parameters for optimization by processing default and bounds of the parameters to be optimized.

Arguments

parameter_table: Table of the parameters to be optimized
parameter_scaling: Scaling method/type for parameter optimization

Returns

A tuple containing processed parameters ready for optimization

Code

julia

function prepParameters(parameter_table, parameter_scaling)
    
    default_values, lower_bounds, upper_bounds = scaleParameters(parameter_table, parameter_scaling)

    parameter_helpers = (; parameter_table=parameter_table, default_values=default_values, lower_bounds=lower_bounds, upper_bounds=upper_bounds)
    return parameter_helpers
end

TEM

Modules

Extensions

Functions

combineMetric

cost

costLand

getCostVectorSize

getData

getDataWithoutNaN

getModelOutputView

globalSensitivity

metricVector

optimizeTEM

optimizeTEMYax

optimizer

prepOpti

prepParameters

Functions ​

combineMetric ​

cost ​

costLand ​

getCostVectorSize ​

getData ​

getDataWithoutNaN ​

getModelOutputView ​

globalSensitivity ​

metricVector ​

optimizeTEM ​

optimizeTEMYax ​

optimizer ​

prepOpti ​

prepParameters ​

Functions

combineMetric

cost

costLand

getCostVectorSize

getData

getDataWithoutNaN

getModelOutputView

globalSensitivity

metricVector

optimizeTEM

optimizeTEMYax

optimizer

prepOpti

prepParameters