Sindbad.ParameterOptimization Module
ParameterOptimizationThe ParameterOptimization module provides tools for optimizing SINDBAD models, including parameter estimation, model calibration, and cost function evaluation. It integrates various optimization algorithms and utilities to streamline the optimization workflow for SINDBAD experiments.
Purpose
This module is designed to support optimization tasks in SINDBAD, such as calibrating model parameters to match observations or minimizing cost functions. It leverages multiple optimization libraries and provides a unified interface for running optimization routines.
Dependencies
Related (SINDBAD ecosystem)
ErrorMetrics: Metric implementations for cost evaluation.TimeSamplers: Temporal helpers used by some workflows.OmniTools: Shared helpers and table utilities.
External (third-party)
StableRNGs: Reproducible random number generation for stochastic workflows.
Internal (within Sindbad)
Sindbad.SetupSindbad.SimulationSindbad.TypesSindbadTEM
Optional dependencies (extensions / weakdeps)
Some optimizer backends are enabled via Julia extensions (see root Project.toml and ext/):
CMAEvolutionStrategy→SindbadCMAEvolutionStrategyExtOptimization→SindbadOptimizationExt
Other packages listed under [weakdeps] may be used by experimental workflows but are not required for the base module to load.
Included Files
handleDataForCost.jl: Helpers for aligning forcing/observation/model outputs for cost evaluation.getCost.jl: Cost extraction and convenience wrappers.optimizer.jl: Core optimization logic (algorithm selection + option normalization).cost.jl: Cost functions for evaluating model–observation mismatch.prepOpti.jl: Prepares inputs and bookkeeping for optimization runs.optimizeTEM.jl: Optimization routines for single sites and spatial workflows.sensitivityAnalysis.jl: Sensitivity analysis utilities.
Note
The package integrates multiple optimization libraries, allowing users to choose the most suitable algorithm for their problem.
Designed to be modular and extensible, enabling users to customize optimization workflows for specific use cases.
Supports both gradient-based and derivative-free optimization methods, ensuring flexibility for different types of cost functions.
Examples
julia> using Sindbad
julia> # Run parameter optimization from a configuration file
julia> # result = runExperimentOpti("path/to/experiment_config.json")
julia> # Calculate cost between model output and observations
julia> # cost_result = runExperimentCost("path/to/experiment_config.json")Functions
combineMetric
Sindbad.ParameterOptimization.combineMetric Function
combineMetric(metric_vector::AbstractArray, ::MetricSum)
combineMetric(metric_vector::AbstractArray, ::MetricMinimum)
combineMetric(metric_vector::AbstractArray, ::MetricMaximum)
combineMetric(metric_vector::AbstractArray, percentile_value::T)combines the metric from all constraints based on the type of combination.
Arguments:
metric_vector: a vector of metrics for variables
methods for combining the metric
::MetricSum: return the total sum as the metric.::MetricMinimum: return the minimum of themetric_vectoras the metric.::MetricMaximum: return the maximum of themetric_vectoras the metric.percentile_value::T:percentile_value^thpercentile of metric of each constraint as the overall metric
Examples
julia> using Sindbad
julia> metric_vec = [0.5, 0.7, 0.3]
3-element Vector{Float64}:
0.5
0.7
0.3
julia> combineMetric(metric_vec, MetricSum())
1.5
julia> combineMetric(metric_vec, MetricMinimum())
0.3
julia> combineMetric(metric_vec, MetricMaximum())
0.7Code
function combineMetric end
function combineMetric(metric_vector::AbstractArray, ::MetricSum)
return sum(metric_vector)
end
function combineMetric(metric_vector::AbstractArray, ::MetricSum)
return sum(metric_vector)
end
function combineMetric(metric_vector::AbstractArray, ::MetricMinimum)
return minimum(metric_vector)
end
function combineMetric(metric_vector::AbstractArray, ::MetricMaximum)
return maximum(metric_vector)
end
function combineMetric(metric_vector::AbstractArray, percentile_value::T) where {T<:Real}
return percentile(metric_vector, percentile_value)
endcost
Sindbad.ParameterOptimization.cost Function
cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, cost_method<: CostMethod)Calculate the cost for a parameter vector.
Arguments
parameter_vector: Vector of parameter values to be optimized'default_values': Default values for model parameters
selected_models: Collection of selected models for simulationspace_forcing: Forcing data for the main simulation periodspace_spinup_forcing: Forcing data for the spin-up periodloc_forcing_t: Time-specific forcing dataoutput_array: Array to store simulation outputsspace_output: Spatial output configurationspace_land: Land surface characteristicstem_info: Temporal information for simulationobservations: Observed data for comparisonparameter_updater: Function to update parameterscost_options: Options for cost function calculationmulti_constraint_method: Method for handling multiple constraintsparameter_scaling_type: Type of parameter scalingsindbad_cost_method <: CostMethod: a type parameter indicating cost calculation method
Returns
Cost value representing the difference between model outputs and observations
sindbad_cost_method:
CostMethod
Abstract type for cost calculation methods in SINDBAD
Available methods/subtypes:
CostModelObs: cost calculation between model output and observationsCostModelObsLandTS: cost calculation between land model output and time series observationsCostModelObsMT: multi-threaded cost calculation between model output and observationsCostModelObsPriors: cost calculation between model output, observations, and priors. NOTE THAT THIS METHOD IS JUST A PLACEHOLDER AND DOES NOT CALCULATE PRIOR COST PROPERLY YET
Examples
julia> using Sindbad
julia> # Calculate cost for a parameter vector
julia> # cost_value = cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, CostModelObs())Code
function cost end
function cost(parameter_vector, _, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, ::CostModelObs)
@debug parameter_vector
updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
runTEM!(updated_models, space_forcing, space_spinup_forcing, loc_forcing_t, space_output, space_land, tem_info)
cost_vector = metricVector(output_array, observations, cost_options)
cost_metric = combineMetric(cost_vector, multi_constraint_method)
@debug cost_vector, cost_metric
return cost_metric
end
function cost(parameter_vector, _, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, ::CostModelObs)
@debug parameter_vector
updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
runTEM!(updated_models, space_forcing, space_spinup_forcing, loc_forcing_t, space_output, space_land, tem_info)
cost_vector = metricVector(output_array, observations, cost_options)
cost_metric = combineMetric(cost_vector, multi_constraint_method)
@debug cost_vector, cost_metric
return cost_metric
end
function cost(parameter_matrix, _, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, cost_out::Vector, ::CostModelObsMT)
@debug "parameter_matrix:: ", size(parameter_matrix)
parameter_set_size = size(parameter_matrix, 2)
done_params=1
Threads.@threads for parameter_index in eachindex(1:parameter_set_size)
idx = Threads.threadid()
parameter_vector = parameter_matrix[:, parameter_index]
@debug parameter_vector
updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
coreTEM!(updated_models, space_forcing, space_spinup_forcing, loc_forcing_t, space_output[idx], space_land, tem_info)
cost_vector = metricVector(space_output[idx], observations, cost_options)
cost_metric = combineMetric(cost_vector, multi_constraint_method)
cost_out[parameter_index] = cost_metric
@debug "Parameter column:: ", idx, round(100 * done_params/parameter_set_size,digits=2), parameter_set_size, cost_metric, cost_vector
done_params += 1
end
return cost_out
end
function cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, ::CostModelObsPriors)
# prior has to be calculated before the parameters are backscaled and models are updated
cost_prior = metric(MSE(), parameter_vector, parameter_vector, default_values)
cost_metric = cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, CostModelObs())
cost_metric = cost_metric + cost_prior
@debug cost_vector, cost_metric
return cost_metric
end
function cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
cost_metric = cost(parameter_vector, default_values, selected_models, space_forcing, space_spinup_forcing, loc_forcing_t, output_array, space_output, space_land, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type, CostModelObs())
return cost_metric
end
function costLand end
function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
cost_metric = combineMetric(cost_vector, multi_constraint_method)
@debug cost_vector, cost_metric
return cost_metric
end
function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
cost_metric = combineMetric(cost_vector, multi_constraint_method)
@debug cost_vector, cost_metric
return cost_metric
end
function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, ::Nothing, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_init, tem_info)
cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
cost_metric = combineMetric(cost_vector, multi_constraint_method)
@debug cost_vector, cost_metric
return cost_metric
endcostLand
Sindbad.ParameterOptimization.costLand Function
costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, _, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)Calculates the cost of SINDBAD model simulations for a single location by comparing model outputs as collections of SINDBAD land with observations using specified metrics and constraints.
In the first variant, the land_time_series is preallocated for computational efficiency. In the second variant, the runTEM stacks the land using map function and the preallocations is not necessary.
Arguments:
parameter_vector::AbstractArray: A vector of model parameter values to be optimized.selected_models: A tuple of selected SINDBAD models in the given model structure, the parameters of which are optimized.forcing: A forcing NamedTuple containing the time series of environmental drivers for the simulation.spinup_forcing: A forcing NamedTuple for the spinup phase, used to initialize the model to a steady state.loc_forcing_t: A forcing NamedTuple for a single location and a single time step.land_timeseries: A preallocated vector to store the land state for each time step during the simulation.land_init: The initial SINDBAD land NamedTuple containing all fields and subfields.tem_info: A nested NamedTuple containing necessary information for running SINDBAD TEM, including helpers, models, and spinup configurations.observations: A NamedTuple or vector of arrays containing observational data, uncertainties, and masks for calculating performance metrics.parameter_updater: A function to update model parameters based on theparameter_vector.cost_options: A table specifying how each observation constraint should be used to calculate the cost or performance metric.multi_constraint_method: A method for combining the vector of costs into a single cost value or vector, as required by the optimization algorithm.parameter_scaling_type: Specifies the type of scaling applied to the parameters during optimization.
Returns:
cost_metric: A scalar or vector representing the cost, calculated by comparing model outputs with observations using the specified metrics and constraints.
Note
The function updates the selected models using the
parameter_vectorandparameter_updater.It runs the SINDBAD TEM simulation for the specified location using
runTEM.The model outputs are compared with observations using
metricVector, which calculates the performance metrics.The resulting cost vector is combined into a single cost value or vector using
combineMetricand the specifiedmulti_constraint_method.
Examples:
- Calculating cost for a single location:
cost = costLand(parameter_vector, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)- Using a custom multi-constraint method:
custom_method = CustomConstraintMethod()
cost = costLand(parameter_vector, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, custom_method, parameter_scaling_type)- Handling observational uncertainties:
- Observations can include uncertainties and masks to refine the cost calculation, ensuring robust model evaluation.
Code
function costLand end
function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
cost_metric = combineMetric(cost_vector, multi_constraint_method)
@debug cost_vector, cost_metric
return cost_metric
end
function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_timeseries, land_init, tem_info)
cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
cost_metric = combineMetric(cost_vector, multi_constraint_method)
@debug cost_vector, cost_metric
return cost_metric
end
function costLand(parameter_vector::AbstractArray, selected_models, forcing, spinup_forcing, loc_forcing_t, ::Nothing, land_init, tem_info, observations, parameter_updater, cost_options, multi_constraint_method, parameter_scaling_type)
updated_models = updateModels(parameter_vector, parameter_updater, parameter_scaling_type, selected_models)
land_wrapper_timeseries = runTEM(updated_models, forcing, spinup_forcing, loc_forcing_t, land_init, tem_info)
cost_vector = metricVector(land_wrapper_timeseries, observations, cost_options)
cost_metric = combineMetric(cost_vector, multi_constraint_method)
@debug cost_vector, cost_metric
return cost_metric
endgetCostVectorSize
Sindbad.ParameterOptimization.getCostVectorSize Function
getCostVectorSize(algo_options, parameter_vector, ::ParameterOptimizationMethod || GSAMethod)Calculates the size of the cost vector required for a specific optimization or sensitivity analysis method.
Arguments:
algo_options: A NamedTuple or dictionary containing algorithm-specific options (e.g., population size, number of trajectories).parameter_vector: A vector of parameters used in the optimization or sensitivity analysis.::ParameterOptimizationMethod: The optimization or sensitivity analysis method. Supported methods include:CMAEvolutionStrategyCMAES: Covariance Matrix Adaptation Evolution Strategy.GSAMorris: Morris method for global sensitivity analysis.GSASobol: Sobol method for global sensitivity analysis.GSASobolDM: Sobol method with Design Matrices.
Returns:
- An integer representing the size of the cost vector required for the specified method.
Notes:
For
CMAEvolutionStrategyCMAES, the size is determined by the population size or a default formula based on the parameter vector length.For
GSAMorris, the size is calculated as the product of the number of trajectories and the length of the design matrix.For
GSASobol, the size is determined by the number of parameters and the number of samples.For
GSASobolDM, the size is equivalent to that ofGSASobol.
Code
function getCostVectorSize end
function getCostVectorSize(algo_options, parameter_vector, ::CMAEvolutionStrategyCMAES)
cost_vector_size = Threads.nthreads()
if hasproperty(algo_options, :multi_threading)
if algo_options.multi_threading
if hasproperty(algo_options, :popsize)
cost_vector_size = algo_options.popsize
else
cost_vector_size = 4 + floor(Int, 3 * log(length(parameter_vector)))
end
end
end
return cost_vector_size
end
function getCostVectorSize(algo_options, parameter_vector, ::CMAEvolutionStrategyCMAES)
cost_vector_size = Threads.nthreads()
if hasproperty(algo_options, :multi_threading)
if algo_options.multi_threading
if hasproperty(algo_options, :popsize)
cost_vector_size = algo_options.popsize
else
cost_vector_size = 4 + floor(Int, 3 * log(length(parameter_vector)))
end
end
end
return cost_vector_size
end
function getCostVectorSize(algo_options, __precompile__, ::GSAMorris)
default_opt = sindbadDefaultOptions(GSAMorris())
num_trajectory = default_opt.num_trajectory
len_design_mat = default_opt.len_design_mat
if hasproperty(algo_options, :num_trajectory)
num_trajectory = algo_options.num_trajectory
end
if hasproperty(algo_options, :len_design_mat)
len_design_mat = algo_options.len_design_mat
end
cost_vector_size = num_trajectory * len_design_mat
return cost_vector_size
end
function getCostVectorSize(algo_options, parameter_vector, ::GSASobol)
default_opt = sindbadDefaultOptions(GSASobol())
samples = default_opt.samples
nparam = length(parameter_vector)
norder = length(algo_options.method_options.order) - 1
if hasproperty(algo_options, :samples)
samples = algo_options.samples
end
cost_vector_size = samples * (norder * nparam + 2)
return cost_vector_size
end
function getCostVectorSize(algo_options, parameter_vector, ::GSASobolDM)
return getCostVectorSize(algo_options, parameter_vector, GSASobol())
endgetData
Sindbad.ParameterOptimization.getData Function
getData(model_output::LandWrapper, observations, cost_option)
getData(model_output::NamedTuple, observations, cost_option)
getData(model_output::AbstractArray, observations, cost_option)Arguments:
model_output: a collection of SINDBAD model output time series as a time series of stacked land NT or as a preallocated array.observations: a NT or a vector of arrays of observations, their uncertainties, and mask to use for calculation of performance metric/losscost_option: information for a observation constraint on how it should be used to calculate the loss/metric of model performance
Code
function getDataWithoutNaN end
function getDataWithoutNaN(y, yσ, ŷ, idxs)
y_view = @view y[idxs]
yσ_view = @view yσ[idxs]
ŷ_view = @view ŷ[idxs]
return (y_view, yσ_view, ŷ_view)
end
function getDataWithoutNaN(y, yσ, ŷ, idxs)
y_view = @view y[idxs]
yσ_view = @view yσ[idxs]
ŷ_view = @view ŷ[idxs]
return (y_view, yσ_view, ŷ_view)
end
function getDataWithoutNaN(y, yσ, ŷ)
@debug sum(is_invalid_number.(y)), sum(is_invalid_number.(yσ)), sum(is_invalid_number.(ŷ))
idxs = (.!isnan.(y .* yσ .* ŷ)) # TODO this has to be run because LandWrapper produces a vector. So, dispatch with the inefficient versions without idxs argument
return y[idxs], yσ[idxs], ŷ[idxs]
endgetDataWithoutNaN
Sindbad.ParameterOptimization.getDataWithoutNaN Function
getDataWithoutNaN(y, yσ, ŷ, idxs)
getDataWithoutNaN(y, yσ, ŷ)return model and obs data excluding for the common NaN or for the valid pixels idxs.
Arguments:
y: observation datayσ: observational uncertainty dataŷ: model simulation data/estimateidxs: indices of valid data points
Examples
julia> using Sindbad
julia> y = [1.0, NaN, 3.0, 4.0]
4-element Vector{Float64}:
1.0
NaN
3.0
4.0
julia> yσ = [0.1, 0.2, 0.1, 0.1]
4-element Vector{Float64}:
0.1
0.2
0.1
0.1
julia> ŷ = [1.1, 2.0, 2.9, 4.1]
4-element Vector{Float64}:
1.1
2.0
2.9
4.1
julia> y_clean, yσ_clean, ŷ_clean = getDataWithoutNaN(y, yσ, ŷ)
([1.0, 3.0, 4.0], [0.1, 0.1, 0.1], [1.1, 2.9, 4.1])Code
function getDataWithoutNaN end
function getDataWithoutNaN(y, yσ, ŷ, idxs)
y_view = @view y[idxs]
yσ_view = @view yσ[idxs]
ŷ_view = @view ŷ[idxs]
return (y_view, yσ_view, ŷ_view)
end
function getDataWithoutNaN(y, yσ, ŷ, idxs)
y_view = @view y[idxs]
yσ_view = @view yσ[idxs]
ŷ_view = @view ŷ[idxs]
return (y_view, yσ_view, ŷ_view)
end
function getDataWithoutNaN(y, yσ, ŷ)
@debug sum(is_invalid_number.(y)), sum(is_invalid_number.(yσ)), sum(is_invalid_number.(ŷ))
idxs = (.!isnan.(y .* yσ .* ŷ)) # TODO this has to be run because LandWrapper produces a vector. So, dispatch with the inefficient versions without idxs argument
return y[idxs], yσ[idxs], ŷ[idxs]
endgetModelOutputView
Sindbad.ParameterOptimization.getModelOutputView Function
getModelOutputView(_dat::AbstractArray{<:Any,N}) where NCode
function getModelOutputView(_dat::AbstractArray{<:Any,N}) where N
dim = 1
inds = map(size(_dat)) do _
ind = dim == 2 ? 1 : Colon()
dim += 1
ind
end
@view _dat[inds...]
endglobalSensitivity
Sindbad.ParameterOptimization.globalSensitivity Function
globalSensitivity(cost_function, method_options, p_bounds, ::GSAMethod; batch=true)Performs global sensitivity analysis using the specified method.
Arguments:
cost_function: A function that computes the cost or output of the model based on input parameters.method_options: A set of options specific to the chosen sensitivity analysis method.p_bounds: A vector or matrix specifying the bounds of the parameters for sensitivity analysis.::GSAMethod: The sensitivity analysis method to use.batch: A boolean flag indicating whether to perform batch processing (default:true).
Returns:
- A
resultsobject containing the sensitivity indices and other relevant outputs for the specified method.
algorithm:
GSAMethod
Abstract type for global sensitivity analysis methods in SINDBAD
Available methods/subtypes:
GSAMorris: Morris method for global sensitivity analysisGSASobol: Sobol method for global sensitivity analysisGSASobolDM: Sobol method with derivative-based measures for global sensitivity analysis
Extended help
Notes:
The function internally calls the
gsafunction from the GlobalSensitivity.jl package with the specified method and options.The
cost_functionshould be defined to compute the model output based on the input parameters.The
method_optionsargument allows fine-tuning of the sensitivity analysis process for each method.
Code
function globalSensitivity end
function globalSensitivity(cost_function, method_options, p_bounds, ::GSAMorris; batch=true)
results = gsa(cost_function, Morris(; method_options...), p_bounds, batch=batch)
return results
end
function globalSensitivity(cost_function, method_options, p_bounds, ::GSAMorris; batch=true)
results = gsa(cost_function, Morris(; method_options...), p_bounds, batch=batch)
return results
end
function globalSensitivity(cost_function, method_options, p_bounds, ::GSASobol; batch=true)
sampler = getproperty(Sindbad.ParameterOptimization.GlobalSensitivity, Symbol(method_options.sampler))(; method_options.sampler_options..., method_options.method_options... )
results = gsa(cost_function, sampler, p_bounds; method_options..., batch=batch)
return results
end
function globalSensitivity(cost_function, method_options, p_bounds, ::GSASobolDM; batch=true)
sampler = getproperty(Sindbad.ParameterOptimization.GlobalSensitivity, Symbol(method_options.sampler))(; method_options.sampler_options...)
samples = method_options.samples
lb = first.(p_bounds)
ub = last.(p_bounds)
A, B = QuasiMonteCarlo.generate_design_matrices(samples, lb, ub, sampler)
results = gsa(cost_function, Sobol(; method_options.method_options...), A, B; method_options..., batch=batch)
return results
endmetricVector
Sindbad.ParameterOptimization.metricVector Function
metricVector(model_output::LandWrapper, observations, cost_options)
metricVector(model_output, observations, cost_options)returns a vector of metrics for variables in cost_options.variable.
Arguments:
observations: a NT or a vector of arrays of observations, their uncertainties, and mask to use for calculation of performance metric/lossmodel_output: a collection of SINDBAD model output time series as a time series of stacked land NTcost_options: a table listing each observation constraint and how it should be used to calculate the loss/metric of model performance
Examples
julia> using Sindbad
julia> # Calculate metrics for all cost options
julia> # metric_vec = metricVector(model_output, observations, cost_options)Code
function metricVector end
function metricVector(model_output, observations, cost_options)
loss_vector = map(cost_options) do cost_option
@debug "***cost for $(cost_option.variable)***"
lossMetric = cost_option.cost_metric
(y, yσ, ŷ) = getData(model_output, observations, cost_option)
(y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
@debug "size y, yσ, ŷ", size(y), size(yσ), size(ŷ)
# (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
metr = metric(lossMetric, ŷ, y, yσ) * cost_option.cost_weight
if isnan(metr)
metr = oftype(metr, 1e19)
end
@debug "$(cost_option.variable) => $(nameof(typeof(lossMetric))): $(metr)"
metr
end
@debug "\n-------------------\n"
return loss_vector
end
function metricVector(model_output, observations, cost_options)
loss_vector = map(cost_options) do cost_option
@debug "***cost for $(cost_option.variable)***"
lossMetric = cost_option.cost_metric
(y, yσ, ŷ) = getData(model_output, observations, cost_option)
(y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
@debug "size y, yσ, ŷ", size(y), size(yσ), size(ŷ)
# (y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ, cost_option.valids)
metr = metric(lossMetric, ŷ, y, yσ) * cost_option.cost_weight
if isnan(metr)
metr = oftype(metr, 1e19)
end
@debug "$(cost_option.variable) => $(nameof(typeof(lossMetric))): $(metr)"
metr
end
@debug "\n-------------------\n"
return loss_vector
end
function metricVector(model_output::LandWrapper, observations, cost_options)
loss_vector = map(cost_options) do cost_option
@debug "$(cost_option.variable)"
lossMetric = cost_option.cost_metric
(y, yσ, ŷ) = getData(model_output, observations, cost_option)
@debug "size y, yσ, ŷ", size(y), size(yσ), size(ŷ), size(idxs)
(y, yσ, ŷ) = getDataWithoutNaN(y, yσ, ŷ) ## cannot use the valids because LandWrapper produces vector
metr = metric(lossMetric, ŷ, y, yσ) * cost_option.cost_weight
if isnan(metr)
metr = oftype(metr, 1e19)
end
@debug "$(cost_option.variable) => $(nameof(typeof(lossMetric))): $(metr)"
metr
end
@debug "\n-------------------\n"
return loss_vector
endoptimizeTEM
Sindbad.ParameterOptimization.optimizeTEM Function
optimizeTEM(forcing::NamedTuple, observations, info::NamedTuple)Optimizes TEM parameters by minimizing the cost function between model outputs and observations.
Arguments:
forcing: a forcing NT that contains the forcing time series set for ALL locationsobservations: a NT or a vector of arrays of observations, their uncertainties, and mask to use for calculation of performance metric/lossinfo: a SINDBAD NT that includes all information needed for setup and execution of an experiment
Returns:
parameter_table: A table containing optimized parameter values
Examples
julia> using Sindbad
julia> # Optimize TEM parameters
julia> # optimized_params = optimizeTEM(forcing, observations, info)Code
function optimizeTEM end
function optimizeTEM(forcing::NamedTuple, observations, info::NamedTuple)
# get the subset of parameters table that consists of only optimized parameters
opti_helpers = prepOpti(forcing, observations, info, info.optimization.run_options.cost_method)
# run the optimizer
optim_para = optimizer(opti_helpers.cost_function, opti_helpers.default_values, opti_helpers.lower_bounds, opti_helpers.upper_bounds, info.optimization.optimizer.options, info.optimization.optimizer.method)
optim_para = backScaleParameters(optim_para, opti_helpers.parameter_table, info.optimization.run_options.parameter_scaling)
# update the parameter table with the optimized values
opti_helpers.parameter_table.optimized .= optim_para
return opti_helpers.parameter_table
end
function optimizeTEM(forcing::NamedTuple, observations, info::NamedTuple)
# get the subset of parameters table that consists of only optimized parameters
opti_helpers = prepOpti(forcing, observations, info, info.optimization.run_options.cost_method)
# run the optimizer
optim_para = optimizer(opti_helpers.cost_function, opti_helpers.default_values, opti_helpers.lower_bounds, opti_helpers.upper_bounds, info.optimization.optimizer.options, info.optimization.optimizer.method)
optim_para = backScaleParameters(optim_para, opti_helpers.parameter_table, info.optimization.run_options.parameter_scaling)
# update the parameter table with the optimized values
opti_helpers.parameter_table.optimized .= optim_para
return opti_helpers.parameter_table
end
function optimizeTEMYax(forcing::NamedTuple, output::NamedTuple, tem::NamedTuple, optim::NamedTuple, observations::NamedTuple; max_cache=1e9)
incubes = (forcing.data..., observations.data...)
indims = (forcing.dims..., observations.dims...)
forcing_vars = collect(forcing.variables)
outdims = output.parameter_dim
out = output.land_init
obs_vars = collect(observations.variables)
params = mapCube(optimizeYax, (incubes...,); out=out, tem=tem, optim=optim, forcing_vars=forcing_vars, obs_vars=obs_vars, indims=indims, outdims=outdims, max_cache=max_cache)
return params
endoptimizeTEMYax
Sindbad.ParameterOptimization.optimizeTEMYax Function
optimizeTEMYax(forcing::NamedTuple, output::NamedTuple, tem::NamedTuple, optim::NamedTuple, observations::NamedTuple; max_cache=1e9)Optimizes the Terrestrial Ecosystem Model (TEM) parameters for each pixel by mapping over the YAXcube(s).
Arguments
forcing::NamedTuple: Input forcing data for the TEM modeloutput::NamedTuple: Output configuration settingstem::NamedTuple: TEM model parameters and settingsoptim::NamedTuple: Optimization parameters and settingsobservations::NamedTuple: Observed data for model calibration
Keywords
max_cache::Float64=1e9: Maximum cache size for optimization process
Returns
Optimized TEM parameters cube
Code
function optimizeTEMYax(forcing::NamedTuple, output::NamedTuple, tem::NamedTuple, optim::NamedTuple, observations::NamedTuple; max_cache=1e9)
incubes = (forcing.data..., observations.data...)
indims = (forcing.dims..., observations.dims...)
forcing_vars = collect(forcing.variables)
outdims = output.parameter_dim
out = output.land_init
obs_vars = collect(observations.variables)
params = mapCube(optimizeYax, (incubes...,); out=out, tem=tem, optim=optim, forcing_vars=forcing_vars, obs_vars=obs_vars, indims=indims, outdims=outdims, max_cache=max_cache)
return params
endoptimizer
Sindbad.ParameterOptimization.optimizer Function
optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, algorithm <: ParameterOptimizationMethod)Optimize model parameters using various optimization algorithms.
Arguments:
cost_function: A function handle that takes a parameter vector as input and calculates a cost/loss (scalar or vector).default_values: A vector of default parameter values to initialize the optimization.lower_bounds: A vector of lower bounds for the parameters.upper_bounds: A vector of upper bounds for the parameters.algo_options: A set of options specific to the chosen optimization algorithm.algorithm: The optimization algorithm to be used.
Returns:
optim_para: A vector of optimized parameter values.
algorithm:
ParameterOptimizationMethod
Abstract type for optimization methods in SINDBAD
Available methods/subtypes:
BayesOptKMaternARD5: Bayesian Optimization using Matern 5/2 kernel with Automatic Relevance Determination from BayesOpt.jlCMAEvolutionStrategyCMAES: Covariance Matrix Adaptation Evolution Strategy (CMA-ES) from CMAEvolutionStrategy.jlEvolutionaryCMAES: Evolutionary version of CMA-ES optimization from Evolutionary.jlOptimBFGS: Broyden-Fletcher-Goldfarb-Shanno (BFGS) from Optim.jlOptimLBFGS: Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) from Optim.jlOptimizationBBOadaptive: Black Box Optimization with adaptive parameters from Optimization.jlOptimizationBBOxnes: Black Box Optimization using Natural Evolution Strategy (xNES) from Optimization.jlOptimizationBFGS: BFGS optimization with box constraints from Optimization.jlOptimizationFminboxGradientDescent: Gradient descent optimization with box constraints from Optimization.jlOptimizationFminboxGradientDescentFD: Gradient descent optimization with box constraints using forward differentiation from Optimization.jlOptimizationGCMAESDef: Global CMA-ES optimization with default settings from Optimization.jlOptimizationGCMAESFD: Global CMA-ES optimization using forward differentiation from Optimization.jlOptimizationMultistartOptimization: Multi-start optimization to find global optimum from Optimization.jlOptimizationNelderMead: Nelder-Mead simplex optimization method from Optimization.jlOptimizationQuadDirect: Quadratic Direct optimization method from Optimization.jl
Extended help
Notes:
The function supports a wide range of optimization algorithms, each tailored for specific use cases.
Some methods do not require bounds for optimization, while others do.
The
cost_functionshould be defined by the user to calculate the loss based on the model output and observations. It is defined in cost.jl.The
algo_optionsargument allows fine-tuning of the optimization process for each algorithm.Some algorithms (e.g.,
BayesOptKMaternARD5,OptimizationBBOxnes) require additional configuration steps, such as setting kernels or merging default and user-defined options.
Examples
julia> using Sindbad
julia> # Optimize using CMA-ES algorithm
julia> # optim_para = optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, CMAEvolutionStrategyCMAES())
julia> # Optimize using BFGS algorithm
julia> # optim_para = optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, OptimBFGS())
julia> # Optimize using Black Box Optimization (xNES)
julia> # optim_para = optimizer(cost_function, default_values, lower_bounds, upper_bounds, algo_options, OptimizationBBOxnes())Implementation Details:
The function internally calls the appropriate optimization library and algorithm based on the
algorithmargument.Each algorithm has its own implementation details, such as handling bounds, configuring options, and solving the optimization problem.
The results are processed to extract the optimized parameter vector (
optim_para), which is returned to the user.
Code
function optimizer(::Any, default_values::Any, ::Any, ::Any, ::Any, x::ParameterOptimizationMethod)
@warn "
Optimizer `$(nameof(typeof(x)))` not implemented.
To implement a new optimizer:
- First add a new type as a subtype of `ParameterOptimizationMethod` in `src/Types/ParameterOptimizationTypes.jl`.
- Then, add a corresponding method:
- if it can be implemented as an internal Sindbad method without additional dependencies, implement the method in `src/ParameterOptimization/optimizer.jl`.
- if it requires additional dependencies, implement the method in `ext/<extension_name>/ParameterOptimizationOptimizer.jl` extension.
As a fallback, this function will return the default values as the optimized parameters.
"
return default_values
endprepOpti
Sindbad.ParameterOptimization.prepOpti Function
prepOpti(forcing, observations, info, cost_method::CostModelObs)Prepares optimization setup including cost functions, parameter bounds, and initial values.
Arguments:
forcing: Forcing data for the simulationobservations: Observation data for cost calculationinfo: Experiment configuration NamedTuplecost_method: Cost calculation method type
Returns:
- A NamedTuple containing optimization helpers including cost function, parameter bounds, and default values
Examples
julia> using Sindbad
julia> # Prepare optimization setup
julia> # opti_helpers = prepOpti(forcing, observations, info, CostModelObs())Prepares optimization parameters, settings, and helper functions based on the provided inputs.
Arguments:
forcing: Input forcing data used for the optimization process.observations: Observed data used for comparison or calibration during optimization.info: A SINDBAD NamedTuple containing all information needed for setup and execution of the experiment.cost_method: The method used to calculate the cost function.
Returns:
- A NamedTuple
opti_helperscontaining:parameter_table: Processed model parameters for optimization.cost_function: A function to compute the cost for optimization.cost_options: Options and settings for the cost function.default_values: Default parameter values for the models.lower_bounds: Lower bounds for the parameters.upper_bounds: Upper bounds for the parameters.run_helpers: Helper information for running the optimization.
cost_method:
CostMethod
Abstract type for cost calculation methods in SINDBAD
Available methods/subtypes:
CostModelObs: cost calculation between model output and observationsCostModelObsLandTS: cost calculation between land model output and time series observationsCostModelObsMT: multi-threaded cost calculation between model output and observationsCostModelObsPriors: cost calculation between model output, observations, and priors. NOTE THAT THIS METHOD IS JUST A PLACEHOLDER AND DOES NOT CALCULATE PRIOR COST PROPERLY YET
Extended help
Notes:
The function processes the input data and configuration to set up the optimization problem.
It prepares model parameters, cost options, and helper functions required for the optimization process.
Depending on the
cost_method, the cost function is customized to handle specific data types or computation methods.
Code
function prepOpti end
function prepOpti(forcing, observations, info)
return prepOpti(forcing, observations, info, CostModelObs())
end
function prepOpti(forcing, observations, info)
return prepOpti(forcing, observations, info, CostModelObs())
end
function prepOpti(forcing, observations, info, ::CostModelObsMT; algorithm_info_field=:optimizer)
algorithm_info = getproperty(info.optimization, algorithm_info_field)
opti_helpers = prepOpti(forcing, observations, info, CostModelObs())
run_helpers = opti_helpers.run_helpers
cost_vector_size = getCostVectorSize(getproperty(algorithm_info, :options), opti_helpers.default_values, getproperty(algorithm_info, :method))
cost_vector = Vector{eltype(opti_helpers.default_values)}(undef, cost_vector_size)
space_index = 1 # the parallelization of cost computation only runs in single pixel runs
cost_function = x -> cost(x, opti_helpers.default_values, info.models.forward, run_helpers.space_forcing[space_index], run_helpers.space_spinup_forcing[space_index], run_helpers.loc_forcing_t, run_helpers.output_array, run_helpers.space_output_mt, deepcopy(run_helpers.space_land[space_index]), run_helpers.tem_info, observations, opti_helpers.parameter_table, opti_helpers.cost_options, info.optimization.run_options.multi_constraint_method, info.optimization.run_options.parameter_scaling, cost_vector, info.optimization.run_options.cost_method)
opti_helpers = (; opti_helpers..., cost_function=cost_function, cost_vector=cost_vector)
return opti_helpers
end
function prepOpti(forcing, observations, info, ::CostModelObsLandTS)
opti_helpers = prepOpti(forcing, observations, info, CostModelObs())
run_helpers = opti_helpers.run_helpers
cost_function = x -> costLand(x, info.models.forward, run_helpers.loc_forcing, run_helpers.loc_spinup_forcing, run_helpers.loc_forcing_t, run_helpers.land_time_series, run_helpers.loc_land, run_helpers.tem_info, observations, opti_helpers.parameter_table, opti_helpers.cost_options, info.optimization.run_options.multi_constraint_method, info.optimization.run_options.parameter_scaling)
opti_helpers = (; opti_helpers..., cost_function=cost_function)
return opti_helpers
end
function prepOpti(forcing, observations, info, cost_method::CostModelObs)
run_helpers = prepTEM(forcing, info)
parameter_helpers = prepParameters(info.optimization.parameter_table, info.optimization.run_options.parameter_scaling)
parameter_table = parameter_helpers.parameter_table
default_values = parameter_helpers.default_values
lower_bounds = parameter_helpers.lower_bounds
upper_bounds = parameter_helpers.upper_bounds
cost_options = prepCostOptions(observations, info.optimization.cost_options, cost_method)
cost_function = x -> cost(x, default_values, info.models.forward, run_helpers.space_forcing, run_helpers.space_spinup_forcing, run_helpers.loc_forcing_t, run_helpers.output_array, run_helpers.space_output, deepcopy(run_helpers.space_land), run_helpers.tem_info, observations, parameter_table, cost_options, info.optimization.run_options.multi_constraint_method, info.optimization.run_options.parameter_scaling, cost_method)
opti_helpers = (; parameter_table=parameter_table, cost_function=cost_function, cost_options=cost_options, default_values=default_values, lower_bounds=lower_bounds, upper_bounds=upper_bounds, run_helpers=run_helpers)
return opti_helpers
endprepParameters
Sindbad.ParameterOptimization.prepParameters Function
prepParameters(parameter_table, parameter_scaling)Prepare model parameters for optimization by processing default and bounds of the parameters to be optimized.
Arguments
parameter_table: Table of the parameters to be optimizedparameter_scaling: Scaling method/type for parameter optimization
Returns
A tuple containing processed parameters ready for optimization
Code
function prepParameters(parameter_table, parameter_scaling)
default_values, lower_bounds, upper_bounds = scaleParameters(parameter_table, parameter_scaling)
parameter_helpers = (; parameter_table=parameter_table, default_values=default_values, lower_bounds=lower_bounds, upper_bounds=upper_bounds)
return parameter_helpers
end