boar.core package

Submodules

boar.core.FitParams module

class boar.core.FitParams.FOMparam(func, name='', val=1.0, std=0, relRange=1, display_name='', unit='', optim_type='linear', axis_type=None)[source]

Bases: object

update_FOMparam(FOM_list)[source]

Update the FOM_param object with lims and startVal based on the FOM_list set the lims to the min and max of the FOM_list and the startVal to the mean set the relRange to 1 set the lim_type to absolute

Parameters:

FOM_list (list) – list of FOMs

class boar.core.FitParams.Fitparam(name='', startVal=None, val=1.0, relRange=0, lims=[], std=0, p0m=None, d='', display_name='', unit='', range_type='log', lim_type='absolute', optim_type='linear', axis_type=None, val_type='float', rescale=True, stepsize=None)[source]

Bases: object

boar.core.funcs module

boar.core.funcs.callable_name(any_callable: Callable[[...], Any]) str[source]

Returns the name of a callable object

Parameters:

any_callable (Callable[..., Any]) – Callable object

Returns:

Name of the callable object

Return type:

str

boar.core.funcs.gauss(x, a, b, xc)[source]
boar.core.funcs.gauss_sk(x, a, b, c, s=0.0001)[source]
boar.core.funcs.gaussian_pulse_norm(t, tpulse, width)[source]

Returns a gaussian pulse

Parameters:
  • t (1-D sequence of floats) – t time axis (unit: s)

  • tpulse (float) – tpulse center of the pulse (unit: s)

  • width (float) – width of the pulse (unit: s)

Returns:

Vector containing the gaussian pulse

Return type:

1-D sequence of floats

boar.core.funcs.get_flux_density(P, wl, nu, A, alpha)[source]

From the measured power and reprate and area, get photons/cm2 and approximate photons/cm3 per pulse

Parameters:
  • P (float) – total CW power of pulse in W

  • wl (float) – excitation wavelength in nm

  • nu (float) – repetition rate in s-1

  • A (float) – effective pump area in cm2

  • alpha (float) – penetration depth in cm

Returns:

flux in photons per cm2 density (float): average volume density in photons/cm3

Return type:

flux (float)

boar.core.funcs.get_unique_X(X, xaxis, X_dimensions)[source]

Get the unique values of the independent variable (X) in the dataset

Parameters:
  • X (ndarray) – the experimental dimensions

  • xaxis (str, optional) – the name of the independent variable

  • X_dimensions (list, optional) – names of the X columns

Returns:

  • X_unique (ndarray) – the unique values of the independent variable

  • X_dimensions_uni (list) – the names of the columns of X_unique

Raises:

ValueError – if xaxis is not in X_dimensions

boar.core.funcs.get_unique_X_and_xaxis_values(X, xaxis, X_dimensions)[source]

Get the values of the independent variable (X) in the dataset for each unique value of the other dimensions

Parameters:
  • X (ndarray) – the experimental dimensions

  • xaxis (str, optional) – the name of the independent variable

  • X_dimensions (list, optional) – the names of the columns of X

Returns:

xs – the values of the independent variable for each unique value of the other dimensions

Return type:

list of ndarrays

boar.core.funcs.polynom(x, a, gamma)[source]
boar.core.funcs.sci_notation(number, sig_fig=2)[source]

Make proper scientific notation for graphs

Parameters:
  • number (float) – Number to put in scientific notation.

  • sig_fig (int, optional) – Number of significant digits (Defaults = 2).

Returns:

output – String containing the number in scientific notation

Return type:

str

boar.core.funcs.sigmoid(x, a, b, xc)[source]

boar.core.optimization module

class boar.core.optimization.MultiObjectiveOptimizer(params=None, targets=None, warmstart=None, Path2OldXY=None, SaveOldXY2file=None, res_dir='temp', parallel=True, verbose=False)[source]

Bases: BoarOptimizer

LH(X, beta_scaled, N, gpr, fscale)[source]

Compute the positive log likelihood from the negative log likelihood Be careful here! The loss and threshold used here are the ones define as arguments of the optimize_sko_parallel and not the one defined in the targets so for the calculation of the MSE to be consistent we need all targets to have the same loss and threshold!

Parameters:
  • X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions

  • beta_scaled (float) – 1 / minimum of scaled surrogate function will be multiplied with fscale to give 1/ the unscaled minimum which is the MSE of the best fit If there are no systematic deviations, and if the noise in y is Gaussian, then this should yield the variance of the Gaussian distribution of target values

  • N (integer) – number of data points

  • gpr (scikit-optimize estimator) – a trained regressor which has a .predict() method

  • fscale (float) – scaling factor to keep the surrogate function between 0 and 100 (yields best results in BO but here we need the unscaled surrogate function, that is, MSE)

Returns:

the likelihood

Return type:

float

LLH(X, beta_scaled, N, gpr, fscale)[source]

Return the negative log likelihood -ln(p(t|w,beta)) where t are the measured target values, w is the set of model parameters, and beta is the target uncertainty.

Be careful here! The loss and threshold used here are the ones defined as arguments of the optimize_sko_parallel and not the ones defined in the targets so for the calculation of the MSE to be consistent, we need all targets to have the same loss and threshold!

For reference check: Christopher M. Bishop: Pattern Recognition and Machine Learning, Springer Information Science and statistics 2006, Chapter 1.2.5 pg 29

Parameters:
  • X (ndarray) – Data array of size (n,m): n=number of data points, m=number of dimensions

  • beta_scaled (float) – 1 / minimum of scaled surrogate function will be multiplied with fscale to give 1/ the unscaled minimum which is the MSE of the best fit If there are no systematic deviations, and if the noise in y is Gaussian, then this should yield the variance of the Gaussian distribution of target values

  • N (integer) – number of data points

  • gpr (scikit-optimize estimator) – a trained regressor which has a .predict() method

  • fscale (float) – scaling factor to keep the surrogate function between 0 and 100 (yields best results in BO but here we need the unscaled surrogate function, that is, MSE)

Returns:

the negative log likelihood

Return type:

float

cost_from_old_xy(old_xy, targets, fscale, obj_type='MSE', loss='linear', threshold=1000)[source]

Calculate the cost function from old data points

Parameters:
  • yfs (1D-array) – array of size (n,) with the model function values from old data points

  • y (1D-array) – data array of size (n,) to fit

  • fscale (float) – a scaling factor to keep y between 0 and 100 so the length scales can be compared

  • weight (int, optional) – weight array of size (n,) to weight the data points, by default 1

  • obj_type (str, optional) –

    objective function type, can be [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’larry’,’nRMSE_VLC’], by default ‘MSE’

    ’MSE’: mean squared error ‘RMSE’: root mean squared error ‘MSLE’: mean squared log error ‘nRMSE’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) ‘MAE’: mean absolute error ‘RAE’: relative absolute error sum(abs(yf-y))/sum(mean(y)-y) ‘larry’: mean squared error legacy version ‘nRMSE_VLC’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) for each experiment separately and then averaged over all experiments

  • loss (str, optional) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’], by default ‘linear’

  • threshold (int, optional) – critical value above which loss sets in, by default 1000.

Returns:

array of size (n,) with the loss function values

Return type:

1D-array

do_grid_posterior(step, fig, axes, gs, lb, ub, pf, beta_scaled, N, gpr, fscale, Nres, logscale, vmin, min_prob=0.01, clear_axis=False, True_values=None, points=None)[source]

Obtain the posterior probability distribution p(w|y) by brute force gridding where w is the set of model parameters and y is the data For each fitparameter, return mean and standard deviation

Parameters:
  • step (int) – Number of zooming steps

  • fig (matplotlib.figure.Figure) – Figure to plot on

  • axes (list) – List of axes to plot on

  • gs (matplotlib.gridspec.GridSpec) – Gridspec to plot on

  • lb (list) – Lower bounds of the grid

  • ub (list) – Upper bounds of the grid

  • pf (list) – List of parameters

  • N (integer) – number of datasets

  • gpr (scikit-optimize estimator) – trained regressor

  • fscale (float) – scaling factor

  • Nres (integer) – Sampling resolution. Number of data points per dimension.

  • logscale (boolean) – display in log scale?

  • vmin (float) – lower cutoff (in terms of exp(vmin) if logscale==True)

  • zoom (int, optional) – number of time to zoom in, by default 1.

  • min_prob (float, optional) – minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.

  • clear_axis (boolean, optional) – clear the axis before plotting the zoomed in data, by default False.

  • True_values (dict, optional) – dictionary of true values of the parameters, by default None

  • points (array, optional) – array of explored points in the parameter space during the optimization, by default None

Returns:

_description_

Return type:

_type_

load_old_xy()[source]

Load old xy data from file from self.SaveOldXY2file

marginal_posterior_1D(x_name, pf=None, lb=None, ub=None, fig=None, ax=None, True_values=None, gpr=None, N=None, beta_scaled=None, fscale=None, Nres=None, Ninteg=100000.0, vmin=None, min_prob=None, points=None, logscale=False, show_plot=True, clear_axis=False, xlabel_pos='bottom', ylabel_pos='left', **kwargs)[source]

calculate and plot the marginal posterior probability distribution p(w|y) for parameter x_name by integrating over the other parameters

Parameters:
  • x_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated

  • lb (float, optional) – lower bound of the parameter x_name, if None we use the main boundaries, by default None

  • ub (float, optional) – upper bound of the parameter x_name, if None we use the main boundaries, by default None

  • fig (matplotlib figure, optional) – figure to plot the marginal posterior probability distribution, if None we create a new figure, by default None

  • ax (matplotlib axis, optional) – axis to plot the marginal posterior probability distribution, if None we create a new axis, by default None

  • True_values (dict, optional) – dictionary with the true values of the parameters, by default None

  • gpr (sklearn regressor, optional) – regressor to calculate the likelihood, if None we use the self.gpr, by default None

  • N (int, optional) – number of samples to calculate the likelihood, if None we use the self.N, by default None

  • beta_scaled (float, optional) – scaling factor for the likelihood, if None we use the self.beta_scaled, by default None

  • fscale (float, optional) – scaling factor for the likelihood, if None we use the self.fscale, by default None

  • Nres (int, optional) – number of points to calculate the marginal posterior probability distribution, by default None

  • Ninteg (int, optional) – number of points to marginalize the prob, by default 1e5

  • vmin (float, optional) – minimum value of the marginal posterior probability distribution, only used if logscale = True as for linscale the min probability is 0, by default None

  • min_prob (float, optional) – value used for the cut off probability when zooming in, note that for now this is not in used, by default None

  • points (array, optional) – array with the points to plot the marginal posterior probability distribution, by default None

  • logscale (bool, optional) – if True we plot the marginal posterior probability distribution in log scale, by default False

  • show_plot (bool, optional) – if True we show the plot, by default True

  • clear_axis (bool, optional) – if True we clear the axis, by default False

  • xlabel_pos (str, optional) – position of the xlabel, by default ‘bottom’

  • ylabel_pos (str, optional) – position of the ylabel, by default ‘left’

  • '**kwargs'

    additional arguments to pass to the plot function, by default None

marginal_posterior_2D(x_name, y_name, pf=None, lb=None, ub=None, fig=None, ax=None, True_values=None, gpr=None, N=None, beta_scaled=None, fscale=None, Nres=None, Ninteg=100000.0, vmin=None, min_prob=None, points=None, logscale=False, show_plot=True, clear_axis=False, xlabel_pos='bottom', ylabel_pos='left', **kwargs)[source]

calculate and plot the marginal posterior probability distribution p(w|y) for parameter x_name by integrating over the other parameters

Parameters:
  • x_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated on the x-axis

  • y_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated on the y-axis

  • lb (float, optional) – lower bound of the parameter x_name, if None we use the main boundaries, by default None

  • ub (float, optional) – upper bound of the parameter x_name, if None we use the main boundaries, by default None

  • fig (matplotlib figure, optional) – figure to plot the marginal posterior probability distribution, if None we create a new figure, by default None

  • ax (matplotlib axis, optional) – axis to plot the marginal posterior probability distribution, if None we create a new axis, by default None

  • True_values (dict, optional) – dictionary with the true values of the parameters, by default None

  • gpr (sklearn regressor, optional) – regressor to calculate the likelihood, if None we use the self.gpr, by default None

  • N (int, optional) – number of samples to calculate the likelihood, if None we use the self.N, by default None

  • beta_scaled (float, optional) – scaling factor for the likelihood, if None we use the self.beta_scaled, by default None

  • fscale (float, optional) – scaling factor for the likelihood, if None we use the self.fscale, by default None

  • Nres (int, optional) – number of points to calculate the marginal posterior probability distribution, by default None

  • Ninteg (int, optional) – number of points to marginalize the prob, by default 1e5

  • vmin (float, optional) – minimum value of the marginal posterior probability distribution, only used if logscale = True as for linscale the min probability is 0, by default None

  • min_prob (float, optional) – value used for the cut off probability when zooming in, note that for now this is not in used, by default None

  • points (array, optional) – array with the points to plot the marginal posterior probability distribution, by default None

  • logscale (bool, optional) – if True we plot the marginal posterior probability distribution in log scale, by default False

  • show_plot (bool, optional) – if True we show the plot, by default True

  • clear_axis (bool, optional) – if True we clear the axis, by default False

  • xlabel_pos (str, optional) – position of the xlabel, by default ‘bottom’

  • ylabel_pos (str, optional) – position of the ylabel, by default ‘left’

  • '**kwargs'

    additional arguments to pass to the plot function, by default None
    show_pointsbool, optional

    if True we show the points, by default True

obj_func_sko(*p, params, targets, fscale, obj_type='MSE', loss='linear', threshold=1000)[source]

Objective function directly returning the loss function for use with the Bayesian Optimizer

Parameters:
  • '*p'

    the parameters as passed by the Bayesian Optimizer

  • params (list) – list of Fitparam() objects

  • model (callable) – Model function yf = f(X) to compare to y

  • X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions

  • y (1D-array) – data array of size (n,) to fit

  • fscale (float) – a scaling factor to keep y between 0 and 100 so the length scales can be compared

  • weight (1D-array, optional) – weight array of size (n,) to weight the data points, by default 1

  • obj_type (str, optional) –

    objective function type, can be [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’larry’,’nRMSE_VLC’], by default ‘MSE’

    ’MSE’: mean squared error ‘RMSE’: root mean squared error ‘MSLE’: mean squared log error ‘nRMSE’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) ‘MAE’: mean absolute error ‘RAE’: relative absolute error sum(abs(yf-y))/sum(mean(y)-y) ‘larry’: mean squared error legacy version ‘nRMSE_VLC’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) for each experiment separately and then averaged over all experiments

  • loss (str, optional) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’], by default ‘linear’

  • threshold (int, optional) – critical value above which loss sets in, by default 1000. Wisely select so that the loss function affects only outliers. If threshold is set too low, then the value will be suppressed even fon non-outliers falsifying the mean square error and thus the log likelihood, the Hessian and the error bars.

Returns:

array of size (n,) with the loss function values

Return type:

1D-array

optimize_sko_parallel(n_jobs=4, n_yscale=20, n_BO=10, n_initial_points=10, n_BO_warmstart=5, n_jobs_init=None, obj_type='MSE', loss='linear', threshold=1000, kwargs=None, base_estimator='GP', show_objective_func=True, kwargs_plot_obj=None, show_posterior=True, kwargs_posterior=None, verbose=True)[source]

Multi-objective optimization of the parameters of the model using the scikit-optimize package

Parameters:
  • n_jobs (int, optional) – number of parallel jobs to run, by default 4

  • n_yscale (int, optional) – number of points used to estimate the scaling factor yscale, by default 20

  • n_BO (int, optional) – number of points to run in the Bayesian optimization, by default 10

  • n_initial_points (int, optional) – number of initial points to run, by default 10

  • n_BO_warmstart (int, optional) – number of points to run in the Bayesian optimization after warmstart, by default 5

  • n_jobs_init (int, optional) – number of parallel jobs to run for the initial points, by default None if None, then n_jobs is used

  • obj_type (str, optional) –

    objective function type, can be [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’larry’,’nRMSE_VLC’], by default ‘MSE’

    ’MSE’: mean squared error ‘RMSE’: root mean squared error ‘MSLE’: mean squared log error ‘nRMSE’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) ‘MAE’: mean absolute error ‘RAE’: relative absolute error sum(abs(yf-y))/sum(mean(y)-y) ‘larry’: mean squared error legacy version ‘nRMSE_VLC’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) for each experiment separately and then averaged over all experiments

  • loss (str, optional) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’], by default ‘linear’

  • threshold (int, optional) – critical value above which loss sets in, by default 1000. Wisely select so that the loss function affects only outliers. If threshold is set too low, then the value will be suppressed even fon non-outliers falsifying the mean square error and thus the log likelihood, the Hessian and the error bars.

  • suggest_only (bool, optional) – only suggest the next point and does not evaluate it, by default False

  • kwargs (dict, optional) –

    dictionary of keyword argument to check for the improvement of the model, by default None including:

    max_loop_no_improvementint, optional

    maximum number of loops without improvement, by default 10

    check_improvementbool, optional

    check for improvement can be either None, ‘relax’, ‘strict’, by default None if None, then no check is performed if ‘relax’, then the check is performed by checking if abs(fun_new - fun_best)/fun_new > ftol or if norm(dx) < xtol*(xtol + norm(x)) if ‘strict’, then the check is performed by checking if abs(fun_new - fun_best)/fun_new > ftol only

    ftolfloat, optional

    monitor the change in the minimum of the objective function value, by default 1e-3

    xtolfloat, optional

    Monitor the change of the fitting results, by default 1e-3

    initial_point_generatorstr, optional

    type of initial point generator, can be [‘random’,’sobol’,’halton’,’hammersly’,’lhs’,’grid’], by default ‘lhs’

    acq_funcstr, optional

    type of acquisition function, can be [‘LCB’,’EI’,’PI’,’gp_hedge’], by default ‘gp_hedge’

    acq_optimizerstr, optional

    type of acquisition optimizer, can be [‘auto’,’sampling’,’lbfgs’], by default ‘auto’

    acq_func_kwargsdict, optional

    dictionary of keyword arguments for the acquisition function, by default {}

    acq_optimizer_kwargsdict, optional

    dictionary of keyword arguments for the acquisition optimizer, by default {}

    switch2exploitbool, optional

    switch to exploitation after reaching max_loop_no_improvement loops without improvement and reset the counter, by default True

show_objective_funcbool, optional

plot the objective function, by default True

kwargs_plot_objdict, optional

dictionary of keyword arguments for plot_objective_function, by default None including:

zscale: str, optional

type of scaling to use for the objective function, can be [‘linear’,’log’], by default ‘log’

show_pointsboolean, optional

show the explored points in the parameter space during the optimization, by default False

savefigboolean, optional

save the figure, by default False

savefig_namestr, optional

name of the file to save the figure, by default ‘posterior.png’

savefig_dirstr, optional

directory to save the figure, by default self.res_dir

figextstr, optional

extension of the figure, by default ‘.png’

figsizetuple, optional

size of the figure, by default (5*nb_params,5*nb_params)

figdpiint, optional

dpi of the figure, by default 300

show_posteriorbool, optional

calculate & show posterior distribution, by default True

kwargs_posteriordict

dictionary of keyword arguments for posterior function, by default None including:

Nresinteger, optional

Sampling resolution. Number of data points per dimension, by default 30

Ninteginteger, optional

Number of points for the marginalization over the other parameters when full_grid = False, by default 100

full_gridboolean, optional

If True, use a full grid for the posterior, by default False

randomizeboolean, optional

If True, calculate the posterior for all the dimension but draw the marginalization points randomly and do not use a corse grid, by default False

logscaleboolean, optional

display in log scale?, by default True

vminfloat, optional

lower cutoff (in terms of exp(vmin) if logscale==True), by default 1e-100

zoomint, optional

number of time to zoom in, only used if full_grid = True, by default 0

min_probfloat, optional

minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.

clear_axisboolean, optional

clear the axis before plotting the zoomed in data, by default False.

True_valuesdict, optional

dictionary of true values of the parameters, by default None

show_pointsboolean, optional

show the explored points in the parameter space during the optimization, by default False

savefigboolean, optional

save the figure, by default False

savefig_namestr, optional

name of the file to save the figure, by default ‘posterior.png’

savefig_dirstr, optional

directory to save the figure, by default self.res_dir

figextstr, optional

extension of the figure, by default ‘.png’

figsizetuple, optional

size of the figure, by default (5*nb_params,5*nb_params)

figdpiint, optional

dpi of the figure, by default 300

verbosebool, optional

display progress and results, by default True

Returns:

dictionary with the optimized parameters (‘popt’) and the corresponding covariance (‘pcov’) and standard deviation (‘std’) values

Return type:

dict

plot_objective_function(rrr, r, axis_type, pnames_display, kwargs_plot_obj={})[source]

Plot the objective function as a contour plot using skopt plt_objective function

Parameters:
  • rrr (skopt.optimizer.OptimizeResult) – result of the optimization

  • pnames_display (list) – list of strings with the display names of the parameters

  • kwargs_plot_obj (dict, optional) – kwargs for the plot_objective function, by default {}

posterior(params, lb_main, ub_main, beta_scaled, N, gpr, fscale, kwargs_posterior, points=None)[source]

Obtain the posterior probability distribution p(w|y) by brute force gridding where w is the set of model parameters and y is the data For each fitparameter, return mean and standard deviation

Parameters:
  • params (list of fitparameter objects) – list of fitparameters

  • lb_main (list of floats) – lower bound for the fitparameters

  • ub_main (list of floats) – upper bound for the fitparameters

  • beta_scaled (float) – 1/minimum of the scaled surrogate function

  • N (integer) – number of datasets

  • gpr (scikit-optimize estimator) – trained regressor

  • fscale (float) – scaling factor

  • kwargs_posterior (dict) –

    dictionary of keyword arguments for posterior function including:

    Nresinteger, optional

    Sampling resolution. Number of data points per dimension, by default 30

    Ninteginteger, optional

    Number of points for the marginalization over the other parameters when full_grid = False, by default 100

    full_gridboolean, optional

    If True, use a full grid for the posterior, by default False

    randomizeboolean, optional

    If True, calculate the posterior for all the dimension but draw the marginalization points randomly and do not use a corse grid, by default False

    logscaleboolean, optional

    display in log scale?, by default True

    vminfloat, optional

    lower cutoff (in terms of exp(vmin) if logscale==True), by default 1e-100

    zoomint, optional

    number of time to zoom in, only used if full_grid = True, by default 0

    min_probfloat, optional

    minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.

    clear_axisboolean, optional

    clear the axis before plotting the zoomed in data, by default False.

    True_valuesdict, optional

    dictionary of true values of the parameters, by default None

    show_pointsboolean, optional

    show the explored points in the parameter space during the optimization, by default False

    savefigboolean, optional

    save the figure, by default False

    savefig_namestr, optional

    name of the file to save the figure, by default ‘posterior.png’

    savefig_dirstr, optional

    directory to save the figure, by default self.res_dir

    figextstr, optional

    extension of the figure, by default ‘.png’

    figsizetuple, optional

    size of the figure, by default (5*nb_params,5*nb_params)

    figdpiint, optional

    dpi of the figure, by default 300

  • points (array, optional) – array of explored points in the parameter space during the optimization, by default None

Returns:

  • Contour plots for each pair of fit parameters

  • list of float, list of float – the mean and the square root of the second central moment (generalized standard deviation for arbitrary probability distribution)

randomize_grid_posterior(params, lb_main, ub_main, beta_scaled, N, gpr, fscale, kwargs_posterior, points=None, True_values=None)[source]

Obtain the posterior probability distribution p(w|y) by brute force gridding where w is the set of model parameters and y is the data For each fitparameter, return mean and standard deviation

Parameters:
  • params (list of fitparameter objects) – list of fitparameters

  • lb_main (list of floats) – lower bound for the fitparameters

  • ub_main (list of floats) – upper bound for the fitparameters

  • beta_scaled (float) – 1/minimum of the scaled surrogate function

  • N (integer) – number of datasets

  • gpr (scikit-optimize estimator) – trained regressor

  • fscale (float) – scaling factor

  • kwargs_posterior (dict) –

    dictionary of keyword arguments for posterior function including:

    Nresinteger, optional

    Sampling resolution. Number of data points per dimension, by default 30

    Ninteginteger, optional

    Number of points for the marginalization over the other parameters when full_grid = False, by default 100

    full_gridboolean, optional

    If True, use a full grid for the posterior, by default False

    logscaleboolean, optional

    display in log scale?, by default True

    vminfloat, optional

    lower cutoff (in terms of exp(vmin) if logscale==True), by default 1e-100

    zoomint, optional

    number of time to zoom in, only used if full_grid = True, by default 0

    min_probfloat, optional

    minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.

    clear_axisboolean, optional

    clear the axis before plotting the zoomed in data, by default False.

    True_valuesdict, optional

    dictionary of true values of the parameters, by default None

    show_pointsboolean, optional

    show the explored points in the parameter space during the optimization, by default False

    savefigboolean, optional

    save the figure, by default False

    savefig_namestr, optional

    name of the file to save the figure, by default ‘posterior.png’

    savefig_dirstr, optional

    directory to save the figure, by default self.res_dir

    figextstr, optional

    extension of the figure, by default ‘.png’

    figsizetuple, optional

    size of the figure, by default (5*nb_params,5*nb_params)

    figdpiint, optional

    dpi of the figure, by default 300

  • points (array, optional) – array of explored points in the parameter space during the optimization, by default None

  • True_values (dict, optional) – dictionary of true values of the parameters, by default None

Returns:

  • Contour plots for each pair of fit parameters

  • list of float, list of float – the mean and the square root of the second central moment (generalized standard deviation for arbitrary probability distribution)

save_old_xy()[source]

Save the old X and y values to self.SaveOldXY2file

single_point(X, y, params, n_jobs=4, base_estimator='GP', n_initial_points=100, show_objective_func=True, kwargs_plot_obj=None, axis_type=[], show_posterior=True, kwargs_posterior=None)[source]

Do a single Gaussian Process Regression on the X,y data

Parameters:
  • X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions

  • y (1D-array) – data array of size (n,) to fit

  • params (list) – list of Fitparam() objects

  • base_estimator (str, optional) – base estimator for the Gaussian Process Regression, by default ‘GP’

  • n_initial_points (int, optional) – number of initial points to use for the Gaussian Process Regression, by default 100

  • show_objective_func (bool, optional) – whether to plot the objective function or not, by default True

  • kwargs_plot_obj (dict, optional) – kwargs arguments for the plot_objective_function function, by default None

  • axis_type (list, optional) – list of strings with the type of axis to use for each dimension. Either ‘linear’ or ‘log’, by default []

  • verbose (bool, optional) – whether to display progress and results or not, by default True

  • zscale (str, optional) – scale to use for the z axis of the contour plots. Either ‘linear’ or ‘log’, by default ‘linear’

  • show_plots (bool, optional) – whether to show the plots or not, by default True

boar.core.optimization_botorch module

class boar.core.optimization_botorch.MooBOtorch(params=None, targets=None, parameter_constraints=None, warmstart=None, Path2OldXY=None, SaveOldXY2file=None, res_dir='temp', evaluate_custom=None, parallel=True, verbose=False)[source]

Bases: BoarOptimizer

BoTorchOpti(n_jobs=[4, 4], n_step_points=[5, 10], models=['Sobol', 'GPEI'], obj_type='MSE', loss='linear', threshold=1, model_kwargs_list=None, model_gen_kwargs_list=None, use_CUDA=True, is_MOO=False, use_custom_func=False, suggest_only=False, global_stopping_strategy=None, show_posterior=True, kwargs_posterior=None, verbose=True)[source]

Optimize the model using the Ax/Botorch library Uses the Expected Hypervolume Improvement (EHVI) algorithm

Parameters:
  • n_jobs (list, optional) – number of parallel jobs for each step, by default [4,4]

  • n_step_points (list, optional) – number of points to sample for each step, by default [5, 10]

  • models (list, optional) – list of models to use for each step, by default [‘Sobol’,’GPEI’]

  • obj_type (str, optional) – type of objective function to be used, by default ‘MSE’

  • loss (str, optional) – loss function to be used, by default ‘linear’

  • threshold (float, optional) – threshold for the loss function, by default 1

  • model_kwargs_list (list, optional) – list of dictionaries of model kwargs to use for each step, by default None Can contains : ‘surrogate’ : Surrogate model to use. ‘botorch_acqf_class’ : BoTorch acquisition function class to use.

  • model_gen_kwargs_list (list, optional) – list of dictionaries of model generation kwargs to use for each step, by default None

  • use_CUDA (bool, optional) – whether to use CUDA or not, by default True

  • is_MOO (bool, optional) – whether to use multi-objective optimization or enforce single-objective optimization, by default False

  • use_custom_func (bool, optional) – use a custom evaluation function instead of the default one, this is useful when the same model is used for different targets, by default False

  • suggest_only (bool, optional) – only suggest the next point and does not evaluate it, by default False

  • global_stopping_strategy (class, optional) – global stopping strategy based on BaseGlobalStoppingStrategy, see https://ax.dev/tutorials/gss.html, by default None

  • show_posterior (bool, optional) – calculate & show posterior distribution, by default True

  • kwargs_posterior (dict) –

    dictionary of keyword arguments for posterior function, by default None including:

    Nresinteger, optional

    Sampling resolution. Number of data points per dimension, by default 30

    Ninteginteger, optional

    Number of points for the marginalization over the other parameters when full_grid = False, by default 100

    full_gridboolean, optional

    If True, use a full grid for the posterior, by default False

    randomizeboolean, optional

    If True, calculate the posterior for all the dimension but draw the marginalization points randomly and do not use a corse grid, by default False

    logscaleboolean, optional

    display in log scale?, by default True

    vminfloat, optional

    lower cutoff (in terms of exp(vmin) if logscale==True), by default 1e-100

    zoomint, optional

    number of time to zoom in, only used if full_grid = True, by default 0

    min_probfloat, optional

    minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.

    clear_axisboolean, optional

    clear the axis before plotting the zoomed in data, by default False.

    True_valuesdict, optional

    dictionary of true values of the parameters, by default None

    show_pointsboolean, optional

    show the explored points in the parameter space during the optimization, by default False

    savefigboolean, optional

    save the figure, by default False

    savefig_namestr, optional

    name of the file to save the figure, by default ‘posterior.png’

    savefig_dirstr, optional

    directory to save the figure, by default self.res_dir

    figextstr, optional

    extension of the figure, by default ‘.png’

    figsizetuple, optional

    size of the figure, by default (5*nb_params,5*nb_params)

    figdpiint, optional

    dpi of the figure, by default 300

  • verbose (bool, optional) – whether to print the optimization steps or not, by default True

Returns:

the AxClient object

Return type:

AxClient

Raises:

ValueError – if n_jobs, n_step_points and models are not lists of the same length

ConvertParams(params)[source]

Convert the params to the format required by the Ax/Botorch library

Parameters:

params (list of Fitparam() objects) – list of Fitparam() objects

Returns:

list of dictionaries with the following keys:

’name’: string: the name of the parameter‘type’: string: ‘range’ or ‘fixed’‘bounds’: list of float: the lower and upper bounds of the parameter

Return type:

list of dict

LH_torch(X, beta, N, gpr, fscale=None)[source]

Compute the positive log likelihood from the negative log likelihood Be careful here! The loss and threshold used here are the ones define as arguments of the optimize_sko_parallel and not the one defined in the targets so for the calculation of the MSE to be consistent we need all targets to have the same loss and threshold!

Parameters:
  • X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions

  • beta (float) – 1 / minimum of scaled surrogate function

  • N (integer) – number of data points

  • gpr (regressor) – a trained regressor which has a .predict() method

  • fscale (float) – scaling factor to keep the surrogate function between 0 and 100 (yields best results in BO but here we need the unscaled surrogate function, that is, MSE), Not used here but might be later, default=None

Returns:

the positive log likelihood

Return type:

float

do_grid_posterior(step, fig, axes, gs, lb, ub, pf, beta_scaled, N, gpr, fscale, Nres, logscale, vmin, min_prob=0.01, clear_axis=False, True_values=None, points=None)[source]

Obtain the posterior probability distribution p(w|y) by brute force gridding where w is the set of model parameters and y is the data For each fitparameter, return mean and standard deviation

Parameters:
  • step (int) – Number of zooming steps

  • fig (matplotlib.figure.Figure) – Figure to plot on

  • axes (list) – List of axes to plot on

  • gs (matplotlib.gridspec.GridSpec) – Gridspec to plot on

  • lb (list) – Lower bounds of the grid

  • ub (list) – Upper bounds of the grid

  • pf (list) – List of parameters

  • N (integer) – number of datasets

  • gpr (scikit-optimize estimator) – trained regressor

  • fscale (float) – scaling factor

  • Nres (integer) – Sampling resolution. Number of data points per dimension.

  • logscale (boolean) – display in log scale?

  • vmin (float) – lower cutoff (in terms of exp(vmin) if logscale==True)

  • zoom (int, optional) – number of time to zoom in, by default 1.

  • min_prob (float, optional) – minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.

  • clear_axis (boolean, optional) – clear the axis before plotting the zoomed in data, by default False.

  • True_values (dict, optional) – dictionary of true values of the parameters, by default None

  • points (array, optional) – array of explored points in the parameter space during the optimization, by default None

Returns:

_description_

Return type:

_type_

evaluate(px, obj_type, loss, threshold=1, is_MOO=False)[source]

Evaluate the target at a given set of parameters

Parameters:
  • px (list) – list of parameters values

  • obj_type (str) – type of objective function to be used, see self.obj_func_metric

  • loss (str) – loss function to be used, see self.lossfunc

  • threshold (float, optional) – threshold for the loss function, by default 1

  • is_MOO (bool, optional) – whether to use multi-objective optimization or enforce single-objective optimization, by default False

Returns:

the model output

Return type:

float

abstract evaluate_custom(px, obj_type, loss, threshold=1, is_MOO=False)[source]

Create a custom evaluation function that can be used with the Ax/Botorch library and needs to be implemented by the user should return a dictionary with the following format: {‘metric_name’:metric_value}

Parameters:
  • px (list) – list of parameters values

  • obj_type (str) – type of objective function to be used, see self.obj_func_metric

  • loss (str) – loss function to be used, see self.lossfunc

  • threshold (float, optional) – threshold for the loss function, by default 1

  • is_MOO (bool, optional) – whether to use multi-objective optimization or enforce single-objective optimization, by default False

expected_minimum_BOAR(triedX, gpr, n_random_starts=20, random_state=None)[source]

Compute the minimum over the predictions of the last surrogate model.

Note: that this will be useful only from single objective optimization and if the goal is to minimize the surrogate function.

This was adapted from the scikit-optimize package. The original code can be found here: [scikit-optimize](https://scikit-optimize.github.io/stable/index.html) in the file scikit-optimize/skopt/utils.py

Parameters:
  • ax_client (AxClient) – AxClient object.

  • n_random_starts (int, default=20) – Number of points to sample randomly before fitting the surrogate model. If n_random_starts=0, then the initial point is taken as the best point seen so far (usually the last point in the GP model).

  • random_state (int, RandomState instance or None, optional (default=None)) – Set random state to something other than None for reproducible results.

Returns:

  • x (ndarray, shape (n_features,)) – The point which minimizes the surrogate function.

  • fun (float) – The surrogate function value at the minimum.

get_model(estimator='GPEI', use_CUDA=True)[source]

Get the model

Parameters:
  • estimator (str, optional) – Estimator to use. The default is ‘GPEI’.

  • use_CUDA (bool, optional) – Use CUDA. The default is True.

Raises:

ValueError – If the estimator is not implemented yet.

Returns:

  • model (class) – Model class.

  • tkwargs (dict) – Dictionary of keyword arguments for the model.

  • opt (str) – type of optimization either ‘random’, ‘single’ or ‘multi’

makeobjectives(targets, obj_type='MSE', threshold=1000, is_MOO=False)[source]

Convert the targets to the format required by the Ax/Botorch library

Parameters:
  • targets (list of dict) –

    list of dictionaries with the following keys:

    ’model’: a pointer to a function y = f(X) where X has m dimensions ‘data’: dictionary with keys ‘X’:ndarray with shape (n,m) where n is the number of evaluations for X ‘y’:ndarray with shape (n,) ‘X_dimensions’: list of string: the names of the dimensions in X ‘X_units’: list of string: the units of the dimensions in X ‘y_dimension’: string: the name of the dimension y ‘y_unit’: string: the unit of the dimension y ‘weight’: float: the weight of the target ‘loss’: string: the loss function to be used ‘threshold’: float: the threshold for the loss function

  • obj_type (str, optional) – the type of objective function to be used, by default ‘MSE’

  • loss (str, optional) – the loss function to be used, by default ‘linear’

  • threshold (float, optional) – the threshold for the loss function, by default 1000

Returns:

list of Metric() objects

Return type:

list of Metric() objects

marginal_posterior_1D(x_name, pf=None, lb=None, ub=None, fig=None, ax=None, True_values=None, gpr=None, N=None, beta_scaled=None, fscale=None, Nres=None, Ninteg=100000.0, vmin=None, min_prob=None, points=None, logscale=False, show_plot=True, clear_axis=False, xlabel_pos='bottom', ylabel_pos='left', **kwargs)[source]

calculate and plot the marginal posterior probability distribution p(w|y) for parameter x_name by integrating over the other parameters

Parameters:
  • x_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated

  • lb (float, optional) – lower bound of the parameter x_name, if None we use the main boundaries, by default None

  • ub (float, optional) – upper bound of the parameter x_name, if None we use the main boundaries, by default None

  • fig (matplotlib figure, optional) – figure to plot the marginal posterior probability distribution, if None we create a new figure, by default None

  • ax (matplotlib axis, optional) – axis to plot the marginal posterior probability distribution, if None we create a new axis, by default None

  • True_values (dict, optional) – dictionary with the true values of the parameters, by default None

  • gpr (sklearn regressor, optional) – regressor to calculate the likelihood, if None we use the self.gpr, by default None

  • N (int, optional) – number of samples to calculate the likelihood, if None we use the self.N, by default None

  • beta_scaled (float, optional) – scaling factor for the likelihood, if None we use the self.beta_scaled, by default None

  • fscale (float, optional) – scaling factor for the likelihood, if None we use the self.fscale, by default None

  • Nres (int, optional) – number of points to calculate the marginal posterior probability distribution, by default None

  • Ninteg (int, optional) – number of points to marginalize the prob, by default 1e5

  • vmin (float, optional) – minimum value of the marginal posterior probability distribution, only used if logscale = True as for linscale the min probability is 0, by default None

  • min_prob (float, optional) – value used for the cut off probability when zooming in, note that for now this is not in used, by default None

  • points (array, optional) – array with the points to plot the marginal posterior probability distribution, by default None

  • logscale (bool, optional) – if True we plot the marginal posterior probability distribution in log scale, by default False

  • show_plot (bool, optional) – if True we show the plot, by default True

  • clear_axis (bool, optional) – if True we clear the axis, by default False

  • xlabel_pos (str, optional) – position of the xlabel, by default ‘bottom’

  • ylabel_pos (str, optional) – position of the ylabel, by default ‘left’

  • **kwargs (dict, optional) – additional arguments to pass to the plot function, by default None

marginal_posterior_2D(x_name, y_name, pf=None, lb=None, ub=None, fig=None, ax=None, True_values=None, gpr=None, N=None, beta_scaled=None, fscale=None, Nres=None, Ninteg=100000.0, vmin=None, min_prob=None, points=None, logscale=False, show_plot=True, clear_axis=False, xlabel_pos='bottom', ylabel_pos='left', **kwargs)[source]

calculate and plot the marginal posterior probability distribution p(w|y) for parameter x_name by integrating over the other parameters

Parameters:
  • x_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated on the x-axis

  • y_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated on the y-axis

  • lb (float, optional) – lower bound of the parameter x_name, if None we use the main boundaries, by default None

  • ub (float, optional) – upper bound of the parameter x_name, if None we use the main boundaries, by default None

  • fig (matplotlib figure, optional) – figure to plot the marginal posterior probability distribution, if None we create a new figure, by default None

  • ax (matplotlib axis, optional) – axis to plot the marginal posterior probability distribution, if None we create a new axis, by default None

  • True_values (dict, optional) – dictionary with the true values of the parameters, by default None

  • gpr (sklearn regressor, optional) – regressor to calculate the likelihood, if None we use the self.gpr, by default None

  • N (int, optional) – number of samples to calculate the likelihood, if None we use the self.N, by default None

  • beta_scaled (float, optional) – scaling factor for the likelihood, if None we use the self.beta_scaled, by default None

  • fscale (float, optional) – scaling factor for the likelihood, if None we use the self.fscale, by default None

  • Nres (int, optional) – number of points to calculate the marginal posterior probability distribution, by default None

  • Ninteg (int, optional) – number of points to marginalize the prob, by default 1e5

  • vmin (float, optional) – minimum value of the marginal posterior probability distribution, only used if logscale = True as for linscale the min probability is 0, by default None

  • min_prob (float, optional) – value used for the cut off probability when zooming in, note that for now this is not in used, by default None

  • points (array, optional) – array with the points to plot the marginal posterior probability distribution, by default None

  • logscale (bool, optional) – if True we plot the marginal posterior probability distribution in log scale, by default False

  • show_plot (bool, optional) – if True we show the plot, by default True

  • clear_axis (bool, optional) – if True we clear the axis, by default False

  • xlabel_pos (str, optional) – position of the xlabel, by default ‘bottom’

  • ylabel_pos (str, optional) – position of the ylabel, by default ‘left’

  • **kwargs (dict, optional) –

    additional arguments to pass to the plot function, by default None
    show_pointsbool, optional

    if True we show the points, by default True

plot_all_objectives(ax_client, **kwargs)[source]

Plot all objectives

Parameters:
  • ax_client (AxClient() object) – AxClient() object

  • kwargs (dict, optional) –

    keyword arguments for the plot, by default {}
    savefigboolean, optional

    save the figure, by default False

    savefig_namestr, optional

    name of the file to save the figure, by default ‘’objectives’

    savefig_dirstr, optional

    directory to save the figure, by default self.res_dir

    figextstr, optional

    extension of the figure, by default ‘.png’

    figsizetuple, optional

    size of the figure, by default (5*nb_params,5*nb_params)

    figdpiint, optional

    dpi of the figure, by default 300

Return type:

None

plot_density(ax_client, **kwargs)[source]

Plot the density of the of the points in the search space

Parameters:
  • ax_client (AxClient() object) – AxClient() object

  • kwargs (dict, optional) –

    keyword arguments for the plot, by default {}
    savefigboolean, optional

    save the figure, by default False

    savefig_namestr, optional

    name of the file to save the figure, by default ‘’objectives’

    savefig_dirstr, optional

    directory to save the figure, by default self.res_dir

    figextstr, optional

    extension of the figure, by default ‘.png’

    figsizetuple, optional

    size of the figure, by default (5*nb_params,5*nb_params)

    figdpiint, optional

    dpi of the figure, by default 300

Raises:

ValueError – axis_type must be either log or linear

plot_hypervolume(hv_list=None, **kwargs)[source]

Plot the hypervolume trace

Parameters:
  • hv_list (list, optional) – list of hypervolumes, by default None

  • kwargs (dict, optional) –

    keyword arguments for the plot, by default {}
    savefigboolean, optional

    save the figure, by default False

    savefig_namestr, optional

    name of the file to save the figure, by default ‘’objectives’

    savefig_dirstr, optional

    directory to save the figure, by default self.res_dir

    figextstr, optional

    extension of the figure, by default ‘.png’

    figsizetuple, optional

    size of the figure, by default (5*nb_params,5*nb_params)

    figdpiint, optional

    dpi of the figure, by default 300

    logscalebool, optional

    use logscale, by default False

posterior(params, lb_main, ub_main, beta_scaled, N, gpr, fscale, kwargs_posterior, points=None)[source]

Obtain the posterior probability distribution p(w|y) by brute force gridding where w is the set of model parameters and y is the data For each fitparameter, return mean and standard deviation

Parameters:
  • params (list of fitparameter objects) – list of fitparameters

  • lb_main (list of floats) – lower bound for the fitparameters

  • ub_main (list of floats) – upper bound for the fitparameters

  • beta_scaled (float) – 1/minimum of the scaled surrogate function

  • N (integer) – number of datasets

  • gpr (scikit-optimize estimator) – trained regressor

  • fscale (float) – scaling factor

  • kwargs_posterior (dict) –

    dictionary of keyword arguments for posterior function including:

    Nresinteger, optional

    Sampling resolution. Number of data points per dimension, by default 30

    Ninteginteger, optional

    Number of points for the marginalization over the other parameters when full_grid = False, by default 100

    full_gridboolean, optional

    If True, use a full grid for the posterior, by default False

    randomizeboolean, optional

    If True, calculate the posterior for all the dimension but draw the marginalization points randomly and do not use a corse grid, by default False

    logscaleboolean, optional

    display in log scale?, by default True

    vminfloat, optional

    lower cutoff (in terms of exp(vmin) if logscale==True), by default 1e-100

    zoomint, optional

    number of time to zoom in, only used if full_grid = True, by default 0

    min_probfloat, optional

    minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.

    clear_axisboolean, optional

    clear the axis before plotting the zoomed in data, by default False.

    True_valuesdict, optional

    dictionary of true values of the parameters, by default None

    show_pointsboolean, optional

    show the explored points in the parameter space during the optimization, by default False

    savefigboolean, optional

    save the figure, by default False

    savefig_namestr, optional

    name of the file to save the figure, by default ‘posterior.png’

    savefig_dirstr, optional

    directory to save the figure, by default self.res_dir

    figextstr, optional

    extension of the figure, by default ‘.png’

    figsizetuple, optional

    size of the figure, by default (5*nb_params,5*nb_params)

    figdpiint, optional

    dpi of the figure, by default 300

  • points (array, optional) – array of explored points in the parameter space during the optimization, by default None

Returns:

  • Contour plots for each pair of fit parameters

  • list of float, list of float – the mean and the square root of the second central moment (generalized standard deviation for arbitrary probability distribution)

class boar.core.optimization_botorch.SimpleThresholdGlobalStoppingStrategy(min_trials: int, inactive_when_pending_trials: bool = True, threshold: float = 0.1)[source]

Bases: BaseGlobalStoppingStrategy

A GSS that stops when we observe a point better than threshold. Taken from : https://ax.dev/tutorials/gss.html

should_stop_optimization(experiment: Experiment) Tuple[bool, str][source]

Check if the best seen is better than self.threshold.

boar.core.optimizer module

class boar.core.optimizer.BoarOptimizer[source]

Bases: object

Provides a default class for the different optimizers in BOAR. This class is not intended to be used directly, but rather to be inherited by the different optimizer classes. It provides the basic functionality for the optimizer classes, such as the different objective functions, the functions to handle the Fitparam() objects, and the functions to handle the different plotting options.

format_func(value, tick_number)[source]

Format function for the x and y axis ticks to be passed to axo[ii,jj].xaxis.set_major_formatter(plt.FuncFormatter(format_func)) to get the logarithmic ticks looking good on the plot

Parameters:
  • value (float) – value to convert

  • tick_number (int) – tick position

Returns:

string representation of the value in scientific notation

Return type:

str

invert_lossfunc(z0, loss, threshold=1000)[source]

Invert the loss function to get the mean squared error values back

Parameters:
  • z0 (1D-array) – data array of size (n,) with the loss function values

  • loss (str) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’]

  • threshold (int, optional) – critical value above which loss sets in, by default 1000. Wisely select so that the loss function affects only outliers. If threshold is set too low, then c<z0 even fon non-outliers falsifying the mean square error and thus the log likelihood, the Hessian and the error bars.

Returns:

array of size (n,) with the mean squared error values

Return type:

1D-array

lossfunc(z0, loss, threshold=1000)[source]

Define the different loss functions that can be used to calculate the objective function value.

Parameters:
  • z0 (1D-array) – data array of size (n,) with the mean squared error values

  • loss (str) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’]

  • threshold (int, optional) – critical value above which loss sets in, by default 1000. Wisely select so that the loss function affects only outliers. If threshold is set too low, then c<z0 even fon non-outliers falsifying the mean square error and thus the log likelihood, the Hessian and the error bars.

Returns:

array of size (n,) with the loss function values

Return type:

1D-array

obj_func_curvefit(X, *p, params, model)[source]

Objective function as desired by scipy.curve_fit

Parameters:
  • X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions

  • *p (ndarray) – list of float; values of the fit parameters as supplied by the optimizer

  • params (list) – list of Fitparam() objects

  • model (callable) – Model function yf = f(X) to compare to y

Returns:

array of size (n,) with the model values

Return type:

1D-array

obj_func_metric(target, yf, obj_type='MSE')[source]

Method to calculate the objective function valueDifferent objective functions can be used, see below

Parameters:
  • target (target object) – target object with data and weight

  • yf (array) – model output

  • obj_type (str, optional) – objective function type, can be [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’MAPE’,’larry’,’nRMSE_VLC’], by default ‘MSE’ ‘MSE’: mean squared error ‘RMSE’: root mean squared error ‘MSLE’: mean squared log error ‘nRMSE’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) ‘MAE’: mean absolute error ‘MAPE’: mean absolute percentage error ‘RAE’: relative absolute error sum(abs(yf-y))/sum(mean(y)-y) ‘larry’: mean squared error legacy version ‘nRMSE_VLC’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) for each experiment separately and then averaged over all experiments

Returns:

objective function value

Return type:

float

Raises:

ValueError – if obj_type is not in [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’larry’,’nRMSE_VLC’]

obj_func_scipy(*p, params, targets, obj_type='MSE', loss='linear')[source]

Objective function directly returning the loss function for use with the Bayesian Optimizer

Parameters:
  • '*p'

    the parameters as passed by the Bayesian Optimizer

  • params (list) – list of Fitparam() objects

  • model (callable) – Model function yf = f(X) to compare to y

  • X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions

  • y (1D-array) – data array of size (n,) to fit

  • obj_type (str, optional) –

    objective function type, can be [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’larry’,’nRMSE_VLC’], by default ‘MSE’

    ’MSE’: mean squared error ‘RMSE’: root mean squared error ‘MSLE’: mean squared log error ‘nRMSE’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) ‘MAE’: mean absolute error ‘RAE’: relative absolute error sum(abs(yf-y))/sum(mean(y)-y) ‘larry’: mean squared error legacy version ‘nRMSE_VLC’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) for each experiment separately and then averaged over all experiments

  • loss (str, optional) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’], by default ‘linear’

Returns:

array of size (n,) with the loss function values

Return type:

1D-array

optimize_basin_hopping(obj_type='MSE', loss='linear', kwargs=None)[source]

Use basin_hopping to optimize a function y = f(x) where x can be multi-dimensional. See scipy.optimize.basin_hopping documentation for more information.

Parameters:

kwargs (dict, optional) – Keyword arguments for basin_hopping, see scipy.optimize.basin_hopping documentation for more information, by default None. If no kwargs is provided, use: kwargs = {‘niter’: 100, ‘T’: 1.0, ‘stepsize’: 0.5, ‘interval’: 50, ‘minimizer_kwargs’: None, ‘take_step’: None, ‘accept_test’: None, ‘callback’: None, ‘niter_success’: None, ‘seed’: None, ‘disp’: False, ‘niter_success’: 10}

Returns:

Dictionary with the optimized parameters (‘x’) and the corresponding function value (‘fun’).

Return type:

dict

optimize_curvefit(kwargs=None)[source]

Use curvefit to optimize a function y = f(x) where x can be multi-dimensional use this function if the loss function is deterministic do not use this function if loss function has uncertainty (e.g from numerical simulation) in this case, use optimize_sko

Parameters:

kwargs (dict, optional) – kwargs aruguments for curve_fit, see scipy.optimize.curve_fit documentation for more information, by default NoneIf no kwargs is provided use:kwargs = {‘ftol’:1e-8, ‘xtol’:1e-6, ‘gtol’: 1e-8, ‘diff_step’:0.001,’loss’:’linear’,’max_nfev’:5000}

Returns:

dictionary with the optimized parameters (‘popt’) and the corresponding covariance (‘pcov’) and standard deviation (‘std’) values

Return type:

dict

optimize_dual_annealing(obj_type='MSE', loss='linear', kwargs=None)[source]

Use dual_annealing to optimize a function y = f(x) where x can be multi-dimensional. See scipy.optimize.dual_annealing documentation for more information.

Parameters:

kwargs (dict, optional) – Keyword arguments for dual_annealing, see scipy.optimize.dual_annealing documentation for more information, by default None. If no kwargs is provided, use: kwargs = {‘maxiter’: 100, ‘local_search_options’: None, ‘initial_temp’: 5230.0, ‘restart_temp_ratio’: 2e-05, ‘visit’: 2.62, ‘accept’: -5.0, ‘maxfun’: 1000, ‘no_local_search’: False, ‘x0’: None, ‘bounds’: None, ‘args’: ()}

Returns:

Dictionary with the optimized parameters (‘x’) and the corresponding function value (‘fun’).

Return type:

dict

params_r(params)[source]

Prepare starting guess and bounds for optimizer Considering settings of Fitparameters:

optim_type:

if ‘linear’: abstract the order of magnitude (number between 1 and 10) if ‘log’: bring to decadic logarithm (discarding the negative sign, if any!)

lim_type: if ‘absolute’, respect user settings in Fitparam.lims

if ‘relative’, respect user settings in Fitparam.relRange:

if Fitparam.range_type==’linear’:

interpret Fitparam.relRange as factor

if Fitparam.range_type==’log’:

interpret Fitparam.relRange as order of magnitude

relRange: if zero, Fitparam is not included into starting guess and bounds

(but still available to the objective function)

Parameters:

params (list) – list of Fitparam() objects

Returns:

set of lists with: p0 = initial guesses

lb = lower boundsub = upper bounds

Return type:

set of lists

params_w(x, params, std=[], which='val')[source]

Method to interact with Fitparam objectsUsed by all Obj_funcs to write desired parameter from optimizer so the model can have the parameters in the physically correct units The fitparams objects are in a nested list at self.include_params

Parameters:
  • x (1D-sequence of floats) – fit parameters as requested by optimizer

  • params (list) – list of Fitparam() objects

  • std (list, optional) – Contains the 95% confidence interval of the parameters, by default []

  • which (str, optional) – ‘val’: x => Fitparam.val‘startVal’: x=> Fitparam.startValdefaults to Fitparam.val (which is used by the obj_funcsto pass to the model function), by default ‘val’

Module contents