boar.core package
Submodules
boar.core.FitParams module
boar.core.funcs module
- boar.core.funcs.callable_name(any_callable: Callable[[...], Any]) str [source]
Returns the name of a callable object
- Parameters:
any_callable (Callable[..., Any]) – Callable object
- Returns:
Name of the callable object
- Return type:
str
- boar.core.funcs.gaussian_pulse_norm(t, tpulse, width)[source]
Returns a gaussian pulse
- Parameters:
t (1-D sequence of floats) – t time axis (unit: s)
tpulse (float) – tpulse center of the pulse (unit: s)
width (float) – width of the pulse (unit: s)
- Returns:
Vector containing the gaussian pulse
- Return type:
1-D sequence of floats
- boar.core.funcs.get_flux_density(P, wl, nu, A, alpha)[source]
From the measured power and reprate and area, get photons/cm2 and approximate photons/cm3 per pulse
- Parameters:
P (float) – total CW power of pulse in W
wl (float) – excitation wavelength in nm
nu (float) – repetition rate in s-1
A (float) – effective pump area in cm2
alpha (float) – penetration depth in cm
- Returns:
flux in photons per cm2 density (float): average volume density in photons/cm3
- Return type:
flux (float)
- boar.core.funcs.get_unique_X(X, xaxis, X_dimensions)[source]
Get the unique values of the independent variable (X) in the dataset
- Parameters:
X (ndarray) – the experimental dimensions
xaxis (str, optional) – the name of the independent variable
X_dimensions (list, optional) – names of the X columns
- Returns:
X_unique (ndarray) – the unique values of the independent variable
X_dimensions_uni (list) – the names of the columns of X_unique
- Raises:
ValueError – if xaxis is not in X_dimensions
- boar.core.funcs.get_unique_X_and_xaxis_values(X, xaxis, X_dimensions)[source]
Get the values of the independent variable (X) in the dataset for each unique value of the other dimensions
- Parameters:
X (ndarray) – the experimental dimensions
xaxis (str, optional) – the name of the independent variable
X_dimensions (list, optional) – the names of the columns of X
- Returns:
xs – the values of the independent variable for each unique value of the other dimensions
- Return type:
list of ndarrays
- boar.core.funcs.sci_notation(number, sig_fig=2)[source]
Make proper scientific notation for graphs
- Parameters:
number (float) – Number to put in scientific notation.
sig_fig (int, optional) – Number of significant digits (Defaults = 2).
- Returns:
output – String containing the number in scientific notation
- Return type:
str
boar.core.optimization module
- class boar.core.optimization.MultiObjectiveOptimizer(params=None, targets=None, warmstart=None, Path2OldXY=None, SaveOldXY2file=None, res_dir='temp', parallel=True, verbose=False)[source]
Bases:
BoarOptimizer
- LH(X, beta_scaled, N, gpr, fscale)[source]
Compute the positive log likelihood from the negative log likelihood Be careful here! The loss and threshold used here are the ones define as arguments of the optimize_sko_parallel and not the one defined in the targets so for the calculation of the MSE to be consistent we need all targets to have the same loss and threshold!
- Parameters:
X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions
beta_scaled (float) – 1 / minimum of scaled surrogate function will be multiplied with fscale to give 1/ the unscaled minimum which is the MSE of the best fit If there are no systematic deviations, and if the noise in y is Gaussian, then this should yield the variance of the Gaussian distribution of target values
N (integer) – number of data points
gpr (scikit-optimize estimator) – a trained regressor which has a .predict() method
fscale (float) – scaling factor to keep the surrogate function between 0 and 100 (yields best results in BO but here we need the unscaled surrogate function, that is, MSE)
- Returns:
the likelihood
- Return type:
float
- LLH(X, beta_scaled, N, gpr, fscale)[source]
Return the negative log likelihood -ln(p(t|w,beta)) where t are the measured target values, w is the set of model parameters, and beta is the target uncertainty.
Be careful here! The loss and threshold used here are the ones defined as arguments of the optimize_sko_parallel and not the ones defined in the targets so for the calculation of the MSE to be consistent, we need all targets to have the same loss and threshold!
For reference check: Christopher M. Bishop: Pattern Recognition and Machine Learning, Springer Information Science and statistics 2006, Chapter 1.2.5 pg 29
- Parameters:
X (ndarray) – Data array of size (n,m): n=number of data points, m=number of dimensions
beta_scaled (float) – 1 / minimum of scaled surrogate function will be multiplied with fscale to give 1/ the unscaled minimum which is the MSE of the best fit If there are no systematic deviations, and if the noise in y is Gaussian, then this should yield the variance of the Gaussian distribution of target values
N (integer) – number of data points
gpr (scikit-optimize estimator) – a trained regressor which has a .predict() method
fscale (float) – scaling factor to keep the surrogate function between 0 and 100 (yields best results in BO but here we need the unscaled surrogate function, that is, MSE)
- Returns:
the negative log likelihood
- Return type:
float
- cost_from_old_xy(old_xy, targets, fscale, obj_type='MSE', loss='linear', threshold=1000)[source]
Calculate the cost function from old data points
- Parameters:
yfs (1D-array) – array of size (n,) with the model function values from old data points
y (1D-array) – data array of size (n,) to fit
fscale (float) – a scaling factor to keep y between 0 and 100 so the length scales can be compared
weight (int, optional) – weight array of size (n,) to weight the data points, by default 1
obj_type (str, optional) –
- objective function type, can be [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’larry’,’nRMSE_VLC’], by default ‘MSE’
’MSE’: mean squared error ‘RMSE’: root mean squared error ‘MSLE’: mean squared log error ‘nRMSE’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) ‘MAE’: mean absolute error ‘RAE’: relative absolute error sum(abs(yf-y))/sum(mean(y)-y) ‘larry’: mean squared error legacy version ‘nRMSE_VLC’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) for each experiment separately and then averaged over all experiments
loss (str, optional) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’], by default ‘linear’
threshold (int, optional) – critical value above which loss sets in, by default 1000.
- Returns:
array of size (n,) with the loss function values
- Return type:
1D-array
- do_grid_posterior(step, fig, axes, gs, lb, ub, pf, beta_scaled, N, gpr, fscale, Nres, logscale, vmin, min_prob=0.01, clear_axis=False, True_values=None, points=None)[source]
Obtain the posterior probability distribution p(w|y) by brute force gridding where w is the set of model parameters and y is the data For each fitparameter, return mean and standard deviation
- Parameters:
step (int) – Number of zooming steps
fig (matplotlib.figure.Figure) – Figure to plot on
axes (list) – List of axes to plot on
gs (matplotlib.gridspec.GridSpec) – Gridspec to plot on
lb (list) – Lower bounds of the grid
ub (list) – Upper bounds of the grid
pf (list) – List of parameters
N (integer) – number of datasets
gpr (scikit-optimize estimator) – trained regressor
fscale (float) – scaling factor
Nres (integer) – Sampling resolution. Number of data points per dimension.
logscale (boolean) – display in log scale?
vmin (float) – lower cutoff (in terms of exp(vmin) if logscale==True)
zoom (int, optional) – number of time to zoom in, by default 1.
min_prob (float, optional) – minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.
clear_axis (boolean, optional) – clear the axis before plotting the zoomed in data, by default False.
True_values (dict, optional) – dictionary of true values of the parameters, by default None
points (array, optional) – array of explored points in the parameter space during the optimization, by default None
- Returns:
_description_
- Return type:
_type_
- marginal_posterior_1D(x_name, pf=None, lb=None, ub=None, fig=None, ax=None, True_values=None, gpr=None, N=None, beta_scaled=None, fscale=None, Nres=None, Ninteg=100000.0, vmin=None, min_prob=None, points=None, logscale=False, show_plot=True, clear_axis=False, xlabel_pos='bottom', ylabel_pos='left', **kwargs)[source]
calculate and plot the marginal posterior probability distribution p(w|y) for parameter x_name by integrating over the other parameters
- Parameters:
x_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated
lb (float, optional) – lower bound of the parameter x_name, if None we use the main boundaries, by default None
ub (float, optional) – upper bound of the parameter x_name, if None we use the main boundaries, by default None
fig (matplotlib figure, optional) – figure to plot the marginal posterior probability distribution, if None we create a new figure, by default None
ax (matplotlib axis, optional) – axis to plot the marginal posterior probability distribution, if None we create a new axis, by default None
True_values (dict, optional) – dictionary with the true values of the parameters, by default None
gpr (sklearn regressor, optional) – regressor to calculate the likelihood, if None we use the self.gpr, by default None
N (int, optional) – number of samples to calculate the likelihood, if None we use the self.N, by default None
beta_scaled (float, optional) – scaling factor for the likelihood, if None we use the self.beta_scaled, by default None
fscale (float, optional) – scaling factor for the likelihood, if None we use the self.fscale, by default None
Nres (int, optional) – number of points to calculate the marginal posterior probability distribution, by default None
Ninteg (int, optional) – number of points to marginalize the prob, by default 1e5
vmin (float, optional) – minimum value of the marginal posterior probability distribution, only used if logscale = True as for linscale the min probability is 0, by default None
min_prob (float, optional) – value used for the cut off probability when zooming in, note that for now this is not in used, by default None
points (array, optional) – array with the points to plot the marginal posterior probability distribution, by default None
logscale (bool, optional) – if True we plot the marginal posterior probability distribution in log scale, by default False
show_plot (bool, optional) – if True we show the plot, by default True
clear_axis (bool, optional) – if True we clear the axis, by default False
xlabel_pos (str, optional) – position of the xlabel, by default ‘bottom’
ylabel_pos (str, optional) – position of the ylabel, by default ‘left’
'**kwargs' –
additional arguments to pass to the plot function, by default None
- marginal_posterior_2D(x_name, y_name, pf=None, lb=None, ub=None, fig=None, ax=None, True_values=None, gpr=None, N=None, beta_scaled=None, fscale=None, Nres=None, Ninteg=100000.0, vmin=None, min_prob=None, points=None, logscale=False, show_plot=True, clear_axis=False, xlabel_pos='bottom', ylabel_pos='left', **kwargs)[source]
calculate and plot the marginal posterior probability distribution p(w|y) for parameter x_name by integrating over the other parameters
- Parameters:
x_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated on the x-axis
y_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated on the y-axis
lb (float, optional) – lower bound of the parameter x_name, if None we use the main boundaries, by default None
ub (float, optional) – upper bound of the parameter x_name, if None we use the main boundaries, by default None
fig (matplotlib figure, optional) – figure to plot the marginal posterior probability distribution, if None we create a new figure, by default None
ax (matplotlib axis, optional) – axis to plot the marginal posterior probability distribution, if None we create a new axis, by default None
True_values (dict, optional) – dictionary with the true values of the parameters, by default None
gpr (sklearn regressor, optional) – regressor to calculate the likelihood, if None we use the self.gpr, by default None
N (int, optional) – number of samples to calculate the likelihood, if None we use the self.N, by default None
beta_scaled (float, optional) – scaling factor for the likelihood, if None we use the self.beta_scaled, by default None
fscale (float, optional) – scaling factor for the likelihood, if None we use the self.fscale, by default None
Nres (int, optional) – number of points to calculate the marginal posterior probability distribution, by default None
Ninteg (int, optional) – number of points to marginalize the prob, by default 1e5
vmin (float, optional) – minimum value of the marginal posterior probability distribution, only used if logscale = True as for linscale the min probability is 0, by default None
min_prob (float, optional) – value used for the cut off probability when zooming in, note that for now this is not in used, by default None
points (array, optional) – array with the points to plot the marginal posterior probability distribution, by default None
logscale (bool, optional) – if True we plot the marginal posterior probability distribution in log scale, by default False
show_plot (bool, optional) – if True we show the plot, by default True
clear_axis (bool, optional) – if True we clear the axis, by default False
xlabel_pos (str, optional) – position of the xlabel, by default ‘bottom’
ylabel_pos (str, optional) – position of the ylabel, by default ‘left’
'**kwargs' –
- additional arguments to pass to the plot function, by default None
- show_pointsbool, optional
if True we show the points, by default True
- obj_func_sko(*p, params, targets, fscale, obj_type='MSE', loss='linear', threshold=1000)[source]
Objective function directly returning the loss function for use with the Bayesian Optimizer
- Parameters:
'*p' –
the parameters as passed by the Bayesian Optimizer
params (list) – list of Fitparam() objects
model (callable) – Model function yf = f(X) to compare to y
X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions
y (1D-array) – data array of size (n,) to fit
fscale (float) – a scaling factor to keep y between 0 and 100 so the length scales can be compared
weight (1D-array, optional) – weight array of size (n,) to weight the data points, by default 1
obj_type (str, optional) –
objective function type, can be [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’larry’,’nRMSE_VLC’], by default ‘MSE’
’MSE’: mean squared error ‘RMSE’: root mean squared error ‘MSLE’: mean squared log error ‘nRMSE’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) ‘MAE’: mean absolute error ‘RAE’: relative absolute error sum(abs(yf-y))/sum(mean(y)-y) ‘larry’: mean squared error legacy version ‘nRMSE_VLC’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) for each experiment separately and then averaged over all experiments
loss (str, optional) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’], by default ‘linear’
threshold (int, optional) – critical value above which loss sets in, by default 1000. Wisely select so that the loss function affects only outliers. If threshold is set too low, then the value will be suppressed even fon non-outliers falsifying the mean square error and thus the log likelihood, the Hessian and the error bars.
- Returns:
array of size (n,) with the loss function values
- Return type:
1D-array
- optimize_sko_parallel(n_jobs=4, n_yscale=20, n_BO=10, n_initial_points=10, n_BO_warmstart=5, n_jobs_init=None, obj_type='MSE', loss='linear', threshold=1000, kwargs=None, base_estimator='GP', show_objective_func=True, kwargs_plot_obj=None, show_posterior=True, kwargs_posterior=None, verbose=True)[source]
Multi-objective optimization of the parameters of the model using the scikit-optimize package
- Parameters:
n_jobs (int, optional) – number of parallel jobs to run, by default 4
n_yscale (int, optional) – number of points used to estimate the scaling factor yscale, by default 20
n_BO (int, optional) – number of points to run in the Bayesian optimization, by default 10
n_initial_points (int, optional) – number of initial points to run, by default 10
n_BO_warmstart (int, optional) – number of points to run in the Bayesian optimization after warmstart, by default 5
n_jobs_init (int, optional) – number of parallel jobs to run for the initial points, by default None if None, then n_jobs is used
obj_type (str, optional) –
- objective function type, can be [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’larry’,’nRMSE_VLC’], by default ‘MSE’
’MSE’: mean squared error ‘RMSE’: root mean squared error ‘MSLE’: mean squared log error ‘nRMSE’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) ‘MAE’: mean absolute error ‘RAE’: relative absolute error sum(abs(yf-y))/sum(mean(y)-y) ‘larry’: mean squared error legacy version ‘nRMSE_VLC’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) for each experiment separately and then averaged over all experiments
loss (str, optional) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’], by default ‘linear’
threshold (int, optional) – critical value above which loss sets in, by default 1000. Wisely select so that the loss function affects only outliers. If threshold is set too low, then the value will be suppressed even fon non-outliers falsifying the mean square error and thus the log likelihood, the Hessian and the error bars.
suggest_only (bool, optional) – only suggest the next point and does not evaluate it, by default False
kwargs (dict, optional) –
dictionary of keyword argument to check for the improvement of the model, by default None including:
- max_loop_no_improvementint, optional
maximum number of loops without improvement, by default 10
- check_improvementbool, optional
check for improvement can be either None, ‘relax’, ‘strict’, by default None if None, then no check is performed if ‘relax’, then the check is performed by checking if abs(fun_new - fun_best)/fun_new > ftol or if norm(dx) < xtol*(xtol + norm(x)) if ‘strict’, then the check is performed by checking if abs(fun_new - fun_best)/fun_new > ftol only
- ftolfloat, optional
monitor the change in the minimum of the objective function value, by default 1e-3
- xtolfloat, optional
Monitor the change of the fitting results, by default 1e-3
- initial_point_generatorstr, optional
type of initial point generator, can be [‘random’,’sobol’,’halton’,’hammersly’,’lhs’,’grid’], by default ‘lhs’
- acq_funcstr, optional
type of acquisition function, can be [‘LCB’,’EI’,’PI’,’gp_hedge’], by default ‘gp_hedge’
- acq_optimizerstr, optional
type of acquisition optimizer, can be [‘auto’,’sampling’,’lbfgs’], by default ‘auto’
- acq_func_kwargsdict, optional
dictionary of keyword arguments for the acquisition function, by default {}
- acq_optimizer_kwargsdict, optional
dictionary of keyword arguments for the acquisition optimizer, by default {}
- switch2exploitbool, optional
switch to exploitation after reaching max_loop_no_improvement loops without improvement and reset the counter, by default True
- show_objective_funcbool, optional
plot the objective function, by default True
- kwargs_plot_objdict, optional
dictionary of keyword arguments for plot_objective_function, by default None including:
- zscale: str, optional
type of scaling to use for the objective function, can be [‘linear’,’log’], by default ‘log’
- show_pointsboolean, optional
show the explored points in the parameter space during the optimization, by default False
- savefigboolean, optional
save the figure, by default False
- savefig_namestr, optional
name of the file to save the figure, by default ‘posterior.png’
- savefig_dirstr, optional
directory to save the figure, by default self.res_dir
- figextstr, optional
extension of the figure, by default ‘.png’
- figsizetuple, optional
size of the figure, by default (5*nb_params,5*nb_params)
- figdpiint, optional
dpi of the figure, by default 300
- show_posteriorbool, optional
calculate & show posterior distribution, by default True
- kwargs_posteriordict
dictionary of keyword arguments for posterior function, by default None including:
- Nresinteger, optional
Sampling resolution. Number of data points per dimension, by default 30
- Ninteginteger, optional
Number of points for the marginalization over the other parameters when full_grid = False, by default 100
- full_gridboolean, optional
If True, use a full grid for the posterior, by default False
- randomizeboolean, optional
If True, calculate the posterior for all the dimension but draw the marginalization points randomly and do not use a corse grid, by default False
- logscaleboolean, optional
display in log scale?, by default True
- vminfloat, optional
lower cutoff (in terms of exp(vmin) if logscale==True), by default 1e-100
- zoomint, optional
number of time to zoom in, only used if full_grid = True, by default 0
- min_probfloat, optional
minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.
- clear_axisboolean, optional
clear the axis before plotting the zoomed in data, by default False.
- True_valuesdict, optional
dictionary of true values of the parameters, by default None
- show_pointsboolean, optional
show the explored points in the parameter space during the optimization, by default False
- savefigboolean, optional
save the figure, by default False
- savefig_namestr, optional
name of the file to save the figure, by default ‘posterior.png’
- savefig_dirstr, optional
directory to save the figure, by default self.res_dir
- figextstr, optional
extension of the figure, by default ‘.png’
- figsizetuple, optional
size of the figure, by default (5*nb_params,5*nb_params)
- figdpiint, optional
dpi of the figure, by default 300
- verbosebool, optional
display progress and results, by default True
- Returns:
dictionary with the optimized parameters (‘popt’) and the corresponding covariance (‘pcov’) and standard deviation (‘std’) values
- Return type:
dict
- plot_objective_function(rrr, r, axis_type, pnames_display, kwargs_plot_obj={})[source]
Plot the objective function as a contour plot using skopt plt_objective function
- Parameters:
rrr (skopt.optimizer.OptimizeResult) – result of the optimization
pnames_display (list) – list of strings with the display names of the parameters
kwargs_plot_obj (dict, optional) – kwargs for the plot_objective function, by default {}
- posterior(params, lb_main, ub_main, beta_scaled, N, gpr, fscale, kwargs_posterior, points=None)[source]
Obtain the posterior probability distribution p(w|y) by brute force gridding where w is the set of model parameters and y is the data For each fitparameter, return mean and standard deviation
- Parameters:
params (list of fitparameter objects) – list of fitparameters
lb_main (list of floats) – lower bound for the fitparameters
ub_main (list of floats) – upper bound for the fitparameters
beta_scaled (float) – 1/minimum of the scaled surrogate function
N (integer) – number of datasets
gpr (scikit-optimize estimator) – trained regressor
fscale (float) – scaling factor
kwargs_posterior (dict) –
dictionary of keyword arguments for posterior function including:
- Nresinteger, optional
Sampling resolution. Number of data points per dimension, by default 30
- Ninteginteger, optional
Number of points for the marginalization over the other parameters when full_grid = False, by default 100
- full_gridboolean, optional
If True, use a full grid for the posterior, by default False
- randomizeboolean, optional
If True, calculate the posterior for all the dimension but draw the marginalization points randomly and do not use a corse grid, by default False
- logscaleboolean, optional
display in log scale?, by default True
- vminfloat, optional
lower cutoff (in terms of exp(vmin) if logscale==True), by default 1e-100
- zoomint, optional
number of time to zoom in, only used if full_grid = True, by default 0
- min_probfloat, optional
minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.
- clear_axisboolean, optional
clear the axis before plotting the zoomed in data, by default False.
- True_valuesdict, optional
dictionary of true values of the parameters, by default None
- show_pointsboolean, optional
show the explored points in the parameter space during the optimization, by default False
- savefigboolean, optional
save the figure, by default False
- savefig_namestr, optional
name of the file to save the figure, by default ‘posterior.png’
- savefig_dirstr, optional
directory to save the figure, by default self.res_dir
- figextstr, optional
extension of the figure, by default ‘.png’
- figsizetuple, optional
size of the figure, by default (5*nb_params,5*nb_params)
- figdpiint, optional
dpi of the figure, by default 300
points (array, optional) – array of explored points in the parameter space during the optimization, by default None
- Returns:
Contour plots for each pair of fit parameters
list of float, list of float – the mean and the square root of the second central moment (generalized standard deviation for arbitrary probability distribution)
- randomize_grid_posterior(params, lb_main, ub_main, beta_scaled, N, gpr, fscale, kwargs_posterior, points=None, True_values=None)[source]
Obtain the posterior probability distribution p(w|y) by brute force gridding where w is the set of model parameters and y is the data For each fitparameter, return mean and standard deviation
- Parameters:
params (list of fitparameter objects) – list of fitparameters
lb_main (list of floats) – lower bound for the fitparameters
ub_main (list of floats) – upper bound for the fitparameters
beta_scaled (float) – 1/minimum of the scaled surrogate function
N (integer) – number of datasets
gpr (scikit-optimize estimator) – trained regressor
fscale (float) – scaling factor
kwargs_posterior (dict) –
dictionary of keyword arguments for posterior function including:
- Nresinteger, optional
Sampling resolution. Number of data points per dimension, by default 30
- Ninteginteger, optional
Number of points for the marginalization over the other parameters when full_grid = False, by default 100
- full_gridboolean, optional
If True, use a full grid for the posterior, by default False
- logscaleboolean, optional
display in log scale?, by default True
- vminfloat, optional
lower cutoff (in terms of exp(vmin) if logscale==True), by default 1e-100
- zoomint, optional
number of time to zoom in, only used if full_grid = True, by default 0
- min_probfloat, optional
minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.
- clear_axisboolean, optional
clear the axis before plotting the zoomed in data, by default False.
- True_valuesdict, optional
dictionary of true values of the parameters, by default None
- show_pointsboolean, optional
show the explored points in the parameter space during the optimization, by default False
- savefigboolean, optional
save the figure, by default False
- savefig_namestr, optional
name of the file to save the figure, by default ‘posterior.png’
- savefig_dirstr, optional
directory to save the figure, by default self.res_dir
- figextstr, optional
extension of the figure, by default ‘.png’
- figsizetuple, optional
size of the figure, by default (5*nb_params,5*nb_params)
- figdpiint, optional
dpi of the figure, by default 300
points (array, optional) – array of explored points in the parameter space during the optimization, by default None
True_values (dict, optional) – dictionary of true values of the parameters, by default None
- Returns:
Contour plots for each pair of fit parameters
list of float, list of float – the mean and the square root of the second central moment (generalized standard deviation for arbitrary probability distribution)
- single_point(X, y, params, n_jobs=4, base_estimator='GP', n_initial_points=100, show_objective_func=True, kwargs_plot_obj=None, axis_type=[], show_posterior=True, kwargs_posterior=None)[source]
Do a single Gaussian Process Regression on the X,y data
- Parameters:
X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions
y (1D-array) – data array of size (n,) to fit
params (list) – list of Fitparam() objects
base_estimator (str, optional) – base estimator for the Gaussian Process Regression, by default ‘GP’
n_initial_points (int, optional) – number of initial points to use for the Gaussian Process Regression, by default 100
show_objective_func (bool, optional) – whether to plot the objective function or not, by default True
kwargs_plot_obj (dict, optional) – kwargs arguments for the plot_objective_function function, by default None
axis_type (list, optional) – list of strings with the type of axis to use for each dimension. Either ‘linear’ or ‘log’, by default []
verbose (bool, optional) – whether to display progress and results or not, by default True
zscale (str, optional) – scale to use for the z axis of the contour plots. Either ‘linear’ or ‘log’, by default ‘linear’
show_plots (bool, optional) – whether to show the plots or not, by default True
boar.core.optimization_botorch module
- class boar.core.optimization_botorch.MooBOtorch(params=None, targets=None, parameter_constraints=None, warmstart=None, Path2OldXY=None, SaveOldXY2file=None, res_dir='temp', evaluate_custom=None, parallel=True, verbose=False)[source]
Bases:
BoarOptimizer
- BoTorchOpti(n_jobs=[4, 4], n_step_points=[5, 10], models=['Sobol', 'GPEI'], obj_type='MSE', loss='linear', threshold=1, model_kwargs_list=None, model_gen_kwargs_list=None, use_CUDA=True, is_MOO=False, use_custom_func=False, suggest_only=False, global_stopping_strategy=None, show_posterior=True, kwargs_posterior=None, verbose=True)[source]
Optimize the model using the Ax/Botorch library Uses the Expected Hypervolume Improvement (EHVI) algorithm
- Parameters:
n_jobs (list, optional) – number of parallel jobs for each step, by default [4,4]
n_step_points (list, optional) – number of points to sample for each step, by default [5, 10]
models (list, optional) – list of models to use for each step, by default [‘Sobol’,’GPEI’]
obj_type (str, optional) – type of objective function to be used, by default ‘MSE’
loss (str, optional) – loss function to be used, by default ‘linear’
threshold (float, optional) – threshold for the loss function, by default 1
model_kwargs_list (list, optional) – list of dictionaries of model kwargs to use for each step, by default None Can contains : ‘surrogate’ : Surrogate model to use. ‘botorch_acqf_class’ : BoTorch acquisition function class to use.
model_gen_kwargs_list (list, optional) – list of dictionaries of model generation kwargs to use for each step, by default None
use_CUDA (bool, optional) – whether to use CUDA or not, by default True
is_MOO (bool, optional) – whether to use multi-objective optimization or enforce single-objective optimization, by default False
use_custom_func (bool, optional) – use a custom evaluation function instead of the default one, this is useful when the same model is used for different targets, by default False
suggest_only (bool, optional) – only suggest the next point and does not evaluate it, by default False
global_stopping_strategy (class, optional) – global stopping strategy based on BaseGlobalStoppingStrategy, see https://ax.dev/tutorials/gss.html, by default None
show_posterior (bool, optional) – calculate & show posterior distribution, by default True
kwargs_posterior (dict) –
dictionary of keyword arguments for posterior function, by default None including:
- Nresinteger, optional
Sampling resolution. Number of data points per dimension, by default 30
- Ninteginteger, optional
Number of points for the marginalization over the other parameters when full_grid = False, by default 100
- full_gridboolean, optional
If True, use a full grid for the posterior, by default False
- randomizeboolean, optional
If True, calculate the posterior for all the dimension but draw the marginalization points randomly and do not use a corse grid, by default False
- logscaleboolean, optional
display in log scale?, by default True
- vminfloat, optional
lower cutoff (in terms of exp(vmin) if logscale==True), by default 1e-100
- zoomint, optional
number of time to zoom in, only used if full_grid = True, by default 0
- min_probfloat, optional
minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.
- clear_axisboolean, optional
clear the axis before plotting the zoomed in data, by default False.
- True_valuesdict, optional
dictionary of true values of the parameters, by default None
- show_pointsboolean, optional
show the explored points in the parameter space during the optimization, by default False
- savefigboolean, optional
save the figure, by default False
- savefig_namestr, optional
name of the file to save the figure, by default ‘posterior.png’
- savefig_dirstr, optional
directory to save the figure, by default self.res_dir
- figextstr, optional
extension of the figure, by default ‘.png’
- figsizetuple, optional
size of the figure, by default (5*nb_params,5*nb_params)
- figdpiint, optional
dpi of the figure, by default 300
verbose (bool, optional) – whether to print the optimization steps or not, by default True
- Returns:
the AxClient object
- Return type:
AxClient
- Raises:
ValueError – if n_jobs, n_step_points and models are not lists of the same length
- ConvertParams(params)[source]
Convert the params to the format required by the Ax/Botorch library
- Parameters:
params (list of Fitparam() objects) – list of Fitparam() objects
- Returns:
- list of dictionaries with the following keys:
’name’: string: the name of the parameter‘type’: string: ‘range’ or ‘fixed’‘bounds’: list of float: the lower and upper bounds of the parameter
- Return type:
list of dict
- LH_torch(X, beta, N, gpr, fscale=None)[source]
Compute the positive log likelihood from the negative log likelihood Be careful here! The loss and threshold used here are the ones define as arguments of the optimize_sko_parallel and not the one defined in the targets so for the calculation of the MSE to be consistent we need all targets to have the same loss and threshold!
- Parameters:
X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions
beta (float) – 1 / minimum of scaled surrogate function
N (integer) – number of data points
gpr (regressor) – a trained regressor which has a .predict() method
fscale (float) – scaling factor to keep the surrogate function between 0 and 100 (yields best results in BO but here we need the unscaled surrogate function, that is, MSE), Not used here but might be later, default=None
- Returns:
the positive log likelihood
- Return type:
float
- do_grid_posterior(step, fig, axes, gs, lb, ub, pf, beta_scaled, N, gpr, fscale, Nres, logscale, vmin, min_prob=0.01, clear_axis=False, True_values=None, points=None)[source]
Obtain the posterior probability distribution p(w|y) by brute force gridding where w is the set of model parameters and y is the data For each fitparameter, return mean and standard deviation
- Parameters:
step (int) – Number of zooming steps
fig (matplotlib.figure.Figure) – Figure to plot on
axes (list) – List of axes to plot on
gs (matplotlib.gridspec.GridSpec) – Gridspec to plot on
lb (list) – Lower bounds of the grid
ub (list) – Upper bounds of the grid
pf (list) – List of parameters
N (integer) – number of datasets
gpr (scikit-optimize estimator) – trained regressor
fscale (float) – scaling factor
Nres (integer) – Sampling resolution. Number of data points per dimension.
logscale (boolean) – display in log scale?
vmin (float) – lower cutoff (in terms of exp(vmin) if logscale==True)
zoom (int, optional) – number of time to zoom in, by default 1.
min_prob (float, optional) – minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.
clear_axis (boolean, optional) – clear the axis before plotting the zoomed in data, by default False.
True_values (dict, optional) – dictionary of true values of the parameters, by default None
points (array, optional) – array of explored points in the parameter space during the optimization, by default None
- Returns:
_description_
- Return type:
_type_
- evaluate(px, obj_type, loss, threshold=1, is_MOO=False)[source]
Evaluate the target at a given set of parameters
- Parameters:
px (list) – list of parameters values
obj_type (str) – type of objective function to be used, see self.obj_func_metric
loss (str) – loss function to be used, see self.lossfunc
threshold (float, optional) – threshold for the loss function, by default 1
is_MOO (bool, optional) – whether to use multi-objective optimization or enforce single-objective optimization, by default False
- Returns:
the model output
- Return type:
float
- abstract evaluate_custom(px, obj_type, loss, threshold=1, is_MOO=False)[source]
Create a custom evaluation function that can be used with the Ax/Botorch library and needs to be implemented by the user should return a dictionary with the following format: {‘metric_name’:metric_value}
- Parameters:
px (list) – list of parameters values
obj_type (str) – type of objective function to be used, see self.obj_func_metric
loss (str) – loss function to be used, see self.lossfunc
threshold (float, optional) – threshold for the loss function, by default 1
is_MOO (bool, optional) – whether to use multi-objective optimization or enforce single-objective optimization, by default False
- expected_minimum_BOAR(triedX, gpr, n_random_starts=20, random_state=None)[source]
Compute the minimum over the predictions of the last surrogate model.
Note: that this will be useful only from single objective optimization and if the goal is to minimize the surrogate function.
This was adapted from the scikit-optimize package. The original code can be found here: [scikit-optimize](https://scikit-optimize.github.io/stable/index.html) in the file scikit-optimize/skopt/utils.py
- Parameters:
ax_client (AxClient) – AxClient object.
n_random_starts (int, default=20) – Number of points to sample randomly before fitting the surrogate model. If n_random_starts=0, then the initial point is taken as the best point seen so far (usually the last point in the GP model).
random_state (int, RandomState instance or None, optional (default=None)) – Set random state to something other than None for reproducible results.
- Returns:
x (ndarray, shape (n_features,)) – The point which minimizes the surrogate function.
fun (float) – The surrogate function value at the minimum.
- get_model(estimator='GPEI', use_CUDA=True)[source]
Get the model
- Parameters:
estimator (str, optional) – Estimator to use. The default is ‘GPEI’.
use_CUDA (bool, optional) – Use CUDA. The default is True.
- Raises:
ValueError – If the estimator is not implemented yet.
- Returns:
model (class) – Model class.
tkwargs (dict) – Dictionary of keyword arguments for the model.
opt (str) – type of optimization either ‘random’, ‘single’ or ‘multi’
- makeobjectives(targets, obj_type='MSE', threshold=1000, is_MOO=False)[source]
Convert the targets to the format required by the Ax/Botorch library
- Parameters:
targets (list of dict) –
list of dictionaries with the following keys:
’model’: a pointer to a function y = f(X) where X has m dimensions ‘data’: dictionary with keys ‘X’:ndarray with shape (n,m) where n is the number of evaluations for X ‘y’:ndarray with shape (n,) ‘X_dimensions’: list of string: the names of the dimensions in X ‘X_units’: list of string: the units of the dimensions in X ‘y_dimension’: string: the name of the dimension y ‘y_unit’: string: the unit of the dimension y ‘weight’: float: the weight of the target ‘loss’: string: the loss function to be used ‘threshold’: float: the threshold for the loss function
obj_type (str, optional) – the type of objective function to be used, by default ‘MSE’
loss (str, optional) – the loss function to be used, by default ‘linear’
threshold (float, optional) – the threshold for the loss function, by default 1000
- Returns:
list of Metric() objects
- Return type:
list of Metric() objects
- marginal_posterior_1D(x_name, pf=None, lb=None, ub=None, fig=None, ax=None, True_values=None, gpr=None, N=None, beta_scaled=None, fscale=None, Nres=None, Ninteg=100000.0, vmin=None, min_prob=None, points=None, logscale=False, show_plot=True, clear_axis=False, xlabel_pos='bottom', ylabel_pos='left', **kwargs)[source]
calculate and plot the marginal posterior probability distribution p(w|y) for parameter x_name by integrating over the other parameters
- Parameters:
x_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated
lb (float, optional) – lower bound of the parameter x_name, if None we use the main boundaries, by default None
ub (float, optional) – upper bound of the parameter x_name, if None we use the main boundaries, by default None
fig (matplotlib figure, optional) – figure to plot the marginal posterior probability distribution, if None we create a new figure, by default None
ax (matplotlib axis, optional) – axis to plot the marginal posterior probability distribution, if None we create a new axis, by default None
True_values (dict, optional) – dictionary with the true values of the parameters, by default None
gpr (sklearn regressor, optional) – regressor to calculate the likelihood, if None we use the self.gpr, by default None
N (int, optional) – number of samples to calculate the likelihood, if None we use the self.N, by default None
beta_scaled (float, optional) – scaling factor for the likelihood, if None we use the self.beta_scaled, by default None
fscale (float, optional) – scaling factor for the likelihood, if None we use the self.fscale, by default None
Nres (int, optional) – number of points to calculate the marginal posterior probability distribution, by default None
Ninteg (int, optional) – number of points to marginalize the prob, by default 1e5
vmin (float, optional) – minimum value of the marginal posterior probability distribution, only used if logscale = True as for linscale the min probability is 0, by default None
min_prob (float, optional) – value used for the cut off probability when zooming in, note that for now this is not in used, by default None
points (array, optional) – array with the points to plot the marginal posterior probability distribution, by default None
logscale (bool, optional) – if True we plot the marginal posterior probability distribution in log scale, by default False
show_plot (bool, optional) – if True we show the plot, by default True
clear_axis (bool, optional) – if True we clear the axis, by default False
xlabel_pos (str, optional) – position of the xlabel, by default ‘bottom’
ylabel_pos (str, optional) – position of the ylabel, by default ‘left’
**kwargs (dict, optional) – additional arguments to pass to the plot function, by default None
- marginal_posterior_2D(x_name, y_name, pf=None, lb=None, ub=None, fig=None, ax=None, True_values=None, gpr=None, N=None, beta_scaled=None, fscale=None, Nres=None, Ninteg=100000.0, vmin=None, min_prob=None, points=None, logscale=False, show_plot=True, clear_axis=False, xlabel_pos='bottom', ylabel_pos='left', **kwargs)[source]
calculate and plot the marginal posterior probability distribution p(w|y) for parameter x_name by integrating over the other parameters
- Parameters:
x_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated on the x-axis
y_name (str) – name of the parameter for which the marginal posterior probability distribution is calculated on the y-axis
lb (float, optional) – lower bound of the parameter x_name, if None we use the main boundaries, by default None
ub (float, optional) – upper bound of the parameter x_name, if None we use the main boundaries, by default None
fig (matplotlib figure, optional) – figure to plot the marginal posterior probability distribution, if None we create a new figure, by default None
ax (matplotlib axis, optional) – axis to plot the marginal posterior probability distribution, if None we create a new axis, by default None
True_values (dict, optional) – dictionary with the true values of the parameters, by default None
gpr (sklearn regressor, optional) – regressor to calculate the likelihood, if None we use the self.gpr, by default None
N (int, optional) – number of samples to calculate the likelihood, if None we use the self.N, by default None
beta_scaled (float, optional) – scaling factor for the likelihood, if None we use the self.beta_scaled, by default None
fscale (float, optional) – scaling factor for the likelihood, if None we use the self.fscale, by default None
Nres (int, optional) – number of points to calculate the marginal posterior probability distribution, by default None
Ninteg (int, optional) – number of points to marginalize the prob, by default 1e5
vmin (float, optional) – minimum value of the marginal posterior probability distribution, only used if logscale = True as for linscale the min probability is 0, by default None
min_prob (float, optional) – value used for the cut off probability when zooming in, note that for now this is not in used, by default None
points (array, optional) – array with the points to plot the marginal posterior probability distribution, by default None
logscale (bool, optional) – if True we plot the marginal posterior probability distribution in log scale, by default False
show_plot (bool, optional) – if True we show the plot, by default True
clear_axis (bool, optional) – if True we clear the axis, by default False
xlabel_pos (str, optional) – position of the xlabel, by default ‘bottom’
ylabel_pos (str, optional) – position of the ylabel, by default ‘left’
**kwargs (dict, optional) –
- additional arguments to pass to the plot function, by default None
- show_pointsbool, optional
if True we show the points, by default True
- plot_all_objectives(ax_client, **kwargs)[source]
Plot all objectives
- Parameters:
ax_client (AxClient() object) – AxClient() object
kwargs (dict, optional) –
- keyword arguments for the plot, by default {}
- savefigboolean, optional
save the figure, by default False
- savefig_namestr, optional
name of the file to save the figure, by default ‘’objectives’
- savefig_dirstr, optional
directory to save the figure, by default self.res_dir
- figextstr, optional
extension of the figure, by default ‘.png’
- figsizetuple, optional
size of the figure, by default (5*nb_params,5*nb_params)
- figdpiint, optional
dpi of the figure, by default 300
- Return type:
None
- plot_density(ax_client, **kwargs)[source]
Plot the density of the of the points in the search space
- Parameters:
ax_client (AxClient() object) – AxClient() object
kwargs (dict, optional) –
- keyword arguments for the plot, by default {}
- savefigboolean, optional
save the figure, by default False
- savefig_namestr, optional
name of the file to save the figure, by default ‘’objectives’
- savefig_dirstr, optional
directory to save the figure, by default self.res_dir
- figextstr, optional
extension of the figure, by default ‘.png’
- figsizetuple, optional
size of the figure, by default (5*nb_params,5*nb_params)
- figdpiint, optional
dpi of the figure, by default 300
- Raises:
ValueError – axis_type must be either log or linear
- plot_hypervolume(hv_list=None, **kwargs)[source]
Plot the hypervolume trace
- Parameters:
hv_list (list, optional) – list of hypervolumes, by default None
kwargs (dict, optional) –
- keyword arguments for the plot, by default {}
- savefigboolean, optional
save the figure, by default False
- savefig_namestr, optional
name of the file to save the figure, by default ‘’objectives’
- savefig_dirstr, optional
directory to save the figure, by default self.res_dir
- figextstr, optional
extension of the figure, by default ‘.png’
- figsizetuple, optional
size of the figure, by default (5*nb_params,5*nb_params)
- figdpiint, optional
dpi of the figure, by default 300
- logscalebool, optional
use logscale, by default False
- posterior(params, lb_main, ub_main, beta_scaled, N, gpr, fscale, kwargs_posterior, points=None)[source]
Obtain the posterior probability distribution p(w|y) by brute force gridding where w is the set of model parameters and y is the data For each fitparameter, return mean and standard deviation
- Parameters:
params (list of fitparameter objects) – list of fitparameters
lb_main (list of floats) – lower bound for the fitparameters
ub_main (list of floats) – upper bound for the fitparameters
beta_scaled (float) – 1/minimum of the scaled surrogate function
N (integer) – number of datasets
gpr (scikit-optimize estimator) – trained regressor
fscale (float) – scaling factor
kwargs_posterior (dict) –
dictionary of keyword arguments for posterior function including:
- Nresinteger, optional
Sampling resolution. Number of data points per dimension, by default 30
- Ninteginteger, optional
Number of points for the marginalization over the other parameters when full_grid = False, by default 100
- full_gridboolean, optional
If True, use a full grid for the posterior, by default False
- randomizeboolean, optional
If True, calculate the posterior for all the dimension but draw the marginalization points randomly and do not use a corse grid, by default False
- logscaleboolean, optional
display in log scale?, by default True
- vminfloat, optional
lower cutoff (in terms of exp(vmin) if logscale==True), by default 1e-100
- zoomint, optional
number of time to zoom in, only used if full_grid = True, by default 0
- min_probfloat, optional
minimum probability to consider when zooming in we will zoom on the parameter space with a probability higher than min_prob, by default 1e-40.
- clear_axisboolean, optional
clear the axis before plotting the zoomed in data, by default False.
- True_valuesdict, optional
dictionary of true values of the parameters, by default None
- show_pointsboolean, optional
show the explored points in the parameter space during the optimization, by default False
- savefigboolean, optional
save the figure, by default False
- savefig_namestr, optional
name of the file to save the figure, by default ‘posterior.png’
- savefig_dirstr, optional
directory to save the figure, by default self.res_dir
- figextstr, optional
extension of the figure, by default ‘.png’
- figsizetuple, optional
size of the figure, by default (5*nb_params,5*nb_params)
- figdpiint, optional
dpi of the figure, by default 300
points (array, optional) – array of explored points in the parameter space during the optimization, by default None
- Returns:
Contour plots for each pair of fit parameters
list of float, list of float – the mean and the square root of the second central moment (generalized standard deviation for arbitrary probability distribution)
- class boar.core.optimization_botorch.SimpleThresholdGlobalStoppingStrategy(min_trials: int, inactive_when_pending_trials: bool = True, threshold: float = 0.1)[source]
Bases:
BaseGlobalStoppingStrategy
A GSS that stops when we observe a point better than threshold. Taken from : https://ax.dev/tutorials/gss.html
boar.core.optimizer module
- class boar.core.optimizer.BoarOptimizer[source]
Bases:
object
Provides a default class for the different optimizers in BOAR. This class is not intended to be used directly, but rather to be inherited by the different optimizer classes. It provides the basic functionality for the optimizer classes, such as the different objective functions, the functions to handle the Fitparam() objects, and the functions to handle the different plotting options.
- format_func(value, tick_number)[source]
Format function for the x and y axis ticks to be passed to axo[ii,jj].xaxis.set_major_formatter(plt.FuncFormatter(format_func)) to get the logarithmic ticks looking good on the plot
- Parameters:
value (float) – value to convert
tick_number (int) – tick position
- Returns:
string representation of the value in scientific notation
- Return type:
str
- invert_lossfunc(z0, loss, threshold=1000)[source]
Invert the loss function to get the mean squared error values back
- Parameters:
z0 (1D-array) – data array of size (n,) with the loss function values
loss (str) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’]
threshold (int, optional) – critical value above which loss sets in, by default 1000. Wisely select so that the loss function affects only outliers. If threshold is set too low, then c<z0 even fon non-outliers falsifying the mean square error and thus the log likelihood, the Hessian and the error bars.
- Returns:
array of size (n,) with the mean squared error values
- Return type:
1D-array
- lossfunc(z0, loss, threshold=1000)[source]
Define the different loss functions that can be used to calculate the objective function value.
- Parameters:
z0 (1D-array) – data array of size (n,) with the mean squared error values
loss (str) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’]
threshold (int, optional) – critical value above which loss sets in, by default 1000. Wisely select so that the loss function affects only outliers. If threshold is set too low, then c<z0 even fon non-outliers falsifying the mean square error and thus the log likelihood, the Hessian and the error bars.
- Returns:
array of size (n,) with the loss function values
- Return type:
1D-array
- obj_func_curvefit(X, *p, params, model)[source]
Objective function as desired by scipy.curve_fit
- Parameters:
X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions
*p (ndarray) – list of float; values of the fit parameters as supplied by the optimizer
params (list) – list of Fitparam() objects
model (callable) – Model function yf = f(X) to compare to y
- Returns:
array of size (n,) with the model values
- Return type:
1D-array
- obj_func_metric(target, yf, obj_type='MSE')[source]
Method to calculate the objective function valueDifferent objective functions can be used, see below
- Parameters:
target (target object) – target object with data and weight
yf (array) – model output
obj_type (str, optional) – objective function type, can be [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’MAPE’,’larry’,’nRMSE_VLC’], by default ‘MSE’ ‘MSE’: mean squared error ‘RMSE’: root mean squared error ‘MSLE’: mean squared log error ‘nRMSE’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) ‘MAE’: mean absolute error ‘MAPE’: mean absolute percentage error ‘RAE’: relative absolute error sum(abs(yf-y))/sum(mean(y)-y) ‘larry’: mean squared error legacy version ‘nRMSE_VLC’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) for each experiment separately and then averaged over all experiments
- Returns:
objective function value
- Return type:
float
- Raises:
ValueError – if obj_type is not in [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’larry’,’nRMSE_VLC’]
- obj_func_scipy(*p, params, targets, obj_type='MSE', loss='linear')[source]
Objective function directly returning the loss function for use with the Bayesian Optimizer
- Parameters:
'*p' –
the parameters as passed by the Bayesian Optimizer
params (list) – list of Fitparam() objects
model (callable) – Model function yf = f(X) to compare to y
X (ndarray) – X Data array of size(n,m): n=number of data points, m=number of dimensions
y (1D-array) – data array of size (n,) to fit
obj_type (str, optional) –
objective function type, can be [‘MSE’, ‘RMSE’, ‘MSLE’,’nRMSE’,’MAE’,’larry’,’nRMSE_VLC’], by default ‘MSE’
’MSE’: mean squared error ‘RMSE’: root mean squared error ‘MSLE’: mean squared log error ‘nRMSE’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) ‘MAE’: mean absolute error ‘RAE’: relative absolute error sum(abs(yf-y))/sum(mean(y)-y) ‘larry’: mean squared error legacy version ‘nRMSE_VLC’: normalized root mean squared error (RMSE/(max(y,yf)-min(y,yf))) for each experiment separately and then averaged over all experiments
loss (str, optional) – type of loss function to use, can be [‘linear’,’soft_l1’,’huber’,’cauchy’,’arctan’], by default ‘linear’
- Returns:
array of size (n,) with the loss function values
- Return type:
1D-array
- optimize_basin_hopping(obj_type='MSE', loss='linear', kwargs=None)[source]
Use basin_hopping to optimize a function y = f(x) where x can be multi-dimensional. See scipy.optimize.basin_hopping documentation for more information.
- Parameters:
kwargs (dict, optional) – Keyword arguments for basin_hopping, see scipy.optimize.basin_hopping documentation for more information, by default None. If no kwargs is provided, use: kwargs = {‘niter’: 100, ‘T’: 1.0, ‘stepsize’: 0.5, ‘interval’: 50, ‘minimizer_kwargs’: None, ‘take_step’: None, ‘accept_test’: None, ‘callback’: None, ‘niter_success’: None, ‘seed’: None, ‘disp’: False, ‘niter_success’: 10}
- Returns:
Dictionary with the optimized parameters (‘x’) and the corresponding function value (‘fun’).
- Return type:
dict
- optimize_curvefit(kwargs=None)[source]
Use curvefit to optimize a function y = f(x) where x can be multi-dimensional use this function if the loss function is deterministic do not use this function if loss function has uncertainty (e.g from numerical simulation) in this case, use optimize_sko
- Parameters:
kwargs (dict, optional) – kwargs aruguments for curve_fit, see scipy.optimize.curve_fit documentation for more information, by default NoneIf no kwargs is provided use:kwargs = {‘ftol’:1e-8, ‘xtol’:1e-6, ‘gtol’: 1e-8, ‘diff_step’:0.001,’loss’:’linear’,’max_nfev’:5000}
- Returns:
dictionary with the optimized parameters (‘popt’) and the corresponding covariance (‘pcov’) and standard deviation (‘std’) values
- Return type:
dict
- optimize_dual_annealing(obj_type='MSE', loss='linear', kwargs=None)[source]
Use dual_annealing to optimize a function y = f(x) where x can be multi-dimensional. See scipy.optimize.dual_annealing documentation for more information.
- Parameters:
kwargs (dict, optional) – Keyword arguments for dual_annealing, see scipy.optimize.dual_annealing documentation for more information, by default None. If no kwargs is provided, use: kwargs = {‘maxiter’: 100, ‘local_search_options’: None, ‘initial_temp’: 5230.0, ‘restart_temp_ratio’: 2e-05, ‘visit’: 2.62, ‘accept’: -5.0, ‘maxfun’: 1000, ‘no_local_search’: False, ‘x0’: None, ‘bounds’: None, ‘args’: ()}
- Returns:
Dictionary with the optimized parameters (‘x’) and the corresponding function value (‘fun’).
- Return type:
dict
- params_r(params)[source]
Prepare starting guess and bounds for optimizer Considering settings of Fitparameters:
- optim_type:
if ‘linear’: abstract the order of magnitude (number between 1 and 10) if ‘log’: bring to decadic logarithm (discarding the negative sign, if any!)
- lim_type: if ‘absolute’, respect user settings in Fitparam.lims
if ‘relative’, respect user settings in Fitparam.relRange:
- if Fitparam.range_type==’linear’:
interpret Fitparam.relRange as factor
- if Fitparam.range_type==’log’:
interpret Fitparam.relRange as order of magnitude
- relRange: if zero, Fitparam is not included into starting guess and bounds
(but still available to the objective function)
- Parameters:
params (list) – list of Fitparam() objects
- Returns:
- set of lists with: p0 = initial guesses
lb = lower boundsub = upper bounds
- Return type:
set of lists
- params_w(x, params, std=[], which='val')[source]
Method to interact with Fitparam objectsUsed by all Obj_funcs to write desired parameter from optimizer so the model can have the parameters in the physically correct units The fitparams objects are in a nested list at self.include_params
- Parameters:
x (1D-sequence of floats) – fit parameters as requested by optimizer
params (list) – list of Fitparam() objects
std (list, optional) – Contains the 95% confidence interval of the parameters, by default []
which (str, optional) – ‘val’: x => Fitparam.val‘startVal’: x=> Fitparam.startValdefaults to Fitparam.val (which is used by the obj_funcsto pass to the model function), by default ‘val’