Fork me on GitHub Top

skopt module

Scikit-Optimize, or skopt, is a simple and efficient library to minimize (very) expensive and noisy black-box functions. It implements several methods for sequential model-based optimization. skopt is reusable in many contexts and accessible.

Build Status

Install

pip install scikit-optimize

Getting started

Find the minimum of the noisy function f(x) over the range -2 < x < 2 with skopt:

import numpy as np
from skopt import gp_minimize

def f(x):
    return (np.sin(5 * x[0]) * (1 - np.tanh(x[0] ** 2)) *
            np.random.randn() * 0.1)

res = gp_minimize(f, [(-2.0, 2.0)])

For more read our introduction to bayesian optimization and the other examples.

Development

The library is still experimental and under heavy development.

The development version can be installed through:

git clone https://github.com/scikit-optimize/scikit-optimize.git
cd scikit-optimize
pip install -r requirements.txt
python setup.py develop

Run the tests by executing nosetests in the top level directory.

"""
Scikit-Optimize, or `skopt`, is a simple and efficient library to
minimize (very) expensive and noisy black-box functions. It implements
several methods for sequential model-based optimization. `skopt` is reusable
in many contexts and accessible.

[![Build Status](https://travis-ci.org/scikit-optimize/scikit-optimize.svg?branch=master)](https://travis-ci.org/scikit-optimize/scikit-optimize)

## Install

```
pip install scikit-optimize
```

## Getting started

Find the minimum of the noisy function `f(x)` over the range `-2 < x < 2`
with `skopt`:

```python
import numpy as np
from skopt import gp_minimize

def f(x):
    return (np.sin(5 * x[0]) * (1 - np.tanh(x[0] ** 2)) *
            np.random.randn() * 0.1)

res = gp_minimize(f, [(-2.0, 2.0)])
```

For more read our [introduction to bayesian optimization](https://scikit-optimize.github.io/notebooks/bayesian-optimization.html)
and the other [examples](https://github.com/scikit-optimize/scikit-optimize/tree/master/examples).


## Development

The library is still experimental and under heavy development.

The development version can be installed through:

    git clone https://github.com/scikit-optimize/scikit-optimize.git
    cd scikit-optimize
    pip install -r requirements.txt
    python setup.py develop

Run the tests by executing `nosetests` in the top level directory.
"""

from . import acquisition
from . import benchmarks
from . import callbacks
from . import learning
from . import optimizer
from . import plots
from . import space
from .optimizer import dummy_minimize
from .optimizer import forest_minimize
from .optimizer import gbrt_minimize
from .optimizer import gp_minimize
from .optimizer import Optimizer
from .utils import load, dump


__version__ = "0.3"


__all__ = (
    "acquisition",
    "benchmarks",
    "callbacks",
    "learning",
    "optimizer",
    "plots",
    "space",
    "gp_minimize",
    "dummy_minimize",
    "forest_minimize",
    "gbrt_minimize",
    "Optimizer",
    "dump",
    "load",
)

Functions

def dummy_minimize(

func, dimensions, n_calls=100, x0=None, y0=None, random_state=None, verbose=False, callback=None)

Random search by uniform sampling within the given bounds.

Parameters

  • func [callable]: Function to minimize. Should take a array of parameters and return the function values.

  • dimensions [list, shape=(n_dims,)]: List of search space dimensions. Each search dimension can be defined either as

    • a (upper_bound, lower_bound) tuple (for Real or Integer dimensions),
    • a (upper_bound, lower_bound, prior) tuple (for Real dimensions),
    • as a list of categories (for Categorical dimensions), or
    • an instance of a Dimension object (Real, Integer or Categorical).
  • n_calls [int, default=100]: Number of calls to func to find the minimum.

  • x0 [list, list of lists or None]: Initial input points.

    • If it is a list of lists, use it as a list of input points.
    • If it is a list, use it as a single initial input point.
    • If it is None, no initial input points are used.
  • y0 [list, scalar or None]: Evaluation of initial input points.

    • If it is a list, then it corresponds to evaluations of the function at each element of x0 : the i-th element of y0 corresponds to the function evaluated at the i-th element of x0.
    • If it is a scalar, then it corresponds to the evaluation of the function at x0.
    • If it is None and x0 is provided, then the function is evaluated at each element of x0.
  • random_state [int, RandomState instance, or None (default)]: Set random state to something other than None for reproducible results.

  • verbose [boolean, default=False]: Control the verbosity. It is advised to set the verbosity to True for long optimization runs.

  • callback [callable, list of callables, optional] If callable then callback(res) is called after each call to func. If list of callables, then each callable in the list is called.

Returns

  • res [OptimizeResult, scipy object]: The optimization result returned as a OptimizeResult object. Important attributes are:

    • x [list]: location of the minimum.
    • fun [float]: function value at the minimum.
    • x_iters [list of lists]: location of function evaluation for each iteration.
    • func_vals [array]: function value for each iteration.
    • space [Space]: the optimisation space.
    • specs [dict]: the call specifications.
    • rng [RandomState instance]: State of the random state at the end of minimization.

    For more details related to the OptimizeResult object, refer http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.OptimizeResult.html

def dummy_minimize(func, dimensions, n_calls=100, x0=None, y0=None,
                   random_state=None, verbose=False, callback=None):
    """Random search by uniform sampling within the given bounds.

    Parameters
    ----------
    * `func` [callable]:
        Function to minimize. Should take a array of parameters and
        return the function values.

    * `dimensions` [list, shape=(n_dims,)]:
        List of search space dimensions.
        Each search dimension can be defined either as

        - a `(upper_bound, lower_bound)` tuple (for `Real` or `Integer`
          dimensions),
        - a `(upper_bound, lower_bound, prior)` tuple (for `Real`
          dimensions),
        - as a list of categories (for `Categorical` dimensions), or
        - an instance of a `Dimension` object (`Real`, `Integer` or
          `Categorical`).

    * `n_calls` [int, default=100]:
        Number of calls to `func` to find the minimum.

    * `x0` [list, list of lists or `None`]:
        Initial input points.

        - If it is a list of lists, use it as a list of input points.
        - If it is a list, use it as a single initial input point.
        - If it is `None`, no initial input points are used.

    * `y0` [list, scalar or `None`]:
        Evaluation of initial input points.

        - If it is a list, then it corresponds to evaluations of the function
          at each element of `x0` : the i-th element of `y0` corresponds
          to the function evaluated at the i-th element of `x0`.
        - If it is a scalar, then it corresponds to the evaluation of the
          function at `x0`.
        - If it is None and `x0` is provided, then the function is evaluated
          at each element of `x0`.

    * `random_state` [int, RandomState instance, or None (default)]:
        Set random state to something other than None for reproducible
        results.

    * `verbose` [boolean, default=False]:
        Control the verbosity. It is advised to set the verbosity to True
        for long optimization runs.

    * `callback` [callable, list of callables, optional]
        If callable then `callback(res)` is called after each call to `func`.
        If list of callables, then each callable in the list is called.

    Returns
    -------
    * `res` [`OptimizeResult`, scipy object]:
        The optimization result returned as a OptimizeResult object.
        Important attributes are:

        - `x` [list]: location of the minimum.
        - `fun` [float]: function value at the minimum.
        - `x_iters` [list of lists]: location of function evaluation for each
           iteration.
        - `func_vals` [array]: function value for each iteration.
        - `space` [Space]: the optimisation space.
        - `specs` [dict]: the call specifications.
        - `rng` [RandomState instance]: State of the random state
           at the end of minimization.

        For more details related to the OptimizeResult object, refer
        http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.OptimizeResult.html
    """
    # Save call args
    specs = {"args": copy.copy(inspect.currentframe().f_locals),
             "function": inspect.currentframe().f_code.co_name}

    # Check params
    rng = check_random_state(random_state)
    space = Space(dimensions)

    if x0 is None:
        x0 = []
    elif not isinstance(x0[0], list):
        x0 = [x0]

    if not isinstance(x0, list):
        raise ValueError("`x0` should be a list, got %s" % type(x0))

    n_init_func_calls = 0
    if len(x0) > 0 and y0 is not None:
        if isinstance(y0, Iterable):
            y0 = list(y0)
        elif isinstance(y0, numbers.Number):
            y0 = [y0]
        else:
            raise ValueError("`y0` should be an iterable or a scalar, got %s"
                             % type(y0))
        if len(x0) != len(y0):
            raise ValueError("`x0` and `y0` should have the same length")

        if not all(map(np.isscalar, y0)):
            raise ValueError("`y0` elements should be scalars")

    elif len(x0) > 0 and y0 is None:
        y0 = []
        n_calls -= len(x0)
        n_init_func_calls = len(x0)

    elif len(x0) == 0 and y0 is not None:
        raise ValueError("`x0`cannot be `None` when `y0` is provided")

    else:  # len(x0) == 0 and y0 is None
        y0 = []

    callbacks = check_callback(callback)
    if verbose:
        callbacks.append(VerboseCallback(
            n_init=n_init_func_calls, n_total=n_calls))

    X = x0
    y = y0

    # Random search
    X = X + space.rvs(n_samples=n_calls, random_state=rng)
    first = True
    result = None

    for i in range(len(y0), len(X)):
        y_i = func(X[i])

        if first:
            first = False
            if not np.isscalar(y_i):
                raise ValueError("`func` should return a scalar")

        y.append(y_i)
        result = create_result(X[:i + 1], y, space, rng, specs)
        if eval_callbacks(callbacks, result):
            break

    y = np.array(y)
    return create_result(X, y, space, rng, specs)

def dump(

res, filename, store_objective=True, **kwargs)

Store an skopt optimization result into a file.

Parameters

  • res [OptimizeResult, scipy object]: Optimization result object to be stored.

  • filename [string or pathlib.Path]: The path of the file in which it is to be stored. The compression method corresponding to one of the supported filename extensions ('.z', '.gz', '.bz2', '.xz' or '.lzma') will be used automatically.

  • store_objective [boolean, default=True]: Whether the objective function should be stored. Set store_objective to False if your objective function (.specs['args']['func']) is unserializable (i.e. if an exception is raised when trying to serialize the optimization result).

    Notice that if store_objective is set to False, a deep copy of the optimization result is created, potentially leading to performance problems if res is very large. If the objective function is not critical, one can delete it before calling skopt.dump() and thus avoid deep copying of res.

  • **kwargs [other keyword arguments]: All other keyword arguments will be passed to joblib.dump.

def dump(res, filename, store_objective=True, **kwargs):
    """
    Store an skopt optimization result into a file.

    Parameters
    ----------
    * `res` [`OptimizeResult`, scipy object]:
        Optimization result object to be stored.

    * `filename` [string or `pathlib.Path`]:
        The path of the file in which it is to be stored. The compression
        method corresponding to one of the supported filename extensions ('.z',
        '.gz', '.bz2', '.xz' or '.lzma') will be used automatically.

    * `store_objective` [boolean, default=True]:
        Whether the objective function should be stored. Set `store_objective`
        to `False` if your objective function (`.specs['args']['func']`) is
        unserializable (i.e. if an exception is raised when trying to serialize
        the optimization result).

        Notice that if `store_objective` is set to `False`, a deep copy of the
        optimization result is created, potentially leading to performance
        problems if `res` is very large. If the objective function is not
        critical, one can delete it before calling `skopt.dump()` and thus
        avoid deep copying of `res`.

    * `**kwargs` [other keyword arguments]:
        All other keyword arguments will be passed to `joblib.dump`.
    """
    if store_objective:
        dump_(res, filename, **kwargs)

    elif 'func' in res.specs['args']:
        # If the user does not want to store the objective and it is indeed
        # present in the provided object, then create a deep copy of it and
        # remove the objective function before dumping it with joblib.dump.
        res_without_func = deepcopy(res)
        del res_without_func.specs['args']['func']
        dump_(res_without_func, filename, **kwargs)

    else:
        # If the user does not want to store the objective and it is already
        # missing in the provided object, dump it without copying.
        dump_(res, filename, **kwargs)

def forest_minimize(

func, dimensions, base_estimator='ET', n_calls=100, n_random_starts=10, acq_func='EI', acq_optimizer='auto', x0=None, y0=None, random_state=None, verbose=False, callback=None, n_points=10000, xi=0.01, kappa=1.96, n_jobs=1)

Sequential optimisation using decision trees.

A tree based regression model is used to model the expensive to evaluate function func. The model is improved by sequentially evaluating the expensive function at the next best point. Thereby finding the minimum of func with as few evaluations as possible.

The total number of evaluations, n_calls, are performed like the following. If x0 is provided but not y0, then the elements of x0 are first evaluated, followed by n_random_starts evaluations. Finally, n_calls - len(x0) - n_random_starts evaluations are made guided by the surrogate model. If x0 and y0 are both provided then n_random_starts evaluations are first made then n_calls - n_random_starts subsequent evaluations are made guided by the surrogate model.

Parameters

  • func [callable]: Function to minimize. Should take a array of parameters and return the function values.

  • dimensions [list, shape=(n_dims,)]: List of search space dimensions. Each search dimension can be defined either as

    • a (upper_bound, lower_bound) tuple (for Real or Integer dimensions),
    • a (upper_bound, lower_bound, prior) tuple (for Real dimensions),
    • as a list of categories (for Categorical dimensions), or
    • an instance of a Dimension object (Real, Integer or Categorical).

    NOTE: The upper and lower bounds are inclusive for Integer dimensions.

  • base_estimator [string or Regressor, default="ET"]: The regressor to use as surrogate model. Can be either

    • "RF" for random forest regressor
    • "ET" for extra trees regressor
    • instance of regressor with support for return_std in its predict method

    The predefined models are initilized with good defaults. If you want to adjust the model parameters pass your own instance of a regressor which returns the mean and standard deviation when making predictions.

  • n_calls [int, default=100]: Number of calls to func.

  • n_random_starts [int, default=10]: Number of evaluations of func with random initialization points before approximating the func with base_estimator.

  • acq_func [string, default="LCB"]: Function to minimize over the forest posterior. Can be either

    • "LCB" for lower confidence bound.
    • "EI" for negative expected improvement.
    • "PI" for negative probability of improvement.
  • x0 [list, list of lists or None]: Initial input points.

    • If it is a list of lists, use it as a list of input points.
    • If it is a list, use it as a single initial input point.
    • If it is None, no initial input points are used.
  • y0 [list, scalar or None]: Evaluation of initial input points.

    • If it is a list, then it corresponds to evaluations of the function at each element of x0 : the i-th element of y0 corresponds to the function evaluated at the i-th element of x0.
    • If it is a scalar, then it corresponds to the evaluation of the function at x0.
    • If it is None and x0 is provided, then the function is evaluated at each element of x0.
  • random_state [int, RandomState instance, or None (default)]: Set random state to something other than None for reproducible results.

  • verbose [boolean, default=False]: Control the verbosity. It is advised to set the verbosity to True for long optimization runs.

  • callback [callable, optional] If provided, then callback(res) is called after call to func.

  • n_points [int, default=10000]: Number of points to sample when minimizing the acquisition function.

  • xi [float, default=0.01]: Controls how much improvement one wants over the previous best values. Used when the acquisition is either "EI" or "PI".

  • kappa [float, default=1.96]: Controls how much of the variance in the predicted values should be taken into account. If set to be very high, then we are favouring exploration over exploitation and vice versa. Used when the acquisition is "LCB".

  • n_jobs [int, default=1]: The number of jobs to run in parallel for fit and predict. If -1, then the number of jobs is set to the number of cores.

Returns

  • res [OptimizeResult, scipy object]: The optimization result returned as a OptimizeResult object. Important attributes are:

    • x [list]: location of the minimum.
    • fun [float]: function value at the minimum.
    • models: surrogate models used for each iteration.
    • x_iters [list of lists]: location of function evaluation for each iteration.
    • func_vals [array]: function value for each iteration.
    • space [Space]: the optimization space.
    • specs [dict]`: the call specifications.

    For more details related to the OptimizeResult object, refer http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.OptimizeResult.html

def forest_minimize(func, dimensions, base_estimator="ET",
                    n_calls=100, n_random_starts=10,
                    acq_func="EI", acq_optimizer="auto",
                    x0=None, y0=None, random_state=None, verbose=False,
                    callback=None, n_points=10000, xi=0.01, kappa=1.96,
                    n_jobs=1):
    """Sequential optimisation using decision trees.

    A tree based regression model is used to model the expensive to evaluate
    function `func`. The model is improved by sequentially evaluating
    the expensive function at the next best point. Thereby finding the
    minimum of `func` with as few evaluations as possible.

    The total number of evaluations, `n_calls`, are performed like the
    following. If `x0` is provided but not `y0`, then the elements of `x0`
    are first evaluated, followed by `n_random_starts` evaluations.
    Finally, `n_calls - len(x0) - n_random_starts` evaluations are
    made guided by the surrogate model. If `x0` and `y0` are both
    provided then `n_random_starts` evaluations are first made then
    `n_calls - n_random_starts` subsequent evaluations are made
    guided by the surrogate model.

    Parameters
    ----------
    * `func` [callable]:
        Function to minimize. Should take a array of parameters and
        return the function values.

    * `dimensions` [list, shape=(n_dims,)]:
        List of search space dimensions.
        Each search dimension can be defined either as

        - a `(upper_bound, lower_bound)` tuple (for `Real` or `Integer`
          dimensions),
        - a `(upper_bound, lower_bound, prior)` tuple (for `Real`
          dimensions),
        - as a list of categories (for `Categorical` dimensions), or
        - an instance of a `Dimension` object (`Real`, `Integer` or
          `Categorical`).

         NOTE: The upper and lower bounds are inclusive for `Integer`
         dimensions.

    * `base_estimator` [string or `Regressor`, default=`"ET"`]:
        The regressor to use as surrogate model. Can be either

        - `"RF"` for random forest regressor
        - `"ET"` for extra trees regressor
        - instance of regressor with support for `return_std` in its predict
          method

        The predefined models are initilized with good defaults. If you
        want to adjust the model parameters pass your own instance of
        a regressor which returns the mean and standard deviation when
        making predictions.

    * `n_calls` [int, default=100]:
        Number of calls to `func`.

    * `n_random_starts` [int, default=10]:
        Number of evaluations of `func` with random initialization points
        before approximating the `func` with `base_estimator`.

    * `acq_func` [string, default=`"LCB"`]:
        Function to minimize over the forest posterior. Can be either

        - `"LCB"` for lower confidence bound.
        - `"EI"` for negative expected improvement.
        - `"PI"` for negative probability of improvement.

    * `x0` [list, list of lists or `None`]:
        Initial input points.

        - If it is a list of lists, use it as a list of input points.
        - If it is a list, use it as a single initial input point.
        - If it is `None`, no initial input points are used.

    * `y0` [list, scalar or `None`]:
        Evaluation of initial input points.

        - If it is a list, then it corresponds to evaluations of the function
          at each element of `x0` : the i-th element of `y0` corresponds
          to the function evaluated at the i-th element of `x0`.
        - If it is a scalar, then it corresponds to the evaluation of the
          function at `x0`.
        - If it is None and `x0` is provided, then the function is evaluated
          at each element of `x0`.

    * `random_state` [int, RandomState instance, or None (default)]:
        Set random state to something other than None for reproducible
        results.

    * `verbose` [boolean, default=False]:
        Control the verbosity. It is advised to set the verbosity to True
        for long optimization runs.

    * `callback` [callable, optional]
        If provided, then `callback(res)` is called after call to func.

    * `n_points` [int, default=10000]:
        Number of points to sample when minimizing the acquisition function.

    * `xi` [float, default=0.01]:
        Controls how much improvement one wants over the previous best
        values. Used when the acquisition is either `"EI"` or `"PI"`.

    * `kappa` [float, default=1.96]:
        Controls how much of the variance in the predicted values should be
        taken into account. If set to be very high, then we are favouring
        exploration over exploitation and vice versa.
        Used when the acquisition is `"LCB"`.

    * `n_jobs` [int, default=1]:
        The number of jobs to run in parallel for `fit` and `predict`.
        If -1, then the number of jobs is set to the number of cores.

    Returns
    -------
    * `res` [`OptimizeResult`, scipy object]:
        The optimization result returned as a OptimizeResult object.
        Important attributes are:

        - `x` [list]: location of the minimum.
        - `fun` [float]: function value at the minimum.
        - `models`: surrogate models used for each iteration.
        - `x_iters` [list of lists]: location of function evaluation for each
           iteration.
        - `func_vals` [array]: function value for each iteration.
        - `space` [Space]: the optimization space.
        - `specs` [dict]`: the call specifications.

        For more details related to the OptimizeResult object, refer
        http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.OptimizeResult.html
    """
    rng = check_random_state(random_state)

    # Default estimator
    if isinstance(base_estimator, str):
        if base_estimator not in ("RF", "ET"):
            raise ValueError(
                "Valid strings for the base_estimator parameter"
                " are: 'RF' or 'ET', not '%s'" % base_estimator)

        if base_estimator == "RF":
            base_estimator = RandomForestRegressor(n_estimators=100,
                                                   min_samples_leaf=3,
                                                   n_jobs=n_jobs,
                                                   random_state=rng)

        elif base_estimator == "ET":
            base_estimator = ExtraTreesRegressor(n_estimators=100,
                                                 min_samples_leaf=3,
                                                 n_jobs=n_jobs,
                                                 random_state=rng)

    return base_minimize(func, dimensions, base_estimator,
                         n_calls=n_calls, n_points=n_points,
                         n_random_starts=n_random_starts,
                         x0=x0, y0=y0, random_state=random_state,
                         acq_func=acq_func,
                         xi=xi, kappa=kappa, verbose=verbose,
                         callback=callback, acq_optimizer="sampling")

def gbrt_minimize(

func, dimensions, base_estimator=None, n_calls=100, n_random_starts=10, acq_func='EI', acq_optimizer='auto', x0=None, y0=None, random_state=None, verbose=False, callback=None, n_points=10000, xi=0.01, kappa=1.96, n_jobs=1)

Sequential optimization using gradient boosted trees.

Gradient boosted regression trees are used to model the (very) expensive to evaluate function func. The model is improved by sequentially evaluating the expensive function at the next best point. Thereby finding the minimum of func with as few evaluations as possible.

The total number of evaluations, n_calls, are performed like the following. If x0 is provided but not y0, then the elements of x0 are first evaluated, followed by n_random_starts evaluations. Finally, n_calls - len(x0) - n_random_starts evaluations are made guided by the surrogate model. If x0 and y0 are both provided then n_random_starts evaluations are first made then n_calls - n_random_starts subsequent evaluations are made guided by the surrogate model.

Parameters

  • func [callable]: Function to minimize. Should take a array of parameters and return the function values.

  • dimensions [list, shape=(n_dims,)]: List of search space dimensions. Each search dimension can be defined either as

    • a (upper_bound, lower_bound) tuple (for Real or Integer dimensions),
    • a (upper_bound, lower_bound, "prior") tuple (for Real dimensions),
    • as a list of categories (for Categorical dimensions), or
    • an instance of a Dimension object (Real, Integer or Categorical).
  • base_estimator [GradientBoostingQuantileRegressor]: The regressor to use as surrogate model

  • n_calls [int, default=100]: Number of calls to func.

  • n_random_starts [int, default=10]: Number of evaluations of func with random initialization points before approximating the func with base_estimator.

  • acq_func [string, default="LCB"]: Function to minimize over the forest posterior. Can be either

    • "LCB" for lower confidence bound.
    • "EI" for negative expected improvement.
    • "PI" for negative probability of improvement.
  • x0 [list, list of lists or None]: Initial input points.

    • If it is a list of lists, use it as a list of input points.
    • If it is a list, use it as a single initial input point.
    • If it is None, no initial input points are used.
  • y0 [list, scalar or None]: Evaluation of initial input points.

    • If it is a list, then it corresponds to evaluations of the function at each element of x0 : the i-th element of y0 corresponds to the function evaluated at the i-th element of x0.
    • If it is a scalar, then it corresponds to the evaluation of the function at x0.
    • If it is None and x0 is provided, then the function is evaluated at each element of x0.
  • random_state [int, RandomState instance, or None (default)]: Set random state to something other than None for reproducible results.

  • verbose [boolean, default=False]: Control the verbosity. It is advised to set the verbosity to True for long optimization runs.

  • callback [callable, optional] If provided, then callback(res) is called after call to func.

  • n_points [int, default=10000]: Number of points to sample when minimizing the acquisition function.

  • xi [float, default=0.01]: Controls how much improvement one wants over the previous best values. Used when the acquisition is either "EI" or "PI".

  • kappa [float, default=1.96]: Controls how much of the variance in the predicted values should be taken into account. If set to be very high, then we are favouring exploration over exploitation and vice versa. Used when the acquisition is "LCB".

  • n_jobs [int, default=1]: The number of jobs to run in parallel for fit and predict. If -1, then the number of jobs is set to the number of cores.

Returns

  • res [OptimizeResult, scipy object]: The optimization result returned as a OptimizeResult object. Important attributes are:

    • x [list]: location of the minimum.
    • fun [float]: function value at the minimum.
    • models: surrogate models used for each iteration.
    • x_iters [list of lists]: location of function evaluation for each iteration.
    • func_vals [array]: function value for each iteration.
    • space [Space]: the optimization space.
    • specs [dict]`: the call specifications.
    • rng [RandomState instance]: State of the random state at the end of minimization.

    For more details related to the OptimizeResult object, refer http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.OptimizeResult.html

def gbrt_minimize(func, dimensions, base_estimator=None,
                  n_calls=100, n_random_starts=10,
                  acq_func="EI", acq_optimizer="auto",
                  x0=None, y0=None, random_state=None, verbose=False,
                  callback=None, n_points=10000, xi=0.01, kappa=1.96,
                  n_jobs=1):
    """Sequential optimization using gradient boosted trees.

    Gradient boosted regression trees are used to model the (very)
    expensive to evaluate function `func`. The model is improved
    by sequentially evaluating the expensive function at the next
    best point. Thereby finding the minimum of `func` with as
    few evaluations as possible.

    The total number of evaluations, `n_calls`, are performed like the
    following. If `x0` is provided but not `y0`, then the elements of `x0`
    are first evaluated, followed by `n_random_starts` evaluations.
    Finally, `n_calls - len(x0) - n_random_starts` evaluations are
    made guided by the surrogate model. If `x0` and `y0` are both
    provided then `n_random_starts` evaluations are first made then
    `n_calls - n_random_starts` subsequent evaluations are made
    guided by the surrogate model.

    Parameters
    ----------
    * `func` [callable]:
        Function to minimize. Should take a array of parameters and
        return the function values.

    * `dimensions` [list, shape=(n_dims,)]:
        List of search space dimensions.
        Each search dimension can be defined either as

        - a `(upper_bound, lower_bound)` tuple (for `Real` or `Integer`
          dimensions),
        - a `(upper_bound, lower_bound, "prior")` tuple (for `Real`
          dimensions),
        - as a list of categories (for `Categorical` dimensions), or
        - an instance of a `Dimension` object (`Real`, `Integer` or
          `Categorical`).

    * `base_estimator` [`GradientBoostingQuantileRegressor`]:
        The regressor to use as surrogate model

    * `n_calls` [int, default=100]:
        Number of calls to `func`.

    * `n_random_starts` [int, default=10]:
        Number of evaluations of `func` with random initialization points
        before approximating the `func` with `base_estimator`.

    * `acq_func` [string, default=`"LCB"`]:
        Function to minimize over the forest posterior. Can be either

        - `"LCB"` for lower confidence bound.
        - `"EI"` for negative expected improvement.
        - `"PI"` for negative probability of improvement.

    * `x0` [list, list of lists or `None`]:
        Initial input points.

        - If it is a list of lists, use it as a list of input points.
        - If it is a list, use it as a single initial input point.
        - If it is `None`, no initial input points are used.

    * `y0` [list, scalar or `None`]:
        Evaluation of initial input points.

        - If it is a list, then it corresponds to evaluations of the function
          at each element of `x0` : the i-th element of `y0` corresponds
          to the function evaluated at the i-th element of `x0`.
        - If it is a scalar, then it corresponds to the evaluation of the
          function at `x0`.
        - If it is None and `x0` is provided, then the function is evaluated
          at each element of `x0`.

    * `random_state` [int, RandomState instance, or None (default)]:
        Set random state to something other than None for reproducible
        results.

    * `verbose` [boolean, default=False]:
        Control the verbosity. It is advised to set the verbosity to True
        for long optimization runs.

    * `callback` [callable, optional]
        If provided, then `callback(res)` is called after call to func.

    * `n_points` [int, default=10000]:
        Number of points to sample when minimizing the acquisition function.

    * `xi` [float, default=0.01]:
        Controls how much improvement one wants over the previous best
        values. Used when the acquisition is either `"EI"` or `"PI"`.

    * `kappa` [float, default=1.96]:
        Controls how much of the variance in the predicted values should be
        taken into account. If set to be very high, then we are favouring
        exploration over exploitation and vice versa.
        Used when the acquisition is `"LCB"`.

    * `n_jobs` [int, default=1]:
        The number of jobs to run in parallel for `fit` and `predict`.
        If -1, then the number of jobs is set to the number of cores.

    Returns
    -------
    * `res` [`OptimizeResult`, scipy object]:
        The optimization result returned as a OptimizeResult object.
        Important attributes are:

        - `x` [list]: location of the minimum.
        - `fun` [float]: function value at the minimum.
        - `models`: surrogate models used for each iteration.
        - `x_iters` [list of lists]: location of function evaluation for each
           iteration.
        - `func_vals` [array]: function value for each iteration.
        - `space` [Space]: the optimization space.
        - `specs` [dict]`: the call specifications.
        - `rng` [RandomState instance]: State of the random state
           at the end of minimization.

        For more details related to the OptimizeResult object, refer
        http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.OptimizeResult.html
    """
    # Check params
    rng = check_random_state(random_state)

    # Default estimator
    if base_estimator is None:
        gbrt = GradientBoostingRegressor(n_estimators=30, loss="quantile")
        base_estimator = GradientBoostingQuantileRegressor(base_estimator=gbrt,
                                                           n_jobs=n_jobs,
                                                           random_state=rng)

    return base_minimize(func, dimensions, base_estimator,
                         n_calls=n_calls, n_points=n_points,
                         n_random_starts=n_random_starts,
                         x0=x0, y0=y0, random_state=random_state, xi=xi,
                         kappa=kappa, acq_func=acq_func, verbose=verbose,
                         callback=callback, acq_optimizer="sampling")

def gp_minimize(

func, dimensions, base_estimator=None, n_calls=100, n_random_starts=10, acq_func='gp_hedge', acq_optimizer='lbfgs', x0=None, y0=None, random_state=None, verbose=False, callback=None, n_points=10000, n_restarts_optimizer=5, xi=0.01, kappa=1.96, noise='gaussian', n_jobs=1)

Bayesian optimization using Gaussian Processes.

If every function evaluation is expensive, for instance when the parameters are the hyperparameters of a neural network and the function evaluation is the mean cross-validation score across ten folds, optimizing the hyperparameters by standard optimization routines would take for ever!

The idea is to approximate the function using a Gaussian process. In other words the function values are assumed to follow a multivariate gaussian. The covariance of the function values are given by a GP kernel between the parameters. Then a smart choice to choose the next parameter to evaluate can be made by the acquisition function over the Gaussian prior which is much quicker to evaluate.

The total number of evaluations, n_calls, are performed like the following. If x0 is provided but not y0, then the elements of x0 are first evaluated, followed by n_random_starts evaluations. Finally, n_calls - len(x0) - n_random_starts evaluations are made guided by the surrogate model. If x0 and y0 are both provided then n_random_starts evaluations are first made then n_calls - n_random_starts subsequent evaluations are made guided by the surrogate model.

Parameters

  • func [callable]: Function to minimize. Should take a array of parameters and return the function values.

  • dimensions [list, shape=(n_dims,)]: List of search space dimensions. Each search dimension can be defined either as

    • a (upper_bound, lower_bound) tuple (for Real or Integer dimensions),
    • a (upper_bound, lower_bound, "prior") tuple (for Real dimensions),
    • as a list of categories (for Categorical dimensions), or
    • an instance of a Dimension object (Real, Integer or Categorical).

    NOTE: The upper and lower bounds are inclusive for Integer dimensions.

  • base_estimator [a Gaussian process estimator]: The Gaussian process estimator to use for optimization. By default, a Matern kernel is used with the following hyperparameters tuned.

    • All the length scales of the Matern kernel.
    • The covariance amplitude that each element is multiplied with.
    • Noise that is added to the matern kernel. The noise is assumed to be iid gaussian.
  • n_calls [int, default=100]: Number of calls to func.

  • n_random_starts [int, default=10]: Number of evaluations of func with random initialization points before approximating the func with base_estimator.

  • acq_func [string, default="EI"]: Function to minimize over the gaussian prior. Can be either

    • "LCB" for lower confidence bound.
    • "EI" for negative expected improvement.
    • "PI" for negative probability of improvement.
    • "gp_hedge" Probabilistically choose one of the above three acquisition functions at every iteration. The weightage given to these gains can be set by \eta through acq_func_kwargs.
      • The gains g_i are initialized to zero.
      • At every iteration,
        • Each acquisition function is optimised independently to propose an candidate point X_i.
        • Out of all these candidate points, the next point X_best is chosen by softmax(\eta g_i)
        • After fitting the surrogate model with (X_best, y_best), the gains are updated such that g_i -= \mu(X_i)

    Reference: https://dslpitt.org/uai/papers/11/p327-hoffman.pdf

  • acq_optimizer [string, "sampling" or "lbfgs", default="lbfgs"]: Method to minimize the acquistion function. The fit model is updated with the optimal value obtained by optimizing acq_func with acq_optimizer.

    The acq_func is computed at n_points sampled randomly.

    • If set to "sampling", then the point among these n_points where the acq_func is minimum is the next candidate minimum.
    • If set to "lbfgs", then
      • The n_restarts_optimizer no. of points which the acquisition function is least are taken as start points.
      • "lbfgs" is run for 20 iterations with these points as initial points to find local minima.
      • The optimal of these local minima is used to update the prior.
  • x0 [list, list of lists or None]: Initial input points.

    • If it is a list of lists, use it as a list of input points.
    • If it is a list, use it as a single initial input point.
    • If it is None, no initial input points are used.
  • y0 [list, scalar or None] Evaluation of initial input points.

    • If it is a list, then it corresponds to evaluations of the function at each element of x0 : the i-th element of y0 corresponds to the function evaluated at the i-th element of x0.
    • If it is a scalar, then it corresponds to the evaluation of the function at x0.
    • If it is None and x0 is provided, then the function is evaluated at each element of x0.
  • random_state [int, RandomState instance, or None (default)]: Set random state to something other than None for reproducible results.

  • verbose [boolean, default=False]: Control the verbosity. It is advised to set the verbosity to True for long optimization runs.

  • callback [callable, list of callables, optional] If callable then callback(res) is called after each call to func. If list of callables, then each callable in the list is called.

  • n_points [int, default=10000]: Number of points to sample to determine the next "best" point. Useless if acq_optimizer is set to "lbfgs".

  • n_restarts_optimizer [int, default=5]: The number of restarts of the optimizer when acq_optimizer is "lbfgs".

  • kappa [float, default=1.96]: Controls how much of the variance in the predicted values should be taken into account. If set to be very high, then we are favouring exploration over exploitation and vice versa. Used when the acquisition is "LCB".

  • xi [float, default=0.01]: Controls how much improvement one wants over the previous best values. Used when the acquisition is either "EI" or "PI".

  • noise [float, default="gaussian"]:

    • Use noise="gaussian" if the objective returns noisy observations. The noise of each observation is assumed to be iid with mean zero and a fixed variance.
    • If the variance is known before-hand, this can be set directly to the variance of the noise.
    • Set this to a value close to zero (1e-10) if the function is noise-free. Setting to zero might cause stability issues.
  • n_jobs [int, default=1] Number of cores to run in parallel while running the lbfgs optimizations over the acquisition function. Valid only when acq_optimizer is set to "lbfgs." Defaults to 1 core. If n_jobs=-1, then number of jobs is set to number of cores.

Returns

  • res [OptimizeResult, scipy object]: The optimization result returned as a OptimizeResult object. Important attributes are:

    • x [list]: location of the minimum.
    • fun [float]: function value at the minimum.
    • models: surrogate models used for each iteration.
    • x_iters [list of lists]: location of function evaluation for each iteration.
    • func_vals [array]: function value for each iteration.
    • space [Space]: the optimization space.
    • specs [dict]`: the call specifications.
    • rng [RandomState instance]: State of the random state at the end of minimization.

    For more details related to the OptimizeResult object, refer http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.OptimizeResult.html

def gp_minimize(func, dimensions, base_estimator=None,
                n_calls=100, n_random_starts=10,
                acq_func="gp_hedge", acq_optimizer="lbfgs", x0=None, y0=None,
                random_state=None, verbose=False, callback=None,
                n_points=10000, n_restarts_optimizer=5, xi=0.01, kappa=1.96,
                noise="gaussian", n_jobs=1):
    """Bayesian optimization using Gaussian Processes.

    If every function evaluation is expensive, for instance
    when the parameters are the hyperparameters of a neural network
    and the function evaluation is the mean cross-validation score across
    ten folds, optimizing the hyperparameters by standard optimization
    routines would take for ever!

    The idea is to approximate the function using a Gaussian process.
    In other words the function values are assumed to follow a multivariate
    gaussian. The covariance of the function values are given by a
    GP kernel between the parameters. Then a smart choice to choose the
    next parameter to evaluate can be made by the acquisition function
    over the Gaussian prior which is much quicker to evaluate.

    The total number of evaluations, `n_calls`, are performed like the
    following. If `x0` is provided but not `y0`, then the elements of `x0`
    are first evaluated, followed by `n_random_starts` evaluations.
    Finally, `n_calls - len(x0) - n_random_starts` evaluations are
    made guided by the surrogate model. If `x0` and `y0` are both
    provided then `n_random_starts` evaluations are first made then
    `n_calls - n_random_starts` subsequent evaluations are made
    guided by the surrogate model.

    Parameters
    ----------
    * `func` [callable]:
        Function to minimize. Should take a array of parameters and
        return the function values.

    * `dimensions` [list, shape=(n_dims,)]:
        List of search space dimensions.
        Each search dimension can be defined either as

        - a `(upper_bound, lower_bound)` tuple (for `Real` or `Integer`
          dimensions),
        - a `(upper_bound, lower_bound, "prior")` tuple (for `Real`
          dimensions),
        - as a list of categories (for `Categorical` dimensions), or
        - an instance of a `Dimension` object (`Real`, `Integer` or
          `Categorical`).

         NOTE: The upper and lower bounds are inclusive for `Integer`
         dimensions.

    * `base_estimator` [a Gaussian process estimator]:
        The Gaussian process estimator to use for optimization.
        By default, a Matern kernel is used with the following
        hyperparameters tuned.
        - All the length scales of the Matern kernel.
        - The covariance amplitude that each element is multiplied with.
        - Noise that is added to the matern kernel. The noise is assumed
          to be iid gaussian.

    * `n_calls` [int, default=100]:
        Number of calls to `func`.

    * `n_random_starts` [int, default=10]:
        Number of evaluations of `func` with random initialization points
        before approximating the `func` with `base_estimator`.

    * `acq_func` [string, default=`"EI"`]:
        Function to minimize over the gaussian prior. Can be either

        - `"LCB"` for lower confidence bound.
        - `"EI"` for negative expected improvement.
        - `"PI"` for negative probability of improvement.
        - `"gp_hedge"` Probabilistically choose one of the above three
          acquisition functions at every iteration. The weightage
          given to these gains can be set by `\eta` through `acq_func_kwargs`.
            - The gains `g_i` are initialized to zero.
            - At every iteration,
                - Each acquisition function is optimised independently to propose an
                  candidate point `X_i`.
                - Out of all these candidate points, the next point `X_best` is
                  chosen by `softmax(\eta g_i)`
                - After fitting the surrogate model with `(X_best, y_best)`,
                  the gains are updated such that `g_i -= \mu(X_i)`

          Reference: https://dslpitt.org/uai/papers/11/p327-hoffman.pdf

    * `acq_optimizer` [string, `"sampling"` or `"lbfgs"`, default=`"lbfgs"`]:
        Method to minimize the acquistion function. The fit model
        is updated with the optimal value obtained by optimizing `acq_func`
        with `acq_optimizer`.

        The `acq_func` is computed at `n_points` sampled randomly.

        - If set to `"sampling"`, then the point among these `n_points`
          where the `acq_func` is minimum is the next candidate minimum.
        - If set to `"lbfgs"`, then
              - The `n_restarts_optimizer` no. of points which the acquisition
                function is least are taken as start points.
              - `"lbfgs"` is run for 20 iterations with these points as initial
                points to find local minima.
              - The optimal of these local minima is used to update the prior.

    * `x0` [list, list of lists or `None`]:
        Initial input points.

        - If it is a list of lists, use it as a list of input points.
        - If it is a list, use it as a single initial input point.
        - If it is `None`, no initial input points are used.

    * `y0` [list, scalar or `None`]
        Evaluation of initial input points.

        - If it is a list, then it corresponds to evaluations of the function
          at each element of `x0` : the i-th element of `y0` corresponds
          to the function evaluated at the i-th element of `x0`.
        - If it is a scalar, then it corresponds to the evaluation of the
          function at `x0`.
        - If it is None and `x0` is provided, then the function is evaluated
          at each element of `x0`.

    * `random_state` [int, RandomState instance, or None (default)]:
        Set random state to something other than None for reproducible
        results.

    * `verbose` [boolean, default=False]:
        Control the verbosity. It is advised to set the verbosity to True
        for long optimization runs.

    * `callback` [callable, list of callables, optional]
        If callable then `callback(res)` is called after each call to `func`.
        If list of callables, then each callable in the list is called.

    * `n_points` [int, default=10000]:
        Number of points to sample to determine the next "best" point.
        Useless if acq_optimizer is set to `"lbfgs"`.

    * `n_restarts_optimizer` [int, default=5]:
        The number of restarts of the optimizer when `acq_optimizer`
        is `"lbfgs"`.

    * `kappa` [float, default=1.96]:
        Controls how much of the variance in the predicted values should be
        taken into account. If set to be very high, then we are favouring
        exploration over exploitation and vice versa.
        Used when the acquisition is `"LCB"`.

    * `xi` [float, default=0.01]:
        Controls how much improvement one wants over the previous best
        values. Used when the acquisition is either `"EI"` or `"PI"`.

    * `noise` [float, default="gaussian"]:
        - Use noise="gaussian" if the objective returns noisy observations.
          The noise of each observation is assumed to be iid with
          mean zero and a fixed variance.
        - If the variance is known before-hand, this can be set directly
          to the variance of the noise.
        - Set this to a value close to zero (1e-10) if the function is
          noise-free. Setting to zero might cause stability issues.

    * `n_jobs` [int, default=1]
        Number of cores to run in parallel while running the lbfgs
        optimizations over the acquisition function. Valid only
        when `acq_optimizer` is set to "lbfgs."
        Defaults to 1 core. If `n_jobs=-1`, then number of jobs is set
        to number of cores.

    Returns
    -------
    * `res` [`OptimizeResult`, scipy object]:
        The optimization result returned as a OptimizeResult object.
        Important attributes are:

        - `x` [list]: location of the minimum.
        - `fun` [float]: function value at the minimum.
        - `models`: surrogate models used for each iteration.
        - `x_iters` [list of lists]: location of function evaluation for each
           iteration.
        - `func_vals` [array]: function value for each iteration.
        - `space` [Space]: the optimization space.
        - `specs` [dict]`: the call specifications.
        - `rng` [RandomState instance]: State of the random state
           at the end of minimization.

        For more details related to the OptimizeResult object, refer
        http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.OptimizeResult.html
    """
    # Check params
    rng = check_random_state(random_state)

    dim_types = [check_dimension(d) for d in dimensions]
    is_cat = all([isinstance(check_dimension(d), Categorical)
                  for d in dim_types])
    if is_cat:
        transformed_dims = [check_dimension(d, transform="identity")
                            for d in dimensions]
    else:
        transformed_dims = []
        for dim_type, dim in zip(dim_types, dimensions):
            if isinstance(dim_type, Categorical):
                transformed_dims.append(
                    check_dimension(dim, transform="onehot")
                    )
            # To make sure that GP operates in the [0, 1] space
            else:
                transformed_dims.append(
                    check_dimension(dim, transform="normalize")
                    )

    space = Space(transformed_dims)
    # Default GP
    if base_estimator is None:
        cov_amplitude = ConstantKernel(1.0, (0.01, 1000.0))

        if is_cat:
            other_kernel = HammingKernel(
                length_scale=np.ones(space.transformed_n_dims))
            acq_optimizer = "sampling"
        else:
            other_kernel = Matern(
                length_scale=np.ones(space.transformed_n_dims),
                length_scale_bounds=[(0.01, 100)] * space.transformed_n_dims,
                nu=2.5)

    base_estimator = GaussianProcessRegressor(
        kernel=cov_amplitude * other_kernel,
        normalize_y=True, random_state=rng, alpha=0.0, noise=noise,
        n_restarts_optimizer=2)

    return base_minimize(
        func, dimensions, base_estimator=base_estimator,
        acq_func=acq_func,
        xi=xi, kappa=kappa, acq_optimizer=acq_optimizer, n_calls=n_calls,
        n_points=n_points, n_random_starts=n_random_starts,
        n_restarts_optimizer=n_restarts_optimizer,
        x0=x0, y0=y0, random_state=random_state, verbose=verbose,
        callback=callback, n_jobs=n_jobs)

def load(

filename, **kwargs)

Reconstruct a skopt optimization result from a file persisted with skopt.dump.

Notice that the loaded optimization result can be missing the objective function (.specs['args']['func']) if dump was called with store_objective=False.

Parameters

  • filename [string or pathlib.Path]: The path of the file from which to load the optimization result.

  • **kwargs [other keyword arguments]: All other keyword arguments will be passed to joblib.load.

Returns

  • res [OptimizeResult, scipy object]: Reconstructed OptimizeResult instance.
def load(filename, **kwargs):
    """
    Reconstruct a skopt optimization result from a file
    persisted with skopt.dump.

    Notice that the loaded optimization result can be missing
    the objective function (`.specs['args']['func']`) if `skopt.dump`
    was called with `store_objective=False`.

    Parameters
    ----------
    * `filename` [string or `pathlib.Path`]:
        The path of the file from which to load the optimization result.

    * `**kwargs` [other keyword arguments]:
        All other keyword arguments will be passed to `joblib.load`.

    Returns
    -------
    * `res` [`OptimizeResult`, scipy object]:
        Reconstructed OptimizeResult instance.
    """
    return load_(filename, **kwargs)

Classes

class Optimizer

Run bayesian optimisation loop.

An Optimizer represents the steps of a bayesian optimisation loop. To use it you need to provide your own loop mechanism. The various optimisers provided by skopt use this class under the hood.

Use this class directly if you want to control the iterations of your bayesian optimisation loop.

Parameters

  • dimensions [list, shape=(n_dims,)]: List of search space dimensions. Each search dimension can be defined either as

    • a (upper_bound, lower_bound) tuple (for Real or Integer dimensions),
    • a (upper_bound, lower_bound, "prior") tuple (for Real dimensions),
    • as a list of categories (for Categorical dimensions), or
    • an instance of a Dimension object (Real, Integer or Categorical).
  • base_estimator [sklearn regressor]: Should inherit from sklearn.base.RegressorMixin. In addition the predict method, should have an optional return_std argument, which returns std(Y | x)`` along withE[Y | x]`.

  • n_random_starts [int, default=10]: Number of evaluations of func with random initialization points before approximating the func with base_estimator. While random points are being suggested no model will be fit to the observations.

  • acq_func [string, default="EI"]: Function to minimize over the posterior distribution. Can be either

    • "LCB" for lower confidence bound.
    • "EI" for negative expected improvement.
    • "PI" for negative probability of improvement.
    • "gp_hedge" Probabilistically choose one of the above three acquisition functions at every iteration.
      • The gains g_i are initialized to zero.
      • At every iteration,
        • Each acquisition function is optimised independently to propose an candidate point X_i.
        • Out of all these candidate points, the next point X_best is chosen by $softmax(\eta g_i)$
        • After fitting the surrogate model with (X_best, y_best), the gains are updated such that $g_i -= \mu(X_i)$
  • acq_optimizer [string, "sampling" or "lbfgs", default="lbfgs"]: Method to minimize the acquistion function. The fit model is updated with the optimal value obtained by optimizing acq_func with acq_optimizer.

    • If set to "sampling", then acq_func is optimized by computing acq_func at n_points randomly sampled points.
    • If set to "lbfgs", then acq_func is optimized by
      • Sampling n_restarts_optimizer points randomly.
      • "lbfgs" is run for 20 iterations with these points as initial points to find local minima.
      • The optimal of these local minima is used to update the prior.
  • random_state [int, RandomState instance, or None (default)]: Set random state to something other than None for reproducible results.

  • acq_func_kwargs [dict]: Additional arguments to be passed to the acquistion function.

  • acq_optimizer_kwargs [dict]: Additional arguments to be passed to the acquistion optimizer.

Attributes

  • Xi [list]: Points at which objective has been evaluated.
  • yi [scalar]: Values of objective at corresponding points in Xi.
  • models [list]: Regression models used to fit observations and compute acquisition function.
  • space An instance of skopt.space.Space. Stores parameter search space used to sample points, bounds, and type of parameters.
class Optimizer(object):
    """Run bayesian optimisation loop.

    An `Optimizer` represents the steps of a bayesian optimisation loop. To
    use it you need to provide your own loop mechanism. The various
    optimisers provided by `skopt` use this class under the hood.

    Use this class directly if you want to control the iterations of your
    bayesian optimisation loop.

    Parameters
    ----------
    * `dimensions` [list, shape=(n_dims,)]:
        List of search space dimensions.
        Each search dimension can be defined either as

        - a `(upper_bound, lower_bound)` tuple (for `Real` or `Integer`
          dimensions),
        - a `(upper_bound, lower_bound, "prior")` tuple (for `Real`
          dimensions),
        - as a list of categories (for `Categorical` dimensions), or
        - an instance of a `Dimension` object (`Real`, `Integer` or
          `Categorical`).

    * `base_estimator` [sklearn regressor]:
        Should inherit from `sklearn.base.RegressorMixin`.
        In addition the `predict` method, should have an optional `return_std`
        argument, which returns `std(Y | x)`` along with `E[Y | x]`.

    * `n_random_starts` [int, default=10]:
        Number of evaluations of `func` with random initialization points
        before approximating the `func` with `base_estimator`. While random
        points are being suggested no model will be fit to the observations.

    * `acq_func` [string, default=`"EI"`]:
        Function to minimize over the posterior distribution. Can be either

        - `"LCB"` for lower confidence bound.
        - `"EI"` for negative expected improvement.
        - `"PI"` for negative probability of improvement.
        - `"gp_hedge"` Probabilistically choose one of the above three
          acquisition functions at every iteration.
            - The gains `g_i` are initialized to zero.
            - At every iteration,
                - Each acquisition function is optimised independently to
                  propose an candidate point `X_i`.
                - Out of all these candidate points, the next point `X_best` is
                  chosen by $softmax(\eta g_i)$
                - After fitting the surrogate model with `(X_best, y_best)`,
                  the gains are updated such that $g_i -= \mu(X_i)$

    * `acq_optimizer` [string, `"sampling"` or `"lbfgs"`, default=`"lbfgs"`]:
        Method to minimize the acquistion function. The fit model
        is updated with the optimal value obtained by optimizing `acq_func`
        with `acq_optimizer`.

        - If set to `"sampling"`, then `acq_func` is optimized by computing
          `acq_func` at `n_points` randomly sampled points.
        - If set to `"lbfgs"`, then `acq_func` is optimized by
              - Sampling `n_restarts_optimizer` points randomly.
              - `"lbfgs"` is run for 20 iterations with these points as initial
                points to find local minima.
              - The optimal of these local minima is used to update the prior.

    * `random_state` [int, RandomState instance, or None (default)]:
        Set random state to something other than None for reproducible
        results.

    * `acq_func_kwargs` [dict]:
        Additional arguments to be passed to the acquistion function.

    * `acq_optimizer_kwargs` [dict]:
        Additional arguments to be passed to the acquistion optimizer.

    Attributes
    ----------
    * `Xi` [list]:
        Points at which objective has been evaluated.
    * `yi` [scalar]:
        Values of objective at corresponding points in `Xi`.
    * `models` [list]:
        Regression models used to fit observations and compute acquisition
        function.
    * `space`
        An instance of `skopt.space.Space`. Stores parameter search space used
        to sample points, bounds, and type of parameters.

    """
    def __init__(self, dimensions, base_estimator,
                 n_random_starts=10, acq_func="gp_hedge",
                 acq_optimizer="lbfgs",
                 random_state=None, acq_func_kwargs=None,
                 acq_optimizer_kwargs=None):
        # Arguments that are just stored not checked
        self.acq_func = acq_func
        self.rng = check_random_state(random_state)
        self.acq_func_kwargs = acq_func_kwargs

        if self.acq_func == "gp_hedge":
            self.cand_acq_funcs_ = ["EI", "LCB", "PI"]
            self.gains_ = np.zeros(3)
        else:
            self.cand_acq_funcs_ = [self.acq_func]

        if acq_func_kwargs is None:
            acq_func_kwargs = dict()
        self.eta = acq_func_kwargs.get("eta", 1.0)

        if acq_optimizer_kwargs is None:
            acq_optimizer_kwargs = dict()

        self.n_points = acq_optimizer_kwargs.get("n_points", 10000)
        self.n_restarts_optimizer = acq_optimizer_kwargs.get(
            "n_restarts_optimizer", 5)
        n_jobs = acq_optimizer_kwargs.get("n_jobs", 1)

        self.space = Space(dimensions)
        self.models = []
        self.Xi = []
        self.yi = []

        self._cat_inds = []
        self._non_cat_inds = []
        for ind, dim in enumerate(self.space.dimensions):
            if isinstance(dim, Categorical):
                self._cat_inds.append(ind)
            else:
                self._non_cat_inds.append(ind)
        self._check_arguments(base_estimator, n_random_starts, acq_optimizer)

        self.n_jobs = n_jobs

    def _check_arguments(self, base_estimator, n_random_starts, acq_optimizer):
        """Check arguments for sanity."""
        if not is_regressor(base_estimator):
            raise ValueError(
                "%s has to be a regressor." % base_estimator)
        self.base_estimator = base_estimator

        if n_random_starts < 0:
            raise ValueError(
                "Expected `n_random_starts` >= 0, got %d" % n_random_starts)
        self._n_random_starts = n_random_starts

        if acq_optimizer == "auto":
            warnings.warn("The 'auto' option for the acq_optimizer will be "
                          "removed in 0.4.")
            acq_optimizer = "lbfgs"
        self.acq_optimizer = acq_optimizer
        if self.acq_optimizer not in ["lbfgs", "sampling"]:
            raise ValueError(
                "Expected acq_optimizer to be 'lbfgs' or 'sampling', "
                "got %s" % acq_optimizer)

    def ask(self):
        """Suggest next point at which to evaluate the objective.

        Returns a random point for the first `n_random_starts` calls, after
        that `base_estimator` is used to determine the next point.
        """
        if self._n_random_starts > 0:
            self._n_random_starts -= 1
            # this will not make a copy of `self.rng` and hence keep advancing
            # our random state.
            return self.space.rvs(random_state=self.rng)[0]

        else:
            if not self.models:
                raise RuntimeError("Random evaluations exhausted and no "
                                   "model has been fit.")

            next_x = self._next_x
            min_delta_x = min([self.space.distance(next_x, xi)
                               for xi in self.Xi])
            if abs(min_delta_x) <= 1e-8:
                warnings.warn("The objective has been evaluated "
                              "at this point before.")

            # return point computed from last call to tell()
            return next_x

    def tell(self, x, y, fit=True):
        """Record an observation (or several) of the objective function.

        Provide values of the objective function at points suggested by `ask()`
        or other points. By default a new model will be fit to all
        observations. The new model is used to suggest the next point at
        which to evaluate the objective. This point can be retrieved by calling
        `ask()`.

        To add observations without fitting a new model set `fit` to False.

        To add multiple observations in a batch pass a list-of-lists for `x`
        and a list of scalars for `y`.

        Parameters
        ----------
        * `x` [list or list-of-lists]:
            Point at which objective was evaluated.
        * `y` [scalar or list]:
            Value of objective at `x`.
        * `fit` [bool, default=True]
            Fit a model to observed evaluations of the objective. A model will
            only be fitted after `n_random_starts` points have been queried
            irrespective of the value of `fit`.
        """
        # if y isn't a scalar it means we have been handed a batch of points
        if (isinstance(y, Iterable) and all(isinstance(point, Iterable)
                                            for point in x)):
            if not np.all([p in self.space for p in x]):
                raise ValueError("Not all points are within the bounds of"
                                 " the space.")
            self.Xi.extend(x)
            self.yi.extend(y)

        elif isinstance(x, Iterable) and isinstance(y, Number):
            if x not in self.space:
                raise ValueError("Point (%s) is not within the bounds of"
                                 " the space (%s)."
                                 % (x, self.space.bounds))
            self.Xi.append(x)
            self.yi.append(y)

        else:
            raise ValueError("Type of arguments `x` (%s) and `y` (%s) "
                             "not compatible." % (type(x), type(y)))

        if fit and self._n_random_starts == 0:
            transformed_bounds = np.array(self.space.transformed_bounds)
            est = clone(self.base_estimator)

            with warnings.catch_warnings():
                warnings.simplefilter("ignore")
                est.fit(self.space.transform(self.Xi), self.yi)

            if hasattr(self, "next_xs_") and self.acq_func == "gp_hedge":
                self.gains_ -= est.predict(np.vstack(self.next_xs_))
            self.models.append(est)

            X = self.space.transform(self.space.rvs(
                n_samples=self.n_points, random_state=self.rng))
            self.next_xs_ = []
            for cand_acq_func in self.cand_acq_funcs_:
                values = _gaussian_acquisition(
                    X=X, model=est, y_opt=np.min(self.yi),
                    acq_func=cand_acq_func,
                    acq_func_kwargs=self.acq_func_kwargs)
                # Find the minimum of the acquisition function by randomly
                # sampling points from the space
                if self.acq_optimizer == "sampling":
                    next_x = X[np.argmin(values)]

                # Use BFGS to find the mimimum of the acquisition function, the
                # minimization starts from `n_restarts_optimizer` different
                # points and the best minimum is used
                elif self.acq_optimizer == "lbfgs":
                    x0 = X[np.argsort(values)[:self.n_restarts_optimizer]]

                    with warnings.catch_warnings():
                        warnings.simplefilter("ignore")
                        results = Parallel(n_jobs=self.n_jobs)(
                            delayed(fmin_l_bfgs_b)(
                                gaussian_acquisition_1D, x,
                                args=(est, np.min(self.yi), cand_acq_func,
                                      self.acq_func_kwargs),
                                bounds=self.space.transformed_bounds,
                                approx_grad=False,
                                maxiter=20)
                            for x in x0)

                    cand_xs = np.array([r[0] for r in results])
                    cand_acqs = np.array([r[1] for r in results])
                    next_x = cand_xs[np.argmin(cand_acqs)]

                # lbfgs should handle this but just in case there are
                # precision errors.
                if not self.space.is_categorical:
                    next_x = np.clip(
                        next_x, transformed_bounds[:, 0],
                        transformed_bounds[:, 1])
                self.next_xs_.append(next_x)

            if self.acq_func == "gp_hedge":
                logits = np.array(self.gains_)
                logits -= np.max(logits)
                exp_logits = np.exp(self.eta * logits)
                probs = exp_logits / np.sum(exp_logits)
                next_x = self.next_xs_[np.argmax(np.random.multinomial(1,
                                                                       probs))]
            else:
                next_x = self.next_xs_[0]

            # note the need for [0] at the end
            self._next_x = self.space.inverse_transform(
                next_x.reshape((1, -1)))[0]

        # Pack results
        return create_result(self.Xi, self.yi, self.space, self.rng,
                             models=self.models)

    def run(self, func, n_iter=1):
        """Execute ask() + tell() `n_iter` times"""
        for _ in range(n_iter):
            x = self.ask()
            self.tell(x, func(x))

        return create_result(self.Xi, self.yi, self.space, self.rng,
                             models=self.models)

Ancestors (in MRO)

Static methods

def __init__(

self, dimensions, base_estimator, n_random_starts=10, acq_func='gp_hedge', acq_optimizer='lbfgs', random_state=None, acq_func_kwargs=None, acq_optimizer_kwargs=None)

Initialize self. See help(type(self)) for accurate signature.

def __init__(self, dimensions, base_estimator,
             n_random_starts=10, acq_func="gp_hedge",
             acq_optimizer="lbfgs",
             random_state=None, acq_func_kwargs=None,
             acq_optimizer_kwargs=None):
    # Arguments that are just stored not checked
    self.acq_func = acq_func
    self.rng = check_random_state(random_state)
    self.acq_func_kwargs = acq_func_kwargs
    if self.acq_func == "gp_hedge":
        self.cand_acq_funcs_ = ["EI", "LCB", "PI"]
        self.gains_ = np.zeros(3)
    else:
        self.cand_acq_funcs_ = [self.acq_func]
    if acq_func_kwargs is None:
        acq_func_kwargs = dict()
    self.eta = acq_func_kwargs.get("eta", 1.0)
    if acq_optimizer_kwargs is None:
        acq_optimizer_kwargs = dict()
    self.n_points = acq_optimizer_kwargs.get("n_points", 10000)
    self.n_restarts_optimizer = acq_optimizer_kwargs.get(
        "n_restarts_optimizer", 5)
    n_jobs = acq_optimizer_kwargs.get("n_jobs", 1)
    self.space = Space(dimensions)
    self.models = []
    self.Xi = []
    self.yi = []
    self._cat_inds = []
    self._non_cat_inds = []
    for ind, dim in enumerate(self.space.dimensions):
        if isinstance(dim, Categorical):
            self._cat_inds.append(ind)
        else:
            self._non_cat_inds.append(ind)
    self._check_arguments(base_estimator, n_random_starts, acq_optimizer)
    self.n_jobs = n_jobs

def ask(

self)

Suggest next point at which to evaluate the objective.

Returns a random point for the first n_random_starts calls, after that base_estimator is used to determine the next point.

def ask(self):
    """Suggest next point at which to evaluate the objective.
    Returns a random point for the first `n_random_starts` calls, after
    that `base_estimator` is used to determine the next point.
    """
    if self._n_random_starts > 0:
        self._n_random_starts -= 1
        # this will not make a copy of `self.rng` and hence keep advancing
        # our random state.
        return self.space.rvs(random_state=self.rng)[0]
    else:
        if not self.models:
            raise RuntimeError("Random evaluations exhausted and no "
                               "model has been fit.")
        next_x = self._next_x
        min_delta_x = min([self.space.distance(next_x, xi)
                           for xi in self.Xi])
        if abs(min_delta_x) <= 1e-8:
            warnings.warn("The objective has been evaluated "
                          "at this point before.")
        # return point computed from last call to tell()
        return next_x

def run(

self, func, n_iter=1)

Execute ask() + tell() n_iter times

def run(self, func, n_iter=1):
    """Execute ask() + tell() `n_iter` times"""
    for _ in range(n_iter):
        x = self.ask()
        self.tell(x, func(x))
    return create_result(self.Xi, self.yi, self.space, self.rng,
                         models=self.models)

def tell(

self, x, y, fit=True)

Record an observation (or several) of the objective function.

Provide values of the objective function at points suggested by ask() or other points. By default a new model will be fit to all observations. The new model is used to suggest the next point at which to evaluate the objective. This point can be retrieved by calling ask().

To add observations without fitting a new model set fit to False.

To add multiple observations in a batch pass a list-of-lists for x and a list of scalars for y.

Parameters

  • x [list or list-of-lists]: Point at which objective was evaluated.
  • y [scalar or list]: Value of objective at x.
  • fit [bool, default=True] Fit a model to observed evaluations of the objective. A model will only be fitted after n_random_starts points have been queried irrespective of the value of fit.
def tell(self, x, y, fit=True):
    """Record an observation (or several) of the objective function.
    Provide values of the objective function at points suggested by `ask()`
    or other points. By default a new model will be fit to all
    observations. The new model is used to suggest the next point at
    which to evaluate the objective. This point can be retrieved by calling
    `ask()`.
    To add observations without fitting a new model set `fit` to False.
    To add multiple observations in a batch pass a list-of-lists for `x`
    and a list of scalars for `y`.
    Parameters
    ----------
    * `x` [list or list-of-lists]:
        Point at which objective was evaluated.
    * `y` [scalar or list]:
        Value of objective at `x`.
    * `fit` [bool, default=True]
        Fit a model to observed evaluations of the objective. A model will
        only be fitted after `n_random_starts` points have been queried
        irrespective of the value of `fit`.
    """
    # if y isn't a scalar it means we have been handed a batch of points
    if (isinstance(y, Iterable) and all(isinstance(point, Iterable)
                                        for point in x)):
        if not np.all([p in self.space for p in x]):
            raise ValueError("Not all points are within the bounds of"
                             " the space.")
        self.Xi.extend(x)
        self.yi.extend(y)
    elif isinstance(x, Iterable) and isinstance(y, Number):
        if x not in self.space:
            raise ValueError("Point (%s) is not within the bounds of"
                             " the space (%s)."
                             % (x, self.space.bounds))
        self.Xi.append(x)
        self.yi.append(y)
    else:
        raise ValueError("Type of arguments `x` (%s) and `y` (%s) "
                         "not compatible." % (type(x), type(y)))
    if fit and self._n_random_starts == 0:
        transformed_bounds = np.array(self.space.transformed_bounds)
        est = clone(self.base_estimator)
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            est.fit(self.space.transform(self.Xi), self.yi)
        if hasattr(self, "next_xs_") and self.acq_func == "gp_hedge":
            self.gains_ -= est.predict(np.vstack(self.next_xs_))
        self.models.append(est)
        X = self.space.transform(self.space.rvs(
            n_samples=self.n_points, random_state=self.rng))
        self.next_xs_ = []
        for cand_acq_func in self.cand_acq_funcs_:
            values = _gaussian_acquisition(
                X=X, model=est, y_opt=np.min(self.yi),
                acq_func=cand_acq_func,
                acq_func_kwargs=self.acq_func_kwargs)
            # Find the minimum of the acquisition function by randomly
            # sampling points from the space
            if self.acq_optimizer == "sampling":
                next_x = X[np.argmin(values)]
            # Use BFGS to find the mimimum of the acquisition function, the
            # minimization starts from `n_restarts_optimizer` different
            # points and the best minimum is used
            elif self.acq_optimizer == "lbfgs":
                x0 = X[np.argsort(values)[:self.n_restarts_optimizer]]
                with warnings.catch_warnings():
                    warnings.simplefilter("ignore")
                    results = Parallel(n_jobs=self.n_jobs)(
                        delayed(fmin_l_bfgs_b)(
                            gaussian_acquisition_1D, x,
                            args=(est, np.min(self.yi), cand_acq_func,
                                  self.acq_func_kwargs),
                            bounds=self.space.transformed_bounds,
                            approx_grad=False,
                            maxiter=20)
                        for x in x0)
                cand_xs = np.array([r[0] for r in results])
                cand_acqs = np.array([r[1] for r in results])
                next_x = cand_xs[np.argmin(cand_acqs)]
            # lbfgs should handle this but just in case there are
            # precision errors.
            if not self.space.is_categorical:
                next_x = np.clip(
                    next_x, transformed_bounds[:, 0],
                    transformed_bounds[:, 1])
            self.next_xs_.append(next_x)
        if self.acq_func == "gp_hedge":
            logits = np.array(self.gains_)
            logits -= np.max(logits)
            exp_logits = np.exp(self.eta * logits)
            probs = exp_logits / np.sum(exp_logits)
            next_x = self.next_xs_[np.argmax(np.random.multinomial(1,
                                                                   probs))]
        else:
            next_x = self.next_xs_[0]
        # note the need for [0] at the end
        self._next_x = self.space.inverse_transform(
            next_x.reshape((1, -1)))[0]
    # Pack results
    return create_result(self.Xi, self.yi, self.space, self.rng,
                         models=self.models)

Instance variables

var Xi

var acq_func

var acq_func_kwargs

var eta

var models

var n_jobs

var n_points

var n_restarts_optimizer

var rng

var space

var yi