HyperParameter Tuning — Hyperopt Bayesian Optimization for (Xgboost and Neural network)

Source: Deep Learning on Medium

HyperParameter Tuning — Hyperopt Bayesian Optimization for (Xgboost and Neural network)

Hyperparameters: These are certain values/weights that determine the learning process of an algorithm.

Certain parameters for an Machine Learning model: learning-rate, alpha, max-depth, col-samples , weights, gamma and so on.

Certain parameters for an Deep Learning model: units(no of units), layer(no of layers), dropout ratio, kernel regularizers, activation function and so on.

Hyperparameter optimization is the selection of optimum or best parameter for a machine learning / deep learning algorithm. Often, we end up tuning or training the model manually with various possible range of parameters until a best fit model is obtained. Hyperparameter tuning helps in determining the optimal tuned parameters and return the best fit model, which is the best practice to follow while building an ML/DL model.

In this section we discuss on one of the most accurate and successful hyperparameter method, which is HYPEROPT.

Optimization is nothing but finding a minimum of cost function , that determines an overall better performance of a mode on both train set and test set.

HYPEROPT: It is a powerful python library for searching the parameters within a set a values. It implements three functions for minimizing the cost function,

  1. Random Search
  2. TPE (Tree Parzen Estimators)
  3. Adaptive TPE

Steps involved in hyperopt for a Machine learning algorithm-XGBOOST:

Step 1: Initialize space or a required range of values:

Other available hyperopt optimization algorithms are,

  • hp.choice(label, options) — Returns one of the options, which should be a list or tuple.
  • hp.randint(label, upper) — Returns a random integer between the range [0, upper).
  • hp.uniform(label, low, high) — Returns a value uniformly between low and high.
  • hp.quniform(label, low, high, q) — Returns a value round(uniform(low, high) / q) * q, i.e it rounds the decimal values and returns an integer
  • hp.normal(label, mean, std) — Returns a real value that’s normally-distributed with mean and standard deviation sigma.

Step 2: Define objective function:

Step 3: Run Hyperopt function:

Here, ‘best’ gives you the optimal parameters that best fit model and better loss function value. ‘trials’, it is an object that contains or stores all the statistical and diagnostic information such as hyperparameter, loss-functions for each set of parameters that the model has been trained. ‘fmin’, it is an optimization function that minimizes the loss function and takes in 4 inputs. Algorithm used is ‘tpe.suggest’ , other algorithm that can be used is ‘tpe.rand.suggest’.

*Re-train the model algorithm using the best parameters obtained using hyperopt and evaluate it against the test-set or use it for the prediction*