Original article was published by Dhanoop Karunakaran on Artificial Intelligence on Medium
Line search and Trust region methods: two optimisation strategies
The objective of the optimisation problem is to find the optimal point. The problem aims to construct the sequences of point, X. Each next point in the sequence can be determined by moving some distance in the direction, d. The update rule of such methods can be formulated as below.
There are two optimisation strategies: line search and trust region.
In line search, we first find the best direction d and decide the step size, α to determine how long we need to go in that direction.
As shown above, Gradient accent and decent approaches are the examples of the line search.
In trust region, we first decide the step size, α. We can construct a region by considering the α as the radius of the circle. We can call this region a trust region. The search for the best point(local minimum or local maximum in that region) for the improvement is bounded in that region. Once we have the best point it determines the direction.
The above picture is an example of how trust-region optimisation works. Let’s deep dive into this method.
The idea here is to approximate objective F(x) with a smaller f(x). It means that we can restrict the search to a trusted region to approximate F well. f is often chosen to be a quadratic approximation of F. The quadratic form of the trust-region optimisation, as shown below:
Where ∇f is the gradient(first-order derivative), H is the Hessian(second-order derivative), and c is the certain point in the trust region.
Pseudocode for the Trust-Region optimisation method
- Initialise x, δ
- Find the best point for improvement(local minimum or local maximum)in the trust region by using the quadratic approximation as shown below.
3. Once we have the best point for improvement, check whether the approximation is good or not. If the approximation is good, then increase the δ. Otherwise, decrease the δ
4. Construct the trust-region based on the new point and δ
5. Repeat 2 to 4 until we reach the optimal point.