At every iteration the new approximation to the inverse Hessian should be constructed in order to calculate the next step. The update formula must be consistent with Equation 2.18 and could take the form
(4.19) |
(4.20) |
Since at every iteration the old Hessian is overwritten with a new one storage locations are needed. This is an entirely trivial disadvantage over the conjugate gradient methods for any modest value of [31]. However, for large-scale problems it is advantageous to be able to specify the amount of storage BFGS is allowed to use.
There is a version of the BFGS algorithm that allows us to vary the amount of storage available to BFGS and is particularly suitable for large-scale problems [2]. The difference between this limited memory BFGS (L-BFGS) algorithm and the standard BFGS is in the Hessian matrix update. L-BFGS stores each correction separately and the Hessian matrix is never formed explicitly. To store each correction to the Hessian storage locations are needed [1]. The user can specify the number of corrections L-BFGS is allowed to store. Every iteration is computed according to a recursive formula described by Nocedal [1]. For the first iterations L-BFGS is identical to the BFGS method. After the first iterations the oldest correction is discarded. Thus, by varying the iteration cost can also be controlled, which makes the method very efficient and flexible. In general, because only recent corrections are stored L-BFGS is better able to use additional storage to accelerate convergence. Here we employed a modified version of Nocedal's L-BFGS implementation [3] in which the line search was removed and the maximum step size was limited for each image separately.