Skip to content

Latest commit

 

History

History
17 lines (14 loc) · 831 Bytes

README.md

File metadata and controls

17 lines (14 loc) · 831 Bytes

pytorch-hessianfree

PyTorch implementation of Hessian Free optimisation

Implemented some parts of Training Deep and Recurrent Networks with Hessian-Free Optimization by Martens and Sutskever (2012):

  • Preconditioner for CG, includes empirical Fisher diagonal (Section 20.11)
  • Gauss-Newton matrix and Hessian matrix (Section 20.5 & 20.6)
  • Martens' CG stopping criteria (Section 20.4)
  • CG backtracking (Section 20.8.7)
  • Tikhonov damping with Levenberg-Marquardt like heuristic (Section 20.8.1 & 20.8.5)
  • Line-searching (Section 20.8.5)
  • Different batches for calculating curvature and gradient, via callable vector b (A x = b) (Section 20.12)

Still yet to do:

  • Scale-Sensitive damping (Section 20.8.3)

Not fully tested, use with caution!