Sorry for the delay in response.
I mainly used just two sources that were also mentioned in the post:
1. Wikipedia
Gauss–Newton algorithm
https://en.wikipedia.org/wiki/Gauss%E2%80%93Newton_algorithm
Newton's method in optimization
https://en.wikipedia.org/wiki/Newton%27s_method_in_optimization
2. Blog post about Hessian Free Optimization
https://andrew.gibiansky.com/blog/machine-learning/hessian-free-optimization/
Wikipedia contained pretty much everything I wanted to know as background information, e.g. derivation of Gauss-Newton from Newton's method, concrete example and some other useful stuff.
The second reference provides good source about the problems that are related to Newton's Method (or more precisely, calculation of Hessian matrix).
For the geometric interpretation of the method I also used some other references but I can't remember which ones were the most useful for me.