The curse of the second Derivative

The second derivative of a function is a powerful tool to study convexity or concavity. These concepts can be used to derive estimates involving the function, which might be difficult to prove by other means. E.g., a concave functions is always below any tangent of it, and above any secant. An example is

\(\dfrac{2x}{\pi} \le \sin(x) \le x, \qquad x \in [0,\pi]. \)

The second derivative is not much of a help for finding extrema, though it is always used by my students. For real functions on intervals we need only compute the values at all critical points (zeros of the derivative), and the end points of the interval. Since the derivative cannot change sign inbetween (or it would have an additional zero), we know all monotonicity intervals, all local, and all global extrema, just by comparing these values. This works even for unlimited intervals. So there is absolutely no need for the second derivative. E.g., the function

\(f(x) = \dfrac{x^2-1}{x^2+1} \)

has only one critical point x=0, and tends to 1 towards infinity, but never reaches that value. So we see immediately

\(f(\mathbf{R}) = [-1,1[ \)

It is true, however, that a convex function can have only one critical point, and if we find it, it must be the minimum. For this, we have to show that the second derivative is positive everywhere. This works for functions defined on open connected (e.g. convex) sets in higher dimensions too. However, it requires to prove that the Hesse matrix is positive semidefinite everywhere. Examples are quadratic functions of the type

\(g(x,y) = ax^2+bx^2+cxy+dx+ey+c \)

Computing the Hesse matrix at the critical point only does not really prove anything, unless we know that the function is of a very special type, like the one above.

Admittedly, on real intervals a function with only one critical point can be decided using the second derivative in the critical point, since this point must be a global minimum, if it is a local minimum. This argument would apply to the function f above.

But in higher dimensions, this is no longer true. It is possible to construct a function on the plane with only one critical point (grad g(x,y)=0), which is an isolated local minimum, but not a global minimum. To construct such a function, define

\(f[x,y] := x^3-x+y^2 \)

\(h[x,y] := [x^2-y^2,2xy] \)

\(g[x,y] := [e^x,y] \)

\(u := f \circ h \circ g \)

The function f has a minimum in (a,0), and a maximum (-a,0), which are the only critical points, for some a>0. Now, g takes the plane to the right half plane, and h is the complex function z^2, which takes the right half plane to the plane, cut along the negative x-axis. So u has only one critical point in [log(a),0], and this is a local minimum, but not a global minimum. Here is a plot

 

(exp^(2*x)-y^2)^3+4*exp(2*x)*y^2+y^2-exp(2*x)

(exp^(2*x)-y^2)^3+4*exp(2*x)*y^2+y^2-exp(2*x)

Here is a plot of the level lines of u.

 

Level lines of u(x,y)

Level lines of u(x,y)

For some functions, it might be possible to proof that f(x) tends to infinity, if the norm |x| tends to infinity. Then, we can compare all critical points and get the total minimum of the function. We would not need the Hesse matrix. An example are complex polynomials p(z) as functions in the plane. This works for all function defined everywhere. If the function is not defined everywhere, we need to compute extrema on the boundary.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht.

Diese Website verwendet Akismet, um Spam zu reduzieren. Erfahre mehr darüber, wie deine Kommentardaten verarbeitet werden.