Proof of the principle theorem [Convex functions]

Proof of the principle theorem

We prove \((i)⇔(ii)\) then \((ii)⇔(iii)\).

\((i)⇒(ii):\) If \(f\) is convex, by definition

\(\hspace{3cm}f(λy+(1-λ)x)≤λf(y)+(1-λ)f(x),∀λ∈[0,1],x,y∈dom(f).\)

After rewriting, we have

\(\hspace{3cm}f(x+λ(y-x))≤f(x)+λ(f(y)-f(x))\)

\(\hspace{3cm}⇒f(y)-f(x)≥((f(x+λ(y-x))-f(x))/λ),∀λ∈(0,1]\)

As \(λ→0\), we get

\(\hspace{4cm}f(y)-f(x)≥∇f^{T}(x)(y-x).\)..............\((1)\)

\((ii)⇒(i):\) Suppose. \((1)\) holds \(∀x,y∈dom(f)\). Take any \( x,y∈dom(f)\) and let

\(\hspace{5cm}z=λx+(1-λ)y.\)

We have

\(\hspace{4cm}f(x)≥f(z)+∇f^{T}(z)(x-z)\)...............\((2)\)

\(\hspace{4cm}f(y)≥f(z)+∇f^{T}(z)(y-z)\)................\((3)\)

Multiplying \((2)\) by \(λ\), \((3)\) by \((1-λ)\) and adding, we get

\(\hspace{1cm}λf(x)+(1-λ)f(y)≥f(z)+∇f^{T}(z)(λx+(1-λ)y-z)\)\(=f(z)=f(λx+(1-λ)y).\)

\((ii)⇔(iii):\) We prove both of these claims first in dimension 1 and then we generalize.

\((ii)⇒(iii):\) (n=1) Let \(x,y∈dom(f),y>x\). We have

\(\hspace{4cm}f(y)≥f(x)+f′(x)(y-x)\)..................\((4)\)

and

\(\hspace{4cm}f(x)≥f(y)+f′(y)(x-y)\)..................\((5)\)

Dividing LHS and RHS by \((y-x)^{2}\) gives

\(\hspace{4cm}((f′(y)-f′(x))/(y-x))≥0,∀x,y,x≠y.\)

As we let \(y→x\), we get

\(\hspace{5cm}f′′(x)≥0,∀x∈dom(f).\)

\((iii)⇒(ii):\) (n=1) Suppose \(f′′(x)≥0,∀x∈dom(f)\). By the mean value version of Taylor's theorem we have

\(\hspace{3cm}f(y)=f(x)+f′(x)(y-x)+(1/2)f′′(z)(y-x)^{2}, for \; some \; z∈[x,y].\)

\(\hspace{3cm}⇒f(y)≥f(x)+f′(x)(y-x).\)

Now to establish \((ii)⇔(iii)\) in general dimension, we recall that convexity is equivalent to convexity along all lines, i.e. \(f:\mathbb{R}^{n}\rightarrow \mathbb{R}\) is convex if \(g(α)=f(x_{0}+αv)\) is convex \(∀x_{0}∈dom(f)\) and \(∀v∈\mathbb{R}^{n}\) and \(∀α \text{ such that } x_{0}+αv∈dom(f)\). Hence, \(f\) is convex if \(∇^{2}f(x)≥0\) for all \(x∈dom(f)\)