Cauchy’s Mean Value Theorem and the Stronger Form of l’Hopital’s Rule

Reference: G B Thomas, Calculus and Analytic Geometry, 9th Indian Edition. 

The stronger form of l’Hopital’s rule is as follows:

Suppose that f(x_{0})=g(x_{0})=0 and the functions f and g are both differentiable on an open interval (a,b) that contains the point x_{0}. Suppose also that g^{'} \neq 0 at every point in (a,b) except possibly x_{0}. Then,

\lim_{x \rightarrow x_{0}} \frac{f(x)}{g(x)} = \lim_{x \rightarrow x_{0}}\frac{f^{'}(x)}{g^{'}(x)}…call this I, provided the limit on the right exists.


The proof of the stronger from of l’Hopital’s rule in based on Cauchy’s mean value theorem, a mean value theorem that involves two functions instead of one. We prove Cauchy’s theorem first and then show how it leads to l’Hopital’s rule.

Cauchy’s Mean Value Theorem:

Suppose the functions f and g are continuous on [a,b] and differentiable through out (a,b) and suppose also that g^{'} \neq 0 through out (a,b). Then, there exists a number c in (a,b) at which

\frac{f^{'}(c)}{g^{'}(c)} = \frac{f(b)-f(a)}{g(b)-g(a)}.

(Note this becomes the ordinary mean value theorem when g(x)=x).

Proof of Cauchy’s Mean Value theorem:

We apply the ordinary mean value theorem twice. First, we use it to show that g(b) \neq g(a). Because if g(b)=g(a), then the ordinary Mean Value theorem says that

g^{'}(c) = \frac{g(b)-g(a0}{b-a}=0 for some c between a and b. This cannot happen because g^{'}(x) \neq 0 in (a,b).

We next apply the Mean Value Theorem to the function

F(x) = f(x)-f(a) - \frac{f(b)-f(a)}{g(b)-g(a)}(g(x)-g(a))

This function is continuous and differentiable where f and g are, and note that F(b)=F(a)=0. Therefore, by the ordinary mean value theorem, there is a number c between a and b for which F^{'}(c)=0. In terms of f and g, this says

F^{'}(c) = f^{'}(c) - \frac{f(b)-f(a)}{g(b)-g(a)}(g^{'}(c)) = 0

or \frac{f^{'}(c)}{g^{'}(c)} = \frac{f(b)-f(a)}{g(b)-g(a)} which is equation II above.

Proof of the stronger form of L’Hopital’s Rule:
We first establish equation I for the case \lim x \rightarrow x_{0}^{+}. The method needs almost no change to apply to the case \lim x \rightarrow x_{0}^{-}, and the combination of these two cases establishes the result.

Suppose that x lies to the right of x_{0}. Then, g^{'}(x) \neq 0 and we can apply Cauchy’s Mean Value theorem to the closed interval from x_{0} to x. This produces a number c between x and x_{0} such that

\frac{f^{'}(c)}{g^{'}(c)} = \frac{f(x)-f(x_{0})}{g(x)-g(x_{0})}

But, f(x_{0})=g(x_{0})=0 so

That \frac{f^{'}(c)}{g^{'}(c)}= \frac{f(x)}{g(x)}

As x approaches x_{0}, c approaches x_{0} as it lies between x and x_{0}. Therefore,

\lim_{x \rightarrow x_{0}^{+}} \frac{f(x)}{g(x)} = \lim_{x \rightarrow x_{0}^{+}}  \frac{f^{'}(c)}{g^{'}(c)} = \lim_{x \rightarrow x_{0}^{+}} \frac{f^{'}(x)}{g^{'}(x)}.

This establishes l’Hopital’s Rule for the case where approaches x_{0} from right. The case where x approaches x_{0} from the left is proved by applying Cauchy’s Mean Value Theorem to the closed interval [x,x_{0}] when x < x_{0}.



Nalin Pithwa

VII. Complete Metric Spaces

Reference: Introductory Real Analysis by Kolmogorov and Fomin. Translated by Richard A. Silverman. Dover Publications. 

Available on Amazon India and Amazon USA. This text book can be studied in parallel with Analysis of Walter Rudin.

7.1. Definition and examples:

The reader is presumably already familiar with the notion of completeness of the real line. (One good simple reference for this could be: Calculus and analytic geometry by G B Thomas. You can also use, alternatively, Advanced Calculus by Buck and Buck.)The real line is, of course, a simple example of a metric space. We now make the natural generalisation of the notion of completeness to the case of an arbitrary metric space.


A sequence \{ x_{n}\} of points in a metric space R with metric \rho is said to satisfy the Cauchy criterion if given any \epsilon >0, there is an integer N_{\epsilon} such that \rho(x_{n}, x_{n^{'}})<\epsilon for all n,n^{'}> N_{\epsilon}.


A subsequence \{ x_{n}\} of points in a metric space R is called a Cauchy sequence (or a fundamental sequence ) if it satisfies the Cauchy criterion.


Every convergent sequence \{ x_{n}\} is fundamental.

Proof 1:

If \{ x_{n} \} converges to a limit x, then, given any \epsilon>0, there is an integer N_{\epsilon} such that

\rho(x_{n}, x) \leq \frac{\epsilon}{2} for all n > N_{\epsilon}.

But, then

\rho(x_{n}, x_{n^{'}}) \leq \rho(x_{n},x)+\rho(x_{n^{'}},x)<\epsilon

for all n, n^{'} >N_{\epsilon}. QED.


A metric space R is said to be complete if every Cauchy sequence in R converges to an element of R. Otherwise, R is said to be incomplete. 

Example 1:

Let R be the “space of isolated points” (discrete metric space) defined as follows: Define \rho(x,y)=0, if x=y; let \rho(x,y)=1, when x \neq y. Then, the Cauchy sequence in R are just the “stationary sequences,” that is, the sequences \{ x_{n}\} all of whose terms are the same starting from some index n. Every such sequence is obviously convergent to an element of R. Hence, R is complete.

Example 2:

The completeness of the real line R is familiar from elementary analysis:

Example 3:

The completeness of the Euclidean n-space \Re^{n} follows from that of \Re^{1}. In fact, let

x^{(p)} = (x_{1}^{(p)}, x_{2}^{(p)}, \ldots, x_{n}^{(p)}) where p = 1, 2, \ldots

be fundamental sequence of points in \Re^{n}. Then, given \epsilon >0, there exists an N_{\epsilon} such that

\Sigma_{n=1}^{\infty} (x_{k}^{(p)}-x_{k}^{(q)})^{2} < \epsilon^{2}

for all p, q > N_{\epsilon}. It follows that

|x_{k}^{(p)}-x_{k}^{(q)}|<\epsilon for k=1,2,\ldots, n for all p,q > N_{\epsilon}, that is, each \{x_{k}^{(p)} \} is a fundamental sequence in \Re^{1}.

Let x = (x_{1}, \ldots, x_{n}) where x_{k} = \lim_{p \rightarrow \infty} x_{k}^{(p)}

Then, obviously \lim_{p \rightarrow \infty} x^{(p)} = x.

This proves the completeness of \Re^{n}. The completeness of the spaces R_{0}^{n} and R_{1}^{n} introduced in earlier examples/blogs is proved in almost the same way. (HW: supply the details). QED.

Example 4:

Let \{ x_{n}(t)\} be a Cauchy sequence in the function space C_{[a,b]} introduced earlier. Then, given any \epsilon >0, there is an N_{\epsilon} such that

|x_{n}(t) - x_{n^{'}}(t)|< \epsilon….I

for all n, n^{'} > N_{\epsilon} and all t \in [a,b]. It follows that the sequence \{ x_{n}(t)\} is uniformly convergent. But the limit of a uniformly convergent sequence of continuous functions is itself a continuous function (see Problem 1 following this Section). Taking the limit as n^{'} \rightarrow \infty in I, we find that

|x_{n}(t) - x(t)|\leq \epsilon for all n > N_{\epsilon} and all t \in [a,b], that is, \{ x_{n}(t)\} converges in the metric space C_{[a,b]} to a function x(t) \in C_{[a,b]}. Hence, C_{[a,b]} is a complete metric space.

Example 5:

Next, let x^{(n)} be a sequence in the space l_{2} so that 

x^ {(n)} = (x_{1}^{(n)}, x_{2}^{(n)}, \ldots, x_{k}^{(n)}, \ldots)

\Sigma_{k=1}^{\infty}(x_{k}^{(n)})^{2} < \infty, where n = 1, 2, , \ldots

Suppose further that \{ x^{(n)}\} is a Cauchy sequence. Then, given any \epsilon > 0 there is a N_{\epsilon} such that

\rho^{2}(x^{(n)},x^{(n^{'})}) = \Sigma_{k=1}^{\infty}(x_{k}^{(n)}-x_{k}^{(n^{'})})^{2}< \epsilon…let us call this II.

if n, n^{'} > N_{\epsilon}.

It follows that (x_{k}^{(n)}-x_{k}^{(n^{'})})^{2} < \epsilon (for k =1, 2, \ldots)

That is, for every k the sequence \{ x_{k}^{(n)}\} is fundamental and hence, convergent. Let

x_{k} = \lim_{n \rightarrow \infty} x_{k}^{(n)}, x = (x_{1}, x_{2}, \ldots, x_{k}, \ldots)

Then, as we now show, x is itself a point of l_{2} and moreover, \{ x^{(n)}\} converges to x in the l_{2} metric space, so that l_{2} is a complete metric space.

In fact, the Cauchy criterion here implies that \Sigma_{k=1}^{M}(x_{k}^{(n)}-x_{k}^{(n^{'})})^{2}<\epsilon for any fixed M. …let us call this III.

Holding n fixed in III, and taking the limit as n^{'} \rightarrow \infty, we get

\Sigma_{k=1}^{M}(x_{k}^{(n)}-x_{k})^{2} \leq \epsilon….call this IV.

Since IV holds for arbitrary M, we can in turn take the limit of IV as M \rightarrow \infty, obtaining

\Sigma_{k=1}^{\infty}(x_{k}^{(n)}-x_{k})^{2} \leq \epsilon.

But, as we have learnt earlier in this series of blogs, the convergence of the two series \Sigma_{k=1}^{\infty}(x_{k}^{(n)})^{2} and \Sigma_{k=1}^{\infty}(x_{k}^{(n)}-x_{k})^{2} implies that of the series \Sigma_{k=1}^{\infty}x_{k}^{2}.

This proves that x \in l_{2}. Moreover, since \epsilon is arbitrarily small, III implies that

\lim_{n \rightarrow \infty}\rho( x^{(n)} ,x ) = \lim_{n \rightarrow \infty} \sqrt{\Sigma_{k=1}^{\infty} (x_{k}^{(n)}-x_{k})^{2}} = 0

That is, \{ x^{(n)}\} converges to x in the l_{2} metric space, as asserted.


Example 6.

Consider the space C_{[a,b]}^{2}. To recap: consider the set of all functions continuous on the closed interval [a,b] with the distance metric defined by: \rho(x,y) = (\int_{a}^{b}|x(t)-y(t)|^{2}dt)^{\frac{1}{2}}.

It is easy to show that the space C_{[a,b]}^{2} is incomplete. If

\phi_{n}(t) = -1, if -1 \leq t \leq -\frac{1}{n}

\phi_{n}(t) = nt, if -\frac{1}{n} \leq t \leq \frac{1}{n}

\phi_{n}(t) = 1, if \frac{1}{n} \leq t \leq 1

then \{ \phi_{n}(t)\} is a fundamental sequence in C_{[a,b]}^{2} since

\int_{-1}^{1}(\phi_{n}(t)-\phi_{n^{'}}(t))^{2} dt \leq \frac{2}{\min{ \{ n,n^{'}}\}}

However, \{ \phi_{n}(t)\} cannot converge to a function in C_{[-1,1]}^{2}. In fact, consider the discontinuous function

\psi(t) = -1, when t \leq 0

\psi(t) = 1, when t \geq 0.

Then, given any function f \epsilon C_{[-1,1]}^{2}, it follows from Schwarz’s inequality (obviously still valid for piecewise continuous functions) that

(\int_{-1}^{1}(f(t)-\psi(t))^{2}dt)^{\frac{1}{2}} \leq (\int_{-1}^{1}(f(t)-\phi_{n}(t))^{2}dt)^{\frac{1}{2}} + (\int_{-1}^{1}(\phi_{n}(t) - \psi(t))^{2}dt)^{\frac{1}{2}}

But the integral on the left is non-zero, by the continuity of f, and moreover, it is clear that

\lim_{n \rightarrow \infty}\int_{-1}^{1}(\phi_{n}(t)-\psi(t))^{2}dt = 0

Therefore, \int_{-1}^{1}(f(t)-\phi_{n}(t))^{2}dt cannot converge to zero as n \ rightarrow \infty.


7.2 The nested sphere theorem.

A sequence of closed spheres S[x_{1}, r_{1}], S[x_{2}, r_{2}], \ldots, S[x_{n}, r_{n}], \ldots

in a metric space R is said to be nested (or decreasing) if

S[x_{1}, r_{1}] \supset S[x_{2}, r_{2}] \supset \ldots \supset S[x_{n}, r_{n}] \supset \ldots

Using this concept, we can prove a simple criterion for the completeness of R:

THEOREM 2: The Nested Sphere Theorem:

A metric space R is complete if and only if every nested sequence \{ S_{n}\} = \{ S_{[x_{n}, r_{n}]}\} of closed spheres in R such that r_{n} \rightarrow 0 as n \rightarrow \infty has a non empty intersection


Proof of the nested theorem:

Part I: Assume that R is complete and that if \{ S_{n}\} = \{ S[x_{n}, r_{n}] \} is any nested sequence of closed spheres in R such that r_{n} \rightarrow 0 as n \rightarrow \infty, then the sequence \{ x_{n}\} of centres of the spheres is fundamental because

\rho(x_{n}, x_{n^{'}}) < r_{n} for n^{'}>n and r_{n} \rightarrow 0 as n \rightarrow \infty. Therefore, \{ x_{n} \} has a limit. Let

x = \lim_{n \rightarrow \infty} x_{n}.

Then, x \in \bigcap_{n=1}^{\infty}S_{n}

Not only that, we can in fact say that S_{n} contains every point of the sequence \{ x_{n}\} except possibly the points x_{1}, x_{2}, \ldots, x_{n-1} and hence, x is a limit point of every sphere S_{n}. But, S_{n} is closed, and hence, x \in S_{n} for all n.

Conversely, suppose every nested sequence of closed spheres in R with radii converging to zero has a non empty intersection, and let \{ x_{n}\} be any fundamental sequence in R. Then, x has a limit point in R. To see this, use the fact that \{x_{n}\} is fundamental to choose a term x_{n_{1}} of the sequence \{ x_{n} \} such that

\rho(x_{n}, x_{n_{1}}) < \frac{1}{2} for all n \geq n_{1}, and let S_{1} be the closed sphere of radius 1 with centre x_{n_{1}}. Then, choose a term x_{n_{1}} of \{ x_{n}\} such that n_{2} > n_{1} and \rho(x_{n}, x_{n_{1}}) < \frac{1}{2^{2}} for all n > n_{2}, and let S_{2} be the closed sphere of radius \frac{1}{2} with centre x_{n_{2}}.

Continue this construction indefinitely, that is, once having chosen terms x_{n_{1}}, x_{n_{2}}, \ldots, x_{n_{k}} (where n_{1}<n_{2}< \ldots n_{k}), choose a term x_{n_{k+1}} such that n_{k+1}>n_{k} and

\rho(x_{n}, x_{n_{k+1}}) < \frac{1}{2^{k+1}} for all n \geq n_{k+1}.

Let S_{k+1} be the closed sphere of radius \frac{1}{2^{n}} with centre x_{n_{k+1}}, and so on. This gives a nested sequence S_{n} of closed spheres with radii converging to zero. By hypothesis, these spheres have a non empty intersection, that is, there is a point x in all the spheres. This point is obviously the limit of the sequence \{ x_{n}\}. But, if a fundamental sequence contains a subsequence converging to x, then the sequence itself must converge to x (HW quiz). That is,

\lim_{n \rightarrow \infty}x_{n} =x.


7.3 Baire’s theorem:

We know that a subset A of a metric space R is said to be nowhere dense in R if it is dense in no (open) sphere at all, or equivalently, if every sphere S \subset R contains another sphere S^{'} such that S^{'} \bigcap A = \phi. (Quiz: check the equivalence).

This concept plays an important role in the following:

THEOREM 3: Baire’s Theorem:

A complete metric space R cannot be represented as the union of a countable number of nowhere dense sets.

Proof of Theorem 3: Baire’s Theorem:

Suppose the contrary. Let R = \bigcup_{n=1}^{\infty}A_{n} ….call this VI.

where every set A_{n} is nowhere dense in R. Let S_{0} \subset R be a closed sphere of radius 1. Since A_{1} is nowhere dense in S_{0}, being nowhere dense in R, there is a closed sphere S_{1} of radius less than \frac{1}{2} such that S_{1} \subset S_{0} and S_{1} \bigcap A_{1} =\phi. Since A_{2} is nowhere dense in S_{1}, being nowhere dense in S_{0}, there is a closed sphere S_{2} of radius less than \frac{1}{3} such that S_{2} \subset S_{1} and S_{2} \bigcap A_{2} = \phi, and so on. In this way, we get a nested sequence of closed spheres \{ S_{n}\} with radii converging to zero such that

S_{n} \bigcap A_{n} = \phi, where n=1,2,3,\ldots

By the nested sphere theorem, the intersection \bigcap_{n=1}^{\infty}S_{n} contains a point x. By construction, x cannot belong to any of the sets A_{n}, that is,

x \notin \bigcup_{n=1}^{\infty}A_{n}

It follows that R \neq \bigcup_{n=1}^{\infty} A_{n} contrary to VI.

Hence, the representation VI is impossible.


COROLLARY TO Baire’s theorem:

A complete metric space R without isolated points is uncountable.


Every single element set \{ x\} is nowhere dense in R.


7.4 Completion of a metric space:

As we now show, an incomplete metric space can always be enlarged (in an essentially unique way) to give a complete metric space.

DEFINITION 4: Completion of a metric space:

Given a metric space R with closure [R], a complete metric space R^{*} is called a completion of R if R \subset R^{*} and [R]=R^{*}, that is, if R is a subset of R^{*} everywhere dense in R^{*}.

Example 1.

Clearly, R^{*}=R if R is already complete. (Quiz: homework).

Example 2:

The space of all real numbers is the completion of the space of all rational numbers.


Every metric space R has a completion. This completion is unique to within an isometric mapping carrying every point x \in R into itself.

Proof of Theorem 4:

(The proof is somewhat lengthy but quite straight forward).

First , we prove the uniqueness showing that if R^{*} and R^{**} are two completions of R, then there is a one-to-one mapping x^{**} = \phi(x^{*}) onto R^{**} such that \phi(x)=x for all x \in R and

\rho_{1}(x^{*}, y^{*}) = \rho_{2}(x^{**}, y^{**})….call this VII.

(y^{**}=\phi(y^{*})), where \rho_{1} is the distance metric in R^{*} and \rho_{2} the distance metric in R^{**}. The required mapping \phi is constructed as follows: Let x^{*} be an arbitrary point of R^{*}. Then, by the definition of a completion, there is a sequence \{ x_{n}\} of points of R converging to x^{*}. The points of the sequence \{ x_{n}\} also belong to R^{**}, where they form a fundamental sequence (quiz: why?). Therefore, \{ x_{n}\} converges to a point x^{**} \in R^{**} since R^{**} is complete. It is clear that x^{**} is independent of the choice of the sequence \{ x_{n}\} converging to the point x^{*} (homework quiz: why?). If we set \phi(x^{*})=x^{**}, then \phi is the required mapping. In fact, \phi(x)=x for all x \in R, since if x_{n} \rightarrow x \in R, then obviously x = x^{*} \in R^{*}, x^{**}=x. Moreover, suppose x_{n} \rightarrow x^{*}, y_{n} \rightarrow y^{*} in R^{*}, while x_{n} \rightarrow x^{**}, y_{n} \rightarrow y^{**}. Then, if \phi is the distance in R,

\rho_{1}(x^{*},y^{*}) = \lim_{n \rightarrow \infty} \rho(x_{n}, y_{n}) = \lim_{n \rightarrow \infty}(x_{n}, y_{n})…call this VIII.

While at the same time, \rho_{2}(x^{**}, y^{**})=\lim_{n \rightarrow \infty}\rho_{2}(x_{n}, y_{n})=\lim_{n \rightarrow \infty}\rho(x_{n}, y_{n})….call this VIII-A. But VIII and VIII-A imply VII.

We must now prove the existence of a completion of R. Given an arbitrary metric space R, we say that two Cauchy sequences \{x_{n} \} and \{\overline{x}_{n} \} in R are equilvalent and write \{ x_{n}\} \sim  \{ \overline{x}_{n}\}

if \lim_{n \rightarrow \infty} \rho(x_{n}, \overline{x}_{n})=0

As anticipated by the notation and terminology, \sim is reflective, symmetric and transitive, that is, \sim is an equivalence relation. Therefore, the set of all Cauchy sequences of points in the space R can be partitioned into classes of equivalent sequences. Let these classes be the points of a new space R^{*}. Then, we define the distance between two arbitrary points x^{*}, y^{*} by the formula

\rho_{1}(x^{*}, y^{*}) = \lim_{n \rightarrow \infty} \rho(x_{n}, y_{n}) ….call this IX.

where \{ x_{n}\} is any “representative” of x^{*} (namely, any Cauchy sequence in the class x^{*}) and \{ y_{n}\} is any representative of y^{*}.

The next step is to verify that IX is indeed a distance metric. That is, also to check that IX exists, independent of the choice of sequence \{x_{n} \} \in x^{*}, \{ y_{n} \} \in y^{*}, and satisfies the three properties of a distance metric function. Given any \epsilon >0, it follows from the triangle inequality in R (this can be proved with a little effort: homework quiz) that

|\rho(x_{n}, y_{n}) - \rho(x_{n^{'}},y_{n^{'}})| = |\rho(x_{n},y_{n}) -\rho(x_{n^{'}},y_{n}) + \rho(x_{n^{'}},y_{n}) - \rho(x_{n^{'}}, y_{n^{'}})|

That is, \leq |\rho(x_{n},y_{n}) -\rho(x_{n^{'}}, y_{n}) | + |\rho(x_{n^{'}}, y_{n}) - \rho(x_{n^{'}}, y_{n^{'}}) |

that is, \leq \rho(x_{n}, x_{n^{'}}) + \rho(y_{n}, y_{n^{'}}) \leq \frac{\epsilon}{2} + \frac{\epsilon}{2} = \epsilon….call this X

for all sufficiently large n and n^{'}.

Therefore, the sequence of real numbers \{ x_{n}\} = \{ \rho(x_{n},y_{n})\} is fundamental and hence, has a limit. This limit is independent of the choice of \{x_{n} \} \in x^{*}, \{ y_{n}\} \in y^{*}. In fact, suppose that

\{ x_{n}\}, \{ \overline{x}_{n}\} \in x^{*}, \{ y_{n}\}, \{ \overline{y}_{n} \in y^{*}


|\rho(x_{n}, y_{n}) - \rho(\overline(x)_{n}, \overline{y}_{n})| \leq \rho(x_{n}, \overline{x}_{n}) + \rho(y_{n}, \overline{y}_{n}) by a calculation analogous to X. But,

\lim_{n \rightarrow \infty} \rho(x_{n}, \overline{x}_{n}) = \lim_{n \rightarrow \infty}(y_{n}, \overline{y}_{n})=0

since \{ x_{n} \} \sim \{ \overline{x}_{n}\} and \{ y_{n} \} \sim \{ \overline{y}_{n}\}, and hence,

\lim_{n \rightarrow \infty} \rho(x_{n}, y_{n}) = \lim_{n \rightarrow \infty} \rho(\overline{x}_{n}, \overline{y}_{n}).

As for the three properties of a metric, it is obvious that

\rho_{1}(x^{*}, y^{*}) = \rho_{1}(y^{*}, x^{*}), and the fact that

\rho_{1}(x^{*}, y^{*}) =0 if and only if x^{*}=y^{*} is an immediate consequence of the definition of equivalent Cauchy sequences.

To verify the triangle inequality in R^{*}, we start from the triangle inequality:

\rho(x_{n}, z_{n}) \leq \rho(x_{n}, y_{n}) + \rho(y_{n}, z_{n})

in the original space R, and then take the limit as n \rightarrow \infty, obtaining

\lim_{n \rightarrow \infty} rho(x_{n}, z_{n}) \leq \lim_{n \rightarrow \infty} \rho(x_{n}, y_{n}+ \lim_{n \rightarrow \infty} \rho(y_{n}, z_{n})

That is, \rho_{1}(x^{*}, z^{*}) \leq \rho_{1}(x^{*}, y^{*}) + \rho_{1}(y^{*}, z^{*})

We now come to the crucial step of showing that R^{*} is a completion of R. Suppose that with every point x \in R, we associate the class x^{*} \in R^{*} of all Cauchy sequences converging to x. Let

x = \lim_{n \rightarrow \infty} x_{n}, y = \lim_{n \rightarrow \infty} y_{n}

Then, clearly \rho(x,y) = \lim_{n \rightarrow \infty}(x_{n}, y_{n})

(the above too can be proven with a slight effort: HW quiz); while, on the other hand,

\rho_{1}(x^{*}, y^{*}) = \lim_{n \rightarrow \infty} \rho(x_{n}, y_{n}) by definition. Therefore,

\rho(x,y) = \rho_{1}(x^{*}, y^{*}) and hence, the mapping of R into R^{*} carrying x into x^{*} is isometric. Accordingly, we need no longer distinguish between the original space R and its image in R^{*}, in particular between the two metrics \rho and \rho_{1}. In other words, R can be regarded as a subset of R^{*}. The theorem will be proved once we succeed in showing that

(i) R is everywhere dense in R^{*}, that is, [R] = R;

(2) R^{*} is complete.

Towards that end, given any point x^{*} \in R^{*} and any \epsilon >0, choose a representative of x^{*}, namely a Cauchy sequence \{ x_{n}\} in the class x^{*}. Let N be such that \rho(x_{n}, x_{n^{'}}) < \epsilon for all n, n^{'} >N. Then,

\rho(x_{n}, x^{*}) = \lim_{n \rightarrow \infty} \rho(x_{n}, x_{n^{'}}) \leq \epsilon if n >N, that is, every neighbourhood of the point x^{*} contains a point of R. It follows that [R] = R.

Finally, to show that R^{*} is complete, we first note that by the very definition of R^{*}, any Cauchy sequence \{ x_{n}\} consisting of points in R converges to some point in R^{*}, namely to the point x^{*} \in R^{*} defined by \{ x_{n}\}. Moreover, since R is dense in R^{*}, given any Cauchy sequence x_{n}^{*} consisting of points in R^{*}, we can find an equivalent sequence \{ x_{n}\} consisting of points in R. In fact, we need only choose x_{n} to be any point of R such that \rho(x_{n}, x_{n}^{*}) < \frac{1}{n}. The resulting sequence \{ x_{n}\} is fundamental, and, as just shown, converges to a point x^{*} \in R^{*}. But, then the sequence x_{n}^{*} also converges to x^{*}.



If R is the space of all rational numbers, then R^{*} is the space of all real numbers, both equipped with the distance \rho(x,y) = |x-y|. In this way, we can “construct the real number system.” However, there still remains the problem of suitably defining sums and products of real numbers and verifying that the usual axioms of arithmetic are satisfied.


Nalin Pithwa.

Purva building, 5A
Flat 06
Thakur Complex, Near Dimple Arcade
Mumbai , Maharastra 400101

Problem Set based on VI. Convergence, open and closed sets.

Problem 1.

Give an example of a metric space R and two open spheres S(x,r_{1}), S(y, r_{2}) in R such that S(x, r_{1}) \subset S(y,r_{2}) although r_{1}> r_{2}.

Problem 2:

Prove that every contact point of a set M is either a limit point of M or is an isolated point of M.

Comment. In particular, [M] can only contain points of the following three types:

a) Limit points of M belonging to M.

b) Limit points of M which do not belong to M.

c) Isolated points of M.

Thus, [M] is the union of M and the set of all its limit points.

Problem 3:

Prove that if x_{n}\rightarrow x and y_{n} \rightarrow y as n \rightarrow \infty then \rho(x_{n},y_{n}) \rightarrow \rho(x,y).

Hint : use the following problem: Given a metric space (X,\rho) prove that |\rho(x,z)-\rho(y,u)| \leq |\rho(x,y)|+|\rho(z,u)|

Problem 4:

Let f be a mapping of one metric space X into another metric space Y. Prove that f is continuous at a point x_{0} if and only if the sequence \{ y_{n}\} = \{ f(x_{n})\} converges to y=f(x_{0}) whenever the sequence x_{n} converges to x_{0}.

Problem 5:

Prove that :

(a) the closure of any set M is a closed set.

(b) [M] is the smallest closed set containing M.

Problem 6:

Is the union of infinitely many closed sets necessarily closed? How about the intersection of infinitely many open sets? Give examples.

Problem 7:

Prove directly that the point \frac{1}{4} belongs to the Cantor set F, although it is not the end point of any of the open interval deleted in constructing F. Hint: The point \frac{1}{4} divides the interval [0,1] in the ratio 1:3 and so on.

Problem 8:

Let F be the Cantor set. Prove that

(a) the points of the first kind form an everywhere dense subset of F.

(b) the numbers of the form t_{1}+t_{2} where t_{1}, t_{2} \in F fill the whole interval [0,2].

Problem 9:

Given a metric space R, let A be a subset of R, and x \in R. Then, the number \rho(A,x) = \inf_{a \in A}\rho(a,x) is called the distance between A and x. Prove that

(a) x \in A implies \rho(A,x)=0 but not conversely

(b) \rho(A,x) is a continuous function of x (for fixed A).

(c) \rho(A,x)=0 if and only if x is a contact point of A.

(d) [A]=A \bigcup M, where M is the set of all points x such that \rho(A,x)=0.

Problem 10:

Let A and B be two subsets of a metric space R. Then, the number \rho(A,B)= \inf_{a \in A, b \in B}\rho(a,b) is called the distance between A and B. Show that \rho(A,B)=0 if A \bigcap B \neq \phi, but not conversely.

Problem 11:

Let M_{K} be the set of all functions f in C_{[a,b]} satisfying a Lipschitz condition, that is, the set of all f such that |f(t_{1}-f(t_{2})| \leq K|t_{1}-t_{2}| for all t_{1}, t_{2} \in [a,b], where K is a fixed positive number. Prove that:

a) M_{K} is closed and in fact is the closure of the set of all differentiable functions on [a,b] such that |f^{'}(t)| \leq K

(b) the set M = \bigcup_{K}M_{K} of all functions satisfying a Lipschitz condition for some K is not closed;

(c) The closure of M is the whole space C_{[a,b]}

Problem 12:

An open set G in n-dimensional Euclidean space R^{n} is said to be connected if any points x,y \in G can be joined by a polygonal line(by a polygonal line we mean a curve obtained by joining a finite number of straight line segments end to end.) lying entirely in G. For example, the open disk x^{2}+y^{2}<1 is connected, but not the union of the two disks x^{2}+y^{2}<1, (x-2)^{2}+y^{2}<1 (even though they share a contact point). An open subset of an open set G is called a component of G if it is connected and is not contained in a larger connected subset of G. Use Zorn’s lemma to prove that every open set G in R^{n} is the union of no more than countably many pairwise disjoint components.

Comment: In the case n=1, that is, the case on the real line, every connected open set is an open interval, possibility one of the infinite intervals (-\infty, \infty), (a, \infty) or (-\infty, b). Thus, theorem 6 (namely: Every open set G on the real line is the union of a finite or countable system of pairwise disjoint open intervals) on the structures of open sets on the line is tantamount to two assertions:

(i) Every open set on the line is the union of a finite or countable number of components.

(ii) Every open connected set on the line is an open interval.

The first assertion holds for open sets in R^{n} (and, in fact, is susceptible to further generalizations), while the second assertion pertains specifically to the real line.


Happy analysis !!

Nalin Pithwa

II Metric Spaces:

Reference: Introductory Real Analysis Kolmogorov and Fomin, translated by Richard A Silverman, Dover Publications. 

Reference: Analysis by Walter Rudin, Third Edition.


Chapter II Metric Spaces:

V: Basic Concepts:

Section 5.1: Definitions and examples:

One of the most important operations in mathematical analysis is the taking of limits. Here what matters is not so much as the algebraic nature of the real numbers (that is, the fact that real numbers form a field), but rather the fact that distance from one point to another on the real line(or, in two or three dimensional space) is well-defined and has certain properties. Roughly speaking, a metric space is a set equipped with a distance (or, ‘metric’) which has these same properties. More exactly, we have:

Definition 1: 

By a metric space is meant a pair (X, \rho) consisting of a set X and a distance \rho, that is, a single valued, nonnegative, real function \rho(x,y) defined for all x, y \in X which has the following three properties:

  1. \rho(x,y)=0 if and only if x=y
  2. Symmetry: \rho(x,y)=\rho(y,x)
  3. Triangle Inequality: \rho(x,z) \leq \rho(x,y)+\rho(y,z)

We will often refer to the set X as a “space” and its elements x, y, …, as “points.” Metric spaces are usually denoted by a single letter, like


or, even by the same letter X as used for the underlying space, in cases where there is no possibility of confusion.

Example 1:

Setting \rho(x,y)=0, if x=y and \rho(x,y)=1, when x \neq y, where x and y are elements of an arbitrary set X, we obviously get a metric space, which might be called a “discrete space” or a “space of isolated points.”

Check: does this satisfy all the three axioms of a metric space: clearly, the first axiom is true. So, also the second axiom is true because \rho(x,y)=\rho(y,x) is zero when x = y and is 1 when x \neq y or y \neq x.

Now we have to check: \rho(x,z) \leq \rho(x,y) + \rho(y,z). Case I: if x=z, LHS is zero and again RHS could be zero or one depending on y and z. In all cases, the inequality holds. Case II: If x \neq z, then LHS is 1. Now, x \neq y, then \rho(x,y) is 1 and depending on z, \rho(y,z) is zero or 1. So, we get LHS = RHS = 1 or LHS=1 less than RHS, which is 2.

So, yes indeed this is a well-defined metric function.

Some remarks: To think further:  Suppose we are given the following function : f(x)=1 when x \in \mathscr{Q} and f(x)=0 when x \in \mathscr{Q^{'}}. Can what can we say about this function with respect to the above discrete space ? What are the limit points of such a function ? Is such a function continuous (if so, at which points) in this metric space? Is it dense in this metric space?

Example 2:

The set of all real numbers with distance \rho(x-y) = |x-y| is a metric space, which we denote by R^{1} —- one dimensional real line.

Check: is this a well-defined metric ? Clearly, axiom 1 holds true because |x-x|=0 and axiom 2 is true because \rho(x-y) = |x-y| = |-(x-y)|=|y-x|=\rho(y-x). Now, we need to check: \rho(x,z) \leq \rho(x,y) + \rho(y,z). Here, LHS is |x-z|=|x-y+y-z| = |(x-y) + (y-z)| \leq |x-y| + |y-z| = \rho(x,y) + \rho(y,z) where we have used the triangle inequality. So, axiom 3 holds true.

So, this is indeed a well-defined metric.

Example 3:

The set of all ordered n-tuples x=(x_{1}, x_{2}, \ldots, x_{n}) of real numbers x_{1}, x_{2}, \ldots, x_{n} with distance

\rho(x,y)= \sqrt{\Sigma_{k=1}^{n}(x_{k}-y_{k})^{2}} ….call this relation I

is a metric space denoted by R^{n} and called n-dimensional Euclidean space (or, simply Euclidean n-space). The distance (I) obviously satisfies axioms 1 and 2 of definition of a metric. Moreover, it can be seen that (I) satisfies the third axiom also:

In fact, let x=(x_{1}, x_{2}, \ldots, x_{n}), y=(y_{1}, y_{2}, y_{3}, \ldots, y_{n}), z=(z_{1}, z_{2}, \ldots, z_{n}) be three points in R^{n}.

Futher, let a_{k}=x_{k}-y_{k} and b_{k}=y_{k}-z_{k} when k=1,2,\ldots, n.

Then, the triangle inequality takes the form:

\sqrt{\Sigma_{k=1}^{n}(x_{k}-z_{k})^{2}} \leq \sqrt{\Sigma_{k=1}^{n}(x_{k}-y_{k})^{2}} + \sqrt{\Sigma_{k=1}^{n}(y_{k}-z_{k})^{2}}….let us call this Relation II.

Or equivalently,

\sqrt{\Sigma_{k=1}^{n}(a_{k}+b_{k})^{2}} \leq \sqrt{\Sigma_{k=1}^{n}a_{k}^{2}} + \sqrt{\Sigma_{k=1}^{n}b_{k}^{2}}….call this as relation II’

It follows from the Cauchy-Schwarz inequality that (\Sigma_{k=1}^{n}a_{k}b_{k})^{2} \leq \Sigma_{k=1}^{n}a_{k}^{2} \times \Sigma_{k=1}^{n}b_{k}^{2} …call this as relation III.

so that we have now,

\Sigma_{k=1}^{n}(a_{k}+b_{k})^{2} = \Sigma_{k=1}^{n}a_{k}^{2} + 2 \Sigma_{k=1}^{n}a_{k}b_{k} + \Sigma_{k=1}^{n}b_{k}^{2} \leq \Sigma_{k=1}^{n}a_{k}^{2}+2\sqrt{\Sigma_{k=1}^{n}a_{k}^{2}\times \Sigma_{k=1}^{n}b_{k}^{2}} + \Sigma_{k=1}^{n}b_{k}^{2} = (\sqrt{\Sigma_{k=1}^{n}a_{k}^{2}} + \sqrt{\Sigma_{k=1}^{n}}b_{k}^{2})^{2}.

Taking square roots, we get II’ and hence, II.


Example 4:

Take the same set of ordered tuples as in preceding example x = (x_{1}, x_{2}, \ldots, x_{n}), but this time define the distance function by \rho_{1}(x,y)= \Sigma_{k=1}^{n}|x_{k}-y_{k}|….call this as relation IV.

It is clear that this is also a well-defined metric function.

Check: Axiom 1 is obvious. So, also axiom 2 because |a-b|=|b-a|. Axiom 3 holds true because the following general inequality holds true: |\pm x_{1} \pm x_{2} \ldots \pm x_{n}| \leq |x_{1}|+|x_{2}|+ \ldots + |x_{n}|.

Example 5:

Take the same set as in the previous two examples, but now let us define the distance to between two points x=(x_{1},x_{2}, \ldots, x_{n}) and y=(y_{1}, y_{2}, \ldots, y_{n}) to be \rho_{0}(x,y)=  \max_{1 \leq k \leq n}|x_{k}-y_{k}|….call this V.

This is also a well-defined metric function.

This space, denoted by R_{0}^{n} is often as usual the Euclidean space R^{n}.

Remark: The last three examples show that it is sometimes important to use a different notation for a metric space than for the underlying set of points in the space, since the latter can be “metrized” in a variety of different ways.

Example 6: 

The set C_{[a,b]} of all continuous functions defined on the closed interval [a,b] with distance \rho(f,g)=-\max_{a \leq t \leq b}|f(t)-g(t)|…call this VI. This is a metric space of great importance in analysis.

Let us verify so:




This metric space and the underlying set of points in the space will both be denoted by C_{[a,b]}. Instead of C_{[0,1]} we just write C. A space like C[a,b] is often called a “function space” to emphasize that its elements are functions.

Example 7:

Let I_{2} be the set of all “infinite” sequences : x=(x_{1}, x_{2}, \ldots, x_{k}, \ldots) of real numbers x_{1}, x_{2}, \ldots, x_{k}, \ldots satisfying the convergence condition:

\Sigma_{k=1}^{\infty}x_{k}^{2} \leq \infty

Note: the infinite sequence with the general term x_{k} can be written as \{ x_{k}\} or simply as x_{1}, x_{2}, \ldots, x_{k}, \ldots  (this notation is familiar from calculus). It can also be written in “point notation” as x=(x_{1}, x_{2}, \ldots, , x_{k}, \ldots), that is, as an “ordered \infty-tuple” generalizing the notion of an ordered n-tuple. (In writing x_{k}) we have another use of curly brackets, but the context will always prevent any confusion between the sequence x_{k}  and the set whose only element is x_{k}).

Where distance between points is defined by

\rho(x,y)=\sqrt{\Sigma_{k=1}^{\infty}(x_{k}-y_{k}^{2})}…call this VII.

Clearly, VII makes sense for all x, y \in l_{2} since it follows from the elementary inequality

(x_{k} \pm y_{k})^{2}\leq 2(x_{k}^{2}+y_{k}^{2}) that the convergence of the two series \Sigma_{k=1}^{n}x_{k}^{\infty} and \Sigma_{k=1}^{\infty}y_{k}^{2} also implies the convergence of the series \Sigma_{k=1}^{\infty}(x_{k}-y_{k})^{2}.

At the same time, we find that if the points (x_{1}, x_{2}, \ldots, x_{k}, \ldots) and (y_{1}, y_{2}, \ldots, y_{k}, \ldots) both belong to l_{2}, then so does the point:

(x_{1}+y_{1}, x_{2}+y_{2}, \ldots, x_{k}+y_{k}, \ldots)

(since the lim of a sum of two sequences is the sum of the individual limits)

The function VII obviously has the first two defining properties of a distance. To verify the triangle inequality, which takes the form:

\sqrt{\Sigma_{k=1}^{\infty}(x_{k}-z_{k})^{2}} \leq \sqrt{\Sigma_{k=1}^{\infty}(x_{k}-y_{k})^{2}} + \sqrt{\Sigma_{k=1}^{\infty}(y_{k}-z_{k})^{2}} ….call this VIII.

for the metric VII, we first note that all three series converge, for the reason just given. Moreover, the inequality:

\sqrt{\Sigma_{k=1}^{n}(x_{k}-z_{k})^{2}} \leq \sqrt{\Sigma_{k=1}^{n}(x_{k}-y_{k})^{2}} + \sqrt{\Sigma_{k=1}^{n}(y_{k}-z_{k})^{2}}…call this IX, holds for all n, (as proved in Example 3 above). Taking the limit as n \rightarrow \infty in IX, we get VIII, thereby satisfying the triangle inequality in l_{2}. Therefore, l_{2} is a metric space.

Example 8:

As in Example 6, consider the set of all functions continuous on the interval [a,b], but now let us define the metric by the formula:

\rho(x,y) = (\int_{a}^{b}[x(t)-y(t)]^{2}dt)^{\frac{1}{2}}….call this X.

instead of VI.

The resulting metric space will be denoted by C_{[a,b]}^{2}. The first two axioms of the metric clearly hold, and the fact that X satisfies the triangle inequality is an immediate consequence of the following Schwarz’s inequality:

(\int_{a}^{b}x(t)y(tdt))^{2} \leq \int_{a}^{b}x^{2}(t)dt \times \int_{a}^{b}y^{2}(t)dt

(see Problem 3 in the exercises below), by the continuous analogue of the argument given in example 3 above.

Example 9:

Next consider the set of all bounded infinite sequences of real numbers x=(x_{1},x_{2}, \ldots, x_{k}, \ldots) and let

\rho(x,y) = \sup_{k} {|x_{k}-y_{k}|}….call this XII.

This gives a metric space which we denote by m. The fact that XII satisfies axioms 1 and 2 of a metric space is obvious by the definition of a supremum.

Axiom 3 can be verified as follows:



Example 10:

As in example 3, consider the set of all ordered n-tuples, x=(x_{1}, x_{2}, \ldots, x_{k}, \ldots), but now let the metric be given by the more general formula as follows:

\rho_{p}(x,y)=(\Sigma_{k=1}^{n}|x_{k}-y_{k}|^{p})^{\frac{1}{p}}….call this XIII.

where p is a fixed real number greater than or equal to 1. (Examples 3 and 4 correspond to the cases p=2 and p=1, respectively.) This gives a metric space, which we denote by R_{p}^{n}.

It is obvious that \rho_{p}(x,y) = 0 if and only if x=y.

It is obvious that \rho_{p}(x,y)=\rho_{p}(y,x).

But, verification of the third axiom of the definition of a metric (XIII) (that is, the triangle inequality) requires a little work as follows:

Let x=(x_{1}, x_{2}, \ldots, x_{n}), y=(y_{1}, y_{2}, \ldots, y_{n}), z=(z_{1}, z_{2}, \ldots, z_{n}) be three points in R_{p}^{n}, and let:

a_{k}=x_{k}-y_{k}, b_{k}=y_{k}-z_{k} for k=1,2,\ldots, n just as in example 3. Then, the triangle inequality

\rho_{p}(x,z) \leq \rho_{p}(x,y)+\rho_{p}(y,z)

takes the form of Minkowski’s inequality:

(\Sigma_{k=1}^{\infty}|a_{k}+b_{k}|^{p})^{\frac{1}{p}} \leq (\Sigma_{k=1}^{\infty}|a_{k}|^{p})^{\frac{1}{p}} + (\Sigma_{k=1}^{\infty}|b_{k}|^{p})^{\frac{1}{p}}

Call the above inequality as XIV in the current blog article.

PS: I think the proof of the Minkowski inequality can be found in any standard text on Inequalities by B. J. Venkatachala, for example;; or, in wikipedia.

The above inequality holds true clearly for p=1, and hence, we assume the case p \geq 1.

The proof of XIV in turn again is based on Holder’s inequality: 

\Sigma_{k=1}^{n}|a_{k}b_{k}| \leq (\Sigma_{k=1}^{n}|a_{k}|^{p})^{\frac{1}{p}} (\Sigma_{k=1}^{n}|b_{k}|^{q})^{\frac{1}{q}}

Call the above as inequality XV.

Where the numbers p>1, q >1 satisfy the condition:

\frac{1}{p} + \frac{1}{q} =1….call this as XVI.

We begin by observing that the inequality XV is homogeneous, that is, if it holds for any two points (a_{1}, a_{2}, \ldots, a_{n}) and (b_{1}, b_{2}, \ldots, b_{n}) then it holds for the two points (\lambda a_{1}, \lambda a_{2}, \ldots, \lambda a_{n}) and (\mu b_{1}, \mu b_{2}, \ldots, \mu b_{n}) where \lambda and \mu are any two real numbers. Therefore, we need only prove XV for the following case:

\Sigma_{k=1}^{n}|a_{k}|^{p}=\Sigma_{k=1}^{n}|b_{k}|^{p}….call this relation XVII.

Thus, assuming that XVII holds, we now have to prove that: \Sigma_{k=1}^{n}|a_{k}b_{k}| \leq 1…call this XVIII.

Consider the two areas S_{1} and S_{2}, associated with the curve defined in the \xi\eta-plane and given by the equation:

\eta = \xi^{p-1}, or equivalently by the equation:

\xi = \eta^{q-1}

Then, clearly S_{1}=\int_{0}^{a}\xi^{p-1}d\xi = \frac{a^{p}}{p}, and S_{2}=\int_{0}^{b}\eta^{q-1}d\eta = \frac{b^{q}}{q}

Moreover, it is apparent (if we draw the figure suitably) that S_{1}+S_{2} \geq ab for arbitrary positive a and b. It follows that ab \leq \frac{a^{p}}{q} + \frac{b^{q}}{q}…call this relation (19 or XIX).

Setting a = |a_{k}|, b=|b_{k}|, summing over k from 1 to n, and taking account of (16, or XVI) and (17, or XVII), we get the desired inequality (18, or XVIII). This proves Holder’s inequality (15 or XV). Note that (15 or XV) reduces to Schwarz’s inequality if p=2.

It is now an easy matter to prove Minkowski’s inequality (14 or XIV), starting from the identity

(|a|+|b|)^{p} = (|a|+|b|)^{p-1}|a|+(|a|+|b|)^{p-1}|b|.

In fact, putting a=a_{k}, b=b_{k} and summing over k from 1 to n, we obtain


Next, we apply Holder’s inequality (15 or XV) to both sums on the right, bearing in mind that (p-1)q=p:

\Sigma_{k=1}^{n}(|a_{k}|+|b_{k}|)^{p} \leq (\Sigma_{k=1}^{n}(|a_{k}|+|b_{k}|)^{p})^{\frac{1}{q}}([\Sigma_{k=1}^{n}|a_{k}|^{p}]^{\frac{1}{p}}+[\Sigma_{k=1}^{n}|b_{k}|^{p}]^{\frac{1}{p}})

Dividing both sides of this inequality by


we get

(\Sigma_{k=1}^{n}(|a_{k}|+|b_{k}|)^{p})^{\frac{1}{p}} \leq (\Sigma_{k=1}^{n}|a_{k}|^{p})^{\frac{1}{p}} + (\Sigma_{k=1}^{n}|b_{k}|^{p})^{\frac{1}{p}}

which immediately implies (14 of XIV), thereby proving the triangle inequality in R_{p}^{n}.


Example 11:

Finally, let l_{p} be the set of all infinite sequences x = (x_{1}, x_{2}, \ldots, x_{k}, \ldots) of real numbers satisfying the convergence condition

\Sigma_{k=1}^{n}x_{k}^{p} < \infty for some fixed number p \geq 1, where distance between points is defined by

\rho(x,y) = (\Sigma_{k=1}^{\infty}|x_{k}-y_{k}|^{p})^{\frac{1}{p}}….call this (20 or XX)

(the case p=2 has already been considered in Example 7). It follows from Minkowski’s inequality (14 or XIV) that

(\Sigma_{k=1}^{n}|x_{k}-y_{k}|^{p})^{\frac{1}{p}} \leq (\Sigma_{k=1}^{n}|x_{k}|^{p})^{\frac{1}{p}} + (\Sigma_{k=1}^{n}|y_{k}|^{p})^{\frac{1}{p}}….call this (21 or XXI) for any n.

Since the series \Sigma_{k=1}^{\infty}|x_{k}|^{p}, and \Sigma_{k=1}^{\infty}|y_{k}|^{p} converge, by hypothesis, we can take the limit as n \rightarrow \infty in (21 or XXI) obtaining

(\Sigma_{k=1}^{\infty}|x_{k}-y_{k}|^{p})^{\frac{1}{p}} \leq (\Sigma_{k=1}^{\infty}|x_{k}|^{p})^{\frac{1}{p}} + (\Sigma_{k=1}^{\infty}|y_{k}|^{p})^{\frac{1}{p}} < \infty.

This shows that (20 or XX) actually makes sense for arbitrary x, y \in l_{p}. At the same time, we have verified that the triangle inequality holds in l_{p} (the other two properties of a metric space are obviously satisfied). Therefore, l_{p} is a metric space.



If R = (X, \rho) is a metric space and M is any subset of X, then obviously R^{*} = (M, \rho) is again a metric space, called a subspace of the original metric space R. This device gives us infinitely more examples of metric spaces.

Example 12:

For x, y \in R^{1}, define:

12a) \rho(x,y) = (x-y)^{2}

12b) \rho(x,y) = \sqrt{|x-y|}

12c) \rho(x,y) = |x^{2}-y^{2}|

12d) \rho(x,y) = |x-2y|

12e) \rho(x,y) = \frac{|x-y|}{1+|x-y|}

Determine for each of these, whether it is a metric or not.

Solution 12a: 

Axioms 1 and 2 are clearly satisfied. We have to verify if the following holds true:

(x-z)^{2} \leq (x-y)^{2} + (y-z)^{2}. Clearly, RHS is x^{2} + 2y^{2}+ z^{2} -2xy -2yz whereas LHS is x^{2}+y^{2}-2xy so it may not always be true that LHS is lesser than or equal to RHS.

Hence, this is not a metric function.

Solution 12b:

this also satisfies the first two axioms of the definition of a metric.

So, we have to verify if the following is true:

\rho(x,z) \leq \rho(x,y) + \rho(y,z), that is, TPT:

\sqrt{|x-z|} \leq \sqrt{|x-y|} + \sqrt{|y-z|}.

Consider the following:

|x-z|=|x-y+y-z| \leq |x-y|+|y-z|

Also, (\sqrt{|x-y|} + \sqrt{|y-z|})^{2} = |x-y|+|y-z|+2\sqrt{|x-y|\times|y-z|}

So the third axiom holds true in this case. So, the given function is a metric.

Solution 12c:

Once again, the first two axioms clearly hold.

We have to verify if the following holds true:

\rho(x,z) \leq \rho(x,y) = \rho(y,z), that is, to prove that:

|x^{2}-z^{2}| \leq |x^{2}-y^{2}| + |y^{2}-z^{2}|, which is obviously true by triangle inequality of real numbers. So, the given function is a metric.

Solution 12d:

Clearly, the first two axioms do not hold. It can easily be checked that the third axiom also does not hold. So, the given function is not a metric.

Solution 12e:

The first axiom holds true.

To check the second axiom consider and compare:

\rho(x,y) = \frac{|x-y|}{1+|x-y|} whereas \rho(y,x) = \frac{|y-x|}{1+|y-x|}=\frac{|x-y|}{1+|x-y|} = \rho(x,y) clearly again.

To verify the third axiom, we have to check if the following is true:

\rho(x,z) \leq \rho(x,y) + \rho(y,z), that is, to prove that:

\frac{|x-z|}{1+|x-z|} \leq \frac{|x-y|}{1+|x-y|} + \frac{|y-z|}{1+|y-z|}. A little algebraic work shows that this is aot always possible. Hence, the given function is not a metric.

Some foundation mathematics

Well-Ordering Principle:

Every non-empty set S of non-negative integers contains a least element; that is, there is some integer a in S such that a \leq b for all b’s belonging to S.

Because this principle plays a role in many proofs related to foundations of mathematics, let us use it to show that the set of positive integers has what is known as the Archimedean property.

Archimedean property:

If a and b are any positive integers, then there exists a positive integer n such that na \geq b.


By contradiction:

Assume that the statement of the theorem is not true so that for some a and b, we have na <b for every positive integer n. Then, the set S = \{ b-na : n \in Z^{+}\} consists entirely of positive integers. By the Well-Ordering Principle, S will possess a least element, say, b-ma. Notice that b- (m+1)a also lies in S; because S contains all integers of this form. Further, we also have b-(m+1)a=(b-ma)-a<b-ma contrary to the choice of b-ma as the smallest integer in S. This contradiction arose out of original assumption that the Archimedean property did not hold; hence, the proof. QED.

First Principle of Finite Induction:

Let S be a set of positive integers with the following properties:

a) the integer 1 belongs to S.

b) Whenever the integer k is in S, the next integer k+1 is also in S.

Then, S is the set of all positive integers.

Second Principle of Finite Induction:

Let S be a set of positive integers with the following properties:

a) the integer 1 belongs to S.

b) If k is a positive integer such that 1,2,\ldots k belong to S, then (k+1) must also be in S.

Then, S is the set of all positive integers.

So, in lighter vein, we assume a set of positive integers is given just as Kronecker had observed: “God created the natural numbers, all the rest is man-made.”

More later,

Nalin Pithwa.

Mathematics versus other escapes from reality

Of all escapes from reality, Mathematics is the most successful ever. It is a fantasy that becomes all the more addictive because it works back to improve the same reality we are trying to evade. All other escapes — sex, drugs, hobbies, whatever — are ephemeral by comparison. The mathematician’s feeling of triumph, as he/she forces the world to obey the laws his/her imagination has created feeds on its own success. The world is permanently changed by the workings of his/her mind, and the certainty that his/her creations will endure renews his/her confidence as no other pursuit.

Gian-Carlo Rota, an MIT Mathematician.

Interchanging Limit Processes — by Walter Rudin

As I mentioned earlier, my thesis (Trans. AMS 68, 1950, 278-363) deals with uniqueness questions for series of spherical harmonics, also known as Laplace series. In the more familiar setting of trigonometric series, the first theorem of the kind that I was looking for was proved by Georg Cantor in 1870, based on earlier work of Riemann (1854, published in 1867). Using the notations

A_{n}(x)=a_{n} \cos{nx}+b_{n}\sin{nx},

s_{p}(x)=A_{0}+A_{1}(x)+ \ldots + A_{p}(x), where a_{n} and b_{n} are real numbers. Cantor’s theorem says:

If \lim_{p \rightarrow \infty}s_{p}(x)=0 at every real x, then a_{n}=b_{n}=0 for every n.

Therefore, two distinct trigonometric series cannot converge to the same sum. This is what is meant by uniqueness.

My aim was to prove this for spherical harmonics and (as had been done for trigonometric series) to whittle away at the hypothesis. Instead of assuming convergence at every point of the sphere, what sort of summability will do? Does one really need convergence (or summability) at every point? If not, what sort of sets can be omitted? Must anything else be assumed at these omitted points? What sort of side conditions, if any, are relevant?

I came up with reasonable answers to these questions, but basically the whole point seemed to be the justification of the interchange of some limit processes. This left me with an uneasy feeling that there ought to be more to Analysis than that. I wanted to do something with more “structure”. I could not  have explained just what I meant by this, but I found it later when I became aware of the close relation between Fourier analysis and group theory, and also in an occasional encounter with number theory and with geometric aspects of several complex variables.

Why was it all an exercise in interchange of limits? Because the “obvious” proof of Cantor’s theorem goes like this: for p > n,

\pi a_{n}= \int_{-\pi}^{\pi}s_{p}(x)\cos{nx}dx = \lim_{p \rightarrow \infty}\int_{-\pi}^{\pi}s_{p}(x)\cos {nx}dx, which in turn, equals

\int_{-\pi}^{\pi}(\lim_{p \rightarrow \infty}s_{p}(x))\cos{nx}dx= \int_{-\pi}^{\pi}0 \cos{nx}dx=0 and similarly, for b_{n}. Note that \lim \int = \int \lim was used.

In Riemann’s above mentioned paper, the derives the conclusion of Cantor’s theorem under an additional hypothesis, namely, a_{n} \rightarrow 0 and b_{n} \rightarrow 0 as n \rightarrow \infty. He associates to \sum {A_{n}(x)} the twice integrated series


and then finds it necessary to prove, in some detail, that this series converges and that its sum F is continuous! (Weierstrass had not yet invented uniform convergence.) This is astonishingly different from most of his other publications, such as his paper  on hypergeometric functions in which mind-boggling relations and transformations are merely stated, with only a few hints, or  his  painfully brief paper on the zeta-function.

In Crelle’s J. 73, 1870, 130-138, Cantor showed that Riemann’s additional hypothesis was redundant, by proving that

(*) \lim_{n \rightarrow \infty}A_{n}(x)=0 for all x implies \lim_{n \rightarrow \infty}a_{n}= \lim_{n \rightarrow \infty}b_{n}=0.

He included the statement: This cannot be proved, as is commonly believed, by term-by-term integration.

Apparently, it took a while before this was generally understood. Ten years later, in Math. America 16, 1880, 113-114, he patiently explains the differenence between pointwise convergence and uniform convergence, in order to refute a “simpler proof” published by Appell. But then, referring to his second (still quite complicated) proof, the one in Math. Annalen 4, 1871, 139-143, he sticks his neck out and writes: ” In my opinion, no further simplification can be achieved, given the nature of ths subject.”

That was a bit reckless. 25 years later, Lebesgue’s dominated convergence theorem became part of every analyst’s tool chest, and since then (*) can be proved in a few lines:

Rewrite A_{n}(x) in the form A_{n}(x)=c_{n}\cos {(nx+\alpha_{n})}, where c_{n}^{2}=a_{n}^{2}+b_{n}^{2}. Put

\gamma_{n}==\min \{1, |c_{n}| \}, B_{n}(x)=\gamma_{n}\cos{(nx+\alpha_{n})}.

Then, B_{n}^{2}(x) \leq 1, B_{n}^{2}(x) \rightarrow 0 at every x, so that the D. C.Th., combined with

\int_{-\pi}^{\pi}B_{n}^{2}(x)dx=\pi \gamma_{n}^{2} shows that \gamma_{n} \rightarrow 0. Therefore, |c_{n}|=\gamma_{n} for all large n, and c_{n} \rightarrow 0. Done.

The point of all this is that my attitude was probably wrong. Interchanging limit processes occupied some of the best mathematicians for most of the 19th century. Thomas Hawkins’ book “Lebesgue’s Theory” gives an excellent description of the difficulties that they had to overcome. Perhaps, we should not be too surprised that even a hundred years later many students are baffled by uniform convergence, uniform continuity etc., and that some never get it at all.

In Trans. AMS 70, 1961,  387-403, I applied the techniques of my thesis to another problem of this type, with Hermite functions in place of spherical harmonics.

(Note: The above article has been picked from Walter Rudin’t book, “The Way I  Remember It)) — hope it helps advanced graduates in Analysis.

More later,

Nalin Pithwa


Analysis: Chapter 1: part 11: algebraic operations with real numbers: continued

(iii) Multiplication. 

When we come to multiplication, it is most convenient to confine ourselves to positive numbers (among which we may include zero) in the first instance, and to go back for a moment to the sections of positive rational numbers only which we considered in articles 4-7. We may then follow practically the same road as in the case of addition, taking (c) to be (ab) and (O) to be (AB). The argument is the same, except when we are proving that all rational numbers with at most one exception must belong to (c) or (C). This depends, as in the case of addition, on showing that we can choose a, A, b, and B so that C-c is as small as we please. Here we use the identity


Finally, we include negative numbers within the scope of our definition by agreeing that, if \alpha and \beta are positive, then

(-\alpha)\beta=-\alpha\beta, \alpha(-\beta)=-\alpha\beta, (-\alpha)(-\beta)=\alpha\beta.

(iv) Division. 

In order to define division, we begin by defining the reciprocal \frac{1}{\alpha} of a number \alpha (other than zero). Confining ourselves in the first instance to positive numbers and sections of positive rational numbers, we define the reciprocal of a positive number \alpha by means of the lower class (1/A) and the upper class (1/a). We then define the reciprocal of a negative number -\alpha by the equation 1/(-\alpha)=-(1/\alpha). Finally, we define \frac{\alpha}{\beta} by the equation

\frac{\alpha}{\beta}=\alpha \times (1/\beta).

We are then in a position to apply to all real numbers, rational or  irrational the whole of the ideas and methods of elementary algebra. Naturally, we do not propose to carry out this task in detail. It will be more profitable and more interesting to turn our attention to some special, but particularly important, classes of irrational numbers.

More later,

Nalin Pithwa