Is Math really abstract? I N Herstein answers…

Reference: Chapter 1: Abstract Algebra Third Edition, I. N. Herstein, Prentice Hall International Edition:

For many readers/students of pure mathematics, such a book will be their first contact with abstract mathematics. The subject to be discussed is usually called “abstract algebra,” but the difficulties that the reader may encounter are not so much due to the “algebra” part as they are to the “abstract” part.

On seeing some area of abstract mathematics for the first time,be it in analysis, topology, or what not, there seems to be a common reaction for the novice. This can best be described by a feeling of being adrift, of not having something solid to hang on to. This is not too surprising, for while many of the ideas are fundamentally quite simple, they are subtle and seem to elude one’s grasp the first time around. One way to mitigate this feeling of limbo, or asking oneself “What is the point of all this?” is to take the concept at hand and see what it says in particular concrete cases. In other words, the best road to good understanding of the notions introduced is to look at examples. This is true in all of mathematics.

Can one, with a few strokes, quickly describe the essence, purpose, and background for abstract algebra, for example?

We start with some collection of objects S and endow this collection with an algebraic structure by assuming that we can combine, in one or several ways (usually two) elements of this set S to obtain, once more, elements of this set S. These ways of combining elements of S we call operations on S. Then we try to condition or regulate the nature of S by imposing rules on how these operations behave on S. These rules are usually called axioms defining the particular structure on S. These axioms are for us to define, but the choice made comes, historically in mathematics from noticing that there are many concrete mathematical systems that satisfy these rules or axioms. In algebra, we study algebraic objects or structures called groups, rings, fields.

Of course, one could try many sets of axioms to define new structures. What would we require of such a structure? Certainly we would want that the axioms be consistent, that is, that we should not be led to some nonsensical contradiction computing within the framework of the allowable things the axioms permit us to do. But that is not enough. We can easily set up such algebraic structures by imposing a set of rules on a set S that lead to a pathological or weird system. Furthermore, there may be very few examples of something obeying the rules we have laid down.

Time has shown that certain structures defined by “axioms” play an important role in mathematics (and other areas as well) and that certain others are of no interest. The ones we mentioned earlier, namely, groups, rings, fields, and vector spaces have stood the test of time.

A word about the use of “axioms.” In everyday language, “an axiom means a self-evident truth”. But we are not using every day language; we are dealing with mathematics. An axiom is not a universal truth — but one of several rules spelling out a given mathematical structure. The axiom is true in the system we are studying because we forced it to be true by “force” or “our choice” or “hypothesis”. It is a licence, in that particular structure to do certain things.

We return to something we said earlier about the reaction that many students have on their first encounter with this kind of algebra, namely, a lack of feeling that the material is something they can get their teeth into. Do not be discouraged if the initial exposure leaves you in a bit of a fog.Stick with it, try to understand what a given concept says and most importantly, look at particular, concrete examples of the concept under discussion.

Follow the same approach in linear algebra, analysis and topology.

Cheers, cheers, cheers,

Nalin Pithwa

Constructing numbers from sets

Continued from previous blog: A fast review of set theory; same reference: A Second Course in Analysis by M Ram Murty, Hindustan Book Agency.

Mathematicians and philosophers of the nineteenth century pondered deeply into the nature of a number. The question of “what is a number?” is not a simple one. But since mathematicians decided to give foundations of mathematics using the axiomatic method and sets as the basic building blocks, we are led to define numbers using sets. We follow Richard Dedekind (1831-1916) and Giuseppe Peano (1858-1932) in the following construction. It was as late as 1888 and 1889 when this construction was described in two papers written independently by Dedekind and Peano.

We construct a sequence of sets to represent the natural numbers. As noted earlier, zero is represented by the empty set. We have already described the construction of the natural numbers using the empty set. For each natural number n, the successor of n is denoted by n+1 (and sometimes as n^{'}) and defined as

n \bigcup \{ n \}

Thus, each natural number is a set with n elements, namely,

\{ 0,1,2, \ldots, n-1\}

We designate the set of natural numbers by the symbol \mathcal{N}. (It is a matter of personal convenience whether to include zero as a natural number or not. In this discussion, zero is a natural number. In other settings, it may not be. There is no universal convention regarding this and the student is expected to understand depending on the context. Some authors use the term “whole numbers” to indicate that zero is included in the discussion.)

The arithmetic operations on \mathcal{N} are now defined recursively. Addition is defined as a function from \mathcal{N} \times \mathcal{N} to \mathcal{N}:

+ (m,n) = m+n

where m+n is defined recursively by 0+n=n and m'+n = (m+n)^{'}. A similar definition is given for multiplication x by defining 0 \times n = 0 and

m^{'} \times n = (m \times n) + n

We also define m \times n as simply mn which is the familiar symbology.

An equivalence relation on a set S is a subset R of S \times S satisfying:

  1. (reflexive axiom) (a,a) \in R \forall {x} \in S.
  2. (symmetry axiom) (a,b) \in R \Longleftrightarrow (b,a) \in R.
  3. (transitive axiom) (a,b) \in R and (b,c) \in R implies (a,c) \in R.

The notion of an equivalence relation is an abstaction of our concept of equality, or at least what we implicitly expect of the notion of equality. It is more suggestive to write the equivalence relationn, not as a subset of S \times S as indicated above, but rather more symbolically as \sim that our axioms become:

  1. (reflexive) a \sim a, \forall {a} \in .
  2. (symmetry) a \sim b if and only if b \sim a.
  3. (transitive) a \sim b and b \sim c implies a \sim c.

Equivalence relations play a fundamental role in all of mathematics. They allow us to understand aspects of sets by grouping them using certain properties.

To construct negative integers, we define an equivalence relation on \mathcal{N} \times \mathcal{N}. We write

(m,n)\sim (j,k) \Longleftrightarrow m+k=j+n.

Intuitively, we think of (m,n) as m-n so that it becomes evident that our definition is now in terms of concepts that have been defined earlier. This is very similar to how the ancients worked with negative numberss that appeared in an equation. They usually moved them to the other side so that the equation became an equation of non-negative numbers. However, with our set theoretic definition, we have reached a more fundamental and higher level of abstraction. Thus, with our equivalence relation above on the natural numbers, we define the set of integers as the set of equivalence classes of such ordered pairs. It is now easy to see that the following lemma holds:

Lemma 1.1 If (j,k) is an ordered pair of non-negative integers, then exactly one of the following statements holds:

(a) (j,k) is equivalent to (m,0) for a unique non-negative integer m;

(b) (j,k) is equivalent to (0,m) for a unique non-negative integer m;

(c) (j,k) is equivalent to (0,0).

Sometimes, we denote by |(j,k)| the equivalence class of (j,k). With this lemma in place, we now denote by m the set of pairs of non-negative integers equivalent to (m,0) ; by -m the set of pairs equivalent to (0,m) and by 0 the set of pairs equivalent to (0,0). We denote these equivalence classes by \mathcal{Z}.This gives us set theoretic construction of the set of integers.

We can define the operations of addition and multiplication by setting:

|(j_{1},  k_{1})|+|(j_{2}, k_{2})|=|(j_{1}+j_{2}, k_{1}+k_{2})|

|(j_{1}, k_{1})| \times |(j_{2}, k_{2})| = |(j_{1}j_{2}+k_{1}k_{2}, j_{1}k_{2}+j_{2}k_{1})|

This latter definition is best understood if we recall that the symbol (j,k) represents j-k so that the left hand side of the above equation is

(j_{1}-k_{1})(j_{2}-k_{2}) = j_{1}j_{2}+k_{1}k_{2}-(j_{1}k_{2}+j_{2}k_{1})

One needs to check that these definitions are “well-defined” in the sense that they are independent of the representatives chosen for the equivalence class. This can be done as exercises.

In this way, we have now extended the notion of addition and multiplication from the set of natural numbers to the set of integers. Subtraction of integers can be defined by

|(j_{1}, k_{1})|-|(j_{2}, k_{2})| = |(j_{1}, k_{1})| +(-1)|(j_{2}, k_{2})|

where -1 represents the equivalence class (0,1). All of these definitions correspond to our usual notion of addition, subtraction and multiplication. Their virtue lies in their pure set-theoretic formulation.

We can also order the set of integers in the usual way. Thus,

j_{1}+k_{2} < k_{1}+j_{2} \Longrightarrow |(j_{1}, k_{1})|< |(j_{2}, k_{2})|


j_{1}+k_{2} \leq k_{1}+j_{2} \Longleftrightarrow |(j_{1}, k_{1})| \leq |(j_{2}, k_{2})|.

This corresponds to our usual notion of “less than” and “less than or equal to”.

Finally, we can define the absolute value on the set of integers by setting

|k|=k, if 0<k

|k|=0, if k=0

|k|=-k, if k<0

We can now construct the rational numbers \mathcal{Q} from the set of integers. We do this by defining an equivalence relation on the set \mathcal{Z} \times \mathcal{Z}^{+} by stating that two pairs (j_{1}, k_{1}) and (j_{2}, k_{2}) are equivalent if and only if j_{1}k_{2}=j_{2}k_{1}. Intuitively, we think of (j_{1}, k_{1}) as representing the “fraction” \frac{j_{1}}{k_{1}} and examining what we would mean by \frac{j_{1}}{k_{1}} = \frac{j_{2}}{k_{2}} by reducing it to notions already defined. The set of rational numbers \mathcal{Q} is then defined as the set of such equivalence classes.

The expected operations of addition and multiplication are now evident:

|(j_{1}, k_{1})|+|j_{2}, k_{2}| = |(j_{1}k_{2}+j_{2}k_{1}, k_{1}k_{2})|

|(j_{1}, k_{1})||(j_{2}, k_{2})| = |(j_{1}j_{2}, k_{1}k_{2})|

Again, these definitions are easily verified to be well-defined. Finally, we can now define “division”.If |(j_{1}, k_{1})|, |(j_{2}, k_{2})| \in \mathcal{Q} with j_{2} \neq 0, we define:

\frac{|(j_{1}, k_{1})|}{|(j_{2}, k_{2})|} = |(j_{1}k_{2}, j_{2}k_{1})|

These operations satisfy the familiar laws of associativity, commutativity and distributivity. Subtraction of rational numbers then can be written as :

|(j_{1}, k_{1})|-|(j_{2}, k_{2})| = |(j_{1}, k_{1})|+(-1,1)|(j_{2}, k_{2})|

The ordering of rational numbers can also be written as:

|(j_{1}, k_{1})|< |(j_{2}, k_{2})| \Longleftrightarrow j_{1}k_{2}< j_{2}k_{1}

|(j_{1}, k_{1})|   \leq  |(j_{2}, k_{2})| \Longleftrightarrow j_{1}k_{2} \leq j_{2}k_{1}.

These definitions agree with out usual notions of ordering of the rational numbers.

Finally, the definition of absolute value can be extended as:

|[(j,,k)]|= |(j,k)| if [(0,1)] < [(j,k)]

|[(j,k)]| = |(0,1)| if |(j,k)|=|(0,1)|

|[(j,k)]| = -|(j,k)|<  |(0,1)|.

Again, our familiar properties of the absolute value of rational numbers hold. With this foundational construction in place, we can conveniently represent the equivalence class of (j,k) as simply the fraction j/k and continue to work with these numbers as we were hopefully taught from childhood.

In the next sections/blogs we construct the real numbers from this axiomatic framework.


Hint (generic): keep the meaning of the symbols in mind and meaning of equivalence relations and equivalence classes. Also note that our basic object is a class and a set is a member of a class.

  1. let |(j_{1}, k_{1})|, |(j_{2}, k_{2})| be two elements of \mathcal{Z}. Show that the addition:

|(j_{1}, k_{1})|+|(j_{2}, k_{2})| = |(j_{1}+j_{2}, k_{1}+k_{2})| is well-defined. That is, prove that for any (j_{1}^{'}, k_{1}^{'}) \in |(j_{1}, k_{1})| and (j_{2}^{'},k_{2}^{'}) \in |(j_{2}, k_{2})|, we have that (j_{1}^{'}+j_{2}^{'}, k_{1}^{'}+k_{2}^{'}) is equivalent to (j_{1}+j_{2}, k_{1}+k_{2}).

2. For j_{1}, j_{2}, k \in \mathcal{Z}, prove the distributive law: (j_{1}+j_{2}).k = j_{1}k+j_{2}k.

3. Show that the relations < and \leq on \mathcal{Z} have the following properties:

(a) |(0,j)|< |(0,0)| for all j \in \mathcal{Z}^{+}

(b) |(0,j)|< |(k,0)| for all j, k \in \mathcal{Z}^{+}

(c) |(0,j)|< |(0,k)|, j ,k \in \mathcal{Z}^{+} if and only if k<j

(d) |(0,0)| < |(j,0)| for all j \in \mathcal{Z}^{+}

(e) |(j,0)|<|(k,0)|, j, k \in \mathcal{Z}_{\geq 0}if and only if j<k.

(f) |(0,j)| \leq |(0,0)| for all j \in \mathcal{Z}_{\geq 0}.

(g) |(0,j)| \leq |(k,0)| for all j,k \in \mathcal{Z}_{\geq 0}

(h) |(0,j)| \leq |(0,k)| for j,k \in \mathcal{Z}_{\geq 0} if and only if k \leq j

(i) |(0,0)| \leq |(j,0)| for all j \in \mathcal{Z}_{\geq 0}

(j) |(j,0)| \leq |(k,0)| where j, k \in \mathcal{Z}_{\geq 0} if and only if j \leq k.


Nalin Pithwa.

Cauchy’s Mean Value Theorem and the Stronger Form of l’Hopital’s Rule

Reference: G B Thomas, Calculus and Analytic Geometry, 9th Indian Edition. 

The stronger form of l’Hopital’s rule is as follows:

Suppose that f(x_{0})=g(x_{0})=0 and the functions f and g are both differentiable on an open interval (a,b) that contains the point x_{0}. Suppose also that g^{'} \neq 0 at every point in (a,b) except possibly x_{0}. Then,

\lim_{x \rightarrow x_{0}} \frac{f(x)}{g(x)} = \lim_{x \rightarrow x_{0}}\frac{f^{'}(x)}{g^{'}(x)}…call this I, provided the limit on the right exists.


The proof of the stronger from of l’Hopital’s rule in based on Cauchy’s mean value theorem, a mean value theorem that involves two functions instead of one. We prove Cauchy’s theorem first and then show how it leads to l’Hopital’s rule.

Cauchy’s Mean Value Theorem:

Suppose the functions f and g are continuous on [a,b] and differentiable through out (a,b) and suppose also that g^{'} \neq 0 through out (a,b). Then, there exists a number c in (a,b) at which

\frac{f^{'}(c)}{g^{'}(c)} = \frac{f(b)-f(a)}{g(b)-g(a)}.

(Note this becomes the ordinary mean value theorem when g(x)=x).

Proof of Cauchy’s Mean Value theorem:

We apply the ordinary mean value theorem twice. First, we use it to show that g(b) \neq g(a). Because if g(b)=g(a), then the ordinary Mean Value theorem says that

g^{'}(c) = \frac{g(b)-g(a0}{b-a}=0 for some c between a and b. This cannot happen because g^{'}(x) \neq 0 in (a,b).

We next apply the Mean Value Theorem to the function

F(x) = f(x)-f(a) - \frac{f(b)-f(a)}{g(b)-g(a)}(g(x)-g(a))

This function is continuous and differentiable where f and g are, and note that F(b)=F(a)=0. Therefore, by the ordinary mean value theorem, there is a number c between a and b for which F^{'}(c)=0. In terms of f and g, this says

F^{'}(c) = f^{'}(c) - \frac{f(b)-f(a)}{g(b)-g(a)}(g^{'}(c)) = 0

or \frac{f^{'}(c)}{g^{'}(c)} = \frac{f(b)-f(a)}{g(b)-g(a)} which is equation II above.

Proof of the stronger form of L’Hopital’s Rule:
We first establish equation I for the case \lim x \rightarrow x_{0}^{+}. The method needs almost no change to apply to the case \lim x \rightarrow x_{0}^{-}, and the combination of these two cases establishes the result.

Suppose that x lies to the right of x_{0}. Then, g^{'}(x) \neq 0 and we can apply Cauchy’s Mean Value theorem to the closed interval from x_{0} to x. This produces a number c between x and x_{0} such that

\frac{f^{'}(c)}{g^{'}(c)} = \frac{f(x)-f(x_{0})}{g(x)-g(x_{0})}

But, f(x_{0})=g(x_{0})=0 so

That \frac{f^{'}(c)}{g^{'}(c)}= \frac{f(x)}{g(x)}

As x approaches x_{0}, c approaches x_{0} as it lies between x and x_{0}. Therefore,

\lim_{x \rightarrow x_{0}^{+}} \frac{f(x)}{g(x)} = \lim_{x \rightarrow x_{0}^{+}}  \frac{f^{'}(c)}{g^{'}(c)} = \lim_{x \rightarrow x_{0}^{+}} \frac{f^{'}(x)}{g^{'}(x)}.

This establishes l’Hopital’s Rule for the case where approaches x_{0} from right. The case where x approaches x_{0} from the left is proved by applying Cauchy’s Mean Value Theorem to the closed interval [x,x_{0}] when x < x_{0}.



Nalin Pithwa

VII. Complete Metric Spaces

Reference: Introductory Real Analysis by Kolmogorov and Fomin. Translated by Richard A. Silverman. Dover Publications. 

Available on Amazon India and Amazon USA. This text book can be studied in parallel with Analysis of Walter Rudin.

7.1. Definition and examples:

The reader is presumably already familiar with the notion of completeness of the real line. (One good simple reference for this could be: Calculus and analytic geometry by G B Thomas. You can also use, alternatively, Advanced Calculus by Buck and Buck.)The real line is, of course, a simple example of a metric space. We now make the natural generalisation of the notion of completeness to the case of an arbitrary metric space.


A sequence \{ x_{n}\} of points in a metric space R with metric \rho is said to satisfy the Cauchy criterion if given any \epsilon >0, there is an integer N_{\epsilon} such that \rho(x_{n}, x_{n^{'}})<\epsilon for all n,n^{'}> N_{\epsilon}.


A subsequence \{ x_{n}\} of points in a metric space R is called a Cauchy sequence (or a fundamental sequence ) if it satisfies the Cauchy criterion.


Every convergent sequence \{ x_{n}\} is fundamental.

Proof 1:

If \{ x_{n} \} converges to a limit x, then, given any \epsilon>0, there is an integer N_{\epsilon} such that

\rho(x_{n}, x) \leq \frac{\epsilon}{2} for all n > N_{\epsilon}.

But, then

\rho(x_{n}, x_{n^{'}}) \leq \rho(x_{n},x)+\rho(x_{n^{'}},x)<\epsilon

for all n, n^{'} >N_{\epsilon}. QED.


A metric space R is said to be complete if every Cauchy sequence in R converges to an element of R. Otherwise, R is said to be incomplete. 

Example 1:

Let R be the “space of isolated points” (discrete metric space) defined as follows: Define \rho(x,y)=0, if x=y; let \rho(x,y)=1, when x \neq y. Then, the Cauchy sequence in R are just the “stationary sequences,” that is, the sequences \{ x_{n}\} all of whose terms are the same starting from some index n. Every such sequence is obviously convergent to an element of R. Hence, R is complete.

Example 2:

The completeness of the real line R is familiar from elementary analysis:

Example 3:

The completeness of the Euclidean n-space \Re^{n} follows from that of \Re^{1}. In fact, let

x^{(p)} = (x_{1}^{(p)}, x_{2}^{(p)}, \ldots, x_{n}^{(p)}) where p = 1, 2, \ldots

be fundamental sequence of points in \Re^{n}. Then, given \epsilon >0, there exists an N_{\epsilon} such that

\Sigma_{n=1}^{\infty} (x_{k}^{(p)}-x_{k}^{(q)})^{2} < \epsilon^{2}

for all p, q > N_{\epsilon}. It follows that

|x_{k}^{(p)}-x_{k}^{(q)}|<\epsilon for k=1,2,\ldots, n for all p,q > N_{\epsilon}, that is, each \{x_{k}^{(p)} \} is a fundamental sequence in \Re^{1}.

Let x = (x_{1}, \ldots, x_{n}) where x_{k} = \lim_{p \rightarrow \infty} x_{k}^{(p)}

Then, obviously \lim_{p \rightarrow \infty} x^{(p)} = x.

This proves the completeness of \Re^{n}. The completeness of the spaces R_{0}^{n} and R_{1}^{n} introduced in earlier examples/blogs is proved in almost the same way. (HW: supply the details). QED.

Example 4:

Let \{ x_{n}(t)\} be a Cauchy sequence in the function space C_{[a,b]} introduced earlier. Then, given any \epsilon >0, there is an N_{\epsilon} such that

|x_{n}(t) - x_{n^{'}}(t)|< \epsilon….I

for all n, n^{'} > N_{\epsilon} and all t \in [a,b]. It follows that the sequence \{ x_{n}(t)\} is uniformly convergent. But the limit of a uniformly convergent sequence of continuous functions is itself a continuous function (see Problem 1 following this Section). Taking the limit as n^{'} \rightarrow \infty in I, we find that

|x_{n}(t) - x(t)|\leq \epsilon for all n > N_{\epsilon} and all t \in [a,b], that is, \{ x_{n}(t)\} converges in the metric space C_{[a,b]} to a function x(t) \in C_{[a,b]}. Hence, C_{[a,b]} is a complete metric space.

Example 5:

Next, let x^{(n)} be a sequence in the space l_{2} so that 

x^ {(n)} = (x_{1}^{(n)}, x_{2}^{(n)}, \ldots, x_{k}^{(n)}, \ldots)

\Sigma_{k=1}^{\infty}(x_{k}^{(n)})^{2} < \infty, where n = 1, 2, , \ldots

Suppose further that \{ x^{(n)}\} is a Cauchy sequence. Then, given any \epsilon > 0 there is a N_{\epsilon} such that

\rho^{2}(x^{(n)},x^{(n^{'})}) = \Sigma_{k=1}^{\infty}(x_{k}^{(n)}-x_{k}^{(n^{'})})^{2}< \epsilon…let us call this II.

if n, n^{'} > N_{\epsilon}.

It follows that (x_{k}^{(n)}-x_{k}^{(n^{'})})^{2} < \epsilon (for k =1, 2, \ldots)

That is, for every k the sequence \{ x_{k}^{(n)}\} is fundamental and hence, convergent. Let

x_{k} = \lim_{n \rightarrow \infty} x_{k}^{(n)}, x = (x_{1}, x_{2}, \ldots, x_{k}, \ldots)

Then, as we now show, x is itself a point of l_{2} and moreover, \{ x^{(n)}\} converges to x in the l_{2} metric space, so that l_{2} is a complete metric space.

In fact, the Cauchy criterion here implies that \Sigma_{k=1}^{M}(x_{k}^{(n)}-x_{k}^{(n^{'})})^{2}<\epsilon for any fixed M. …let us call this III.

Holding n fixed in III, and taking the limit as n^{'} \rightarrow \infty, we get

\Sigma_{k=1}^{M}(x_{k}^{(n)}-x_{k})^{2} \leq \epsilon….call this IV.

Since IV holds for arbitrary M, we can in turn take the limit of IV as M \rightarrow \infty, obtaining

\Sigma_{k=1}^{\infty}(x_{k}^{(n)}-x_{k})^{2} \leq \epsilon.

But, as we have learnt earlier in this series of blogs, the convergence of the two series \Sigma_{k=1}^{\infty}(x_{k}^{(n)})^{2} and \Sigma_{k=1}^{\infty}(x_{k}^{(n)}-x_{k})^{2} implies that of the series \Sigma_{k=1}^{\infty}x_{k}^{2}.

This proves that x \in l_{2}. Moreover, since \epsilon is arbitrarily small, III implies that

\lim_{n \rightarrow \infty}\rho( x^{(n)} ,x ) = \lim_{n \rightarrow \infty} \sqrt{\Sigma_{k=1}^{\infty} (x_{k}^{(n)}-x_{k})^{2}} = 0

That is, \{ x^{(n)}\} converges to x in the l_{2} metric space, as asserted.


Example 6.

Consider the space C_{[a,b]}^{2}. To recap: consider the set of all functions continuous on the closed interval [a,b] with the distance metric defined by: \rho(x,y) = (\int_{a}^{b}|x(t)-y(t)|^{2}dt)^{\frac{1}{2}}.

It is easy to show that the space C_{[a,b]}^{2} is incomplete. If

\phi_{n}(t) = -1, if -1 \leq t \leq -\frac{1}{n}

\phi_{n}(t) = nt, if -\frac{1}{n} \leq t \leq \frac{1}{n}

\phi_{n}(t) = 1, if \frac{1}{n} \leq t \leq 1

then \{ \phi_{n}(t)\} is a fundamental sequence in C_{[a,b]}^{2} since

\int_{-1}^{1}(\phi_{n}(t)-\phi_{n^{'}}(t))^{2} dt \leq \frac{2}{\min{ \{ n,n^{'}}\}}

However, \{ \phi_{n}(t)\} cannot converge to a function in C_{[-1,1]}^{2}. In fact, consider the discontinuous function

\psi(t) = -1, when t \leq 0

\psi(t) = 1, when t \geq 0.

Then, given any function f \epsilon C_{[-1,1]}^{2}, it follows from Schwarz’s inequality (obviously still valid for piecewise continuous functions) that

(\int_{-1}^{1}(f(t)-\psi(t))^{2}dt)^{\frac{1}{2}} \leq (\int_{-1}^{1}(f(t)-\phi_{n}(t))^{2}dt)^{\frac{1}{2}} + (\int_{-1}^{1}(\phi_{n}(t) - \psi(t))^{2}dt)^{\frac{1}{2}}

But the integral on the left is non-zero, by the continuity of f, and moreover, it is clear that

\lim_{n \rightarrow \infty}\int_{-1}^{1}(\phi_{n}(t)-\psi(t))^{2}dt = 0

Therefore, \int_{-1}^{1}(f(t)-\phi_{n}(t))^{2}dt cannot converge to zero as n \ rightarrow \infty.


7.2 The nested sphere theorem.

A sequence of closed spheres S[x_{1}, r_{1}], S[x_{2}, r_{2}], \ldots, S[x_{n}, r_{n}], \ldots

in a metric space R is said to be nested (or decreasing) if

S[x_{1}, r_{1}] \supset S[x_{2}, r_{2}] \supset \ldots \supset S[x_{n}, r_{n}] \supset \ldots

Using this concept, we can prove a simple criterion for the completeness of R:

THEOREM 2: The Nested Sphere Theorem:

A metric space R is complete if and only if every nested sequence \{ S_{n}\} = \{ S_{[x_{n}, r_{n}]}\} of closed spheres in R such that r_{n} \rightarrow 0 as n \rightarrow \infty has a non empty intersection


Proof of the nested theorem:

Part I: Assume that R is complete and that if \{ S_{n}\} = \{ S[x_{n}, r_{n}] \} is any nested sequence of closed spheres in R such that r_{n} \rightarrow 0 as n \rightarrow \infty, then the sequence \{ x_{n}\} of centres of the spheres is fundamental because

\rho(x_{n}, x_{n^{'}}) < r_{n} for n^{'}>n and r_{n} \rightarrow 0 as n \rightarrow \infty. Therefore, \{ x_{n} \} has a limit. Let

x = \lim_{n \rightarrow \infty} x_{n}.

Then, x \in \bigcap_{n=1}^{\infty}S_{n}

Not only that, we can in fact say that S_{n} contains every point of the sequence \{ x_{n}\} except possibly the points x_{1}, x_{2}, \ldots, x_{n-1} and hence, x is a limit point of every sphere S_{n}. But, S_{n} is closed, and hence, x \in S_{n} for all n.

Conversely, suppose every nested sequence of closed spheres in R with radii converging to zero has a non empty intersection, and let \{ x_{n}\} be any fundamental sequence in R. Then, x has a limit point in R. To see this, use the fact that \{x_{n}\} is fundamental to choose a term x_{n_{1}} of the sequence \{ x_{n} \} such that

\rho(x_{n}, x_{n_{1}}) < \frac{1}{2} for all n \geq n_{1}, and let S_{1} be the closed sphere of radius 1 with centre x_{n_{1}}. Then, choose a term x_{n_{1}} of \{ x_{n}\} such that n_{2} > n_{1} and \rho(x_{n}, x_{n_{1}}) < \frac{1}{2^{2}} for all n > n_{2}, and let S_{2} be the closed sphere of radius \frac{1}{2} with centre x_{n_{2}}.

Continue this construction indefinitely, that is, once having chosen terms x_{n_{1}}, x_{n_{2}}, \ldots, x_{n_{k}} (where n_{1}<n_{2}< \ldots n_{k}), choose a term x_{n_{k+1}} such that n_{k+1}>n_{k} and

\rho(x_{n}, x_{n_{k+1}}) < \frac{1}{2^{k+1}} for all n \geq n_{k+1}.

Let S_{k+1} be the closed sphere of radius \frac{1}{2^{n}} with centre x_{n_{k+1}}, and so on. This gives a nested sequence S_{n} of closed spheres with radii converging to zero. By hypothesis, these spheres have a non empty intersection, that is, there is a point x in all the spheres. This point is obviously the limit of the sequence \{ x_{n}\}. But, if a fundamental sequence contains a subsequence converging to x, then the sequence itself must converge to x (HW quiz). That is,

\lim_{n \rightarrow \infty}x_{n} =x.


7.3 Baire’s theorem:

We know that a subset A of a metric space R is said to be nowhere dense in R if it is dense in no (open) sphere at all, or equivalently, if every sphere S \subset R contains another sphere S^{'} such that S^{'} \bigcap A = \phi. (Quiz: check the equivalence).

This concept plays an important role in the following:

THEOREM 3: Baire’s Theorem:

A complete metric space R cannot be represented as the union of a countable number of nowhere dense sets.

Proof of Theorem 3: Baire’s Theorem:

Suppose the contrary. Let R = \bigcup_{n=1}^{\infty}A_{n} ….call this VI.

where every set A_{n} is nowhere dense in R. Let S_{0} \subset R be a closed sphere of radius 1. Since A_{1} is nowhere dense in S_{0}, being nowhere dense in R, there is a closed sphere S_{1} of radius less than \frac{1}{2} such that S_{1} \subset S_{0} and S_{1} \bigcap A_{1} =\phi. Since A_{2} is nowhere dense in S_{1}, being nowhere dense in S_{0}, there is a closed sphere S_{2} of radius less than \frac{1}{3} such that S_{2} \subset S_{1} and S_{2} \bigcap A_{2} = \phi, and so on. In this way, we get a nested sequence of closed spheres \{ S_{n}\} with radii converging to zero such that

S_{n} \bigcap A_{n} = \phi, where n=1,2,3,\ldots

By the nested sphere theorem, the intersection \bigcap_{n=1}^{\infty}S_{n} contains a point x. By construction, x cannot belong to any of the sets A_{n}, that is,

x \notin \bigcup_{n=1}^{\infty}A_{n}

It follows that R \neq \bigcup_{n=1}^{\infty} A_{n} contrary to VI.

Hence, the representation VI is impossible.


COROLLARY TO Baire’s theorem:

A complete metric space R without isolated points is uncountable.


Every single element set \{ x\} is nowhere dense in R.


7.4 Completion of a metric space:

As we now show, an incomplete metric space can always be enlarged (in an essentially unique way) to give a complete metric space.

DEFINITION 4: Completion of a metric space:

Given a metric space R with closure [R], a complete metric space R^{*} is called a completion of R if R \subset R^{*} and [R]=R^{*}, that is, if R is a subset of R^{*} everywhere dense in R^{*}.

Example 1.

Clearly, R^{*}=R if R is already complete. (Quiz: homework).

Example 2:

The space of all real numbers is the completion of the space of all rational numbers.


Every metric space R has a completion. This completion is unique to within an isometric mapping carrying every point x \in R into itself.

Proof of Theorem 4:

(The proof is somewhat lengthy but quite straight forward).

First , we prove the uniqueness showing that if R^{*} and R^{**} are two completions of R, then there is a one-to-one mapping x^{**} = \phi(x^{*}) onto R^{**} such that \phi(x)=x for all x \in R and

\rho_{1}(x^{*}, y^{*}) = \rho_{2}(x^{**}, y^{**})….call this VII.

(y^{**}=\phi(y^{*})), where \rho_{1} is the distance metric in R^{*} and \rho_{2} the distance metric in R^{**}. The required mapping \phi is constructed as follows: Let x^{*} be an arbitrary point of R^{*}. Then, by the definition of a completion, there is a sequence \{ x_{n}\} of points of R converging to x^{*}. The points of the sequence \{ x_{n}\} also belong to R^{**}, where they form a fundamental sequence (quiz: why?). Therefore, \{ x_{n}\} converges to a point x^{**} \in R^{**} since R^{**} is complete. It is clear that x^{**} is independent of the choice of the sequence \{ x_{n}\} converging to the point x^{*} (homework quiz: why?). If we set \phi(x^{*})=x^{**}, then \phi is the required mapping. In fact, \phi(x)=x for all x \in R, since if x_{n} \rightarrow x \in R, then obviously x = x^{*} \in R^{*}, x^{**}=x. Moreover, suppose x_{n} \rightarrow x^{*}, y_{n} \rightarrow y^{*} in R^{*}, while x_{n} \rightarrow x^{**}, y_{n} \rightarrow y^{**}. Then, if \phi is the distance in R,

\rho_{1}(x^{*},y^{*}) = \lim_{n \rightarrow \infty} \rho(x_{n}, y_{n}) = \lim_{n \rightarrow \infty}(x_{n}, y_{n})…call this VIII.

While at the same time, \rho_{2}(x^{**}, y^{**})=\lim_{n \rightarrow \infty}\rho_{2}(x_{n}, y_{n})=\lim_{n \rightarrow \infty}\rho(x_{n}, y_{n})….call this VIII-A. But VIII and VIII-A imply VII.

We must now prove the existence of a completion of R. Given an arbitrary metric space R, we say that two Cauchy sequences \{x_{n} \} and \{\overline{x}_{n} \} in R are equilvalent and write \{ x_{n}\} \sim  \{ \overline{x}_{n}\}

if \lim_{n \rightarrow \infty} \rho(x_{n}, \overline{x}_{n})=0

As anticipated by the notation and terminology, \sim is reflective, symmetric and transitive, that is, \sim is an equivalence relation. Therefore, the set of all Cauchy sequences of points in the space R can be partitioned into classes of equivalent sequences. Let these classes be the points of a new space R^{*}. Then, we define the distance between two arbitrary points x^{*}, y^{*} by the formula

\rho_{1}(x^{*}, y^{*}) = \lim_{n \rightarrow \infty} \rho(x_{n}, y_{n}) ….call this IX.

where \{ x_{n}\} is any “representative” of x^{*} (namely, any Cauchy sequence in the class x^{*}) and \{ y_{n}\} is any representative of y^{*}.

The next step is to verify that IX is indeed a distance metric. That is, also to check that IX exists, independent of the choice of sequence \{x_{n} \} \in x^{*}, \{ y_{n} \} \in y^{*}, and satisfies the three properties of a distance metric function. Given any \epsilon >0, it follows from the triangle inequality in R (this can be proved with a little effort: homework quiz) that

|\rho(x_{n}, y_{n}) - \rho(x_{n^{'}},y_{n^{'}})| = |\rho(x_{n},y_{n}) -\rho(x_{n^{'}},y_{n}) + \rho(x_{n^{'}},y_{n}) - \rho(x_{n^{'}}, y_{n^{'}})|

That is, \leq |\rho(x_{n},y_{n}) -\rho(x_{n^{'}}, y_{n}) | + |\rho(x_{n^{'}}, y_{n}) - \rho(x_{n^{'}}, y_{n^{'}}) |

that is, \leq \rho(x_{n}, x_{n^{'}}) + \rho(y_{n}, y_{n^{'}}) \leq \frac{\epsilon}{2} + \frac{\epsilon}{2} = \epsilon….call this X

for all sufficiently large n and n^{'}.

Therefore, the sequence of real numbers \{ x_{n}\} = \{ \rho(x_{n},y_{n})\} is fundamental and hence, has a limit. This limit is independent of the choice of \{x_{n} \} \in x^{*}, \{ y_{n}\} \in y^{*}. In fact, suppose that

\{ x_{n}\}, \{ \overline{x}_{n}\} \in x^{*}, \{ y_{n}\}, \{ \overline{y}_{n} \in y^{*}


|\rho(x_{n}, y_{n}) - \rho(\overline(x)_{n}, \overline{y}_{n})| \leq \rho(x_{n}, \overline{x}_{n}) + \rho(y_{n}, \overline{y}_{n}) by a calculation analogous to X. But,

\lim_{n \rightarrow \infty} \rho(x_{n}, \overline{x}_{n}) = \lim_{n \rightarrow \infty}(y_{n}, \overline{y}_{n})=0

since \{ x_{n} \} \sim \{ \overline{x}_{n}\} and \{ y_{n} \} \sim \{ \overline{y}_{n}\}, and hence,

\lim_{n \rightarrow \infty} \rho(x_{n}, y_{n}) = \lim_{n \rightarrow \infty} \rho(\overline{x}_{n}, \overline{y}_{n}).

As for the three properties of a metric, it is obvious that

\rho_{1}(x^{*}, y^{*}) = \rho_{1}(y^{*}, x^{*}), and the fact that

\rho_{1}(x^{*}, y^{*}) =0 if and only if x^{*}=y^{*} is an immediate consequence of the definition of equivalent Cauchy sequences.

To verify the triangle inequality in R^{*}, we start from the triangle inequality:

\rho(x_{n}, z_{n}) \leq \rho(x_{n}, y_{n}) + \rho(y_{n}, z_{n})

in the original space R, and then take the limit as n \rightarrow \infty, obtaining

\lim_{n \rightarrow \infty} rho(x_{n}, z_{n}) \leq \lim_{n \rightarrow \infty} \rho(x_{n}, y_{n}+ \lim_{n \rightarrow \infty} \rho(y_{n}, z_{n})

That is, \rho_{1}(x^{*}, z^{*}) \leq \rho_{1}(x^{*}, y^{*}) + \rho_{1}(y^{*}, z^{*})

We now come to the crucial step of showing that R^{*} is a completion of R. Suppose that with every point x \in R, we associate the class x^{*} \in R^{*} of all Cauchy sequences converging to x. Let

x = \lim_{n \rightarrow \infty} x_{n}, y = \lim_{n \rightarrow \infty} y_{n}

Then, clearly \rho(x,y) = \lim_{n \rightarrow \infty}(x_{n}, y_{n})

(the above too can be proven with a slight effort: HW quiz); while, on the other hand,

\rho_{1}(x^{*}, y^{*}) = \lim_{n \rightarrow \infty} \rho(x_{n}, y_{n}) by definition. Therefore,

\rho(x,y) = \rho_{1}(x^{*}, y^{*}) and hence, the mapping of R into R^{*} carrying x into x^{*} is isometric. Accordingly, we need no longer distinguish between the original space R and its image in R^{*}, in particular between the two metrics \rho and \rho_{1}. In other words, R can be regarded as a subset of R^{*}. The theorem will be proved once we succeed in showing that

(i) R is everywhere dense in R^{*}, that is, [R] = R;

(2) R^{*} is complete.

Towards that end, given any point x^{*} \in R^{*} and any \epsilon >0, choose a representative of x^{*}, namely a Cauchy sequence \{ x_{n}\} in the class x^{*}. Let N be such that \rho(x_{n}, x_{n^{'}}) < \epsilon for all n, n^{'} >N. Then,

\rho(x_{n}, x^{*}) = \lim_{n \rightarrow \infty} \rho(x_{n}, x_{n^{'}}) \leq \epsilon if n >N, that is, every neighbourhood of the point x^{*} contains a point of R. It follows that [R] = R.

Finally, to show that R^{*} is complete, we first note that by the very definition of R^{*}, any Cauchy sequence \{ x_{n}\} consisting of points in R converges to some point in R^{*}, namely to the point x^{*} \in R^{*} defined by \{ x_{n}\}. Moreover, since R is dense in R^{*}, given any Cauchy sequence x_{n}^{*} consisting of points in R^{*}, we can find an equivalent sequence \{ x_{n}\} consisting of points in R. In fact, we need only choose x_{n} to be any point of R such that \rho(x_{n}, x_{n}^{*}) < \frac{1}{n}. The resulting sequence \{ x_{n}\} is fundamental, and, as just shown, converges to a point x^{*} \in R^{*}. But, then the sequence x_{n}^{*} also converges to x^{*}.



If R is the space of all rational numbers, then R^{*} is the space of all real numbers, both equipped with the distance \rho(x,y) = |x-y|. In this way, we can “construct the real number system.” However, there still remains the problem of suitably defining sums and products of real numbers and verifying that the usual axioms of arithmetic are satisfied.


Nalin Pithwa.

Purva building, 5A
Flat 06
Thakur Complex, Near Dimple Arcade
Mumbai , Maharastra 400101