Exercises: Set theoretic construction of real numbers


  1. If p is a prime number, show that \sqrt{p} is irrational.
  2. Show that \sum_{n=2}^{\infty} \frac{1}{n!}<1
  3. Show that if e_{n} = \pm {1}, then the number \sim_{n=1}^{\infty} \frac{e_{n}}{n!} is an irrational number.
  4. Show that if \{ q_{n}\} and \{ r_{n}\} are two Cauchy sequences, then so are \{ q_{n}+r_{n}\} and \{ q_{n}.r_{n}\}.
  5. If A and B are non empty subsets of \Re that is bounded below. Then, s = \inf{A} if and only if (i) s \leq a for all a \in A and (ii) for any \epsilon>0, A \bigcap [s, s +\epsilon) \neq \phi.

Reference: A Second Course in Analysis by M Ram Murty, Hindustan Book Agency.


Nalin Pithwa

Set theoretic construction of real numbers

Continued from previous blog. Same reference: A Second course in analysis by M Ram Murty, Hindustan Book Agency.

Section 1.3

The rational number is insufficient to “measure” all the lengths that arise in the “real world.” This was discovered by the ancient school of Pythagoras which viewed the world through a strange mix of mathematics and mysticism. The aphorism “all is number” seems to have been the underlying mantra of the Pythagoreans. In our modern digital world, this mantra imposes the universality more than in any earlier age.

Pythagoras seems to have been a contemporaryof the Buddha in India, of Confucius and Lao-Tzu in China. He seems to have travelled widely in Egypt, Babylon and India. The Pythagoreans believed in reincarnation and this was wedded to their strict adherence to vegetarianism for by eating meat they might be eating a friend!They are credited to the discovery of both the words philosophy and mathematics. The word “philosophy” literally means the love of wisdom and the word “mathematics” means “that which is learned.” Philosophy then, for the Pythagoreans was seen as a byproduct of mathematics.

The celebrated theorem of Pythagoras dealing with the length of the hypotenuse of a right angled triangle was known to earlier cultures stretching far back in time to Babylon, China and India. It is possible that Pythagoras learned of this theorem during his extensive peregrinations. But the Pythagoreans could see the mysticism inherent in the theorem, for the rule applied to all right angled triangles and there was an element of universality in it and in mathematics in general.

But when the Pythagoreans declared “all is number” they were referring to whole numbers and it is accurate to say that the concept of number had not been clearly articulated at that time. With the underlying idea that whole numbers govern the universe, their theory of ratio and proportion was in for a rude shock. Legend has it that the Pythagorean who discovered that \sqrt{2} is an irrational number was drowned !

The irrationality of \sqrt{2} represents a major episode in the history of mathematics and we owe this to the unfortunate Pythagorean who lost his life by discovering it. The proof is by contradiction and invokes the rudimentary arithmetical idea of a prime number, a concept first enunciated in Euclid’s Elements. A natural number is called a prime number if it has no proper divisors. Thus, the sequence of prime numbers begins as 2,3,5,7,11, \ldots (The number one is not included in the list of prime numberss. It it called a unit, a concept which generalizes to rings.) Two natural numbers a and b are said to be coprime if they have no common prime factor. When we write a fraction a/b in “lowest termss”, we are writing it such that a and b are coprime. We sometimes write (a,b)=1 to indicate that a and b are coprime. This is not to be confused with the concept of an ordered pair introduced earlier.

The irrationality of \sqrt{2} is established by using the method of contradiction. It is instructive to see all the details. If \sqrt{2} is a rational number, then we can write

\sqrt{2}=\frac{a}{b} such that (a,b)=1

Upon squaring both sides, we see


which implies that a is even since the left hand side of the equation is patently an even number. Writing a=2c, we now get


and so b is even. But this contradicts the coprimality of a and b.

This was the theorem that led to the demise of the heroic Pythagorean for it violated the aphorism “all is number” because the ancients viewed numbers as only whole numbers. They also allowed for rational numbers since they can be represented as ratio and proportion of whole quantities. This innocuous theorem opens the door for the discovery of real numbers. For it signals the lack of the “least upper bound property” in the realm of rational numbers.

The student should keep in mind that so far, we have only constructed the universe of rational numbers and thus any further definitions must be given only in terms of rational numbers. A set A of rational numbers is said to be bounded if there is a rational number M such that |x| \leq M for all x \in A. An upper bound for A is any rational number u such that x \leq u for all x \in A. A lower bound for A is a rational number l such that l \leq x for all x \in A. A rational number v is called a least upper bound for A if v \leq u for every upper bound u of A. A rational number m is called a greatest lower bound for A if l \leq m for every lower bound l of A. Thus, a set A is bounded if and only if it has both an upper bound and a lower bound.

Several questions now arise: given a set A of rational numbers, does it always have a least upper bound? Does it have a greatest lower bound? If so, are they unique?

The theorem discovered by the Pythagoreans can be explained as follows. Nascent in this example are many of the fundamental properties we would expect of the real numbers. Essential to the understanding is the familiar fact that the sequence of rational numbers \frac{1}{n} gets smaller and smaller and tends to zero.

If we consider the set

A = \{ x \in \mathcal{Q}: x^{2} \leq 2\}

then A has no least upper bound. Indeed if u is a least upper bound, x \leq u for any x \in A. By definition, u is a rational number and so cannot first equal \sqrt{2}. If u < \sqrt{2}, then we can choose a natural number N such that

\frac{1}{N} < \sqrt{2}-u

so that u + \frac{1}{N}< \sqrt{2}

contradicting that u is an upper bound for the set A. If u>\sqrt{2}, then we can similarly find N such that

u - \frac{1}{N}> \sqrt{2},

which again contradicts that u is the least upper bound for A. Thus, A does not have a least upper bound in the realm of rational numbers.

Digress: We digress a bit here from Dr. M Ram Murty’s above text and reproduce a nice “little proof”‘ from Rudin’s Analysis text illustrating that the rational number line has gaps or holes in it..


As we know there exists no rational solution to p^{2}=2. We now examine this situation a little more closely. Let A be the set of all positive rationals p such that p^{2}<2 and let B consist of all positive rationals p such that p^{2}>2. We shall show that A contains no largest rational number and B contains no smallest rational number.

More explicitly, for every p \in A we can find a rational q in A such that p<q, and for every p in B, we can find a rational q in B such that q < p.

To do this, we associate with each rational number p>0 the number q such that:

q = p - \frac{p^{2}-2}{p+2}=\frac{2p+1}{p+2}…call this I.

Thus, q^{2}-2 = \frac{2(p^{2}-2)}{(p+2)^{2}}….call this II.

If p is in A, then p^{2}-2<0, (1) above shows that q >p, and II shows that q^{2}<2. Thus, q is in A.

If p is in B, then p^{2}-2>0, (I) shows that 0<q<p, and II shows that q^{2}>2. Thus, q is in B.

Conclusion: the purpose of the above discussion is to show that the rational number system has gaps, in spite of the fact that in between any two rational numbers, there is another: if r, s \in \mathcal{Q}, and if r<s, then r < \frac{r+s}{2}<s. The real number system fills these gaps. This is the principle reason for the fundamental role which it plays in analysis.


Two ways of constructing the real numbers were proposed independently by Richard Dedekind (1831-1916) and Augustin Cauchy (1789-1857). Each method has its virtues. The method of Dedekind, using what are called Dedekind cuts is closer to the axiomatic foundations we have been discussing and is suggested by the example discussing \sqrt{2} above. The method of Cauchy using what are now called Cauchy sequences has a wider applicability. In our example above, an essential role is played by the ordering of the rational numbers. The method of Dedekind cuts uses this idea in a fundamental way. It coincides with our intuitive and visual idea of the real numbers as being “points on the number line stretching from minus infinity to plus infinity.”

A Dedekind cut in Q is a partition of Q into two sets A and B with the following properties:

  1. A and B are both non-empty.
  2. A  \bigcup B = \mathcal{Q}
  3. A is “closed downwards”, that is, if r \in A and q<r, then q \in A;
  4. B is “closed upwards”, that is, if q \in B and q<r, then r \in B.
  5. A contains no greatest element, that is, there is no q \in A such that r \leq q for all r \in A.

The set R of real numbers is then defined as the set of all Dedekind cuts.

For example, the irrational number \sqrt{2} is defined by the Dedekind cut:

\mathcal{Q} = A \bigcup B, and A = \{ x \in \mathcal{Q}: x^{2}<2\}, and B = \mathcal{Q}-A.

For example, the number zero is represented by A \bigcup B with A being the set of all non-zero negative rational numbers and B being the set of all non-zero positive rational numbers. As the Dedekind cut A \bigcup B is uniquely defined by A, with B being the complement of A in the set of rationals, we may as well associate A to each real number, or even think of A as the real number. One would then have to define how one would work with this definition from the perspective of addition, multiplication and so on. This is not difficult to do. For instance, we have

A_{1}< A_{2} \Longleftrightarrow A_{1} \subset A_{2}, and A_{1} \leq A_{2} \subseteq A_{2}.

Addition is simple enough:

A_{1}+ A_{2} = \{ a_{1}+a_{2}: a_{1} \in A_{1}, a_{2} \in A_{2}\}

Multiplication is defined (rather cumbersomely) by setting A_{1}.A_{2} to be

\{ a_{1}.a_{2}: a_{1} \in A_{1}, a_{2} \in A_{2}, a_{1}, a_{2} \geq 0\} \bigcup \{ q \in Q: q \leq 0\} if A_{1}, A_{2} \geq 0

-(A_{1}.(-A_{2})), if A_{1} \geq 0, and A_{2}<0

-((-A_{1}).A_{2}), if A_{1}<0, A_{2} \geq 0

(-A_{1}).(-A_{2}), if A_{1},A_{2}<0

The absolute value function is then

|A| = A, if A>0; |A|=0, if A=0; |A|=-A, if A <0.

One can prove that all the usual operations with numbers satisfy the expected properties of commutativity, associativity, and distributivity.

By contrast, the method of Cauchy sequences begins with a definition of convergence. In some ways, it is motivated by our discussion of \sqrt{2} earlier. If A is the Dedekind cut representing \sqrt{2} , then the least upper bound of A “is” \sqrt{2}. Thus, the number \sqrt{2} can be thought of as a limit of a sequence of rational numbers. But there can be many such sequences and so one needs a more formal approach. The underlying observation is that a limit of a sequence of rational numbers need not be rational.

This is perhaps best illustrated by Euler’s series for e. Students of calculus are familiar with

e = \sum_{j=0}^{\infty}\frac{1}{j!}

and the right hand side can be thought of as a limit of a sequence of rational numbers


The irrationality of e is easily proved by the method of contradiction. Suppose that e=a/b for two coprime natural numbers a and b. Then,

b!e = \sum_{j=0}^{b}\frac{b!}{j!} + \sum_{j=b+1}^{\infty}\frac{b!}{j!} …call this 1.1

and the second term on the right hand side is strictly less than the geometric series

\sum_{k=1}^{\infty} \frac{1}{(b+1)^{k}}=\frac{1}{b}

which is less than or equal to 1. But the left hand side of 1.1 is a positive integer, the first term on the right hand side of 1.1 is a positive integer and the second term is not an integer, which is a contradiction.

What this example shows is that the “limit” of the sequence of partial sums S_{n} (each of which is a rational number) is the irrational number e.

The reader is familiar with the usual notion of convergence. We say a sequence of real numbers x_{n} converges to L if given \epsilon>0 there exists N such that

|x_{n}-L|< \epsilon for all n \geq N

One may want to view real numbers as “limits of sequences of rational numbers.”But in our formal construction of numbers, several difficulties arise with this definition. The first is that as we have so far only constructed the rational numbers, we need to take the x_{n}‘s to be all rational numbers. Secondly, the limit L need not be a rational number as the example with e shows. This motivates the formal definition of a Cauchy sequence of rational numbers.

A sequence q_{n} in \mathcal{Q} is called a Cauchy sequence if given \epsilon>0, there is an N such that

|q_{n}-q_{m}|< \epsilon for all m,n \geq N

Now it is easy to see that any convergent sequence is a Cauchy sequence. For reference, we record this below:

Theorem 1.2: Any convergent sequence q_{n} is a Cauchy sequence.

Proof of 1.2: Suppose that q_{n} converges to L. Then, choosing N such that

|q_{n}-L|< \frac{\epsilon}{2} for all n \geq N

we have for all m,n\geq N, by the triangle inequality

|q_{n}-q_{m}| \leq |q_{m}-L|+|L-q_{n}|<\epsilon.


Later, we will show that every Cauchy sequence converges. Thus, a sequence is Cauchy if and only if it is convergent and the two notions are equivalent. But the advantage of the notion of a Cauchy sequence is that the limit value L is not mentioned in the definition.

The construction of real numbers is now brought about by defining an equivalence relation on the set of all Cauchy sequences of rational numbers. Since we have constructed rational numbers, we are now allowed to construct the set of all Cauchy sequences of rational numbers. On this set, we say two Cauchy sequences \{ q_{n}\} and \{ r_{n} \} are equivalent if for any \epsilon>0, there exists a natural number N such that

|q_{n}-r_{n}|< \epsilon for all n \geq N.

This is easily seen to define an equivalence relation on the set of all Cauchy sequences of rational numbers. The set of real numbers \Re is then defined as the set of all equivalence classes. We will denote the equivalence class of the sequence \{ q_{n}\} by |\{ q_{n}\}|.

We leave as an exercise to verify that if \{ q_{n}\} and \{ r_{n}\} are two Cauchy sequences, then so are \{ q_{n}+r_{n}\} and \{ q_{n}.r_{n}\}. Thus, the usual operations of addition and multiplication are easily defined as:

[\{ q_{n}\}]+[\{ r_{n}\}] = [\{q_{n}+r_{n} \}]

[\{ q_{n}\}].[\{ r_{n}\}] = [\{ q_{n}.r_{n}\}]

One needs to verify that these are well-defined by checking that the definition does not depend on the choice of representative of the equivalence class.

The order relation of the real numbers can also be defined. Thus, viewing real numbers as equivalence classes or Cauchy sequences of rational numbers, we define [\{ q_{n}\}] < [\{ r_{n}\}] if there is an N such that q_{n}<r_{n} for all n \geq N. We also write [\{ q_{n}\}] \leq [\{ r_{n}\}] if either [\{ q_{n}\}] < [\{ r_{n}\}] or [\{ q_{n}\}] = [\{ r_{n}\}]. Again, these order relations have the expected properties.

Finally, the absolute value function can also be defined. We set

$latex|[\{ q_{n}\}]| = [\{ |q_{n}|\}]$

and the reader can easily verify that indeed the right hand side is a Cauchy sequence if \{ q_{n}\} is Cauchy.

The absolute value has the usual properties, the most notable being the triangle inequality

|[\{ q_{n}\}] + [\{ r_{n}\}]| \leq |[\{ q_{n}\}]|+|[\{ r_{n}\}]|

Having given this formal definition of the real numbers using Cauchy sequences, it is convenient to drop the sequence notation and just represent the real numbers as single letters. Thus, if a, b are real numbers, the triangle inequality is the familiar: |a+b| \leq |a|+|b|

It is now convenient to visualize the set of real numbers in the usual way as points of the line stretching from -\infty to +\infty.

By the very construction, the set of real numbers has the property of completeness in that every Cauchy sequence converges. The reader will also observe that the construction depended only on the property of the absolute value viewed as a metric to measure the distance between two rational numbers. This signals a wider applicability of the method of Cauchy sequences valid more generally in metric spaces which we discuss later.

With the construction of the real numbers, we have restored the least upper bound property and can now define that a sequence of numbers (rational or real) x_{n} converges to a number L \in \Re if given any \epsilon >0, there is an N such that

|x_{n}-L|<\epsilon for all n \geq N.

A non-empty subset A of \Re is said to be bounded below if there is a number s \in \Re such that s \leq a for all a \in A. Any such number s is called a lower bound for A. Analogously, A is said to be bounded above if there is an upper bound for A. The set A is said to be bounded if it is both bounded below and bounded above and unbounded if it is not bounded.

The construction of the real numbers using the Dedekind cuts shows that every non-empty subset of \Re that is bounded above has a least upper bound denoted by \sup {A}. The least upper bound is also called the supremum. Every non-empty subset which is bounded below has a greatest lower bound denoted by \inf {A}. The greatest lower bound is also called the infimum. It is clear that if A and B are non-empty subsets of \Re such that A \subseteq B, then \sup{A} \leq  \sup{B}(see exercises). The following theorem gives a convenient characterization of the supremum.

Theorem 1.3 Let A be a non-empty subset of \Re that is bounded above. Then s=\sup {A} if and only if

(i) a \leq s for all a \in A, and

(ii) for any \epsilon>0, A \bigcap (s-\epsilon, s]\neq \phi.

Proof :

Let s=\sup {A}. Then s is an upper bound for A so that (i) holds. Now if (ii) were false, there would be an \epsilon>0 such that A \bigcap (s-\epsilon, s] = \phi. But then s-\epsilon would also be an upper bound for A, a contradiction. Conversely, suppose thatt (i) and (ii) hold. Then (i) implies s is an upper bound for A. If s were not the least upper bound for A, there would bea t <ssuch that a \leq t for alla \in A. Taking \epsilon = \frac{s-t}{2}>0 in (ii) we get that A \bigcap (s-\epsilon, s] \neq \phi. But s-\epsilon= t+\epsilonso that A \bigcap (t+\epsilon, s] \neq \phi which is a contradiction since a \leq t for all a \in A.


Theorem 1.3 can also be stated as follows. If A is a non-empty subset of \Re that is bounded above ands=\sup{A}, then either s \in A or A \bigcap (s-\epsilon, s] \neq \phi for all \epsilon >0. A similar theorem can be written for the infimum of a set.


Nalin Pithwa

Constructing numbers from sets

Continued from previous blog: A fast review of set theory; same reference: A Second Course in Analysis by M Ram Murty, Hindustan Book Agency.

Mathematicians and philosophers of the nineteenth century pondered deeply into the nature of a number. The question of “what is a number?” is not a simple one. But since mathematicians decided to give foundations of mathematics using the axiomatic method and sets as the basic building blocks, we are led to define numbers using sets. We follow Richard Dedekind (1831-1916) and Giuseppe Peano (1858-1932) in the following construction. It was as late as 1888 and 1889 when this construction was described in two papers written independently by Dedekind and Peano.

We construct a sequence of sets to represent the natural numbers. As noted earlier, zero is represented by the empty set. We have already described the construction of the natural numbers using the empty set. For each natural number n, the successor of n is denoted by n+1 (and sometimes as n^{'}) and defined as

n \bigcup \{ n \}

Thus, each natural number is a set with n elements, namely,

\{ 0,1,2, \ldots, n-1\}

We designate the set of natural numbers by the symbol \mathcal{N}. (It is a matter of personal convenience whether to include zero as a natural number or not. In this discussion, zero is a natural number. In other settings, it may not be. There is no universal convention regarding this and the student is expected to understand depending on the context. Some authors use the term “whole numbers” to indicate that zero is included in the discussion.)

The arithmetic operations on \mathcal{N} are now defined recursively. Addition is defined as a function from \mathcal{N} \times \mathcal{N} to \mathcal{N}:

+ (m,n) = m+n

where m+n is defined recursively by 0+n=n and m'+n = (m+n)^{'}. A similar definition is given for multiplication x by defining 0 \times n = 0 and

m^{'} \times n = (m \times n) + n

We also define m \times n as simply mn which is the familiar symbology.

An equivalence relation on a set S is a subset R of S \times S satisfying:

  1. (reflexive axiom) (a,a) \in R \forall {x} \in S.
  2. (symmetry axiom) (a,b) \in R \Longleftrightarrow (b,a) \in R.
  3. (transitive axiom) (a,b) \in R and (b,c) \in R implies (a,c) \in R.

The notion of an equivalence relation is an abstaction of our concept of equality, or at least what we implicitly expect of the notion of equality. It is more suggestive to write the equivalence relationn, not as a subset of S \times S as indicated above, but rather more symbolically as \sim that our axioms become:

  1. (reflexive) a \sim a, \forall {a} \in .
  2. (symmetry) a \sim b if and only if b \sim a.
  3. (transitive) a \sim b and b \sim c implies a \sim c.

Equivalence relations play a fundamental role in all of mathematics. They allow us to understand aspects of sets by grouping them using certain properties.

To construct negative integers, we define an equivalence relation on \mathcal{N} \times \mathcal{N}. We write

(m,n)\sim (j,k) \Longleftrightarrow m+k=j+n.

Intuitively, we think of (m,n) as m-n so that it becomes evident that our definition is now in terms of concepts that have been defined earlier. This is very similar to how the ancients worked with negative numberss that appeared in an equation. They usually moved them to the other side so that the equation became an equation of non-negative numbers. However, with our set theoretic definition, we have reached a more fundamental and higher level of abstraction. Thus, with our equivalence relation above on the natural numbers, we define the set of integers as the set of equivalence classes of such ordered pairs. It is now easy to see that the following lemma holds:

Lemma 1.1 If (j,k) is an ordered pair of non-negative integers, then exactly one of the following statements holds:

(a) (j,k) is equivalent to (m,0) for a unique non-negative integer m;

(b) (j,k) is equivalent to (0,m) for a unique non-negative integer m;

(c) (j,k) is equivalent to (0,0).

Sometimes, we denote by |(j,k)| the equivalence class of (j,k). With this lemma in place, we now denote by m the set of pairs of non-negative integers equivalent to (m,0) ; by -m the set of pairs equivalent to (0,m) and by 0 the set of pairs equivalent to (0,0). We denote these equivalence classes by \mathcal{Z}.This gives us set theoretic construction of the set of integers.

We can define the operations of addition and multiplication by setting:

|(j_{1},  k_{1})|+|(j_{2}, k_{2})|=|(j_{1}+j_{2}, k_{1}+k_{2})|

|(j_{1}, k_{1})| \times |(j_{2}, k_{2})| = |(j_{1}j_{2}+k_{1}k_{2}, j_{1}k_{2}+j_{2}k_{1})|

This latter definition is best understood if we recall that the symbol (j,k) represents j-k so that the left hand side of the above equation is

(j_{1}-k_{1})(j_{2}-k_{2}) = j_{1}j_{2}+k_{1}k_{2}-(j_{1}k_{2}+j_{2}k_{1})

One needs to check that these definitions are “well-defined” in the sense that they are independent of the representatives chosen for the equivalence class. This can be done as exercises.

In this way, we have now extended the notion of addition and multiplication from the set of natural numbers to the set of integers. Subtraction of integers can be defined by

|(j_{1}, k_{1})|-|(j_{2}, k_{2})| = |(j_{1}, k_{1})| +(-1)|(j_{2}, k_{2})|

where -1 represents the equivalence class (0,1). All of these definitions correspond to our usual notion of addition, subtraction and multiplication. Their virtue lies in their pure set-theoretic formulation.

We can also order the set of integers in the usual way. Thus,

j_{1}+k_{2} < k_{1}+j_{2} \Longrightarrow |(j_{1}, k_{1})|< |(j_{2}, k_{2})|


j_{1}+k_{2} \leq k_{1}+j_{2} \Longleftrightarrow |(j_{1}, k_{1})| \leq |(j_{2}, k_{2})|.

This corresponds to our usual notion of “less than” and “less than or equal to”.

Finally, we can define the absolute value on the set of integers by setting

|k|=k, if 0<k

|k|=0, if k=0

|k|=-k, if k<0

We can now construct the rational numbers \mathcal{Q} from the set of integers. We do this by defining an equivalence relation on the set \mathcal{Z} \times \mathcal{Z}^{+} by stating that two pairs (j_{1}, k_{1}) and (j_{2}, k_{2}) are equivalent if and only if j_{1}k_{2}=j_{2}k_{1}. Intuitively, we think of (j_{1}, k_{1}) as representing the “fraction” \frac{j_{1}}{k_{1}} and examining what we would mean by \frac{j_{1}}{k_{1}} = \frac{j_{2}}{k_{2}} by reducing it to notions already defined. The set of rational numbers \mathcal{Q} is then defined as the set of such equivalence classes.

The expected operations of addition and multiplication are now evident:

|(j_{1}, k_{1})|+|j_{2}, k_{2}| = |(j_{1}k_{2}+j_{2}k_{1}, k_{1}k_{2})|

|(j_{1}, k_{1})||(j_{2}, k_{2})| = |(j_{1}j_{2}, k_{1}k_{2})|

Again, these definitions are easily verified to be well-defined. Finally, we can now define “division”.If |(j_{1}, k_{1})|, |(j_{2}, k_{2})| \in \mathcal{Q} with j_{2} \neq 0, we define:

\frac{|(j_{1}, k_{1})|}{|(j_{2}, k_{2})|} = |(j_{1}k_{2}, j_{2}k_{1})|

These operations satisfy the familiar laws of associativity, commutativity and distributivity. Subtraction of rational numbers then can be written as :

|(j_{1}, k_{1})|-|(j_{2}, k_{2})| = |(j_{1}, k_{1})|+(-1,1)|(j_{2}, k_{2})|

The ordering of rational numbers can also be written as:

|(j_{1}, k_{1})|< |(j_{2}, k_{2})| \Longleftrightarrow j_{1}k_{2}< j_{2}k_{1}

|(j_{1}, k_{1})|   \leq  |(j_{2}, k_{2})| \Longleftrightarrow j_{1}k_{2} \leq j_{2}k_{1}.

These definitions agree with out usual notions of ordering of the rational numbers.

Finally, the definition of absolute value can be extended as:

|[(j,,k)]|= |(j,k)| if [(0,1)] < [(j,k)]

|[(j,k)]| = |(0,1)| if |(j,k)|=|(0,1)|

|[(j,k)]| = -|(j,k)|<  |(0,1)|.

Again, our familiar properties of the absolute value of rational numbers hold. With this foundational construction in place, we can conveniently represent the equivalence class of (j,k) as simply the fraction j/k and continue to work with these numbers as we were hopefully taught from childhood.

In the next sections/blogs we construct the real numbers from this axiomatic framework.


Hint (generic): keep the meaning of the symbols in mind and meaning of equivalence relations and equivalence classes. Also note that our basic object is a class and a set is a member of a class.

  1. let |(j_{1}, k_{1})|, |(j_{2}, k_{2})| be two elements of \mathcal{Z}. Show that the addition:

|(j_{1}, k_{1})|+|(j_{2}, k_{2})| = |(j_{1}+j_{2}, k_{1}+k_{2})| is well-defined. That is, prove that for any (j_{1}^{'}, k_{1}^{'}) \in |(j_{1}, k_{1})| and (j_{2}^{'},k_{2}^{'}) \in |(j_{2}, k_{2})|, we have that (j_{1}^{'}+j_{2}^{'}, k_{1}^{'}+k_{2}^{'}) is equivalent to (j_{1}+j_{2}, k_{1}+k_{2}).

2. For j_{1}, j_{2}, k \in \mathcal{Z}, prove the distributive law: (j_{1}+j_{2}).k = j_{1}k+j_{2}k.

3. Show that the relations < and \leq on \mathcal{Z} have the following properties:

(a) |(0,j)|< |(0,0)| for all j \in \mathcal{Z}^{+}

(b) |(0,j)|< |(k,0)| for all j, k \in \mathcal{Z}^{+}

(c) |(0,j)|< |(0,k)|, j ,k \in \mathcal{Z}^{+} if and only if k<j

(d) |(0,0)| < |(j,0)| for all j \in \mathcal{Z}^{+}

(e) |(j,0)|<|(k,0)|, j, k \in \mathcal{Z}_{\geq 0}if and only if j<k.

(f) |(0,j)| \leq |(0,0)| for all j \in \mathcal{Z}_{\geq 0}.

(g) |(0,j)| \leq |(k,0)| for all j,k \in \mathcal{Z}_{\geq 0}

(h) |(0,j)| \leq |(0,k)| for j,k \in \mathcal{Z}_{\geq 0} if and only if k \leq j

(i) |(0,0)| \leq |(j,0)| for all j \in \mathcal{Z}_{\geq 0}

(j) |(j,0)| \leq |(k,0)| where j, k \in \mathcal{Z}_{\geq 0} if and only if j \leq k.


Nalin Pithwa.

An example: continuous map which is not homeomorphic

Reference: Topology by Hocking and Young. Dover Publications.

Let us present an example of a continuous mapping (one-to-one) which is not a homeomorphism:

Let S be the set of all non-negative real numbers with their usual metric topology, and let T be the unit circle in its metric topology. Consider $f: S \rightarrow T$. For each x in S, let $f(x) = (1, \frac{2\pi x^{2}}{(1+x^{2})})$, a point in polar coordinates on T.

It is easily shown that f is continuous and one-to-one.

But the set of all x in S such that $x<1$ is open in S while its image is not open in T. Hence, f is not interior. (meaning that: A transformation $f: S \rightarrow T$ of the space S into the space T is said to be interior if f is continuous and and if the image of every open subset of S is open in T) , and is not a homeomorphism (because of the following theorem: A necessary and sufficient condition that the one-to-one continuous map $f: S \rightarrow T$ of the space S onto the space S be a homeomorphism is that f be interior).


Nalin Pithwa.

Some basic facts about connectedness and compactness

Reference: Hocking and Young’s Topology, Dover Publishers. Chapter 1: Topological Spaces and Functions.

Definition : Separated Space: A topological space is separated if it is the union of two disjoint, non empty, open sets.

Definition: Connected Space: A topological space is connected if it is not separated.

PS: Both separatedness and connectedness are invariant under homeomorphisms.

Lemma 1: A space is separated if and only if it is the union of two disjoint, non empty closed sets.

Lemma 2: A space S is connected if and only if the only sets in S which are both open and closed are S and the empty set.

Theorem 1: The real line E^{1} is connected.

Theorem 2: A subset X of a space S is connected if and only if there do not exist two non empty subsets A and B of X such that X = A \bigcup B, and such that (\overline{A} \bigcap B) \bigcup (A \bigcap \overline{B}) is empty.

Note the above is Prof. Rudin’s definition of connectedness.

Theorem 3: Suppose that C is a connected subset of a space S and that \{ C_{\alpha}\} os a collection of connected subsets of S, each of which intersects C. Then, S = C \bigcup (\bigcup_{\alpha}C_{\alpha}) is connected.

Corollary of above: For each n, E^{n} is connected.

Theorem 4:

Every continuous image of a connected space is connected.

Lemma 3: For n>1, the complement of the origin in E^{n} is connected.

Theorem 5: For each n>0, S^{n} is connected.

Theorem 6: If both f: S \rightarrow T and g: T \rightarrow X are continuous, then the composition gf is also continuous.

Lemma 4: A subset X of a space S is compact if and only if every covering of X by open sets in S contains a finite covering of X.

Theorem 7: A closed interval [a,b] in E^{1} is compact.

Theorem 8: Compactness is equivalent to the finite intersection property.

Theorem 9: A compact space is countably compact.

Theorem 10: Compactness and countable compactness are both invariant under continuous transformations.

Theorem 11: A closed subset of a compact space is compact.


Nalin Pithwa.

Some basic facts about continuity

Reference: (1) Topology by Hocking and Young especially chapter 1 (2) Analysis — Walter Rudin.

Definition 1: A transformation f: S \rightarrow T is continuous provided that if p is a limit point of a subset X of S, then f(p) is a limit point or a point of f(X).

Definition 2: We may also state the continuity requirement on f as follows: if p is a limit point of \overline{X}, then f(p) is a point of \overline{f(X)}.

Theorem 1: If S is a set with the discrete topology and f: S \rightarrow T any transformation of S into a topologized set T, then f is continuous.

Theorem 2: A real-valued function y=f(x) defined on an interval [a,b] is continuous provided that if a \leq x_{0} \leq b and \epsilon >0, then there is a number \delta >0 such that if |x-x_{0}|<\delta, x in [a,b], then |f(x) - f(x_{0})|< \epsilon. (NB: this is same as definition 1 above).

Theorem 3: Let f: S \rightarrow T be a transformation of the space S into the space T. A necessary and sufficient condition that f be continuous is that if O is any open subset of T, then its inverse image f^{-1}(O) is open in S.

Theorem 4:

A necessary and sufficient condition that the transformation f: S \rightarrow T of the space S into the space T be continuous is that if x is a point of S, and V is an open subset of T containing f(x), then there is an open set, U in S containing x and such that f(U) lies in V.

Theorem 5:

A one-to-one transformation f: S \rightarrow T of a space S onto a space T is a homeomorphism, if and only if both f and f^{-1} are continuous.

Theorem 6:

Let f: M \rightarrow N be a transformation of the metric space M with metric d into the metric space N with metric \rho. A necessary and sufficient condition that f be continuous is that if \epsilon is any positive number and x is a point of M, then there is a number \delta >0 such that if d(x,y)< \delta, then \rho(f(x), f(y)) < \epsilon.

Theorem 7:

A necessary and sufficient condition that the one-to-one mapping (that is, a continuous transformation) f: S \rightarrow T of the space S onto the space T be a homeomorphism is that f is interior. (NB: A transformation f: S \rightarrow T of the space S into the space T is said to be interior if f is continuous and if the image of every open set subset of S is open in T).


Nalin Pithwa.

Tutorial Problems I: Topology: Hocking and Young

Reference: Topoology by Hocking and Young, Dover Publications, Inc., NY. Available in Amazon India.

Exercises 1-1:

Show that if S is a set with the discrete topology and f: S \rightarrow T is any transformation of S into a topologized set T, then f is continuous.

Solution 1-1:

Definition A: The set S has a topology (or is topologized) provided that, for every point p in S and every subset X of S, the question : “is p a limit point of X?” can be answered.

Definition B: A topology is said to be a discrete topology when we assume that for no point p in S, and every subset X of S: the answer to the question: “is p a limit point of X?” is NO.

Definition C: A transformation f: S \rightarrow T is continuous provided that if p is a limit point of a subset X of S, then f(p) is a limit point or a point of f(X).

So, the claim is vacuously true. QED.

Exercises 1-2:

A real-valued function y=f(x) defined on an interval [a,b] is continuous provided that if a \leq x_{0} \leq b and \epsilon >0, then there is a number \delta>0 such that if $|x-x_{0}|<\delta$, where x \in [a,b], then |f(x)-f(x_{0})|<\epsilon. Show that this is equivalent to our definition, using definition 1-1.

Solution 1-2:

Definition 1-1: The real number p is a limit point of a set X of real numbers provided that for every positive number \epsilon, there is an element x of the set X such that $0<|p-x|<\epsilon$.

Definition C: A transformation f: S \rightarrow T is continuous provided that if p is a limit point of a subset X of S, then f(p) is a limit point or a point of f(X).

Part 1: Let us assume that given function f is continuous as per definition given just above.

Then, as p is a limit point of X: it means: For any \delta>0, there exists a real number p such that there is an element x \in X such that |p-x|<\delta..

So, also, by definition C, f(p) is a limit point or a point of f(X); this means the following: if f(p) is a point of f(X), there exists some x_{0} \in X such that $f(x_{0} \in f(X)$, and so quite clearly in this case p=x_{0} so that |p-x_{0}|=|x_{0}-x_{0}|<\delta, as \delta is positive.

On the other hand, if f(p) is a limit point of f(X), as per the above definition of continuity, then also for any \epsilon>0, there exists a point y \in f(X) such that |y-f(p)|<\epsilon. So, in this case also the claim is true.

We have proved Part 1. QED.

Now, part II: We assume the definition of continuity given in the problem statement is true. From here, we got to prove definition C as the basic definition given by the authors.

But this is quite obvious as in this case p=x_{0}.

We have proved Part II. QED.

Thus, the two definitions are equivalent.


Nalin Pithwa

Metric space question and solution

Reference: I had blogged this example earlier, but I myself could not fill in the missing gaps at that time. I am trying again with the help of MathWorld Wolfram and of course, the classic, Introductory Real Analysis by Kolmogorov and Fomin, from which it is picked up for my study.

Consider the set C_{[a,b]} of all continuous functions defined on the closed interval [a,b]. Let the distance function (or metric) be defined by the formula:

\rho(x,y) = (\int_{a}^{b}[x(t)-y(t)]^{2}dt)^{1/2} ——– relation I

The resulting metric space will be denoted by C_{[a,b]}^{2}.

The first two properties of a metric space are clearly satisfied by the above function. We need to check for the triangle inequality:

Now I satisfies the triangle inequality because of the following Schwarz’s inequality:

(\int_{a}^{b}x(t)y(t)dt)^{2} \leq \int_{a}^{b}x^{2}(t)dt \int_{a}^{b}y(t)dt —— relation II

In order to get to the above relation II, we need to prove the following:

Prove: (\int_{a}^{b} x(t)y(t)dt)^{2} = \int_{a}^{b}x^{2}(t)dt \int_{a}^{b} y^{2}(t)dt - \frac{1}{2}\int_{a}^{b} \int_{a}^{b}[x(s)y(t)-x(t)y(s)]^{2}dsdt.

From the above, we can deduce Schwarz’s inequality (relation II here in this blog article).

(My own attempts failed to crack it so I had to look at the internet for help. Fortunately, MathWorld Wolfram has given a crisp clear proof…but some parts of the proof are still out of my reach…nevertheless, I am reproducing the proof here for the sake of completeness of my notes…for whatever understanding I can derive at this stage from the proof…):


Weisstein, Eric W. “Schwarz’s Inequality.” From MathWorld–A Wolfram Web Resource. https://mathworld.wolfram.com/SchwarzsInequality.html

Schwarz’s Inequality:

Let \Psi_{1}(x), \Psi_{2}(x) be any two real integrable functions in [a,b], then Schwarz’s inequality is given by :

|< \Psi_{1}, \Psi_{2}>|^{2} \leq < \Psi_{1}|\Psi_{2}> <\Psi_{2}|Psi_{1}>

Written out explicity,

|\int_{a}^{b} \Psi_{1}(x), \Psi_{2}(x)|^{2} \leq \int_{a}^{b}[\Psi_{1}(x)]^{2}dx \int_{a}^{b}[\Psi_{2}(x)]^{2}dx

with equality if and only if \Psi_{1}(x) = \alpha \Psi_{2}(x) with \alpha a constant. Schwarz’s inequality is sometimes also called the Cauchy-Schwarz inequality or Buniakowsky’s inequality.

To derive the inequality, let \Psi(x) be a complex function and \lambda a complex constant such that

\Psi(x) \equiv f(x) + \lambda g(x) for some f and g


\int \overline{\Psi} \Psi dx \geq 0, where \overline{z} is the complex conjugate.

\int \overline{\Psi}\Psi dx = \int \overline{f}f dx + \lambda \int \overline{f} g dx + \overline\lambda \int \overline{g} f dx + \lambda \overline{\lambda} \int \overline{g} g dx

with equality when \Psi(x) = 0

Writing this, in compact notation:

<\overline{f},f> + \lambda <\overline{f},g> + \overline{\lambda} <\overline{g},f> + \lambda \overline{\lambda} <\overline{g},g> \geq 0….relation A

Now, define \lambda \equiv - \frac{<\overline{g}, f>}{<\overline{g},g>}….relation B

and \overline{\lambda} = - \frac{<g, \overline{f}>}{<\overline{g}, g>}…relation C

Multiply A by <\overline{g},g> and then plug in B and C to obtain:

<\overline{f}, f><\overline{g}, g> - <\overline{f},g><\overline{g},f> - <\overline{g},f><g, \overline{f}> +<\overline{g}, f><g, \overline{f}> \geq 0

which simplifies to

<\overline{g},f><\overline{f},g> \leq <\overline{f},f><\overline{g},g>


|<f,g>|^{2} \leq <f,f><g,g>. Bessel’s inequality follows from Schwarz’s inequality. QED.


Nalin Pithwa

VI. Countable sets: My notes.


  1. Introduction to Topology and Modern Analysis by G. F. Simmons, Tata McGraw Hill Publications, India.
  2. Introductory Real Analysis — Kolmogorov and Fomin, Dover Publications, N.Y.(to some extent, I have blogged this topic based on this text earlier. Still, it deserves a mention/revision again).
  3. Topology : James Munkres.

The subject of this section and the next — infinite cardinal numbers — lies at the very foundation of modern mathematics. It is a vital instrument in the day to day work of many mathematicians, and we shall make extensive use of it ourselves (in our beginning studying of math ! :-)). This theory, which was created by the German mathematician Georg Cantor, also has great aethetic appeal, for it begins with ideas of extreme simplicity and develops through natural stages into an elaborate and beautiful structure of thought. In the course of our discussion, we shall answer questions which no one before Cantor’s time thought to ask, and we shall ask a question which no one can answer to this day…(perhaps !:-))

Without further ado, we can say that cardinal numbers are those used in counting, such as the positive integers (or natural numbers) 1, 2, 3, …familiar to us all. But, there is much more to the story than this.

The act of counting is undoubtedly one of the oldest of human activities. Men probably learned to count in a crude way at about the same time as they began to develop articulate speech. The earliest men who lived in communities and domesticated animals must have found it necessary to record the number of goats in the village herd by means of a pile of stones or some similar device. If the herd was counted in each night by removing one stone from the pile for each goat accounted for, then stones left over would have indicated strays, and herdsmen would have gone out to search for them. Names for numbers and symbols for them like our 1, 2, 3, …would have been superfluous. The simple yet profound idea of a one-to-one correspondence between the stones and the goats would have fully met the needs of the situation.

In a manner of speaking, we ourselves use the infinite set

N = \{ 1, 2, 3, \ldots\}

of all positive integers as “pile of stones.” We carry this set around with us as part of our intellectual equipment. Whenever we want to count a set, say, a stack of dollar bills, we start through the set N and tally off one bill against each positive integer as we come to it. The last number we reach, corresponding to the last bill, is what we call the number of bills in the stack. If this last number happens to be 10, then “10” is our symbol for the number of bills in the stack, as it also is for the number of our fingers, and for the number of our toes, and for the number of elements which can be put into one-to-one correspondence with the finite set \{ 1,2,3, \ldots, 10\}. Our procedure is slightly more sophisticated than that of the primitive savage man. We have the symbols 1, 2, 3, …for the numbers which arise in counting; we can record them for future use, and communicate them to other people, and manipulate them by the operations of arithmetic. But the underlying idea, that of the one-to-one correspondence, remains the same for us as it probably was for him.

The positive integers are adequate for our purpose of counting any non-empty finite set, and since outside of mathematics all sets appear to of this kind, they suffice for all non-mathematical counting. But in the world of mathematics, we are obliged to consider many infinite sets, such as the set of positive integers itself, the set of all integers, the set of all rational numbers, the set of all real numbers, the set of all points in a plane, and so on. It is often important to be able to count such sets, and it was Cantor’s idea to do this, and to develop a theory of infinite cardinal numbers, by means of one-to-one correspondence.

In comparing the sizes of two sets, the basic concept is that of numerical equivalence as defined in the previous section. We recall that two non-empty sets are numerically equivalent if there exists a one-to-one mapping of a set onto the other, or — and this amounts to the same thing — if there can be found a one-to-one correspondence between them. To say that two non-empty finite sets are numerically equivalent is of course to say that they have the same number of elements in the ordinary sense. If we count one of them, we simply establish a one-to-one correspondence between its elements and a set of positive integers of the form \{1,2,3, \ldots, n \} and we then say that n is the number of elements possessed by both, or the cardinal number of both. The positive integers are the finite cardinal numbers. We encounter many surprises as we follow Cantor and consider numerical equivalences for infinite sets.

The set N = \{ 1,2,3, \ldots\} of all positive integers is obviously “larger” than the set \{ 2,4,6, \ldots\} of all even positive integers, for it contains this set as a proper subset. It appears on the surface that N has “more” elements. But it is very important to avoid jumping to conclusions when dealing with infinite sets, and we must remember that our criterion in these matters is whether there exists a one-to-one correspondence between the sets (not whether one set is a proper subset or not of the other) . As a matter of fact, consider the “pairing”

1,2,3, \ldots, n, \ldots

2,4,6, \ldots, 2n, \ldots

serves to establish a one-to-one correspondence between these sets, in which each positive integer in the upper row is matched with the even positive integer (its double) directly below it, and these two sets must therefore be regarded as having the same number of elements. This is a very remarkable circumstance, for it seems to contradict our intuition and yet is based only on solid common sense. We shall see below, in Problems 6 and 7-4, that every infinite set is numerically equivalent to a proper subset of itself. Since this property is clearly not possessed by any finite set, some writers even use it as the definition of an infinite set.

In much the same way as above, we can show that N is numerically equivalent to the set of all even integers:

1, 2, 3,4, 5,6, 7, \ldots

0,2,-2,4,-4,4,6,-6, \ldots

Here, our device is start with zero and follow each even positive integer as we come to it by its negative. Similarly, N is numerically equivalent to the set of all integers:

1,2,3,4,5,6,7, \ldots

0,1,-1, 2, -2, 3, -3, \ldots

It is of considerable interest historical interest to note that Galileo had observed in the early seventeenth century that there are precisely as many perfect squares (1,4,9,16,25, etc.) among the positive integers as there are positive integers altogether. This is clear from the “pairing”:

1,2,3,4,5, \ldots

1^{2}, 2^{2}, 3^{2}, 4^{2}, 5^{2}, \ldots

It struck him as very strange that this should be true, considering how sparsely strewn the squares are among all the positive integers. But, the time appears not to have been ripe for the exploration of this phenomenon, or perhaps he had other things on his mind; in any case, he did not follow up his idea.

These examples should make it clear that all that is really necessary in showing that an infinite set X is numerically equivalent to N is that we be able to list the elements of X, with a first, a second, a third, and so on, in such a way that it is completed exhausted by this counting off of its elements. It is for this reason that any infinite set which is numerically equivalent to N is said to be countably infinite. (Prof. Walter Rudin also, in his classic on mathematical analysis, considers a countable set to be either finite or countably infinite. ) We say that a set is countable it is non-empty and finite (in which case it can obviously be counted) or if it is countably infinite.

One of Cantor’s earliest discoveries in his study of infinite sets was that the set of all positive rational numbers (which is very large : it contains N and a great many other numbers besides) is actually countable. We cannot list the positive rational numbers in order of size, as we can the positive integers, beginning with the smallest, then the next smallest, and so on, for there is no smallest, and between any two there are infinitely many others. We must find some other way of counting them, and following Cantor, we arrange them not not in order of size, but according to the size of the sum of numerator and denominator. We begin with all positive rationals whose numerator and denominator sum add up to 2: there is only one \frac{1}{1}=1. Next, we list (with increasing numerators) all those for which this sum is 3: \frac{1}{2}, \frac{2}{1}=2. Next, all those for which the sum is 4: \frac{1}{3}, \frac{2}{2}=1, \frac{3}{1}=3. Next, all those for which this sum is 5: \frac{1}{4}, \frac{2}{3}, \frac{3}{2}, \frac{4}{1}=4. Next, all those for which this sum is 6: \frac{1}{5}, \frac{2}{4}, \frac{3}{3}=1, \frac{4}{2}=2, \frac{5}{1}=5. And, so on. If we now list all these together from the beginning, omitting those already listed when we come to them, we get a sequence like:

1, 1/2, 2, 1/3, 1/4, 2/3, 3/2, 4, 1/5, 5, \ldots

which contains each positive rational number once and only once. (There is a nice schematic representation of this : Cantor’s diagonalization process; please google it). In this figure, the first row contains all positive rationals with numerator 1, the second all with numerator 2, and so on. Our listing amounts to traversing this array of numbers in a zig-zag manner (again, please google), where of course, all those numbers already encountered are left out.

It is high time that we christened the infinite cardinal number we have been discussing, and for this purpose, we use the first letter of the Hebrew alphabet, \bf{aleph_{0}}. We say that \aleph_{0} is the number of elements in any countably infinite set. Our complete list of cardinal numbers so far is

1, 2, 3, \ldots, \aleph_{0}.

We expand this list in the next section.

Suppose now that m and n are two cardinal numbers (finite or infinite). The statement that m is less than n (written m < n) is defined to mean the following: if X and Y are sets with m and n elements, then (i) there exists a one-to-one mapping of X into Y, and (ii) there does not exist a mapping of X onto Y. Using this concept, it is easy to relate our cardinal numbers to one another by means of:

1<2<3< \ldots < \aleph_{0}.

With respect to the finite cardinal numbers, this ordering corresponds to their usual ordering as real numbers.


Nalin Pithwa