Is Math really abstract? I N Herstein answers…

Reference: Chapter 1: Abstract Algebra Third Edition, I. N. Herstein, Prentice Hall International Edition:

For many readers/students of pure mathematics, such a book will be their first contact with abstract mathematics. The subject to be discussed is usually called “abstract algebra,” but the difficulties that the reader may encounter are not so much due to the “algebra” part as they are to the “abstract” part.

On seeing some area of abstract mathematics for the first time,be it in analysis, topology, or what not, there seems to be a common reaction for the novice. This can best be described by a feeling of being adrift, of not having something solid to hang on to. This is not too surprising, for while many of the ideas are fundamentally quite simple, they are subtle and seem to elude one’s grasp the first time around. One way to mitigate this feeling of limbo, or asking oneself “What is the point of all this?” is to take the concept at hand and see what it says in particular concrete cases. In other words, the best road to good understanding of the notions introduced is to look at examples. This is true in all of mathematics.

Can one, with a few strokes, quickly describe the essence, purpose, and background for abstract algebra, for example?

We start with some collection of objects S and endow this collection with an algebraic structure by assuming that we can combine, in one or several ways (usually two) elements of this set S to obtain, once more, elements of this set S. These ways of combining elements of S we call operations on S. Then we try to condition or regulate the nature of S by imposing rules on how these operations behave on S. These rules are usually called axioms defining the particular structure on S. These axioms are for us to define, but the choice made comes, historically in mathematics from noticing that there are many concrete mathematical systems that satisfy these rules or axioms. In algebra, we study algebraic objects or structures called groups, rings, fields.

Of course, one could try many sets of axioms to define new structures. What would we require of such a structure? Certainly we would want that the axioms be consistent, that is, that we should not be led to some nonsensical contradiction computing within the framework of the allowable things the axioms permit us to do. But that is not enough. We can easily set up such algebraic structures by imposing a set of rules on a set S that lead to a pathological or weird system. Furthermore, there may be very few examples of something obeying the rules we have laid down.

Time has shown that certain structures defined by “axioms” play an important role in mathematics (and other areas as well) and that certain others are of no interest. The ones we mentioned earlier, namely, groups, rings, fields, and vector spaces have stood the test of time.

A word about the use of “axioms.” In everyday language, “an axiom means a self-evident truth”. But we are not using every day language; we are dealing with mathematics. An axiom is not a universal truth — but one of several rules spelling out a given mathematical structure. The axiom is true in the system we are studying because we forced it to be true by “force” or “our choice” or “hypothesis”. It is a licence, in that particular structure to do certain things.

We return to something we said earlier about the reaction that many students have on their first encounter with this kind of algebra, namely, a lack of feeling that the material is something they can get their teeth into. Do not be discouraged if the initial exposure leaves you in a bit of a fog.Stick with it, try to understand what a given concept says and most importantly, look at particular, concrete examples of the concept under discussion.

Follow the same approach in linear algebra, analysis and topology.

Cheers, cheers, cheers,

Nalin Pithwa

Difference between identity and equality

A nice way to illustrate this is the following conversation in Lewis Caroll’s Alice in Wonderland:

In a discussion between Alice and the Pigeon, the latter accuses her of being a serpent. In spite of her indignation, Alice has to confess that “little girls eat as many eggs as serpents do.” But, the Pigeon calls “anything with long necks and eating many eggs” as “a serpent.”

In other words, two objects may have same property and are equal in that sense. But, the objects might be different.

So, that brings to mind the next question: what is the difference between equal and equivalence. In my opinion, equivalence measures equality up to a certain property. For example, congruence modulo relation in numbers. Or, congruent triangles in Euclid’s geometry.


Nalin Pithwa

Exercises: Set theoretic construction of real numbers


  1. If p is a prime number, show that \sqrt{p} is irrational.
  2. Show that \sum_{n=2}^{\infty} \frac{1}{n!}<1
  3. Show that if e_{n} = \pm {1}, then the number \sim_{n=1}^{\infty} \frac{e_{n}}{n!} is an irrational number.
  4. Show that if \{ q_{n}\} and \{ r_{n}\} are two Cauchy sequences, then so are \{ q_{n}+r_{n}\} and \{ q_{n}.r_{n}\}.
  5. If A and B are non empty subsets of \Re that is bounded below. Then, s = \inf{A} if and only if (i) s \leq a for all a \in A and (ii) for any \epsilon>0, A \bigcap [s, s +\epsilon) \neq \phi.

Reference: A Second Course in Analysis by M Ram Murty, Hindustan Book Agency.


Nalin Pithwa

Set theoretic construction of real numbers

Continued from previous blog. Same reference: A Second course in analysis by M Ram Murty, Hindustan Book Agency.

Section 1.3

The rational number is insufficient to “measure” all the lengths that arise in the “real world.” This was discovered by the ancient school of Pythagoras which viewed the world through a strange mix of mathematics and mysticism. The aphorism “all is number” seems to have been the underlying mantra of the Pythagoreans. In our modern digital world, this mantra imposes the universality more than in any earlier age.

Pythagoras seems to have been a contemporaryof the Buddha in India, of Confucius and Lao-Tzu in China. He seems to have travelled widely in Egypt, Babylon and India. The Pythagoreans believed in reincarnation and this was wedded to their strict adherence to vegetarianism for by eating meat they might be eating a friend!They are credited to the discovery of both the words philosophy and mathematics. The word “philosophy” literally means the love of wisdom and the word “mathematics” means “that which is learned.” Philosophy then, for the Pythagoreans was seen as a byproduct of mathematics.

The celebrated theorem of Pythagoras dealing with the length of the hypotenuse of a right angled triangle was known to earlier cultures stretching far back in time to Babylon, China and India. It is possible that Pythagoras learned of this theorem during his extensive peregrinations. But the Pythagoreans could see the mysticism inherent in the theorem, for the rule applied to all right angled triangles and there was an element of universality in it and in mathematics in general.

But when the Pythagoreans declared “all is number” they were referring to whole numbers and it is accurate to say that the concept of number had not been clearly articulated at that time. With the underlying idea that whole numbers govern the universe, their theory of ratio and proportion was in for a rude shock. Legend has it that the Pythagorean who discovered that \sqrt{2} is an irrational number was drowned !

The irrationality of \sqrt{2} represents a major episode in the history of mathematics and we owe this to the unfortunate Pythagorean who lost his life by discovering it. The proof is by contradiction and invokes the rudimentary arithmetical idea of a prime number, a concept first enunciated in Euclid’s Elements. A natural number is called a prime number if it has no proper divisors. Thus, the sequence of prime numbers begins as 2,3,5,7,11, \ldots (The number one is not included in the list of prime numberss. It it called a unit, a concept which generalizes to rings.) Two natural numbers a and b are said to be coprime if they have no common prime factor. When we write a fraction a/b in “lowest termss”, we are writing it such that a and b are coprime. We sometimes write (a,b)=1 to indicate that a and b are coprime. This is not to be confused with the concept of an ordered pair introduced earlier.

The irrationality of \sqrt{2} is established by using the method of contradiction. It is instructive to see all the details. If \sqrt{2} is a rational number, then we can write

\sqrt{2}=\frac{a}{b} such that (a,b)=1

Upon squaring both sides, we see


which implies that a is even since the left hand side of the equation is patently an even number. Writing a=2c, we now get


and so b is even. But this contradicts the coprimality of a and b.

This was the theorem that led to the demise of the heroic Pythagorean for it violated the aphorism “all is number” because the ancients viewed numbers as only whole numbers. They also allowed for rational numbers since they can be represented as ratio and proportion of whole quantities. This innocuous theorem opens the door for the discovery of real numbers. For it signals the lack of the “least upper bound property” in the realm of rational numbers.

The student should keep in mind that so far, we have only constructed the universe of rational numbers and thus any further definitions must be given only in terms of rational numbers. A set A of rational numbers is said to be bounded if there is a rational number M such that |x| \leq M for all x \in A. An upper bound for A is any rational number u such that x \leq u for all x \in A. A lower bound for A is a rational number l such that l \leq x for all x \in A. A rational number v is called a least upper bound for A if v \leq u for every upper bound u of A. A rational number m is called a greatest lower bound for A if l \leq m for every lower bound l of A. Thus, a set A is bounded if and only if it has both an upper bound and a lower bound.

Several questions now arise: given a set A of rational numbers, does it always have a least upper bound? Does it have a greatest lower bound? If so, are they unique?

The theorem discovered by the Pythagoreans can be explained as follows. Nascent in this example are many of the fundamental properties we would expect of the real numbers. Essential to the understanding is the familiar fact that the sequence of rational numbers \frac{1}{n} gets smaller and smaller and tends to zero.

If we consider the set

A = \{ x \in \mathcal{Q}: x^{2} \leq 2\}

then A has no least upper bound. Indeed if u is a least upper bound, x \leq u for any x \in A. By definition, u is a rational number and so cannot first equal \sqrt{2}. If u < \sqrt{2}, then we can choose a natural number N such that

\frac{1}{N} < \sqrt{2}-u

so that u + \frac{1}{N}< \sqrt{2}

contradicting that u is an upper bound for the set A. If u>\sqrt{2}, then we can similarly find N such that

u - \frac{1}{N}> \sqrt{2},

which again contradicts that u is the least upper bound for A. Thus, A does not have a least upper bound in the realm of rational numbers.

Digress: We digress a bit here from Dr. M Ram Murty’s above text and reproduce a nice “little proof”‘ from Rudin’s Analysis text illustrating that the rational number line has gaps or holes in it..


As we know there exists no rational solution to p^{2}=2. We now examine this situation a little more closely. Let A be the set of all positive rationals p such that p^{2}<2 and let B consist of all positive rationals p such that p^{2}>2. We shall show that A contains no largest rational number and B contains no smallest rational number.

More explicitly, for every p \in A we can find a rational q in A such that p<q, and for every p in B, we can find a rational q in B such that q < p.

To do this, we associate with each rational number p>0 the number q such that:

q = p - \frac{p^{2}-2}{p+2}=\frac{2p+1}{p+2}…call this I.

Thus, q^{2}-2 = \frac{2(p^{2}-2)}{(p+2)^{2}}….call this II.

If p is in A, then p^{2}-2<0, (1) above shows that q >p, and II shows that q^{2}<2. Thus, q is in A.

If p is in B, then p^{2}-2>0, (I) shows that 0<q<p, and II shows that q^{2}>2. Thus, q is in B.

Conclusion: the purpose of the above discussion is to show that the rational number system has gaps, in spite of the fact that in between any two rational numbers, there is another: if r, s \in \mathcal{Q}, and if r<s, then r < \frac{r+s}{2}<s. The real number system fills these gaps. This is the principle reason for the fundamental role which it plays in analysis.


Two ways of constructing the real numbers were proposed independently by Richard Dedekind (1831-1916) and Augustin Cauchy (1789-1857). Each method has its virtues. The method of Dedekind, using what are called Dedekind cuts is closer to the axiomatic foundations we have been discussing and is suggested by the example discussing \sqrt{2} above. The method of Cauchy using what are now called Cauchy sequences has a wider applicability. In our example above, an essential role is played by the ordering of the rational numbers. The method of Dedekind cuts uses this idea in a fundamental way. It coincides with our intuitive and visual idea of the real numbers as being “points on the number line stretching from minus infinity to plus infinity.”

A Dedekind cut in Q is a partition of Q into two sets A and B with the following properties:

  1. A and B are both non-empty.
  2. A  \bigcup B = \mathcal{Q}
  3. A is “closed downwards”, that is, if r \in A and q<r, then q \in A;
  4. B is “closed upwards”, that is, if q \in B and q<r, then r \in B.
  5. A contains no greatest element, that is, there is no q \in A such that r \leq q for all r \in A.

The set R of real numbers is then defined as the set of all Dedekind cuts.

For example, the irrational number \sqrt{2} is defined by the Dedekind cut:

\mathcal{Q} = A \bigcup B, and A = \{ x \in \mathcal{Q}: x^{2}<2\}, and B = \mathcal{Q}-A.

For example, the number zero is represented by A \bigcup B with A being the set of all non-zero negative rational numbers and B being the set of all non-zero positive rational numbers. As the Dedekind cut A \bigcup B is uniquely defined by A, with B being the complement of A in the set of rationals, we may as well associate A to each real number, or even think of A as the real number. One would then have to define how one would work with this definition from the perspective of addition, multiplication and so on. This is not difficult to do. For instance, we have

A_{1}< A_{2} \Longleftrightarrow A_{1} \subset A_{2}, and A_{1} \leq A_{2} \subseteq A_{2}.

Addition is simple enough:

A_{1}+ A_{2} = \{ a_{1}+a_{2}: a_{1} \in A_{1}, a_{2} \in A_{2}\}

Multiplication is defined (rather cumbersomely) by setting A_{1}.A_{2} to be

\{ a_{1}.a_{2}: a_{1} \in A_{1}, a_{2} \in A_{2}, a_{1}, a_{2} \geq 0\} \bigcup \{ q \in Q: q \leq 0\} if A_{1}, A_{2} \geq 0

-(A_{1}.(-A_{2})), if A_{1} \geq 0, and A_{2}<0

-((-A_{1}).A_{2}), if A_{1}<0, A_{2} \geq 0

(-A_{1}).(-A_{2}), if A_{1},A_{2}<0

The absolute value function is then

|A| = A, if A>0; |A|=0, if A=0; |A|=-A, if A <0.

One can prove that all the usual operations with numbers satisfy the expected properties of commutativity, associativity, and distributivity.

By contrast, the method of Cauchy sequences begins with a definition of convergence. In some ways, it is motivated by our discussion of \sqrt{2} earlier. If A is the Dedekind cut representing \sqrt{2} , then the least upper bound of A “is” \sqrt{2}. Thus, the number \sqrt{2} can be thought of as a limit of a sequence of rational numbers. But there can be many such sequences and so one needs a more formal approach. The underlying observation is that a limit of a sequence of rational numbers need not be rational.

This is perhaps best illustrated by Euler’s series for e. Students of calculus are familiar with

e = \sum_{j=0}^{\infty}\frac{1}{j!}

and the right hand side can be thought of as a limit of a sequence of rational numbers


The irrationality of e is easily proved by the method of contradiction. Suppose that e=a/b for two coprime natural numbers a and b. Then,

b!e = \sum_{j=0}^{b}\frac{b!}{j!} + \sum_{j=b+1}^{\infty}\frac{b!}{j!} …call this 1.1

and the second term on the right hand side is strictly less than the geometric series

\sum_{k=1}^{\infty} \frac{1}{(b+1)^{k}}=\frac{1}{b}

which is less than or equal to 1. But the left hand side of 1.1 is a positive integer, the first term on the right hand side of 1.1 is a positive integer and the second term is not an integer, which is a contradiction.

What this example shows is that the “limit” of the sequence of partial sums S_{n} (each of which is a rational number) is the irrational number e.

The reader is familiar with the usual notion of convergence. We say a sequence of real numbers x_{n} converges to L if given \epsilon>0 there exists N such that

|x_{n}-L|< \epsilon for all n \geq N

One may want to view real numbers as “limits of sequences of rational numbers.”But in our formal construction of numbers, several difficulties arise with this definition. The first is that as we have so far only constructed the rational numbers, we need to take the x_{n}‘s to be all rational numbers. Secondly, the limit L need not be a rational number as the example with e shows. This motivates the formal definition of a Cauchy sequence of rational numbers.

A sequence q_{n} in \mathcal{Q} is called a Cauchy sequence if given \epsilon>0, there is an N such that

|q_{n}-q_{m}|< \epsilon for all m,n \geq N

Now it is easy to see that any convergent sequence is a Cauchy sequence. For reference, we record this below:

Theorem 1.2: Any convergent sequence q_{n} is a Cauchy sequence.

Proof of 1.2: Suppose that q_{n} converges to L. Then, choosing N such that

|q_{n}-L|< \frac{\epsilon}{2} for all n \geq N

we have for all m,n\geq N, by the triangle inequality

|q_{n}-q_{m}| \leq |q_{m}-L|+|L-q_{n}|<\epsilon.


Later, we will show that every Cauchy sequence converges. Thus, a sequence is Cauchy if and only if it is convergent and the two notions are equivalent. But the advantage of the notion of a Cauchy sequence is that the limit value L is not mentioned in the definition.

The construction of real numbers is now brought about by defining an equivalence relation on the set of all Cauchy sequences of rational numbers. Since we have constructed rational numbers, we are now allowed to construct the set of all Cauchy sequences of rational numbers. On this set, we say two Cauchy sequences \{ q_{n}\} and \{ r_{n} \} are equivalent if for any \epsilon>0, there exists a natural number N such that

|q_{n}-r_{n}|< \epsilon for all n \geq N.

This is easily seen to define an equivalence relation on the set of all Cauchy sequences of rational numbers. The set of real numbers \Re is then defined as the set of all equivalence classes. We will denote the equivalence class of the sequence \{ q_{n}\} by |\{ q_{n}\}|.

We leave as an exercise to verify that if \{ q_{n}\} and \{ r_{n}\} are two Cauchy sequences, then so are \{ q_{n}+r_{n}\} and \{ q_{n}.r_{n}\}. Thus, the usual operations of addition and multiplication are easily defined as:

[\{ q_{n}\}]+[\{ r_{n}\}] = [\{q_{n}+r_{n} \}]

[\{ q_{n}\}].[\{ r_{n}\}] = [\{ q_{n}.r_{n}\}]

One needs to verify that these are well-defined by checking that the definition does not depend on the choice of representative of the equivalence class.

The order relation of the real numbers can also be defined. Thus, viewing real numbers as equivalence classes or Cauchy sequences of rational numbers, we define [\{ q_{n}\}] < [\{ r_{n}\}] if there is an N such that q_{n}<r_{n} for all n \geq N. We also write [\{ q_{n}\}] \leq [\{ r_{n}\}] if either [\{ q_{n}\}] < [\{ r_{n}\}] or [\{ q_{n}\}] = [\{ r_{n}\}]. Again, these order relations have the expected properties.

Finally, the absolute value function can also be defined. We set

$latex|[\{ q_{n}\}]| = [\{ |q_{n}|\}]$

and the reader can easily verify that indeed the right hand side is a Cauchy sequence if \{ q_{n}\} is Cauchy.

The absolute value has the usual properties, the most notable being the triangle inequality

|[\{ q_{n}\}] + [\{ r_{n}\}]| \leq |[\{ q_{n}\}]|+|[\{ r_{n}\}]|

Having given this formal definition of the real numbers using Cauchy sequences, it is convenient to drop the sequence notation and just represent the real numbers as single letters. Thus, if a, b are real numbers, the triangle inequality is the familiar: |a+b| \leq |a|+|b|

It is now convenient to visualize the set of real numbers in the usual way as points of the line stretching from -\infty to +\infty.

By the very construction, the set of real numbers has the property of completeness in that every Cauchy sequence converges. The reader will also observe that the construction depended only on the property of the absolute value viewed as a metric to measure the distance between two rational numbers. This signals a wider applicability of the method of Cauchy sequences valid more generally in metric spaces which we discuss later.

With the construction of the real numbers, we have restored the least upper bound property and can now define that a sequence of numbers (rational or real) x_{n} converges to a number L \in \Re if given any \epsilon >0, there is an N such that

|x_{n}-L|<\epsilon for all n \geq N.

A non-empty subset A of \Re is said to be bounded below if there is a number s \in \Re such that s \leq a for all a \in A. Any such number s is called a lower bound for A. Analogously, A is said to be bounded above if there is an upper bound for A. The set A is said to be bounded if it is both bounded below and bounded above and unbounded if it is not bounded.

The construction of the real numbers using the Dedekind cuts shows that every non-empty subset of \Re that is bounded above has a least upper bound denoted by \sup {A}. The least upper bound is also called the supremum. Every non-empty subset which is bounded below has a greatest lower bound denoted by \inf {A}. The greatest lower bound is also called the infimum. It is clear that if A and B are non-empty subsets of \Re such that A \subseteq B, then \sup{A} \leq  \sup{B}(see exercises). The following theorem gives a convenient characterization of the supremum.

Theorem 1.3 Let A be a non-empty subset of \Re that is bounded above. Then s=\sup {A} if and only if

(i) a \leq s for all a \in A, and

(ii) for any \epsilon>0, A \bigcap (s-\epsilon, s]\neq \phi.

Proof :

Let s=\sup {A}. Then s is an upper bound for A so that (i) holds. Now if (ii) were false, there would be an \epsilon>0 such that A \bigcap (s-\epsilon, s] = \phi. But then s-\epsilon would also be an upper bound for A, a contradiction. Conversely, suppose thatt (i) and (ii) hold. Then (i) implies s is an upper bound for A. If s were not the least upper bound for A, there would bea t <ssuch that a \leq t for alla \in A. Taking \epsilon = \frac{s-t}{2}>0 in (ii) we get that A \bigcap (s-\epsilon, s] \neq \phi. But s-\epsilon= t+\epsilonso that A \bigcap (t+\epsilon, s] \neq \phi which is a contradiction since a \leq t for all a \in A.


Theorem 1.3 can also be stated as follows. If A is a non-empty subset of \Re that is bounded above ands=\sup{A}, then either s \in A or A \bigcap (s-\epsilon, s] \neq \phi for all \epsilon >0. A similar theorem can be written for the infimum of a set.


Nalin Pithwa

Constructing numbers from sets

Continued from previous blog: A fast review of set theory; same reference: A Second Course in Analysis by M Ram Murty, Hindustan Book Agency.

Mathematicians and philosophers of the nineteenth century pondered deeply into the nature of a number. The question of “what is a number?” is not a simple one. But since mathematicians decided to give foundations of mathematics using the axiomatic method and sets as the basic building blocks, we are led to define numbers using sets. We follow Richard Dedekind (1831-1916) and Giuseppe Peano (1858-1932) in the following construction. It was as late as 1888 and 1889 when this construction was described in two papers written independently by Dedekind and Peano.

We construct a sequence of sets to represent the natural numbers. As noted earlier, zero is represented by the empty set. We have already described the construction of the natural numbers using the empty set. For each natural number n, the successor of n is denoted by n+1 (and sometimes as n^{'}) and defined as

n \bigcup \{ n \}

Thus, each natural number is a set with n elements, namely,

\{ 0,1,2, \ldots, n-1\}

We designate the set of natural numbers by the symbol \mathcal{N}. (It is a matter of personal convenience whether to include zero as a natural number or not. In this discussion, zero is a natural number. In other settings, it may not be. There is no universal convention regarding this and the student is expected to understand depending on the context. Some authors use the term “whole numbers” to indicate that zero is included in the discussion.)

The arithmetic operations on \mathcal{N} are now defined recursively. Addition is defined as a function from \mathcal{N} \times \mathcal{N} to \mathcal{N}:

+ (m,n) = m+n

where m+n is defined recursively by 0+n=n and m'+n = (m+n)^{'}. A similar definition is given for multiplication x by defining 0 \times n = 0 and

m^{'} \times n = (m \times n) + n

We also define m \times n as simply mn which is the familiar symbology.

An equivalence relation on a set S is a subset R of S \times S satisfying:

  1. (reflexive axiom) (a,a) \in R \forall {x} \in S.
  2. (symmetry axiom) (a,b) \in R \Longleftrightarrow (b,a) \in R.
  3. (transitive axiom) (a,b) \in R and (b,c) \in R implies (a,c) \in R.

The notion of an equivalence relation is an abstaction of our concept of equality, or at least what we implicitly expect of the notion of equality. It is more suggestive to write the equivalence relationn, not as a subset of S \times S as indicated above, but rather more symbolically as \sim that our axioms become:

  1. (reflexive) a \sim a, \forall {a} \in .
  2. (symmetry) a \sim b if and only if b \sim a.
  3. (transitive) a \sim b and b \sim c implies a \sim c.

Equivalence relations play a fundamental role in all of mathematics. They allow us to understand aspects of sets by grouping them using certain properties.

To construct negative integers, we define an equivalence relation on \mathcal{N} \times \mathcal{N}. We write

(m,n)\sim (j,k) \Longleftrightarrow m+k=j+n.

Intuitively, we think of (m,n) as m-n so that it becomes evident that our definition is now in terms of concepts that have been defined earlier. This is very similar to how the ancients worked with negative numberss that appeared in an equation. They usually moved them to the other side so that the equation became an equation of non-negative numbers. However, with our set theoretic definition, we have reached a more fundamental and higher level of abstraction. Thus, with our equivalence relation above on the natural numbers, we define the set of integers as the set of equivalence classes of such ordered pairs. It is now easy to see that the following lemma holds:

Lemma 1.1 If (j,k) is an ordered pair of non-negative integers, then exactly one of the following statements holds:

(a) (j,k) is equivalent to (m,0) for a unique non-negative integer m;

(b) (j,k) is equivalent to (0,m) for a unique non-negative integer m;

(c) (j,k) is equivalent to (0,0).

Sometimes, we denote by |(j,k)| the equivalence class of (j,k). With this lemma in place, we now denote by m the set of pairs of non-negative integers equivalent to (m,0) ; by -m the set of pairs equivalent to (0,m) and by 0 the set of pairs equivalent to (0,0). We denote these equivalence classes by \mathcal{Z}.This gives us set theoretic construction of the set of integers.

We can define the operations of addition and multiplication by setting:

|(j_{1},  k_{1})|+|(j_{2}, k_{2})|=|(j_{1}+j_{2}, k_{1}+k_{2})|

|(j_{1}, k_{1})| \times |(j_{2}, k_{2})| = |(j_{1}j_{2}+k_{1}k_{2}, j_{1}k_{2}+j_{2}k_{1})|

This latter definition is best understood if we recall that the symbol (j,k) represents j-k so that the left hand side of the above equation is

(j_{1}-k_{1})(j_{2}-k_{2}) = j_{1}j_{2}+k_{1}k_{2}-(j_{1}k_{2}+j_{2}k_{1})

One needs to check that these definitions are “well-defined” in the sense that they are independent of the representatives chosen for the equivalence class. This can be done as exercises.

In this way, we have now extended the notion of addition and multiplication from the set of natural numbers to the set of integers. Subtraction of integers can be defined by

|(j_{1}, k_{1})|-|(j_{2}, k_{2})| = |(j_{1}, k_{1})| +(-1)|(j_{2}, k_{2})|

where -1 represents the equivalence class (0,1). All of these definitions correspond to our usual notion of addition, subtraction and multiplication. Their virtue lies in their pure set-theoretic formulation.

We can also order the set of integers in the usual way. Thus,

j_{1}+k_{2} < k_{1}+j_{2} \Longrightarrow |(j_{1}, k_{1})|< |(j_{2}, k_{2})|


j_{1}+k_{2} \leq k_{1}+j_{2} \Longleftrightarrow |(j_{1}, k_{1})| \leq |(j_{2}, k_{2})|.

This corresponds to our usual notion of “less than” and “less than or equal to”.

Finally, we can define the absolute value on the set of integers by setting

|k|=k, if 0<k

|k|=0, if k=0

|k|=-k, if k<0

We can now construct the rational numbers \mathcal{Q} from the set of integers. We do this by defining an equivalence relation on the set \mathcal{Z} \times \mathcal{Z}^{+} by stating that two pairs (j_{1}, k_{1}) and (j_{2}, k_{2}) are equivalent if and only if j_{1}k_{2}=j_{2}k_{1}. Intuitively, we think of (j_{1}, k_{1}) as representing the “fraction” \frac{j_{1}}{k_{1}} and examining what we would mean by \frac{j_{1}}{k_{1}} = \frac{j_{2}}{k_{2}} by reducing it to notions already defined. The set of rational numbers \mathcal{Q} is then defined as the set of such equivalence classes.

The expected operations of addition and multiplication are now evident:

|(j_{1}, k_{1})|+|j_{2}, k_{2}| = |(j_{1}k_{2}+j_{2}k_{1}, k_{1}k_{2})|

|(j_{1}, k_{1})||(j_{2}, k_{2})| = |(j_{1}j_{2}, k_{1}k_{2})|

Again, these definitions are easily verified to be well-defined. Finally, we can now define “division”.If |(j_{1}, k_{1})|, |(j_{2}, k_{2})| \in \mathcal{Q} with j_{2} \neq 0, we define:

\frac{|(j_{1}, k_{1})|}{|(j_{2}, k_{2})|} = |(j_{1}k_{2}, j_{2}k_{1})|

These operations satisfy the familiar laws of associativity, commutativity and distributivity. Subtraction of rational numbers then can be written as :

|(j_{1}, k_{1})|-|(j_{2}, k_{2})| = |(j_{1}, k_{1})|+(-1,1)|(j_{2}, k_{2})|

The ordering of rational numbers can also be written as:

|(j_{1}, k_{1})|< |(j_{2}, k_{2})| \Longleftrightarrow j_{1}k_{2}< j_{2}k_{1}

|(j_{1}, k_{1})|   \leq  |(j_{2}, k_{2})| \Longleftrightarrow j_{1}k_{2} \leq j_{2}k_{1}.

These definitions agree with out usual notions of ordering of the rational numbers.

Finally, the definition of absolute value can be extended as:

|[(j,,k)]|= |(j,k)| if [(0,1)] < [(j,k)]

|[(j,k)]| = |(0,1)| if |(j,k)|=|(0,1)|

|[(j,k)]| = -|(j,k)|<  |(0,1)|.

Again, our familiar properties of the absolute value of rational numbers hold. With this foundational construction in place, we can conveniently represent the equivalence class of (j,k) as simply the fraction j/k and continue to work with these numbers as we were hopefully taught from childhood.

In the next sections/blogs we construct the real numbers from this axiomatic framework.


Hint (generic): keep the meaning of the symbols in mind and meaning of equivalence relations and equivalence classes. Also note that our basic object is a class and a set is a member of a class.

  1. let |(j_{1}, k_{1})|, |(j_{2}, k_{2})| be two elements of \mathcal{Z}. Show that the addition:

|(j_{1}, k_{1})|+|(j_{2}, k_{2})| = |(j_{1}+j_{2}, k_{1}+k_{2})| is well-defined. That is, prove that for any (j_{1}^{'}, k_{1}^{'}) \in |(j_{1}, k_{1})| and (j_{2}^{'},k_{2}^{'}) \in |(j_{2}, k_{2})|, we have that (j_{1}^{'}+j_{2}^{'}, k_{1}^{'}+k_{2}^{'}) is equivalent to (j_{1}+j_{2}, k_{1}+k_{2}).

2. For j_{1}, j_{2}, k \in \mathcal{Z}, prove the distributive law: (j_{1}+j_{2}).k = j_{1}k+j_{2}k.

3. Show that the relations < and \leq on \mathcal{Z} have the following properties:

(a) |(0,j)|< |(0,0)| for all j \in \mathcal{Z}^{+}

(b) |(0,j)|< |(k,0)| for all j, k \in \mathcal{Z}^{+}

(c) |(0,j)|< |(0,k)|, j ,k \in \mathcal{Z}^{+} if and only if k<j

(d) |(0,0)| < |(j,0)| for all j \in \mathcal{Z}^{+}

(e) |(j,0)|<|(k,0)|, j, k \in \mathcal{Z}_{\geq 0}if and only if j<k.

(f) |(0,j)| \leq |(0,0)| for all j \in \mathcal{Z}_{\geq 0}.

(g) |(0,j)| \leq |(k,0)| for all j,k \in \mathcal{Z}_{\geq 0}

(h) |(0,j)| \leq |(0,k)| for j,k \in \mathcal{Z}_{\geq 0} if and only if k \leq j

(i) |(0,0)| \leq |(j,0)| for all j \in \mathcal{Z}_{\geq 0}

(j) |(j,0)| \leq |(k,0)| where j, k \in \mathcal{Z}_{\geq 0} if and only if j \leq k.


Nalin Pithwa.

Axioms of set theory: fast review

Reference: A Second Course in Analysis, M Ram Murty, Hindustan Book Agency.

Chapter 1 Background

Axioms of Set Theory

The concept of number lies at the heart of mathematics. The subject of mathematical analysis, in the formal sense of the word, began when mathematicians initiated the exploration of infinity as a precise mathematical idea. This perhaps is the greatest event of the nineteenth century, on par with the discovery of zero as a mathematical idea. Both zero and infinity are now essential to the understanding of limits and the related processes of real and complex analysis.

Because these two ideas were poorly understood, mathematicians in the nineteenth century introduced the axiomatic method with the theory of sets as the foundation for all mathematics. In this approach, Georg Cantor (1845-1918) was definitely the leading figure. Sadly, his work received considerable opposition at first, but is now universally accepted. It is not an exaggeration to say that research into the foundations of mathematics has given us a larger understanding of the nature of mathematical logic, both its strengths and limitations.

Set theory is the standard foundation of all mathematics. Mathematical proofs use set-theoretic ideas in one form or another. Below, we give a quick overview of the axioms of set theory so that the student is introduced to the relevant vocabulary. One need not become overly preoccupied with these formalities for otherwise a serious study of these axioms will take us into the realm of mathematical logic which is not the province of this short survey. Rather, we indicate the main themes, very much similar to a quick sight-seeing tour of an ancient city. There will undoubtedly be many side streets to explore and since time is of the essence, we need to be content with a one-hour tour highlighting the significant monuments. For a gentle introduction to mathematical logic, we refer the reader to the following text: Hilbert’s Tenth Problem, An introduction to logic, number theory and computing, Student Mathematical Library, Volume 88, American Mathematical Society, Providence, Rhode Island, 2019.

The basic objects of set theory are classes, some of which are sets. An informal definition of a set is an unordered collection of objects. The basic relationship between them is membership indicated by the symbol \epsilon. Thus, $x \in A$ means that x is a member of A. Its negation, x is not a member of A is written as $x \notin A$. Already, with the notion of a set and the relation of membership, we can easily run into logical contradictions if we do not formulate proper axioms for the construction of sets. The famous “Russell paradox” (named after the British philosopher and mathematician Bertrand Russell (1872-1970)) shows that the “set of all sets” is not a set. Indeed, if it were a set, it would be a proper member of itself by definition, which is a contradiction. The paradox is often posed as the following riddle: In the town of Seville lives a barber who shaves everyone who does not shave himself. Does the barber shave himself? The contradiction arises from self-reference and so it was quickly realized that to avoid logical contradictions in mathematics, one must formulate precise axioms for the construction of sets. In a formal system of axiomatic set theory, the notions of “class” and “\epsilon” are undefined terms. There are several well-developed axiomatic systems such as the Zermelo-Fraenkel system and the von Neumann-Godel-Bernays system. Our treatment below follows the latter.

The underlying idea of any these systems is that we construct new sets from old sets in a systematic manner, with the existence of the empty set as a basic axiom. This is how we avoid the Russell paradox.

The words “and” and “not’ are used as propositional connectives. We usually write these words out and do not use symbols for them, though some logicians do. To say that “Proposition P implies Proposition Q” we write “P \Longrightarrow Q“. If $P \Longrightarrow Q$ and $Q \Longrightarrow P$, we write P \Longleftrightarrow Q. The usual logical quantifiers are : \forall which means “for all” and \exists which means “there exists”. The first one is sometimes called the universal quantifier and the second one as the existential quantifier. The statement, “For all x, the proposition F(x) holds” is denoted $\forall (F(x))$ with the parentheses used to designate the proposition. Such a statement is called an existential proposition.Using class variables like x, y, A,B and so on, together with the basic relation of membership $\epsilon$, quantifiers, connectives and parentheses, we can construct “well-formed” formulas. If variables are quantified, they are called “bound” and should only appear after the quantifier. Variables without quantifiers are called “free”. For example,

\forall {x} \exists {y} ((x \in y) \Longrightarrow (x \in z))

is well-formed in which x and y are bound variables and z is free, whereas

\exists{x} ((x \in z) \Longrightarrow \forall{z}(z \in y))

is not well-formed since z precedes its quantifier.

As noted earlier, it is not our goal here to develop axiomatic set theory. On the other hand, the mature mathematician cannot ignore the need for foundations and must have some idea of the basic axioms and notation. So we content ourselves with a cursory overview of the basic axioms and set the stage for both notations and structure of logical proofs.

The essential idea of set theory is that sets are built up in a progressive hierarchy and there are precise rules as to how to form new sets from old sets. We begin with the axiom of extensionality which allows us to determine when two sets are equal.

Axiom I : (extensionality) For any classes A and B, A=B if and only if

\forall{x} (x \in A \Longleftrightarrow x \in B).

This axiom can be taken as the definition of equality.

Axiom 2: (existence of the empty set). There is a set \phi such that

\forall{x}(x \notin O)

The axiom of extensionality implies that the empty set \phi is unique.

Axiom 3: (unordered pairs) For all sets x, y there is a set (x,y) such that

u \in {x,y} \Longleftrightarrow (u =x \hspace{0.1in} or \hspace{0.1in} u=y)

If x=y we obtain the singleton \{ x\}, u \in \{ x \} if and only if u=x. Again this sets are unique by extensionality.

As defined earlier, a variable is free if there are no quantifiers defining it. Otherwise, the variable is said to be bounded. This concept isused in the next axiom and allows us to form new sets from old using well-formed formulas. In fact, the next axiom is more appropriately called an axiom schema or a collection of axioms.

Axiom 4: (selection): Given any well-formed formula P(x) containing one free variable x, there is a class A suchthat for any set x, we have $x \in A$ if and only if P(x) holds.

We often write

A = \{ x: P(x) \}

and read this as “A is the class of all (sets) x such that P(x) holds.”

The distinction between sets and classes in the last axiom is important as noted earlier. Without it, we fall into”Russell’s paradox.” Indeed, let R = \{x: x \notin x \}. If R were a set, then R \in R implies R \notin R and if R \notin R, then R \in R. a contradiction in both cases. Technically, R is not a set but a proper class containing all sets which we call the “universe” and the theory contains no logical contradictions. The previous axioms do allow us to construct new sets from old ones in a systematic manner. Thus, a set may consist of other sets that have already been well-defined.

Axiom 5 (union): For any set x, there is a set \bigcup {x} such that y \in \bigcup{x} if and only if y \in x for some x \in x.

We sometimes write \bigcup_{y \in x}y for \bigcup{x} in the last axiom. Thus,

x \bigcup y = \bigcup \{ x,y\} = \{ z: z \in x \hspace{0.1in} or \hspace{0.1in} z \in y\}.

For any well-formed formula Q(x) with a free variable x, (\forall x \in A)(Q(x)) means

(\forall x)(x \in A \Longrightarrow Q(x)).

The relation of A being a subset of B is defined by the symbol:

(\forall x \in A)(x \in B) and we write this as A \subset B, or equivalently B \supset A.

Axiom 6(power set axiom); For any set x, there is a set 2^{x} such that y \in 2^{x} if and only if y \in x.

The reader should understand that the symbol 2^{x}is pure symbology for the set of all subsets of x and that no exponentiation is meant since x is a set. It will always be clear from the context which meaning is indicated.

For any two classes A and B, we define the intersection as

A \bigcap B = \{ x: x \in A \hspace{0.1in}and \hspace{0.1in}x \in B\} .

If A is any set, then A \bigcap B is a set for any class B by Axiom 6 and the definition of a set. If x is a non-empty set, its intersection is defined by

\bigcap {x} = \{ y; (\forall u \in x) \Longrightarrow y \in u\}.

This is easily seen to be a set. We can also write \bigcap{x} as \bigcap_{y \in x}y.

We can use set theory to define an ordered pair (x,y) as the set of the form \{ \{ x\}, \{x.y\}\}.

Indeed, for any ordered pair Q, its first member is the unique x such that \{ x\} \in Q and its second member is the unique y such that Q = \{ \{x\}, \{ x,y\}\}, where x is the first member.

A relation is any set of ordered pairs. For any relation E, we define the inverse relation by

E^{-1} = \{ (y,x): (x,y) \in E\}.

From the definition of an ordered pair, we are led to the definition of a function as a set f of ordered pairs such that (x,y) \in f and (x,z) \in f implies y=z. We write f(x) =y to mean (x,y) \in f for any given function f. Thus, a function is a special kind of relation. In other words, all functions are relations, but not all relations are functions.

If f is a function, f^{-1} is not necessarily a function. A function f is called one-to-one if and only if f^{-1} is a function.

This now allows us to construct the Cartesian product of two sets X and Y denoted X \times Y, which consists of ordered pairs (x,y) with x \in X and y \in Y.

With the axioms listed so far, we cannot prove the existence of an infinite set. The next axiom allows us to do this along with our earlier axioms.

Axiom 7 (axiom of infinity) There is a set X \neq \phi such that for all x \in N, x \bigcup \{ x\} \in N.

This axiom allows us to construct the natural numbers using the empty set because we now define inductively the sequence of sets

\phi, \{ \phi\}, \{ \phi, \{ \phi\}\}, \ldots

and define the non-negative integers using the “zero-symbol” 0 to designate the empty set and define

1 = \{ \phi\}, 2 = \{ 0,1\}, 3=\{ 0,1,2\}, \ldots

We could also have stated Axiom 7 in logical notation as \exists {N} such that \phi \in N and \forall{x}\in N,

x \bigcup \{ x\} \in N

To prevent x \bigcup {x} from being equal to x, and in general to keep order among sets and prevent unwanted closed cycles for the membership relation, we need a further axiom:

Axiom 8: (axiom of regularity) For any set A \neq \phi, (\exists X \in A) X \bigcup A = \phi. This allows us to prove :

Theorem 1.1: for all x and y, x \notin x and if x \in y, then y \notin x.

Proof 1.1: by Axiom 8, x \bigcap \{ x\} = \phi since x \bigcap \{ x\} = x \bigcap_{y \in x}y. If x \in x, then clearly x \in x \bigcap \{ x\} which implies x \in \phi, a contradiction. This proves the first half of our assertion. For the second half, if x \in y, then by Axiom 8, the set \{ x,y\} has a member disjoint from it. This member cannot be x for otherwise x \in x, a contradiction by the first half of the theorem. If the member is y, then y\bigcap \{ x,y\}= \phi is also a contradiction since x \in y. QED.


  1. Recall that an ordered pair (a,b) can be defined as the set \{ \{ a\}, \{ a,b\}\}. Show that (a,b) = (c,d) if and only if a=c and b=d.

Solution 1: This is clear by definition of equality and definition of ordered pair.

2. Define the ordered triple (a,b,c) to be the ordered pair ((a,b),c) where the ordered pair is defined a usual. Show that

(a,b,c) = (a^{'}, b^{'}, c^{'}) if and only if a=a^{'}, b=b^{'}, c=c^{'}.

Solution: If (a,b,c) = (a^{'}, b^{'}, c^{'}), then by definition of triple,

((a,b), c) = ((a^{'}, b^{'}), c^{'}), which in turn by equality of ordered pairs means that

c = c^{'} and (a,b) = (a^{'}, b^{'}) that is ({a}, \{ a,b\})= (\{ a^{'}\}, \{ a^{'}. b^{'}\}) that is a=a^{'} and b=b^{'}.

3. If x \in y \in z \in w, prove that w \notin x.

Solution: x \in w need not mean w \in x unless w=x.

4. Which of the following are functions and why?

4. (a) ((1,2), (2,3), (3,1)). yes.

4. (b) ((1,2), (2,3), (2,1)). No.

4 (c). ((2,1),(3,1),(1,2)). yes.

4 (d) ((x,y) \in \Re^{2}: x = y^{2}). No.

4 (e) ((x,y) \in \Re^{2}: y = x^{2}). yes.

5. For any relation V (that is, any set of ordered pairs), define the domain of V as

\{ x: (x,y) \in V \hspace{0.1in} for \hspace{0.1in} some \hspace{0.1in} y\}

and the range of V as

\{ y: (x,y) \in V for some x\}

Find the domain and range for each relation in the previous question whether or not it is a function.

4a. Domain is (1,2,3), and same range.

4b. Domain is (1,2) and range is (1,2,3).

4c. Domain is (x: x \geq 0), range is y  \in \Re.

4d. Domain is (x: x \in \Re) and range is y \geq 0/


Nalin Pithwa