Is Math really abstract? I N Herstein answers…

Reference: Chapter 1: Abstract Algebra Third Edition, I. N. Herstein, Prentice Hall International Edition:

For many readers/students of pure mathematics, such a book will be their first contact with abstract mathematics. The subject to be discussed is usually called “abstract algebra,” but the difficulties that the reader may encounter are not so much due to the “algebra” part as they are to the “abstract” part.

On seeing some area of abstract mathematics for the first time,be it in analysis, topology, or what not, there seems to be a common reaction for the novice. This can best be described by a feeling of being adrift, of not having something solid to hang on to. This is not too surprising, for while many of the ideas are fundamentally quite simple, they are subtle and seem to elude one’s grasp the first time around. One way to mitigate this feeling of limbo, or asking oneself “What is the point of all this?” is to take the concept at hand and see what it says in particular concrete cases. In other words, the best road to good understanding of the notions introduced is to look at examples. This is true in all of mathematics.

Can one, with a few strokes, quickly describe the essence, purpose, and background for abstract algebra, for example?

We start with some collection of objects S and endow this collection with an algebraic structure by assuming that we can combine, in one or several ways (usually two) elements of this set S to obtain, once more, elements of this set S. These ways of combining elements of S we call operations on S. Then we try to condition or regulate the nature of S by imposing rules on how these operations behave on S. These rules are usually called axioms defining the particular structure on S. These axioms are for us to define, but the choice made comes, historically in mathematics from noticing that there are many concrete mathematical systems that satisfy these rules or axioms. In algebra, we study algebraic objects or structures called groups, rings, fields.

Of course, one could try many sets of axioms to define new structures. What would we require of such a structure? Certainly we would want that the axioms be consistent, that is, that we should not be led to some nonsensical contradiction computing within the framework of the allowable things the axioms permit us to do. But that is not enough. We can easily set up such algebraic structures by imposing a set of rules on a set S that lead to a pathological or weird system. Furthermore, there may be very few examples of something obeying the rules we have laid down.

Time has shown that certain structures defined by “axioms” play an important role in mathematics (and other areas as well) and that certain others are of no interest. The ones we mentioned earlier, namely, groups, rings, fields, and vector spaces have stood the test of time.

A word about the use of “axioms.” In everyday language, “an axiom means a self-evident truth”. But we are not using every day language; we are dealing with mathematics. An axiom is not a universal truth — but one of several rules spelling out a given mathematical structure. The axiom is true in the system we are studying because we forced it to be true by “force” or “our choice” or “hypothesis”. It is a licence, in that particular structure to do certain things.

We return to something we said earlier about the reaction that many students have on their first encounter with this kind of algebra, namely, a lack of feeling that the material is something they can get their teeth into. Do not be discouraged if the initial exposure leaves you in a bit of a fog.Stick with it, try to understand what a given concept says and most importantly, look at particular, concrete examples of the concept under discussion.

Follow the same approach in linear algebra, analysis and topology.

Cheers, cheers, cheers,

Nalin Pithwa

Some basic facts about continuity

Reference: (1) Topology by Hocking and Young especially chapter 1 (2) Analysis — Walter Rudin.

Definition 1: A transformation f: S \rightarrow T is continuous provided that if p is a limit point of a subset X of S, then f(p) is a limit point or a point of f(X).

Definition 2: We may also state the continuity requirement on f as follows: if p is a limit point of \overline{X}, then f(p) is a point of \overline{f(X)}.

Theorem 1: If S is a set with the discrete topology and f: S \rightarrow T any transformation of S into a topologized set T, then f is continuous.

Theorem 2: A real-valued function y=f(x) defined on an interval [a,b] is continuous provided that if a \leq x_{0} \leq b and \epsilon >0, then there is a number \delta >0 such that if |x-x_{0}|<\delta, x in [a,b], then |f(x) - f(x_{0})|< \epsilon. (NB: this is same as definition 1 above).

Theorem 3: Let f: S \rightarrow T be a transformation of the space S into the space T. A necessary and sufficient condition that f be continuous is that if O is any open subset of T, then its inverse image f^{-1}(O) is open in S.

Theorem 4:

A necessary and sufficient condition that the transformation f: S \rightarrow T of the space S into the space T be continuous is that if x is a point of S, and V is an open subset of T containing f(x), then there is an open set, U in S containing x and such that f(U) lies in V.

Theorem 5:

A one-to-one transformation f: S \rightarrow T of a space S onto a space T is a homeomorphism, if and only if both f and f^{-1} are continuous.

Theorem 6:

Let f: M \rightarrow N be a transformation of the metric space M with metric d into the metric space N with metric \rho. A necessary and sufficient condition that f be continuous is that if \epsilon is any positive number and x is a point of M, then there is a number \delta >0 such that if d(x,y)< \delta, then \rho(f(x), f(y)) < \epsilon.

Theorem 7:

A necessary and sufficient condition that the one-to-one mapping (that is, a continuous transformation) f: S \rightarrow T of the space S onto the space T be a homeomorphism is that f is interior. (NB: A transformation f: S \rightarrow T of the space S into the space T is said to be interior if f is continuous and if the image of every open set subset of S is open in T).


Nalin Pithwa.

Metric space question and solution

Reference: I had blogged this example earlier, but I myself could not fill in the missing gaps at that time. I am trying again with the help of MathWorld Wolfram and of course, the classic, Introductory Real Analysis by Kolmogorov and Fomin, from which it is picked up for my study.

Consider the set C_{[a,b]} of all continuous functions defined on the closed interval [a,b]. Let the distance function (or metric) be defined by the formula:

\rho(x,y) = (\int_{a}^{b}[x(t)-y(t)]^{2}dt)^{1/2} ——– relation I

The resulting metric space will be denoted by C_{[a,b]}^{2}.

The first two properties of a metric space are clearly satisfied by the above function. We need to check for the triangle inequality:

Now I satisfies the triangle inequality because of the following Schwarz’s inequality:

(\int_{a}^{b}x(t)y(t)dt)^{2} \leq \int_{a}^{b}x^{2}(t)dt \int_{a}^{b}y(t)dt —— relation II

In order to get to the above relation II, we need to prove the following:

Prove: (\int_{a}^{b} x(t)y(t)dt)^{2} = \int_{a}^{b}x^{2}(t)dt \int_{a}^{b} y^{2}(t)dt - \frac{1}{2}\int_{a}^{b} \int_{a}^{b}[x(s)y(t)-x(t)y(s)]^{2}dsdt.

From the above, we can deduce Schwarz’s inequality (relation II here in this blog article).

(My own attempts failed to crack it so I had to look at the internet for help. Fortunately, MathWorld Wolfram has given a crisp clear proof…but some parts of the proof are still out of my reach…nevertheless, I am reproducing the proof here for the sake of completeness of my notes…for whatever understanding I can derive at this stage from the proof…):


Weisstein, Eric W. “Schwarz’s Inequality.” From MathWorld–A Wolfram Web Resource.

Schwarz’s Inequality:

Let \Psi_{1}(x), \Psi_{2}(x) be any two real integrable functions in [a,b], then Schwarz’s inequality is given by :

|< \Psi_{1}, \Psi_{2}>|^{2} \leq < \Psi_{1}|\Psi_{2}> <\Psi_{2}|Psi_{1}>

Written out explicity,

|\int_{a}^{b} \Psi_{1}(x), \Psi_{2}(x)|^{2} \leq \int_{a}^{b}[\Psi_{1}(x)]^{2}dx \int_{a}^{b}[\Psi_{2}(x)]^{2}dx

with equality if and only if \Psi_{1}(x) = \alpha \Psi_{2}(x) with \alpha a constant. Schwarz’s inequality is sometimes also called the Cauchy-Schwarz inequality or Buniakowsky’s inequality.

To derive the inequality, let \Psi(x) be a complex function and \lambda a complex constant such that

\Psi(x) \equiv f(x) + \lambda g(x) for some f and g


\int \overline{\Psi} \Psi dx \geq 0, where \overline{z} is the complex conjugate.

\int \overline{\Psi}\Psi dx = \int \overline{f}f dx + \lambda \int \overline{f} g dx + \overline\lambda \int \overline{g} f dx + \lambda \overline{\lambda} \int \overline{g} g dx

with equality when \Psi(x) = 0

Writing this, in compact notation:

<\overline{f},f> + \lambda <\overline{f},g> + \overline{\lambda} <\overline{g},f> + \lambda \overline{\lambda} <\overline{g},g> \geq 0….relation A

Now, define \lambda \equiv - \frac{<\overline{g}, f>}{<\overline{g},g>}….relation B

and \overline{\lambda} = - \frac{<g, \overline{f}>}{<\overline{g}, g>}…relation C

Multiply A by <\overline{g},g> and then plug in B and C to obtain:

<\overline{f}, f><\overline{g}, g> - <\overline{f},g><\overline{g},f> - <\overline{g},f><g, \overline{f}> +<\overline{g}, f><g, \overline{f}> \geq 0

which simplifies to

<\overline{g},f><\overline{f},g> \leq <\overline{f},f><\overline{g},g>


|<f,g>|^{2} \leq <f,f><g,g>. Bessel’s inequality follows from Schwarz’s inequality. QED.


Nalin Pithwa

VI. Countable sets: My notes.


  1. Introduction to Topology and Modern Analysis by G. F. Simmons, Tata McGraw Hill Publications, India.
  2. Introductory Real Analysis — Kolmogorov and Fomin, Dover Publications, N.Y.(to some extent, I have blogged this topic based on this text earlier. Still, it deserves a mention/revision again).
  3. Topology : James Munkres.

The subject of this section and the next — infinite cardinal numbers — lies at the very foundation of modern mathematics. It is a vital instrument in the day to day work of many mathematicians, and we shall make extensive use of it ourselves (in our beginning studying of math ! :-)). This theory, which was created by the German mathematician Georg Cantor, also has great aethetic appeal, for it begins with ideas of extreme simplicity and develops through natural stages into an elaborate and beautiful structure of thought. In the course of our discussion, we shall answer questions which no one before Cantor’s time thought to ask, and we shall ask a question which no one can answer to this day…(perhaps !:-))

Without further ado, we can say that cardinal numbers are those used in counting, such as the positive integers (or natural numbers) 1, 2, 3, …familiar to us all. But, there is much more to the story than this.

The act of counting is undoubtedly one of the oldest of human activities. Men probably learned to count in a crude way at about the same time as they began to develop articulate speech. The earliest men who lived in communities and domesticated animals must have found it necessary to record the number of goats in the village herd by means of a pile of stones or some similar device. If the herd was counted in each night by removing one stone from the pile for each goat accounted for, then stones left over would have indicated strays, and herdsmen would have gone out to search for them. Names for numbers and symbols for them like our 1, 2, 3, …would have been superfluous. The simple yet profound idea of a one-to-one correspondence between the stones and the goats would have fully met the needs of the situation.

In a manner of speaking, we ourselves use the infinite set

N = \{ 1, 2, 3, \ldots\}

of all positive integers as “pile of stones.” We carry this set around with us as part of our intellectual equipment. Whenever we want to count a set, say, a stack of dollar bills, we start through the set N and tally off one bill against each positive integer as we come to it. The last number we reach, corresponding to the last bill, is what we call the number of bills in the stack. If this last number happens to be 10, then “10” is our symbol for the number of bills in the stack, as it also is for the number of our fingers, and for the number of our toes, and for the number of elements which can be put into one-to-one correspondence with the finite set \{ 1,2,3, \ldots, 10\}. Our procedure is slightly more sophisticated than that of the primitive savage man. We have the symbols 1, 2, 3, …for the numbers which arise in counting; we can record them for future use, and communicate them to other people, and manipulate them by the operations of arithmetic. But the underlying idea, that of the one-to-one correspondence, remains the same for us as it probably was for him.

The positive integers are adequate for our purpose of counting any non-empty finite set, and since outside of mathematics all sets appear to of this kind, they suffice for all non-mathematical counting. But in the world of mathematics, we are obliged to consider many infinite sets, such as the set of positive integers itself, the set of all integers, the set of all rational numbers, the set of all real numbers, the set of all points in a plane, and so on. It is often important to be able to count such sets, and it was Cantor’s idea to do this, and to develop a theory of infinite cardinal numbers, by means of one-to-one correspondence.

In comparing the sizes of two sets, the basic concept is that of numerical equivalence as defined in the previous section. We recall that two non-empty sets are numerically equivalent if there exists a one-to-one mapping of a set onto the other, or — and this amounts to the same thing — if there can be found a one-to-one correspondence between them. To say that two non-empty finite sets are numerically equivalent is of course to say that they have the same number of elements in the ordinary sense. If we count one of them, we simply establish a one-to-one correspondence between its elements and a set of positive integers of the form \{1,2,3, \ldots, n \} and we then say that n is the number of elements possessed by both, or the cardinal number of both. The positive integers are the finite cardinal numbers. We encounter many surprises as we follow Cantor and consider numerical equivalences for infinite sets.

The set N = \{ 1,2,3, \ldots\} of all positive integers is obviously “larger” than the set \{ 2,4,6, \ldots\} of all even positive integers, for it contains this set as a proper subset. It appears on the surface that N has “more” elements. But it is very important to avoid jumping to conclusions when dealing with infinite sets, and we must remember that our criterion in these matters is whether there exists a one-to-one correspondence between the sets (not whether one set is a proper subset or not of the other) . As a matter of fact, consider the “pairing”

1,2,3, \ldots, n, \ldots

2,4,6, \ldots, 2n, \ldots

serves to establish a one-to-one correspondence between these sets, in which each positive integer in the upper row is matched with the even positive integer (its double) directly below it, and these two sets must therefore be regarded as having the same number of elements. This is a very remarkable circumstance, for it seems to contradict our intuition and yet is based only on solid common sense. We shall see below, in Problems 6 and 7-4, that every infinite set is numerically equivalent to a proper subset of itself. Since this property is clearly not possessed by any finite set, some writers even use it as the definition of an infinite set.

In much the same way as above, we can show that N is numerically equivalent to the set of all even integers:

1, 2, 3,4, 5,6, 7, \ldots

0,2,-2,4,-4,4,6,-6, \ldots

Here, our device is start with zero and follow each even positive integer as we come to it by its negative. Similarly, N is numerically equivalent to the set of all integers:

1,2,3,4,5,6,7, \ldots

0,1,-1, 2, -2, 3, -3, \ldots

It is of considerable interest historical interest to note that Galileo had observed in the early seventeenth century that there are precisely as many perfect squares (1,4,9,16,25, etc.) among the positive integers as there are positive integers altogether. This is clear from the “pairing”:

1,2,3,4,5, \ldots

1^{2}, 2^{2}, 3^{2}, 4^{2}, 5^{2}, \ldots

It struck him as very strange that this should be true, considering how sparsely strewn the squares are among all the positive integers. But, the time appears not to have been ripe for the exploration of this phenomenon, or perhaps he had other things on his mind; in any case, he did not follow up his idea.

These examples should make it clear that all that is really necessary in showing that an infinite set X is numerically equivalent to N is that we be able to list the elements of X, with a first, a second, a third, and so on, in such a way that it is completed exhausted by this counting off of its elements. It is for this reason that any infinite set which is numerically equivalent to N is said to be countably infinite. (Prof. Walter Rudin also, in his classic on mathematical analysis, considers a countable set to be either finite or countably infinite. ) We say that a set is countable it is non-empty and finite (in which case it can obviously be counted) or if it is countably infinite.

One of Cantor’s earliest discoveries in his study of infinite sets was that the set of all positive rational numbers (which is very large : it contains N and a great many other numbers besides) is actually countable. We cannot list the positive rational numbers in order of size, as we can the positive integers, beginning with the smallest, then the next smallest, and so on, for there is no smallest, and between any two there are infinitely many others. We must find some other way of counting them, and following Cantor, we arrange them not not in order of size, but according to the size of the sum of numerator and denominator. We begin with all positive rationals whose numerator and denominator sum add up to 2: there is only one \frac{1}{1}=1. Next, we list (with increasing numerators) all those for which this sum is 3: \frac{1}{2}, \frac{2}{1}=2. Next, all those for which the sum is 4: \frac{1}{3}, \frac{2}{2}=1, \frac{3}{1}=3. Next, all those for which this sum is 5: \frac{1}{4}, \frac{2}{3}, \frac{3}{2}, \frac{4}{1}=4. Next, all those for which this sum is 6: \frac{1}{5}, \frac{2}{4}, \frac{3}{3}=1, \frac{4}{2}=2, \frac{5}{1}=5. And, so on. If we now list all these together from the beginning, omitting those already listed when we come to them, we get a sequence like:

1, 1/2, 2, 1/3, 1/4, 2/3, 3/2, 4, 1/5, 5, \ldots

which contains each positive rational number once and only once. (There is a nice schematic representation of this : Cantor’s diagonalization process; please google it). In this figure, the first row contains all positive rationals with numerator 1, the second all with numerator 2, and so on. Our listing amounts to traversing this array of numbers in a zig-zag manner (again, please google), where of course, all those numbers already encountered are left out.

It is high time that we christened the infinite cardinal number we have been discussing, and for this purpose, we use the first letter of the Hebrew alphabet, \bf{aleph_{0}}. We say that \aleph_{0} is the number of elements in any countably infinite set. Our complete list of cardinal numbers so far is

1, 2, 3, \ldots, \aleph_{0}.

We expand this list in the next section.

Suppose now that m and n are two cardinal numbers (finite or infinite). The statement that m is less than n (written m < n) is defined to mean the following: if X and Y are sets with m and n elements, then (i) there exists a one-to-one mapping of X into Y, and (ii) there does not exist a mapping of X onto Y. Using this concept, it is easy to relate our cardinal numbers to one another by means of:

1<2<3< \ldots < \aleph_{0}.

With respect to the finite cardinal numbers, this ordering corresponds to their usual ordering as real numbers.


Nalin Pithwa

V. Exercises: Partitions and Equivalence Relations

Reference: Topology and Modern Analysis, G. F. Simmons, Tata McGraw Hill Publications, India.

  1. Let f: X \rightarrow Y be an arbitrary mapping. Define a relation in X as follows: x_{1} \sim x_{2} means that f(x_{1})=f(x_{2}). Prove that this is an equivalence relation and describe the equivalent sets.

Proof : HW. It is easy. Try it. 🙂

2. In the set \Re of real numbers, let x \sim y means that x-y is an integer. Prove that this is an equivalence relation and describe the equivalence sets.

Proof: HW. It is easy. Try it. 🙂

3. Let I be the set of all integers, and let m be a fixed positive integer. Two integers a and b are said to be congruent modulo m — symbolized by a \equiv b \pmod {m} — if a-b is exactly divisible by m, that is, if a-b is an integral multiple of m. Show that this is an equivalence relation, describe the equivalence sets, and state the number of distinct equivalence sets.

Proof: HW. It is easy. Try it. 🙂

4. Decide which one of the three properties of reflexivity, symmetry and transitivity are true for each of the following relations in the set of all positive integers: m \leq n, m < n, m|n. Are any of these equivalence relations?

Proof. HW. It is easy. Try it. 🙂

5. Give an example of a relation which is (a) reflexive, but not symmetric or transitive. (b) symmetric but not reflexive or transitive. (c) transitive but not reflexive or symmetric (d) reflexive and symmetric but not transitive (e) reflexive and transitive but not symmetric. (f) symmetric and transitive but not reflexive.

Solutions. (i) You can try to Google (ii) Consider complex numbers (iii) there are many examples given in the classic text “Discrete Mathematics” by Rosen.

6) Let X be a non-empty set and \sim a relation in X. The following purports to be a proof of the statement that if this relation is symmetric and transitive, then it is necessarily reflexive: x \sim y \Longrightarrow y \sim x ; x \sim y and y \sim x \Longrightarrow x; therefore, x \sim x for every x. In view of problem 5f above, this cannot be a valid proof. What is the flaw in the reasoning? 🙂

7) Let X be a non-empty set. A relation \sim in X is called circular if x \sim y and y \sim x \Longrightarrow z \sim x, and triangular if x \sim y and x \sim z \Longrightarrow y \sim z. Prove that a relation in X is an equivalence relation if and only if it is reflexive and circular if and only if it is reflexive and triangular.

HW: Try it please. Let me know if you need any help.


Nalin Pithwa.

PS: There are more examples on this topic of relations in Abstract Algebra of I. N. Herstein and Discrete Mathematics by Rosen.

V. Partitions and Equivalence Relations: My Notes


  1. Topology and Modern Analysis, G F Simmons, Tata McGraw Hill Publications, India.
  2. Toplcs in Algebra, I N Herstein.
  3. Abstract Algebra, Dummit and Foote.
  4. Topology by James Munkres.

In the first part of this section, we consider a non-empty set X, and we study decompositions of X into non-empty subsets which fill it out and have no elements in common with one another. We give special attention to the tools (equivalence relation) which are normally used to generate such decompositions.

A partition of X is a disjoint class \{ X_{i} \} of non-empty subsets of X whose union if the full set X itself. The X_{i}‘s are called the partition sets. Expressed somewhat differently, a partition of X is the result of splitting it, or subdividing it, into non-empty subsets in such a way that each element of X belongs to one and only one of the given subsets. ]

If X is the set \{1,2,3,4,5 \}, then \{1,3,5 \}, \{2,4 \} and \{1,2,3 \} and \{ 4,5\} are two different partitions of X. If X is the set \Re of all real numbers, then we can partition \Re into the set of all rationals and the set of all irrationals, or into the infinitely many closed open intervals of the form [n, n+1) where n is an integer. If X is the set of all points in the coordinate plane, then we can partition X in such a way that each partition set consists of all points with the same x coordinate (vertical lines), or so that each partition set consists of all points with the same y coordinate (horizontal lines).

Other partitions of each of these sets will readily occur to the reader. In general, there are many different ways in which any given set can be partitioned. These manufactored examples are admittedly rather uninspiring and serve only to make our ideas more concrete. Later in this section we consider some others which are more germane to our present purposes.

A binary relation in the set X is a mathematical symbol or verbal phrase, which we denote by R in this paragraph, such that for each ordered pair (x,y) of elements of X the statement x \hspace{0.1in} R \hspace{0.1in} y is meaningful, in the sense that it can be classified definitely as true or false. For such a binary relation, x \hspace{0.1in} R \hspace{0.1in}y symbolizes the assertion that x is related by R to y, and x \not {R} \hspace{0.1in}y the negation of this, namely, the assertion that x is not related by R to y. Many examples of binary relations can be given, some familiar and others less so, some mathematical and others not. For instance, if X is the set of all integers and R is interpreted to mean “is less than,” which of course is usually denoted by the symbol <, then we clearly have 6<7 and 5 \not < 2. We have been speaking of binary relations, which are so named because they apply only to ordered pairs of elements, rather than to ordered triples, etc. In our work, we drop the qualifying adjective and speak simply of a relation in X, since we shall have occasion to consider only relations of this kind. {NB: Some writers prefer to regard a relation R in X as a subset R of X \times X. From this point of view, x R y and x \not {R} y are simply equivalent ways of writing (x,y) \in R and (x,y) \notin R. This definition has the advantage of being more tangible than our definition, and the disadvantage that few people really think of a relation in this way.” )

We now assume that a partition of our non-empty set X is given, and we associate with this partition a relation on X. This relation is defined to be in the following way: we say that x is equivalent to y and write this as x \sim y (the symbol \sim is pronounced “wiggle”.), if x and y belong to the same partition set. It is obvious that the relation \sim has the following properties:

a) x \sim x for every x (reflexivity)

b) x \sim y \Longrightarrow y \sim x (symmetry)

c) x \sim y \hspace{0.1in} and \hspace{0.1in} y \sim z \Longrightarrow x \sim z (transitivity)

This particular relation in X arose in a special way, in connection with a given partition of X, and its properties are immediate consequences of the definition. Any relation whatever in X which possesses these three properties is called an equivalence relation in X.

We have just seen that each partition of X has associated with it a natural equivalence relation in X. We now reverse the situation and prove that a given equivalence relation in X determines a natural partition of X.

Let \sim be an equivalence relation in X; that is, assume that it is reflexive, symmetric, and transitive in the sense described above. If x is an element of X, the subset of X defined by [x] = \{ y: y \sim x\} is called the equivalence set of x. The equivalence set of x is thus the set of all elements which are equivalent to x. We show that the class of all distinct equivalence sets forms a partition of X. By reflexivity, x \in [x] for each element x in X, so each equivalence set is non-empty and their union is X. It remains to be shown that any two equivalence sets [x_{1}] and [x_{2}] are either disjoint or identical. We prove this by showing that if [x_{1}] and [x_{2}] are not disjoint, then they must be identical. Suppose that [x_{1}] and [x_{2}] are not disjoint, that is, suppose that they have a common element z. Since x belongs to both equivalence sets, z \sim x_{1} and z \sim x_{2}, and by symmetry x_{1} \sim z. Let y be any element of x_{1}, so that y \sim x_{1}. Since y \sim x_{1} and x_{1} \sim z, transitivity shows that y \sim z. By another application of transitivity, y \sim z and z \sim x_{2}, imply that y \sim x_{2} so that y is in [x_{2}]. Since y was arbitrarily chosen in [x_{1}], we see by this that [x_{1}] \subseteq [x_{2}]. The same reasoning shows that [x_{2}] \subseteq [x_{1}] and from this we conclude that [x_{1}] = [x_{2}].

The above discussion demonstrates that there is no real distinction (other than a difference in language) between partitions of a set and equivalence relation by regarding elements as equivalent if they belong to the same partition set, and if we start with an equivalence relation, we get a partition by grouping together into subsets all elements which are equivalent to one another. We have here a single mathematical idea, which we have been considering from two different points of view, and the approach we choose in any particular application depends entirely on our own convenience. In practice, it is almost invariably the case that we use equivalence relations (which are usually easy to define) to obtain partitions (which are sometimes difficult to describe fully).

We now turn to several of the more important simple examples of equivalence relations.

Let I be the set of integers. If a and b are elements of this set, we write a = b (and say that a equals b) if a and b are the same integer. Thus, 2+3=5 means that the expression on the left and right are simply different ways of writing the same integer. It is apparent that = used in this sense is an equivalence relation in the set I:

i) a=a for every a

ii) a=b \Longrightarrow b=a

iii) a=b \hspace{0.1in} b=c \Longrightarrow a=c.

Clearly, each equivalence set consists of precisely one integer.

Another familiar example is this relation of equality commonly used for fractions. We remind the reader that, strictly speaking, a fraction is merely a symbol of the form a/b, where a and b are integers and b is not zero. The fractions 2/3 and 4/6 are obviously not identical, but nevertheless we consider them to be equal. In general, we say that two fractions a/b and c/d are equal, written \frac{a}{b} = \frac{c}{d}, if ad and bc are equal as integers in the usual sense (see the paragraph above). (HW quiz: show this is an equivalence relation on the set of fractions). An equivalence set of fractions is what we call a rational number. Every day usage ignores the distinction between fractions and rational numbers, but it is important to recognize that from the strict point of view it is the rational numbers (and not the fractions) which form part of the real number system.

Our final example has a deeper significance, for it provides us with the basic tool for our work of the next two sections.

For the remainder of all this section, we consider a relation between pairs of non-empty sets, and each set mentioned (whether we say so explicitly or not) is assumed to be non-empty. If X and Y are two sets, we say that X is numerically equivalent to Y if there exists a one-to-one correspondence between X and Y, that is, if there exists a one-to-one mapping of X onto Y. This relation is reflexive, since the identity mapping i_{X}: X \rightarrow X is one-to-one onto; it is symmetric since if f: X \rightarrow Y is one-to-one onto, then its inverse mapping f^{-1}: Y \rightarrow X is also one-to-one onto; and it is transitive, since if f: X \rightarrow Y and g: Y \rightarrow Z are one-to-one onto, then gf: X \rightarrow Z is also one-to-one onto. Numerical equivalence has all the properties of an equivalence relation, and if we consider it as an equivalence relation in the class of all non-empty subsets of some universal set U, it groups together into equivalence sets all those subsets of U which have the “same number of elements.” After we state and prove the following very useful but rather technical theorem, we shall continue in Sections 6 and 7 with an exploration of the implications of these ideas.

The theorem we have in mind — the Schroeder-Bernstein theorem: If X and Y are two sets each of which is numerically equivalent to a subset of the other, then all of X is numerically equivalent to all of Y. There are several proofs of this classic theorem, some of which are quite difficult. The very elegant proof we give is essentially due to Birkhoff and MacLane.


Assume that f: X \rightarrow Y is a one-to-one mapping of X into Y, and that g: Y \rightarrow X is a one-to-one mapping of Y into X. We want to produce a mapping F: X \rightarrow Y which is one-to-one onto. We may assume that neither f nor g is onto, since if f is, we can define F to f, and if g is, we can define F to be g^{-1}. Since both f and g are one-to-one, it is permissible to use the mappings f^{-1} and g^{-1} as long as we keep in mind that f^{-1} is defined only on f(X) and g^{-1} is defined only on g(Y). We obtain the mapping F by splitting both X and Y into subsets which we characterize in terms of the ancestry of their elements. Let x be an element of X. We apply g^{-1} (if we can) to get the element g^{-1}(x) in Y. If g^{-1}(x) exists, we call it the first ancestor of x. The element x itself we call the zeroth ancestor of x. We now apply f^{-1} to g^{-1}(x) if we can, and if (f^{-1}g^{-1})(x) exists, we call it the second ancestor of x. We now apply g^{-1} to (f^{-1}g^{-1})(x) if we can, and if (g^{-1}f^{-1}g^{-1})(x) exists, we call it the third ancestor of x. As we continue this process of tracing back the ancestry of x, it becomes apparent that there are three possibilities — (1) x has infinitely many ancestors. We denote by X_{i}, the subset of X, which consists of all elements with infinitely many ancestors (2) x has an even number of ancestors, this means that x has a last ancestor (that is, one which itself has no first ancestor) in X. We denote by X_{e} the subset of X consisting of all elements with an even number of ancestors. (3) x has an odd number of ancestors; this means that x has a last ancestor in Y. We denote by X_{o} the subset of X which consists of all elements with an odd number of ancestors. The three sets X_{i}, X_{e}, X_{o} form a disjoint class whose union is X. We decompose Y in just the same way into three subsets Y_{i}, Y_{e}, Y_{o}. It is easy to see that f maps X_{i} onto Y_{i}, and X_{e} onto Y_{e}, and that g^{-1} maps X_{o} onto Y_{o}, and we complete the proof by defining F in the following piecemeal manner:

F(x) = f(x) if x \in X_{i} \bigcup X_{e}

and F(x) = g^{-1}(x) if x \in X_{o}.


The Schroeder Bernstein theorem has great theoretical and practical significance. It main value for us lies in its role as a tool by means of which we can prove numerical equivalence with a minimum of effort for many specific sets. We put it to work in Section 7.


Nalin Pithwa

IV. Product of Sets: Exercises

Reference: Topology and Modern Analysis, G F Simmons, Tata McGraw Hill Publications, India.


I) The graph of a mapping f: X \rightarrow Y is a subset of the product X \times Y. What properties characterize the graphs of mappings among all subsets of X \times Y?

Solution I: composition.

II) Let X and Y be non-empty sets. If A_{1} and A_{2} are subsets of X, and Y_{1} and Y_{2} are subsets of Y, then prove the following

(a) (A_{1} \times B_{1}) \bigcap (A_{2} \times B_{2}) = (A_{1} \bigcap A_{2}) \times (B_{1} \bigcap B_{2})

(b) (A_{1} \times B_{1}) - (A_{2} \times B_{2}) = (A_{1}-A_{2}) \times (B_{1}-B_{2}) \bigcup (A_{1} \bigcap A_{2}) \times (B_{1}-B_{2}) \bigcup (A_{1}-A_{2}) \times (B_{1} \bigcap B_{2})

Solution IIa:

TPT: (A_{1} \times B_{1}) \times (A_{2} \times B_{2}) = (A_{1} \bigcap A_{2}) \times (B_{1} \bigcap B_{2})

This is by definition of product and the fact that the co-ordinates are ordered and the fact that A_{1} \subseteq X, A_{2} \subseteq X, B_{1} \subseteq Y, and B_{2} \subseteq Y.

Solution IIb:

Let (a_{i}, b_{j}) \in A_{1} \times B_{1}, but (a_{i}, b_{j}) \notin A_{2} \times B_{2}. So, the element may belong to (A_{1}-A_{2}) \times (B_{1} - B_{2}) or it could happen that it belongs to A_{1} \times A_{1}, but to (B_{1}-B_{2}) (here we need to remember that the element is ordered); so, also it could be the other way: it could belong to (A_{1}-A_{2}) but to (B_{1} \bigcap B_{2}) also. The same arguments applied in reverse establish the other subset inequality. Hence, done.

III) Let X and Y be non-empty sets, and let A and B be rings of subsets of X and Y, respectively. Show that the class of all finite unions of sets of the form A \times B with A \in \bf{A} and B \in \bf{B} is a ring of subsets of X \times Y.

Solution III:

\bigcup_{i=1}^{n}A_{i} \times B_{i} = \bigcup_{i=1}^{n}A_{i} \times \bigcup_{i=1}^{n}B_{i}.

From question IIb above, the difference of any two pairs of sets is also in X \times Y.

Hence, done.


Nalin Pithwa

IV. Product of Sets

Reference: Topology and Modern Analysis — G. F. Simmons, Tata McGraw Hill, India.

We shall often have occasion to weld together the sets of a given class into a single new set called their product (or their Cartesian product). The ancestor of this concept is the coordinate plane of analytic geometry, that is, a plane equipped with the normal rectangular coordinate axes. We give a brief description of this fundamental idea with a view to paving the way for our discussion of product of sets in general.

First, a few preliminary comments about the real line. We have already used this term several times before without any explanations, and of course what we mean by it is an ordinary geometric line whose points have been identified with — or coordinatized by — the set R of all real numbers. We use the letter R to denote the real line as well as the set of all real numbers, and we often speak of real numbers as if they were points on the real line, and of points on the real line as if they were real numbers. Let no be deceived into thinking that the real line is a simple thing, for its structure is exceedingly intricate. Our present view of it, however, is as naive and uncomplicated as the picture of it. Generally speaking, we assume that the reader is familiar with the simpler properties of the real line — those relating to inequalities (see problems section) and the basic algebraic operations of addition, subtraction, multiplication and division. One of the most significant facts about the real number system is perhaps less well known. This is the so-called least upper bound property. It asserts that every non empty set of real numbers which has an upper bound has a least upper bound. It is an easy consequence of this fact that a non empty set of real numbers which has a lower bound has a greatest lower bound. All these matters are developed rigorously on the basis of a small number of axioms, and detailed treatments can often be found in books on elementary abstract algebra.

To construct the coordinate plane, we now proceed as follows. We take two identical replicas of the real line, which we call the x axis and the y axis, and paste them on a plane at right angles to one another in such a way that they cross at the zero point on each. We know that usual picture. Now, let P be a point in the plane. We project P perpendicularly onto points Px and Py on the axes. If x and y are the coordinates of Px and Py on their respective axes, this process leads us from the point P to the uniquely determined ordered pair (x,y) of real numbers, where x and y are called the x coordinate and y coordinate of the point P. We can reverse the process, and starting with the ordered pair of real numbers, we can recapture the point. This is the manner in which we establish the familiar one-to-one correspondence between points P in the plane and ordered pairs (x,y) of real numbers. In fact, we think of a point in the plane (which is a geometric object) and its corresponding ordered pair of real numbers (which is an algebraic object) as being — to all intents and purposes — identical with one another. The essence of analytic geometry lies in the possibility of this exploiting this identification by using algebraic tools in geometric arguments and giving geometric interpretations to algebraic calculations.

The conventional attidute towards the coordinate plane in analytic geometry is that the geometry is the focus of interest and the algebra of ordered pairs is only a convenient tool. Here, we reverse this point of view. For us, the coordinate plane is defined to be the set of all ordered pairs (x,y) of real number. We can satisfy our desire for visual images by using the usual picture of the plane and by calling such an ordered pair a point, but this geometric lnaguage is more a convenience than a necessity.

Our notation for the coordinate plane is R \times R or R^{2}. This symbolism reflects the idea that the coordinate plane is the result of multiplying together two replicas of the real line R.

It is perhaps necessary to comment on one possible source of misunderstanding. When we speak of R^{2} as a plane, we do so only to establish an intuitive bond with the reader’s previous experience in analytic geometry. Our present attitude is that R^{2} is a pure set and has no structure whatsoever, because no structure has as yet been assigned to it. We remarked earlier (with deliberate vagueness) that a space is a set to which has been added some kind of algebraic or geometric structure. Later, we shall convert the set R^{2} into the space of analytic geometry by defining the distance between any two points (x_{1}, y_{1}) and (w_{2}, y_{2}) to be


This notion of distance endows the set R^{2} with a certain “spatial” character, which we shall recognize by calling the resulting space the Euclidean plane instead of the coordinate plane.

We assume that the reader is fully acquainted with the way in which the set C of all complex numbers can be identified (as a set) with the coordinate plane R^{2}. If z is a complex number, and if z has the standard form x+iy where x and y are real numbers, then we identify z with the ordered pair (x,y) and thus, with an element of R^{2}. The complex numbers, however, are much more than merely a set. They constitute a number system with the operations of addition, multiplication, conjugation, etc. When the coordinate plane R^{2} is thought of as consisting of complex numbers and is enriched by the algebraic structure it acquires in this way, it is called the complex plane. The letter C is used to denote either the set C of all complex numbers or the complex plane. Later, we shall make a space out of the complex plane.

Suppose now that X_{1} and X_{2} are two non empty sets. By analogy with our above discussion, their product X_{1} \times X_{2} is defined by the set of all ordered pairs (x_{1}, x_{2}) where x_{1} is in X_{1} and x_{2} is in X_{2}. In spite of the arbitrary nature of X_{1} and X_{2}, their product can be represented by a picture similar to the XY plane. The term product is applied to this set, and it is thought of as a result of multiplying together the sets X_{1} and X_{2} for the following reason: if X_{1} and X_{2} are finite sets with m and n elements, then clearly X_{1} \times X_{2} has mn elements. If f: X_{1} \rightarrow X_{2} is a mapping with domain X_{1} and range X_{2}, its graph is a subset of X_{1} \times X_{2} which consists of all ordered pairs of the form (x_{1}, f(x_{1})). We observe that this is an appropriate generalization of the concept of a graph of a function as it occurs in elementary mathematics.

This definition of the product of two sets extends easily to the definiion of product of n sets for any positive integer n. If X_{1}, X_{2}, X_{3}, \ldots, X_{n} are n sets where n is any positive integer, then their product X_{1} \times X_{2} \times X_{3} \times \ldots \times X_{n} is the set of all ordered tuples (x_{1}, x_{2}, \ldots, x_{n}) where x_{i} is in the set X_{i} for each subscript i. If the X_{i}‘s are all replicas of a single set X, that is, if

X_{1} = X_{2} = \ldots = X_{n} = X,

then their product is usually denoted by the symbol X^{n}.

These ideas specialize directly to yield the important sets R^{n} and C^{n}. R is just R, the real line; and R^{2} is the coordinate plane; R^{3} is the set of all ordered triples of real numbers — the set which underlies solid analytic geometry, and we assume that the reader is familiar with the manner in which this set arises, through the introduction of a rectangular coordinate system into three dimensional space. We can draw pictures here, just as in the case of the coordinate plane and use geometric language here as much as we please, but it must be understood that the mathematics of this set is the mathematics of ordered set of triples of real numbers and that pictures are merely an aid to the intuition. Once we fully this grasp this point of view, there is no difficulty whatsoever in advancing at once to the study of the set R^{n}, study of n-tuples of real numbers for any positive integer n. It is quite true that when n is greater than 3, it is no longer possible to draw the same kind of intuitively rich pictures, but at worst this is merely an inconvenience. We can and do continue to use suggestive geometric language, so all is not lost. The set C^{n} is defined similarly; it is the set of all ordered n-tuples (z_{1}, z_{2}, \ldots, z_{n}) of complex numbers. Both R^{n} and C^{n} are of fundamental importance in analysis and topology.

We emphasized that for the present the coordinate plane is to be considered merely as a set, but not as a space. Similar remarks apply to R^{n} and C^{n}. In due conrse, we shall impart form and content to each of these sets by suitable definitions. We shall convert them into Euclidean and n-unitary space which underlie and motivate so many developments in pure mathematics, and we shall explore some aspects of their algebraic and topological structures, But, as of now — and this is the point of view we insist on — they do not have any structure at all; they are merely sets.

As the reader doubtless suspects, we need not consider only products of finite classes of sets. The needs of topology compel us to extend these ideas to arbitrary classes of sets.

We defined the product X_{1} \times X_{2} \times \ldots \times X_{n} to be the set of all ordered n-tuples (x_{1}, x_{2}, \ldots, x_{n}) such that x_{i} is in X_{i} for each subscript i. To see how to extend this definition, we reformulate it as follows: We have an nidex set I, consisting of the integers from 1 to n, and corresponding to each index (or subscript) i we have a non-empty set X_{i}. The n-tuple (x_{1}, x_{2}, x_{3}, \ldots, x_{n}) is simply a function (call it x) defined on the index set I, with the restriction that its value x(i)=x_{i} is an element of the set X_{i} for each i in I. Our point of view here is that the function x is completely determined by and is essentially equivalent to the array (x_{1}, x_{2}, x_{3}, \ldots, x_{n}) of its values.

The way is now open for the definition of products in their full generality. Let \{ X_{i}\} be a non-empty class of non-empty sets, indexed by the elements i of an index set I. The sets X_{i} need not be different from one another; indeed, it may happen that they are all identical replicas of a single set, distinguished only different indices. The product of the sets X_{i}, written P_{i \in I}X_{i} is defined to be the set of all functions x defined on I such that x(i) is an element of the set X_{i} for each index i. We call X_{i} the ith coordinate set. When there can be no misunderstanding about the index set, the symbol P_{i \in I}X_{i} is often abbreviated to P_{i}X_{i}. The definition we have just given requires that each coordinate set be non-empty before the product can be formed. It will be useful if we extend this definition slightly by agreeing that if any of the X_{i}‘s are empty, then P_{i}X_{i} is also empty.

This approach to the idea of a product of a class of sets, by means of functions defined on the index set is useful mainly in giving the definition. In practice, it is much more convenient to use the subscript notation x_{i} instead of the function notation x(i). We then interpret the product P_{i}X_{i} as made up of elements x, each of which is specified by the exhibited array \{ x_{i}\} of it values in the respective coordinate sets X_{i}. We call x_{i} the ith coordinate of the element x = \{ x_{i}\}.

The mapping p_{i} of the product P_{i}X_{i} onto its ith coordinate set X_{i} which is defined by p_{i}(x) = x_{i} — that is, the mapping whose value at an arbitrary element of the product is the ith coordinate of that element — is called the projection onto the ith coordinate set. The projection p_{i} selects the ith coordinate of each element in its domain. There is clearly one projection for each element of the index set I, and the set of all projections plays an important role in the general theory of topological spaces.

We will continue with exercises on this topic in a later blog.


Nalin Pithwa

Notes II: Sets and Functions: problems and solutions

Let f: X \rightarrow Y, let A \subseteq X, let B \subseteq Y. Then,

f(A) = \{f(x): x \in A \}

f^{-1}(B) = \{ x: f(x) \in B\}

I. The main properties of the first set mapping are:

1a) f(\phi) = \phi, f(X) \subseteq Y

1b) A_{1} \subseteq A_{2} \Longrightarrow f(A_{1}) \subseteq A_{2}

1c) f(\bigcup_{i}A_{i}) = \bigcup_{i}f(A_{i})

1d) f(\bigcup_{i}A_{i}) \subseteq \bigcap_{i} f(A_{i})

Answer I:

1a: obvious by definition of f(A) or f.

1b: obvious by definition of f

1c: Let x_{0} \in \bigcup_{i}A_{i} so that for some one i we have f(x) \in f(A_{i}). That is, f(\bigcup_{i}A_{i}) \subseteq \bigcup_{i}f(A_{i}). Reversing the arguments, we get the other subset relation. So that f(\bigcup_{i}A_{i}) = \bigcup_{i}f(A_{i}). The image of the union is the union of images.

1d: Let x_{0} \in \bigcap_{i}A_{i} so that x_{0} belongs to each A_{i}. So that LHS is clearly a subset of \bigcap_{i}f(A_{i}).

II) Now, we want to verify the following useful/famous relations of the second set mapping:

2a) f^{-1}{(\phi)} = \phi, f^{-1}(Y) = X

2b) B_{1} \subseteq B_{2} \Longrightarrow f^{-1}(B_{1}) \subseteq f^{-1}(B_{2})

2c) f^{-1}(\bigcup_{i}B_{i}) = \bigcup_{i}f^{-1}(B_{i})

2d) f^{-1}(\bigcap_{i}B_{i}) = \bigcap_{i}f^{-1}(B_{i})

2e) f^{-1}(B^{'}) = f^{-1}(B)^{'}

Answer 2a: obvious from definition.

Answer 2b: obvious from definition.

Answer 2c: Let y_{0} belong to at least one B_{i} so that x_{0} = f^{-1}(y_{0}) = f^{-1}(B_{i}) for some i. In other words, f^{-1}(\bigcup_{i}B_{i}) = \bigcup_{i}f^{-1}(B_{i})

Answer 2d:

TPT: f^{-1}(B^{'}) = f^{-1}(B)^{'}.

Let y_{0} \in f^{-1}(B^{'}).

Hence, f(y_{0}) \in B^{'}.

Hence, f(y_{0}) \in U-B, where U is the universal set

Hence, f(y_{0}) \in U-f(A)

Hence, f(y_{0}) \in U-B

Hence, f(y_{0}) \in U-f(A)

Hence, f(y_{0}) \in (f(A))^{'}

Hence, y_{0} \in f^{-1}(B)^{'}. Hence, f^{-1}(B^{'}) \subseteq f^{-1}(B)^{'}

Reversing the arguments proves the other subset inequality.

Hence, f^{-1}(B^{'}) = f^{-1}(B)^{'}. QED.

More Problems :

A) Two mappings f: X \rightarrow Y and g: X \rightarrow Y are said to be equal (and we write them as f=g) if f(x) = g(x) for every x in X. Let f, g, and h be any three mappings of a non-empty set X into itself, and now prove that the multiplication of mappings is associative in the sense that (fg)h = f(gh).

Solution A:

Let f: X \rightarrow X, g: X \rightarrow X, and h: X \rightarrow X be any three mappings of X into itself. Let x_{0} \in X and let h(x_{0}) = x_{1} \in X. Now, we know that ((fg)h)(x)= (fg)(h(x)). So that now we want to find (fg)(x_{1}) = f(g(x_{1})) = f(x_{2}) assuming g(x_{1}) \in x_{2} \in X.

Now, on the other hand, (f(gh))(x) means (f)((gh)(x)) = f(g(h(x_{0})) = f(g(x_{1}))=f(x_{2}) just as in LHS. QED.

B) Let X be a non-empty set. The identity mapping i_{X} on X is the mapping of X onto itlsef defined by i_{X}(x)=x for every x. Then, i_{X} sends each element of the set X to itself; that is, leaves fixed each element of X. Prove that fi_{X}= i_{X}f=f for any mapping f of X into itself. If f is one-to-one onto, so that its inverse f^{-1} exists, show that ff^{-1} = f^{-1}f=i_{X}. Prove further that f^{-1} is the only mapping of X into itself which has this property; that is, show that if g is a mapping of X into itself such that fg=gf=i_{X}, then g=f^{-1}. Hint: g=gi_{X}=g(ff^{-1})=(gf)f^{-1} = i_{X}f^{-1}=f^{-1}, or

g=i_{X}g=(f^{-1}f)g=f^{-1}(fg) = f^{-1}i_{X}=f^{-1}

B(i) TPT: fi_{X}=i_{X}f=f for any mapping f of X into itself. Proof: Consider x_{0}\in X and such that f(x_{0}) = x_{1}. Hence, fi_{X}(x_{0}) = f(i_{X}(x_{0}))=f(x_{0}) = x_{1} \in X = f(x_{0}). Now, on the other (i_{X}f)(x_{0}) = i_{X}(f(x_{0}))=i_{X}(x_{1})= x_{1} also. QED.

B(ii): Let f be one-to-one onto such that f: X \rightarrow X. Hence, f^{-1} exists. Also, f^{-1}: X \rightarrow X. Let f(x_{0}) = x_{1} \in X. Hence, by definition and because f^{-1} is also one-to-one and onto, f^{-1}(x_{1})=x_{0}. Hence, clearly, ff^{-1} = f^{-1}f=i_{X}. QED.

B(iii) : Prove that such a f^{-1} is unique.

Solution to B(iii): Let there be another function g such that gf=fg=i_{X}. Also, we have ff^{-1}=f^{-1}f=i_{X}. Use the hint ! 🙂 QED.

Question C:

Let X and Y be non-empty sets and f a mapping of X into Y. Prove the following: (i) f is one-to-one if and only if there exists a mapping g of Y into X such that gf=i_{X} (ii) f is onto if and only if there exists a mapping h of Y into X such that fh=i_{X}.

Solution C(i) : Given f: X \rightarrow Y such that if x_{0} \neq x_{1}, then f(x_{0}) \neq f(x_{1}). Now, we want a mapping g: Y \rightarrow X such that gf=i_{X}. Let us construct a g such that g(y_{0})=x_{0} and g(y_{1}) = x_{1}. Then, (gf)(x_{0}) = g(f(x_{0}))=g(y_{0})=x_{0}. Similarly, gf(x_{1})=x_{1}. Here, we have assumed that f(x_{0})=y_{0} and f(x_{1})=y_{1}. So, knowing f to be one-to-one, we can explicitly construct a mapping g such that g: Y \rightarrow X and gf=i_{X}. QED. On the other hand, let it be given that X and Y are two non-empty sets, that f: X \rightarrow Y and g: Y \rightarrow X such that gf=i_{X}. That is, if some x^{'} \in X, then we have (gf)(x^{'}) = i_{X}(x^{'})=x^{'}. That is, g(f(x^{'})) = x^{'}. This forces, if f(x^{'}) = y^{'}, then g(y^{'})=x^{'}. Now, if there be an element x^{''} \neq x^{'}, it will force f(x^{''})=y^{''} \neq x^{'}. Hence, different inputs give rise to different outputs and so f is one-to-one. QED.

Solution C(ii) : Given f: X \rightarrow Y and f is onto, that is, f(X) = Y. We want to show a function h: Y \rightarrow X such that fh=i_{X}. Since f(X)=Y, with X as domain, Y as codomain equal to range, f is one-to-one, onto. So h=f^{-1}. QED.

Question D:

Let X be a non-empty set and f a mapping of X into itself. Show f is one-to-one onto if and only there exists a mapping g of X into itself such that gf=fg=i_{X}. If there exists a mapping g with the property, then there is one and only one such mapping. Why?

Solution D: From previous question, C we are done.

Question E:

Let X be a non-empty set, and let f and g be one-to-one mappings of X onto itself. Show that fg is also a one-to-one mapping of X onto itself and that (fg)^{-1}=g^{-1}f^{-1}.

HW. Hint: consider the action of the functions on each element.

Question F:

Let X and Y be non-empty sets and f a mapping of X into Y. If A and B are, respectively, subsets of X and Y, prove the following:

(a) ff^{-1}(B) \subseteq B, and ff^{-1}(B)=B is true for all B if and only if f is onto.

(b) A \subseteq f^{-1}f(A), and A = f^{-1}f(A) is true for all A if and only if f is one-to-one.

(c) f(A_{1} \bigcap A_{2}) = f(A_{1}) \bigcap f(A_{2}) is true for all A_{1} and A_{2} if and only if f is one-to-one.

(d) f(A)^{'} \subseteq f(A^{'}) is true for all A if and only if f is onto.

(e) If f is onto — so that, f(A)^{'} \subseteq f(A^{'}) is true for all A — then, f(A)^{'}=f(A^{'}) is true for all A if and only if f is also one-to-one.

Solution Fa: Put B=Y, so that f is onto. To prove the other subset relationship, simply reverse the arguments.

Solution Fb:

We need to apply the definition that different inputs give rise to different outputs.


😦 I hope to blog later)


Solution Fc:

It certainly implies that A_{1} \bigcap A_{2} \neq \phi.


(I hope to blog later).

Solution Fd:

Given that f: X \rightarrow Y and A \subseteq X and B \subseteq Y. Then, f(A)^{'} is Y-f(A), that is, Y-B, that is B^{'}, that is, f(B^{'}). Put A=X, then f is onto. Now, f(X)^{'} \subseteq f(X^{'}), that is, Y-f(X) \subseteq f(\phi). This implies Y=f(X). To prove the other subset relationship, simply reverse the arguments. QED.


Nalin Pithwa.