V. Partitions and Equivalence Relations: My Notes


  1. Topology and Modern Analysis, G F Simmons, Tata McGraw Hill Publications, India.
  2. Toplcs in Algebra, I N Herstein.
  3. Abstract Algebra, Dummit and Foote.
  4. Topology by James Munkres.

In the first part of this section, we consider a non-empty set X, and we study decompositions of X into non-empty subsets which fill it out and have no elements in common with one another. We give special attention to the tools (equivalence relation) which are normally used to generate such decompositions.

A partition of X is a disjoint class \{ X_{i} \} of non-empty subsets of X whose union if the full set X itself. The X_{i}‘s are called the partition sets. Expressed somewhat differently, a partition of X is the result of splitting it, or subdividing it, into non-empty subsets in such a way that each element of X belongs to one and only one of the given subsets. ]

If X is the set \{1,2,3,4,5 \}, then \{1,3,5 \}, \{2,4 \} and \{1,2,3 \} and \{ 4,5\} are two different partitions of X. If X is the set \Re of all real numbers, then we can partition \Re into the set of all rationals and the set of all irrationals, or into the infinitely many closed open intervals of the form [n, n+1) where n is an integer. If X is the set of all points in the coordinate plane, then we can partition X in such a way that each partition set consists of all points with the same x coordinate (vertical lines), or so that each partition set consists of all points with the same y coordinate (horizontal lines).

Other partitions of each of these sets will readily occur to the reader. In general, there are many different ways in which any given set can be partitioned. These manufactored examples are admittedly rather uninspiring and serve only to make our ideas more concrete. Later in this section we consider some others which are more germane to our present purposes.

A binary relation in the set X is a mathematical symbol or verbal phrase, which we denote by R in this paragraph, such that for each ordered pair (x,y) of elements of X the statement x \hspace{0.1in} R \hspace{0.1in} y is meaningful, in the sense that it can be classified definitely as true or false. For such a binary relation, x \hspace{0.1in} R \hspace{0.1in}y symbolizes the assertion that x is related by R to y, and x \not {R} \hspace{0.1in}y the negation of this, namely, the assertion that x is not related by R to y. Many examples of binary relations can be given, some familiar and others less so, some mathematical and others not. For instance, if X is the set of all integers and R is interpreted to mean “is less than,” which of course is usually denoted by the symbol <, then we clearly have 6<7 and 5 \not < 2. We have been speaking of binary relations, which are so named because they apply only to ordered pairs of elements, rather than to ordered triples, etc. In our work, we drop the qualifying adjective and speak simply of a relation in X, since we shall have occasion to consider only relations of this kind. {NB: Some writers prefer to regard a relation R in X as a subset R of X \times X. From this point of view, x R y and x \not {R} y are simply equivalent ways of writing (x,y) \in R and (x,y) \notin R. This definition has the advantage of being more tangible than our definition, and the disadvantage that few people really think of a relation in this way.” )

We now assume that a partition of our non-empty set X is given, and we associate with this partition a relation on X. This relation is defined to be in the following way: we say that x is equivalent to y and write this as x \sim y (the symbol \sim is pronounced “wiggle”.), if x and y belong to the same partition set. It is obvious that the relation \sim has the following properties:

a) x \sim x for every x (reflexivity)

b) x \sim y \Longrightarrow y \sim x (symmetry)

c) x \sim y \hspace{0.1in} and \hspace{0.1in} y \sim z \Longrightarrow x \sim z (transitivity)

This particular relation in X arose in a special way, in connection with a given partition of X, and its properties are immediate consequences of the definition. Any relation whatever in X which possesses these three properties is called an equivalence relation in X.

We have just seen that each partition of X has associated with it a natural equivalence relation in X. We now reverse the situation and prove that a given equivalence relation in X determines a natural partition of X.

Let \sim be an equivalence relation in X; that is, assume that it is reflexive, symmetric, and transitive in the sense described above. If x is an element of X, the subset of X defined by [x] = \{ y: y \sim x\} is called the equivalence set of x. The equivalence set of x is thus the set of all elements which are equivalent to x. We show that the class of all distinct equivalence sets forms a partition of X. By reflexivity, x \in [x] for each element x in X, so each equivalence set is non-empty and their union is X. It remains to be shown that any two equivalence sets [x_{1}] and [x_{2}] are either disjoint or identical. We prove this by showing that if [x_{1}] and [x_{2}] are not disjoint, then they must be identical. Suppose that [x_{1}] and [x_{2}] are not disjoint, that is, suppose that they have a common element z. Since x belongs to both equivalence sets, z \sim x_{1} and z \sim x_{2}, and by symmetry x_{1} \sim z. Let y be any element of x_{1}, so that y \sim x_{1}. Since y \sim x_{1} and x_{1} \sim z, transitivity shows that y \sim z. By another application of transitivity, y \sim z and z \sim x_{2}, imply that y \sim x_{2} so that y is in [x_{2}]. Since y was arbitrarily chosen in [x_{1}], we see by this that [x_{1}] \subseteq [x_{2}]. The same reasoning shows that [x_{2}] \subseteq [x_{1}] and from this we conclude that [x_{1}] = [x_{2}].

The above discussion demonstrates that there is no real distinction (other than a difference in language) between partitions of a set and equivalence relation by regarding elements as equivalent if they belong to the same partition set, and if we start with an equivalence relation, we get a partition by grouping together into subsets all elements which are equivalent to one another. We have here a single mathematical idea, which we have been considering from two different points of view, and the approach we choose in any particular application depends entirely on our own convenience. In practice, it is almost invariably the case that we use equivalence relations (which are usually easy to define) to obtain partitions (which are sometimes difficult to describe fully).

We now turn to several of the more important simple examples of equivalence relations.

Let I be the set of integers. If a and b are elements of this set, we write a = b (and say that a equals b) if a and b are the same integer. Thus, 2+3=5 means that the expression on the left and right are simply different ways of writing the same integer. It is apparent that = used in this sense is an equivalence relation in the set I:

i) a=a for every a

ii) a=b \Longrightarrow b=a

iii) a=b \hspace{0.1in} b=c \Longrightarrow a=c.

Clearly, each equivalence set consists of precisely one integer.

Another familiar example is this relation of equality commonly used for fractions. We remind the reader that, strictly speaking, a fraction is merely a symbol of the form a/b, where a and b are integers and b is not zero. The fractions 2/3 and 4/6 are obviously not identical, but nevertheless we consider them to be equal. In general, we say that two fractions a/b and c/d are equal, written \frac{a}{b} = \frac{c}{d}, if ad and bc are equal as integers in the usual sense (see the paragraph above). (HW quiz: show this is an equivalence relation on the set of fractions). An equivalence set of fractions is what we call a rational number. Every day usage ignores the distinction between fractions and rational numbers, but it is important to recognize that from the strict point of view it is the rational numbers (and not the fractions) which form part of the real number system.

Our final example has a deeper significance, for it provides us with the basic tool for our work of the next two sections.

For the remainder of all this section, we consider a relation between pairs of non-empty sets, and each set mentioned (whether we say so explicitly or not) is assumed to be non-empty. If X and Y are two sets, we say that X is numerically equivalent to Y if there exists a one-to-one correspondence between X and Y, that is, if there exists a one-to-one mapping of X onto Y. This relation is reflexive, since the identity mapping i_{X}: X \rightarrow X is one-to-one onto; it is symmetric since if f: X \rightarrow Y is one-to-one onto, then its inverse mapping f^{-1}: Y \rightarrow X is also one-to-one onto; and it is transitive, since if f: X \rightarrow Y and g: Y \rightarrow Z are one-to-one onto, then gf: X \rightarrow Z is also one-to-one onto. Numerical equivalence has all the properties of an equivalence relation, and if we consider it as an equivalence relation in the class of all non-empty subsets of some universal set U, it groups together into equivalence sets all those subsets of U which have the “same number of elements.” After we state and prove the following very useful but rather technical theorem, we shall continue in Sections 6 and 7 with an exploration of the implications of these ideas.

The theorem we have in mind — the Schroeder-Bernstein theorem: If X and Y are two sets each of which is numerically equivalent to a subset of the other, then all of X is numerically equivalent to all of Y. There are several proofs of this classic theorem, some of which are quite difficult. The very elegant proof we give is essentially due to Birkhoff and MacLane.


Assume that f: X \rightarrow Y is a one-to-one mapping of X into Y, and that g: Y \rightarrow X is a one-to-one mapping of Y into X. We want to produce a mapping F: X \rightarrow Y which is one-to-one onto. We may assume that neither f nor g is onto, since if f is, we can define F to f, and if g is, we can define F to be g^{-1}. Since both f and g are one-to-one, it is permissible to use the mappings f^{-1} and g^{-1} as long as we keep in mind that f^{-1} is defined only on f(X) and g^{-1} is defined only on g(Y). We obtain the mapping F by splitting both X and Y into subsets which we characterize in terms of the ancestry of their elements. Let x be an element of X. We apply g^{-1} (if we can) to get the element g^{-1}(x) in Y. If g^{-1}(x) exists, we call it the first ancestor of x. The element x itself we call the zeroth ancestor of x. We now apply f^{-1} to g^{-1}(x) if we can, and if (f^{-1}g^{-1})(x) exists, we call it the second ancestor of x. We now apply g^{-1} to (f^{-1}g^{-1})(x) if we can, and if (g^{-1}f^{-1}g^{-1})(x) exists, we call it the third ancestor of x. As we continue this process of tracing back the ancestry of x, it becomes apparent that there are three possibilities — (1) x has infinitely many ancestors. We denote by X_{i}, the subset of X, which consists of all elements with infinitely many ancestors (2) x has an even number of ancestors, this means that x has a last ancestor (that is, one which itself has no first ancestor) in X. We denote by X_{e} the subset of X consisting of all elements with an even number of ancestors. (3) x has an odd number of ancestors; this means that x has a last ancestor in Y. We denote by X_{o} the subset of X which consists of all elements with an odd number of ancestors. The three sets X_{i}, X_{e}, X_{o} form a disjoint class whose union is X. We decompose Y in just the same way into three subsets Y_{i}, Y_{e}, Y_{o}. It is easy to see that f maps X_{i} onto Y_{i}, and X_{e} onto Y_{e}, and that g^{-1} maps X_{o} onto Y_{o}, and we complete the proof by defining F in the following piecemeal manner:

F(x) = f(x) if x \in X_{i} \bigcup X_{e}

and F(x) = g^{-1}(x) if x \in X_{o}.


The Schroeder Bernstein theorem has great theoretical and practical significance. It main value for us lies in its role as a tool by means of which we can prove numerical equivalence with a minimum of effort for many specific sets. We put it to work in Section 7.


Nalin Pithwa

IV. Product of Sets: Exercises

Reference: Topology and Modern Analysis, G F Simmons, Tata McGraw Hill Publications, India.


I) The graph of a mapping f: X \rightarrow Y is a subset of the product X \times Y. What properties characterize the graphs of mappings among all subsets of X \times Y?

Solution I: composition.

II) Let X and Y be non-empty sets. If A_{1} and A_{2} are subsets of X, and Y_{1} and Y_{2} are subsets of Y, then prove the following

(a) (A_{1} \times B_{1}) \bigcap (A_{2} \times B_{2}) = (A_{1} \bigcap A_{2}) \times (B_{1} \bigcap B_{2})

(b) (A_{1} \times B_{1}) - (A_{2} \times B_{2}) = (A_{1}-A_{2}) \times (B_{1}-B_{2}) \bigcup (A_{1} \bigcap A_{2}) \times (B_{1}-B_{2}) \bigcup (A_{1}-A_{2}) \times (B_{1} \bigcap B_{2})

Solution IIa:

TPT: (A_{1} \times B_{1}) \times (A_{2} \times B_{2}) = (A_{1} \bigcap A_{2}) \times (B_{1} \bigcap B_{2})

This is by definition of product and the fact that the co-ordinates are ordered and the fact that A_{1} \subseteq X, A_{2} \subseteq X, B_{1} \subseteq Y, and B_{2} \subseteq Y.

Solution IIb:

Let (a_{i}, b_{j}) \in A_{1} \times B_{1}, but (a_{i}, b_{j}) \notin A_{2} \times B_{2}. So, the element may belong to (A_{1}-A_{2}) \times (B_{1} - B_{2}) or it could happen that it belongs to A_{1} \times A_{1}, but to (B_{1}-B_{2}) (here we need to remember that the element is ordered); so, also it could be the other way: it could belong to (A_{1}-A_{2}) but to (B_{1} \bigcap B_{2}) also. The same arguments applied in reverse establish the other subset inequality. Hence, done.

III) Let X and Y be non-empty sets, and let A and B be rings of subsets of X and Y, respectively. Show that the class of all finite unions of sets of the form A \times B with A \in \bf{A} and B \in \bf{B} is a ring of subsets of X \times Y.

Solution III:

\bigcup_{i=1}^{n}A_{i} \times B_{i} = \bigcup_{i=1}^{n}A_{i} \times \bigcup_{i=1}^{n}B_{i}.

From question IIb above, the difference of any two pairs of sets is also in X \times Y.

Hence, done.


Nalin Pithwa

IV. Product of Sets

Reference: Topology and Modern Analysis — G. F. Simmons, Tata McGraw Hill, India.

We shall often have occasion to weld together the sets of a given class into a single new set called their product (or their Cartesian product). The ancestor of this concept is the coordinate plane of analytic geometry, that is, a plane equipped with the normal rectangular coordinate axes. We give a brief description of this fundamental idea with a view to paving the way for our discussion of product of sets in general.

First, a few preliminary comments about the real line. We have already used this term several times before without any explanations, and of course what we mean by it is an ordinary geometric line whose points have been identified with — or coordinatized by — the set R of all real numbers. We use the letter R to denote the real line as well as the set of all real numbers, and we often speak of real numbers as if they were points on the real line, and of points on the real line as if they were real numbers. Let no be deceived into thinking that the real line is a simple thing, for its structure is exceedingly intricate. Our present view of it, however, is as naive and uncomplicated as the picture of it. Generally speaking, we assume that the reader is familiar with the simpler properties of the real line — those relating to inequalities (see problems section) and the basic algebraic operations of addition, subtraction, multiplication and division. One of the most significant facts about the real number system is perhaps less well known. This is the so-called least upper bound property. It asserts that every non empty set of real numbers which has an upper bound has a least upper bound. It is an easy consequence of this fact that a non empty set of real numbers which has a lower bound has a greatest lower bound. All these matters are developed rigorously on the basis of a small number of axioms, and detailed treatments can often be found in books on elementary abstract algebra.

To construct the coordinate plane, we now proceed as follows. We take two identical replicas of the real line, which we call the x axis and the y axis, and paste them on a plane at right angles to one another in such a way that they cross at the zero point on each. We know that usual picture. Now, let P be a point in the plane. We project P perpendicularly onto points Px and Py on the axes. If x and y are the coordinates of Px and Py on their respective axes, this process leads us from the point P to the uniquely determined ordered pair (x,y) of real numbers, where x and y are called the x coordinate and y coordinate of the point P. We can reverse the process, and starting with the ordered pair of real numbers, we can recapture the point. This is the manner in which we establish the familiar one-to-one correspondence between points P in the plane and ordered pairs (x,y) of real numbers. In fact, we think of a point in the plane (which is a geometric object) and its corresponding ordered pair of real numbers (which is an algebraic object) as being — to all intents and purposes — identical with one another. The essence of analytic geometry lies in the possibility of this exploiting this identification by using algebraic tools in geometric arguments and giving geometric interpretations to algebraic calculations.

The conventional attidute towards the coordinate plane in analytic geometry is that the geometry is the focus of interest and the algebra of ordered pairs is only a convenient tool. Here, we reverse this point of view. For us, the coordinate plane is defined to be the set of all ordered pairs (x,y) of real number. We can satisfy our desire for visual images by using the usual picture of the plane and by calling such an ordered pair a point, but this geometric lnaguage is more a convenience than a necessity.

Our notation for the coordinate plane is R \times R or R^{2}. This symbolism reflects the idea that the coordinate plane is the result of multiplying together two replicas of the real line R.

It is perhaps necessary to comment on one possible source of misunderstanding. When we speak of R^{2} as a plane, we do so only to establish an intuitive bond with the reader’s previous experience in analytic geometry. Our present attitude is that R^{2} is a pure set and has no structure whatsoever, because no structure has as yet been assigned to it. We remarked earlier (with deliberate vagueness) that a space is a set to which has been added some kind of algebraic or geometric structure. Later, we shall convert the set R^{2} into the space of analytic geometry by defining the distance between any two points (x_{1}, y_{1}) and (w_{2}, y_{2}) to be


This notion of distance endows the set R^{2} with a certain “spatial” character, which we shall recognize by calling the resulting space the Euclidean plane instead of the coordinate plane.

We assume that the reader is fully acquainted with the way in which the set C of all complex numbers can be identified (as a set) with the coordinate plane R^{2}. If z is a complex number, and if z has the standard form x+iy where x and y are real numbers, then we identify z with the ordered pair (x,y) and thus, with an element of R^{2}. The complex numbers, however, are much more than merely a set. They constitute a number system with the operations of addition, multiplication, conjugation, etc. When the coordinate plane R^{2} is thought of as consisting of complex numbers and is enriched by the algebraic structure it acquires in this way, it is called the complex plane. The letter C is used to denote either the set C of all complex numbers or the complex plane. Later, we shall make a space out of the complex plane.

Suppose now that X_{1} and X_{2} are two non empty sets. By analogy with our above discussion, their product X_{1} \times X_{2} is defined by the set of all ordered pairs (x_{1}, x_{2}) where x_{1} is in X_{1} and x_{2} is in X_{2}. In spite of the arbitrary nature of X_{1} and X_{2}, their product can be represented by a picture similar to the XY plane. The term product is applied to this set, and it is thought of as a result of multiplying together the sets X_{1} and X_{2} for the following reason: if X_{1} and X_{2} are finite sets with m and n elements, then clearly X_{1} \times X_{2} has mn elements. If f: X_{1} \rightarrow X_{2} is a mapping with domain X_{1} and range X_{2}, its graph is a subset of X_{1} \times X_{2} which consists of all ordered pairs of the form (x_{1}, f(x_{1})). We observe that this is an appropriate generalization of the concept of a graph of a function as it occurs in elementary mathematics.

This definition of the product of two sets extends easily to the definiion of product of n sets for any positive integer n. If X_{1}, X_{2}, X_{3}, \ldots, X_{n} are n sets where n is any positive integer, then their product X_{1} \times X_{2} \times X_{3} \times \ldots \times X_{n} is the set of all ordered tuples (x_{1}, x_{2}, \ldots, x_{n}) where x_{i} is in the set X_{i} for each subscript i. If the X_{i}‘s are all replicas of a single set X, that is, if

X_{1} = X_{2} = \ldots = X_{n} = X,

then their product is usually denoted by the symbol X^{n}.

These ideas specialize directly to yield the important sets R^{n} and C^{n}. R is just R, the real line; and R^{2} is the coordinate plane; R^{3} is the set of all ordered triples of real numbers — the set which underlies solid analytic geometry, and we assume that the reader is familiar with the manner in which this set arises, through the introduction of a rectangular coordinate system into three dimensional space. We can draw pictures here, just as in the case of the coordinate plane and use geometric language here as much as we please, but it must be understood that the mathematics of this set is the mathematics of ordered set of triples of real numbers and that pictures are merely an aid to the intuition. Once we fully this grasp this point of view, there is no difficulty whatsoever in advancing at once to the study of the set R^{n}, study of n-tuples of real numbers for any positive integer n. It is quite true that when n is greater than 3, it is no longer possible to draw the same kind of intuitively rich pictures, but at worst this is merely an inconvenience. We can and do continue to use suggestive geometric language, so all is not lost. The set C^{n} is defined similarly; it is the set of all ordered n-tuples (z_{1}, z_{2}, \ldots, z_{n}) of complex numbers. Both R^{n} and C^{n} are of fundamental importance in analysis and topology.

We emphasized that for the present the coordinate plane is to be considered merely as a set, but not as a space. Similar remarks apply to R^{n} and C^{n}. In due conrse, we shall impart form and content to each of these sets by suitable definitions. We shall convert them into Euclidean and n-unitary space which underlie and motivate so many developments in pure mathematics, and we shall explore some aspects of their algebraic and topological structures, But, as of now — and this is the point of view we insist on — they do not have any structure at all; they are merely sets.

As the reader doubtless suspects, we need not consider only products of finite classes of sets. The needs of topology compel us to extend these ideas to arbitrary classes of sets.

We defined the product X_{1} \times X_{2} \times \ldots \times X_{n} to be the set of all ordered n-tuples (x_{1}, x_{2}, \ldots, x_{n}) such that x_{i} is in X_{i} for each subscript i. To see how to extend this definition, we reformulate it as follows: We have an nidex set I, consisting of the integers from 1 to n, and corresponding to each index (or subscript) i we have a non-empty set X_{i}. The n-tuple (x_{1}, x_{2}, x_{3}, \ldots, x_{n}) is simply a function (call it x) defined on the index set I, with the restriction that its value x(i)=x_{i} is an element of the set X_{i} for each i in I. Our point of view here is that the function x is completely determined by and is essentially equivalent to the array (x_{1}, x_{2}, x_{3}, \ldots, x_{n}) of its values.

The way is now open for the definition of products in their full generality. Let \{ X_{i}\} be a non-empty class of non-empty sets, indexed by the elements i of an index set I. The sets X_{i} need not be different from one another; indeed, it may happen that they are all identical replicas of a single set, distinguished only different indices. The product of the sets X_{i}, written P_{i \in I}X_{i} is defined to be the set of all functions x defined on I such that x(i) is an element of the set X_{i} for each index i. We call X_{i} the ith coordinate set. When there can be no misunderstanding about the index set, the symbol P_{i \in I}X_{i} is often abbreviated to P_{i}X_{i}. The definition we have just given requires that each coordinate set be non-empty before the product can be formed. It will be useful if we extend this definition slightly by agreeing that if any of the X_{i}‘s are empty, then P_{i}X_{i} is also empty.

This approach to the idea of a product of a class of sets, by means of functions defined on the index set is useful mainly in giving the definition. In practice, it is much more convenient to use the subscript notation x_{i} instead of the function notation x(i). We then interpret the product P_{i}X_{i} as made up of elements x, each of which is specified by the exhibited array \{ x_{i}\} of it values in the respective coordinate sets X_{i}. We call x_{i} the ith coordinate of the element x = \{ x_{i}\}.

The mapping p_{i} of the product P_{i}X_{i} onto its ith coordinate set X_{i} which is defined by p_{i}(x) = x_{i} — that is, the mapping whose value at an arbitrary element of the product is the ith coordinate of that element — is called the projection onto the ith coordinate set. The projection p_{i} selects the ith coordinate of each element in its domain. There is clearly one projection for each element of the index set I, and the set of all projections plays an important role in the general theory of topological spaces.

We will continue with exercises on this topic in a later blog.


Nalin Pithwa

Notes II: Sets and Functions: problems and solutions

Let f: X \rightarrow Y, let A \subseteq X, let B \subseteq Y. Then,

f(A) = \{f(x): x \in A \}

f^{-1}(B) = \{ x: f(x) \in B\}

I. The main properties of the first set mapping are:

1a) f(\phi) = \phi, f(X) \subseteq Y

1b) A_{1} \subseteq A_{2} \Longrightarrow f(A_{1}) \subseteq A_{2}

1c) f(\bigcup_{i}A_{i}) = \bigcup_{i}f(A_{i})

1d) f(\bigcup_{i}A_{i}) \subseteq \bigcap_{i} f(A_{i})

Answer I:

1a: obvious by definition of f(A) or f.

1b: obvious by definition of f

1c: Let x_{0} \in \bigcup_{i}A_{i} so that for some one i we have f(x) \in f(A_{i}). That is, f(\bigcup_{i}A_{i}) \subseteq \bigcup_{i}f(A_{i}). Reversing the arguments, we get the other subset relation. So that f(\bigcup_{i}A_{i}) = \bigcup_{i}f(A_{i}). The image of the union is the union of images.

1d: Let x_{0} \in \bigcap_{i}A_{i} so that x_{0} belongs to each A_{i}. So that LHS is clearly a subset of \bigcap_{i}f(A_{i}).

II) Now, we want to verify the following useful/famous relations of the second set mapping:

2a) f^{-1}{(\phi)} = \phi, f^{-1}(Y) = X

2b) B_{1} \subseteq B_{2} \Longrightarrow f^{-1}(B_{1}) \subseteq f^{-1}(B_{2})

2c) f^{-1}(\bigcup_{i}B_{i}) = \bigcup_{i}f^{-1}(B_{i})

2d) f^{-1}(\bigcap_{i}B_{i}) = \bigcap_{i}f^{-1}(B_{i})

2e) f^{-1}(B^{'}) = f^{-1}(B)^{'}

Answer 2a: obvious from definition.

Answer 2b: obvious from definition.

Answer 2c: Let y_{0} belong to at least one B_{i} so that x_{0} = f^{-1}(y_{0}) = f^{-1}(B_{i}) for some i. In other words, f^{-1}(\bigcup_{i}B_{i}) = \bigcup_{i}f^{-1}(B_{i})

Answer 2d:

TPT: f^{-1}(B^{'}) = f^{-1}(B)^{'}.

Let y_{0} \in f^{-1}(B^{'}).

Hence, f(y_{0}) \in B^{'}.

Hence, f(y_{0}) \in U-B, where U is the universal set

Hence, f(y_{0}) \in U-f(A)

Hence, f(y_{0}) \in U-B

Hence, f(y_{0}) \in U-f(A)

Hence, f(y_{0}) \in (f(A))^{'}

Hence, y_{0} \in f^{-1}(B)^{'}. Hence, f^{-1}(B^{'}) \subseteq f^{-1}(B)^{'}

Reversing the arguments proves the other subset inequality.

Hence, f^{-1}(B^{'}) = f^{-1}(B)^{'}. QED.

More Problems :

A) Two mappings f: X \rightarrow Y and g: X \rightarrow Y are said to be equal (and we write them as f=g) if f(x) = g(x) for every x in X. Let f, g, and h be any three mappings of a non-empty set X into itself, and now prove that the multiplication of mappings is associative in the sense that (fg)h = f(gh).

Solution A:

Let f: X \rightarrow X, g: X \rightarrow X, and h: X \rightarrow X be any three mappings of X into itself. Let x_{0} \in X and let h(x_{0}) = x_{1} \in X. Now, we know that ((fg)h)(x)= (fg)(h(x)). So that now we want to find (fg)(x_{1}) = f(g(x_{1})) = f(x_{2}) assuming g(x_{1}) \in x_{2} \in X.

Now, on the other hand, (f(gh))(x) means (f)((gh)(x)) = f(g(h(x_{0})) = f(g(x_{1}))=f(x_{2}) just as in LHS. QED.

B) Let X be a non-empty set. The identity mapping i_{X} on X is the mapping of X onto itlsef defined by i_{X}(x)=x for every x. Then, i_{X} sends each element of the set X to itself; that is, leaves fixed each element of X. Prove that fi_{X}= i_{X}f=f for any mapping f of X into itself. If f is one-to-one onto, so that its inverse f^{-1} exists, show that ff^{-1} = f^{-1}f=i_{X}. Prove further that f^{-1} is the only mapping of X into itself which has this property; that is, show that if g is a mapping of X into itself such that fg=gf=i_{X}, then g=f^{-1}. Hint: g=gi_{X}=g(ff^{-1})=(gf)f^{-1} = i_{X}f^{-1}=f^{-1}, or

g=i_{X}g=(f^{-1}f)g=f^{-1}(fg) = f^{-1}i_{X}=f^{-1}

B(i) TPT: fi_{X}=i_{X}f=f for any mapping f of X into itself. Proof: Consider x_{0}\in X and such that f(x_{0}) = x_{1}. Hence, fi_{X}(x_{0}) = f(i_{X}(x_{0}))=f(x_{0}) = x_{1} \in X = f(x_{0}). Now, on the other (i_{X}f)(x_{0}) = i_{X}(f(x_{0}))=i_{X}(x_{1})= x_{1} also. QED.

B(ii): Let f be one-to-one onto such that f: X \rightarrow X. Hence, f^{-1} exists. Also, f^{-1}: X \rightarrow X. Let f(x_{0}) = x_{1} \in X. Hence, by definition and because f^{-1} is also one-to-one and onto, f^{-1}(x_{1})=x_{0}. Hence, clearly, ff^{-1} = f^{-1}f=i_{X}. QED.

B(iii) : Prove that such a f^{-1} is unique.

Solution to B(iii): Let there be another function g such that gf=fg=i_{X}. Also, we have ff^{-1}=f^{-1}f=i_{X}. Use the hint ! 🙂 QED.

Question C:

Let X and Y be non-empty sets and f a mapping of X into Y. Prove the following: (i) f is one-to-one if and only if there exists a mapping g of Y into X such that gf=i_{X} (ii) f is onto if and only if there exists a mapping h of Y into X such that fh=i_{X}.

Solution C(i) : Given f: X \rightarrow Y such that if x_{0} \neq x_{1}, then f(x_{0}) \neq f(x_{1}). Now, we want a mapping g: Y \rightarrow X such that gf=i_{X}. Let us construct a g such that g(y_{0})=x_{0} and g(y_{1}) = x_{1}. Then, (gf)(x_{0}) = g(f(x_{0}))=g(y_{0})=x_{0}. Similarly, gf(x_{1})=x_{1}. Here, we have assumed that f(x_{0})=y_{0} and f(x_{1})=y_{1}. So, knowing f to be one-to-one, we can explicitly construct a mapping g such that g: Y \rightarrow X and gf=i_{X}. QED. On the other hand, let it be given that X and Y are two non-empty sets, that f: X \rightarrow Y and g: Y \rightarrow X such that gf=i_{X}. That is, if some x^{'} \in X, then we have (gf)(x^{'}) = i_{X}(x^{'})=x^{'}. That is, g(f(x^{'})) = x^{'}. This forces, if f(x^{'}) = y^{'}, then g(y^{'})=x^{'}. Now, if there be an element x^{''} \neq x^{'}, it will force f(x^{''})=y^{''} \neq x^{'}. Hence, different inputs give rise to different outputs and so f is one-to-one. QED.

Solution C(ii) : Given f: X \rightarrow Y and f is onto, that is, f(X) = Y. We want to show a function h: Y \rightarrow X such that fh=i_{X}. Since f(X)=Y, with X as domain, Y as codomain equal to range, f is one-to-one, onto. So h=f^{-1}. QED.

Question D:

Let X be a non-empty set and f a mapping of X into itself. Show f is one-to-one onto if and only there exists a mapping g of X into itself such that gf=fg=i_{X}. If there exists a mapping g with the property, then there is one and only one such mapping. Why?

Solution D: From previous question, C we are done.

Question E:

Let X be a non-empty set, and let f and g be one-to-one mappings of X onto itself. Show that fg is also a one-to-one mapping of X onto itself and that (fg)^{-1}=g^{-1}f^{-1}.

HW. Hint: consider the action of the functions on each element.

Question F:

Let X and Y be non-empty sets and f a mapping of X into Y. If A and B are, respectively, subsets of X and Y, prove the following:

(a) ff^{-1}(B) \subseteq B, and ff^{-1}(B)=B is true for all B if and only if f is onto.

(b) A \subseteq f^{-1}f(A), and A = f^{-1}f(A) is true for all A if and only if f is one-to-one.

(c) f(A_{1} \bigcap A_{2}) = f(A_{1}) \bigcap f(A_{2}) is true for all A_{1} and A_{2} if and only if f is one-to-one.

(d) f(A)^{'} \subseteq f(A^{'}) is true for all A if and only if f is onto.

(e) If f is onto — so that, f(A)^{'} \subseteq f(A^{'}) is true for all A — then, f(A)^{'}=f(A^{'}) is true for all A if and only if f is also one-to-one.

Solution Fa: Put B=Y, so that f is onto. To prove the other subset relationship, simply reverse the arguments.

Solution Fb:

We need to apply the definition that different inputs give rise to different outputs.


😦 I hope to blog later)


Solution Fc:

It certainly implies that A_{1} \bigcap A_{2} \neq \phi.


(I hope to blog later).

Solution Fd:

Given that f: X \rightarrow Y and A \subseteq X and B \subseteq Y. Then, f(A)^{'} is Y-f(A), that is, Y-B, that is B^{'}, that is, f(B^{'}). Put A=X, then f is onto. Now, f(X)^{'} \subseteq f(X^{'}), that is, Y-f(X) \subseteq f(\phi). This implies Y=f(X). To prove the other subset relationship, simply reverse the arguments. QED.


Nalin Pithwa.

Notes II: Sets and Functions:

Reference I : Topology and Modern Analysis, G F Simmons, Tata McGraw Hill.

III. Functions:

Many kinds of functions occur in topology in a wide variety of situations. In our work, we shall need the full power of the general concept of a function, and since, its modern meaning is much broader and deeper than its elementary meaning, we discuss this concept in considerable detail and develop its main abstract properties.

Let us begin with a brief inspection of some simple examples. Consider the elementary function


of the real variable x. What do we have in mind when we call this a function and say that y is a function of x? In a nutshell, we are drawing attention to the fact that each real number x has a specific real number y linked to it, which can be calculated according to the rule (or law of correspondence) given by the formula. We have here a process which applied to a real number x does something to it (squares it) to produce another number y (the square of x). Similarly,

y =  x^{2}-3x and y = (x^{2}+1)^{-1}

are two other simple functions of the real variable x, and each is given by a rule in the form of an algebraic expression which specifies the exact manner in which the value of y depends on the value of x.

The rules for the functions we have just mentioned are expressed by formulas. In general, this is possible for functions of a very simple kind or for those which are sufficiently important to deserve a special symbol of their own. Consider for instance the function of the real variable x defined as follows: for each real number x, write x as an infinite decimal (using the scheme of decimal expansion in which infinite chains of 9s are avoided — in which for example, 1/4 is represented as 0.250000….rather than by 0.24999….); then, let y be the 59th digit after the decimal point. There is of course no standard formula for this but nevertheless it is a perfectly respectable function whose rule is given by a verbal description. On the other hand, the function y=\sin{x} of the real variable x is so important that its rule, though fully as complicated as the one just defined is assigned the special symbol sin. When discussing functions in general, we work to allow for all sorts of rules and to talk about them all at once, so we simply employ non-committal notations like y=f(x), y=g(z), and so on.

Each of the functions mentioned above is defined for all real numbers x. The example y = \frac{1}{x} shows that this restriction is much too severe, for this function is defined only for non zero values of x. Similarly, y = \log{x} is defined only for positive values of x and y = \arcsin{x} only for values of x in the interval [-1,1]. Whatever our conception of a function may be, it should certainly be broad enough to include examples like these, which are defined only for some values of the real variable x.

In real analysis, the notion of function is introduced in the following way. Let X be any non-empty set of real numbers. We say that a function y=f(x) is defined on X if the rule f associates a definite real number y with each real number x in X. The specific nature of the rule f is totally irrelevant to the concept of a function. The set X is called the domain of the given function, and the set Y of all the values it assumes is called its range. If we speak of complex numbers here instead of real numbers, we have the notion of function as it is used in complex analysis.

This point of view towards functions is actually more general than is needed for aims of analysis, but it isn’t nearly general enough for our purposes. The sets X and Y above were taken to be sets of numbers. If we now remove even this restriction and allow X and Y to be completely arbitrary non-empty sets, then we arrive at the most inclusive concept of a function. By way of illustration, suppose that X is the set of all squares in a plane and that Y is the set of all circles in the same plane. We can define a function y=f(x) by requiring that the rule f associate with each square x that circle y which is inscribed in it. In general, there is no need at all for either X or Y to be a set of numbers. All that is really necessary for a function is two non-empty sets X and Y and a rule f which is meaningful and unambiguous in assigning to each element x in X a specific element y in Y.

With these preliminary descriptive remarks, we now turn to the rather abstract but very precise ideas they are intended to motivate.

A function consists of three objects: two non-empty sets X and Y (which may be equal but need not be) and a rule f which assigns to each element x in X a single fully determined element y in Y. The y which corresponds in this way to a given x is usually written f(x), and is called the image of x under the rule f, or the value of f at the element x. (It is fun to draw some figures here). This notation is supposed to be suggestive of the idea that the rule f takes the element x and does something to it to produce the element y = f(x). The rule f is often called a mapping or transformation or operator to amplify this concept of it. We then think of f as mapping x’s to y’s, or transforming x’s to y’s, or operating on x’s to produce y’s. The set X is called the domain of the function, and the set of all f(x)’s for all x’s in X is called its range. A function whose range consists of just one element is called a constant function.

We often denote by f: X \rightarrow Y the function with rule f, domain X and range contained in Y. This notation is useful because the essential parts of the function are displayed in a manner which emphasizes that it is a composite object, the central thing being the rule or mapping f. You can try drawing a figure depicting a convenient way of picturing this function. (these notes don’t have my diagrams from the reference book) On the left, X and Y are different sets, and on the right they are equal — in which case we usually refer to f as a mapping of X into itself. If it is clear that from the context what the sets X and Y are, or if there is no real need to specify them explicitly, it is common practice to identify the function f: X \rightarrow Y with the rule f, and to speak of f alone as if it were the function under consideration (without mentioning the sets X and Y).

It sometimes happens that two perfectly definite sets X and Y are under discussion and that a mapping of X into Y arises which has no natural symbol attached to it. If there is no necessity to invent a symbol for this mapping and if it is quite clear what the mapping is, it is often convenient to designate it by x \rightarrow y. Accordingly, the function y=x^{2} mentioned in the beginning of this section can be written as x \rightarrow x^{2} or x \rightarrow y where y is understood to be the square of x.

A function f is called an extension of a function g (and g is called a restriction of f) if the domain of f contains the domain of g and f(x) = g(x) for each x in the domain of y.

Most of mathematical analysis, both classical and modern, deals with functions whose values are real numbers or complex numbers. This is also true of those parts of topology which are concerned with the foundations of analysis. If the range of a function consists of real numbers, we call it a real function, similarly, a complex function is one whose range consists of complex numbers. Obviously, every real function is also complex. We lay very heavy emphasis on real and coomplex functions through out our work.

As a matter of usage, we generally prefer to reserve the term function for real or complex functions and to speak of mappings when dealing with functions whose values are not necessarily numbers.

Consider a mapping f: X \rightarrow Y. When we call f a mapping of X into Y, we mean to suggest by this that the elements f(x) — as x varies over all the elements of X — need not fill up Y; but if it definitely does happen that the range of f equals Y, or if we specifically want to assume this, then we call f a mapping of X onto Y. If two different elements in X always have different images under f, then we call f a one=to-one mapping of X into Y. If f: X \rightarrow Y is both onto and one-to-one, then we can define its inverse mapping f^{-1}: X \rightarrow Y as follows: for each y in Y, we find that unique element x in X such that f(x)=y 9 x exists and is unique since f is onto and one-to-one); we then define x to be f^{-1}(y). The equation x = f^{-1}(y) is the result of solving y=f(x) for x in just the same way as x = \log{y} is the result of solving e^{x} for x. Figure 7 illustrates the concept of the inverse of a mapping.

If f is a one-to-one mapping of X onto Y, it will sometimes be convenient to subordinate the conception of f as a mapping sending x’s over to y’s and to emphasize its role as a link between x’s an y’s. Each x has linked to it (or has corresponding to it) exactly one x = f^{-1}(y). When we focus our attention on this aspect of a mapping which is one-to-one onto correspondence between X and Y, and f^{-1} is a one-to-one correspondence between Y and X.

Now, consider an arbitrary mapping f: X \rightarrow Y. The mapping f which sends each element of X over to an element of Y induces the following important set mappings. If A is a subset of X, then the image f(A) is the subset of Y defined by

f^{-1}(B) = \{ x : f(x) \in \hspace{0.1in} B\}

and the second set mappings pull each B back to its corresponding f^{-1}(B). It is often essential for us to know how these set mappings behave with respect to set inclusion and operations on sets. We develop most of their significant features in the following two paragraphs.

The main properties of the first set mapping are:

f(\phi) = \phi

f(X) \subseteq Y

A_{1} \subseteq A_{2} \Longrightarrow f(A_{1}) \subseteq f(A_{2})

f(\bigcup_{i})A_{i} = \bigcup_{i}f(A_{i})

f(\bigcap_{i}A_{i}) \subseteq \bigcup_{i}f(A_{i})….call these relations I.

The reader should convince himself of the truth of these statements. For instance, to prove (i) we would have to prove first that f(\bigcup_{i}A_{i}) is a subset of \bigcup_{i}f(A_{i}), and second that \bigcup_{i}f(A_{i}) is a subset of f(\bigcup_{i}A_{i}). A proof of the first of these set inclusions might run as follows: an element in f(\bigcup_{i}A_{i}) is the image of some element in \bigcup_{i}A_{i}, therefore, it is the image of an element in some A_{i}, therefore it is some f(A_{i}) and so finally it is in \bigcup_{i}f(A_{i}). The irregularities and gaps which the reader will notice in the above statements are essential features of this set mapping. For example, the image of an intersection need not equal the intersection of the images, because two disjoint sets can easily have images which are not disjoint. Furthermore, without special assumpitions (see Problem 6), nothing can be said about the relation between f(A)^{'} and f(A^{'}).

The second set mapping is much better behaved. Its properties are satisfyingly complete, and can be stated as follows:

f^{-1}(\phi) = \phi and f^{-1}(Y) =X;

B_{1} \subseteq B_{2} \Longrightarrow f^{-1}(B_{1}) \subseteq f^{-1}(B_{2})

f^{-1}(\bigcup_{i}B_{i}) \Longrightarrow \bigcup_{i}f^{-1}(B_{i})….(2)

f^{-1}(\bigcap_{i}B_{i}) = \bigcap_{i}f^{-1}(B_{i})….(3)

f^{-1}(B^{'}) = f^{-1}(B)^{'}….(4)

Again, the reader should verify each of these statements for himself.

We discuss one more concept in this section, that of the multiplication or composition of mappings. If y= f(x)=x^{2}+1 and z = g(y) = \sin{y}

then these two functions can be put together to form a single function defined by x = (gf)(x) = g(x^{2}+1)=\sin{(x^{2}+1)}. One of the most important tools of calculus (the chain rule) explains how to differentiate functions of this kind. This manner of multiplying functions together is of basic importance for us as well, and we formulate it in general as follows. We define the product of these mappings, denoted by gf: X \rightarrow Z by (gf)(x) = g(f(x)). In words, an element x in X is taken by f to the element f(x) in Y,and then g maps f(x) to g(f(x)) in Z. Figure 8 is a picture of this process. We observe that the two mappings involved here are not entirely arbitrary, for the set Y which contains the range of the first equals the domain of the second. More generally, the product of two mappings is meaningful whenever the range of the first is contained in the domain of the second. We have regarded f as the first mapping and y as the second, and in forming their product gf, their symbols have gotten turned around. This is a rather unpleasant phenomenon, for which we blame the occasional perversity of mathematical symbols. Perhaps it will help the reader to keep this straight in his mind if he will remember to read the product gf from right to left: first apply f, then g.


To be continued next blog.


Nalin Pithwa

Notes I: Sets and Functions

Reference 1: Topology and Modern Analysis, G F Simmons, Tata McGraw Hill Publication.

There are certain logical difficulties which arise in the foundations of set theory (see Problem 1). We avoid these difficulties by assuming that each discussion in which a number of sets are involved takes place in the context of a single fixed set. This set is called the universal set. It is denoted by U in this section and the next, and every set mentioned is assumed to contain elements of U. In later chapters, there will always be on the one hand, a given space within which we work, and this will serve without further comment as our universal set. It is often convenient to have available in U a set containing no elements whatsoever; we call this the empty set and denote it by the symbol \phi. A set is said to be finite if it is empty or consists of n elements for some positive integer n; otherwise, it is said to be infinite.

The relation \subseteq is called set inclusion. The following are the properties of set inclusion (it is an equivalence relation):

i) A \subseteq A for every A;

ii) A \subseteq B and B \subseteq A imply A=B.

iii) A \subseteq B and B \subseteq C imply A \subseteq C.

It is quite important to observe that (i) and (ii) above can be combined into a single statement that A=B \longrightarrow A \subseteq B and B \subseteq A. This remark contains a useful principle of proof, namely, that the only way to show that two sets are equal apart from merely inspecting them, is to show that each is a subset of the other.


  1. Perhaps, the most famous of the logical difficulties referred to in the theory of sets is Russell’s paradox. To explain what this is, we begin by observing that a set can easily have elements which are themselves sets., e.g., \{ 1, \{ 2,3\}, 4\}. This means the possibility that a set might well contain itself as one of its elements. We call such a set an abnormal set, and any set which does not have itself as an element we call a normal set. Most sets are notmal,, and if we suspect that abnormal sets are in some way undesirable, we might try to to confine our attention to the set N of all normal sets. Someone is now sure to ask, is N itself normal or abnormal? It is evidently one or the other, as it cannot be both. Show that if N is normal, it must be abnormal. Proof 1: Set N is the set of all normal sets. Clearly, it contains itself, so it is abnormal. Part II: Show that if N is abnormal, then it is normal. Part 2 proof: As N is abnormal, it contains itself as an element. Set N is the set of all normal sets, so it does not contain itself. So, it is normal. We see in this way that each of our two alternatives is self-contradictory, and it seems to be the assumption that such a set N exists which has brought us to this impasse. For further discussion of these matters, refer to Fraenkel’s Abstract Set Theory. Russell’s own account of the discovery can also be found.
  2. The symbol we have used for set inclusion is similar to that used for the familiar order relation on the real line: if x and y are real numbers, x \leq y means that y-x is non negative. The order relation on the real line has all the properties mentioned in the text: (a) x \leq x for every x (b) x \leq y and y \leq x means x=y, and (c) x \leq y and y \leq z means x \leq z. It also has an additional property: (d) For any x and y, either x \leq y or y \leq x. Property d says that any two real numbers are comparable with respect to the relation in question, and it leads us to call the order relation on the real line as a total (or linear) order relation. Show by an example that this property is not possessed by set inclusion. It is for this reason that set inclusion is called a partial order relation. Answer: Disjoint sets.
  3. (a) Let U be the single element set \{ 1\}. There are two subsets, the empty set \phi and \{1\} itself. If A and B are arbitrary subsets of U, there are four possible relations of the form A \subseteq B. Count the number of true relations amongst these. Answer: \phi \subset U, a true relation; U \subset \phi, which is not a true relation; U \subseteq U, which is a true relation; \phi \subseteq \phi, which is debatable. (3b) Let U be the set \{ 1,2\}. There are four subsets, namely, \phi, U, \{ 1 \} and \{ 2\}. If A and B are arbitrary subsets of U, there are 16 possible relations of the form A \subseteq B. Count the number of true ones: Answer: \phi \subseteq U, \phi \subseteq \{ 1\}, \phi \subseteq \{ 2 \}, \{ 1 \} \subseteq U, \{ 2\} \subseteq U, \{ 1 \} \subseteq \{ 1 \}, \{ 2 \} \subseteq \{ 2 \}. So, there are 7 such true relations out of 16. (3c) Now, let U be the set \{ 1,2,3\}. There are 8 subsets, namely, \phi, U, \{ 1\}, \{ 2\}, \{ 3 \}, \{1,2\} , \{ 1,3\}, \{ 2,3\}. There are 64 possible relations of the form A \subseteq B. Count the number of true ones. Answer: \phi \subseteq U, \phi \subseteq \{ 1 \}, \phi \subseteq \{ 2 \}, \phi \subseteq \{ 3 \}, \{ 1 \} \subseteq U, \{ 2 \} \subseteq U, \{ 3 \} \subseteq U, \{ 1, 2\} \subseteq U, \{ 1, 3\} \subseteq U, \{ 2,3\} \subseteq U, \{ 1\} \subseteq \{ 1, 2 \}, \{ 1 \} \subseteq \{ 1,3\}, \{ 2 \} \subseteq \{ 1,2 \}, \{ 2\} \subseteq \{ 2,3\}, \{ 3\} \subseteq \{ 1,3\}, \{ 3\} \subseteq \{ 2,3\}. And, also, except \phi and U, each is a subset of itself. Thus, there are such 30 true relations. (3d) Let U be the set \{ 1,2,3,\ldots, n\} for an arbitrary positive integer n. How many subsets are there ? Answer: 2^{n} subsets. How many possible relations of the form A \subseteq B? Answer: 2^{n} \times 2^{n}, that is, 2^{2n} possible pairings. Can you make an informed guess as to how many of these are true? Answer (trial 1): The empty set \phi is paired with each of the remaining 2^{n}-1 so these give 2^{n}-1 pairings; each set which is not U and \phi can be paired with U as a superset, so these give 2^{n}-2 pairings; except \phi and U, each subset is a subset of itself, so these are total 2^{n}-2 possibilities; each two element subset can be paired with 3-element subsets containing the original two elements and one more, and there are 2^{n}-2 such cases; each two element subset can be paired with 4-element subsets out of which two elements are original and two are chosen out of the remaining 2^{n}-2, that is, there are {2^{n}-2} \choose {2} pairings; …continuing in this way, we get the following final answer: 2(n-3)+\frac{(n-4)(n-1)}{2} + 2^{n}\times (n-3) + (2^{n-2}-1)(2^{2^{n}-n+2})

II. The Algebra of Sets:

In this section, we consider several useful ways in which sets can be combined with one another, and we develop the chief properties of these operations of combination.

As we emphasized above, all the sets we mention in this section are assumed to be subsets of our universal set U. U is the frame of reference, or the universe, for our present discussion. In our later work, the frame of reference in a particular context will naturally depend on what ideas we happen to be considering. If we find ourselves studying sets of real numbers, then U is the set R of real numbers. If we wish to study sets of complex numbers, then we take U to be the set C of all complex numbers. We sometimes want to narrow the frame of reference and to consider (for instance) only subsets of the closed unit interval [0,1], or of the closed unit disc \{z : |z|\leq 1 \} and in these cases we choose U accordingly. Generally speaking, the universal set U is at out disposal, and we are free to select it to fit the needs of the moment. For the present, however, U is to be regarded as a fixed but arbitrary set. This generally allows us to apply the ideas we develop below to any situation which arises in out later work.

It is extremely helpful to the imagination to have a geometric picture available in terms of which we can visualize sets and operations on sets. A convenient way to accomplish this is to represent U by a rectangular area in a plane, and the elements which make up U by the points of this area. Sets can then be pictured by areas within this rectangle, and diagrams can be drawn which illustrate operations on sets and relations between them. For instance, if A and B are sets, then Figure 1 represents the circumstance that A is a subset of B (we think of each set as consisting of all points within the corresponding closed curve). Diagrammatic thought of this kind is admittedly loose and imprecise; nevertheless, the reader will find it invaluable. No mathematics, however, abstract it may appear, is ever carried on without the help of mental images of some kind, and these are often nebulous, personal and difficult to describe.

The first operation we discuss in the algebra of sets is that of forming unions. The union of two sets A and B, written A \bigcup B, is defined to be the set of all elements which are in either A or B (including those which are in both). A \bigcup B is formed by lumping together the elements of A and those of B and regarding them as constituting a single set. In Figure 2, A \bigcup B is indicated by the shaded area. The above can also be expressed symbolically:

A \bigcup B = \{ x : x \in A or x \in B\}.

The operation of forming union is commutative and associative:

A \bigcup B = B \bigcup A and A \bigcup (B \bigcup C) = (A \bigcup B) \bigcup C.

It has the following additional properties: A \bigcup A = A, A \bigcup \phi=A, and A \bigcup U = U.

We also note that A \subseteq B \Leftrightarrow A \bigcup B = B so set inclusion can be expressed in terms of this operation.

Our next operation is that of forming intersections. The intersection of two sets A and B, written A \bigcap B is the set of all elements which are in both A and B. In symbols, A \bigcap B = \{ x: x \in A and x \in B\}.

A \bigcap B is the common part of the sets A and B. In Figure 3, A \bigcap B is represented by the shaded area. If A \bigcap B is non-empty, we express this by saying that A intersects B. If, on the other hand, it happens that A and B have no common part, or equivalently, that A \bigcap B = \phi, then we say that A does not intersect B, or that A and B are disjoint and a class of sets in which all pairs of distinct sets are disjoint is called a disjoint class of sets. The operation of forming intersection is also commutative and associative:

A \bigcap B = B \bigcap A and A \bigcap (B \bigcap C) = (A \bigcap B) \bigcap C.

It has the further properties that

A \bigcap A = A, A \bigcap \phi = \phi, and A \bigcap U=A, and since

A \subseteq B \Leftrightarrow A \bigcap B = A,

We see that set inclusion can also be expressed in terms of forming intersections.

We have now defined two of the fundamental operations on sets and we have seen how each is related to set inclusion. The next obvious step is to see how they are related to one another. The facts here are given by the distributive laws:

A \bigcap (B \bigcup C) = (A \bigcap B) \bigcup (A bigcap C) and

A \bigcup (B \bigcap C) = (A \bigcup B) \bigcap (A \bigcup C)

These properties depend only on simple logic applied to the meaning of symbols involved. For instance, the first of the two distributive laws says that and element is in A and B or C precisely when it is in A and B or A and C. We can convince ourselves of the validity of these laws by drawing pictures (Venn Diagrams). The second distributive law is illustrated in Figure 4 where A \bigcup (B \bigcap C is formed on the left by shading and (A \bigcup B) \bigcap (A \bigcup C) on the right by cross shading. A moment’s consideration by the reader ought to convince him that one gets the same sets in both cases.

The last of our major operations on sets is the formation of complements. The complement of a set A, denoted by A^{'} is the set of all elements not in A. Since the only elements we are considering are those which make up U, it goes without saying — but it ought to be said that A^{['} consists of all elements of U which are not in A. Symbolically: A^{'} = \{ x: x \notin A, x \in U\}

Figure 5 in which A^{'} is shaded indicates this operation. The operation of forming complements has the following obvious properties:

(A^{'})^{'} = A

\phi^{'} = U


A \bigcup A^{'} = U

A \bigcap A^{'} = \phi.

Further, it is related to set inclusion by A \subseteq B \Longleftrightarrow B^{'} \subseteq A^{'}, and to the formation of unions and intersections by

(A \bigcup B)^{'} = A^{'} \bigcap B^{'}

(A \bigcap B)^{'} = A^{'} \bigcup B^{'}.

In words, the above laws, called De Morgan’s Laws, can be beautifully related as: The complement of union of two sets is the intersection of the complements; the complement of the intersection of two sets is the union of the complements.

Or, in yet other words, the first De Morgan law says: an element is not in either of the two sets precisely when it is outside of both; and the second De Morgan law can be expressed as: an element is not in both sets precisely when it is outside of one or the other.

The operations of forming unions and intersections are primarily binary operations, that is, each is a process which applies to a pair of sets and yields a third. We have emphasized this by our use of parentheses to indicate the order in which the operations are to be performed, as in (A_{1} \bigcup A_{2})\bigcup A_{3}, where the parentheses direct us first to unite A_{1} and A_{2}, and then to unite the result of this with A_{3}. Associativity makes it possible to dispense with parentheses in an expression like this and to write A_{1} \bigcup A_{2} \bigcup A_{3} where we understand that these sets are to be united in any order and the order in which we do it is irrelevant (that is, all possible different orders yield the same result). Similar remarks apply to A_{1} \bigcap A_{2} \bigcap A_{3}. Furthermore, if \{ A_{1}, A_{2}, \ldots, A_{n}\} is any finite class of sets, then we can form

A_{1} \bigcup A_{2} \bigcup \ldots \bigcup A_{n}

A_{1} \bigcap A_{2} \bigcap \ldots \bigcap A_{n}

in much the same way without any ambiguity of meaning whatever. In order to shorten the notation, we let I = \{1, 2, 3, \ldots, n \} be the set of subscripts which index the sets under consideration. I is called the index set. We then compress the symbols for the union and intersection just mentioned to \bigcup_{i \in I}A_{i} and \bigcap_{i \in I}A_{i}. For the sake of both brevity and clarity, these sets are often written \bigcup_{i=1}^{n}A_{i} and \bigcap_{i=1}^{n}A_{i}.

These extensions of our ideas and notation don[‘t reach nearly far enough. It is often necessary to form unions and intersections of large (really large!) classes of sets. Let \{A_{i} \} be an entirely arbitrary class of sets indexed by a set I of subscripts. Then,

\bigcup_{i \in I}A_{i} = \{x : x \in A \hspace{0.1in} for \hspace{0.1in} at \hspace{0.1in} least \hspace{0.1in} one \hspace{0.1in} i \in I \}.


\bigcap_{i \in I}A_{i} = \{ x: x \in A_{i} \hspace{0.1in} for \hspace{0.1in} each \hspace{0.1in} i \in I\}

define their unions and intersections.

As above, we usually abbreviate these notations in \bigcup_{i}A_{i} and \bigcap_{i}A_{i} and if the class \{ A_{i}\} consists of a sequence of sets, that is, if \{ A_{i}\} = \{A_{1}, A_{2}, \ldots, A_{n} \}, then their unions and intersections are often written in the form \bigcup_{i=1}^{\infty}A_{i} and \bigcap_{i}A_{i}. Observe that we did not require the class A_{i} to be non-empty. If it does happen that the class is empty, then the above definitions give (remembering that all sets under consideration are subsets of U) \bigcup_{i}A_{i} and \bigcap_{i}A_{i}=U. The second of these facts amounts to the following statement: if we require of an element that it belongs to each set of the class, but there be no set present in the class, then every element satisfies this requirement. (We call this as the school of “vacuous logic” in theory of sets.) Note that if we had not made the explicit assumption that all sets under consideration are to be considered as subsets of U, we would not be able to fix a meaning to the intersection of an empty class of sets. A moments reflection makes it clear that the De Morgan equations are valid for arbitrary unions and intersections: (we can write a little proof also; it is easy):

(\bigcup_{i}A_{i})^{'} = \bigcap_{i}A_{i}^{'}

Complement of arbitrary union is intersection of complements.

(\bigcap_{i}A_{i})^{'} = \bigcup_{i}A_{i}^{'}

Complement of arbitrary intersection is union of complements.


It is instructive to verify the above general De Morgan laws for the case in which the class \{ A_{i}\} is empty. We present the answers below:

Consider the first general DeMorgan law: (\bigcup_{i}A_{i})^{'} = \bigcap_{i}A_{i}^{'}. Now, as \{ A_{i} \} = \phi, so \bigcup_{i}A_{i}=\phi and hence, the LHS is (\bigcup_{i}A_{i})^{'}= (\phi)^{'}=U whereas RHS Is \bigcap_{i}A_{i}^{'}= \bigcap_{i}U=U same as LHS. QED.

Now, consider the second generalized De Morgan Law:

(\bigcap_{i}A_{i})^{'} = \bigcup_{i}A_{i}^{'}, which we want to verify when the class of sets \{ A_{i}\} is empty. Consider \bigcap_{i}A_{i}=\phi and so (\bigcap_{i}A_{i})^{'} = (\phi)^{'}=U, which is our LHS. Now let us see RHS: \bigcup_{i}A_{i}^{'} = \bigcup_{i}(\phi)^{'}=\bigcup_{i}U=U which is same as LHS. QED.


We conclude our treatment of the general theory of sets with a brief discussion of certain special classes of sets which are of considerable importance in topology, logic, and measure theory. We usually denote classes of sets by capital letters in boldface.

First some general remarks which will be useful both now and later, especially in connection with topological spaces. We shall often have occasion to speak of finite unions and finite intersections, by which we mean unions and intersections of finite classes of sets, and by a finite class of sets we always mean one which is empty or consists of n sets for some positive integer n. If we say that a class of sets A is closed under the formation of finite unions, we mean that A contains the union of each of its finite subclasses; and since the empty subclass qualifies as a finite subclass of A , we see that its union, the empty set, is necessarily an element of A. In the same way, a class of sets which is closed under the formation of finite intersections necessarily contains the universal set.

Now, for the special classes of sets mentioned above. For the remainder of this section, we specifically assume that the universal set is non-empty. A Boolean algebra of sets is a non-empty class A of subsets of U which has the following properties:

(i) A \hspace{0.1in}and \hspace{0.1in} B \in {\bf {A}} \Longrightarrow A \bigcup B \in \bf{A}

(ii) A \hspace{0.1in} and \hspace{0.1in} B \in {\bf{A}} \Longrightarrow A \bigcap B \in \bf{A}

(iii) A \in {\bf{A}} \Longrightarrow A^{'} \in \bf{A}.

Since A is assumed to be non-empty, it must contain at least one set A. Then, property iii above shows that A^{'} \in {\bf{A}} also, so that A \bigcap A^{'} = \phi \in {\bf{A}} also. Moreover, A^{'} \bigcup A = U \in {\bf{A}} also. Since the class consisting of only the empty set and the Universal set is also clearly a boolean algebra of sets, these two distinct sets are the only ones which every Boolean algebra of sets must contain. It is equally clear that the class of all subsets of U is also a Boolean algebra of sets. There are many other less trivial kinds, and their applications are manifold in fields of study as diverse as electronics and statistics.

Let A be a Boolean algebra of sets. It is obvious that if \{ A_{1}, A_{2}, \ldots, A_{n}\} is a non-empty finite sub-class of A then

A_{1}\bigcup A_{2} \bigcup \ldots \bigcup A_{n} and A_{1}\bigcap A_{2} \bigcap \ldots \bigcap A_{n} are both sets in A; and since A contains both the empty set and the universal set, it is easy to see that A is a class of sets which is closed under the formation of finite unions, finite intersections and complements. We now go in the other direction, and let A be a class of sets which is closed under the formation of finite unions, finite intersections, and complements. By these assumptions, A automatically contains the empty set and the universal set, so it is non-empty and is easily seen to be a Boolean algebra of sets. We conclude from these remarks that Boolean algebras of sets can be described alternatively as classes of sets which are closed under the formation of finite unions, finite intersections, and complements. It should be emphasized once again that when discussing Boolean algebras of sets, we always assume that the universal set is non-empty.

One final comment. We speak of Boolean algebras of sets because there are other kinds of Boolean algebras than those which consist of sets, and we wish to preserve this distinction. We explore this topic further in our Appendix on Boolean algebras.


  1. If \{ A_{i}\} and \{ B_{j}\} are two classes of sets such that \{ A_{i}\} \subseteq \{ B_{j}\}, prove that \bigcup_{i}A_{i} \subseteq \bigcup_{j}B_{j} and \bigcap_{j}B_{j} \subseteq \bigcap_{i}A_{i}.

Answer 1: Recall that a class of sets means a set of sets. By this definition, the following obviously holds: \bigcup_{i}A_{i} \subseteq \bigcup_{j}B_{j}. But, the second claim (I have yet to prove it for myself): \bigcap_{j}B_{j} \subseteq \bigcap_{i}A_{i}.

2. The difference between two sets A and B, denoted by A-B, is the set of all elements in A and not in B, thus A-B=A \bigcap B^{'}. Show that the following holds true:

A-B = A - (A \bigcap B) = (A \bigcup B) -B

(A-B)-C=A-(B \bigcup C)

A - (B-C) = (A-B) \bigcup (A \bigcap C)

(A \bigcup B) - C = (A-C) \bigcup (B-C)

A-(B \bigcup C) = (A-B) \bigcap (A-C)

3. The symmetric difference of two sets A and B, denoted by A \triangle B, is defined by A \triangle B = (A-B) \bigcup (B-A); it is thus the union of their differences in opposite orders. Prove the following:

A \triangle (B \triangle C) = (A \triangle B) \triangle C….that is, symmetric difference is associative

A \triangle \phi = A

A \triangle A = \phi

A \triangle B = B \triangle A…that is, symmetric difference is commutative

Distributive law like relation: prove: A \bigcap (B \triangle C) = (A \bigcap B) \triangle (A \bigcap C)

4. A ring of sets is a non-empty class A of sets such that if A and B are in A, then A \triangle B and A \bigcap B are also in A. Show that A must also contain the empty set, A \bigcup B, and A-B. Prove that if a non empty class of sets contains the union and difference of any pair of its sets, then it is a ring of sets. Prove that a Boolean algebra of sets is a ring of sets.

5. Show that the class of all finite subsets (including the empty set) of an infinite set is a ring of sets but is not a Boolean algebra of sets.

6. Show that the class of all finite unions of closed open intervals on the real line is a ring of sets but is not a Boolean algebra of sets.

7. Assuming that the universal set U is non-empty, show that Boolean algebras of sets can be described rings of sets which contain U.

Let me host the solutions after a while.


Nalin Pithwa

Mathematical Morsels : IV

The words set and space are often used in loose context to one another. A set is merely an amorphous collection of elements, without coherence or form. When some kind of algebraic or geometric structure is imposed on a set, so that its elements are organized into a systematic whole, then it becomes a space.