Topological Spaces and Groups : Part 1: Fast Review


Topological Transformation Groups by Deane Montgomery and Leo Zippin, Dover Publications, Available in Amazon India.

My motivation:

We present preliminary and somewhat elementary facts of general spaces and groups. Proofs are given in considerable detail and there are examples which may be of help to a reader for whom the subject is new.

(The purpose of this blog is basically improvement in my understanding of the subject. If, by sharing, it helps some student/readers, that would be great. ) (Also, I found that reading/solving just one main text like Topology by James Munkres left me craving for more topological food: I feel I must also know how the founding fathers discovered topology)…

1.0 Introduction:

We use standard set-theoretic symbols : capitals A, B etc. for sets, A \bigcup B for the union of sets (elements in one or both), and A \bigcap B for the intersection (elements in both) etc.

1.1 Spaces (Topological Spaces);

The term space is sometimes used in mathematical literature in a very general sense to denote any collection whose individual objects are called points, but in topology the term space is used only when some further structure is specified for the collection. As the term will be used in this book it has a meaning which is convenient in studying topological groups. The definition is as follows:

DEFINITION: A topological space (or more simply space) is a non-empty set of points certain subsets of which are designated as open and where, moreover, these open sets are subject to the following conditions:

  1. The intersection of any finite number of open sets is open.
  2. The union of any number of open sets is open.
  3. The empty set and the whole space are open.
  4. To each pair of distinct points of space there is associated at least one open set which contains one of the points and does not contain the other.

A space is called discrete if each point is an open set.

Condition (4) is known as the T_{0} separation axiomi in the terminology of Paul Alexandroff and Herr Heinz Hopf. The first three conditions define a topological space in their terminology. The designated system of open sets is the essential part of the topology, and the same set of points can become a topological space in many ways by choosing different systems of subsets designated as open.

1.2. Homeomorphisms

DEFINITION. A homeomorphism is a one-to-one relation between all points of one topological space and all points of a second which puts the open sets of the two spaces in one-one correspondence; the spaces are topologically equivalent.

The notion of homeomorphism is reflexive, symmetric, and transitive so that it is an equivalence relation in a given set of topological spaces.

EXAMPLES of topological spaces:

Let E_{1} denote the set of all real numbers in its customary topology: the open intervals are the sets \{ y: x < y < z\} for every x < z. The open sets are those which are unions of open intervals together with the empty set and the whole space.

Let R_{1} \subset E_{1} denote the set of numbers in the closed interval 0 \leq y \leq 1, where for the moment we take this subset without a topology. This set gives distinct topological spaces as follows:

  1. Topologize R_{1} as customarily : the open sets are the intersections of R_{1} with the open sets of E_{1}.
  2. Topologize R_{1} discretely, that is, let every subset be open.
  3. Topologize R_{1} by the choice : open sets are the null set, the whole space and for each x of R_{1} the set \{ z, x < z \leq 1\}
  4. Topologize R_{1} by the choice: the complement of any finite set is open and the empty set and whole space are open

In the sequel R_{1} will denote the closed unit interval and E_{1} the set of all reals in the customary topology. A set homeomorphic to R_{1} is called an arc. A set homeomorphic to a circle is called a simply closed curve.

1.3 Basis

DEFINITION. A collection \{ Q_{a}\} of open sets of a space is called a basis for open sets if every open set (except possibly the null set) in the space can be represented as a union of sets in \{ Q_{a}\}. It is called a sub-basis if every open set can be represented as a union of finite intersection of sets in \{ Q_{a}\} (except possibly the null set).

A collection \{ Q_{a}\} of open sets of a space S is a basis if and only for every open set Q in S and x \in Q, there is a Q_{a} \in \{ Q_{a} \} such that x \in Q_{a} \subset Q.

If a collection has this property at a particular point x then the collection is called a basis at x.

If a set together with certain subsets are called a sub-basis, then another family of subsets is determined from the sub-basis by taking arbitrary unions and finite intersections. This new family (with the null set added if necessary) then satisfies conditions 1, 2, 3 for open sets of a topological space. Whether 4 will also be satisfied depends on the original family of sets.

EXAMPLE. Let E_{1} denote the space of real numbers in its usual topology. For each pair of rationals r_{1} < r_{2} let \{ r_{1}, r_{2} denote the set of reals r_{1}<x<r_{2}. This countable collection of open sets is a basis.

A space is said to be separable or to satisfy the second countability axiom if it has a countable basis. A space is said to satisfy the “first countability axiom” if it has a countable basis at each point.

EXAMPLE. Let S denote a topological space and let F denote a collection of real valued functions f(x), where x \in S. If f_{0} is a particular element of F then for each positive integer n let Q(f_{0},n)=\{ f \in F: |f(x)-f_{0}(x)| < \frac{1}{n}\} for all x. We may topologize F by choosing the sets Q(f_{0},n) for all f_{0} and n as a sub-basis. The topological space so obtained has a countable basis at each point in many important cases.

1.4 Topology of subsets

Let S be a topological space, T a subset. Let Q \bigcap T be called open in T or open relative to T if Q is open in S. With open sets defined in this way T becomes a topological space and the topology so defined in T is called the induced or relative topology. If S has a countable basis and T \subset S, then T has a countable basis in the induced topology.

DEFINITION. A subset X \subset S is called closed if the complement S - X is open. If X \subset T \subset S then X is called closed in T when T - X is open in T.

Notice that T closed in S and X closed in T implies that X is closed in S. The corresponding assertion for relatively open sets is also true.

It can be seen that finite unions and arbitrary intersections of closed sets are again closed.

DEFINITION. If K \subset S, the intersection of all closed subsets of S which contain K is called the closure of K and is denoted by \overline{K}. If K is closed, K = \overline{K}.

15 Continuous maps

Let S and T be topological spaces and f a map of S into T f: S \rightarrow T, that is, for each x in S, y=f(x) is a point of T. If the inverse of each relatively open set in f(S) is an open set in S, then f is called continuous. In case f(S)=T then f is continuous and V open in T imply f^{-1}(V) is open in S. The map is called an open map if it carries open sets to open sets.

If f is a continuous map of S onto T (that is, f(S)=T) and if f^{-1} is also single valued and continuous, then f and f^{-1} is also single valued and continuous, then f and f^{-1} are homeomorphisms and S and T are homeomorphic or topologically equivalent.

EXAMPLE. The map f(t) = \exp{2\pi it} is a continuous and open map of E_{1} onto a circle (circumference) in the complex plane.

EXAMPLE. Let K denote the cylindrical surface, described in x ,y, z coordinates in three-space by x^{2}+y^{2}=1. Let f_{1} denote the map of K onto E_{1} given by (x,y,z) \rightarrow (0,0,z), let f_{2} the map (x,y,z) \rightarrow (x,y, |z|) of K into K. All three maps are continuous, the first two are open and f_{1} and f_{2} are also closed, that is, they map closed sets into closed sets.

1.6 Topological products

The space of n real variables (x_{1}, x_{2}, \ldots, x_{n}) from -\infty < x_{i} < \infty, where i=1,2,\ldots, n and the cylinder K of the preceding example are instances of topological products.

Let A denote any non-empty set of indices and suppose that to each a \in A there is associated a topological space S_{a}. The totality of functions f defined on A such that f(a) \in S_{a} for each a \in A is called the product of the spaces S_{a}. When topologized as below it will be denoted by PROD S_{n}; we also use the standard symbol \times thus E \times B is the set of ordered pairs (a,b) e \in E, b \in B.

The standard topology for this product space is defined as follows: For each positive integer n, for each choice of n indices a_{1}, a_{2}, \ldots, a_{n} and for each choice of a non empty open set in S_{a_{i}}

U_{a_{i}} \subset S_{a_{i}} for i=1,2, \ldots, n

consider the set of functions f \in PROD \hspace{0.1in} S_{n} for which f(a_{i}) \in U_{a_{i}} for i=1,2, \ldots, n

Let the totality of these sets be a sub-basis for the product. The resulting family of open sets satisfies the definition of space in 1.1


The space E_{n}= E_{1} \times E_{1} \times \ldots \times E_{1}, n copies, is the space of n real variables; here A=\{ 1, 2, 3, \ldots, n\} and each S_{i} is homeomorphic to E_{1} (1.2). Let x_{i} \in S_{i}. Then, (x_{1}, x_{2}, \ldots, x_{n}) are the co-ordinates of a point of E_{n}. It can be verified that the sets U_{m}(x) where m \in \mathcal{N}, of points of E_{n}, whose Euclidean distance from x = (x_{1}, x_{2}, \ldots, x_{n}) is less than \frac{1}{m} form a basis at x. The subset R_{1} \times R_{1}\times \ldots \times R_{1} is an n-cell.


Let A be of arbitrary cardinal power and let each S_{n}, a \in A, be homemorphic to C_{1}, the circumference of a circle. Then PROD S_{n} is a generalized torus. If A consists of n objects, the product space is the n-dimensional torus. For n=2, we get the torus.


Let D=S_{1} \times S_{2} \times \ldots \times S_{n} \times \ldots , where n \in \mathcal{N} where each S_{i} is a pair of points — conveniently regarded as the “same” pair, and designated 0 and 2. This is the Cantor Discontinuum, of Cantor Middle Third set. It is homeomorphic to the subset of the unit interval defined by the convergent series: D: \sum { a_{n} / 3^{n}}, where a_{n}=0, 2. This example will be described in another way in the next section.


Let F_{a} be a closed subset of the topological space S_{a}, and a \in A. Then PROD F_{n} is a closed subset of PROD S_{a}.

Proof: The reader is requested to try. It is quite elementary.

1.7 Compactness:

DEFINITION: A topological space S is compact if every collection of open sets whose union covers S contains a finite subcollection whose union covers S.


The unit interval R_{1} Thus let \{ Q\} denote a collection of open sets covering R_{1}. Let F denote the set of points x \in R such that the interval 0 \leq y \leq x can be covered by a finite subcollection of \{ Q \}. Then F is not empty and is both open and closed. Hence, by the Dedekind cut postulate, or the existence of least upper bounds, or the connectedness of R_{1} it follows that F = R_{1} To illustrate the concept of compactness, consider the open sets W_{n} \subset R_{1}, W_{n}: \frac{1}{3n}< x < \frac{1}{n}, n \in \mathcal{N}. This collection does not cover R_{1}. Let W_{n} be the union of two sets: 0 \leq x < a and 1-a < x \leq 1 for some a, 0 < a < 1. No matter how a > 0 is chosen, there is always some finite number of the W_{n} which together with W_{a} covers R_{1}. Of course, R_{1} minus endpoints is not compact and no finite subcollection of the W_{n} in this example will cover it.

THEOREM. Let S be a compact space and let f: S \rightarrow T be a continuous map of S onto a topological space T Then, T is compact.


Let \{ O _{a} be a covering of T by open sets. Since f is continuous, each f^{-1}(O_{a}) is an open set in S. There is a finite covering of S by sets of the collection \{ f^{-1}(O_{a})\} and this gives a corresponding finite covering of T by sets of O_{a}. This completes the proof.

COROLLARY. If f is a continuous map of S into T then f(S) is a compact subset of T.


Let S be a compact space and \{ D_{a}\} a collection of closed subsets such that \bigcap_{a}D_{a} is empty. Then there is some finite set D_{a_{1}}, \ldots, D_{a_{n}} such that \bigcap_{i}D_{a_{i}} is empty.


The complement of \bigcap_{a}D_{a} is \bigcup_{a}(S-D_{a}); if the intersection set is empty. the union covers S. There is a finite set of indices a_{i} such that S \subset \bigcup_{i}(S - D_{a_{i}}) and conequently \bigcup_{i}D_{a_{i}} is empty for the same finite set of indices.


Let D_{n}, n \in \mathcal{N} be a sequence of non empty closed subsets of the compact space S with D_{n+1} \subset D_{n}. Then, \bigcap_{n}D_{n} is not empty.


The Cantor Middle Third Set D

From R_{1}, “delete” the middle third: \frac{1}{3} < x < \frac{2}{3}. Let D_{1} denote the residue: it is a union of two closed intervals. Let D_{2} denote the closed set in D_{1} complementary to the union of the middle third intervals: \frac{1}{9} < x < \frac{2}{9} and \frac{7}{9} < x < \frac{8}{9}. Continuing inductively, define D_{n} \subset D_{n-1} consisting of 2^{n} closed mutually exclusive intervals Let D = \bigcap_{n}. This is homeomorphic to the space of Example 3 of 1.6.


A lower semi-continuous (upper semi-continuous) real-valued function on a compact space has finite glb (greatest lower bound), lub (least upper bound) and always attains these bounds at some points of space.

This follows from the preceding corollary and the fact that the set where f(x) \leq r is closed, for every r (similarly, f(x) \geq r.


A topological space with the property: “every collection of closed subsets with empty set-intersection has a finite subcollection whose set-intersection is empty” is compact.


The proof, like that of the Theorem of 1.7.1 is based on the duality between open and closed sets.


If a point x of a topological space S belongs to an open subset of S whose closure is compact, then S is called locally compact set at x; S i locally compact if it has this property at every point.


A closed subset of a locally compact space is locally compact in the induced topology. Similarly, a closed subset of a compact space is compact. The union of a finite number of compact subsets is compact.

Proof: HW.

A set U in a topological space is called a neighbourhood of a point z if there is an open set O such that z \in O \subset U; z is called an inner point of U. A set F is covered by a collection \{ U_{i}\} if each point of F is an inner point of some set U_{i}.


A space S is called a Hausdorff space if for every x, y \in S, x \neq y, there exist open sets U and V including x and y respectively such that U \bigcap V = \phi where \phi is the empty set; an equivalent property is the existence of a closed neighbourhood of x not meeting y.

HW: Show that a compact subset of a Hausdorff space is closed.

Lemma: Let S be a compact Hausdorff space, let F be a closed set in S, and x a point not in F. Then there is a closed neighbourhood of x, such that W \bigcap F = \Phi

For each y \in F let U_{y} be a neighbourhood of y and W_{y} a neighbourhood of x, such that U_{y} \bigcap W_{y}= \Phi. There is a covering of F by sets U_{y_{1}},\ldots, U_{y_{n}}. Let W_{x} be the intersection of the associated W_{y_{i}} where i = 1, 2, \ldots, n and let W be the closure of W_{x}. The union of the U_{y_{i}} does not meet W. Then W \bigcap F = \Phi which completes the proof.

THEOREM. Let U be a compact Hausdorff space and let F_{n} be a sequence of closed subsets of U. If U is contained in the union of sets F_{n}, then at least one of the sets F_{n} has inner points.

Proof. Take a sequence C_{1} \supset C_{2} \supset \ldots of non empty compact neighbourhoods such that for each n, (\bigcup_{1}^{n}F_{i}) \bigcap C_{n} = \Phi. This leads to \bigcap C_{n}=\Phi a contradiction. QED.

A set is nowhere dense if its closure has no inner points. A space is said to be of the second category if it cannot be expressed as the union of a countable number of nowhere dense subsets. Hence, a compact Hausdorff space is of the second category. Complete metric spaces, to be defined later, are also of the second category.

1.7.4 THEOREM:

Let S be a locally compact space. There exists a compact space S^{*} and a point z in S^{*} such that S^{*}-z is homeomorphic to S.


Let z denote a “new” point, not in S, and let S^{*} denote the set union of S and z. If Y is a subset of S, let Y^{*} \subset S^{*} denote the union of Y and z. We topologize S^{*}. Any open set in S is also open in S^{*}. In addition, if X is a compact subset of S and Y is the complement S-X, then Y^{*} is open in S^{*}. These open sets are taken as a sub-basis for open sets in S^{*}.

Suppose now that we have some covering of S^{*} by a family of open sets. Then z belongs to one of these open sets, say z \in U^{*}. The complement of U^{*} is a compact subset of S . Hence, the complement is covered by a finite subset of the given covering sets, because of the compactness in S. Together with U^{*}, this gives a finite covering of S^{*}.


1.8 Tychonoff Theorem


Let S_{a} where a \in \{ a \}, be compact spaces and let P be the topological product of the S_{a}. Then, P is compact.


Let P = S_{1} \times S_{2} and let F denote a family of open sets of P covering P. For each point 1x_{1} of S_{1}, the closed subset x_{1} \times S_{2} of P is homeomorphic to S_{2} amd is therefore compact. Each point of x_{1} \times S_{2} belongs to a set in F because F is a covering. Because of the way in which a product is topologized, it follows that each point of x_{1} \times S_{2} belongs to some open set U \times V of P such that U \times V is a subset of some set of F. It follows from its compactness that x_{1} \times S_{2} is contained in the union of a finite number of sets

U_{1} \times V_{1}, \ldots, U_{n} \times V_{n}

each of which is a subset of some set of F. Let U^{'} = \bigcap U_{i}. Then, U^{'} \times S_{2} is covered by a finite number of sets of F.

Since x_{1} is an arbitrary point of S_{1} amd S_{1} is compact, there exist a finite number of open sets of S_{1}

U_{1}^{'}, \ldots, U_{m}^{'}

which cover S_{1} and which are such that there is a finite number of sets of F covering U_{1}^{'} \times S_{2} where i=1, \ldots, m. The totality of sets of F thus indicated is a finite number which covers S_{1} \times S_{2}. This completes the proof for the case of two factors. The case for a finite number of factors follows by a simple induction.


Let R_{n} = R_{1} \times R_{1} \times \ldots R_{1}, n factors. Then, R_{n} is compact and it follows that E_{n} = product of n real lines is locally compact.


To consider the general case, let \{ a\} be an arbitrary collection of at least two indices: let S_{a} be compact topological spaces, let P be the topological product, and let F be a collection of open subsets of F covering P. The proof that P is compact is by contradiction. Accordingly, we shall suppose that no finite subcollection of sets of F covers P.

It was shown by Zermelo that it is possible to well-order the set of all subsets of a given set by the use of an axiom-of-choice of appropriate power, namely the cardinal number of the set of all subsets of the given set. A well-ordering of objects permits them to be inspected systematically.

Using such a well-ordering we can enlarge the given family F to a family F^{*} of open sets, where F^{*} has the following properties:

  1. F^{*} is a covering of P by open sets.
  2. No finite subcollection of P by open sets.
  3. If we adjoin to F^{*} any open subset of P not already in F^{*}, then the enlarged collection does contain a finite subcollection which covers P. Of course, it is in (3) that F^{*} has a property not necessarily true of F

Using this enlarged family the proof for the general case becomes similar to the proof for two factors. Let b denote an arbitrary index in \{ a \} which shall be fixed temporarily, and let P_{b} denote the product of all factors S_{a} excepting S_{b}. Then P is homeomorphic to S_{b} \times P_{b}.

Suppose for a moment that to each point x_{b} \times S_{b} there exists an open U_{b} \subset S_{b} containing x_{b} such that

*) U_{b} \times P_{b} \in F^{*}

There must then exist some finite covering of S_{b} by sets U_{b}^{1}, U_{b}^{2}, \ldots, U_{b}^{n} each satisfying *). The producgt P is covered by the union of U_{b}^{i} \times P_{b}, i=1, \ldots, n. This is impossible by the contradiction of F^{*}. Hence in each S_{b} there is at least one x_{b} which does not satisfy the first sentence of this paragraph.

It follows by the axiom of choice that P contains at least one point x = PROD x_{b} such that if U_{b} is an open set in S_{b} and x_{b} \in U_{b} then *) is false. This holds for each coordinate x_{b} of x. This implies for each coordinate x_{b} of x that if x_{b} is in an open set U_{b} of S_{b} then there is a finite collection of sets in F^{*}

O_{b}^{1}, O_{b}^{2}, \ldots, O_{b}^{n_{b}}

which together with U_{b} \times P_{b} forms a covering of P.

The point x belongs to an open set O_{x} \in F^{*}. There is some open set contained in O_{x} which contains x and is of the form

U_{a_{1}} \times U_{a_{2}} \times \ldots U_{a_{n}} \times P_{a_{1}a_{2}a_{3}\ldots a_{n}}

for some finite set of indices a_{i} and where the last set is the product of all S_{n} with the exception of S_{a_{i}}, where i=1,2, \ldots, n. For each a_{i} there exists a finite collection of sets of F^{*} which together with U_{a_{i}} \times P_{a_{i}} covers P, say these sets are

**) O_{a_{i}}^{1}, O_{a_{i}}^{2}, \ldots, O_{a_{i}}^{n_{i}}, where i=1, 2, \ldots, n

Then P is covered by the union of O_{x} and the sets of **). This contradiction completes the proof.



The infinite-dimensional torus described in 1.6 whose “dimension” equals the cardinal power of the set of indices A is compact. It is a commutative group where the addition of two points is carried out by adding the respective coordinates in each factor S_{n} = C_{1} each of these factors being itself a commutative group. The group addition is continuous in the topology and this defines a topological group (1.11) In fact, this is a universal compact commutative topological group (depending on the cardinal power of the group). See for example the following paper: Discrete Abelian groups and their character groups, Ann. of Math., (2) 36 (1935) pp. 71-85.

The principal theorem of this section is due to Tychonoff (see: Uber einen Funktionenraum Math. Ann. III (1935), pp. 762-766). The present proof is dual to a proof given by Bourbaki (see: Topologie generale, Paris, 1942).

1.9 Metric Spaces


A set S of points is called a metric space if to each pair x, y \in S there is associated a non-negative real number d(x,y) the distance from x to y, satisfying

  1. d(x,y)=0 if and only if x=y.
  2. d(x,y)=d(y,x)
  3. d(x,y) + d(y,z) \geq d(x,z) where x,y,z \in S.

The distance function d(x,y) also called the metric induces a topology in S as follows. For each r >0 let S_{r}(x) denote the sphere of radius r, that is, the set of y \in S such that d(x,y) <r. Now let S_{r}(x), for all positive r and all x \in S constitute a basis for open sets. This choice of basis makes S a topological space. A space is called metrizable if a metric can be defined for it which induces in it the desired topology. It is clear that a metric space has a countable basis at each point x, namely S_{r}(x) where r is rational.


If S_{1} and S_{2} are metric spaces then S_{1} \times S_{2} is a metric space in the metric

d((x_{1}, x_{2}),(y_{1}, y_{2})) = max (d(x_{1},y_{1}), d(x_{2},y_{2}))

where x_{1},y_{1} \in S_{1}, and x_{2}, y_{2} \in S_{2}. The topology determined by this metric is the same as the product topology.


The set F of continuous functions defined on a compact space S with values in a metric space M becomes a metric space by defining for f, g \in F

d(f,g) = lub (x \in S) [d_{M}(f(x), g(x))] where d_{M} is the metric in M. See corollary 2, section 1.7.1


The collection of open sets of a compact metric space S has a countable basis.


For each n \in \mathcal{N} there is a covering of the space by a finite number of open sets each of diameter at most \frac{1}{n}. The countable collection of these sets for all n is a basis.


If S is a compact metric space and E_{1} denotes the real line, then S \times E_{1} is a metrizable locally compact space with a countable basis for open sets.

By Example 1 above, the space is metrizable. If \{ U_{m}\} and \{ V_{m}\} are countable bases in S and E_{1} respectively then \{ U_{m} \times V_{m}\} forms a countable base in S \times E_{1}. If E_{1n} = \{ x \in E_{1}, |x| \leq n\} then S \times E_{1n} is a compact subset of S \times E_{1} and any point of the product is interior to S \times E_{1n} for n large enough. This proves the local compactness.


The following is of interest:If (x,y) is a metric for a space M then the following equivalent metric:

(x,y)^{'} = \frac{(x,y)}{1+(x,y)} \leq 1

is a bounded metric. Properties (1) and (2) above are obviously satisfied. For (3) , one uses the fact that the function \frac{t}{1+t} increases with t. Thus

(x,y)^{'}+(y,z)^{'} \geq \frac{(x,y)}{1+(x,y)+(y,z)} + \frac{(y,z)}{1+(y,z)+(x,y)} = \frac{(x,y)+(y,z)}{1+(x,y)+(y,z)} \geq \frac{(x,z)}{1+(x,z)} = (x,z)^{'}

This has the following consequence:


Let M be a space which is the union of a system M_{a} where a \in \{ a\} of open mutually exclusive sets. Suppose each M_{a} of open mutually exclusive sets. Suppose each M_{a} is a metric space and carries a metric d_{a} bounded by 1. Define a function d(x,y) which is equal to 2 if x and y are not in the same M_{a}; otherwise let d agree with the appropriate d_{a}. Then d is a metric for M.

Proof: HW.


A sequence of points x_{n} in a metric space is said to converge to a point x, symbolically x_{n} \rightarrow x if \lim d(x,x_{n}) =0. A sequence of points x_{n} satisfies the Cauchy convergence criterion if when \epsilon >0 is given there is an N such that for m,n >N d(x_{n},x_{m})<\epsilon. A metric space is called complete if every sequence of points satisfying the Cauchy criterion converges to a point of the space. A subset of a space is called dense (everywhere dense) in the space if every point of space is a limit of some sequence of points of the subsets.

To be continued in next blog,


Nalin Pithwa

Is Math really abstract? I N Herstein answers…

Reference: Chapter 1: Abstract Algebra Third Edition, I. N. Herstein, Prentice Hall International Edition:

For many readers/students of pure mathematics, such a book will be their first contact with abstract mathematics. The subject to be discussed is usually called “abstract algebra,” but the difficulties that the reader may encounter are not so much due to the “algebra” part as they are to the “abstract” part.

On seeing some area of abstract mathematics for the first time,be it in analysis, topology, or what not, there seems to be a common reaction for the novice. This can best be described by a feeling of being adrift, of not having something solid to hang on to. This is not too surprising, for while many of the ideas are fundamentally quite simple, they are subtle and seem to elude one’s grasp the first time around. One way to mitigate this feeling of limbo, or asking oneself “What is the point of all this?” is to take the concept at hand and see what it says in particular concrete cases. In other words, the best road to good understanding of the notions introduced is to look at examples. This is true in all of mathematics.

Can one, with a few strokes, quickly describe the essence, purpose, and background for abstract algebra, for example?

We start with some collection of objects S and endow this collection with an algebraic structure by assuming that we can combine, in one or several ways (usually two) elements of this set S to obtain, once more, elements of this set S. These ways of combining elements of S we call operations on S. Then we try to condition or regulate the nature of S by imposing rules on how these operations behave on S. These rules are usually called axioms defining the particular structure on S. These axioms are for us to define, but the choice made comes, historically in mathematics from noticing that there are many concrete mathematical systems that satisfy these rules or axioms. In algebra, we study algebraic objects or structures called groups, rings, fields.

Of course, one could try many sets of axioms to define new structures. What would we require of such a structure? Certainly we would want that the axioms be consistent, that is, that we should not be led to some nonsensical contradiction computing within the framework of the allowable things the axioms permit us to do. But that is not enough. We can easily set up such algebraic structures by imposing a set of rules on a set S that lead to a pathological or weird system. Furthermore, there may be very few examples of something obeying the rules we have laid down.

Time has shown that certain structures defined by “axioms” play an important role in mathematics (and other areas as well) and that certain others are of no interest. The ones we mentioned earlier, namely, groups, rings, fields, and vector spaces have stood the test of time.

A word about the use of “axioms.” In everyday language, “an axiom means a self-evident truth”. But we are not using every day language; we are dealing with mathematics. An axiom is not a universal truth — but one of several rules spelling out a given mathematical structure. The axiom is true in the system we are studying because we forced it to be true by “force” or “our choice” or “hypothesis”. It is a licence, in that particular structure to do certain things.

We return to something we said earlier about the reaction that many students have on their first encounter with this kind of algebra, namely, a lack of feeling that the material is something they can get their teeth into. Do not be discouraged if the initial exposure leaves you in a bit of a fog.Stick with it, try to understand what a given concept says and most importantly, look at particular, concrete examples of the concept under discussion.

Follow the same approach in linear algebra, analysis and topology.

Cheers, cheers, cheers,

Nalin Pithwa

Chapter 1: Topology, Hocking and Young: A theorem due G.D. Birkhoff

Prove that the collection of all topologies on a given set S constitutes a lattice under the partial ordering of finer/coarser topology of a set.

Proof: NB. This theorem has already been proven by eminent mathematician G. D. Birkhoff.

PS: I am making my own attempt here. Your comments are most welcome !

Part 1: TST: Coarser/finer is a partial ordering of topologies.

To prove axiom 1: this order is reflexive.

Let \{ O_{\alpha}\} and \{ R_{\beta} \} be two collections of subsets of a set S, both satisfying the three defining axioms O_{1}, O_{2} and O_{3} of a topology. Let S have two topologies. We will say that the topology \mathcal{T}_{1} determined by \{ O_{\alpha} \} is a finer topology than the topology \mathcal{T}_{2} determined by \{ R_{\beta} \} if every set R_{\beta} is a union of sets O_{\alpha}, that is, each R_{\beta} is open in the \mathcal{T}_{1} topology. We will denote this situation with the symbol \mathcal{T}_{1} \geq \mathcal{T}_{2}.

Axiom 1: reflexive clearly holds because \mathcal{T}_{1} \geq \mathcal{T}_{1} as set inclusion/containment is reflexive.

Axiom 2: antisymmetric also holds true because clearly \mathcal{T}_{1} \geq \mathcal{T}_{2} and \mathcal{T}_{2} \geq \mathcal{T}_{1} together imply that \mathcal{T}_{2} = \mathcal{T}_{1}. In other words, these two topologies are equivalent, or they give rise to same basis.

Axiom 3: Transitivity holds because set inclusion/containment is transitive. If \mathcal{T}_{1} \geq \mathcal{T}_{2} and \mathcal{T}_{2} \geq \mathcal{T}_{3}, then clearly \mathcal{T}_{1} \geq \mathcal{T}_{2} \geq \mathcal{T}_{3}. In other words, there can be a chain of finer/coarser topologies for a given set.

Part 2:

TST: Under this definition of partial ordering of topologies, the partial ordering forms a lattice.

From axiom 3 proof, we know that there can exist a chain of finer/coarser topologies. So supremum and infimum can exist in a partial ordering of topologies. Hence, such a partial ordering forms a lattice.


Kindly let me know your comments and oblige about my above attempt.


Nalin Pithwa

Solutions to Chapter 1: Topology, Hocking and Young

Exercise 1-4: Prove that the collection of all open half planes is a subbasis for the Euclidean topology of the plane.

Proof 1-4:

Note: In the Euclidean plane, we can take as a basis the collection of all interiors of squares.

Note also that a subcollection \mathcal{B} of all open sets of a topological space S is a SUBBASIS of S provided that the collection of all finite intersections of elements of $\mathcal{B}$ is a basis for S.

Clearly, from all open half planes (x > x_{i} and y > y_{i}), we can create a collection of all interior of squares.

Hence, the collection of all finite intersections of all open-half planes satisfies:

Axiom \mathcal{O_{2}}: the intersection of a finite intersection of open half planes is an open set. (interior of a square).

Axiom \mathcal{O_{1}}; (also). The union of any number of finite intersections of all open half planes is also open set (interior of a square).

Axiom O_{3}: (clearly). \phi and S are open.


Exercise 1-5:

Let S be any infinite set. Show that requiring every infinite subset of S to be open imposes the discrete topology on S.

Proof 1-5:

Case: S is countable. We neglect the subcase that a selected subcase is finite. (I am using Prof. Rudin’s definition of countable). The other subcase is that there exists a subset X_{1} \subset S, where X_{1} is countable. Let X_{1} be open. Hence, S-X_{1} is closed. But S-X_{1} is also countable. Hence, S-X_{1} is also open. Hence, there is no limit point. Hence, the topology is discrete, that is, there is no limit point.

Case: S is uncountable. Consider again a proper subset X_{1} \subset S; hence, X_{1} is open by imposition of hypothesis. Hence, S-X_{1} is closed. But, S-X_{1} \neq \phi and not finite also. Hence, S-X_{1} is infinite. Hence, S-X_{1} is open. Hence, there are no limit points. Hence, it is a discrete topology in this case also.


Your comments/observations are welcome !


Nalin Pithwa.

Tutorial Problems I: Topology: Hocking and Young

Reference: Topoology by Hocking and Young, Dover Publications, Inc., NY. Available in Amazon India.

Exercises 1-1:

Show that if S is a set with the discrete topology and f: S \rightarrow T is any transformation of S into a topologized set T, then f is continuous.

Solution 1-1:

Definition A: The set S has a topology (or is topologized) provided that, for every point p in S and every subset X of S, the question : “is p a limit point of X?” can be answered.

Definition B: A topology is said to be a discrete topology when we assume that for no point p in S, and every subset X of S: the answer to the question: “is p a limit point of X?” is NO.

Definition C: A transformation f: S \rightarrow T is continuous provided that if p is a limit point of a subset X of S, then f(p) is a limit point or a point of f(X).

So, the claim is vacuously true. QED.

Exercises 1-2:

A real-valued function y=f(x) defined on an interval [a,b] is continuous provided that if a \leq x_{0} \leq b and \epsilon >0, then there is a number \delta>0 such that if $|x-x_{0}|<\delta$, where x \in [a,b], then |f(x)-f(x_{0})|<\epsilon. Show that this is equivalent to our definition, using definition 1-1.

Solution 1-2:

Definition 1-1: The real number p is a limit point of a set X of real numbers provided that for every positive number \epsilon, there is an element x of the set X such that $0<|p-x|<\epsilon$.

Definition C: A transformation f: S \rightarrow T is continuous provided that if p is a limit point of a subset X of S, then f(p) is a limit point or a point of f(X).

Part 1: Let us assume that given function f is continuous as per definition given just above.

Then, as p is a limit point of X: it means: For any \delta>0, there exists a real number p such that there is an element x \in X such that |p-x|<\delta..

So, also, by definition C, f(p) is a limit point or a point of f(X); this means the following: if f(p) is a point of f(X), there exists some x_{0} \in X such that $f(x_{0} \in f(X)$, and so quite clearly in this case p=x_{0} so that |p-x_{0}|=|x_{0}-x_{0}|<\delta, as \delta is positive.

On the other hand, if f(p) is a limit point of f(X), as per the above definition of continuity, then also for any \epsilon>0, there exists a point y \in f(X) such that |y-f(p)|<\epsilon. So, in this case also the claim is true.

We have proved Part 1. QED.

Now, part II: We assume the definition of continuity given in the problem statement is true. From here, we got to prove definition C as the basic definition given by the authors.

But this is quite obvious as in this case p=x_{0}.

We have proved Part II. QED.

Thus, the two definitions are equivalent.


Nalin Pithwa

Metric space question and solution

Reference: I had blogged this example earlier, but I myself could not fill in the missing gaps at that time. I am trying again with the help of MathWorld Wolfram and of course, the classic, Introductory Real Analysis by Kolmogorov and Fomin, from which it is picked up for my study.

Consider the set C_{[a,b]} of all continuous functions defined on the closed interval [a,b]. Let the distance function (or metric) be defined by the formula:

\rho(x,y) = (\int_{a}^{b}[x(t)-y(t)]^{2}dt)^{1/2} ——– relation I

The resulting metric space will be denoted by C_{[a,b]}^{2}.

The first two properties of a metric space are clearly satisfied by the above function. We need to check for the triangle inequality:

Now I satisfies the triangle inequality because of the following Schwarz’s inequality:

(\int_{a}^{b}x(t)y(t)dt)^{2} \leq \int_{a}^{b}x^{2}(t)dt \int_{a}^{b}y(t)dt —— relation II

In order to get to the above relation II, we need to prove the following:

Prove: (\int_{a}^{b} x(t)y(t)dt)^{2} = \int_{a}^{b}x^{2}(t)dt \int_{a}^{b} y^{2}(t)dt - \frac{1}{2}\int_{a}^{b} \int_{a}^{b}[x(s)y(t)-x(t)y(s)]^{2}dsdt.

From the above, we can deduce Schwarz’s inequality (relation II here in this blog article).

(My own attempts failed to crack it so I had to look at the internet for help. Fortunately, MathWorld Wolfram has given a crisp clear proof…but some parts of the proof are still out of my reach…nevertheless, I am reproducing the proof here for the sake of completeness of my notes…for whatever understanding I can derive at this stage from the proof…):


Weisstein, Eric W. “Schwarz’s Inequality.” From MathWorld–A Wolfram Web Resource.

Schwarz’s Inequality:

Let \Psi_{1}(x), \Psi_{2}(x) be any two real integrable functions in [a,b], then Schwarz’s inequality is given by :

|< \Psi_{1}, \Psi_{2}>|^{2} \leq < \Psi_{1}|\Psi_{2}> <\Psi_{2}|Psi_{1}>

Written out explicity,

|\int_{a}^{b} \Psi_{1}(x), \Psi_{2}(x)|^{2} \leq \int_{a}^{b}[\Psi_{1}(x)]^{2}dx \int_{a}^{b}[\Psi_{2}(x)]^{2}dx

with equality if and only if \Psi_{1}(x) = \alpha \Psi_{2}(x) with \alpha a constant. Schwarz’s inequality is sometimes also called the Cauchy-Schwarz inequality or Buniakowsky’s inequality.

To derive the inequality, let \Psi(x) be a complex function and \lambda a complex constant such that

\Psi(x) \equiv f(x) + \lambda g(x) for some f and g


\int \overline{\Psi} \Psi dx \geq 0, where \overline{z} is the complex conjugate.

\int \overline{\Psi}\Psi dx = \int \overline{f}f dx + \lambda \int \overline{f} g dx + \overline\lambda \int \overline{g} f dx + \lambda \overline{\lambda} \int \overline{g} g dx

with equality when \Psi(x) = 0

Writing this, in compact notation:

<\overline{f},f> + \lambda <\overline{f},g> + \overline{\lambda} <\overline{g},f> + \lambda \overline{\lambda} <\overline{g},g> \geq 0….relation A

Now, define \lambda \equiv - \frac{<\overline{g}, f>}{<\overline{g},g>}….relation B

and \overline{\lambda} = - \frac{<g, \overline{f}>}{<\overline{g}, g>}…relation C

Multiply A by <\overline{g},g> and then plug in B and C to obtain:

<\overline{f}, f><\overline{g}, g> - <\overline{f},g><\overline{g},f> - <\overline{g},f><g, \overline{f}> +<\overline{g}, f><g, \overline{f}> \geq 0

which simplifies to

<\overline{g},f><\overline{f},g> \leq <\overline{f},f><\overline{g},g>


|<f,g>|^{2} \leq <f,f><g,g>. Bessel’s inequality follows from Schwarz’s inequality. QED.


Nalin Pithwa

VI. Countable sets: My notes.


  1. Introduction to Topology and Modern Analysis by G. F. Simmons, Tata McGraw Hill Publications, India.
  2. Introductory Real Analysis — Kolmogorov and Fomin, Dover Publications, N.Y.(to some extent, I have blogged this topic based on this text earlier. Still, it deserves a mention/revision again).
  3. Topology : James Munkres.

The subject of this section and the next — infinite cardinal numbers — lies at the very foundation of modern mathematics. It is a vital instrument in the day to day work of many mathematicians, and we shall make extensive use of it ourselves (in our beginning studying of math ! :-)). This theory, which was created by the German mathematician Georg Cantor, also has great aethetic appeal, for it begins with ideas of extreme simplicity and develops through natural stages into an elaborate and beautiful structure of thought. In the course of our discussion, we shall answer questions which no one before Cantor’s time thought to ask, and we shall ask a question which no one can answer to this day…(perhaps !:-))

Without further ado, we can say that cardinal numbers are those used in counting, such as the positive integers (or natural numbers) 1, 2, 3, …familiar to us all. But, there is much more to the story than this.

The act of counting is undoubtedly one of the oldest of human activities. Men probably learned to count in a crude way at about the same time as they began to develop articulate speech. The earliest men who lived in communities and domesticated animals must have found it necessary to record the number of goats in the village herd by means of a pile of stones or some similar device. If the herd was counted in each night by removing one stone from the pile for each goat accounted for, then stones left over would have indicated strays, and herdsmen would have gone out to search for them. Names for numbers and symbols for them like our 1, 2, 3, …would have been superfluous. The simple yet profound idea of a one-to-one correspondence between the stones and the goats would have fully met the needs of the situation.

In a manner of speaking, we ourselves use the infinite set

N = \{ 1, 2, 3, \ldots\}

of all positive integers as “pile of stones.” We carry this set around with us as part of our intellectual equipment. Whenever we want to count a set, say, a stack of dollar bills, we start through the set N and tally off one bill against each positive integer as we come to it. The last number we reach, corresponding to the last bill, is what we call the number of bills in the stack. If this last number happens to be 10, then “10” is our symbol for the number of bills in the stack, as it also is for the number of our fingers, and for the number of our toes, and for the number of elements which can be put into one-to-one correspondence with the finite set \{ 1,2,3, \ldots, 10\}. Our procedure is slightly more sophisticated than that of the primitive savage man. We have the symbols 1, 2, 3, …for the numbers which arise in counting; we can record them for future use, and communicate them to other people, and manipulate them by the operations of arithmetic. But the underlying idea, that of the one-to-one correspondence, remains the same for us as it probably was for him.

The positive integers are adequate for our purpose of counting any non-empty finite set, and since outside of mathematics all sets appear to of this kind, they suffice for all non-mathematical counting. But in the world of mathematics, we are obliged to consider many infinite sets, such as the set of positive integers itself, the set of all integers, the set of all rational numbers, the set of all real numbers, the set of all points in a plane, and so on. It is often important to be able to count such sets, and it was Cantor’s idea to do this, and to develop a theory of infinite cardinal numbers, by means of one-to-one correspondence.

In comparing the sizes of two sets, the basic concept is that of numerical equivalence as defined in the previous section. We recall that two non-empty sets are numerically equivalent if there exists a one-to-one mapping of a set onto the other, or — and this amounts to the same thing — if there can be found a one-to-one correspondence between them. To say that two non-empty finite sets are numerically equivalent is of course to say that they have the same number of elements in the ordinary sense. If we count one of them, we simply establish a one-to-one correspondence between its elements and a set of positive integers of the form \{1,2,3, \ldots, n \} and we then say that n is the number of elements possessed by both, or the cardinal number of both. The positive integers are the finite cardinal numbers. We encounter many surprises as we follow Cantor and consider numerical equivalences for infinite sets.

The set N = \{ 1,2,3, \ldots\} of all positive integers is obviously “larger” than the set \{ 2,4,6, \ldots\} of all even positive integers, for it contains this set as a proper subset. It appears on the surface that N has “more” elements. But it is very important to avoid jumping to conclusions when dealing with infinite sets, and we must remember that our criterion in these matters is whether there exists a one-to-one correspondence between the sets (not whether one set is a proper subset or not of the other) . As a matter of fact, consider the “pairing”

1,2,3, \ldots, n, \ldots

2,4,6, \ldots, 2n, \ldots

serves to establish a one-to-one correspondence between these sets, in which each positive integer in the upper row is matched with the even positive integer (its double) directly below it, and these two sets must therefore be regarded as having the same number of elements. This is a very remarkable circumstance, for it seems to contradict our intuition and yet is based only on solid common sense. We shall see below, in Problems 6 and 7-4, that every infinite set is numerically equivalent to a proper subset of itself. Since this property is clearly not possessed by any finite set, some writers even use it as the definition of an infinite set.

In much the same way as above, we can show that N is numerically equivalent to the set of all even integers:

1, 2, 3,4, 5,6, 7, \ldots

0,2,-2,4,-4,4,6,-6, \ldots

Here, our device is start with zero and follow each even positive integer as we come to it by its negative. Similarly, N is numerically equivalent to the set of all integers:

1,2,3,4,5,6,7, \ldots

0,1,-1, 2, -2, 3, -3, \ldots

It is of considerable interest historical interest to note that Galileo had observed in the early seventeenth century that there are precisely as many perfect squares (1,4,9,16,25, etc.) among the positive integers as there are positive integers altogether. This is clear from the “pairing”:

1,2,3,4,5, \ldots

1^{2}, 2^{2}, 3^{2}, 4^{2}, 5^{2}, \ldots

It struck him as very strange that this should be true, considering how sparsely strewn the squares are among all the positive integers. But, the time appears not to have been ripe for the exploration of this phenomenon, or perhaps he had other things on his mind; in any case, he did not follow up his idea.

These examples should make it clear that all that is really necessary in showing that an infinite set X is numerically equivalent to N is that we be able to list the elements of X, with a first, a second, a third, and so on, in such a way that it is completed exhausted by this counting off of its elements. It is for this reason that any infinite set which is numerically equivalent to N is said to be countably infinite. (Prof. Walter Rudin also, in his classic on mathematical analysis, considers a countable set to be either finite or countably infinite. ) We say that a set is countable it is non-empty and finite (in which case it can obviously be counted) or if it is countably infinite.

One of Cantor’s earliest discoveries in his study of infinite sets was that the set of all positive rational numbers (which is very large : it contains N and a great many other numbers besides) is actually countable. We cannot list the positive rational numbers in order of size, as we can the positive integers, beginning with the smallest, then the next smallest, and so on, for there is no smallest, and between any two there are infinitely many others. We must find some other way of counting them, and following Cantor, we arrange them not not in order of size, but according to the size of the sum of numerator and denominator. We begin with all positive rationals whose numerator and denominator sum add up to 2: there is only one \frac{1}{1}=1. Next, we list (with increasing numerators) all those for which this sum is 3: \frac{1}{2}, \frac{2}{1}=2. Next, all those for which the sum is 4: \frac{1}{3}, \frac{2}{2}=1, \frac{3}{1}=3. Next, all those for which this sum is 5: \frac{1}{4}, \frac{2}{3}, \frac{3}{2}, \frac{4}{1}=4. Next, all those for which this sum is 6: \frac{1}{5}, \frac{2}{4}, \frac{3}{3}=1, \frac{4}{2}=2, \frac{5}{1}=5. And, so on. If we now list all these together from the beginning, omitting those already listed when we come to them, we get a sequence like:

1, 1/2, 2, 1/3, 1/4, 2/3, 3/2, 4, 1/5, 5, \ldots

which contains each positive rational number once and only once. (There is a nice schematic representation of this : Cantor’s diagonalization process; please google it). In this figure, the first row contains all positive rationals with numerator 1, the second all with numerator 2, and so on. Our listing amounts to traversing this array of numbers in a zig-zag manner (again, please google), where of course, all those numbers already encountered are left out.

It is high time that we christened the infinite cardinal number we have been discussing, and for this purpose, we use the first letter of the Hebrew alphabet, \bf{aleph_{0}}. We say that \aleph_{0} is the number of elements in any countably infinite set. Our complete list of cardinal numbers so far is

1, 2, 3, \ldots, \aleph_{0}.

We expand this list in the next section.

Suppose now that m and n are two cardinal numbers (finite or infinite). The statement that m is less than n (written m < n) is defined to mean the following: if X and Y are sets with m and n elements, then (i) there exists a one-to-one mapping of X into Y, and (ii) there does not exist a mapping of X onto Y. Using this concept, it is easy to relate our cardinal numbers to one another by means of:

1<2<3< \ldots < \aleph_{0}.

With respect to the finite cardinal numbers, this ordering corresponds to their usual ordering as real numbers.


Nalin Pithwa

V. Partitions and Equivalence Relations: My Notes


  1. Topology and Modern Analysis, G F Simmons, Tata McGraw Hill Publications, India.
  2. Toplcs in Algebra, I N Herstein.
  3. Abstract Algebra, Dummit and Foote.
  4. Topology by James Munkres.

In the first part of this section, we consider a non-empty set X, and we study decompositions of X into non-empty subsets which fill it out and have no elements in common with one another. We give special attention to the tools (equivalence relation) which are normally used to generate such decompositions.

A partition of X is a disjoint class \{ X_{i} \} of non-empty subsets of X whose union if the full set X itself. The X_{i}‘s are called the partition sets. Expressed somewhat differently, a partition of X is the result of splitting it, or subdividing it, into non-empty subsets in such a way that each element of X belongs to one and only one of the given subsets. ]

If X is the set \{1,2,3,4,5 \}, then \{1,3,5 \}, \{2,4 \} and \{1,2,3 \} and \{ 4,5\} are two different partitions of X. If X is the set \Re of all real numbers, then we can partition \Re into the set of all rationals and the set of all irrationals, or into the infinitely many closed open intervals of the form [n, n+1) where n is an integer. If X is the set of all points in the coordinate plane, then we can partition X in such a way that each partition set consists of all points with the same x coordinate (vertical lines), or so that each partition set consists of all points with the same y coordinate (horizontal lines).

Other partitions of each of these sets will readily occur to the reader. In general, there are many different ways in which any given set can be partitioned. These manufactored examples are admittedly rather uninspiring and serve only to make our ideas more concrete. Later in this section we consider some others which are more germane to our present purposes.

A binary relation in the set X is a mathematical symbol or verbal phrase, which we denote by R in this paragraph, such that for each ordered pair (x,y) of elements of X the statement x \hspace{0.1in} R \hspace{0.1in} y is meaningful, in the sense that it can be classified definitely as true or false. For such a binary relation, x \hspace{0.1in} R \hspace{0.1in}y symbolizes the assertion that x is related by R to y, and x \not {R} \hspace{0.1in}y the negation of this, namely, the assertion that x is not related by R to y. Many examples of binary relations can be given, some familiar and others less so, some mathematical and others not. For instance, if X is the set of all integers and R is interpreted to mean “is less than,” which of course is usually denoted by the symbol <, then we clearly have 6<7 and 5 \not < 2. We have been speaking of binary relations, which are so named because they apply only to ordered pairs of elements, rather than to ordered triples, etc. In our work, we drop the qualifying adjective and speak simply of a relation in X, since we shall have occasion to consider only relations of this kind. {NB: Some writers prefer to regard a relation R in X as a subset R of X \times X. From this point of view, x R y and x \not {R} y are simply equivalent ways of writing (x,y) \in R and (x,y) \notin R. This definition has the advantage of being more tangible than our definition, and the disadvantage that few people really think of a relation in this way.” )

We now assume that a partition of our non-empty set X is given, and we associate with this partition a relation on X. This relation is defined to be in the following way: we say that x is equivalent to y and write this as x \sim y (the symbol \sim is pronounced “wiggle”.), if x and y belong to the same partition set. It is obvious that the relation \sim has the following properties:

a) x \sim x for every x (reflexivity)

b) x \sim y \Longrightarrow y \sim x (symmetry)

c) x \sim y \hspace{0.1in} and \hspace{0.1in} y \sim z \Longrightarrow x \sim z (transitivity)

This particular relation in X arose in a special way, in connection with a given partition of X, and its properties are immediate consequences of the definition. Any relation whatever in X which possesses these three properties is called an equivalence relation in X.

We have just seen that each partition of X has associated with it a natural equivalence relation in X. We now reverse the situation and prove that a given equivalence relation in X determines a natural partition of X.

Let \sim be an equivalence relation in X; that is, assume that it is reflexive, symmetric, and transitive in the sense described above. If x is an element of X, the subset of X defined by [x] = \{ y: y \sim x\} is called the equivalence set of x. The equivalence set of x is thus the set of all elements which are equivalent to x. We show that the class of all distinct equivalence sets forms a partition of X. By reflexivity, x \in [x] for each element x in X, so each equivalence set is non-empty and their union is X. It remains to be shown that any two equivalence sets [x_{1}] and [x_{2}] are either disjoint or identical. We prove this by showing that if [x_{1}] and [x_{2}] are not disjoint, then they must be identical. Suppose that [x_{1}] and [x_{2}] are not disjoint, that is, suppose that they have a common element z. Since x belongs to both equivalence sets, z \sim x_{1} and z \sim x_{2}, and by symmetry x_{1} \sim z. Let y be any element of x_{1}, so that y \sim x_{1}. Since y \sim x_{1} and x_{1} \sim z, transitivity shows that y \sim z. By another application of transitivity, y \sim z and z \sim x_{2}, imply that y \sim x_{2} so that y is in [x_{2}]. Since y was arbitrarily chosen in [x_{1}], we see by this that [x_{1}] \subseteq [x_{2}]. The same reasoning shows that [x_{2}] \subseteq [x_{1}] and from this we conclude that [x_{1}] = [x_{2}].

The above discussion demonstrates that there is no real distinction (other than a difference in language) between partitions of a set and equivalence relation by regarding elements as equivalent if they belong to the same partition set, and if we start with an equivalence relation, we get a partition by grouping together into subsets all elements which are equivalent to one another. We have here a single mathematical idea, which we have been considering from two different points of view, and the approach we choose in any particular application depends entirely on our own convenience. In practice, it is almost invariably the case that we use equivalence relations (which are usually easy to define) to obtain partitions (which are sometimes difficult to describe fully).

We now turn to several of the more important simple examples of equivalence relations.

Let I be the set of integers. If a and b are elements of this set, we write a = b (and say that a equals b) if a and b are the same integer. Thus, 2+3=5 means that the expression on the left and right are simply different ways of writing the same integer. It is apparent that = used in this sense is an equivalence relation in the set I:

i) a=a for every a

ii) a=b \Longrightarrow b=a

iii) a=b \hspace{0.1in} b=c \Longrightarrow a=c.

Clearly, each equivalence set consists of precisely one integer.

Another familiar example is this relation of equality commonly used for fractions. We remind the reader that, strictly speaking, a fraction is merely a symbol of the form a/b, where a and b are integers and b is not zero. The fractions 2/3 and 4/6 are obviously not identical, but nevertheless we consider them to be equal. In general, we say that two fractions a/b and c/d are equal, written \frac{a}{b} = \frac{c}{d}, if ad and bc are equal as integers in the usual sense (see the paragraph above). (HW quiz: show this is an equivalence relation on the set of fractions). An equivalence set of fractions is what we call a rational number. Every day usage ignores the distinction between fractions and rational numbers, but it is important to recognize that from the strict point of view it is the rational numbers (and not the fractions) which form part of the real number system.

Our final example has a deeper significance, for it provides us with the basic tool for our work of the next two sections.

For the remainder of all this section, we consider a relation between pairs of non-empty sets, and each set mentioned (whether we say so explicitly or not) is assumed to be non-empty. If X and Y are two sets, we say that X is numerically equivalent to Y if there exists a one-to-one correspondence between X and Y, that is, if there exists a one-to-one mapping of X onto Y. This relation is reflexive, since the identity mapping i_{X}: X \rightarrow X is one-to-one onto; it is symmetric since if f: X \rightarrow Y is one-to-one onto, then its inverse mapping f^{-1}: Y \rightarrow X is also one-to-one onto; and it is transitive, since if f: X \rightarrow Y and g: Y \rightarrow Z are one-to-one onto, then gf: X \rightarrow Z is also one-to-one onto. Numerical equivalence has all the properties of an equivalence relation, and if we consider it as an equivalence relation in the class of all non-empty subsets of some universal set U, it groups together into equivalence sets all those subsets of U which have the “same number of elements.” After we state and prove the following very useful but rather technical theorem, we shall continue in Sections 6 and 7 with an exploration of the implications of these ideas.

The theorem we have in mind — the Schroeder-Bernstein theorem: If X and Y are two sets each of which is numerically equivalent to a subset of the other, then all of X is numerically equivalent to all of Y. There are several proofs of this classic theorem, some of which are quite difficult. The very elegant proof we give is essentially due to Birkhoff and MacLane.


Assume that f: X \rightarrow Y is a one-to-one mapping of X into Y, and that g: Y \rightarrow X is a one-to-one mapping of Y into X. We want to produce a mapping F: X \rightarrow Y which is one-to-one onto. We may assume that neither f nor g is onto, since if f is, we can define F to f, and if g is, we can define F to be g^{-1}. Since both f and g are one-to-one, it is permissible to use the mappings f^{-1} and g^{-1} as long as we keep in mind that f^{-1} is defined only on f(X) and g^{-1} is defined only on g(Y). We obtain the mapping F by splitting both X and Y into subsets which we characterize in terms of the ancestry of their elements. Let x be an element of X. We apply g^{-1} (if we can) to get the element g^{-1}(x) in Y. If g^{-1}(x) exists, we call it the first ancestor of x. The element x itself we call the zeroth ancestor of x. We now apply f^{-1} to g^{-1}(x) if we can, and if (f^{-1}g^{-1})(x) exists, we call it the second ancestor of x. We now apply g^{-1} to (f^{-1}g^{-1})(x) if we can, and if (g^{-1}f^{-1}g^{-1})(x) exists, we call it the third ancestor of x. As we continue this process of tracing back the ancestry of x, it becomes apparent that there are three possibilities — (1) x has infinitely many ancestors. We denote by X_{i}, the subset of X, which consists of all elements with infinitely many ancestors (2) x has an even number of ancestors, this means that x has a last ancestor (that is, one which itself has no first ancestor) in X. We denote by X_{e} the subset of X consisting of all elements with an even number of ancestors. (3) x has an odd number of ancestors; this means that x has a last ancestor in Y. We denote by X_{o} the subset of X which consists of all elements with an odd number of ancestors. The three sets X_{i}, X_{e}, X_{o} form a disjoint class whose union is X. We decompose Y in just the same way into three subsets Y_{i}, Y_{e}, Y_{o}. It is easy to see that f maps X_{i} onto Y_{i}, and X_{e} onto Y_{e}, and that g^{-1} maps X_{o} onto Y_{o}, and we complete the proof by defining F in the following piecemeal manner:

F(x) = f(x) if x \in X_{i} \bigcup X_{e}

and F(x) = g^{-1}(x) if x \in X_{o}.


The Schroeder Bernstein theorem has great theoretical and practical significance. It main value for us lies in its role as a tool by means of which we can prove numerical equivalence with a minimum of effort for many specific sets. We put it to work in Section 7.


Nalin Pithwa

IV. Product of Sets: Exercises

Reference: Topology and Modern Analysis, G F Simmons, Tata McGraw Hill Publications, India.


I) The graph of a mapping f: X \rightarrow Y is a subset of the product X \times Y. What properties characterize the graphs of mappings among all subsets of X \times Y?

Solution I: composition.

II) Let X and Y be non-empty sets. If A_{1} and A_{2} are subsets of X, and Y_{1} and Y_{2} are subsets of Y, then prove the following

(a) (A_{1} \times B_{1}) \bigcap (A_{2} \times B_{2}) = (A_{1} \bigcap A_{2}) \times (B_{1} \bigcap B_{2})

(b) (A_{1} \times B_{1}) - (A_{2} \times B_{2}) = (A_{1}-A_{2}) \times (B_{1}-B_{2}) \bigcup (A_{1} \bigcap A_{2}) \times (B_{1}-B_{2}) \bigcup (A_{1}-A_{2}) \times (B_{1} \bigcap B_{2})

Solution IIa:

TPT: (A_{1} \times B_{1}) \times (A_{2} \times B_{2}) = (A_{1} \bigcap A_{2}) \times (B_{1} \bigcap B_{2})

This is by definition of product and the fact that the co-ordinates are ordered and the fact that A_{1} \subseteq X, A_{2} \subseteq X, B_{1} \subseteq Y, and B_{2} \subseteq Y.

Solution IIb:

Let (a_{i}, b_{j}) \in A_{1} \times B_{1}, but (a_{i}, b_{j}) \notin A_{2} \times B_{2}. So, the element may belong to (A_{1}-A_{2}) \times (B_{1} - B_{2}) or it could happen that it belongs to A_{1} \times A_{1}, but to (B_{1}-B_{2}) (here we need to remember that the element is ordered); so, also it could be the other way: it could belong to (A_{1}-A_{2}) but to (B_{1} \bigcap B_{2}) also. The same arguments applied in reverse establish the other subset inequality. Hence, done.

III) Let X and Y be non-empty sets, and let A and B be rings of subsets of X and Y, respectively. Show that the class of all finite unions of sets of the form A \times B with A \in \bf{A} and B \in \bf{B} is a ring of subsets of X \times Y.

Solution III:

\bigcup_{i=1}^{n}A_{i} \times B_{i} = \bigcup_{i=1}^{n}A_{i} \times \bigcup_{i=1}^{n}B_{i}.

From question IIb above, the difference of any two pairs of sets is also in X \times Y.

Hence, done.


Nalin Pithwa