Group theory: v v v brief historical origins

Reference: Abstract Algebra, Second Edition, Dummit and Foote. John Wiley and Sons Inc.

The modern treatment of abstract algebra begins with the disarmingly simple abstract definition of a group. This simple definition quickly leads to difficult questions involving the structure of such objects. There are many specific examples of groups and the power of the abstract point of view becomes apparent when results for all of these examples are obtained by proving a single result for the abstract group.

The notion of a group did not simply spring into existence, however, but is rather the culmination of a long period of mathematical investigation, the first formal definition of an abstract group in the form in which we use it appearing in 1882. The definition of an abstract group has its origins in extremely old problems in algebraic equations, number theory, and geometry, and arose because very similar techniques were found to be applicable in a variety of situations. As Otto Holder (1859-1937) observed, one of the essential characteristics of mathematics is that after applying a certain algorithm or method of proof one then considers the scope and limits of the method. As a result, properties possessed by a number of interesting objects are frequently abstracted and the question raised: can one determine all the objects possessing these properties ? Attempting to answer such a question also frequently adds considerable understanding of the original objects under consideration. It is in this fashion that the definition of an abstract group evolved into what is, for us, the starting point of abstract algebra.

We illustrate with a few of the disparate situations in which the ideas later formalized into the notion of an abstract group were used.

  1. In number theory the very object of study, the set of integers, is an example of a group. Consider, for example, what we refer to as “Euler’s Theorem”, one extremely simple example of which is that a^{40} has last two digits 01 if a is any integer not divisible by 2 nor by 5. This was proved in 1761 by Leonhard Euler (1707-1783) using “group-theoretic” ideas of Joseph Louis Lagrange (1736-1813), long before the first formal definition of a group. From our perspective, one now proves “Lagrange’s Theorem”, applying these techniques abstracted to an arbitrary group, and then recovers Euler’s theorem (and many others) as a special case.
  2. Investigations into the question of rational solutions to algebraic equatios of the form y^{2}=x^{3}-2x (there are infintely many, for example (0,0), (-1,1), (2,2), (\frac{9}{4}, \frac{-21}{8}), (\frac{-1}{169}, \frac{239}{2197})) showed that connecting any two solutions by a straight line and computing the intersection of this line with the curve y^{2}=x^{3}-2x produces another solution. Such “Diophantine equations,” among others, were considered by Pierre de Fermat (1601-1655) (this one was solved by him in 1644),by Euler, by Lagrange around 1777, and others. In 1730 Euler raised the question of determining the indefinite integral \int (\frac{dx}{\sqrt{1-x^{2}}}) of the “lemniscatic differential” \frac{dx}{\sqrt{1-x^{2}}}, used in determining the arc length along an ellipse (the question had also been considered by Wilhelm Gottfried Leibniz (1646-1716) and Johannes Bernoulli (1667-1748)). In 1752 Euler proved a “multiplication formula” for such elliptic integrals (using ideas of G. C. di Fagnano (1682-1766), received by Euler in 1751), which shows how two elliptic integrals give rise to a third, bringing into existence the theory of elliptic functions in analysis. In 1834 Carl Gustav Jacob Jacobi (1804-1851) observed that the work of Euler on solving certain Diophantine equations amounted to writing the multiplication formula for certain elliptic integrals. Today the curve above is referred to as an “elliptic curve” and these questions are viewed as two different aspects of the same thing — the fact that this geometric operation on points can be used to give the set of points on an elliptic curve the structure of a group. The study of the “arithmetic” of these groups is an active area of current research.
  3. By 1824 it was known that there are formulas giving the roots of quadratic, cubic and quartic equations (extending the familiar quadratic formula for the roots of ax^{2}+bx+c=0). In 1824, however, Niels Henrik Abel (1802-1829) proved that such a formula for the roots of a quintic is impossible. The proof is based on the idea of examining what happens when the roots are permuted amongst themselves (for example, interchanging two of the roots). The collection of such permutations has the structure of a group (called, naturally enough, a “permutation group”). This idea culminated in the beautiful work of Evariste Galois (1811-1832) in 1830-32, working with explicit groups of “substitutions.” Today this work is referred to as Galois Theory. Similar explicit groups were being used in geometry as collections of geometric transformations (translations, reflections, etc) by Arthur Cayley (1821-1895) around 1851. Camille Jordan (1838-1922) around 1867, Felix Klein (1849-1925) around 1870 etc. and the application of groups to geometry is still extremely active in current research into the structure of 3-space, 4-space, etc. The same group arising in the study of the solvability of the quintic arises in the study of the rigid motion of an icosahedron in geometry and in the study of elliptic functions in analysis.
  4. The precursors of todays group can be traced back many years, even before the groups of “substitutions” of Galois. the formal definitionof an abstract group which is our starting point appeared in 1882 in the work of Walter Dyck (1856-1934), an assistant to Felix Klein, and also in the work of Heinrich Weber (1842-1913) in the same year. It is freuently the case in mathematics research to find specific application of an idea before having that idea extracted and presented as an item of interest in its own right (for example, in 1830 Galois used the notion of quotient group implicitly in his investigations in 1830 and the definition of an abstract quotient group is due to Holder in 1889). It is important to realize, with or without the historical context, that the reason the abstract definitions are made is because it is useful to isolate specific characteristics and consider what structure is imposed on an object having these characteristics. The notion of the structure of an algebraic object (which is made more precise by the concept of an isomorphism —- which considers when two apparently different objects are in some sense the same) is a major theme that recurs in algebra.

Hope you enjoyed !

Cheers,

Nalin Pithwa

Generalized associative law, gen comm law etc.

Reference : Algebra by Hungerford, Springer Verlag, GTM.

Let G be a semigroup. Given a_{1}, a_{2}, a_{3}, \ldots, a_{n} \in G, with n \geq 3, it is intuitively plausible that there are many ways of inserting parentheses in the expression latex a_{1}a_{2}\ldots a_{n}$ so as to yield a “meaningful” product in G of these n elements in this order. Furthermore, it is plausible that any two such products can be proved equal by repeated use of the associative law. A necessary prerequisite for further study of groups and rings is a precise statement and proof of these conjectures and related ones.

Given any sequence of elements of a semigroup G, (a_{1}a_{2}\ldots a_{n}) define inductively a meaningful product of a_{1}, a_{2}, \ldots, a_{n} (in this order) as follows: If n=1, the only meaningful product is a_{1}. If n>1, then a meaningful product is defined to be any product of the form (a_{1}\ldots a_{m})(a_{m+1}\ldots a_{n}) where m<n and (a_{1}\ldots a_{m}) and (a_{m+1} \ldots a_{n}) are meaningful products of m and n-m elements respectively. (to show that this statement is well-defined requires a version of the Recursion Theorem). Note that for each n \geq 3 there may be meaningful products of a_{1}, a_{2}, \ldots, a_{n}. For each n \in \mathcal{N}^{*} we single out a particular meaningful product by defining inductively the standard n product \prod_{i=1}^{n}a_{i} of a_{1}, \ldots, a_{n} as follows:

\prod_{i=1}^{1}a_{i}=a_{1}, and for n>1, \prod_{i=1}^{n}a_{i}=(\prod_{i=1}^{n-1})a_{n}

The fact that this definition defines for each n \in \mathcal{N}^{*} a unique element of G (which is clearly a meaningful product) is a consequence of the Recursion Theorem.

Theorem: Generalized Associative Law:

If G is a semigroup and a_{1}, a_{2}, \ldots, a_{n} \in G, then any two meaningful products in a_{1}a_{2}\ldots a_{n} in this order are equal.

Proof:

We use induction to show that for every n any meaningful product a_{1}a_{2} \ldots a_{n} is equal to the standard n product \prod_{i=1}^{n}a_{i}. This is certainly true for n=1, 2. For n>2, by definition, a_{1}a_{2} \ldots a_{n} = (a_{1}a_{2}\ldots a_{m})(a_{m+1} \ldots a_{n}) for some m<n. Therefore by induction and associativity:

(a_{1}a_{2} \ldots a_{n}) = (a_{1}a_{2} \ldots a_{m})(a_{m} \ldots a_{n}) = (\prod_{i=1}^{m}a_{i})(\prod_{i=1}^{n-m})a_{m+i}

= (\prod_{i=1}^{m}a_{i}) ((\prod_{i=1}^{n-m-1}a_{m+i})a_{n} ) = ((\prod_{i=1}^{m})(\prod_{i=1}^{n-m-1}a_{m+i}))a_{n} = (\prod_{i=1}^{n-1}a_{i})a_{n} = \prod_{i=1}^{n}a_{i}

QED.

Corollary: Generalized Commutative Law:

If G is a commutative semigroup and a_{1}, \ldots, a_{n}, then for any permutation i_{1}, \ldots, i_{n} of 1, 2, …,n a_{1}a_{2}\ldots a_{n} = a_{i_{1}}a_{i_{2}}\ldots a_{i_{n}}

Proof: Homework.

Definition:

Let G be a semigroup with a \in G and n \in \mathcal{N}^{*}. The element a^{n} \in G is defined to be the standard n product \prod_{i=1}^{n}a_{i} with a_{i}=a for 1 \leq i \leq n. If G is a monoid, a^{0} is defined to be the identity element e. If G is a group, then for each n \in \mathcal{N}^{*}, a^{-n} is defined to be (a^{-1})^{n} \in G.

It can be shown that this exponentiation is well-defined. By definition, then a^{1}=a, a^{2}=aa, a^{3}=(aa)a=aaa, \ldots, a^{n}=a^{n-1}a…and so on. Note that it is possible that even if n \neq m, we may have a^{n} = a^{m}.

Regards.

Nalin Pithwa

A non trivial example of a monoid

Reference : Algebra 3rd Edition, Serge Lang. AWL International Student Edition.

We assume that the reader is familiar with the terminology of elementary topology. Let M be the set of homeomorphism classes of compact (connected) surfaces. We shall define an addition in M. Let S, S^{'} be compact surfaces. Let D be a small disc in S, and D^{'} in S^{'}. Let C, C^{'} be the circles which form the boundaries of D and D^{'} respectively. Let D_{0}, D_{0}^{'} be the interiors of D and D^{'} respectively, and glue S-D_{0} to S^{'}-D_{0}^{'} by identifying C with C^{'}. It can be shown that the resulting surface is “independent” up to homeomorphism, of the various choices made in preceding construction. If \sigma, \sigma_{'} denote the homeomorphism classes of S and S^{'} respectively, we define \sigma + \sigma_{'} to be the class of the surface obtained by the preceding gluing process. It can be shown that this addition defines a monoid structure on M, whose unit element is the class of the ordinary 2-sphere. Furthermore, if \tau denotes the class of torus, and \pi denotes the class of the projective plane, then every element \sigma of M has a unique expression of the form

\sigma = n \tau + m\pi

where n is an integer greater than or equal to 0 and m is zero, one or two. We have 3\pi=\tau+n.

This shows that there are interesting examples of monoids and that monoids exist in nature.

Hope you enjoyed !

Regards,

Nalin Pithwa

Algebra is symbolic manipulation though painstaking or conscientious :-)

Of course, I have oversimplified the meaning of algebra. 🙂

Here is an example. Let me know what you think. (Reference: Algebra 3rd Edition by Serge Lang).

Let G be a commutative monoid, and x_{1}, x_{2}, \ldots, x_{n} be elements of G. Let \psi be a bijection of the set of integers (1,2, \ldots, n) onto itself. Then,

\prod_{v=1}^{n}x_{\psi(v)} = \prod_{v=1}^{n}x_{v}

Proof by mathematical induction:

PS: if one gets scared by the above notation, one can expand it and see its meaning. Try that.

It is clearly true for n=1. We assume it for n=1. Let k be an integer such that \psi(k)=n. Then,

\prod_{i}^{n}x_{\psi(v)} = \prod_{1}^{k-1}x_{\psi(v)}.x_{\psi(k)}.\prod_{1}^{n-k}x_{\psi(k+v)}

= \prod_{1}^{k-1}x_{\psi(v)}. \psi_{1}^{n-k}x_{\psi(k+v)}.x_{\psi(k)}

Define a map \phi of (1,2, \ldots, n-1) into itself by the rule:

\phi(v)=\psi(v) if v<k

\phi(v) = \psi(v+1) if v \geq k

Then,

\prod_{1}^{n} x_{\psi(v)} = \prod_{1}^{k-1}x_{\phi(v)}. \prod_{1}^{n-k}x_{\phi(k-1+v)} = \prod_{1}^{n-1}x_{\phi(v)}.x_{n}

which by induction is equal to x_{1}\ldots x_{n} as desired.

Some remarks: As a student, I used to think many a times that this proof is obvious. But it would be difficult to write it. I think this cute little proof is a good illustration of “how to prove obvious things in algebra.” 🙂

Regards,

Nalin Pithwa

Wisdom of Hermann Weyl w.r.t. Algebra

Important though the general concepts and propositions may be with which the modern and industrious passion for axiomatizating and generalizing has presented us, in algebra perhaps more than anywhere else, nevertheless I am convinced that the special problems in all their complexity consitute the stock and core of mathematics, and that to master their difficulties requires on the whole hard labour.

—- Prof. Hermann Weyl.

Notes II: Sets and Functions:

Reference I : Topology and Modern Analysis, G F Simmons, Tata McGraw Hill.

III. Functions:

Many kinds of functions occur in topology in a wide variety of situations. In our work, we shall need the full power of the general concept of a function, and since, its modern meaning is much broader and deeper than its elementary meaning, we discuss this concept in considerable detail and develop its main abstract properties.

Let us begin with a brief inspection of some simple examples. Consider the elementary function

y=x^{2}

of the real variable x. What do we have in mind when we call this a function and say that y is a function of x? In a nutshell, we are drawing attention to the fact that each real number x has a specific real number y linked to it, which can be calculated according to the rule (or law of correspondence) given by the formula. We have here a process which applied to a real number x does something to it (squares it) to produce another number y (the square of x). Similarly,

y =  x^{2}-3x and y = (x^{2}+1)^{-1}

are two other simple functions of the real variable x, and each is given by a rule in the form of an algebraic expression which specifies the exact manner in which the value of y depends on the value of x.

The rules for the functions we have just mentioned are expressed by formulas. In general, this is possible for functions of a very simple kind or for those which are sufficiently important to deserve a special symbol of their own. Consider for instance the function of the real variable x defined as follows: for each real number x, write x as an infinite decimal (using the scheme of decimal expansion in which infinite chains of 9s are avoided — in which for example, 1/4 is represented as 0.250000….rather than by 0.24999….); then, let y be the 59th digit after the decimal point. There is of course no standard formula for this but nevertheless it is a perfectly respectable function whose rule is given by a verbal description. On the other hand, the function y=\sin{x} of the real variable x is so important that its rule, though fully as complicated as the one just defined is assigned the special symbol sin. When discussing functions in general, we work to allow for all sorts of rules and to talk about them all at once, so we simply employ non-committal notations like y=f(x), y=g(z), and so on.

Each of the functions mentioned above is defined for all real numbers x. The example y = \frac{1}{x} shows that this restriction is much too severe, for this function is defined only for non zero values of x. Similarly, y = \log{x} is defined only for positive values of x and y = \arcsin{x} only for values of x in the interval [-1,1]. Whatever our conception of a function may be, it should certainly be broad enough to include examples like these, which are defined only for some values of the real variable x.

In real analysis, the notion of function is introduced in the following way. Let X be any non-empty set of real numbers. We say that a function y=f(x) is defined on X if the rule f associates a definite real number y with each real number x in X. The specific nature of the rule f is totally irrelevant to the concept of a function. The set X is called the domain of the given function, and the set Y of all the values it assumes is called its range. If we speak of complex numbers here instead of real numbers, we have the notion of function as it is used in complex analysis.

This point of view towards functions is actually more general than is needed for aims of analysis, but it isn’t nearly general enough for our purposes. The sets X and Y above were taken to be sets of numbers. If we now remove even this restriction and allow X and Y to be completely arbitrary non-empty sets, then we arrive at the most inclusive concept of a function. By way of illustration, suppose that X is the set of all squares in a plane and that Y is the set of all circles in the same plane. We can define a function y=f(x) by requiring that the rule f associate with each square x that circle y which is inscribed in it. In general, there is no need at all for either X or Y to be a set of numbers. All that is really necessary for a function is two non-empty sets X and Y and a rule f which is meaningful and unambiguous in assigning to each element x in X a specific element y in Y.

With these preliminary descriptive remarks, we now turn to the rather abstract but very precise ideas they are intended to motivate.

A function consists of three objects: two non-empty sets X and Y (which may be equal but need not be) and a rule f which assigns to each element x in X a single fully determined element y in Y. The y which corresponds in this way to a given x is usually written f(x), and is called the image of x under the rule f, or the value of f at the element x. (It is fun to draw some figures here). This notation is supposed to be suggestive of the idea that the rule f takes the element x and does something to it to produce the element y = f(x). The rule f is often called a mapping or transformation or operator to amplify this concept of it. We then think of f as mapping x’s to y’s, or transforming x’s to y’s, or operating on x’s to produce y’s. The set X is called the domain of the function, and the set of all f(x)’s for all x’s in X is called its range. A function whose range consists of just one element is called a constant function.

We often denote by f: X \rightarrow Y the function with rule f, domain X and range contained in Y. This notation is useful because the essential parts of the function are displayed in a manner which emphasizes that it is a composite object, the central thing being the rule or mapping f. You can try drawing a figure depicting a convenient way of picturing this function. (these notes don’t have my diagrams from the reference book) On the left, X and Y are different sets, and on the right they are equal — in which case we usually refer to f as a mapping of X into itself. If it is clear that from the context what the sets X and Y are, or if there is no real need to specify them explicitly, it is common practice to identify the function f: X \rightarrow Y with the rule f, and to speak of f alone as if it were the function under consideration (without mentioning the sets X and Y).

It sometimes happens that two perfectly definite sets X and Y are under discussion and that a mapping of X into Y arises which has no natural symbol attached to it. If there is no necessity to invent a symbol for this mapping and if it is quite clear what the mapping is, it is often convenient to designate it by x \rightarrow y. Accordingly, the function y=x^{2} mentioned in the beginning of this section can be written as x \rightarrow x^{2} or x \rightarrow y where y is understood to be the square of x.

A function f is called an extension of a function g (and g is called a restriction of f) if the domain of f contains the domain of g and f(x) = g(x) for each x in the domain of y.

Most of mathematical analysis, both classical and modern, deals with functions whose values are real numbers or complex numbers. This is also true of those parts of topology which are concerned with the foundations of analysis. If the range of a function consists of real numbers, we call it a real function, similarly, a complex function is one whose range consists of complex numbers. Obviously, every real function is also complex. We lay very heavy emphasis on real and coomplex functions through out our work.

As a matter of usage, we generally prefer to reserve the term function for real or complex functions and to speak of mappings when dealing with functions whose values are not necessarily numbers.

Consider a mapping f: X \rightarrow Y. When we call f a mapping of X into Y, we mean to suggest by this that the elements f(x) — as x varies over all the elements of X — need not fill up Y; but if it definitely does happen that the range of f equals Y, or if we specifically want to assume this, then we call f a mapping of X onto Y. If two different elements in X always have different images under f, then we call f a one=to-one mapping of X into Y. If f: X \rightarrow Y is both onto and one-to-one, then we can define its inverse mapping f^{-1}: X \rightarrow Y as follows: for each y in Y, we find that unique element x in X such that f(x)=y 9 x exists and is unique since f is onto and one-to-one); we then define x to be f^{-1}(y). The equation x = f^{-1}(y) is the result of solving y=f(x) for x in just the same way as x = \log{y} is the result of solving e^{x} for x. Figure 7 illustrates the concept of the inverse of a mapping.

If f is a one-to-one mapping of X onto Y, it will sometimes be convenient to subordinate the conception of f as a mapping sending x’s over to y’s and to emphasize its role as a link between x’s an y’s. Each x has linked to it (or has corresponding to it) exactly one x = f^{-1}(y). When we focus our attention on this aspect of a mapping which is one-to-one onto correspondence between X and Y, and f^{-1} is a one-to-one correspondence between Y and X.

Now, consider an arbitrary mapping f: X \rightarrow Y. The mapping f which sends each element of X over to an element of Y induces the following important set mappings. If A is a subset of X, then the image f(A) is the subset of Y defined by

f^{-1}(B) = \{ x : f(x) \in \hspace{0.1in} B\}

and the second set mappings pull each B back to its corresponding f^{-1}(B). It is often essential for us to know how these set mappings behave with respect to set inclusion and operations on sets. We develop most of their significant features in the following two paragraphs.

The main properties of the first set mapping are:

f(\phi) = \phi

f(X) \subseteq Y

A_{1} \subseteq A_{2} \Longrightarrow f(A_{1}) \subseteq f(A_{2})

f(\bigcup_{i})A_{i} = \bigcup_{i}f(A_{i})

f(\bigcap_{i}A_{i}) \subseteq \bigcup_{i}f(A_{i})….call these relations I.

The reader should convince himself of the truth of these statements. For instance, to prove (i) we would have to prove first that f(\bigcup_{i}A_{i}) is a subset of \bigcup_{i}f(A_{i}), and second that \bigcup_{i}f(A_{i}) is a subset of f(\bigcup_{i}A_{i}). A proof of the first of these set inclusions might run as follows: an element in f(\bigcup_{i}A_{i}) is the image of some element in \bigcup_{i}A_{i}, therefore, it is the image of an element in some A_{i}, therefore it is some f(A_{i}) and so finally it is in \bigcup_{i}f(A_{i}). The irregularities and gaps which the reader will notice in the above statements are essential features of this set mapping. For example, the image of an intersection need not equal the intersection of the images, because two disjoint sets can easily have images which are not disjoint. Furthermore, without special assumpitions (see Problem 6), nothing can be said about the relation between f(A)^{'} and f(A^{'}).

The second set mapping is much better behaved. Its properties are satisfyingly complete, and can be stated as follows:

f^{-1}(\phi) = \phi and f^{-1}(Y) =X;

B_{1} \subseteq B_{2} \Longrightarrow f^{-1}(B_{1}) \subseteq f^{-1}(B_{2})

f^{-1}(\bigcup_{i}B_{i}) \Longrightarrow \bigcup_{i}f^{-1}(B_{i})….(2)

f^{-1}(\bigcap_{i}B_{i}) = \bigcap_{i}f^{-1}(B_{i})….(3)

f^{-1}(B^{'}) = f^{-1}(B)^{'}….(4)

Again, the reader should verify each of these statements for himself.

We discuss one more concept in this section, that of the multiplication or composition of mappings. If y= f(x)=x^{2}+1 and z = g(y) = \sin{y}

then these two functions can be put together to form a single function defined by x = (gf)(x) = g(x^{2}+1)=\sin{(x^{2}+1)}. One of the most important tools of calculus (the chain rule) explains how to differentiate functions of this kind. This manner of multiplying functions together is of basic importance for us as well, and we formulate it in general as follows. We define the product of these mappings, denoted by gf: X \rightarrow Z by (gf)(x) = g(f(x)). In words, an element x in X is taken by f to the element f(x) in Y,and then g maps f(x) to g(f(x)) in Z. Figure 8 is a picture of this process. We observe that the two mappings involved here are not entirely arbitrary, for the set Y which contains the range of the first equals the domain of the second. More generally, the product of two mappings is meaningful whenever the range of the first is contained in the domain of the second. We have regarded f as the first mapping and y as the second, and in forming their product gf, their symbols have gotten turned around. This is a rather unpleasant phenomenon, for which we blame the occasional perversity of mathematical symbols. Perhaps it will help the reader to keep this straight in his mind if he will remember to read the product gf from right to left: first apply f, then g.

Problems:

To be continued next blog.

Regards,

Nalin Pithwa