Every non-empty set S of non-negative integers contains a least element; that is, there is some integer a in S such that for all b’s belonging to S.
Because this principle plays a role in many proofs related to foundations of mathematics, let us use it to show that the set of positive integers has what is known as the Archimedean property.
If a and b are any positive integers, then there exists a positive integer n such that .
Assume that the statement of the theorem is not true so that for some a and b, we have for every positive integer n. Then, the set consists entirely of positive integers. By the Well-Ordering Principle, S will possess a least element, say, . Notice that also lies in S; because S contains all integers of this form. Further, we also have contrary to the choice of as the smallest integer in S. This contradiction arose out of original assumption that the Archimedean property did not hold; hence, the proof. QED.
First Principle of Finite Induction:
Let S be a set of positive integers with the following properties:
a) the integer 1 belongs to S.
b) Whenever the integer k is in S, the next integer is also in S.
Then, S is the set of all positive integers.
Second Principle of Finite Induction:
Let S be a set of positive integers with the following properties:
a) the integer 1 belongs to S.
b) If k is a positive integer such that belong to S, then must also be in S.
Then, S is the set of all positive integers.
So, in lighter vein, we assume a set of positive integers is given just as Kronecker had observed: “God created the natural numbers, all the rest is man-made.”
(The following is reproduced from the book “The Way I Remember It” by Walter Rudin. The purpose is just to share the insights of a formidable analyst with the student community.)
When I arrived at MIT in 1950, Banach algebras were one of the hot toppers. Gelfand’s 1941 paper “Normierte Ringe” had apparently only reached the USA in the late forties, and circulated on hard-to-read smudged purple ditto copies. As one application of the general theory presented there, it contained a stunningly short proof of Wiener’s lemma: the Fourier series of the reciprocal of a nowhere vanishing function with absolutely convergent Fourier series also converges absolutely. Not only was the proof extremely short, it was one of those that are hard to forget. All one needs to remember is that the absolutely convergent Fourier series form a Banach algebra, and that every multiplicative linear functional on this algebra is evaluation at some point of the unit circle.
This may have led some to believe that Banach algebras would now solve all our problems. Of course, they could not, but they did provide the right framework for many questions in analysis (as does most of functional analysis) and conversely, abstract questions about Banach algebras often gave rise to interesting problems in “hard analysis”. (Hard analysis is used here as Hardy and Littlewood used it. For example, you do hard analysis when, in order to estimate some integral, you break it into three pieces and apply different inequalities to each.)
One type of Banach algebras that was soon studied in detail were the so-called function algebras, also known as uniform algebras.
To see what these are, let be the set of all complex-valued continuous functions on a compact Hausdorff space X. A function algebra on X is a subset A of such that
(i) If f and g are in A, so are , , and for every complex number c (this says that A is an algebra).
(ii) A contains the constant functions.
(iii) A separates points on X (that is, if , both in X, then for some f in A), and
(iv) A is closed, relative to the sup-norm topology of , that is, the topology in which convergence means uniform convergence.
A is said to be self-adjoint if the complex conjugate of every f in A is also in A. The most familiar example of a non-self-adjoint function algebra is the disc algebra which consists of all f in that are holomorphic in U. (here, and later, U is the open unit disc in C, the complex plane, and is its closure). I already had an encounter with , a propos maximum modulus algebras.
One type of question that was asked over and over again was: Suppose that a function algebra on X satisfies … and …and is it C(X)? (In fact, 20 years later a whole book, entitled “Characterizations of C(X) among its Subalgebras” was published by R. B. Burckel.) The Stone-Weierstrass Theorem gives the classical answer. Yes, if A is self-adjoint.
There are problems even when X is a compact interval I on the real line. For instance, suppose A is a function algebra on I, and to every maximal ideal M of A corresponds a point p in I such that M is the set of all f in A having (In other words, the only maximal ideals of A are the obvious ones). Is ? This is still unknown, in 1995.
If are in , and the n-tuple separates points on I, let be the smallest closed subalgebras of that contains and I.
When is 1-1 on I, it follows from an old theorem of Walsh (Math. Annalen 96, 1926, 437-450) that .
Stone-Weierstrass implies that if each is real-valued.
In the other direction, John Wermer showed in Annals of Math. 62, 1955, 267-270, that can be a proper subset of !
Here is how he did this:
Let E be an arc in C, of positive two-dimensional measure, and let be an algebra of all continuous functions on the Riemann sphere S (the one-point compactification of C). which are holomorphic in the complement of E. He showed that for every g in , that contains a triple that separates points on S and that the restriction of to E is closed in . Pick a homeomorphism of I onto E and define . Then, , for if h is in then for some g in , so that
is the closure of an open subset of (except when h is constant).
In order to prove the same with two function instead of three I replaced John’s arc E with a Cantor set K, also of positive two-dimensional measure (I use the term “Cantor set” for any totally disconnected compact metric space with no isolated points; these are all homeomorphic to each other.) A small extra twist, applied to John’s argument, with in place of , proved that can also be smaller than .
I also used to show that contains maximal closed point-separating subalgebras that are not maximal ideals, and that the same is true for whenever X contains a Cantor set. These ideas were pushed further by Hoffman and Singer in Acta Math. 103, 1960, 217-241.
In the same paper, I showed that when of the n given functions are real-valued.
Since Wermer’s paper was being published in the Annals, and mine strengthened his theorem and contained other interesting (at least to me) results, I sent mine there too. It was rejected, almost by return mail, by an anonymous editor, for not being sufficiently interesting. I have had a few others papers rejected over the years, but for better reasons. This one was published in Proc. AMS 7, 1956, 825-830, and is one of six whose Russian transactions were made into a book “Some Questions in Approximation Theory”, the others were three by Bishop and two by Wermer. Good company.
Later, Gabriel Stolzenberg (Acta Math. 115, 1966, 185-198) and Herbert Alexander (Amer. J. Math., 93, 1971, 65-74) went much more deeply into these problems. One of the highlights in Alexander’s paper is:
if are of bounded variation.
A propos the Annals (published by Princeton University) here is a little Princeton anecdote. During a week that I spent there, in the mid-eighties, the Institute threw a cocktail party. (What I enjoyed best at that affair was being attacked by Armand Borel for having said, in print, that sheaves had vanished into the background.) Next morning I overheard the following conversation in Fine Hall:
Prof. A: That was a nice party yesterday, wasn’t it?
Prof. B: Yes, and wasn’t it nice that they invited the whole department.
Prof. A: Well, only the full professors.
Prof. B: Of course.
The above-mentioned facts about Cantor sets led me to look at the opposite extreme, the so-called scattered spaces. A compact Hausdorff space Q is said to be shattered if Q contains no perfect set, every non-empty compact set F in Q thus contains a point that is not a limit point of F. The principal result proved in Proc. AMS 8, 1957, 39-42 is:
THEOREM: Every closed subalgebra of is self-adjoint.
In fact, the scattered spaces are the only ones for which this is true, but I did not state this in that paper.
In 1956, I found a very explicit description of all closed ideals in the disc algebra (defined at the beginning of this chapter). The description involves inner function. These are the bounded holomorphic functions in U whose radial limits have absolute value 1 at almost every point of the unit circle . They play a very important role in the study of holomorphic functions in U (see, for instance, Garnett’s book, Bounded Analytic Functions) and their analogues will be mentioned again, on Riemann surfaces, in polydiscs, and in balls in .
Recall that a point on is called a singular point of a holomorphic function f in U if f has no analytic continuation to any neighbourhood of . The ideals in question are described in the following:
THEOREM: Let E be a compact subset of , of Lebesgue measure 0, let u be an inner function all of whose singular points lie in E, and let be the set of all f in such that
(i) the quotient f/u is bounded in U, and
(ii) at every in E.
Then, is a closed ideal of A(U), and every closed ideal of is obtained in this way.
One of several corollaries is that every closed ideal of A(U) is principal, that is, is generated by a single function.
I presented this at the December 1956 AMS meeting in Rochester, and was immediately told by several people that Beurling had proved the same thing, in a course he had given at Harvard, but had not published it. I was also told that Beurling might be quite upset at this, and having Beurling upset at you was not a good thing. Having used this famous paper about the shift operator on a Hilbert space as my guide, I was not surprised that he too had proved this, but I saw no reason to withdraw my already submitted paper. It appeared in Canadian J. Math. 9, 1967, 426-434. The result is now known as Beurling-Rudin theorem. I met him several times later, and he never made a fuss over this.
In the preceding year Lennart Carleson and I, neither of us knowing what the other was doing proved what is now known as Rudin-Carleson interpolation theorem. His paper is in Math. Z. 66, 1957, 447-451, mine in Proc. AMS 7, 1956, 808-811.
THEOREM. If E is a compact subset of , of Lebesgue measure 0, then every f in C(E) extends to a function F in A(U).
(It is easy to see that this fails if . To say that F is an extension of f means simply that at every in E.)
Our proofs have some ingredients in common, but they are different, and we each proved more than is stated above. Surprisingly, Carleson, the master of classical hard analysis, used a soft approach, namely duality in Banach spaces, and concluded that F could be so chosen that . (The norms are sup-norms over the sets appearing as subscripts.) In the same paper he used his Banach space argument to prove another interpolation theorem, involving Fourier-Stieltjes transforms.
On the other hand, I did not have functional analysis in mind at all, I did not think of the norms or of Banach spaces, I proved, by a bare-hands construction combined with the Riemann mapping theorem that if is a closed Jordan domain containing then f can be chosen so that also lies in . If is a disc, centered at 0, this gives , so F is a norm-preserving extension.
What our proofs had in common is that we both used part of the construction that was used in the original proof of the F. and M. Riesz theorem (which says that if a measure on gives for every f in then is absolutely continuous with respect to Lebesgue measure). Carleson showed, in fact, that F. and M. Riesz can be derived quite easily from the interpolation theorem. I tried to prove the implication in the other direction. But that had to wait for Errett Bishop. In Proc. AMS 13, 1962, 140-143, he established this implication in a very general setting which had nothing to do with holomorphic functions or even with algebras, and which, combined with a refinement due to Glicksberg (Trans. AMS 105, 1962, 415-435) makes the interpolation theorem even more precise:
THEOREM: One can choose F in so that at every in E, and at every z in .
This is usually called peak-interpolation.
Several variable analogues of this and related results may be found in Chap. 6 of my Function Theory in Polydiscs and in Chap 10 of my Function Theory in the Unit Ball of .
The last item in this chapter concerns Riemann surfaces. Some definitions are needed.
A finite Riemann surface is a connected open proper subset R of some compact Riemann surface X, such that the boundary of R in X is also the boundary of its closure and is the union of finitely many disjoint simple closed analytic curves . Shrinking each to a point gives a compact orientable manifold whose genus g is defined to be the genus of R. The numbers g and k determine the topology of R, but not, of course, its conformal structure.
denotes the algebra of all continuous functions on that are holomorphic in R. If f is in and at every point p in then, just as in U, f is called inner. A set is unramified if every point of has a neighbourhood in which at least one member of S is one-to-one.
I became interested in these algebras when Lee Stout (Math. Z., 92, 1966, 366-379; also 95, 1967, 403-404) showed that every contains an unramified triple of inner functions that separates points on . He deduced from the resulting embedding of R in that is generated by these 3 functions. Whether every is generated by some pair of its member is still unknown, but the main result of my paper in Trans. AMS 150, 1969, 423-434 shows that pairs of inner functions won’t always do:
THEOREM: If contains a point-separating unramified pair f, g of inner functions, then there exist relatively prime integers s and t such that f is s-to-1 and g is t-to-1 on every , and
For example, when and , then (*) holds for no integers s and t. When and , then is the only pair that satisfies (*) but it is not relatively prime. Even when the theorem gives some information. In that case, , so (*) becomes , which means:
If a pair of finite Blaschke products separates points on and their derivatives have no common zero in U, then at least one of them is one-to-one (that is, a Mobius transformation).
There are two cases in which (*) is not only necessary but also sufficient. This happens when and when .
But there are examples in which the topological condition (*) is satisfied even though the conformal structure of R prevents the existence of a separating unramified pair of inner functions.
This paper is quite different from anything else that I have ever done. As far as I know, no one has ever referred to it, but I had fun working on it.
More blogs from Rudin’s autobiography later, till then,
As I mentioned earlier, my thesis (Trans. AMS 68, 1950, 278-363) deals with uniqueness questions for series of spherical harmonics, also known as Laplace series. In the more familiar setting of trigonometric series, the first theorem of the kind that I was looking for was proved by Georg Cantor in 1870, based on earlier work of Riemann (1854, published in 1867). Using the notations
, where and are real numbers. Cantor’s theorem says:
at every real x, then for every n.
Therefore, two distinct trigonometric series cannot converge to the same sum. This is what is meant by uniqueness.
My aim was to prove this for spherical harmonics and (as had been done for trigonometric series) to whittle away at the hypothesis. Instead of assuming convergence at every point of the sphere, what sort of summability will do? Does one really need convergence (or summability) at every point? If not, what sort of sets can be omitted? Must anything else be assumed at these omitted points? What sort of side conditions, if any, are relevant?
I came up with reasonable answers to these questions, but basically the whole point seemed to be the justification of the interchange of some limit processes. This left me with an uneasy feeling that there ought to be more to Analysis than that. I wanted to do something with more “structure”. I could not have explained just what I meant by this, but I found it later when I became aware of the close relation between Fourier analysis and group theory, and also in an occasional encounter with number theory and with geometric aspects of several complex variables.
Why was it all an exercise in interchange of limits? Because the “obvious” proof of Cantor’s theorem goes like this: for ,
, which in turn, equals
and similarly, for . Note that was used.
In Riemann’s above mentioned paper, the derives the conclusion of Cantor’s theorem under an additional hypothesis, namely, and as . He associates to the twice integrated series
and then finds it necessary to prove, in some detail, that this series converges and that its sum F is continuous! (Weierstrass had not yet invented uniform convergence.) This is astonishingly different from most of his other publications, such as his paper on hypergeometric functions in which mind-boggling relations and transformations are merely stated, with only a few hints, or his painfully brief paper on the zeta-function.
In Crelle’s J. 73, 1870, 130-138, Cantor showed that Riemann’s additional hypothesis was redundant, by proving that
(*) for all x implies .
He included the statement: This cannot be proved, as is commonly believed, by term-by-term integration.
Apparently, it took a while before this was generally understood. Ten years later, in Math. America 16, 1880, 113-114, he patiently explains the differenence between pointwise convergence and uniform convergence, in order to refute a “simpler proof” published by Appell. But then, referring to his second (still quite complicated) proof, the one in Math. Annalen 4, 1871, 139-143, he sticks his neck out and writes: ” In my opinion, no further simplification can be achieved, given the nature of ths subject.”
That was a bit reckless. 25 years later, Lebesgue’s dominated convergence theorem became part of every analyst’s tool chest, and since then (*) can be proved in a few lines:
Rewrite in the form , where . Put
Then, , at every x, so that the D. C.Th., combined with
shows that . Therefore, for all large n, and . Done.
The point of all this is that my attitude was probably wrong. Interchanging limit processes occupied some of the best mathematicians for most of the 19th century. Thomas Hawkins’ book “Lebesgue’s Theory” gives an excellent description of the difficulties that they had to overcome. Perhaps, we should not be too surprised that even a hundred years later many students are baffled by uniform convergence, uniform continuity etc., and that some never get it at all.
In Trans. AMS 70, 1961, 387-403, I applied the techniques of my thesis to another problem of this type, with Hermite functions in place of spherical harmonics.
(Note: The above article has been picked from Walter Rudin’t book, “The Way I Remember It)) — hope it helps advanced graduates in Analysis.
When we come to multiplication, it is most convenient to confine ourselves to positive numbers (among which we may include zero) in the first instance, and to go back for a moment to the sections of positive rational numbers only which we considered in articles 4-7. We may then follow practically the same road as in the case of addition, taking (c) to be (ab) and (O) to be (AB). The argument is the same, except when we are proving that all rational numbers with at most one exception must belong to (c) or (C). This depends, as in the case of addition, on showing that we can choose a, A, b, and B so that C-c is as small as we please. Here we use the identity
Finally, we include negative numbers within the scope of our definition by agreeing that, if and are positive, then
, , .
In order to define division, we begin by defining the reciprocal of a number (other than zero). Confining ourselves in the first instance to positive numbers and sections of positive rational numbers, we define the reciprocal of a positive number by means of the lower class and the upper class . We then define the reciprocal of a negative number by the equation . Finally, we define by the equation
We are then in a position to apply to all real numbers, rational or irrational the whole of the ideas and methods of elementary algebra. Naturally, we do not propose to carry out this task in detail. It will be more profitable and more interesting to turn our attention to some special, but particularly important, classes of irrational numbers.
Algebraic operations with real numbers.
We now proceed to meaning of the elementary algebraic operations such as addition, as applied to real numbers in general.
(i), Addition. In order to define the sum of two numbers and , we consider the following two classes: (i) the class (c) formed by all sums , (ii) the class (C) formed by all sums . Clearly, in all cases.
Again, there cannot be more than one rational number which does not belong either to (c) or to (C). For suppose there were two, say r and s, and let s be the greater. Then, both r and s must be greater than every c and less than every C; and so cannot be less than . But,
and we can choose a, b, A, B so that both and are as small as we like; and this plainly contradicts our hypothesis.
If every rational number belongs to (c) or to (C), the classes (c), (C) form a section of the rational numbers, that is to say, a number . If there is one which does not, we add it to (C). We have now a section or real number , which must clearly be rational, since it corresponds to the least member of (C). In any case we call the sum of and , and write
If both and are rational, they are the least members of the upper classes (A) and (B). In this case it is clear that is the least member of (C), so that our definition agrees with our previous ideas of addition.
We define by the equation .
The idea of subtraction accordingly presents no fresh difficulties.
1) If r and s are rational numbers, then , , , and are rational numbers, unless in the last case (when is of course meaningless).
Part i): Given r and s are rational numbers. Let , , where a, b, c and d are integers, and b and d are not zero; where a and b do not have any common factors, where c and d do not have any common factors, and c and d are positive integers.
Then, , which is clearly rational as both the numerator and denominator are new integers (closure in addition and multiplication).
Part ii) Similar to part (i).
Part iii) By closure in multiplication.
Part iv) By definition of division in fractions, and closure in multiplication.
2) If are positive rational numbers, and , then prove that , , are positive rational numbers. Hence, show how to determine any number of right-angled triangles the lengths of all of whose sides are rational.
This follows from problem 1 where we proved that the addition, subtraction and multiplication of rational numbers is rational.
Also, Pythagoras’ theorem holds in the following manner:
3) Any terminated decimal represents a rational number whose denominator contains no factors other than 2 or 5. Conversely, any such rational number can be expressed, and in one way only, as a terminated decimal.
Proof Part 1:
This is obvious since the divisors other than 2 or 5, namely, 3,6,7,9, and other prime numbers do not divide 1 into a terminated decimal.
Proof Part 2:
Since the process of division produces a unique quotient.
4) The positive rational numbers may be arranged in the form of a simple series as follows:
Show that is the th term of the series.
Suggested idea. Try by mathematical induction.
The leading lights at Courant were very much at the forefront of rapid progress, stimulated by World War II, in certain kinds of differential equations that serve as mathematical models for an immense variety of physical phenomena involving some sort of change. By the mid-fifties, as Fortune noted, mathematicians knew relatively simple routines for solving ordinary differential equations using computers. But there were no straightforward methods for solving most nonlinear partial differential equations that crop up when large or abrupt changes occur — such as equations that describe the aerodynamic shock waves produced when a jet accelerates past the speed of sound. In his 1958 obituary of von Neumann, who did important work in this field in the thirties, Stanislaw Ulam called such systems of equations “baffling analytically” saying that they “defy even qualitative insights by present methods.” As Nash was to write that same year, “The open problems in the area of non-linear partial differential equations are very relevant to applied mathematics and science as a whole, perhaps, more so than the open problems in any other area of mathematics, and this field seems poised for rapid development. It seems clear however that fresh methods must be employed.”
Nash, partly because of his contact with Norbert Wiener and perhaps his earlier interaction with Weinstein at Carnegie, was already interested in the problem of turbulence. Turbulence refers to the flow of gas or liquid over any uneven surface, like water rushing into a bay, heat or electrical charges travelling through metal, oil escaping from an underground pool, or clouds skimming over an air mass. It should be possible to model such motion mathematically. But, it turns out to be extremely difficult. As Nash wrote:
Little is known about the existence, uniqueness and smoothness of solutions of the general equations of flow for a viscous, compressible, and heat conducting fluid. These are a non-linear parabolic system of equations. An interest in these questions led us to undertake this work. It became clear that nothing could be done about the continuum description of general fluid flow without the ability to handle non-linear parabolic equations and this in turn required an a priori estimate of continuity.
It was Louis Nirenberg, a short, myopic, and sweet-natured young protege of Courant’s, who had handed Nash a major unsolved problem in the then fairly new field of nonlinear theoty. Nirenberg, also in his twenties then, and already a formidable analyst, found Nash a bit strange. “He’d often seemed to have an internal smile, as if he was thinking of a private joke, as he was laughing at a private joke that he had never told anyone about.” But he was extremely impressed with the technique Nash had invented for solving his embedding theorem and sensed that Nash might be the man to crack an extremely difficult outstanding problem that had been open since the late 1950s:
He (Nirenberg) had recalled:
I worked in partial differential equations. I also worked in geometry. The problem had to do with certain kinds of inequalities associated with elliptic partial differential equations. The problem had been around in the field for some time and a number of people had worked on it. Someone had obtained such estimates much earlier in the 1930s in two dimensions. But the problem was open for almost thirty years in higher dimensions.
Nash had begun working on the problem almost as soon as Nirenberg suggested it, although he knocked on doors until he had been satisfied that the problem was as important as Nirenberg had claimed. Peter Lax, who was one of these he had consulted, had commented some time back: In physics, everybody knows the most important problems. They are well-defined. Not so in mathematics. People are more introspective. For Nash, though, it had to be important in the opinion of others.
Nash had started visiting Nirenberg’s office to discuss his progress. But, it was weeks before Nirenberg got any real sense that Nash was getting anywhere. “We would meet often. Nash would say, “I seem to need such and such an inequality. I think it’s true that…” Very often, Nash’s speculations were far off the mark. He was sort of groping. He gave that impression. I wasn’t very confident he was going to get through.
Nitenberg had then sent Nash around to talk to Lars Hormander, a tall, steely Swede who was then already one of the top scholars in the field. Precise, careful, and immensely knowledgeable, Hormander knew Nash, by reputation but had reacted even more skeptically than Nirenberg. “Nash had learned from Nirenberg the importance of extending the Holder estimates known for second-order elliptic equations with two variables and irregular coefficients to higher dimensions,” Hormander had recalled in 1997. “He came to see me several times. ‘What did I think of such and such an inequality?’ At first, his conjectures were obviously false. They were easy to disprove by known facts on constant coefficient operators. He was rather inexperienced in these matters. Nash did things from scratch without using standard techniques. He was always trying to extract problems…(from conversations with others). He had not the patience to study them.”
Nash had continued to grope, but with more success. “After a couple more times,” said Hormander, “he would come up with things that were not so obviously wrong.”
By the spring, Nash was able to obtain basic existence, uniqueness, and continuity theorems once again using novel methods of his own invention. He had a theory that difficult problems couldn’t be attacked frontally. He had approached the problem in an ingeniously roundabout manner, first transforming the nonllnear equations into linear equations and then attacking these by nonlinear means. “It was a stroke of genius,” said Peter Lax, who had followed the progress of Nash’s research closely. “I have never seen that done. I always kept it in my mind, thinking may be, it will work in another circumstance.”
(Note: Peter Lax is an earlier Abel Laureate).
Nash’s new result had gotten far more immediate attention than his embedding theorem. It had convinced Nirenberg, too, that Nash was a genius. Hormander’s mentor of the University of Lund, Lars Garding, a world class specialist in partial differential equations, had immediately declared, “You have to be a genius to do that.”