Category Archives: Philosophy of Mathematics

The Infinite

Working with the infinite is tricky business. Zeno’s paradoxes first alerted philosophers to this in 450 B.C.E. when he argued that a fast runner such as Achilles has an infinite number of places to reach during the pursuit of a slower runner. Since then, there has been a struggle to understand how to use the notion of infinity in a coherent manner. This article concerns the significant and controversial role that the concepts of infinity and the infinite play in the disciplines of philosophy, physical science, and mathematics.

Philosophers want to know whether there is more than one coherent concept of infinity; which entities and properties are infinitely large, infinitely small, infinitely divisible, and infinitely numerous; and what arguments can justify answers one way or the other.

Here are four suggested examples of these different ways to be infinite. The density of matter at the center of a black hole is infinitely large. An electron is infinitely small. An hour is infinitely divisible. The integers are infinitely numerous. These four claims are ordered from most to least controversial, although all four have been challenged in the philosophical literature.

This article also explores a variety of other questions about the infinite. Is the infinite something indefinite and incomplete, or is it complete and definite? What does Thomas Aquinas mean when he says God is infinitely powerful? Was Gauss, who was one of the greatest mathematicians of all time, correct when he made the controversial remark that scientific theories involve infinities merely as idealizations and merely in order to make for easy applications of those theories, when in fact all physically real entities are finite? How did the invention of set theory change the meaning of the term “infinite”? What did Cantor mean when he said some infinities are smaller than others? Quine said the first three sizes of Cantor’s infinities are the only ones we have reason to believe in. Mathematical Platonists disagree with Quine. Who is correct? We shall see that there are deep connections among all these questions.

Table of Contents

  1. What “Infinity” Means
    1. Actual, Potential, and Transcendental Infinity
    2. The Rise of the Technical Terms
  2. Infinity and the Mind
  3. Infinity in Metaphysics
  4. Infinity in Physical Science
    1. Infinitely Small and Infinitely Divisible
    2. Singularities
    3. Idealization and Approximation
    4. Infinity in Cosmology
  5. Infinity in Mathematics
    1. Infinite Sums
    2. Infinitesimals and Hyperreals
    3. Mathematical Existence
    4. Zermelo-Fraenkel Set Theory
    5. The Axiom of Choice and the Continuum Hypothesis
  6. Infinity in Deductive Logic
    1. Finite and Infinite Axiomatizability
    2. Infinitely Long Formulas
    3. Infinitely Long Proofs
    4. Infinitely Many Truth Values
    5. Infinite Models
    6. Infinity and Truth
  7. Conclusion
  8. References and Further Reading

1. What “Infinity” Means

The term “the infinite” refers to whatever it is that the word “infinity” correctly applies to. For example, the infinite integers exist just in case there is an infinity of integers. We also speak of infinite quantities, but what does it mean to say a quantity is infinite? In 1851, Bernard Bolzano argued in The Paradoxes of the Infinite that, if a quantity is to be infinite, then the measure of that quantity also must be infinite. Bolzano’s point is that we need a clear concept of infinite number in order to have a clear concept of infinite quantity. This idea of Bolzano’s has led to a new way of speaking about infinity, as we shall see.

The term “infinite” can be used for many purposes. The logician Alfred Tarski used it for dramatic purposes when he spoke about trying to contact his wife in Nazi-occupied Poland in the early 1940s. He complained, “We have been sending each other an infinite number of letters. They all disappear somewhere on the way. As far as I know, my wife has received only one letter.” (Feferman 2004, p. 137) Although the meaning of a term is intimately tied to its use, we can tell only a very little about the meaning of the term from Tarski’s use of it to exaggerate for dramatic effect.

Looking back over the last 2,500 years of use of the term “infinite,” three distinct senses stand out: actually infinite, potentially infinite, and transcendentally infinite. These will be discussed in more detail below, but briefly the concept of potential infinity treats infinity as an unbounded or non-terminating process developing over time. By contrast, the concept of actual infinity treats the infinite as timeless and complete. Transcendental infinity is the least precise of the three concepts and is more commonly used in discussions of metaphysics and theology to suggest transcendence of human understanding or human capability. To give some examples, the set of integers is actually infinite, and so is the number of locations (points of space) between London and Moscow. The maximum length of grammatical sentences in English is potentially infinite, and so is the total amount of memory in a Turing machine, an ideal computer. An omnipotent being’s power is transcendentally infinite.

For purposes of doing mathematics and science, the actual infinite has turned out to be the most useful of the three concepts. Using the idea proposed by Bolzano that was mentioned above, the concept of the actual infinite was precisely defined in 1888 when Richard Dedekind redefined the term “infinity” for use in set theory and Georg Cantor made the infinite, in this sense, an object of mathematical study. Before this turning point, the philosophical community would have said that Aristotle’s concept of potential infinity should be the concept used in mathematics and science.

a. Actual, Potential, and Transcendental Infinity

The Ancient Greeks generally conceived of the infinite as formless, characterless, indefinite, indeterminate, chaotic, and unintelligible. The term had negative connotations and was especially vague, having no clear criteria for distinguishing the finite from the infinite. In his treatment of Zeno’s paradoxes about infinite divisibility, Aristotle (384-322 B.C.E.) made a positive step toward clarification by distinguishing two different concepts of infinity, potential infinity and actual infinity. The latter is also called complete infinity and completed infinity. The actual infinite is not a process in time; it is an infinity that exists wholly at one time. By contrast, Aristotle spoke of the potentially infinite as a never-ending process over time. The word “potential” is being used in a technical sense. A potential swimmer can learn to become an actual swimmer, but a potential infinity cannot become an actual infinity. Aristotle argued that all the problems involving reasoning with infinity are really problems of improperly applying the incoherent concept of actual infinity instead of the coherent concept of potential infinity. (See Aristotle’s Physics, Book III, for his account of infinity.)

For its day, this was a successful way of treating Zeno’s Achilles paradox since, if Zeno had confined himself to using only potential infinity, he would not have been able to develop his paradoxical argument. Here is why. Zeno said that to go from the start to the finish line, the runner must reach the place that is halfway-there, then after arriving at this place he still must reach the place that is half of that remaining distance, and after arriving there he again must reach the new place that is now halfway to the goal, and so on. These are too many places to reach because there is no end to these place since for any one there is another. Zeno made the mistake, according to Aristotle, of supposing that this infinite process needs completing when it really doesn’t; the finitely long path from start to finish exists undivided for the runner, and it is the mathematician who is demanding the completion of such a process. Without that concept of a completed infinite process there is no paradox.

Although today’s standard treatment of the Achilles paradox disagrees with Aristotle and says Zeno was correct to use the concept of a completed infinity and to imply the runner must go to an actual infinity of places in a finite time, Aristotle had so many other intellectual successes that his ideas about infinity dominated the Western world for the next two thousand years.

Even though Aristotle promoted the belief that “the idea of the actual infinite−of that whose infinitude presents itself all at once−was close to a contradiction in terms…,” (Moore 2001, 40) during those two thousand years others did not treat it as a contradiction in terms. Archimedes, Duns Scotus, William of Ockham, Gregory of Rimini, and Leibniz made use of it. Archimedes used it, but had doubts about its legitimacy. Leibniz used it but had doubts about whether it was needed.

Here is an example of how Gregory of Rimini argued in the fourteenth century for the coherence of the concept of actual infinity:

If God can endlessly add a cubic foot to a stone–which He can–then He can create an infinitely big stone. For He need only add one cubic foot at some time, another half an hour later, another a quarter of an hour later than that, and so on ad infinitum. He would then have before Him an infinite stone at the end of the hour. (Moore 2001, 53)

Leibniz envisioned the world as being an actual infinity of mind-like monads, and in (Leibniz 1702) he freely used the concept of being infinitesimally small in his development of the calculus in mathematics.

The term “infinity” that is used in contemporary mathematics and science is based on a technical development of this earlier, informal concept of actual infinity. This technical concept was not created until late in the 19th century.

b. The Rise of the Technical Terms

In the centuries after the decline of ancient Greece, the word “infinite” slowly changed its meaning in Medieval Europe. Theologians promoted the idea that God is infinite because He is limitless, and this at least caused the word “infinity” to lose its negative connotations. Eventually during the Medieval Period, the word had come to mean endless, unlimited, and immeasurable–but not necessarily chaotic. The question of its intelligibility and conceivability by humans was disputed.

Actual infinity is very different. There are actual infinities in the technical, post-1880s sense, which are neither endless, unlimited, nor immeasurable. A line segment one meter long is a good example. It is not endless because it is finitely long, and it is not a process because it is timeless. It is not unlimited because it is limited by both zero and one. It is not immeasurable because its length measure is one meter. Nevertheless, the one meter line is infinite in the technical sense because it has an actual infinity of sub-segments, and it has an actual infinity of distinct points. So, there definitely has been a conceptual revolution.

This can be very shocking to those people who are first introduced to the technical term “actual infinity.” It seems not to be the kind of infinity they are thinking about. The crux of the problem is that these people really are using a different concept of infinity. The sense of infinity in ordinary discourse these days is either the Aristotelian one of potential infinity or the medieval one that requires infinity to be endless, immeasurable, and perhaps to have connotations of perfection, inconceivability, and paradox. This article uses the name transcendental infinity for the medieval concept although there is no generally accepted name for the concept. A transcendental infinity transcends human limits and detailed knowledge; it might be incapable of being described by a precise theory. It might also be a cluster of concepts rather than a single one.

Those people who are surprised when first introduced to the technical term “actual infinity” are probably thinking of either potential infinity or transcendental infinity, and that is why, in any discussion of infinity, some philosophers will say that an appeal to the technical term “actual infinity” is changing the subject. Another reason why there is opposition to actual infinities is that they have so many counter-intuitive properties. For example, consider a continuous line that has an actual infinity of points. A single point on this line has no next point! Also, a one-dimensional continuous curve can fill a two-dimensional area. Equally counterintuitive is the fact that some actually infinite numbers are smaller than other actually infinite numbers. Looked at more optimistically, though, most other philosophers will say the rise of this technical term is yet another example of how the discovery of a new concept has propelled civilization forward.

Resistance to the claim that there are actual infinities has had two other sources. One is the belief that actual infinities cannot be experienced. The second is the belief that use of the concept of actual infinity leads to paradoxes, such as Zeno’s. Because the standard solution to Zeno’s Paradoxes makes use of calculus, the birth of the new technical definition of actual infinity is intimately tied to the development of calculus and thus to properly defining the mathematician’s real line, the linear continuum. Briefly, the reason is that science needs calculus; calculus needs the continuum; the continuum needs a very careful definition; and the best definition requires there to be actual infinities (not merely potential infinities) in the micro-structure and the overall macro-structure of the continuum.

Defining the continuum involves defining real numbers because the linear continuum is the intended model of the theory of real numbers just as the plane is the intended model for the theory of ordinary two-dimensional geometry. It was eventually realized by mathematicians that giving a careful definition to the continuum and to real numbers requires formulating their definitions within set theory. As part of that formulation, mathematicians found a good way to define a rational number in the language of set theory; then they defined a real number to be a certain pair of actually infinite sets of rational numbers. The continuum’s eventual definition required it to be an actually infinite collection whose elements are themselves infinite sets. The details are too complex to be presented here, but the curious reader can check any textbook in classical real analysis. The intuitive picture is that any interval or segment of the continuum is a continuum, and any continuum is a very special infinite set of points that are packed so closely together that there are no gaps. A continuum is perfectly smooth. This smoothness is reflected in there being a great many real numbers between any two real numbers.

Calculus is the area of mathematics that is more applicable to science than any other area. It can be thought of as a technique for treating a continuous change as being composed of an infinite number of infinitesimal changes. When calculus is applied to physical properties capable of change such as spatial location, ocean salinity or an electrical circuit’s voltage, these properties are represented with continuous variables that have real numbers for their values. These values are specific real numbers, not ranges of real numbers and not just rational numbers. Achilles’ location along the path to his goal is such a property.

It took many centuries to rigorously develop the calculus. A very significant step in this direction occurred in 1888 when Richard Dedekind re-defined the term “infinity” and when Georg Cantor used that definition to create the first set theory, a theory that eventually was developed to the point where it could be used for embedding all classical mathematical theories. See the example in the Zeno's Paradoxes article of how Dedekind used set theory and his new idea of "cuts" to define the real numbers in terms of infinite sets of rational numbers. In this way additional rigor was given to the concepts of mathematics, and it encouraged more mathematicians to accept the notion of actually infinite sets. What this embedding requires is first defining the terms of any mathematical theory in the language of set theory, then translating the axioms and theorems of the mathematical theory into sentences of set theory, and then showing that these theorems follow logically from the axioms. (The axioms of any theory, such as set theory, are the special sentences of the theory that can always be assumed during the process of deducing the other theorems of the theory.)

The new technical treatment of infinity that originated with Dedekind in 1888 and adopted by Cantor in his new set theory provided a definition of "infinite set" rather than simply “infinite.” Dedekind says an infinite set is a set that is not finite. The notion of a finite set can be defined in various ways. We might define it numerically as a set having n members, where n is some non-negative integer. Dedekind found an essentially equivalent definition of finite set (assuming the axiom of choice, which will be discussed later), but Dedekind’s definition does not require mentioning numbers:

A (Dedekind) finite set is a set for which there exists no one-to-one correspondence between it and one of its proper subsets.

By placing the finger-tips of your left hand on the corresponding finger-tips of your right hand, you establish a one-to-one correspondence between the set of fingers of each hand; in that way you establish that there are the same number of fingers on each of your hands, without your needing to count the fingers. More generally, there is a one-to-one correspondence between two sets when each member of one set can be paired off with a unique member of the other set, so that neither set has an unpaired member.

Here is a one-to-one correspondence between the natural numbers and the even, positive numbers:

1, 2, 3, 4, ...

↕   ↕   ↕  ↕

2, 4, 6, 8, ...

Informally expressed, any infinite set can be matched up to a part of itself; so the whole is equivalent to a part. This is a surprising definition because, before this definition was adopted, the idea that actually infinite wholes are equinumerous with some of their parts was taken as clear evidence that the concept of actual infinity is inherently paradoxical. For a systematic presentation of the many alternative ways to successfully define “infinite set” non-numerically, see (Tarski 1924).

Dedekind’s new definition of "infinite" is defining an actually infinite set, not a potentially infinite set because Dedekind appealed to no continuing operation over time. The concept of a potentially infinite set is then given a new technical definition by saying a potentially infinite set is a growing, finite subset of an actually infinite set. Cantor expressed the point this way:

In order for there to be a variable quantity in some mathematical study, the “domain” of its variability must strictly speaking be known beforehand through a definition. However, this domain cannot itself be something variable…. Thus this “domain” is a definite, actually infinite set of values. Thus each potential infinite…presupposes an actual infinite. (Cantor 1887)

The new idea is that the potentially infinite set presupposes an actually infinite one. If this is correct, then Aristotle’s two notions of the potential infinite and actual infinite have been redefined and clarified.

Two sets are the same if any member of one is a member of the other, and vice versa. Order of the members is irrelevant to the identity of the set, and to the size of the set. Two sets are the same size if there exists a one-to-one correspondence between them. This definition of same size was recommended by both Cantor and Frege. Cantor defined “finite” by saying a set is finite if it is in one-to-one correspondence with the set {1, 2, 3, …, n} for some positive integer n; and he said a set is infinite if it is not finite.

Cardinal numbers are measures of the sizes of sets. There are many definitions of what a cardinal number is, but what is essential for cardinal numbers is that two sets have the same cardinal just in case there is a one-to-one correspondence between them; and set A has a smaller cardinal number than a set B (and so set A has fewer members than B) provided there is a one-to-one correspondence between A and a subset of B, but B is not the same size as A. In this sense, the set of even integers does not have fewer members than the set of all integers, although intuitively you might think it does.

How big is infinity? This question does not make sense for either potential infinity or transcendental infinity, but it does for actual infinity. Finite cardinal numbers such as 0, 1, 2, and 3 are measures of the sizes of finite sets, and transfinite cardinal numbers are measures of the sizes of actually infinite sets. The transfinite cardinals are aleph-null, aleph-one, aleph-two, and so on, which we represent with the numerals ℵ0, ℵ1, ℵ2, .... The smallest infinite size is ℵ0 which is the size of the set of natural numbers, and it is called a countable infinity; the other alephs are measures of the uncountable infinities. However, these are somewhat misleading terms since no process of counting is involved. Nobody would have the time to count from 0 to any aleph.

The set of even integers, the set of natural numbers and the set of rational numbers all can be shown to have the same size, but surprisingly they all are smaller than the set of real numbers. Any set of size ℵ0 is said to be countably infinite (or denumerably infinite or enumerably infinite). The set of points in the continuum and in any interval of the continuum turns out to be larger than ℵ0, although how much larger is still an open problem, called the continuum problem. A popular but controversial suggestion is that a continuum is of size ℵ1, the next larger size.

When creating set theory, mathematicians did not begin with the belief that there would be so many points between any two points in the continuum nor with the belief that for any infinite cardinal there is a larger cardinal. These were surprising consequences discovered by Cantor. To many philosophers, this surprise is evidence that what is going on is not invention but rather is discovery about a mind-independent reality.

The intellectual community has always been wary of actually infinite sets. Before the discovery of how to embed calculus within set theory (a process that is also called giving calculus a basis in set theory), it could have been more easily argued that science does not need actual infinities. The burden of proof has now shifted, and the default position is that actual infinites are indispensable in mathematics and science, and anyone who wants to do without them must show that removing them does not do too much damage and has additional benefits. There are no known successful attempts to reconstruct the theories of mathematical physics without basing them on mathematical objects such as numbers and sets, but for one attempt to do so using second-order logic, see (Field 1980).

Here is why some mathematicians believe the set-theoretic basis is so important:

Just as chemistry was unified and simplified when it was realized that every chemical compound is made of atoms, mathematics was dramatically unified when it was realized that every object of mathematics can be taken to be the same kind of thing. There are now other ways than set theory to unify mathematics, but before set theory there was no such unifying concept. Indeed, in the Renaissance, mathematicians hesitated to add x2 to x3, since the one was an area and the other a volume. Since the advent of set theory, one can correctly say that all mathematicians are exploring the same mental universe. (Rucker 1982, p. 64)

But the significance of this basis can be exaggerated. The existence of the basis does not imply that mathematics is set theory.

However, paradoxes soon were revealed within set theory, by Cantor himself and then others, so the quest for a more rigorous definition of the mathematical continuum continued. Cantor’s own paradox surfaced in 1895 when he asked whether the set of all cardinal numbers has a cardinal number. Cantor showed that, if it does, then it doesn’t. Surely the set of all sets would have the greatest cardinal number, but Cantor showed that for any cardinal number there is a greater cardinal number.  [For more details about this and the other paradoxes, see (Suppes 1960).] The most famous paradox of set theory is Russell’s Paradox of 1901. He showed that the set of all sets that are not members of themselves is both a member of itself and not a member of itself. Russell wrote that the paradox “put an end to the logical honeymoon that I had been enjoying.”

These and other paradoxes were eventually resolved satisfactorily by finding revised axioms of set theory that permit the existence of enough well-behaved sets so that set theory is not crippled [that is, made incapable of providing a basis for mathematical theories] and yet the axioms do not permit the existence of too many sets, the ill-behaved sets such as Cantor’s set of all cardinals and Russell’s set of all sets that are not members of themselves. Finally, by the mid-20th century, it had become clear that, despite the existence of competing set theories, Zermelo-Fraenkel’s set theory (ZF) was the best way or the least radical way to revise set theory in order to avoid all the known paradoxes and problems while at the same time preserving enough of our intuitive ideas about sets that it deserved to be called a set theory, and at this time most mathematicians would have agreed that the continuum had been given a proper basis in ZF. See (Kleene 1967, pp. 189-191) for comments on this agreement about ZF’s success and for a list of the ZF axioms and for a detailed explanation of why each axiom deserves to be an axiom.

Because of this success, and because it was clear enough that the concept of infinity used in ZF does not lead to contradictions, and because it seemed so evident how to use the concept in other areas of mathematics and science where the term “infinity” was being used, the definition of the concept of "infinite set" within ZF was claimed by many philosophers to be the paradigm example of how to provide a precise and fruitful definition of a philosophically significant concept. Much less attention was then paid to critics who had complained that we can never use the word “infinity” coherently because infinity is ineffable or inherently paradoxical.

Nevertheless there was, and still is, serious philosophical opposition to actually infinite sets and to ZF's treatment of the continuum, and this has spawned the programs of constructivism, intuitionism, finitism and ultrafinitism, all of whose advocates have philosophical objections to actual infinities. Even though there is much to be said in favor of replacing a murky concept with a clearer, technical concept, there is always the worry that the replacement is a change of subject that hasn’t really solved the problems it was designed for. This discussion of the role of infinity in mathematics and science continues in later sections of this article.

2. Infinity and the Mind

Can humans grasp the concept of the infinite? This seems to be a profound question. Ever since Zeno, intellectuals have realized that careless reasoning about infinity can lead to paradox and perhaps “defeat” the human mind. Some critics of infinity argue that paradox is essential to, or inherent in, the use of the concept of infinity, so the infinite is beyond the grasp of the human mind. However, this criticism applies more properly to some forms of transcendental infinity rather than to either actual infinity or potential infinity.

A second reason to believe humans cannot grasp infinity is that the concept must contain an infinite number of parts or sub-ideas. A counter to this reason is to defend the psychological claim that if a person succeeds in thinking about infinity, it does not follow that the person needs to have an actually infinite number of ideas in mind at one time.

A third reason to believe the concept of infinity is beyond human understanding is that to have the concept one must have some accurate mental picture of infinity. Thomas Hobbes, who believed that all thinking is based on imagination, might remark that nobody could picture an infinite number of grains of sand at once. However, most contemporary philosophers of psychology believe mental pictures are not essential to having any concept. Regarding the concept of dog, you might have a picture of a brown dog in your mind and I might have a picture of a black dog in mine, but I can still understand you perfectly well when you say dogs frequently chase cats.

The main issue here is whether we can coherently think about infinity to the extent of being said to have the concept. Here is a simple argument that we can: If we understand negation and have the concept of finite, then the concept of infinite is merely the concept of not-finite. A second argument says the apparent consistency of set theory indicates that infinity in the technical sense of actual infinity is well within our grasp. And since potential infinity is definable in terms of actual infinity, it, too, is within our grasp.

Assuming that infinity is within our grasp, what is it that we are grasping? Philosophers disagree on the answer. In 1883, Cantor said

A set is a Many which allows itself to be thought of as a One.

Notice the dependence on thought. Cantor eventually clarified what he meant and was clear that he did not want set existence to depend on mental capability. What he really believed is that a set is a collection of well-defined and distinct objects that exists independently of being thought of, but that could be thought of by a powerful enough mind.

3. Infinity in Metaphysics

There is a concept which corrupts and upsets all others. I refer not to Evil, whose limited realm is that of ethics; I refer to the infinite. —Jorge Luis Borges.

Shakespeare declared, “The will is infinite.” Is he correct or just exaggerating? Critics of Shakespeare, interpreted literally, might argue that the will is basically a product of different brain states. Because a person’s brain contains approximately 1027 atoms, these have only a finite number of configurations or states, and so, regardless of whether we interpret Shakespeare’s remark as implying that the will is unbounded (is potentially infinite) or the will produces an infinite number of brain states (is actually infinite), the will is not infinite. But perhaps Shakespeare was speaking metaphorically and did not intend to be taken literally, or perhaps he meant to use some version of transcendental infinity that makes infinity be somehow beyond human comprehension.

Contemporary Continental philosophers often speak that way. Emmanuel Levinas says the infinite is another name for the Other, for the existence of other conscious beings besides ourselves whom we are ethically responsible for. We “face the infinite” in the sense of facing a practically incomprehensible and unlimited number of possibilities upon encountering another conscious being. (See Levinas 1961.) If we ask what sense of “infinite” is being used by Levinas, it may be yet another concept of infinity, or it may be some kind of transcendental infinity. Another interpretation is that he is exaggerating about the number of possibilities and should say instead that there are too many possibilities to be faced when we encounter another conscious being and that the possibilities are not readily predictable because other conscious beings make free choices, the causes of which often are not known even to the person making the choice.

Leibniz was one of the few persons in earlier centuries who believed in actually infinite sets, but he did not believe in infinite numbers. Cantor did. Referring to his own discovery of the transfinite cardinals ℵ0, ℵ1, ℵ2, .... and their properties, Cantor claimed his work was revealing God’s existence and that these mathematical objects were in the mind of God. He claimed God gave humans the concept of the infinite so that they could reflect on His perfection. Influential German neo-Thomists such as Constantin Gutberlet agreed with Cantor. Some Jesuit math instructors claim that by taking a calculus course and understanding infinity, students are getting closer to God. Their critics complain that these mystical ideas about infinity and God are too speculative.

When metaphysicians speak of infinity they use all three concepts: potential infinity, actual infinity, and transcendental infinity. But when they speak about God being infinite, they are usually interested in implying that God is beyond human understanding or that there is a lack of a limit on particular properties of God, such as God's goodness and knowledge and power.

The connection between infinity and God exists in nearly all of the world’s religions. It is prominent in Hindu, Muslim, Jewish, and Christian literature. For example, in chapter 11 of the Bhagavad Gita of Hindu scripture, Krishna says, “O Lord of the universe, I see You everywhere with infinite form....”

Plato did not envision God (the Demi-urge) as infinite because he viewed God as perfect, and he believed anything perfect must be limited and thus not infinite because the infinite was defined as an unlimited, unbounded, indefinite, unintelligible chaos.

But the meaning of the term “infinite” slowly began to change. Over six hundred years later, the Neo-Platonist philosopher Plotinus was one of the first important Greek philosophers to equate God with the infinite−although he did not do so explicitly. He said instead that any idea abstracted from our finite experience is not applicable to God. He probably believed that if God were finite in some aspect, then there could be something beyond God and therefore God wouldn’t be “the One.” Plotinus was influential in helping remove the negative connotations that had accompanied the concept of the infinite. One difficulty here, though, is that it is unclear whether metaphysicians have discovered that God is identical with the transcendentally infinite or whether they are simply defining “God” to be that way. A more severe criticism is that perhaps they are just defining “infinite” (in the transcendental sense) as whatever God is.

Augustine, who merged Platonic philosophy with the Christian religion, spoke of God “whose understanding is infinite” for “what are we mean wretches that dare presume to limit His knowledge?” Augustine wrote that the reason God can understand the infinite is that “...every infinity is, in a way we cannot express, made finite to God....” [City of God, Book XII, ch. 18] This is an interesting perspective. Medieval philosophers debated whether God could understand infinite concepts other than Himself, not because God had limited understanding, but because there was no such thing as infinity anywhere except in God.

The medieval philosopher Thomas Aquinas, too, said God has infinite knowledge. He definitely did not mean potentially infinite knowledge. The technical definition of actual infinity might be useful here. If God is infinitely knowledgeable, this can be understood perhaps as meaning that God knows the truth values of all declarative sentences and that the set of these sentences is actually infinite.

Aquinas argued in his Summa Theologia that, although God created everything, nothing created by God can be actually infinite. His main reason was that anything created can be counted, yet if an infinity were created, then the count would be infinite, but no infinite numbers exist to do the counting (as Aristotle had also said). In his day this was a better argument than today because Cantor created (or discovered) infinite numbers in the late 19th century.

René Descartes believed God was actually infinite, and he remarked that the concept of actual infinity is so awesome that no human could have created it or deduced it from other concepts, so any idea of infinity that humans have must have come from God directly. Thus God exists. Descartes is using the concept of infinity to produce a new ontological argument for God’s existence.

David Hume, and many other philosophers, raised the problem that if God has infinite power then there need not be evil in the world, and if God has infinite goodness, then there should not be any evil in the world. This problem is often referred to as "The Problem of Evil" and has been a long standing point of contention for theologians.

Spinoza and Hegel envisioned God, or the Absolute, pantheistically. If they are correct, then to call God infinite, is to call the world itself infinite. Hegel denigrated Aristotle’s advocacy of potential infinity and claimed the world is actually infinite. Traditional Christian, Muslim and Jewish metaphysicians do not accept the pantheistic notion that God is at one with the world. Instead they say God transcends the world. Since God is outside space and time, the space and time that he created may or may not be infinite, depending on God’s choice, but surely everything else he created is finite, they say.

The multiverse theories of cosmology in the early 21st century allow there to be an uncountable infinity of universes within a background space whose volume is actually infinite. The universe created by our Big Bang is just one of these many universes. Christian theologians balk at the notion of God choosing to create this multiverse because the theory implies that, although there are so many universes radically different from ours, there also are an actually infinite number of copies of ours, which implies there are an infinite number of Jesuses who have been crucified on the cross. The removal of the uniqueness of Jesus is apparently a removal of his dignity. Augustine had this worry when considering infinite universes, and he responded that "Christ died once for sinners...."

There are many other entities and properties that some metaphysician or other has claimed are infinite: places, possibilities, propositions, properties, particulars, partial orderings, pi’s decimal expansion, predicates, proofs, Plato’s forms, principles, power sets, probabilities, positions, and possible worlds. That is just for the letter p. Some of these are considered to be abstract objects, objects outside of space and time, and others are considered to be concrete objects, objects within, or part of, space and time.

For helpful surveys of the history of infinity in theology and metaphysics, see (Owen 1967) and (Moore 2001).

4. Infinity in Physical Science

From a metaphysical perspective, the theories of mathematical physics seem to be ontologically committed to objects and their properties. If any of those objects or properties are infinite, then physics is committed to there being infinity within the physical world.

Here are four suggested examples where infinity occurs within physical science. (1) Standard cosmology based on Einstein’s general theory of relativity implies the density of the mass at the center of a simple black hole is infinitely large (even though black hole’s total mass is finite). (2) The Standard Model of particle physics implies the size of an electron is infinitely small. (3) General relativity implies that every path in space is infinity divisible. (4) Classical quantum theory implies the values of kinetic energy of an accelerating, free electron are infinitely numerous. These four kinds of infinities are implied by theory and argumentation, and are not something that could be measured directly.

Objecting to taking scientific theories at face value, the 18th century British empiricists George Berkeley and David Hume denied the physical reality of even potential infinities on the empiricist grounds that such infinities are not detectable by our sense organs. Most philosophers of the 21st century would say that Berkeley’s and Hume’s empirical standards are too rigid because they are based on the mistaken assumption that our knowledge of reality must be a complex built up from simple impressions gained from our sense organs.

But in the spirit of Berkeley and Hume’s empiricism, instrumentalists also challenge any claim that science tells us the truth about physical infinities. The instrumentalists say that all theories of science are merely effective “instruments” designed for explanatory and predictive success. A scientific theory’s claims are neither true nor false. By analogy, a shovel is an effective instrument for digging, but a shovel is neither true nor false. The instrumentalist would say our theories of mathematical physics imply only that reality looks “as if” there are physical infinities. Some realists on this issue respond that to declare it to be merely a useful mathematical fiction that there are physical infinities is just as misleading as to say it is a mere fiction that moving planets actually have inertia or petunias actually contain electrons. We have no other tool than theory-building for accessing the existing features of reality that are not directly perceptible. If our best theories—those that have been well tested and are empirically successful and make novel predictions—use theoretical terms that refer to infinities, then infinities must be accepted. See (Leplin 2000) for more details about anti-realist arguments, such as those of instrumentalism and constructive empiricism.

a. Infinitely Small and Infinitely Divisible

Consider the size of electrons and quarks, the two main components of atoms. All scientific experiments so far have been consistent with electrons and quarks having no internal structure (components), as our best scientific theories imply, so the "simple conclusion" is that electrons are infinitely small, or infinitesimal, and zero-dimensional. Is this “simple conclusion” too simple? Some physicists speculate that there are no physical particles this small and that, in each subsequent century, physicists will discover that all the particles of the previous century have a finite size due to some inner structure. However, most physicists withhold judgment on this point about the future of physics.

A second reason to question whether the “simple conclusion” is too simple is that electrons, quarks, and all other elementary particles behave in a quantum mechanical way. They have a wave nature as well as a particle nature, and they have these simultaneously. When probing an electron’s particle nature it is found to have no limit to how small it can be, but when probing the electron’s wave nature, the electron is found to be spread out through all of space, although it is more probably in some places than others. Also, quantum theory is about groups of objects, not a single object. The theory does not imply a definite result for a single observation but only for averages over many observations, so this is why quantum theory introduces an inescapable randomness or unpredictability into claims about single objects and single experimental results. The more accurate theory of quantum electrodynamics (QED) that incorporates special relativity and improves on classical quantum theory for the smallest regions, also implies electrons are infinitesimal particles when viewed as particles, while they are wavelike or spread out when viewed as waves. When considering the electron’s particle nature, QED’s prediction of zero volume has been experimentally verified down to the limits of measurement technology. The measurement process is limited by the fact that light or other electromagnetic radiation must be used to locate the electron, and this light cannot be used to determine the position of the electron more accurately than the distance between the wave crests of the light wave used to bombard the electron. So, all this is why the “simple conclusion” mentioned at the beginning of this paragraph may be too simple. For more discussion, see the chapter “The Uncertainty Principle” in (Hawking 2001) or (Greene 1999, pp. 121-2).

If a scientific theory implies space is a continuum, with the structure of a mathematical continuum, then if that theory is taken at face value, space is infinitely divisible and composed of infinitely small entities, the so-called points of space. But should it be taken at face value? The mathematician David Hilbert declared in 1925, “A homogeneous continuum which admits of the sort of divisibility needed to realize the infinitely small is nowhere to be found in reality. The infinite divisibility of a continuum is an operation which exists only in thought.” Many physicists agree with Hilbert, but many others argue that, although Hilbert is correct that ordinary entities such as strawberries and cream are not continuous, he is ultimately incorrect, for the following reasons.

First, the Standard Model of particles and forces is one of the best tested and most successful theories in all the history of physics. So are the theories of relativity and quantum mechanics. All these theories imply or assume that, using Cantor’s technical sense of actual infinity, there are infinitely many infinitesimal instants in any non-zero duration, and there are infinitely many point places along any spatial path. So, time is a continuum, and space is a continuum.

The second challenge to Hilbert’s position is that quantum theory, in agreement with relativity theory, implies that for any possible kinetic energy of a free electron there is half that energy−insofar as an electron can be said to have a value of energy independent of being measured to have it. Although the energy of an electron bound within an atom is quantized, the energy of an unbound or free electron is not. If it accelerates in its reference frame from zero to nearly the speed of light, its energy changes and takes on all intermediate real-numbered values from its rest energy to its total energy. But mass is just a form of energy, as Einstein showed in his famous equation E = mc2, so in this sense mass is a continuum as well as energy.

How about non-classical quantum mechanics, the proposed theories of quantum gravity that are designed to remove the disagreements between quantum mechanics and relativity theory? Do these non-classical theories quantize all these continua we’ve been talking about? One such theory, the theory of loop quantum gravity, implies space consists of discrete units called loops. But string theory, which is the more popular of the theories of quantum gravity in the early 21st century, does not imply space is discontinuous. [See (Greene 2004) for more details.] Speaking about this question of continuity, the theoretical physicist Brian Greene says that, although string theory is developed against a background of continuous spacetime, his own insight is that

[T]he increasingly intense quantum jitters that arise on decreasing scales suggest that the notion of being able to divide distances or durations into ever smaller units likely comes to an end at around the Planck length (10-33centimeters) and Planck time (10-43 seconds). ...There is something lurking in the microdepths−something that might be called the bare-bones substrate of spacetime−the entity to which the familiar notion of spacetime alludes. We expect that this ur-ingredient, this most elemental spacetime stuff, does not allow dissection into ever smaller pieces because of the violent fluctuations that would ultimately be encountered.... [If] familiar spacetime is but a large-scale manifestation of some more fundamental entity, what is that entity and what are its essential properties? As of today, no one knows. (Greene 2004, pp. 473, 474, 477)

Disagreeing, the theoretical physicist Roger Penrose speaks about both loop quantum gravity and string theory and says:

...in the early days of quantum mechanics, there was a great hope, not realized by future developments, that quantum theory was leading physics to a picture of the world in which there is actually discreteness at the tiniest levels. In the successful theories of our present day, as things have turned out, we take spacetime as a continuum even when quantum concepts are involved, and ideas that involve small-scale spacetime discreteness must be regarded as ‘unconventional.’ The continuum still features in an essential way even in those theories which attempt to apply the ideas of quantum mechanics to the very structure of space and time.... Thus it appears, for the time being at least, that we need to take the use of the infinite seriously, particular in its role in the mathematical description of the physical continuum. (Penrose 2005, 363)

b. Singularities

There is a good reason why scientists fear the infinite more than mathematicians do. Scientists have to worry that some day we will have a dangerous encounter with a singularity, with something that is, say, infinitely hot or infinitely dense. For example, we might encounter a singularity by being sucked into a black hole. According to Schwarzschild’s solution to the equations of general relativity, a simple, non-rotating black hole is infinitely dense at its center. For a second example of where there may be singularities, there is good reason to believe that 13.8 billion years ago the entire universe was a singularity with infinite temperature, infinite density, infinitesimal volume, and infinite curvature of spacetime.

Some philosophers will ask: Is it not proper to appeal to our best physical theories in order to learn what is physically possible? Usually, but not in this case, say many scientists, including Albert Einstein. He believed that, if a theory implies that some physical properties might have or, worse yet, do have actually infinite values (the so-called singularities), then this is a sure sign of error in the theory. It’s an error primarily because the theory will be unable to predict the behavior of the infinite entity, and so the theory will fail. For example, even if there were a large, shrinking universe pre-existing the Big Bang, if the Big Bang were considered to be an actual singularity, then knowledge of the state of the universe before the Big Bang could not be used to predict events after the Big Bang, or vice versa. This failure to imply the character of later states of the universe is what Einstein’s collaborator Peter Bergmann meant when he said, “A theory that involves singularities...carries within itself the seeds of its own destruction.” The majority of physicists probably would agree with Einstein and Bergmann about this, but the critics of these scientists say this belief that we need to remove singularities everywhere is merely a hope that has been turned into a metaphysical assumption.

But doesn’t quantum theory also rule out singularities? Yes. Quantum theory allows only arbitrary large, finite values of properties such as temperature and mass-energy density. So which theory, relativity theory or quantum theory, should we trust to tell us whether the center of a black hole is or isn’t a singularity? The best answer is, “Neither, because we should get our answer from a theory of quantum gravity.” A principal attraction of string theory, a leading proposal for a theory of quantum gravity to replace both relativity theory and quantum theory, is that it eliminates the many singularities that appear in previously accepted physical theories such as relativity theory. In string theory, the electrons and quarks are not point particles but are small, finite loops of fundamental string. That finiteness in the loop is what eliminates the singularities.

Unfortunately, string theory has its own problems with infinity. It implies an infinity of kinds of particles. If a particle is a string, then the energy of the particle should be the energy of its vibrating string. Strings have an infinite number of possible vibrational patterns each corresponding to a particle that should exist if we take the theory literally. One response that string theorists make to this problem about too many particles is that perhaps the infinity of particles did exist at the time of the Big Bang but now they have all disintegrated into a shower of simpler particles and so do not exist today. Another response favored by string theorists is that perhaps there never were an infinity of particles nor a Big Bang singularity in the first place. Instead the Big Bang was a Big Bounce or quick expansion from a pre-existing, shrinking universe whose size stopped shrinking when it got below the critical Planck length of about 10-35 meters.

c. Idealization and Approximation

Scientific theories use idealization and approximation; they are "lies that help us to see the truth," to use a phrase from the painter Pablo Picasso (who was speaking about art, not science). In our scientific theories, there are ideal gases, perfectly elliptical orbits, and economic consumers motivated only by profit. Everybody knows these are not intended to be real objects. Yet, it is clear that idealizations and approximations are actually needed in science in order to promote genuine explanation of many phenomena. We need to reduce the noise of the details in order to see what is important. In short, approximations and idealizations can be explanatory. But what about approximations and idealizations that involve the infinite?

Although the terms “idealization” and “approximation” are often used interchangeably, John Norton (Norton 2012) recommends paying more attention to their difference by saying that, when there is some aspect of the world, some target system, that we are trying to understand scientifically, approximations should be considered to be inexact descriptions of the target system whereas idealizations should be considered to be new systems or parts of new systems that also are approximations to the target system but that contain reference to some novel object or property. For example, elliptical orbits are approximations to actual orbits of planets, but ideal gases are idealizations because they contain novel objects such as point particles that are part of a new system that is useful for approximating the target system of actual gases.

All very detailed physical theories are idealizations or approximations to reality that can fail if pushed too far, but some defenders of infinity ask whether all appeals to infinity can be known a priori to be idealizations or approximations. Our theory of the solar system justifies our belief that the Earth is orbited by a moon, not just an approximate moon. The speed of light in a vacuum really is constant, not just approximately constant. Why then should it be assumed, as it often is, that all appeals to infinity in scientific theory are approximations or idealizations? Must the infinity be an artifact of the model rather than a feature of actual physical reality?  Philosophers of science disagree on this issue. See (Mundy, 1990, p. 290).

There is an argument for believing some appeals to infinity definitely are neither approximations nor idealizations. The argument presupposes a realist rather than an antirealist understanding of science, and it begins with a description of the opponents’ position. Carl Friedrich Gauss (1777-1855) was one of the greatest mathematicians of all time. He said scientific theories involve infinities merely as approximations or idealizations and merely in order to make for easy applications of those theories, when in fact all real entities are finite. At the time, nearly everyone would have agreed with Gauss. Roger Penrose argues against Gauss’ position:

Nevertheless, as tried and tested physical theory stands today—as it has for the past 24 centuries—real numbers still form a fundamental ingredient of our understanding of the physical world. (Penrose 2004, 62)

Gauss’ position could be buttressed if there were useful alternatives to our physical theories that do not use infinities. There actually are alternative mathematical theories of analysis that do not use real numbers and do not use infinite sets and do not require the line to be dense. See (Ahmavaara 1965) for an example. Representing the majority position among scientists on this issue, Penrose says, “To my mind, a physical theory which depends fundamentally upon some absurdly enormous...number would be a far more complicated (and improbable) theory than one that is able to depend upon a simple notion of infinity” (Penrose 2005, 359). David Deutsch agrees. He says, “Versions of number theory that confined themselves to ‘small natural numbers’ would have to be so full of arbitrary qualifiers, workarounds and unanswered questions, that they would be very bad explanations until they were generalized to the case that makes sense without such ad-hoc restrictions: the infinite case.” (Deutsch 2011, pp. 118-9) And surely a successful explanation is the surest route to understanding reality.

In opposition to this position of Penrose and Deutsch, and in support of Gauss’ position, the physicist Erwin Schrödinger remarks, “The idea of a continuous range, so familiar to mathematicians in our days, is something quite exorbitant, an enormous extrapolation of what is accessible to us.” Emphasizing this point about being “accessible to us,” some metaphysicians attack the applicability of the mathematical continuum to physical reality on the grounds that a continuous human perception over time is not mathematically continuous. Wesley Salmon responds to this complaint from Schrödinger:

...The perceptual continuum and perceived becoming [that is, the evidence from our sense organs that the world changes from time to time] exhibit a structure radically different from that of the mathematical continuum. Experience does seem, as James and Whitehead emphasize, to have an atomistic character. If physical change could be understood only in terms of the structure of the perceptual continuum, then the mathematical continuum would be incapable of providing an adequate description of physical processes. In particular, if we set the epistemological requirement that physical continuity must be constructed from physical points which are explicitly definable in terms of observables, then it will be impossible to endow the physical continuum with the properties of the mathematical continuum. In our discussion..., we shall see, however, that no such rigid requirement needs to be imposed. (Salmon 1970, 20)

Salmon continues by making the point that calculus provides better explanations of physical change than explanations which accept the “rigid requirement” of understanding physical change in terms of the structure of the perceptual continuum, so he recommends that we apply Ockham’s Razor and eliminate that rigid requirement. But the issue is not settled.

d. Infinity in Cosmology

Let’s review some of the history regarding the volume of spacetime. Aristotle said the past is infinite because, for any past time we can imagine an earlier one. It is difficult to make sense of his belief about the past since he means it is potentially infinite. After all, the past has an end, namely the present, so its infinity has been completed and therefore is not a potential infinity. This problem with Aristotle’s reasoning was first raised in the 13th century by Richard Rufus of Cornwall. It was not given the attention it deserved because of the assumption for so many centuries that Aristotle couldn’t have been wrong about time, especially since his position was consistent with Christian, Jewish, and Muslim theology which implies the physical world became coherent or well-formed only a finite time ago. However Aquinas argued against Aristotle’s view that the past is infinite; Aquinas’ grounds were that Holy Scripture implies God created the world a finite time ago, and that Aristotle was wrong to put so much trust in what we can imagine.

Unlike time, Aristotle claimed space is finite. He said the volume of physical space is finite because it is enclosed within a finite, spherical shell of visible, fixed stars with the Earth at its center. On this topic of space not being infinite, Aristotle’s influence was authoritative to most scholars for the next eighteen hundred years.

The debate about whether the volume of space is infinite was rekindled in Renaissance Europe. The English astronomer and defender of Copernicus, Thomas Digges (1546–1595) was the first scientist to reject the ancient idea of an outer spherical shell and to declare that physical space is actually infinite in volume and filled with stars. The physicist Isaac Newton (1642–1727) at first believed the universe's material is confined to only a finite region while it is surrounded by infinite empty space, but in 1691 he realized that if there were a finite number of stars in a finite region, then gravity would require all the stars to fall in together at some central point. To avoid this result, he later speculated that the universe contains an infinite number of stars in an infinite volume. The notion of infinite time, however, was not accepted by Newton because of conflict with Christian orthodoxy, as influenced by Aquinas. We now know that Newton’s speculation about the stability of an infinity of stars in an infinite universe is incorrect. There would still be clumping so long as the universe did not expand. (Hawking 2001, p. 9)

Immanuel Kant (1724–1804) declared that space and time are both potentially infinite in extent because this is imposed by our own minds. Space and time are not features of “things in themselves” but are an aspect of the very form of any possible human experience, he said. We can know a priori even more about space than about time, he believed; and he declared that the geometry of space must be Euclidean. Kant’s approach to space and time as something knowable a priori went out of fashion in the early 20th century. It was undermined in large part by the discovery of non-Euclidean geometries in the 19th century, then by Beltrami’s and Klein’s proofs that these geometries are as logically consistent as Euclidean geometry, and finally by Einstein’s successful application to physical space of non-Euclidean geometry within his general theory of relativity.

The volume of spacetime is finite at present if we can trust the classical Big Bang theory. [But do not think of this finite space as having a boundary beyond which a traveler falls over the edge into nothingness, or a boundary that cannot be penetrated.] Assuming space is all the places that have been created since the Big Bang, then the volume of space is definitely finite at present, though it is huge and growing ever larger over time. Assuming this expansion will never stop, it follows that the volume of spacetime is potentially infinite but not actually infinite. However, if, as some theorists speculate on the basis of inflationary cosmology, everything that is a product of our Big Bang is just one “bubble” in a sea of bubbles in the infinite spacetime background of the Multiverse, then both space and time are actually infinite. For more discussion of the issue of the infinite volume of spacetime, see (Greene 2011).

In the late nineteenth century, Georg Cantor argued that the mathematical concept of potential infinity presupposes the mathematical concept of actual infinity. This argument was accepted by most later mathematicians, but it does not imply that, if future time were to be potentially infinite, then future time also would be actually infinite.

5. Infinity in Mathematics

The previous sections of this article have introduced the concepts of actual infinity and potential infinity and explored the development of calculus and set theory, but this section will probe deeper into the role of infinity in mathematics. Mathematicians always have been aware of the special difficulty in dealing with the concept of infinity in a coherent manner. Intuitively, it seems reasonable that if we have two infinities of things, then we still have an infinity of them. So, we might represent this intuition mathematically by the equation 2 ∞ = 1 ∞. Dividing both sides by ∞ will prove that 2 = 1, which is a good sign we were not using infinity in a coherent manner. In recommending how to use the concept of infinity coherently, Bertrand Russell said pejoratively:

The whole difficulty of the subject lies in the necessity of thinking in an unfamiliar way, and in realising that many properties which we have thought inherent in number are in fact peculiar to finite numbers. If this is remembered, the positive theory of infinity...will not be found so difficult as it is to those who cling obstinately to the prejudices instilled by the arithmetic which is learnt in childhood. (Salmon 1970, 58)

That positive theory of infinity that Russell is talking about is set theory, and the new arithmetic is the result of Cantor’s generalizing the notions of order and of size of sets into the infinite, that is, to the infinite ordinals and infinite cardinals. These numbers are also called transfinite ordinals and transfinite cardinals. The following sections will briefly explore set theory and the role of infinity within mathematics. The main idea, though, is that the basic theories of mathematical physics are properly expressed using the differential calculus with real-number variables, and these concepts are well-defined in terms of set theory which, in turn, requires using actual infinities or transfinite infinities of various kinds.

a. Infinite Sums

In the 17th century, when Newton and Leibniz invented calculus, they wondered what the value is of this infinite sum:

1/1 + 1/2 + 1/4 + 1/8 + ....

They believed the sum is 2. Knowing about the dangers of talking about infinity, most later mathematicians hoped to find a technique to avoid using the phrase “infinite sum.” Cauchy and Weierstrass eventually provided this technique two centuries later. They removed any mention of “infinite sum” by using the formal idea of a limit. Informally, the Cauchy-Weierstrass idea is that instead of overtly saying the infinite sum s1 + s2 + s3 + … is some number S, as Newton and Leibniz were saying, one should say that the sequence converges to S just in case the numerical difference between any pair of terms within the sequence is as small as one desires, provided the two terms are sufficiently far out in the sequence. More formally it is expressed this way: The series s1 + s2 + s3 + … converges to S if, and only if, for every positive number ε there exists a number δ such that |sn+h +  sn| < ε for all integers n > δ and all integers h > 0. In this way, reference to an actual infinity has been eliminated.

This epsilon-delta technique of talking about limits was due to Cauchy in 1821 and Weierstrass in the period from 1850 to 1871. The two drawbacks to this technique are that (1) it is unintuitive and more complicated than Newton and Leibniz’s intuitive approach that did mention infinite sums, and (2) it is not needed because infinite sums were eventually legitimized by being given a set-theoretic foundation.

b. Infinitesimals and Hyperreals

There has been considerable controversy throughout history about how to understand infinitesimal objects and infinitesimal changes in the properties of objects. Intuitively an infinitesimal object is as small as you please but not quite nothing. Infinitesimal objects and infinitesimal methods were first used by Archimedes in ancient Greece, but he did not mention them in any publication intended for the public because he did not consider his use of them to be rigorous. Infinitesimals became better known when Leibniz used them in his differential and integral calculus. The differential calculus can be considered to be a technique for treating continuous motion as being composed of an infinite number of infinitesimal steps. The calculus’ use of infinitesimals led to the so-called “golden age of nothing” in which infinitesimals were used freely in mathematics and science. During this period, Leibniz, Euler, and the Bernoullis applied the concept. Euler applied it cavalierly (although his intuition was so good that he rarely if ever made mistakes), but Leibniz and the Bernoullis were concerned with the general question of when we could, and when we could not, consider an infinitesimal to be zero. They were aware of apparent problems with these practices in large part because they had been exposed by Berkeley.

In 1734, George Berkeley attacked the concept of infinitesimal as ill-defined and incoherent because there were no definite rules for when the infinitesimal should be and shouldn’t be considered to be zero. Berkeley, like Leibniz, was thinking of infinitesimals as objects with a constant value--as genuinely infinitesimally small magnitudes--whereas Newton thought of them as variables that could arbitrarily approach zero. Either way, there were coherence problems. The scientists and results-oriented mathematicians of the golden age of nothing had no good answer to the coherence problem. As standards of rigorous reasoning increased over the centuries, mathematicians became more worried about infinitesimals. They were delighted when Cauchy in 1821 and Weierstrass in the period from 1850 to 1875 developed a way to use calculus without infinitesimals, and at this time any appeal to infinitesimals was considered illegitimate, and mathematicians soon stopped using infinitesimals.

Here is how Cauchy and Weierstrass eliminated infinitesimals with their concept of limit. Suppose we have a function f,  and we are interested in the Cartesian graph of the curve y = f(x) at some point a along the x axis. What is the rate of change of  f at a? This is the slope of the tangent line at a, and it is called the derivative f' at a. This derivative was defined by Leibniz to be

infinity-equation1

where h is an infinitesimal. Because of suspicions about infinitesimals, Cauchy and Weierstrass suggested replacing Leibniz’s definition of the derivative with

equation

That is,  f'(a) is the limit, as x approaches a, of the above ratio. The limit idea was rigorously defined using Cauchy’s well known epsilon and delta method. Soon after the Cauchy-Weierstrass’ definition of derivative was formulated, mathematicians stopped using infinitesimals.

The scientists did not follow the lead of the mathematicians. Despite the lack of a coherent theory of infinitesimals, scientists continued to reason with infinitesimals because infinitesimal methods were so much more intuitively appealing than the mathematicians’ epsilon-delta methods. Although students in calculus classes in the early 21st century are still taught the unintuitive epsilon-delta methods, Abraham Robinson (Robinson 1966) created a rigorous alternative to standard Weierstrassian analysis by using the methods of model theory to define infinitesimals.

Here is Robinson’s idea. Think of the rational numbers in their natural order as being gappy with real numbers filling the gaps between them. Then think of the real numbers as being gappy with hyperreals filling the gaps between them. There is a cloud or region of hyperreals surrounding each real number (that is, surrounding each real number described nonstandardly). To develop these ideas more rigorously, Robinson used this simple definition of an infinitesimal:

h is infinitesimal if and only if 0 < |h| < 1/n, for every positive integer n.

|h| is the absolute value of h.

Robinson did not actually define an infinitesimal as a number on the real line. The infinitesimals were defined on a new number line, the hyperreal line, that contains within it the structure of the standard real numbers from classical analysis. In this sense the hyperreal line is the extension of the reals to the hyperreals. The development of analysis via infinitesimals creates a nonstandard analysis with a hyperreal line and a set of hyperreal numbers that include real numbers. In this nonstandard analysis, 78+2h is a hyperreal that is infinitesimally close to the real number 78. Sums and products of infinitesimals are infinitesimal.

Because of the rigor of the extension, all the arguments for and against Cantor’s infinities apply equally to the infinitesimals. Sentences about the standardly-described reals are true if and only if they are true in this extension to the hyperreals. Nonstandard analysis allows proofs of all the classical theorems of standard analysis, but it very often provides shorter, more direct, and more elegant proofs than those that were originally proved by using standard analysis with epsilons and deltas. Objections by practicing mathematicians to infinitesimals subsided after this was appreciated. With a good definition of “infinitesimal” they could then use it to explain related concepts such as in the sentence, “That curve approaches infinitesimally close to that line.” See (Wolf 2005, chapter 7) for more about infinitesimals and hyperreals.

c. Mathematical Existence

Mathematics is apparently about mathematical objects, so it is apparently about infinitely large objects, infinitely small objects, and infinitely many objects. Mathematicians who are doing mathematics and are not being careful about ontology too easily remark that there are infinite dimensional spaces, the continuum, continuous functions, an infinity of functions, and this or that infinite structure. Do these infinities really exist? The philosophical literature is filled with arguments pro and con and with fine points about senses of existence.

When axiomatizing geometry, Euclid said that between any two points one could choose to construct a line. Opposed to Euclid’s constructivist stance, many modern axiomatizers take a realist philosophical stance by declaring simply that there exists a line between any two points, so the line pre-exists any construction process. In mathematics, the constructivist will recognize the existence of a mathematical object only if there is at present an algorithm (that is, a step by step “mechanical” procedure operating on symbols that is finitely describable, that requires no ingenuity and that uses only finitely many steps) for constructing or finding such an object. Assertions require proofs. The constructivist believes that to justifiably assert the negation of a sentence S is to prove that the assumption of S leads to a contradiction. So, legitimate mathematical objects must be shown to be constructible in principle by some mental activity and cannot be assumed to pre-exist any such construction process nor to exist simply because their non-existence would be contradictory. A constructivist, unlike a realist, is a kind of conceptualist, one who believes that an unknowable mathematical object is impossible. Most constructivists complain that, although potential infinites can be constructed, actual infinities cannot be.

There are many different schools of constructivism. The first systematic one, and perhaps the most well known version and most radical version, is due to L.E.J. Brouwer. He is not a finitist,  but his intuitionist school demands that all legitimate mathematics be constructible from a basis of mental processes he called “intuitions.” These intuitions might be more accurately called “clear mental procedures.” If there were no minds capable of having these intuitions, then there would be no mathematical objects just as there would be no songs without ideas in the minds of composers. Numbers are human creations. The number pi is intuitionistically legitimate because we have an algorithm for computing all its decimal digits, but the following number g is not legitimate: The following number g is illegitimate. It is the number whose nth digit is either 0 or 1, and it is 1 if and only if there are n consecutive 7s in the decimal expansion of pi. No person yet knows how to construct the decimal digits of g. Brouwer argued that the actually infinite set of natural numbers cannot be constructed (using intuitions) and so does not exist. The best we can do is to have a rule for adding more members to a set. So, his concept of an acceptable infinity is closer to that of potential infinity than actual infinity. Hermann Weyl emphasizes the merely potential character of these infinities:

Brouwer made it clear, as I think beyond any doubt, that there is no evidence supporting the belief in the existential character of the totality of all natural numbers…. The sequence of numbers which grows beyond any stage already reached by passing to the next number, is a manifold of possibilities open towards infinity; it remains forever in the status of creation, but is not a closed realm of things existing in themselves. (Weyl is quoted in (Kleene 1967, p. 195))

It is not legitimate for platonic realists, said Brouwer, to bring all the sets into existence at once by declaring they are whatever objects satisfy all the axioms of set theory. Brouwer believed realists accept too many sets because they are too willing to accept sets merely by playing coherently with the finite symbols for them when sets instead should be tied to our experience. For Brouwer this experience is our experience of time. He believed we should arrive at our concept of the infinite by noticing that our experience of a duration can be divided into parts and then these parts can be further divided, and so. This infinity is a potential infinity, not an actual infinity. For the intuitionist, there is no determinate, mind-independent mathematical reality which provides the facts to make mathematical sentences true or false. This metaphysical position is reflected in the principles of logic that are acceptable to an intuitionist. For the intuitionist, the sentence “For all x, x has property F” is true only if we have already proved constructively that each x has property F. And it is false only if we have proved that some x does not have property F. Otherwise, it is neither true nor false. The intuitionist does not accept the principle of excluded middle: For any sentence S, either S or the negation of S. Outraged by this intuitionist position, David Hilbert famously responded by saying, “To take the law of the excluded middle away from the mathematician would be like denying the astronomer the telescope or the boxer the use of his fists.” (quoted from Kleene 1967, p. 197) For a presentation of intuitionism with philosophical emphasis, see (Posy 2005) and (Dummett 1977).

Finitists, even those who are not constructivists, also argue that the actually infinite set of natural numbers does not exist. They say there is a finite rule for generating each numeral from the previous one, but the rule does not produce an actual infinity of either numerals or numbers. The ultrafinitist considers the classical finitist to be too liberal because finite numbers such as 2100 and 21000 can never be accessed by a human mind in a reasonable amount of time. Only the numerals or symbols for those numbers can be coherently manipulated. One challenge to ultrafinitists is that they should explain where the cutoff point is between numbers that can be accessed and numbers that cannot be. Ultrafinitsts have risen to this challenge. The mathematician Harvey Friedman says:

I raised just this objection [about a cutoff] with the (extreme) ultrafinitist Yessenin-Volpin during a lecture of his. He asked me to be more specific. I then proceeded to start with 21 and asked him whether this is “real” or something to that effect. He virtually immediately said yes. Then I asked about 22, and he again said yes, but with a perceptible delay. Then 23, and yes, but with more delay. This continued for a couple of more times, till it was obvious how he was handling this objection. Sure, he was prepared to always answer yes, but he was going to take 2100 times as long to answer yes to 2100 than he would to answering 21. There is no way that I could get very far with this. (Elwes 2010, 317)

This battle among competing philosophies of mathematics will not be explored in depth in this article, but this section will offer a few more points about mathematical existence.

Hilbert argued that, “If the arbitrarily given axioms do not contradict one another, then they are true and the things defined by the axioms exist.” But (Chihara 2008, 141) points out that Hilbert seems to be confusing truth with truth in a model. If a set of axioms is consistent, and so is its corresponding axiomatic theory, then the theory defines a class of models, and each axiom is true in any such model, but it does not follow that the axioms are really true. To give a crude, nonmathematical example, consider this set of two axioms {All horses are blue, all cows are green.}. The formal theory using these axioms is consistent and has a model, but it does not follow that either axiom is really true.

Quine objected to Hilbert's criterion for existence as being too liberal. Quine’s argument for infinity in mathematics begins by noting that our fundamental scientific theories are our best tools for helping us understand reality and doing ontology. Mathematical theories which imply the existence of some actually infinite sets are indispensable to all these scientific theories, and their referring to these infinities cannot be paraphrased away. All this success is a good reason to believe in some actual infinite sets and to say the sentences of both the mathematical theories and the scientific theories are true or approximately true since their success would otherwise be a miracle. But, he continues, of course it is no miracle. See (Quine 1960 chapter 7).

Quine believed that infinite sets exist only if they are indispensable in successful applications of mathematics to science; but he believed science so far needs only the first three alephs: ℵ0 for the integers, ℵ1 for the set of point places in space, and ℵ2 for the number of possible lines in space (including lines that are not continuous). The rest of Cantor’s heaven of transfinite numbers is unreal, Quine said, and the mathematics of the extra transfinite numbers is merely “recreational mathematics.” But Quine showed intellectual flexibility by saying that if he were to be convinced more transfinite sets were needed in science, then he’d change his mind about which alephs are real. To briefly summarize Quine’s position, his indispensability argument treats mathematical entities on a par with all other theoretical entities in science and says mathematical statements can be (approximately) true. Quine points out that reference to mathematical entities is vital to science, and there is no way of separating out the evidence for the mathematics from the evidence for the science. This famous indispensability argument has been attacked in many ways. Critics charge, “Quite aside from the intrinsic logical defects of set theory as a deductive theory, this is disturbing because sets are so very different from physical objects as ordinarily conceived, and because the axioms of set theory are so very far removed from any kind of empirical support or empirical testability…. Not even set theory itself can tell us how the existence of a set (e.g. a power set) is empirically manifested.” (Mundy 1990, pp. 289-90). See (Parsons 1980) for more details about Quine’s and other philosophers’ arguments about existence of mathematical objects.

d. Zermelo-Fraenkel Set Theory

Cantor initially thought of a set as being a collection of objects that can be counted, but this notion eventually gave way to a set being a collection that has a clear membership condition. Over several decades, Cantor’s naive set theory evolved into ZF, Zermelo-Fraenkel set theory, and ZF was accepted by most mid-20th century mathematicians as the correct tool to use for deciding which mathematical objects exist. The acceptance was based on three reasons. (1) ZF is precise and rigorous. (2) ZF is useful for defining or representing other mathematical concepts and methods. Mathematics can be modeled in set theory; it can be given a basis in set theory. (3) No inconsistency has been uncovered despite heavy usage.

Notice that one of the three reasons is not that set theory provides a foundation to mathematics in the sense of justifying the doing of mathematics or in the sense of showing its sentences are certain or necessary. Instead, set theory provides a basis for theories only in the sense that it helps to organize them, to reveal their interrelationships, and to provide a means to precisely define their concepts. The first program for providing this basis began in the late 19th century. Peano had given an axiomatization of the natural numbers. It can be expressed in set theory using standard devices for treating natural numbers and relations and functions and so forth as being sets. (For example, zero is the empty set, and a relation is a set of ordered pairs.) Then came the arithmetization of analysis which involved using set theory to construct from the natural numbers all the negative numbers and the fractions and real numbers and complex numbers. Along with this, the principles of these numbers became sentences of set theory. In this way, the assumptions used in informal reasoning in arithmetic are explicitly stated in the formalism, and proofs in informal arithmetic can be rewritten as formal proofs so that no creativity is required for checking the correctness of the proofs. Once a mathematical theory is given a set theoretic basis in this manner, it follows that if we have any philosophical concerns about the higher level mathematical theory, those concerns will also be concerns about the lower level set theory in the basis.

In addition to Dedekind’s definition, there are other acceptable definitions of "infinite set" and "finite set" using set theory. One popular one is to define a finite set as a set onto which a one-to-one function maps the set of all natural numbers that are less than some natural number n. That finite set contains n elements. An infinite set is then defined as one that is not finite. Dedekind, himself, used another definition; he defined an infinite set as one that is not finite, but defined a finite set as any set in which there exists no one-to-one mapping of the set into a proper subset of itself. The philosopher C. S. Peirce suggested essentially the same approach as Dedekind at approximately the same time, but he received little notice from the professional community. For more discussion of the details, see (Wilder 1965, p. 66f, and Suppes 1960, p. 99n).

Set theory implies quite a bit about infinity. First, infinity in ZF has some very unsurprising features. If a set A is infinite and is the same size as set B, then B also is infinite. If A is infinite and is a subset of B, then B also is infinite. Using the axiom of choice, it follows that a set is infinite just in case for every natural number n, there is some subset whose size is n.

ZF’s axiom of infinity declares that there is at least one infinite set, a so-called inductive set containing zero and the successor of each of its members (such as {0, 1, 2, 3, …}). The power set axiom (which says every set has a power set, namely a set of all its subsets) then generates many more infinite sets of larger cardinality, a surprising result that Cantor first discovered in 1874.

In ZF, there is no set with maximum cardinality, nor a set of all sets, nor an infinitely descending sequence of sets x0, x1, x2, ... in which x1 is in x0, and x2 is in x1, and so forth. There is however, an infinitely ascending sequence of sets x0, x1, x2, ... in which x0 is in x1, and x1 is in x2, and so forth. In ZF, a set exists if it is implied by the axioms; there is no requirement that there be some property P such that the set is the extension of P. That is, there is no requirement that the set be defined as {x| P(x)} for some property P. One especially important feature of ZF is that for any condition or property, there is only one set of objects having that property, but it cannot be assumed that for any property, there is a set of all those objects that have that property. For example, it cannot be assumed that, for the property of being a set, there is a set of all objects having that property.

In ZF, all sets are pure. A set is pure if it is empty or its members are sets, and its members' members are sets, and so forth. In informal set theory, a set can contain cows and electrons and other non-sets.

In the early years of set theory, the terms "set" and "class" and “collection” were used interchangeably, but in von Neumann–Bernays–Gödel set theory (NBG or VBG) a set is defined to be a class that is an element of some other class. NBG is designed to have proper classes, classes that are not sets, even though they can have members which are sets. The intuitive idea is that a proper class is a collection that is too big to be a set. There can be a proper class of all sets, but neither a set of all sets nor a class of all classes. A nice feature of NBG is that a sentence in the language of ZFC is provable in NBG only if it is provable in ZFC.

Are philosophers justified in saying there is more to know about sets than is contained within ZF set theory? If V is the collection or class of all sets, do mathematicians have any access to V independently of the axioms? This is an open question that arose concerning the axiom of choice and the continuum hypothesis.

e. The Axiom of Choice and the Continuum Hypothesis

Consider whether to believe in the axiom of choice. The axiom of choice is the assertion that, given any collection of non-empty and non-overlapping sets, there exists a ‘choice set’ which is composed of one element chosen from each set in the collection. However, the axiom does not say how to do the choosing. For some sets there might not be a precise rule of choice. If the collection is infinite and its sets are not well-ordered in any way that has been specified, then there is in general no way to define the choice set. The axiom is implicitly used throughout the field of mathematics, and several important theorems cannot be proved without it. Mathematical Platonists tend to like the axiom, but those who want explicit definitions or constructions for sets do not like it. Nor do others who note that mathematics’ most unintuitive theorem, the Banach-Tarski Theorem, requires the axiom of choice. The dispute can get quite intense with advocates of the axiom of choice saying that their opponents are throwing out invaluable mathematics, while these opponents consider themselves to be removing tainted mathematics. See (Wagon 1985) for more on the Banach-Tarski Theorem; see (Wolf 2005, pp. 226-8) for more discussion of which theorems require the axiom.

A set is always smaller than its power set. How much bigger is the power set? Cantor’s controversial continuum hypothesis says that the cardinality of the power set of ℵ0 is ℵ1, the next larger cardinal number, and not some higher cardinal. The generalized continuum hypothesis is more general; it says that, given an infinite set of any cardinality, the cardinality of its power set is the next larger cardinal and not some even higher cardinal. Cantor believed the continuum hypothesis is true, but he was frustrated that he could not prove it. The philosophical issue is whether we should alter the axioms to enable the hypotheses to be proved.

If ZF is formalized as a first-order theory of deductive logic, then both Cantor’s generalized continuum hypothesis and the axiom of choice are consistent with the other principles of set theory but cannot be proved or disproved from them, assuming that ZF is not inconsistent. In this sense, both the continuum hypothesis and the axiom of choice are independent of ZF. Gödel in 1940 and Cohen in 1964 contributed to the proof of this independence result.

So, how do we decide whether to believe the axiom of choice and continuum hypothesis, and how do we decide whether to add them to the principles of ZF or any other set theory? Most mathematicians do believe the axiom of choice is true, but there is more uncertainty about the continuum hypothesis. The independence does not rule out our someday finding a convincing argument that the hypothesis is true or a convincing argument that it is false, but the argument will need more premises than just the principles of ZF. At this point the philosophers of mathematics divide into two camps. The realists, who think there is a unique universe of sets to be discovered, believe that if ZF does not fix the truth values of the continuum hypothesis and the axiom of choice, then this is a defect within ZF and we need to explore our intuitions about infinity in order to uncover a missing axiom or two for ZF that will settle the truth values. These persons prefer to think that there is a single system of mathematics to which set theory is providing a foundation, but they would prefer not simply to add the continuum hypothesis itself as an axiom because the hope is to make the axioms "readily believable," yet it is not clear enough that the axiom itself is readily believable. The second camp of philosophers of mathematics disagree and say the concept of infinite set is so vague that we simply do not have any intuitions that will or should settle the truth values. According to this second camp, there are set theories with and without axioms that fix the truth values of the axiom of choice and the continuum hypothesis, and set theory should no more be a unique theory of sets than Euclidean geometry should be the unique theory of geometry.

Believing that ZFC’s infinities are merely the above-surface part of the great iceberg of infinite sets, many set theorists are actively exploring new axioms that imply the existence of sets that could not be proved to exist within ZFC. So far there is no agreement among researchers about the acceptability of any of the new axioms. See (Wolf 2005, pp. 226-8) and (Rucker 1982) pp. 252-3 for more discussion of the search for these new axioms.

6. Infinity in Deductive Logic

The infinite appears in many interesting ways in formal deductive logic, and this section presents an introduction to a few of those ways. Among all the various kinds of formal deductive logics, first-order logic (the usual predicate logic) stands out as especially important, in part because of the accuracy and detail with which it can mirror mathematical deductions. First-order logic also stands out because it is the strongest logic that has a proof for every one of its logically true sentences, and that is compact in the sense that if an infinite set of its sentences is inconsistent, then so is some finite subset.

But just what is first-order logic? To answer this and other questions, it is helpful to introduce some technical terminology. Here is a chart of what is ahead:

First-order language First-order theory First-order formal system First-order logic
Definition Formal language with quantifiers over objects but not over sets of objects. A set of sentences expressed in a first-order language. First-order theory plus its method for building proofs. First-order language with its method for building proofs.

A first-order theory is a set of sentences expressed in a first-order language (which will be defined below). A first-order formal system is a first-order theory plus its deductive structure (method of building proofs). The term “first-order logic” is ambiguous. It can mean a first-order language with its deductive structure, or it can mean simply the academic subject or discipline that studies first-order languages and theories.

Classical first-order logic is distinguished by its satisfying certain classically-accepted assumptions: that it has only two truth values; in an interpretation or valuation [note: the terminology is not standardized] , every sentence gets exactly one of the two truth values; no well-formed formula (wff) can contain an infinite number of symbols; a valid deduction cannot be made from true sentences to a false one; deductions cannot be infinitely long; the domain of an interpretation cannot be empty but can have any infinite cardinality; an individual constant (name) must name something in the domain; and so forth.

A formal language specifies the language’s vocabulary symbols and its syntax, primarily what counts as being a term or name and what are its well-formed formulas (wffs). A first-order language is a formal language whose symbols are the quantifiers (∃), connectives (↔), constants (a), variables (x), predicates or relations (R), and perhaps functions (f) and equality (=). It has a denumerable list of variables. (A set is denumerable or countably infinite if it has size ℵ0.) A first-order language has a countably finite or countably infinite number of predicate symbols and function symbols, but not a zero number of both. First-order languages differ from each other only in their predicate symbols or function symbols or constants symbols or in having or not having the equality symbol. See (Wolf 2005, p. 23) for more details. Every wff in a first-order language must contain only finitely many symbols. There are denumerably many terms, formulas, and sentences. Because there are uncountably many real numbers, a theory of real numbers in a first-order language does not have enough names for all the real numbers.

To carry out proofs or deductions in a first-order language, the language needs to be given a deductive structure. There are several different ways to do this (via axioms, natural deduction, sequent calculus), but the ways all are independent of which first-order language is being used, and they all require specifying rules such as modus ponens for how to deduce wffs from finitely many previous wffs in the deduction.

To give some semantics or meaning to its symbols, the first-order language needs a definition of valuation and of truth in a valuation and of validity of an argument. In a propositional logic, the valuation assigns to each sentence letter a single truth value; in predicate logic each term is given a denotation, and each predicate is given a set of objects in the domain that satisfy the predicate. The valuation rules then determine the truth values of all the wffs. The valuation’s domain is a set containing all the objects that the terms might denote and that the variables range over. The domain may be of any finite or transfinite size, but the variables can range only over objects in this domain, not over sets of those objects.

Because a first-order language cannot successfully express sentences that generalize over sets (or properties or classes or relations) of the objects in the domain, it cannot, for example, adequately express Leibniz’s Law that, “If objects a and b are identical, then they have the same properties.” A second-order language can do this. A language is second-order if in addition to quantifiers on variables that range over objects in the domain it also has quantifiers (such as œthe universal quantifier ∀P) on a second kind of variable P that ranges over properties (or classes or relations) of these objects. Here is one way to express Leibniz’s Law in second-order logic:

(a = b) --> ∀P(Pa ↔ Pb)

P is called a predicate variable or property variable. Every valid deduction in first-order logic is also valid in second-order logic. A language is third-order if it has quantifiers on variables that range over properties of properties of objects (or over sets of sets of objects), and so forth. A language is called higher-order if it is at least second-order.

The definition of first-order theory given earlier in this section was that it is any set of wffs in a first-order language. A more ordinary definition adds that it is closed under deduction. This additional requirement implies that every deductive consequence of some sentences of the theory also is in the theory. Since the consequences are countably infinite, all ordinary first-order theories are countably infinite.

If the language isn’t explicitly mentioned for a first-order theory, then it is generally assumed that the language is the smallest first-order language that contains all the sentences of the theory. Valuations of the language in which all the sentences of the theory are true are said to be models of the theory.

If the theory is axiomatized, then in addition to the logical axioms there are proper axioms (also called non-logical axioms); these axioms are specific to the theory (and so usually do not hold in other first-order theories). For example, Peano’s axioms when expressed in a first-order language are proper axioms for the formal theory of arithmetic, but they aren't logical axioms or logical truths. See (Wolf, 2005, pp. 32-3) for specific proper axioms of Peano Arithmetic and for proofs of some of its important theorems.

Besides the above problem about Leibniz’s Law, there is a related problem about infinity that occurs when Peano Arithmetic is expressed as a first-order theory. Gödel’s First Incompleteness Theorem proves that there are some bizarre truths which are independent of first-order Peano Arithmetic (PA), and so cannot be deduced within PA. None of these truths so far are known to lie in mainstream mathematics. But they might. And there is another reason to worry about the limitations of PA. Because the set of sentences of PA is only countable, whereas there are uncountably many sets of numbers in informal arithmetic, it might be that PA is inadequate for expressing and proving some important theorems about sets of numbers. See (Wolf 2005, pp. 33-4, 225).

It seems that all the important theorems of arithmetic and the rest of mathematics can be expressed and proved in another first-order theory, Zermelo-Fraenkel set theory with the axiom of choice (ZFC). Unlike first-order Peano Arithmetic, ZFC needs only a very simple first-order language that surprisingly has no undefined predicate symbol, equality symbol, relation symbol, or function symbol, other than a single two-place binary relation symbol intended to represent set membership. The domain is intended to be composed only of sets but since mathematical objects can be defined to be sets, the domain contains these mathematical objects.

a. Finite and Infinite Axiomatizability

In the process of axiomatizing a theory, any sentence of the theory can be called an axiom. When axiomatizing a theory, there is no problem with having an infinite number of axioms so long as the set of axioms is decidable, that is, so long as there is a finitely long computation or mechanical procedure for deciding, for any sentence, whether it is an axiom.

Logicians are curious as to which formal theories can be finitely axiomatized in a given formal system and which can only be infinitely axiomatized. Group theory is finitely axiomatizable in classical first-order logic, but Peano Arithmetic and ZFC are not. Peano Arithmetic is not finitely axiomatizable because it requires an axiom scheme for induction. An axiom scheme is a countably infinite number of axioms of similar form, and an axiom scheme for induction would be an infinite number of axioms of the form (expressed here informally): “If property P of natural numbers holds for zero, and also holds for n+1 whenever it holds for natural number n, then P holds for all natural numbers.” There needs to be a separate axiom for every property P, but there is a countably infinite number of these properties expressible in a first-order language of elementary arithmetic.

Assuming ZF is consistent, ZFC is not finitely axiomatizable in first-order logic, as Richard Montague discovered. Nevertheless ZFC is a subset of von Neumann–Bernays–Gödel (NBG) set theory, and the latter is finitely axiomatizable, as Paul Bernays discovered. The first-order theory of Euclidean geometry is not finitely axiomatizable, and the second-order logic used in (Field 1980) to reconstruct mathematical physics without quantifying over numbers also is not finitely axiomatizable. See (Mendelson 1997) for more discussion of finite axiomatizability.

b. Infinitely Long Formulas

An infinitary logic is a logic that makes one of classical logic’s necessarily finite features be infinite. In the languages of classical first-order logic, every formula is required to be only finitely long, but an infinitary logic might relax this. The original, intuitive idea behind requiring finitely long sentences in classical logic was that logic should reflect the finitude of the human mind. But with increasing opposition to psychologism in logic, that is, to making logic somehow dependent on human psychology, researchers began to ignore the finitude restrictions. Löwenheim in about 1915 was perhaps the pioneer here. In 1957, Alfred Tarski and Dana Scott explored permitting the operations of conjunction and disjunction to link infinitely many formulas into an infinitely long formula. Tarski also suggested allowing formulas to have a sequence of quantifiers of any transfinite length. William Hanf proved in 1964 that, unlike classical logics, these infinitary logics fail to be compact. See (Barwise 1975) for more discussion of these developments.

c. Infinitely Long Proofs

Classical formal logic requires proofs to contain a finite number of steps. In the mid-20th century with the disappearance of psychologism in logic, researchers began to investigate logics with infinitely long proofs as an aid to simplifying consistency proofs. See (Barwise 1975).

d. Infinitely Many Truth Values

One reason for permitting an infinite number of truth values is to represent the idea that truth is a matter of degree. The intuitive idea is that, say, depending on the temperature, the truth of “This cup of coffee is warm” might be definitely true, less true, even less true, and so forth

One of the simplest infinite-valued semantics uses a continuum of truth values. Its valuations assign to each basic sentence (a formal sentence that contains no connectives or quantifiers) a truth value that is a specific number in the closed interval of real numbers from 0 to 1. The truth value of the vague sentence “This water is warm” is understood to be definitely true if it has the truth value 1 and definitely false if it has the truth value 0. To sentences having main connectives, the valuation assigns to the negation ~P of any sentence P the truth value of one minus the truth value assigned to P. It assigns to the conjunction P & Q the minimum of the truth values of P and of Q. It assigns to the disjunction P v Q the maximum of the truth values of P and of Q, and so forth.

One advantage to using an infinite-valued semantics is that by permitting modus ponens to produce a conclusion that is slightly less true than either premise, we can create a solution to the paradox of the heap, the sorites paradox. One disadvantage is that there is no well-motivated choice for the specific real number that is the truth value of a vague statement. What is the truth value appropriate to “This water is warm” when the temperature is 100 degrees Fahrenheit and you are interested in cooking pasta in it? Is the truth value 0.635? This latter problem of assigning truth values to specific sentences without being arbitrary has led to the development of fuzzy logics in place of the simpler infinite-valued semantics we have been considering. Lofti Zadeh suggested that instead of vague sentences having any of a continuum of precise truth values we should make the continuum of truth values themselves imprecise. His suggestion was to assign a sentence a truth value that is a fuzzy set of numerical values, a set for which membership is a matter of degree. For more details, see (Nolt 1997, pp. 420-7).

e. Infinite Models

A countable language is a language with countably many symbols. The Löwenhim Skolem Theorem says:

If a first-order theory in a countable language has an infinite model, then it has a countably infinite model.

This is a surprising result about infinity. Would you want your theory of real numbers to have a countable model? Strictly speaking it is a puzzle and not a paradox because the property of being countably infinite is a property it has when viewed from outside the object language not within it. The theorem does not imply first-order theories of real numbers must have no more real numbers than there are natural numbers.

The Löwenhim-Skolem Theorem can be extended to say that if a theory in a countable language has a model of some infinite size, then it also has models of any infinite size. This is a limitation on first-order theories; they do not permit having a categorical theory of an infinite structure.  A formal theory is said to be categorical if any two models satisfying the theory are isomorphic. The two models are isomorphic if they have the same structure; and they can’t be isomorphic if they have different sizes. So, if you create a first-order theory intended to describe a single infinite structure of a certain size, the theory will end up having, for any infinite size, a model of that size. This frustrates the hopes of anyone who would like to have a first-order theory of arithmetic that has models only of size ℵ0, and to have a first-order theory of real numbers that has models only of size 20.  See (Enderton 1972, pp. 142-3) for more discussion of this limitation.

Because of this limitation, many logicians have turned to second-order logics. There are second-order categorical theories for the natural numbers and for the real numbers. Unfortunately, there is no sound and complete deductive structure for any second-order logic having a decidable set of axioms; this is a major negative feature of second-order logics.

To illustrate one more surprise regarding infinity in formal logic, notice that the quantifiers are defined in terms of their domain, the domain of discourse. In a first-order set theory, the expression ∃xPx says there exists some set x in the infinite domain of all the sets such that x has property P. Unfortunately, in ZF there is no set of all sets to serve as this domain. So, it is oddly unclear what the expression ∃xPx means when we intend to use it to speak about sets.

f. Infinity and Truth

According to Alfred Tarski’s Undefinability Theorem, in an arbitrary first-order language a global truth predicate is not definable. A global truth predicate is a predicate which is satisfied by all and only the names (via, say, Gödel numbering) of all the true sentences of the formal language. According to Tarski, since no single language has a global truth predicate, the best approach to expressing truth formally within the language is to expand the  language into an infinite hierarchy of languages, with each higher language (the metalanguage) containing a truth predicate that can apply to all and only the true sentences of languages lower in the hierarchy. This process is iterated into the transfinite to obtain Tarski's hierarchy of metalanguages. Some philosophers have suggested that this infinite hierarchy is implicit within natural languages such as English, but other philosophers, including Tarski himself, believe an informal language does not contain within it a formal language.

To handle the concept of truth formally, Saul Kripke rejects the infinite hierarchy of metalanguages in favor of an infinite hierarchy of interpretations (that is, valuations) of a single language, such as a first-order predicate calculus, with enough apparatus to discuss its own syntax. The language’s intended truth predicate T is the only basic (atomic) predicate that is ever partially-interpreted at any stage of the hierarchy. At the first step in the hierarchy, all predicates but the single predicate T(x) are interpreted. T(x) is completely uninterpreted at this level. As we go up the hierarchy, the interpretation of the other basic predicates are unchanged, but T is satisfied by the names of sentences that were true at lower levels. For example, at the second level, T is satisfied by the name of the sentence ∀œx(Fx v ~Fx). At each step in the hierarchy, more sentences get truth values, but any sentence that has a truth value at one level has that same truth value at all higher levels. T almost becomes a global truth predicate when the inductive interpretation-building reaches the first so-called fixed point level. At this countably infinite level, although T is a truth predicate for all those sentences having one of the two classical truth values, the predicate is not quite satisfied by the names of every true sentence because it is not satisfied by the names of some of the true sentences containing T. At this fixed point level, the Liar sentence (of the Liar Paradox) is still neither true nor false. For this reason, the Liar sentence is said to fall into a “truth gap” in Kripke’s theory of truth. See (Kripke, 1975).

(Yablo 1993) produced a semantic paradox somewhat like the Liar Paradox. Yablo claimed there is no way to coherently assign a truth value to any of the sentences in the countably infinite sequence of sentences of the form, “None of the subsequent sentences are true.” Ask yourself whether the first sentence in the sequence could be true. Notice that no sentence overtly refers to itself. There is controversy in the literature about whether the paradox actually contains a hidden appeal to self-reference, and there has been some investigation of the parallel paradox in which “true” is replaced by “provable.” See (Beall 2001).

7. Conclusion

There are many aspects of the infinite that this article does not cover. Here are some of them: renormalization in quantum field theory, supertasks and infinity machines, categorematic and syncategorematic uses of the word “infinity,” mereology, ordinal and cardinal arithmetic in ZF, the various non-ZF set theories, non-standard solutions to Zeno's Paradoxes, Cantor's arguments for the Absolute, Kant’s views on the infinite, quantifiers that assert the existence of uncountably many objects, and the detailed arguments for and against constructivism, intuitionism, and finitism. For more discussion of these latter three programs, see (Maddy 1992).

8. References and Further Reading

  • Ahmavaara, Y. (1965). “The Structure of Space and the Formalism of Relativistic Quantum Theory,” Journal of Mathematical Physics, 6, 87-93.
    • Uses finite arithmetic in mathematical physics, and argues that this is the correct arithmetic for science.
  • Barrow, John D. (2005). The Infinite Book: A Short Guide to the Boundless, Timeless and Endless. Pantheon Books, New York.
    • An informal and easy-to-understand survey of the infinite in philosophy, theology, science and mathematics. Says which Western philosopher throughout the centuries said what about infinity.
  • Barwise, Jon. (1975) “Infinitary Logics,” in Modern Logic: A Survey, E. Agazzi (ed.), Reidel, Dordrecht, pp. 93-112.
    • An introduction to infinitary logics that emphasizes historical development.
  • Beall, J.C. (2001). “Is Yablo’s Paradox Non-Circular?” Analysis 61, no. 3, pp. 176-87.
    • Discusses the controversy over whether the Yablo Paradox is or isn’t indirectly circular.
  • Cantor, Georg. (1887). "Über die verschiedenen Ansichten in Bezug auf die actualunendlichen Zahlen." Bihang till Kongl. Svenska Vetenskaps-Akademien Handlingar , Bd. 11 (1886-7), article 19. P. A. Norstedt & Sôner: Stockholm.
    • A very early description of set theory and its relationship to old ideas about infinity.
  • Chihara, Charles. (1973). Ontology and the Vicious-Circle Principle. Ithaca: Cornell University Press.
    • Pages 63-65 give Chihara’s reasons for why the Gödel-Cohen independence results are evidence against mathematical Platonism.
  • Chihara, Charles. (2008). “The Existence of Mathematical Objects,” in Proof & Other Dilemmas: Mathematics and Philosophy, Bonnie Gold & Roger A. Simons, eds., The Mathematical Association of America.
    • In chapter 7, Chihara provides a fine survey of the ontological issues in mathematics.
  • Deutsch, David. (2011). The Beginning of Infinity: Explanations that Transform the World. Penguin Books, New York City.
    • Emphasizes the importance of successful explanation in understanding the world, and provides new ideas on the nature and evolution of our knowledge.
  • Descartes, René. (1641). Meditations on First Philosophy.
    • The third meditation says, “But these properties [of God] are so great and excellent, that the more attentively I consider them the less I feel persuaded that the idea I have of them owes its origin to myself alone. And thus it is absolutely necessary to conclude, from all that I have before said, that God exists….”
  • Dummett, Michael. (1977). Elements of Intuitionism. Oxford University Press, Oxford.
    • A philosophically rich presentation of intuitionism in logic and mathematics.
  • Elwes, Richard. (2010). Mathematics 1001: Absolutely Everything That Matters About Mathematics in 1001 Bite-Sized Explanations, Firefly Books, Richmond Hill, Ontario.
    • Contains the quoted debate between Harvey Friedman and a leading ultrafinitist.
  • Enderton, Herbert B. (1972). A Mathematical Introduction to Logic. Academic Press: New York.
    • An introduction to deductive logic that presupposes the mathematical sophistication of an advanced undergraduate mathematics major. The corollary proved on p. 142 says that if a theory in a countable language has a model of some infinite size, then it also has models of any infinite size.
  • Feferman, Anita Burdman, and Solomon. (2004) Alfred Tarski: Life and Logic, Cambridge University Press, New York.
    • A biography of Alfred Tarski, the 20th century Polish and American logician.
  • Field, Hartry. (1980). Science Without Numbers: A Defense of Nominalism. Princeton: Princeton University Press.
    • Field’s program is to oppose the Quine-Putnam Indispensability argument which apparently implies that mathematical physics requires the existence of mathematical objects such as numbers and sets. Field tries to reformulate scientific theories so, when they are formalized in second-order logic, their quantifiers do not range over abstract mathematical entities. Field’s theory uses quantifiers that range over spacetime points. However, because it uses a second order logic, the theory is also committed to quantifiers that range over sets of spacetime points, and sets are normally considered to be mathematical objects.
  • Gödel, Kurt. (1947/1983). “What is Cantor’s Continuum Problem?” American Mathematical Monthly 54, 515-525. Revised and reprinted in Philosophy of Mathematics: Selected Readings, Paul Benacerraf and Hilary Putnam (eds.), Prentice-Hall, Inc. Englewood Cliffs, 1964.
    • Gödel argues that the failure of ZF to provide a truth value for Cantor’s continuum hypothesis implies a failure of ZF to correctly describe the Platonic world of sets.
  • Greene, Brian. (2004). The Fabric of Reality. Random House, Inc., New York.
    • Promotes the virtues of string theory.
  • Greene, Brian (1999). The Elegant Universe. Vintage Books, New York.
    • The quantum field theory called quantum electrodynamics (QED) is discussed on pp. 121-2.
  • Greene, Brian. (2011). The Hidden Reality: Parallel Universes and the Deep Laws of the Cosmos. Vintage Books, New York.
    • A popular survey of cosmology with an emphasis on string theory.
  • Hawking, Stephen. (2001). The Illustrated A Brief History of Time: Updated and Expanded Edition. Bantam Dell. New York.
    • Chapter 4 of Brief History contains an elementary and non-mathematical introduction to quantum mechanics and Heisenberg’s uncertainty principle.
  • Hilbert, David. (1925). “On the Infinite,” in Philosophy of Mathematics: Selected Readings, Paul Benacerraf and Hilary Putnam (eds.), Prentice-Hall, Inc. Englewood Cliffs, 1964. 134-151.
    • Hilbert promotes what is now called the Hilbert Program for solving the problem of the infinite by requiring a finite basis for all acceptable assertions about the infinite.
  • Kleene, (1967). Mathematical Logic. John Wiley & Sons: New York.
    • An advanced textbook in mathematical logic.
  • Kripke, Saul. (1975). "Outline of a Theory of Truth," Journal of Philosophy 72, pp. 690–716.
    • Describes how to create a truth predicate within a formal language that avoids assigning a truth value to the Liar Sentence.
  • Leibniz, Gottfried. (1702). "Letter to Varignon, with a note on the 'Justification of the Infinitesimal Calculus by that of Ordinary Algebra,'" pp. 542-6. In Leibniz Philosophical Papers and Letters. translated by Leroy E. Loemkr (ed.). D. Reidel Publishing Company, Dordrecht, 1969.
    • Leibniz defends the actual infinite in calculus.
  • Levinas, Emmanuel. (1961). Totalité et Infini. The Hague: Martinus Nijhoff.
    • In Totality and Infinity, the Continental philosopher Levinas describes infinity in terms of the possibilities a person confronts upon encountering other conscious beings.
  • Maddy, Penelope. (1992). Realism in Mathematics. Oxford: Oxford University Press.
    • A discussion of the varieties of realism in mathematics and the defenses that have been, and could be, offered for them. The book is an extended argument for realism about mathematical objects. She offers a set theoretic monism in which all physical objects are sets.
  • Maor, E. (1991). To Infinity and Beyond: A Cultural History of the Infinite. Princeton: Princeton University Press.
    • A survey of many of the issues discussed in this encyclopedia article.
  • Mendelson, Elliolt. (1997). An Introduction to Mathematical Logic, 4th ed. London: Chapman & Hall.
    • Pp. 225–86 discuss NBG set theory.
  • Mill, John Stuart. (1843). A System of Logic: Ratiocinative and Inductive. Reprinted in J. M. Robson, ed., Collected Works, volumes 7 and 8. Toronto: University of Toronto Press, 1973.
    • Mill argues for empiricism and against accepting the references of theoretical terms in scientific theories if the terms can be justified only by the explanatory success of those theories.
  • Moore, A. W. (2001). The Infinite. Second edition, Routledge, New York.
    • A popular survey of the infinite in metaphysics, mathematics, and science.
  • Mundy, Brent. (1990). “Mathematical Physics and Elementary Logic,” Proceedings of the Biennial Meeting of the Philosophy of Science Association. Vol. 1990, Volume 1. Contributed Papers (1990), pp. 289-301.
    • Discusses the relationships among set theory, logic and physics.
  • Nolt, John. Logics. (1997). Wadsworth Publishing Company, Belmont, California.
    • An undergraduate logic textbook containing in later chapters a brief introduction to non-standard logics such as those with infinite-valued semantics.
  • Norton, John. (2012). "Approximation and Idealization: Why the Difference Matters," Philosophy of Science, 79, pp. 207-232.
    • Recommends being careful about the distinction between approximation and idealization in science.
  • Owen, H. P. (1967). “Infinity in Theology and Metaphysics.” In Paul Edwards (Ed.) The Encyclopedia of Philosophy, volume 4, pp. 190-3.
    • This survey of the topic is still reliable.
  • Parsons, Charles. (1980). “Quine on the Philosophy of Mathematics.” In L. Hahn and P. Schilpp (Eds.) The Philosophy of W. V. Quine, pp. 396-403. La Salle IL: Open Court.
    • Argues against Quine’s position that whether a mathematical entity exists depends on the indispensability of the mathematical term denoting that entity in a true scientific theory.
  • Penrose, Roger. (2005). The Road to Reality: A Complete Guide to the Laws of the Universe. New York: Alfred A. Knopf.
    • A fascinating book about the relationship between mathematics and physics. Many of its chapters assume sophistication in advanced mathematics.
  • Posy, Carl. (2005). “Intuitionism and Philosophy.” In Stewart Shapiro. Ed. (2005). The Oxford Handbook of Philosophy of Mathematics and Logic. Oxford: Oxford University Press.
    • The history of the intuitionism of Brouwer, Heyting and Dummett. Pages 330-1 explain how Brouwer uses choice sequences to develop “even the infinity needed to produce a continuum” non-empirically.
  • Quine, W. V. (1960). Word and Object. Cambridge: MIT Press.
    • Chapter 7 introduces Quine’s viewpoint that set theoretic objects exist because they are needed in the basis of our best scientific theories.
  • Quine, W. V. (1986). The Philosophy of W. V. Quine. Editors: Lewis Edwin Hahn and Paul Arthur Schilpp, Open Court, LaSalle, Illinois.
    • Contains the quotation saying infinite sets exist only insofar as they are needed for scientific theory.
  • Robinson, Abraham. (1966). Non-Standard Analysis. Princeton Univ. Press, Princeton.
    • Robinson’s original theory of the infinitesimal and its use in real analysis to replace the Cauchy-Weierstrass methods that use epsilons and deltas.
  • Rucker, Rudy. (1982). Infinity and the Mind: The Science and Philosophy of the Infinite. Birkhäuser: Boston.
    • A survey of set theory with much speculation about its metaphysical implications.
  • Russell, Bertrand. (1914). Our Knowledge of the External World as a Field for Scientific Method in Philosophy. Open Court Publishing Co.: Chicago.
    • Russell champions the use of contemporary real analysis and physics in resolving Zeno’s paradoxes. Chapter 6 is “The Problem of Infinity Considered Historically,” and that chapter is reproduced in (Salmon, 1970).
  • Salmon, Wesley, ed. (1970). Zeno's Paradoxes. The Bobbs-Merrill Company, Inc., Indianapolis.
    • A collection of the important articles on Zeno's Paradoxes plus a helpful and easy-to-read preface providing an overview of the issues.
  • Smullyan, Raymond. (1967). “Continuum Problem,” in Paul Edwards (ed.), The Encyclopedia of Philosophy, Macmillan Publishing Co. & The Free Press: New York.
    • Discusses the variety of philosophical reactions to the discovery of the independence of the continuum hypotheses from ZF set theory.
  • Suppes, Patrick. (1960). Axiomatic Set Theory. D. Van Nostrand Company, Inc.: Princeton.
    • An undergraduate-level introduction to set theory.
  • Tarski, Alfred. (1924). “Sur les Ensembles Finis,” Fundamenta Mathematicae, Vol. 6, pp. 45-95.
    • Surveys and evaluates alternative definitions of finitude and infinitude proposed by Zermelo, Russell, Sierpinski, Kuratowski, Tarski, and others.
  • Wagon, Stan. (1985). The Banach-Tarski Paradox. Cambridge University Press: Cambridge.
    • The unintuitive Banach-Tarski Theorem says a solid sphere can be decomposed into a finite number of parts and then reassembled into two solid spheres of the same radius as the original sphere. Unfortunately you cannot double your sphere of solid gold this way.
  • Wilder, Raymond L. (1965) Introduction to the Foundations of Mathematics, 2nd ed., John Wiley & Sons, Inc.: New York.
    • An undergraduate-level introduction to the foundation of mathematics.
  • Wolf, Robert S. (2005). A Tour through Mathematical Logic. The Mathematical Association of America: Washington, D.C.
    • Chapters 2 and 6 describe set theory and its historical development. Both the history of the infinitesimal and the development of Robinson’s nonstandard model of analysis are described clearly on pages 280-316.
  • Yablo, Stephen. (1993). “Paradox without Self-Reference.” Analysis 53: 251-52.
    • Yablo presents a Liar-like paradox involving an infinite sequence of sentences that, the author claims, is “not in any way circular,” unlike with the traditional Liar Paradox.

 

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University Sacramento
U. S. A.

Zeno’s Paradoxes

Zeno_of_EleaIn the fifth century B.C.E., Zeno of Elea offered arguments that led to conclusions contradicting what we all know from our physical experience–that runners run, that arrows fly, and that there are many different things in the world. The arguments were paradoxes for the ancient Greek philosophers. Because most of the arguments turn crucially on the notion that space and time are infinitely divisible—for example, that for any distance there is such a thing as half that distance, and so on—Zeno was the first person in history to show that the concept of infinity is problematical.

In his Achilles Paradox, Achilles races to catch a slower runner–for example, a tortoise that is crawling away from him. The tortoise has a head start, so if Achilles hopes to overtake it, he must run at least to the place where the tortoise presently is, but by the time he arrives there, it will have crawled to a new place, so then Achilles must run to this new place, but the tortoise meanwhile will have crawled on, and so forth. Achilles will never catch the tortoise, says Zeno. Therefore, good reasoning shows that fast runners never can catch slow ones. So much the worse for the claim that motion really occurs, Zeno says in defense of his mentor Parmenides who had argued that motion is an illusion.

Although practically no scholars today would agree with Zeno’s conclusion, we can not escape the paradox by jumping up from our seat and chasing down a tortoise, nor by saying Achilles should run to some other target place ahead of where the tortoise is at the moment. What is required is an analysis of Zeno's own argument that does not get us embroiled in new paradoxes nor impoverish our mathematics and science.

This article explains his ten known paradoxes and considers the treatments that have been offered. Zeno assumed distances and durations can be divided into an actual infinity (what we now call a transfinite infinity) of indivisible parts, and he assumed these are too many for the runner to complete. Aristotle's treatment said Zeno should have assumed there are only potential infinities, and that neither places nor times divide into indivisible parts. His treatment became the generally accepted solution until the late 19th century. The current standard treatment says Zeno was right to conclude that a runner's path contains an actual infinity of parts, but he was mistaken to assume this is too many. This treatment employs the apparatus of calculus which has proved its indispensability for the development of modern science. In the twentieth century it became clear to most researchers that disallowing actual infinities, as Aristotle wanted, hampers the growth of set theory and ultimately of mathematics and physics. This standard treatment took hundreds of years to perfect and was due to the flexibility of intellectuals who were willing to replace old theories and their concepts with more fruitful ones, despite the damage done to common sense and our naive intuitions. The article ends by exploring newer treatments of the paradoxes—and related paradoxes such as Thomson's Lamp Paradox—that were developed since the 1950s.

Table of Contents

  1. Zeno of Elea
    1. His Life
    2. His Book
    3. His Goals
    4. His Method
  2. The Standard Solution to the Paradoxes
  3. The Ten Paradoxes
    1. Paradoxes of Motion
      1. The Achilles
      2. The Dichotomy (The Racetrack)
      3. The Arrow
      4. The Moving Rows (The Stadium)
    2. Paradoxes of Plurality
      1. Alike and Unlike
      2. Limited and Unlimited
      3. Large and Small
      4. Infinite Divisibility
    3. Other Paradoxes
      1. The Grain of Millet
      2. Against Place
  4. Aristotle’s Treatment of the Paradoxes
  5. Other Issues Involving the Paradoxes
    1. Consequences of Accepting the Standard Solution
    2. Criticisms of the Standard Solution
    3. Supertasks and Infinity Machines
    4. Constructivism
    5. Nonstandard Analysis
    6. Smooth Infinitesimal Analysis
  6. The Legacy and Current Significance of the Paradoxes
  7. References and Further Reading

1. Zeno of Elea

a. His Life

Zeno was born in about 490 B.C.E. in Elea, now Velia, in southern Italy; and he died in about 430 B.C.E. He was a friend and student of Parmenides, who was twenty-five years older and also from Elea. There is little additional, reliable information about Zeno’s life. Plato remarked (in Parmenides 127b) that Parmenides took Zeno to Athens with him where he encountered Socrates, who was about twenty years younger than Zeno, but today’s scholars consider this encounter to have been invented by Plato to improve the story line. Zeno is reported to have been arrested for taking weapons to rebels opposed to the tyrant who ruled Elea. When asked about his accomplices, Zeno said he wished to whisper something privately to the tyrant. But when the tyrant came near, Zeno bit him, and would not let go until he was stabbed. Diogenes Laërtius reported this apocryphal story seven hundred years after Zeno’s death.

b. His Book

According to Plato’s commentary in his Parmenides (127a to 128e), Zeno brought a treatise with him when he visited Athens. It was said to be a book of paradoxes defending the philosophy of Parmenides. Plato and Aristotle may have had access to the book, but Plato did not state any of the arguments, and Aristotle’s presentations of the arguments are very compressed. A thousand years after Zeno, the Greek philosophers Proclus and Simplicius commented on the book and its arguments. They had access to some of the book, perhaps to all of it, but it has not survived. Proclus is the first person to tell us that the book contained forty arguments. This number is confirmed by the sixth century commentator Elias, who is regarded as an independent source because he does not mention Proclus. Unfortunately, we know of no specific dates for when Zeno composed any of his paradoxes, and we know very little of how Zeno stated his own paradoxes. We do have a direct quotation via Simplicius of the Paradox of Denseness and a partial quotation via Simplicius of the Large and Small Paradox. In total we know of less than two hundred words that can be attributed to Zeno. Our knowledge of these two paradoxes and the other seven comes to us indirectly through paraphrases of them, and comments on them, primarily by Aristotle (384-322 B.C.E.), but also by Plato (427-347 B.C.E.), Proclus (410-485 C.E.), and Simplicius (490-560 C.E.). The names of the paradoxes were created by commentators, not by Zeno.

c. His Goals

In the early fifth century B.C.E., Parmenides emphasized the distinction between appearance and reality. Reality, he said, is a seamless unity that is unchanging and can not be destroyed, so appearances of reality are deceptive. Our ordinary observation reports are false; they do not report what is real. This metaphysical theory is the opposite of Heraclitus’ theory, but evidently it was supported by Zeno. Although we do not know from Zeno himself whether he accepted his own paradoxical arguments or what point he was making with thm, according to Plato the paradoxes were designed to provide detailed, supporting arguments for Parmenides by demonstrating that our common sense confidence in the reality of motion, change, and ontological plurality (that is, that there exist many things), involve absurdities. Plato’s classical interpretation of Zeno was accepted by Aristotle and by most other commentators throughout the intervening centuries.

Eudemus, a student of Aristotle, offered another interpretation. He suggested that Zeno was challenging both pluralism and Parmenides’ idea of monism, which would imply that Zeno was a nihilist. Paul Tannery in 1885 and Wallace Matson in 2001 offer a third interpretation of Zeno’s goals regarding the paradoxes of motion. Plato and Aristotle did not understand Zeno’s arguments nor his purpose, they say. Zeno was actually challenging the Pythagoreans and their particular brand of pluralism, not Greek common sense. Zeno was not trying to directly support Parmenides. Instead, he intended to show that Parmenides’ opponents are committed to denying the very motion, change, and plurality they believe in, and Zeno’s arguments were completely successful. This controversial issue about interpreting Zeno’s purposes will not be pursued further in this article, and Plato’s classical interpretation will be assumed.

d. His Method

Before Zeno, Greek thinkers favored presenting their philosophical views by writing poetry. Zeno began the grand shift away from poetry toward a prose that contained explicit premises and conclusions. And he employed the method of indirect proof in his paradoxes by temporarily assuming some thesis that he opposed and then attempting to deduce an absurd conclusion or a contradiction, thereby undermining the temporary assumption. This method of indirect proof or reductio ad absurdum probably originated with his teacher Parmenides [although this is disputed in the scholarly literature], but Zeno used it more systematically.

2. The Standard Solution to the Paradoxes

Any paradox can be treated by abandoning enough of its crucial assumptions. For Zeno's it is very interesting to consider which assumptions to abandon, and why those. A paradox is an argument that reaches a contradiction by apparently legitimate steps from apparently reasonable assumptions, while the experts at the time can not agree on the way out of the paradox, that is, agree on its resolution. It is this latter point about disagreement among the experts that distinguishes a paradox from a mere puzzle in the ordinary sense of that term. Zeno’s paradoxes are now generally considered to be puzzles because of the wide agreement among today’s experts that there is at least one acceptable resolution of the paradoxes.

This resolution is called the Standard Solution. It presupposes calculus, the rest of standard real analysis, and classical mechanics. It assumes that physical processes are sets of point-events. It implies that motions, durations, distances and line segments are all linear continua composed of points, then uses these ideas to challenge various assumptions made, and steps taken, by Zeno. To be very brief and anachronistic, Zeno's mistake (and Aristotle's mistake) was not to have used calculus. More specifically, in the case of the paradoxes of motion such as the Achilles and the Dichotomy, Zeno's mistake was not his assuming there is a completed infinity of places for the runner to go, which was what Aristotle said was Zeno's mistake; Zeno's and Aristotle's mistake was in assuming that this is too many places (for the runner to go to in a finite time).

A key background assumption of the Standard Solution is that this resolution is not simply employing some concepts that will undermine Zeno’s reasoning–Aristotle's reasoning does that, too, at least for most of the paradoxes–but that it is employing concepts which have been shown to be appropriate for the development of a coherent and fruitful system of mathematics and physical science. Aristotle's treatment of the paradoxes does not employ these fruitful concepts. The Standard Solution is much more complicated than Aristotle's treatment, and no single person can be credited with creating it.

The Standard Solution uses calculus. In calculus we need to speak of one event happening pi seconds after another, and of one event happening the square root of three seconds after another. In ordinary discourse outside of science we would never need this kind of precision. The need for this precision has led to requiring time to be a linear continuum, very much like a segment of the real number line.

Calculus was invented in the late 1600's by Newton and Leibniz. Their calculus is a technique for treating continuous motion as being composed of an infinite number of infinitesimal steps. After the acceptance of calculus, most all mathematicians and physicists believed that continuous motion, including Achilles' motion, should be modeled by a function which takes real numbers representing time as its argument and which gives real numbers representing spatial position as its value. This position function should be continuous or gap-free. In addition, the position function should be differentiable or smooth in order to make sense of speed, the rate of change of position. By the early 20th century most mathematicians had come to believe that, to make rigorous sense of motion, mathematics needs a fully developed set theory that rigorously defines the key concepts of real number, continuity and differentiability. Doing this requires a well defined concept of the continuum. Unfortunately Newton and Leibniz did not have a good definition of the continuum, and finding a good one required over two hundred years of work.

The continuum is a very special set; it is the standard model of the real numbers. Intuitively, a continuum is a continuous entity; it is a whole thing that has no gaps. Some examples of a continuum are the path of a runner’s center of mass, the time elapsed during this motion, ocean salinity, and the temperature along a metal rod. Distances and durations are normally considered to be real continua whereas treating the ocean salinity and the rod's temperature as continua is a very useful approximation for many calculations in physics even though we know that at the atomic level the approximation breaks down.

The distinction between “a” continuum and “the” continuum is that “the” continuum is the paradigm of “a” continuum. The continuum is the mathematical line, the line of geometry, which is standardly understood to have the same structure as the real numbers in their natural order. Real numbers and points on the continuum can be put into a one-to-one order-preserving correspondence. There are not enough rational numbers for this correspondence even though the rational numbers are dense, too (in the sense that between any two rational numbers there is another rational number).

For Zeno’s paradoxes, standard analysis assumes that length should be defined in terms of measure, and motion should be defined in terms of the derivative. These definitions are given in terms of the linear continuum. The most important features of any linear continuum are that (a) it is composed of points, (b) it is an actually infinite set, that is, a transfinite set, and not merely a potentially infinite set that gets bigger over time, (c) it is undivided yet infinitely divisible (that is, it is gap-free), (d) the points are so close together that no point can have a point immediately next to it, (e) between any two points there are other points, (f) the measure (such as length) of a continuum is not a matter of adding up the measures of its points nor adding up the number of its points, (g) any connected part of a continuum is also a continuum, and (h) there are an aleph-one number of points between any two points.

Physical space is not a linear continuum because it is three-dimensional and not linear; but it has one-dimensional subspaces such as paths of runners and orbits of planets; and these are linear continua if we use the path created by only one point on the runner and the orbit created by only one point on the planet. Regarding time, each (point) instant is assigned a real number as its time, and each instant is assigned a duration of zero. The time taken by Achilles to catch the tortoise is a temporal interval, a linear continuum of instants, according to the Standard Solution (but not according to Zeno or Aristotle). The Standard Solution says that the sequence of Achilles' goals (the goals of reaching the point where the tortoise is) should be abstracted from a pre-existing transfinite set, namely a linear continuum of point places along the tortoise's path. Aristotle's treatment does not do this. The next section of this article presents the details of how the concepts of the Standard Solution are used to resolve each of Zeno's Paradoxes.

Of the ten known paradoxes, The Achilles attracted the most attention over the centuries. Aristotle’s treatment of the paradox involved accusing Zeno of using the concept of an actual or completed infinity instead of the concept of a potential infinity, and accusing Zeno of failing to appreciate that a line cannot be composed of points. Aristotle’s treatment is described in detail below. It was generally accepted until the 19th century, but slowly lost ground to the Standard Solution. Some historians say he had no solution but only a verbal quibble. This article takes no side on this dispute and speaks of Aristotle’s “treatment.”

The development of calculus was the most important step in the Standard Solution of Zeno's paradoxes, so why did it take so long for the Standard Solution to be accepted after Newton and Leibniz developed their calculus? The period lasted about two hundred years. There are four reasons. (1) It took time for calculus and the rest of real analysis to prove its applicability and fruitfulness in physics. (2) It took time for the relative shallowness of Aristotle’s treatment to be recognized. (3) It took time for philosophers of science to appreciate that each theoretical concept used in a physical theory need not have its own correlate in our experience.  (4) It took time for certain problems in the foundations of mathematics to be resolved, such as finding a better definition of the continuum and avoiding the paradoxes of Cantor's naive set theory.

Point (2) is discussed in section 4 below.

Point (3) is about the time it took for philosophers of science to reject the demand, favored by Ernst Mach and many Logical Positivists, that meaningful terms in science must have “empirical meaning.” This was the demand that each physical concept be separately definable with observation terms. It was thought that, because our experience is finite, the term “actual infinite” or "completed infinity" could not have empirical meaning, but “potential infinity” could. Today, most philosophers would not restrict meaning to empirical meaning. However, for an interesting exception see Dummett (2000) which contains a theory in which time is composed of overlapping intervals rather than durationless instants, and in which the endpoints of those intervals are the initiation and termination of actual physical processes. This idea of treating time without instants develops a 1936 proposal of Russell and Whitehead. The central philosophical issue about Dummett's treatment of motion is how its adoption would affect other areas of mathematics and science.

Point (1) is about the time it took for classical mechanics to develop to the point where it was accepted as giving correct solutions to problems involving motion. Point (1) was challenged in the metaphysical literature on the grounds that the abstract account of continuity in real analysis does not truly describe either time, space or concrete physical reality. This challenge is discussed in later sections.

Point (4) arises because the standard of rigorous proof and rigorous definition of concepts has increased over the years. As a consequence, the difficulties in the foundations of real analysis, which began with George Berkeley’s criticism of inconsistencies in the use of infinitesimals in the calculus of Leibniz (and fluxions in the calculus of Newton), were not satisfactorily resolved until the early 20th century with the development of Zermelo-Fraenkel set theory. The key idea was to work out the necessary and sufficient conditions for being a continuum. To achieve the goal, the conditions for being a mathematical continuum had to be strictly arithmetical and not dependent on our intuitions about space, time and motion. The idea was to revise or “tweak” the definition until it would not create new paradoxes and would still give useful theorems. When this revision was completed, it could be declared that the set of real numbers is an actual infinity, not a potential infinity, and that not only is any interval of real numbers a linear continuum, but so are the spatial paths, the temporal durations, and the motions that are mentioned in Zeno’s paradoxes. In addition, it was important to clarify how to compute the sum of an infinite series (such as 1/2 + 1/4 + 1/8 + ...) and how to define motion in terms of the derivative. This new mathematical system required new or better-defined mathematical concepts of compact set, connected set, continuity, continuous function, convergence-to-a-limit of an infinite sequence (such as 1/2, 1/4, 1/8, ...), curvature at a point, cut, derivative, dimension, function, integral, limit, measure, reference frame, set, and size of a set. Similarly, rigor was added to the definitions of the physical concepts of place, instant, duration, distance, and instantaneous speed. The relevant revisions were made by Euler in the 18th century and by Bolzano, Cantor, Cauchy, Dedekind, Frege, Hilbert, Lebesque, Peano, Russell, Weierstrass, and Whitehead, among others, during the 19th and early 20th centuries.

What about Leibniz's infinitesimals or Newton's fluxions? Let's stick with infinitesimals, since fluxions have the same problems and same resolution. In 1734, Berkeley had properly criticized the use of infinitesimals as being "ghosts of departed quantities" that are used inconsistently in calculus. Earlier Newton had defined instantaneous speed as the ratio of an infinitesimally small distance and an infinitesimally small duration, and he and Leibniz produced a system of calculating variable speeds that was very fruitful. But nobody in that century or the next could adequately explain what an infinitesimal was. Newton had called them “evanescent divisible quantities,” whatever that meant. Leibniz called them “vanishingly small,” but that was just as vague. The practical use of infinitesimals was unsystematic. For example, the infinitesimal dx is treated as being equal to zero when it is declared that x + dx = x, but is treated as not being zero when used in the denominator of the fraction [f(x + dx) - f(x)]/dx which is the derivative of the function f. In addition, consider the seemingly obvious Archimedean property of pairs of positive numbers: given any two positive numbers A and B, if you add enough copies of A, then you can produce a sum greater than B. This property fails if A is an infinitesimal. Finally, mathematicians gave up on answering Berkeley’s charges (and thus re-defined what we mean by standard analysis) because, in 1821, Cauchy showed how to achieve the same useful theorems of calculus by using the idea of a limit instead of an infinitesimal. Later in the 19th century, Weierstrass resolved some of the inconsistencies in Cauchy’s account and satisfactorily showed how to define continuity in terms of limits (his epsilon-delta method). As J. O. Wisdom points out (1953, p. 23), “At the same time it became clear that [Leibniz's and] Newton’s theory, with suitable amendments and additions, could be soundly based.” In an effort to provide this sound basis according to the latest, heightened standard of what counts as “sound,” Peano, Frege, Hilbert, and Russell attempted to properly axiomatize real analysis. This led in 1901 to Russell’s paradox and the fruitful controversy about how to provide a foundation to all of mathematics. That controversy still exists, but the majority view is that axiomatic Zermelo-Fraenkel set theory with the axiom of choice blocks all the paradoxes, legitimizes Cantor’s theory of transfinite sets, and provides the proper foundation for real analysis and other areas of mathematics. This standard real analysis lacks infinitesimals, thanks to Cauchy and Weierstrass. Standard real analysis is the mathematics that the Standard Solution applies to Zeno’s Paradoxes.

The rational numbers are not continuous although they are infinitely numerous and infinitely dense. To come up with a foundation for calculus there had to be a good definition of the continuity of the real numbers. But this required having a good definition of irrational numbers. There wasn’t one before 1872. Dedekind’s definition in 1872 defines the mysterious irrationals in terms of the familiar rationals. The result was a clear and useful definition of real numbers. The usefulness of Dedekind's definition of real numbers, and the lack of any better definition, convinced many mathematicians to be more open to accepting actually-infinite sets.

We won't explore the definitions of continuity here, but what Dedekind discovered about the reals and their relationship to the rationals was how to define a real number to be a cut of the rational numbers, where a cut is a certain ordered pair of actually-infinite sets of rational numbers.

A Dedekind cut (A,B) is defined to be a partition or cutting of the set of all the rational numbers into a left part A and a right part B. A and B are non-empty subsets, such that all rational numbers in A are less than all rational numbers in B, and also A contains no greatest number. Every real number is a unique Dedekind cut. The cut can be made at a rational number or at an irrational number. Here are examples of each:

Dedekind's real number 1/2 is ({x : x < 1/2} , {x: x ≥ 1/2}).

Dedekind's positive real number √2 is ({x : x < 0 or x2 < 2} , {x: x2 ≥ 2}).

Notice that the rational real number 1/2 is within its B set, but the irrational real number √2 is not within its B set because B contains only rational numbers. That property is what distinguishes rationals from irrationals, according to Dedekind.

For any cut (A,B), if B has a smallest number, then the real number for that cut corresponds to this smallest number, as in the definition of ½ above. Otherwise, the cut defines an irrational number which, loosely speaking, fills the gap between A and B, as in the definition of the square root of 2 above.

By defining reals in terms of rationals this way, Dedekind gave a foundation to the reals, and legitimized them by showing they are as acceptable as actually-infinite sets of rationals.

But what exactly is an actually-infinite or transfinite set, and does this idea lead to contradictions? This question needs an answer if there is to be a good theory of continuity and of real numbers. In the 1870s, Cantor clarified what an actually-infinite set is and made a convincing case that the concept does not lead to inconsistencies. These accomplishments by Cantor are why he (along with Dedekind and Weierstrass) is said by Russell to have “solved Zeno’s Paradoxes.”

That solution recommends using very different concepts and theories than those used by Zeno. The argument that this is the correct solution was presented by many people, but it was especially influenced by the work of Bertrand Russell (1914, lecture 6) and the more detailed work of Adolf Grünbaum (1967). In brief, the argument for the Standard Solution is that we have solid grounds for believing our best scientific theories, but the theories of mathematics such as calculus and Zermelo-Fraenkel set theory are indispensable to these theories, so we have solid grounds for believing in them, too. The scientific theories require a resolution of Zeno’s paradoxes and the other paradoxes; and the Standard Solution to Zeno's Paradoxes that uses standard calculus and Zermelo-Fraenkel set theory is indispensable to this resolution or at least is the best resolution, or, if not, then we can be fairly sure there is no better solution, or, if not that either, then we can be confident that the solution is good enough (for our purposes). Aristotle's treatment, on the other hand, uses concepts that hamper the growth of mathematics and science. Therefore, we should accept the Standard Solution.

In the next section, this solution will be applied to each of Zeno’s ten paradoxes.

To be optimistic, the Standard Solution represents a counterexample to the claim that philosophical problems never get solved. To be less optimistic, the Standard Solution has its drawbacks and its alternatives, and these have generated new and interesting philosophical controversies beginning in the last half of the 20th century, as will be seen in later sections. The primary alternatives contain different treatments of calculus from that developed at the end of the 19th century. Whether this implies that Zeno’s paradoxes have multiple solutions or only one is still an open question.

Did Zeno make mistakes? And was he superficial or profound? These questions are a matter of dispute in the philosophical literature. The majority position is as follows. If we give his paradoxes a sympathetic reconstruction, he correctly demonstrated that some important, classical Greek concepts are logically inconsistent, and he did not make a mistake in doing this, except in the Moving Rows Paradox, the Paradox of Alike and Unlike and the Grain of Millet Paradox, his weakest paradoxes. Zeno did assume that the classical Greek concepts were the correct concepts to use in reasoning about his paradoxes, and now we prefer revised concepts, though it would be unfair to say he blundered for not foreseeing later developments in mathematics and physics.

3. The Ten Paradoxes

Zeno probably created forty paradoxes, of which only the following ten are known. Only the first four have standard names, and the first two have received the most attention. The ten are of uneven quality. Zeno and his ancient interpreters usually stated his paradoxes badly, so it has taken some clever reconstruction over the years to reveal their full force. Below, the paradoxes are reconstructed sympathetically, and then the Standard Solution is applied to them. These reconstructions use just one of several reasonable schemes for presenting the paradoxes, but the present article does not explore the historical research about the variety of interpretive schemes and their relative plausibility.

a. Paradoxes of Motion

i. The Achilles

Achilles, who is the fastest runner of antiquity, is racing to catch the tortoise that is slowly crawling away from him. Both are moving along a linear path at constant speeds. In order to catch the tortoise, Achilles will have to reach the place where the tortoise presently is. However, by the time Achilles gets there, the tortoise will have crawled to a new location. Achilles will then have to reach this new location. By the time Achilles reaches that location, the tortoise will have moved on to yet another location, and so on forever. Zeno claims Achilles will never catch the tortoise. He might have defended this conclusion in various ways—by saying it is because the sequence of goals or locations has no final member, or requires too much distance to travel, or requires too much travel time, or requires too many tasks. However, if we do believe that Achilles succeeds and that motion is possible, then we are victims of illusion, as Parmenides says we are.

The source for Zeno's views is Aristotle (Physics 239b14-16) and some passages from Simplicius in the fifth century C.E. There is no evidence that Zeno used a tortoise rather than a slow human. The tortoise is a commentator’s addition. Aristotle spoke simply of “the runner” who competes with Achilles.

It won’t do to react and say the solution to the paradox is that there are biological limitations on how small a step Achilles can take. Achilles’ feet aren’t obligated to stop and start again at each of the locations described above, so there is no limit to how close one of those locations can be to another. It is best to think of the change from one location to another as a movement rather than as incremental steps requiring halting and starting again. Zeno is assuming that space and time are infinitely divisible; they are not discrete or atomistic. If they were, the Paradox's argument would not work.

One common complaint with Zeno’s reasoning is that he is setting up a straw man because it is obvious that Achilles cannot catch the tortoise if he continually takes a bad aim toward the place where the tortoise is; he should aim farther ahead. The mistake in this complaint is that even if Achilles took some sort of better aim, it is still true that he is required to go to every one of those locations that are the goals of the so-called “bad aims,” so Zeno's argument needs a better treatment.

The treatment called the "Standard Solution" to the Achilles Paradox uses calculus and other parts of real analysis to describe the situation. It implies that Zeno is assuming in the Achilles situation that Achilles cannot achieve his goal because

(1) there is too far to run, or

(2) there is not enough time, or

(3) there are too many places to go, or

(4) there is no final step, or

(5) there are too many tasks.

The historical record does not tell us which of these was Zeno's real assumption, but they are all false assumptions, according to the Standard Solution. Let's consider (1). Presumably Zeno would defend the assumption by remarking that the sum of the distances along so many of the runs to where the tortoise is must be infinite, which is too much for even Achilles. However, the advocate of the Standard Solution will remark, "How does Zeno know what the sum of this infinite series is?" According to the Standard Solution the sum is not infinite. Here is a graph using the methods of the Standard Solution showing the activity of Achilles as he chases the tortoise and overtakes it.

graph of Achilles and the Tortoise

To describe this graph in more detail, we need to say that Achilles' path [the path of some dimensionless point of Achilles' body] is a linear continuum and so is composed of an actual infinity of points. (An actual infinity is also called a "completed infinity" or "transfinite infinity," and the word "actual" does not mean "real" as opposed to "imaginary.") Since Zeno doesn't make this assumption, that is another source of error in Zeno's reasoning. Achilles travels a distance d1 in reaching the point x1 where the tortoise starts, but by the time Achilles reaches x1, the tortoise has moved on to a new point x2. When Achilles reaches x2, having gone an additional distance d2, the tortoise has moved on to point x3, requiring Achilles to cover an additional distance d3, and so forth. This sequence of non-overlapping distances (or intervals or sub-paths) is an actual infinity, but happily the geometric series converges. The sum of its terms d1 + d2 + d3 +… is a finite distance that Achilles can readily complete while moving at a constant speed.

Similar reasoning would apply if Zeno were to have made assumption (2) or (3). Regarding (4), the requirement that there be a final step or final sub-path is simply mistaken, according to the Standard Solution. More will be said about assumption (5) in Section 5c.

By the way, the Paradox does not require the tortoise to crawl at a constant speed but only to never stop crawling and for Achilles to travel faster on average than the tortoise. The assumption of constant speed is made simply for ease of understanding.

The Achilles Argument presumes that space and time are infinitely divisible. So, Zeno's conclusion may not simply have been that Achilles cannot catch the tortoise but instead that he cannot catch the tortoise if space and time are infinitely divisible. Perhaps, as some commentators have speculated, Zeno used the Achilles only to attack continuous space, and he intended his other paradoxes such as "The Moving Rows" to attack discrete space. The historical record is not clear. Notice that, although space and time are infinitely divisible for Zeno, he did not have the concepts to properly describe the limit of the infinite division. Neither Zeno nor any of the other ancient Greeks had the concept of a dimensionless point; they did  not even have the concept of zero. However, today's versions of Zeno's Paradoxes can and do use those concepts.

ii. The Dichotomy (The Racetrack)

In his Progressive Dichotomy Paradox, Zeno argued that a runner will never reach the stationary goal line of a racetrack. The reason is that the runner must first reach half the distance to the goal, but when there he must still cross half the remaining distance to the goal, but having done that the runner must cover half of the new remainder, and so on. If the goal is one meter away, the runner must cover a distance of 1/2 meter, then 1/4 meter, then 1/8 meter, and so on ad infinitum. The runner cannot reach the final goal, says Zeno. Why not? There are few traces of Zeno's reasoning here, but for reconstructions that give the strongest reasoning, we may say that the runner will not reach the final goal because there is too far to run, the sum is actually infinite. The Standard Solution argues instead that the sum of this infinite geometric series is one, not infinity.

The problem of the runner getting to the goal can be viewed from a different perspective. According to the Regressive version of the Dichotomy Paradox, the runner cannot even take a first step. Here is why. Any step may be divided conceptually into a first half and a second half. Before taking a full step, the runner must take a 1/2 step, but before that he must take a 1/4 step, but before that a 1/8 step, and so forth ad infinitum, so Achilles will never get going. Like the Achilles Paradox, this paradox also concludes that any motion is impossible. The original source is Aristotle (Physics, 239b11-13).

The Dichotomy paradox, in either its Progressive version or its Regressive version, assumes for the sake of simplicity that the runner’s positions are point places. Actual runners take up some larger volume, but assuming point places is not a controversial assumption because Zeno could have reconstructed his paradox by speaking of the point places occupied by, say, the tip of the runner’s nose, and this assumption makes for a strong paradox than assuming the runner's position are larger.

In the Dichotomy Paradox, the runner reaches the points 1/2 and 3/4 and 7/8 and so forth on the way to his goal, but under the influence of Bolzano and Dedekind and Cantor, who developed the first theory of sets, the set of those points is no longer considered to be potentially infinite. It is an actually infinite set of points abstracted from a continuum of points–in the contemporary sense of “continuum” at the heart of calculus. And the ancient idea that the actually infinite series of path lengths or segments 1/2 + 1/4 + 1/8 + … is infinite had to be rejected in favor of the new theory that it converges to 1. This is key to solving the Dichotomy Paradox, according to the Standard Solution. It is basically the same treatment as that given to the Achilles. The Dichotomy Paradox has been called “The Stadium” by some commentators, but that name is also commonly used for the Paradox of the Moving Rows.

Aristotle, in Physics Z9, said of the Dichotomy that it is possible for a runner to come in contact with a potentially infinite number of things in a finite time provided the time intervals becomes shorter and shorter. Aristotle said Zeno assumed this is impossible, and that is one of his errors in the Dichotomy. However, Aristotle merely asserted this and could give no detailed theory that enables the computation of the finite amount of time. So, Aristotle could not really defend his diagnosis of Zeno's error. Today the calculus is used to provide the Standard Solution with that detailed theory.

There is another detail of the Dichotomy that needs resolution. How does Zeno complete the trip if there is no final step or last member of the infinite sequence of steps (intervals and goals)? Don't trips need last steps? The Standard Solution answers "no" and says the intuitive answer "yes" is one of our many intuitions that must be rejected when embracing the Standard Solution.

iii. The Arrow

Zeno’s Arrow Paradox takes a different approach to challenging the coherence of our common sense concepts of time and motion. As Aristotle explains, from Zeno’s “assumption that time is composed of moments,” a moving arrow must occupy a space equal to itself during any moment. That is, during any moment it is at the place where it is. But places do not move. So, if in each moment, the arrow is occupying a space equal to itself, then the arrow is not moving in that moment because it has no time in which to move; it is simply there at the place. The same holds for any other moment during the so-called “flight” of the arrow. So, the arrow is never moving. Similarly, nothing else moves. The source for Zeno’s argument is Aristotle (Physics, 239b5-32).

The Standard Solution to the Arrow Paradox uses the “at-at” theory of motion, which says motion is being at different places at different times and that being at rest involves being motionless at a particular point at a particular time. The difference between rest and motion has to do with what is happening at nearby moments and has nothing to do with what is happening during a moment. An object cannot be in motion in or during an instant, but it can be in motion at an instant in the sense of having a speed at that instant, provided the object occupies different positions at times before or after that instant so that the instant is part of a period in which the arrow is continuously in motion. If we don't pay attention to what happens at nearby instants, it is impossible to distinguish instantaneous motion from instantaneous rest, but distinguishing the two is the way out of the Arrow Paradox. Zeno would have balked at the idea of motion at an instant, and Aristotle explicitly denied it. The Arrow Paradox seems especially strong to someone who would say that motion is an intrinsic property of an instant, being some propensity or disposition to be elsewhere.

In standard calculus, speed of an object at an instant (instantaneous velocity) is the time derivative of the object's position; this means the object's speed is the limit of its speeds during arbitrarily small intervals of time containing the instant. Equivalently, we say the object's speed is the limit of its speed over an interval as the length of the interval tends to zero. The derivative of position x with respect to time t, namely dx/dt, is the arrow’s speed, and it has non-zero values at specific places at specific instants during the flight, contra Zeno and Aristotle. The speed during an instant or in an instant, which is what Zeno is calling for, would be 0/0 and so be undefined. Using these modern concepts, Zeno cannot successfully argue that at each moment the arrow is at rest or that the speed of the arrow is zero at every instant. Therefore, advocates of the Standard Solution conclude that Zeno’s Arrow Paradox has a false, but crucial, assumption and so is unsound.

Independently of Zeno, the Arrow Paradox was discovered by the Chinese dialectician Kung-sun Lung (Gongsun Long, ca. 325–250 B.C.E.). A lingering philosophical question about the arrow paradox is whether there is a way to properly refute Zeno's argument that motion is impossible without using the apparatus of calculus.

iv. The Moving Rows (The Stadium)

It takes a body moving at a given speed a certain amount of time to traverse a body of a fixed length. Passing the body again at that speed will take the same amount of time, provided the body’s length stays fixed. Zeno challenged this common reasoning. According to Aristotle (Physics 239b33-240a18), Zeno considered bodies of equal length aligned along three parallel racetracks within a stadium. One track contains A bodies (three A bodies are shown below); another contains B bodies; and a third contains C bodies. Each body is the same distance from its neighbors along its track. The A bodies are stationary, but the Bs are moving to the right, and the Cs are moving with the same speed to the left. Here are two snapshots of the situation, before and after.

Diagram of Zeno's Moving Rows

Zeno points out that, in the time between the before-snapshot and the after-snapshot, the leftmost C passes two Bs but only one A, contradicting the common sense assumption that the C should take longer to pass two Bs than one A. The usual way out of this paradox is to remark that Zeno mistakenly supposes that a moving body passes both moving and stationary objects with equal speed.

Aristotle argues that how long it takes to pass a body depends on the speed of the body; for example, if the body is coming towards you, then you can pass it in less time than if it is stationary. Today’s analysts agree with Aristotle’s diagnosis, and historically this paradox of motion has seemed weaker than the previous three. This paradox is also called “The Stadium,” but occasionally so is the Dichotomy Paradox.

Some analysts, such as Tannery (1887), believe Zeno may have had in mind that the paradox was supposed to have assumed that space and time are discrete (quantized, atomized) as opposed to continuous, and Zeno intended his argument to challenge the coherence of this assumption about discrete space and time. Well, the paradox could be interpreted this way. Assume the three objects are adjacent to each other in their tracks or spaces; that is, the middle object is only one atom of space away from its neighbors. Then, if the Cs were moving at a speed of, say, one atom of space in one atom of time, the leftmost C would pass two atoms of B-space in the time it passed one atom of A-space, which is a contradiction to our assumption that the Cs move at a rate of one atom of space in one atom of time. Or else we’d have to say that in that atom of time, the leftmost C somehow got beyond two Bs by passing only one of them, which is also absurd (according to Zeno). Interpreted this way, Zeno’s argument produces a challenge to the idea that space and time are discrete. However, most commentators believe Zeno himself did not interpret his paradox this way.

b. Paradoxes of Plurality

Zeno's paradoxes of motion are attacks on the commonly held belief that motion is real, but because motion is a kind of plurality, namely a process along a plurality of places in a plurality of times, they are also attacks on this kind of plurality. Zeno offered more direct attacks on all kinds of plurality. The first is his Paradox of Alike and Unlike.

i. Alike and Unlike

According to Plato in Parmenides 127-9, Zeno argued that the assumption of plurality–the assumption that there are many things–leads to a contradiction. He quotes Zeno as saying: "If things are many, . . . they must be both like and unlike. But that is impossible; unlike things cannot be like, nor like things unlike" (Hamilton and Cairns (1961), 922).

Zeno's point is this. Consider a plurality of things, such as some people and some mountains. These things have in common the property of being heavy. But if they all have this property in common, then they really are all the same kind of thing, and so are not a plurality. They are a one. By this reasoning, Zeno believes it has been shown that the plurality is one (or the many is not many), which is a contradiction. Therefore, by reductio ad absurdum, there is no plurality, as Parmenides has always claimed.

Plato immediately accuses Zeno of equivocating. A thing can be alike some other thing in one respect while being not alike it in a different respect. Your having a property in common with some other thing does not make you identical with that other thing. Consider again our plurality of people and mountains. People and mountains are all alike in being heavy, but are unlike in intelligence. And they are unlike in being mountains; the mountains are mountains, but the people are not. As Plato says, when Zeno tries to conclude "that the same thing is many and one, we shall [instead] say that what he is proving is that something is many and one [in different respects], not that unity is many or that plurality is one...." [129d] So, there is no contradiction, and the paradox is solved by Plato. This paradox is generally considered to be one of Zeno's weakest paradoxes, and it is now rarely discussed. [See Rescher (2001), pp. 94-6 for some discussion.]

ii. Limited and Unlimited

This paradox is also called the Paradox of Denseness. Suppose there exist many things rather than, as Parmenides would say, just one thing. Then there will be a definite or fixed number of those many things, and so they will be “limited.” But if there are many things, say two things, then they must be distinct, and to keep them distinct there must be a third thing separating them. So, there are three things. But between these, …. In other words, things are dense and there is no definite or fixed number of them, so they will be “unlimited.” This is a contradiction, because the plurality would be both limited and unlimited. Therefore, there are no pluralities; there exists only one thing, not many things. This argument is reconstructed from Zeno’s own words, as quoted by Simplicius in his commentary of book 1 of Aristotle’s Physics.

According to the Standard Solution to this paradox, the weakness of Zeno’s argument can be said to lie in the assumption that “to keep them distinct, there must be a third thing separating them.” Zeno would have been correct to say that between any two physical objects that are separated in space, there is a place between them, because space is dense, but he is mistaken to claim that there must be a third physical object there between them. Two objects can be distinct at a time simply by one having a property the other does not have.

iii. Large and Small

Suppose there exist many things rather than, as Parmenides says, just one thing. Then every part of any plurality is both so small as to have no size but also so large as to be infinite, says Zeno. His reasoning for why they have no size has been lost, but many commentators suggest that he’d reason as follows. If there is a plurality, then it must be composed of parts which are not themselves pluralities. Yet things that are not pluralities cannot have a size or else they’d be divisible into parts and thus be pluralities themselves.

Now, why are the parts of pluralities so large as to be infinite? Well, the parts cannot be so small as to have no size since adding such things together would never contribute anything to the whole so far as size is concerned. So, the parts have some non-zero size. If so, then each of these parts will have two spatially distinct sub-parts, one in front of the other. Each of these sub-parts also will have a size. The front part, being a thing, will have its own two spatially distinct sub-parts, one in front of the other; and these two sub-parts will have sizes. Ditto for the back part. And so on without end. A sum of all these sub-parts would be infinite. Therefore, each part of a plurality will be so large as to be infinite.

This sympathetic reconstruction of the argument is based on Simplicius’ On Aristotle’s Physics, where Simplicius quotes Zeno’s own words for part of the paradox, although he does not say what he is quotingfrom.

There are many errors here in Zeno’s reasoning, according to the Standard Solution. He is mistaken at the beginning when he says, “If there is a plurality, then it must be composed of parts which are not themselves pluralities.” A university is an illustrative counterexample. A university is a plurality of students, but we need not rule out the possibility that a student is a plurality. What’s a whole and what’s a plurality depends on our purposes. When we consider a university to be a plurality of students, we consider the students to be wholes without parts. But for another purpose we might want to say that a student is a plurality of biological cells. Zeno is confused about this notion of relativity, and about part-whole reasoning; and as commentators began to appreciate this they lost interest in Zeno as a player in the great metaphysical debate between pluralism and monism.

A second error occurs in arguing that the each part of a plurality must have a non-zero size. In 1901, Henri Lebesgue showed how to properly define the measure function so that a line segment has nonzero measure even though (the singleton set of) any point has a zero measure. The measure of the line segment [a,  b] is b - a; the measure of a cube with side a is a3. Lebesgue’s theory is our current civilization’s theory of measure, and thus of length, volume, duration, mass, voltage, brightness, and other continuous magnitudes.

Thanks to Aristotle’s support, Zeno’s Paradoxes of Large and Small and of Infinite Divisibility (to be discussed below) were generally considered to have shown that a continuous magnitude cannot be composed of points. Interest was rekindled in this topic in the 18th century. The physical objects in Newton’s classical mechanics of 1726 were interpreted by R. J. Boscovich in 1763 as being collections of point masses. Each point mass is a movable point carrying a fixed mass. This idealization of continuous bodies as if they were compositions of point particles was very fruitful; it could be used to easily solve otherwise very difficult problems in physics. This success led scientists, mathematicians, and philosophers to recognize that the strength of Zeno’s Paradoxes of Large and Small and of Infinite Divisibility had been overestimated; they did not prevent a continuous magnitude from being composed of points.

iv. Infinite Divisibility

This is the most challenging of all the paradoxes of plurality. Consider the difficulties that arise if we assume that an object theoretically can be divided into a plurality of parts. According to Zeno, there is a reassembly problem. Imagine cutting the object into two non-overlapping parts, then similarly cutting these parts into parts, and so on until the process of repeated division is complete. Assuming the hypothetical division is “exhaustive” or does comes to an end, then at the end we reach what Zeno calls “the elements.” Here there is a problem about reassembly. There are three possibilities. (1) The elements are nothing. In that case the original objects will be a composite of nothing, and so the whole object will be a mere appearance, which is absurd. (2) The elements are something, but they have zero size. So, the original object is composed of elements of zero size. Adding an infinity of zeros yields a zero sum, so the original object had no size, which is absurd. (3) The elements are something, but they do not have zero size. If so, these can be further divided, and the process of division was not complete after all, which contradicts our assumption that the process was already complete. In summary, there were three possibilities, but all three possibilities lead to absurdity. So, objects are not divisible into a plurality of parts.

Simplicius says this argument is due to Zeno even though it is in Aristotle (On Generation and Corruption, 316a15-34, 316b34 and 325a8-12) and is not attributed there to Zeno, which is odd. Aristotle says the argument convinced the atomists to reject infinite divisibility. The argument has been called the Paradox of Parts and Wholes, but it has no traditional name.

The Standard Solution says we first should ask Zeno to be clearer about what he is dividing. Is it concrete or abstract? When dividing a concrete, material stick into its components, we reach ultimate constituents of matter such as quarks and electrons that cannot be further divided. These have a size, a zero size (according to quantum electrodynamics), but it is incorrect to conclude that the whole stick has no size if its constituents have zero size. [Due to the forces involved, point particles have finite “cross sections,” and configurations of those particles, such as atoms, do have finite size.] So, Zeno is wrong here. On the other hand, is Zeno dividing an abstract path or trajectory? Let's assume he is, since this produces a more challenging paradox. If so, then choice (2) above is the one to think about. It's the one that talks about addition of zeroes. Let's assume the object is one-dimensional, like a path. According to the Standard Solution, this "object" that gets divided should be considered to be a continuum with its elements arranged into the order type of the linear continuum, and we should use Lebesgue's notion of measure to find the size of the object. The size (length, measure) of a point-element is zero, but Zeno is mistaken in saying the total size (length, measure) of all the zero-size elements is zero. The size of the object  is determined instead by the difference in coordinate numbers assigned to the end points of the object. An object extending along a straight line that has one of its end points at one meter from the origin and other end point at three meters from the origin has a size of two meters and not zero meters. So, there is no reassembly problem, and a crucial step in Zeno's argument breaks down.

c. Other Paradoxes

i. The Grain of Millet

There are two common interpretations of this paradox. According to the first, which is the standard interpretation, when a bushel of millet (or wheat) grains falls out of its container and crashes to the floor, it makes a sound. Since the bushel is composed of individual grains, each individual grain also makes a sound, as should each thousandth part of the grain, and so on to its ultimate parts. But this result contradicts the fact that we actually hear no sound for portions like a thousandth part of a grain, and so we surely would hear no sound for an ultimate part of a grain. Yet, how can the bushel make a sound if none of its ultimate parts make a sound? The original source of this argument is Aristotle Physics (250a.19-21). There seems to be appeal to the iterative rule that if a millet or millet part makes a sound, then so should a next smaller part.

We do not have Zeno’s words on what conclusion we are supposed to draw from this. Perhaps he would conclude it is a mistake to suppose that whole bushels of millet have millet parts. This is an attack on plurality.

The Standard Solution to this interpretation of the paradox accuses Zeno of mistakenly assuming that there is no lower bound on the size of something that can make a sound. There is no problem, we now say, with parts having very different properties from the wholes that they constitute. The iterative rule is initially plausible but ultimately not trustworthy, and Zeno is committing both the fallacy of division and the fallacy of composition.

Some analysts interpret Zeno’s paradox a second way, as challenging our trust in our sense of hearing, as follows. When a bushel of millet grains crashes to the floor, it makes a sound. The bushel is composed of individual grains, so they, too, make an audible sound. But if you drop an individual millet grain or a small part of one or an even smaller part, then eventually your hearing detects no sound, even though there is one. Therefore, you cannot trust your sense of hearing.

This reasoning about our not detecting low amplitude sounds is similar to making the mistake of arguing that you cannot trust your thermometer because there are some ranges of temperature that it is not sensitive to. So, on this second interpretation, the paradox is also easy to solve. One reason given in the literature for believing that this second interpretation is not the one that Zeno had in mind is that Aristotle’s criticism given below applies to the first interpretation and not the second, and it is unlikely that Aristotle would have misinterpreted the paradox.

ii. Against Place

Given an object, we may assume that there is a single, correct answer to the question, “What is its place?” Because everything that exists has a place, and because place itself exists, so it also must have a place, and so on forever. That’s too many places, so there is a contradiction. The original source is Aristotle’sPhysics (209a23-25 and 210b22-24).

The standard response to Zeno’s Paradox Against Place is to deny that places have places, and to point out that the notion of place should be relative to reference frame. But Zeno’s assumption that places have places was common in ancient Greece at the time, and Zeno is to be praised for showing that it is a faulty assumption.

4. Aristotle’s Treatment of the Paradoxes

Aristotle’s views about Zeno’s paradoxes can be found in Physics, book 4, chapter 2, and book 6, chapters 2 and 9. Regarding the Dichotomy Paradox, Aristotle is to be applauded for his insight that Achilles has time to reach his goal because during the run ever shorter paths take correspondingly ever shorter times.

Aristotle had several criticisms of Zeno. Regarding the paradoxes of motion, he complained that Zeno should not suppose the runner's path is dependent on its parts; instead, the path is there first, and the parts are constructed by the analyst. His second complaint was that Zeno should not suppose that lines contain points. Aristotle's third and most influential, critical idea involves a complaint about potential infinity. On this point, in remarking about the Achilles Paradox, Aristotle said, “Zeno’s argument makes a false assumption in asserting that it is impossible for a thing to pass over…infinite things in a finite time.” Aristotle believes it is impossible for a thing to pass over an actually infinite number of things in a finite time, but that it is possible for a thing to pass over a potentially infinite number of things in a finite time. Here is how Aristotle expressed the point:

For motion…, although what is continuous contains an infinite number of halves, they are not actual but potential halves. (Physics 263a25-27). …Therefore to the question whether it is possible to pass through an infinite number of units either of time or of distance we must reply that in a sense it is and in a sense it is not. If the units are actual, it is not possible: if they are potential, it is possible. (Physics 263b2-5).

Actual infinities are also called completed infinities. A potential infinity could never become an actual infinity. Aristotle believed the concept of actual infinity is perhaps not coherent, and so not real either in mathematics or in nature. He believes that actual infinities are not real because, if one were to exist, its infinity of parts would have to exist all at once, which he believed is impossible. Potential infinities exist over time, as processes that always can be continued at a later time. That's the only kind of infinity that could be real, thought Aristotle. A potential infinity is an unlimited iteration of some operation—unlimited in time. Aristotle claimed correctly that if Zeno were not to have used the concept of actual infinity, the paradoxes of motion such as the Achilles Paradox (and the Dichotomy Paradox) could not be created.

Here is why doing so is a way out of these paradoxes. Zeno said that to go from the start to the finish line, the runner Achilles must reach the place that is halfway-there, then after arriving at this place he still must reach the place that is half of that remaining distance, and after arriving there he must again reach the new place that is now halfway to the goal, and so on. These are too many places to reach. Zeno made the mistake, according to Aristotle, of supposing that this infinite process needs completing when it really does not; the finitely long path from start to finish exists undivided for the runner, and it is the mathematician who is demanding the completion of such a process. Without that concept of a completed infinity there is no paradox. Aristotle is correct about this being a treatment that avoids paradox. Today’s standard treatment of the Achilles paradox disagrees with Aristotle's way out of the paradox and says Zeno was correct to use the concept of a completed infinity and to imply the runner must go to an actual infinity of places in a finite time.

From what Aristotle says, one can infer between the lines that he believes there is another reason to reject actual infinities: doing so is the only way out of these paradoxes of motion. Today we know better. There is another way out, namely, the Standard Solution that uses actual infinities, namely Cantor's transfinite sets.

Aristotle’s treatment by disallowing actual infinity while allowing potential infinity was clever, and it satisfied nearly all scholars for 1,500 years, being buttressed during that time by the Church's doctrine that only God is actually infinite. George Berkeley, Immanuel Kant, Carl Friedrich Gauss, and Henri Poincaré were influential defenders of potential infinity. Leibniz accepted actual infinitesimals, but other mathematicians and physicists in European universities during these centuries were careful to distinguish between actual and potential infinities and to avoid using actual infinities.

Given 1,500 years of opposition to actual infinities, the burden of proof was on anyone advocating them. Bernard Bolzano and Georg Cantor accepted this burden in the 19th century. The key idea is to see a potentially infinite set as a variable quantity that is dependent on being abstracted from a pre-exisiting actually infinite set. Bolzano argued that the natural numbers should be conceived of as a set, a determinate set, not one with a variable number of elements. Cantor argued that any potential infinity must be interpreted as varying over a predefined fixed set of possible values, a set that is actually infinite. He put it this way:

In order for there to be a variable quantity in some mathematical study, the “domain” of its variability must strictly speaking be known beforehand through a definition. However, this domain cannot itself be something variable…. Thus this “domain” is a definite, actually infinite set of values. Thus each potential infinite…presupposes an actual infinite. (Cantor 1887)

From this standpoint, Dedekind’s 1872 axiom of continuity and his definition of real numbers as certain infinite subsets of rational numbers suggested to Cantor and then to many other mathematicians that arbitrarily large sets of rational numbers are most naturally seen to be subsets of an actually infinite set of rational numbers. The same can be said for sets of real numbers. An actually infinite set is what we today call a "transfinite set." Cantor's idea is then to treat a potentially infinite set as being a sequence of definite subsets of a transfinite set. Aristotle had said mathematicians need only the concept of a finite straight line that may be produced as far as they wish, or divided as finely as they wish, but Cantor would say that this way of thinking presupposes a completed infinite continuum from which that finite line is abstracted at any particular time.

[When Cantor says the mathematical concept of potential infinity presupposes the mathematical concept of actual infinity, this does not imply that, if future time were to be potentially infinite, then future time also would be actually infinite.]

Dedekind's primary contribution to our topic was to give the first rigorous definition of infinite set—an actual infinity—showing that the notion is useful and not self-contradictory. Cantor provided the missing ingredient—that the mathematical line can fruitfully be treated as a dense linear ordering of uncountably many points, and he went on to develop set theory and to give the continuum a set-theoretic basis which convinced mathematicians that the concept was rigorously defined.

These ideas now form the basis of modern real analysis. The implication for the Achilles and Dichotomy paradoxes is that, once the rigorous definition of a linear continuum is in place, and once we have Cauchy’s rigorous theory of how to assess the value of an infinite series, then we can point to the successful use of calculus in physical science, especially in the treatment of time and of motion through space, and say that the sequence of intervals or paths described by Zeno is most properly treated as a sequence of subsets of an actually infinite set [that is, Aristotle's potential infinity of places that Achilles reaches are really a variable subset of an already existing actually infinite set of point places], and we can be confident that Aristotle’s treatment of the paradoxes is inferior to the Standard Solution’s.

Zeno said Achilles cannot achieve his goal in a finite time, but there is no record of the details of how he defended this conclusion. He might have said the reason is (i) that there is no last goal in the sequence of sub-goals, or, perhaps (ii) that it would take too long to achieve all the sub-goals, or perhaps (iii) that covering all the sub-paths is too great a distance to run. Zeno might have offered all these defenses. In attacking justification (ii), Aristotle objects that, if Zeno were to confine his notion of infinity to a potential infinity and were to reject the idea of zero-length sub-paths, then Achilles achieves his goal in a finite time, so this is a way out of the paradox. However, an advocate of the Standard Solution says Achilles achieves his goal by covering an actual infinity of paths in a finite time, and this is the way out of the paradox. (The discussion of whether Achilles can properly be described as completing an actual infinity of tasks rather than goals will be considered in Section 5c.) Aristotle's treatment of the paradoxes is basically criticized for being inconsistent with current standard real analysis that is based upon Zermelo Fraenkel set theory and its actually infinite sets. To summarize the errors of Zeno and Aristotle in the Achilles Paradox and in the Dichotomy Paradox, they both made the mistake of thinking that if a runner has to cover an actually infinite number of sub-paths to reach his goal, then he will never reach it; calculus shows how Achilles can do this and reach his goal in a finite time, and the fruitfulness of the tools of calculus imply that the Standard Solution is a better treatment than Aristotle's.

Let’s turn to the other paradoxes. In proposing his treatment of the Paradox of the Large and Small and of the Paradox of Infinite Divisibility, Aristotle said that

…a line cannot be composed of points, the line being continuous and the point indivisible. (Physics, 231a 25)

In modern real analysis, a continuum is composed of points, but Aristotle, ever the advocate of common sense reasoning, claimed that a continuum cannot be composed of points. Aristotle believed a line can be composed only of smaller, indefinitely divisible lines and not of points without magnitude. Similarly a distance cannot be composed of point places and a duration cannot be composed of instants. This is one of Aristotle’s key errors, according to advocates of the Standard Solution, because by maintaining this common sense view he created an obstacle to the fruitful development of real analysis. In addition to complaining about points, Aristotelians object to the idea of an actual infinite number of them.

In his analysis of the Arrow Paradox, Aristotle said Zeno mistakenly assumes time is composed of indivisible moments, but “This is false, for time is not composed of indivisible moments any more than any other magnitude is composed of indivisibles.” (Physics, 239b8-9) Zeno needs those instantaneous moments; that way Zeno can say the arrow does not move during the moment. Aristotle recommends not allowing Zeno to appeal to instantaneous moments and restricting Zeno to saying motion be divided only into a potential infinity of intervals. That restriction implies the arrow’s path can be divided only into finitely many intervals at any time. So, at any time, there is a finite interval during which the arrow can exhibit motion by changing location. So the arrow flies, after all. That is, Aristotle declares Zeno’s argument is based on false assumptions without which there is no problem with the arrow’s motion. However, the Standard Solution agrees with Zeno that time can be composed of indivisible moments or instants, and it implies that Aristotle has mis-diagnosed where the error lies in the Arrow Paradox. Advocates of the Standard Solution would add that allowing a duration to be composed of indivisible moments is what is needed for having a fruitful calculus, and Aristotle's recommendation is an obstacle to the development of calculus.

Aristotle’s treatment of The Paradox of the Moving Rows is basically in agreement with the Standard Solution to that paradox–that Zeno did not appreciate the difference between speed and relative speed.

Regarding the Paradox of the Grain of Millet, Aristotle said that parts need not have all the properties of the whole, and so grains need not make sounds just because bushels of grains do. (Physics, 250a, 22) And if the parts make no sounds, we should not conclude that the whole can make no sound. It would have been helpful for Aristotle to have said more about what are today called the Fallacies of Division and Composition that Zeno is committing. However, Aristotle’s response to the Grain of Millet is brief but accurate by today’s standards.

In conclusion, are there two adequate but different solutions to Zeno’s paradoxes, Aristotle’s Solution and the Standard Solution? No. Aristotle’s treatment does not stand up to criticism in a manner that most scholars deem adequate. The Standard Solution uses contemporary concepts that have proved to be more valuable for solving and resolving so many other problems in mathematics and physics. Replacing Aristotle’s common sense concepts with the new concepts from real analysis and classical mechanics has been a key ingredient in the successful development of mathematics and science in recent centuries, and for this reason the vast majority of scientists, mathematicians, and philosophers reject Aristotle's treatment. Nevertheless, there is a significant minority in the philosophical community who do not agree, as we shall see in the sections that follow.

5. Other Issues Involving the Paradoxes

a. Consequences of Accepting the Standard Solution

There is a price to pay for accepting the Standard Solution to Zeno’s Paradoxes. The following–once presumably safe–intuitions or assumptions must be rejected:

  1. A continuum is too smooth to be divisible into point elements.
  2. Runners do not have time to go to an actual infinity of places in a finite time.
  3. The sum of an infinite series of positive terms is always infinite.
  4. For each instant there is a next instant and for each place along a line there is a next place.
  5. A finite distance along a line cannot contain an actually infinite number of points.
  6. The more points there are on a line, the longer the line is.
  7. It is absurd for there to be numbers that are bigger than every integer.
  8. A one-dimensional curve can not fill a two-dimensional area, nor can an infinitely long curve enclose a finite area.
  9. A whole is always greater than any of its parts.

Item (8) was undermined when it was discovered that the continuum implies the existence of fractal curves. However, the loss of intuition (1) has caused the greatest stir because so many philosophers object to a continuum being constructed from points. The Austrian philosopher Franz Brentano believed with Aristotle that scientific theories should be literal descriptions of reality, as opposed to today’s more popular view that theories are idealizations or approximations of reality. Continuity is something given in perception, said Brentano, and not in a mathematical construction; therefore, mathematics misrepresents. In a 1905 letter to Husserl, he said, “I regard it as absurd to interpret a continuum as a set of points.”

But the Standard Solution needs to be thought of as a package to be evaluated in terms of all of its costs and benefits. From this perspective the Standard Solution’s point-set analysis of continua has withstood the criticism and demonstrated its value in mathematics and mathematical physics. As a consequence, advocates of the Standard Solution say we must live with rejecting the eight intuitions listed above, and accept the counterintuitive implications such as there being divisible continua, infinite sets of different sizes, and space-filling curves. They agree with the philosopher W. V .O. Quine who demands that we be conservative when revising the system of claims that we believe and who recommends “minimum mutilation.” Advocates of the Standard Solution say no less mutilation will work satisfactorily.

b. Criticisms of the Standard Solution

Balking at having to reject so many of our intuitions, the 20th century philosophers Henri-Louis Bergson, Max Black, Franz Brentano, L. E. J. Brouwer, Solomon Feferman, William James, James Thomson, and Alfred North Whitehead argued in different ways that the standard mathematical account of continuity does not apply to physical processes, or is improper for describing those processes. Here are their main reasons: (1) the actual infinite cannot be encountered in experience and thus is unreal, (2) human intelligence is not capable of understanding motion, (3) the sequence of tasks that Achilles performs is finite and the illusion that it is infinite is due to mathematicians who confuse their mathematical representations with what is represented. (4) motion is unitary even though its spatial trajectory is infinitely divisible, (5) treating time as being made of instants is to treat time as static rather than as the dynamic aspect of consciousness that it truly is, (6) actual infinities and the contemporary continuum are not indispensable to solving the paradoxes, and (7) the Standard Solution’s implicit assumption of the primacy of the coherence of the sciences is unjustified because coherence with a priori knowledge and common sense is primary.

See Salmon (1970, Introduction) and Feferman (1998) for a discussion of the controversy about the quality of Zeno’s arguments, and an introduction to its vast literature. This controversy is much less actively pursued in today’s mathematical literature, and hardly at all in today’s scientific literature. A minority of philosophers are actively involved in an attempt to retain one or more of the eight intuitions listed in section 5a above. An important philosophical issue is whether the paradoxes should be solved by the Standard Solution or instead by assuming that a line is not composed of points but of intervals, and whether use of infinitesimals is essential to a proper understanding of the paradoxes.

c. Supertasks and Infinity Machines

Zeno’s Paradox of Achilles was presented as implying that he will never catch the tortoise because the sequence of goals to be achieved has no final member. In that presentation, use of the terms “task” and “act” was intentionally avoided, but there are interesting questions that do use those terms. In reaching the tortoise, Achilles does not cover an infinite distance, but he does cover an infinite number of distances. In doing so, does he need to complete an infinite sequence of tasks or actions? In other words, assuming Achilles does complete the task of reaching the tortoise, does he thereby complete a supertask, a transfinite number of tasks in a finite time?

Bertrand Russell said “yes.” He argued that it is possible to perform a task in one-half minute, then perform another task in the next quarter-minute, and so on, for a full minute. At the end of the minute, an infinite number of tasks would have been performed. In fact, Achilles does this in catching the tortoise. In the mid-twentieth century, Hermann Weyl, Max Black, and others objected, and thus began an ongoing controversy about the number of tasks that can be completed in a finite time.

That controversy has sparked a related discussion about whether there could be a machine that can perform an infinite number of tasks in a finite time. A machine that can is called an infinity machine. In 1954, in an effort to undermine Russell’s argument, the philosopher James Thomson described a lamp that is intended to be a typical infinity machine. Let the machine switch the lamp on for a half-minute; then switch it off for a quarter-minute; then on for an eighth-minute; off for a sixteenth-minute; and so on. Would the lamp be lit or dark at the end of minute? Thomson argued that it must be one or the other, but it cannot be either because every period in which it is off is followed by a period in which it is on, and vice versa, so there can be no such lamp, and the specific mistake in the reasoning was to suppose that it is logically possible to perform a supertask. The implication for Zeno’s paradoxes is that, although Thomson is not denying Achilles catches the tortoise, he is denying Russell’s description of Achilles’ task as being the completion of an infinite number of sub-tasks in a finite time.

Paul Benacerraf (1962) complains that Thomson’s reasoning is faulty because it fails to notice that the initial description of the lamp determines the state of the lamp at each period in the sequence of switching, but it determines nothing about the state of the lamp at the limit of the sequence. The lamp could be either on or off at the limit. The limit of the infinite converging sequence is not in the sequence. So, Thomson has not established the logical impossibility of completing this supertask.

Could some other argument establish this impossibility? Benacerraf suggests that an answer depends on what we ordinarily mean by the term “completing a task.” If the meaning does not require that tasks have minimum times for their completion, then maybe Russell is right that some supertasks can be completed, he says; but if a minimum time is always required, then Russell is mistaken because an infinite time would be required. What is needed is a better account of the meaning of the term “task.” Grünbaum objects to Benacerraf’s reliance on ordinary meaning. “We need to heed the commitments of ordinary language,” says Grünbaum, “only to the extent of guarding against being victimized or stultified by them.”

The Thomson Lamp has generated a great literature in recent philosophy. Here are some of the issues. What is the proper definition of “task”? For example, does it require a minimum amount of time, and does it require a minimum amount of work, in the physicists’ technical sense of that term? Even if it is physically impossible to flip the switch in Thomson’s lamp, suppose physics were different and there were no limit on speed; what then? Is the lamp logically impossible? Is the lamp metaphysically impossible, even if it is logically possible? Was it proper of Thomson to suppose that the question of whether the lamp is lit or dark at the end of the minute must have a determinate answer? Does Thomson’s question have no answer, given the initial description of the situation, or does it have an answer which we are unable to compute? Should we conclude that it makes no sense to divide a finite task into an infinite number of ever shorter sub-tasks? Even if completing a countable infinity of tasks in a finite time is physically possible (such as when Achilles runs to the tortoise), is completing an uncountable infinity also possible? Interesting issues arise when we bring in Einstein’s theory of relativity and consider a bifurcated supertask. This is an infinite sequence of tasks in a finite interval of an external observer’s proper time, but not in the machine’s own proper time. See Earman and Norton (1996) for an introduction to the extensive literature on these topics. Unfortunately, there is no agreement in the philosophical community on most of the questions we’ve just entertained.

d. Constructivism

The spirit of Aristotle’s opposition to actual infinities persists today in the philosophy of mathematics called constructivism. Constructivism is not a precisely defined position, but it implies that acceptable mathematical objects and procedures have to be founded on constructions and not, say, on assuming the object does not exist, then deducing a contradiction from that assumption. Most constructivists believe acceptable constructions must be performable ideally by humans independently of practical limitations of time or money. So they would say potential infinities, recursive functions, mathematical induction, and Cantor’s diagonal argument are constructive, but the following are not: The axiom of choice, the law of excluded middle, the law of double negation, completed infinities, and the classical continuum of the Standard Solution. The implication is that Zeno’s Paradoxes were not solved correctly by using the methods of the Standard Solution. More conservative constructionists, the finitists, would go even further and reject potential infinities because of the human being's finite computational resources, but this conservative sub-group of constructivists is very much out of favor.

L. E. J. Brouwer’s intuitionism was the leading constructivist theory of the early 20th century. In response to suspicions raised by the discovery of Russell’s Paradox and the introduction into set theory of the controversial non-constructive axiom of choice, Brouwer attempted to place mathematics on what he believed to be a firmer epistemological foundation by arguing that mathematical concepts are admissible only if they can be constructed from, and thus grounded in, an ideal mathematician’s vivid temporal intuitions, the a priori intuitions of time. Brouwer’s intuitionistic continuum has the Aristotelian property of unsplitability. What this means is that, unlike the Standard Solution’s set-theoretic composition of the continuum which allows, say, the closed interval of real numbers from zero to one to be split or cut into (that is, be the union of sets of) those numbers in the interval that are less than one-half and those numbers in the interval that are greater than or equal to one-half, the corresponding closed interval of the intuitionistic continuum cannot be split this way into two disjoint sets. This unsplitability or inseparability agrees in spirit with Aristotle’s idea of the continuity of a real continuum, but disagrees in spirit with Aristotle by allowing the continuum to be composed of points. [Posy (2005) 346-7]

Although everyone agrees that any legitimate mathematical proof must use only a finite number of steps and be constructive in that sense, the majority of mathematicians in the first half of the twentieth century claimed that constructive mathematics could not produce an adequate theory of the continuum because essential theorems will no longer be theorems, and constructivist principles and procedures are too awkward to use successfully. In 1927, David Hilbert exemplified this attitude when he objected that Brouwer’s restrictions on allowable mathematics–such as rejecting proof by contradiction–were like taking the telescope away from the astronomer.

But thanks in large part to the later development of constructive mathematics by Errett Bishop and Douglas Bridges in the second half of the 20th century, most contemporary philosophers of mathematics believe the question of whether constructivism could be successful in the sense of producing an adequate theory of the continuum is still open [see Wolf (2005) p. 346, and McCarty (2005) p. 382], and to that extent so is the question of whether the Standard Solution to Zeno’s Paradoxes needs to be rejected or perhaps revised to embrace constructivism. Frank Arntzenius (2000), Michael Dummett (2000), and Solomon Feferman (1998) have done important philosophical work to promote the constructivist tradition. Nevertheless, the vast majority of today’s practicing mathematicians routinely use nonconstructive mathematics.

e. Nonstandard Analysis

Although Zeno and Aristotle had the concept of small, they did not have the concept of infinitesimally small, which is the informal concept that was used by Leibniz (and Newton) in the development of calculus. In the 19th century, infinitesimals were eliminated from the standard development of calculus due to the work of Cauchy and Weierstrass on defining a derivative in terms of limits using the epsilon-delta method. But in 1881, C. S. Peirce advocated restoring infinitesimals because of their intuitive appeal. Unfortunately, he was unable to work out the details, as were all mathematicians—until 1960 when Abraham Robinson produced his nonstandard analysis. At this point in time it was no longer reasonable to say that banishing infinitesimals from analysis was an intellectual advance. What Robinson did was to extend the standard real numbers to include infinitesimals, using this definition: h is infinitesimal if and only if its absolute value is less than 1/n, for every positive standard number n. Robinson went on to create a nonstandard model of analysis using hyperreal numbers. The class of hyperreal numbers contains counterparts of the reals, but in addition it contains any number that is the sum, or difference, of both a standard real number and an infinitesimal number, such as 3 + h and 3 – 4h2. The reciprocal of an infinitesimal is an infinite hyperreal number. These hyperreals obey the usual rules of real numbers except for the Archimedean axiom. Infinitesimal distances between distinct points are allowed, unlike with standard real analysis. The derivative is defined in terms of the ratio of infinitesimals, in the style of Leibniz, rather than in terms of a limit as in standard real analysis in the style of Weierstrass.

Nonstandard analysis is called “nonstandard” because it was inspired by Thoralf Skolem’s demonstration in 1933 of the existence of models of first-order arithmetic that are not isomorphic to the standard model of arithmetic. What makes them nonstandard is especially that they contain infinitely large (hyper)integers. For nonstandard calculus one needs nonstandard models of real analysis rather than just of arithmetic. An important feature demonstrating the usefulness of nonstandard analysis is that it achieves essentially the same theorems as those in classical calculus. The treatment of Zeno’s paradoxes is interesting from this perspective. See McLaughlin (1994) for how Zeno’s paradoxes may be treated using infinitesimals. McLaughlin believes this approach to the paradoxes is the only successful one, but commentators generally do not agree with that conclusion, and consider it merely to be an alternative solution. See Dainton (2010) pp. 306-9 for some discussion of this.

f. Smooth Infinitesimal Analysis

Abraham Robinson in the 1960s resurrected the infinitesimal as an infinitesimal number, but F. W. Lawvere in the 1970s resurrected the infinitesimal as an infinitesimal magnitude. His work is called “smooth infinitesimal analysis” and is part of “synthetic differential geometry.” In smooth infinitesimal analysis, a curved line is composed of infinitesimal tangent vectors. One significant difference from a nonstandard analysis, such as Robinson’s above, is that all smooth curves are straight over infinitesimal distances, whereas Robinson’s can curve over infinitesimal distances. In smooth infinitesimal analysis, Zeno’s arrow does not have time to change its speed during an infinitesimal interval. Smooth infinitesimal analysis retains the intuition that a continuum should be smoother than the continuum of the Standard Solution. Unlike both standard analysis and nonstandard analysis whose real number systems are set-theoretical entities and are based on classical logic, the real number system of smooth infinitesimal analysis is not a set-theoretic entity but rather an object in a topos of category theory, and its logic is intuitionist. (Harrison, 1996, p. 283) Like Robinson’s nonstandard analysis, Lawvere’s smooth infinitesimal analysis may also be a promising approach to a foundation for real analysis and thus to solving Zeno’s paradoxes, but there is no consensus that Zeno’s Paradoxes need to be solved this way. For more discussion see note 11 in Dainton (2010) pp. 420-1.

6. The Legacy and Current Significance of the Paradoxes

What influence has Zeno had? He had none in the East, but in the West there has been continued influence and interest up to today.

Let’s begin with his influence on the ancient Greeks. Before Zeno, philosophers expressed their philosophy in poetry, and he was the first philosopher to use prose arguments. This new method of presentation was destined to shape almost all later philosophy, mathematics, and science. Zeno drew new attention to the idea that the way the world appears to us is not how it is in reality. Zeno probably also influenced the Greek atomists to accept atoms. Aristotle was influenced by Zeno to use the distinction between actual and potential infinity as a way out of the paradoxes, and careful attention to this distinction has influenced mathematicians ever since. The proofs in Euclid’s Elements, for example, used only potentially infinite procedures. Awareness of Zeno’s paradoxes made Greek and all later Western intellectuals more aware that mistakes can be made when thinking about infinity, continuity, and the structure of space and time, and it made them wary of any claim that a continuous magnitude could be made of discrete parts. ”Zeno’s arguments, in some form, have afforded grounds for almost all theories of space and time and infinity which have been constructed from his time to our own,” said Bertrand Russell in the twentieth century.

There is controversy in the recent literature about whether Zeno developed any specific, new mathematical techniques. Some scholars claim Zeno influenced the mathematicians to use the indirect method of proof (reductio ad absurdum), but others disagree and say it may have been the other way around. Other scholars take the internalist position that the conscious use of the method of indirect argumentation arose in both mathematics and philosophy independently of each other. See Hintikka (1978) for a discussion of this controversy about origins. Everyone agrees the method was Greek and not Babylonian, as was the method of proving something by deducing it from explicitly stated assumptions. G. E. L. Owen (Owen 1958, p. 222) argued that Zeno influenced Aristotle’s concept of motion not existing at an instant, which implies there is no instant when a body begins to move, nor an instant when a body changes its speed. Consequently, says Owen, Aristotle’s conception is an obstacle to a Newton-style concept of acceleration, and this hindrance is “Zeno’s major influence on the mathematics of science.” Other commentators consider Owen’s remark to be slightly harsh regarding Zeno because, they ask, if Zeno had not been born, would Aristotle have been likely to develop any other concept of motion?

Zeno’s paradoxes have received some explicit attention from scholars throughout later centuries. Pierre Gassendi in the early 17th century mentioned Zeno’s paradoxes as the reason to claim that the world’s atoms must not be infinitely divisible. Pierre Bayle’s 1696 article on Zeno drew the skeptical conclusion that, for the reasons given by Zeno, the concept of space is contradictory. In the early 19th century, Hegel suggested that Zeno’s paradoxes supported his view that reality is inherently contradictory.

Zeno’s paradoxes caused mistrust in infinites, and this mistrust has influenced the contemporary movements of constructivism, finitism, and nonstandard analysis, all of which affect the treatment of Zeno’s paradoxes. Dialetheism, the acceptance of true contradictions via a paraconsistent formal logic, provides a newer, although unpopular, response to Zeno’s paradoxes, but dialetheism was not created specifically in response to worries about Zeno’s paradoxes. With the introduction in the 20th century of thought experiments about supertasks, interesting philosophical research has been directed towards understanding what it means to complete a task.

Zeno's paradoxes are often pointed to for a case study in how a philosophical problem has been solved, even though the solution took over two thousand years to materialize.

So, Zeno’s paradoxes have had a wide variety of impacts upon subsequent research. Little research today is involved directly in how to solve the paradoxes themselves, especially in the fields of mathematics and science, although discussion continues in philosophy, primarily on whether a continuous magnitude should be composed of discrete magnitudes, such as whether a line should be composed of points. If there are alternative treatments of Zeno's paradoxes, then this raises the issue of whether there is a single solution to the paradoxes or several solutions or one best solution. The answer to whether the Standard Solution is the correct solution to Zeno’s paradoxes may also depend on whether the best physics of the future that reconciles the theories of quantum mechanics and general relativity will require us to assume spacetime is composed at its most basic level of points, or, instead, of regions or loops or something else.

From the perspective of the Standard Solution, the most significant lesson learned by researchers who have tried to solve Zeno’s paradoxes is that the way out requires revising many of our old theories and their concepts. We have to be willing to rank the virtues of preserving logical consistency and promoting scientific fruitfulness above the virtue of preserving our intuitions. Zeno played a significant role in causing this progressive trend.

7. References and Further Reading

  • Arntzenius, Frank. (2000) “Are there Really Instantaneous Velocities?”, The Monist 83, pp. 187-208.
    • Examines the possibility that a duration does not consist of points, that every part of time has a non-zero size, that real numbers cannot be used as coordinates of times, and that there are no instantaneous velocities at a point.
  • Barnes, J. (1982). The Presocratic Philosophers, Routledge & Kegan Paul: Boston.
    • A well respected survey of the philosophical contributions of the Pre-Socratics.
  • Barrow, John D. (2005). The Infinite Book: A Short Guide to the Boundless, Timeless and Endless, Pantheon Books, New York.
    • A popular book in science and mathematics introducing Zeno’s Paradoxes and other paradoxes regarding infinity.
  • Benacerraf, Paul (1962). “Tasks, Super-Tasks, and the Modern Eleatics,” The Journal of Philosophy, 59, pp. 765-784.
    • An original analysis of Thomson’s Lamp and supertasks.
  • Bergson, Henri (1946). Creative Mind, translated by M. L. Andison. Philosophical Library: New York.
    • Bergson demands the primacy of intuition in place of the objects of mathematical physics.
  • Black, Max (1950-1951). “Achilles and the Tortoise,” Analysis 11, pp. 91-101.
    • A challenge to the Standard Solution to Zeno’s paradoxes. Blacks agrees that Achilles did not need to complete an infinite number of sub-tasks in order to catch the tortoise.
  • Cajori, Florian (1920). “The Purpose of Zeno’s Arguments on Motion,” Isis, vol. 3, no. 1, pp. 7-20.
    • An analysis of the debate regarding the point Zeno is making with his paradoxes of motion.
  • Cantor, Georg (1887). "Über die verschiedenen Ansichten in Bezug auf die actualunendlichen Zahlen." Bihang till Kongl. Svenska Vetenskaps-Akademien Handlingar , Bd. 11 (1886-7), article 19. P. A. Norstedt & Sôner: Stockholm.
    • A very early description of set theory and its relationship to old ideas about infinity.
  • Chihara, Charles S. (1965). “On the Possibility of Completing an Infinite Process,” Philosophical Review 74, no. 1, p. 74-87.
    • An analysis of what we mean by “task.”
  • Copleston, Frederick, S.J. (1962). “The Dialectic of Zeno,” chapter 7 of A History of Philosophy, Volume I, Greece and Rome, Part I, Image Books: Garden City.
    • Copleston says Zeno’s goal is to challenge the Pythagoreans who denied empty space and accepted pluralism.
  • Dainton, Barry. (2010). Time and Space, Second Edition, McGill-Queens University Press: Ithaca.
    • Chapters 16 and 17 discuss Zeno's Paradoxes.
  • Dauben, J. (1990). Georg Cantor, Princeton University Press: Princeton.
    • Contains Kronecker’s threat to write an article showing that Cantor’s set theory has “no real significance.” Ludwig Wittgenstein was another vocal opponent of set theory.
  • De Boer, Jesse (1953). “A Critique of Continuity, Infinity, and Allied Concepts in the Natural Philosophy of Bergson and Russell,” in Return to Reason: Essays in Realistic Philosophy, John Wild, ed., Henry Regnery Company: Chicago, pp. 92-124.
    • A philosophical defense of Aristotle’s treatment of Zeno’s paradoxes.
  • Diels, Hermann and W. Kranz (1951). Die Fragmente der Vorsokratiker, sixth ed., Weidmannsche Buchhandlung: Berlin.
    • A standard edition of the pre-Socratic texts.
  • Dummett, Michael (2000). “Is Time a Continuum of Instants?,” Philosophy, 2000, Cambridge University Press: Cambridge, pp. 497-515.
    • Promoting a constructive foundation for mathematics, Dummett’s formalism implies there are no instantaneous instants, so times must have rational values rather than real values. Times have only the values that they can in principle be measured to have; and all measurements produce rational numbers within a margin of error.
  • Earman J. and J. D. Norton (1996). “Infinite Pains: The Trouble with Supertasks,” in Paul Benacerraf: the Philosopher and His Critics, A. Morton and S. Stich (eds.), Blackwell: Cambridge, MA, pp. 231-261.
    • A criticism of Thomson’s interpretation of his infinity machines and the supertasks involved, plus an introduction to the literature on the topic.
  • Feferman, Solomon (1998). In the Light of Logic, Oxford University Press, New York.
    • A discussion of the foundations of mathematics and an argument for semi-constructivism in the tradition of Kronecker and Weyl, that the mathematics used in physical science needs only the lowest level of infinity, the infinity that characterizes the whole numbers. Presupposes considerable knowledge of mathematical logic.
  • Freeman, Kathleen (1948). Ancilla to the Pre-Socratic Philosophers, Harvard University Press: Cambridge, MA. Reprinted in paperback in 1983.
    • One of the best sources in English of primary material on the Pre-Socratics.
  • Grünbaum, Adolf (1967). Modern Science and Zeno’s Paradoxes, Wesleyan University Press: Middletown, Connecticut.
    • A detailed defense of the Standard Solution to the paradoxes.
  • Grünbaum, Adolf (1970). “Modern Science and Zeno’s Paradoxes of Motion,” in (Salmon, 1970), pp. 200-250.
    • An analysis of arguments by Thomson, Chihara, Benacerraf and others regarding the Thomson Lamp and other infinity machines.
  • Hamilton, Edith and Huntington Cairns (1961). The Collected Dialogues of Plato Including the Letters, Princeton University Press: Princeton.
  • Harrison, Craig (1996). “The Three Arrows of Zeno: Cantorian and Non-Cantorian Concepts of the Continuum and of Motion,” Synthese, Volume 107, Number 2, pp. 271-292.
    • Considers smooth infinitesimal analysis as an alternative to the classical Cantorian real analysis of the Standard Solution.
  • Heath, T. L. (1921). A History of Greek Mathematics, Vol. I, Clarendon Press: Oxford. Reprinted 1981.
    • Promotes the minority viewpoint that Zeno had a direct influence on Greek mathematics, for example by eliminating the use of infinitesimals.
  • Hintikka, Jaakko, David Gruender and Evandro Agazzi. Theory Change, Ancient Axiomatics, and Galileo’s Methodology, D. Reidel Publishing Company, Dordrecht.
    • A collection of articles that discuss, among other issues, whether Zeno’s methods influenced the mathematicians of the time or whether the influence went in the other direction. See especially the articles by Karel Berka and Wilbur Knorr.
  • Kirk, G. S., J. E. Raven, and M. Schofield, eds. (1983). The Presocratic Philosophers: A Critical History with a Selection of Texts, Second Edition, Cambridge University Press: Cambridge.
    • A good source in English of primary material on the Pre-Socratics with detailed commentary on the controversies about how to interpret various passages.
  • Maddy, Penelope (1992) “Indispensability and Practice,” Journal of Philosophy 59, pp. 275-289.
    • Explores the implication of arguing that theories of mathematics are indispensable to good science, and that we are justified in believing in the mathematical entities used in those theories.
  • Matson, Wallace I (2001). “Zeno Moves!” pp. 87-108 in Essays in Ancient Greek Philosophy VI: Before Plato, ed. by Anthony Preus, State University of New York Press: Albany.
    • Matson supports Tannery’s non-classical interpretation that Zeno’s purpose was to show only that the opponents of Parmenides are committed to denying motion, and that Zeno himself never denied motion, nor did Parmenides.
  • McCarty, D.C. (2005). “Intuitionism in Mathematics,” in The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro, Oxford University Press, Oxford, pp. 356-86.
    • Argues that a declaration of death of the program of founding mathematics on an intuitionistic basis is premature.
  • McLaughlin, William I. (1994). “Resolving Zeno’s Paradoxes,” Scientific American, vol. 271, no. 5, Nov., pp. 84-90.
    • How Zeno’s paradoxes may be explained using a contemporary theory of Leibniz’s infinitesimals.
  • Owen, G.E.L. (1958). “Zeno and the Mathematicians,” Proceedings of the Aristotelian Society, New Series, vol. LVIII, pp. 199-222.
    • Argues that Zeno and Aristotle negatively influenced the development of the Renaissance concept of acceleration that was used so fruitfully in calculus.
  • Posy, Carl. (2005). “Intuitionism and Philosophy,” in The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro, Oxford University Press, Oxford, pp. 318-54.
    • Contains a discussion of how the unsplitability of Brouwer’s intuitionistic continuum makes precise Aristotle’s notion that “you can’t cut a continuous medium without some of it clinging to the knife,” on pages 345-7.
  • Proclus (1987). Proclus’ Commentary on Plato’s Parmenides, translated by Glenn R. Morrow and John M. Dillon, Princeton University Press: Princeton.
    • A detailed list of every comment made by Proclus about Zeno is available with discussion starting on p. xxxix of the Introduction by John M. Dillon. Dillon focuses on Proclus’ comments which are not clearly derivable from Plato’s Parmenides, and concludes that Proclus had access to other sources for Zeno’s comments, most probably Zeno’s original book or some derivative of it. William Moerbeke’s overly literal translation in 1285 from Greek to Latin of Proclus’ earlier, but now lost, translation of Plato’s Parmenides is the key to figuring out the original Greek. (see p. xliv)
  • Rescher, Nicholas (2001). Paradoxes: Their Roots, Range, and Resolution, Carus Publishing Company: Chicago.
    • Pages 94-102 apply the Standard Solution to all of Zeno's paradoxes. Rescher calls the Paradox of Alike and Unlike the "Paradox of Differentiation."
  • Russell, Bertrand (1914). Our Knowledge of the External World as a Field for Scientific Method in Philosophy, Open Court Publishing Co.: Chicago.
    • Russell champions the use of contemporary real analysis and physics in resolving Zeno’s paradoxes.
  • Salmon, Wesley C., ed. (1970). Zeno’s Paradoxes, The Bobbs-Merrill Company, Inc.: Indianapolis and New York. Reprinted in paperback in 2001.
    • A collection of the most influential articles about Zeno’s Paradoxes from 1911 to 1965. Salmon provides an excellent annotated bibliography of further readings.
  • Szabo, Arpad (1978). The Beginnings of Greek Mathematics, D. Reidel Publishing Co.: Dordrecht.
    • Contains the argument that Parmenides discovered the method of indirect proof by using it against Anaximenes’ cosmogony, although it was better developed in prose by Zeno. Also argues that Greek mathematicians did not originate the idea but learned of it from Parmenides and Zeno. (pp. 244-250). These arguments are challenged in Hntikka (1978).
  • Tannery, Paul (1885). “‘Le Concept Scientifique du continu: Zenon d’Elee et Georg Cantor,” pp. 385-410 of Revue Philosophique de la France et de l’Etranger, vol. 20, Les Presses Universitaires de France: Paris.
    • This mathematician gives the first argument that Zeno’s purpose was not to deny motion but rather to show only that the opponents of Parmenides are committed to denying motion.
  • Tannery, Paul (1887). Pour l’Histoire de la Science Hellène: de Thalès à Empédocle, Alcan: Paris. 2nd ed. 1930.
    • More development of the challenge to the classical interpretation of what Zeno’s purposes were in creating his paradoxes.
  • Thomson, James (1954-1955). “Tasks and Super-Tasks,” Analysis, XV, pp. 1-13.
    • A criticism of supertasks. The Thomson Lamp thought-experiment is used to challenge Russell’s characterization of Achilles as being able to complete an infinite number of tasks in a finite time.
  • Tiles, Mary (1989). The Philosophy of Set Theory: An Introduction to Cantor’s Paradise, Basil Blackwell: Oxford.
    • A philosophically oriented introduction to the foundations of real analysis and its impact on Zeno’s paradoxes.
  • Vlastos, Gregory (1967). “Zeno of Elea,” in The Encyclopedia of Philosophy, Paul Edwards (ed.), The Macmillan Company and The Free Press: New York.
    • A clear, detailed presentation of the paradoxes. Vlastos comments that Aristotle does not consider any other treatment of Zeno’s paradoxes than by recommending replacing Zeno’s actual infinities with potential infinites, so we are entitled to assert that Aristotle probably believed denying actual infinities is the only route to a coherent treatment of infinity. Vlastos also comments that “there is nothing in our sources that states or implies that any development in Greek mathematics (as distinct from philosophical opinions about mathematics) was due to Zeno’s influence.”
  • White, M. J. (1992). The Continuous and the Discrete: Ancient Physical Theories from a Contemporary Perspective, Clarendon Press: Oxford.
    • A presentation of various attempts to defend finitism, neo-Aristotelian potential infinities, and the replacement of the infinite real number field with a finite field.
  • Wisdom, J. O. (1953). “Berkeley’s Criticism of the Infinitesimal,” The British Journal for the Philosophy of Science, Vol. 4, No. 13, pp. 22-25.
    • Wisdom clarifies the issue behind George Berkeley’s criticism (in 1734 in The Analyst) of the use of the infinitesimal (fluxion) by Newton and Leibniz. See also the references there to Wisdom’s other three articles on this topic in the journal Hermathena in 1939, 1941 and 1942.
  • Wolf, Robert S. (2005). A Tour Through Mathematical Logic, The Mathematical Association of America: Washington, DC.
    • Chapter 7 surveys nonstandard analysis, and Chapter 8 surveys constructive mathematics, including the contributions by Errett Bishop and Douglas Bridges.

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.

Constructive Mathematics

Constructive mathematics is positively characterized by the requirement that proof be algorithmic. Loosely speaking, this means that when a (mathematical) object is asserted to exist, an explicit example is given: a constructive existence proof demonstrates the existence of a mathematical object by outlining a method of finding ("constructing") such an object. The emphasis in constructive theory is placed on hands-on provability, instead of on an abstract notion of truth. The classical concept of validity is starkly contrasted with the constructive notion of proof. An implication (A⟹B) is not equivalent to a disjunction (¬A∨B), and neither are equivalent to a negated conjunction (¬(A∧¬B)). In practice, constructive mathematics may be viewed as mathematics done using intuitionistic logic.

With the advent of the computer, much more emphasis has been placed on algorithmic procedures for obtaining numerical results, and constructive mathematics has come into its own. For, a constructive proof is exactly that: an algorithmic procedure for obtaining a conclusion from a set of hypotheses.

The historical and philosophical picture is complex; various forms of constructivism have developed over time. Presented here is a brief introduction to several of the more widely accepted approaches, and is by no means comprehensive.

Table of Contents

  1. Motivation and History
    1. An Example
    2. Constructivism as Philosophy
    3. The BHK Interpretation
    4. Constructive Methods in Mathematics
    5. Early History
    6. The Axiom of Choice
  2. Intuitionism
    1. Brouwerian Counterexamples
    2. The Fan Theorem
  3. Constructive Recursive Mathematics
    1. Surprises
  4. Bishop's Constructive Mathematics
    1. Proof Readability and Preservation of Numerical Meaning
    2. The Axiom of Choice in BISH
    3. Foundations and Advances
  5. Martin-Löf Type Theory
    1. The Axiom of Choice is Provable
  6. Constructive Reverse Mathematics
  7. Summary
    1. Some Further Remarks
  8. References and Further Reading
    1. Further Reading
    2. References

1. Motivation and History

The origin of modern constructive mathematics lies in the foundational debate at the turn of the 20th Century. At that time, the German mathematician David Hilbert established some very deep and far-reaching existence theorems using non-constructive methods. Around the same time, the Dutch mathematician L.E.J. Brouwer became philosophically convinced that the universal validity of contradiction proofs for existence proofs was unwarranted—despite his early work in establishing the non-constructive fixed point theorem which now bears his name.

It is often the case that a classical theorem becomes more enlightening when seen from the constructive viewpoint (we meet an example of such a case—the least upper bound principle—in Section 1d. It would, however, be unfair to say that constructive mathematics is revisionist in nature. Indeed, Brouwer proved his Fan Theorem (see Section 2b) intuitionistically in 1927 (Brouwer, 1927), but the first proof of König's Lemma (its classical equivalent) only appeared in 1936 (König, 1936). It is ironic that the Brouwer's intuitionist position (see Section 2) has become known as anti-realist since it demands that every object be explicitly constructible.

a. An example

Here is an example of a non-constructive proof that is commonly encountered in the literature:

Proposition.
There exist non-rational numbers a and b such that ab is rational.

Proof. Take b=2; so b is irrational. Either 22 is rational, or it is not. If it is, then set a=2. On the other hand, if 22 is irrational, then take a=22, which makes ab = 2 and thus rational. In either case, the conclusion holds.

The proof is non-constructive because even though it shows that the non-existence of such numbers would be contradictory, it leaves us without the knowledge of which choice of a and b satisfy the theorem. There is a (simple) constructive proof of this theorem: set a=2 and b=log29 (in the full constructive proof, a little work first needs to be done to demonstrate that a and b are, in fact, irrational). A fully constructive proof that 2 is properly irrational (that is, positively bounded away from every rational number) may be found in Bishop (1973). This clarifies the choice of words in the proposition above. It is further possible to show that 22 is in fact irrational, but this is not done by the proof presented here. (An early mention of the above illustrative example in the literature is in Dummett (1977, p. 10).)

b. Constructivism as philosophy

Constructive mathematics is often mis-characterized as classical mathematics without the axiom of choice (see Section 1f); or classical mathematics without the Law of Excluded Middle. But seen from within the discipline, constructive mathematics is positively characterized by a strict provability requirement. The consequences of adopting this stance—and rigorously implementing it—are far-reaching, as will be seen.

There are two conditions which are fundamental to every constructivist philosophy:

  • The notion of `truth' is not taken as primitive; rather, a proposition is considered true only when a proof for the proposition is produced.
  • The notion of `existence' is taken to be constructibility: when an object is proved to exist, the proof also exhibits how to find it.

To assert that P, then, is to produce a proof that P. Likewise, to assert that ¬P is to prove that the assumption of P leads to a contradiction. Very quickly, one realizes that the Principle of Excluded Middle (PEM; Latin tertium non datur or principium tertii exclusi) leads to trouble:

PEM: For any statement P, either P or ¬P.

The assertion of PEM constructively amounts to claiming that, for any given statement P, either there is a proof of P, or there is a proof that P leads to a contradiction. Consider, for example, the following:

The Collatz conjecture. Define f:NN by the rule:

f(n)={n/23n+1if n is even,if n is odd.


Then for each natural number n, that is for each n in {1,2,3,...}, there exists a natural number k such that fk(n)=1.

At the time of writing this article, the Collatz conjecture remains unsolved. To claim that there is a proof of it is erroneous; likewise to claim that there is a proof that it leads to a contradiction is also erroneous. In fact, in Brouwer's view (see Section 2), to assert PEM is to claim that any mathematical problem has a solution. Thus there are good philosophical grounds for rejecting PEM.

c. The BHK interpretation

The following interpretation of logical connectives is now known as the Brouwer-Heyting-Kolmogorov (BHK) interpretation, and is widely accepted by constructivists.

Statement Interpretation
P We have a proof of P.
PQ We have both a proof of P and a proof of Q.
PQ We have either a proof of P or a proof of Q.
PQ We have an algorithm that converts any proof of P into a proof of Q.
¬P We have a proof that P, where is absurd (for example, 0=1).
xAP(x) We have an algorithm which computes an object xA and confirms that P(x).
xAP(x) We have an algorithm which, given any object x, together with the data that xA, shows that P(x) holds.

In particular, the interpretations for ¬,, and are worth emphasizing: each of these has the very strict requirement that a proof of such a statement will consist of a decision procedure—in other words, every constructive proof contains, in principle, an algorithm.

The BHK interpretation characterizes a logic called intuitionistic logic. Every form of constructive mathematics has intuitionistic logic at its core; different schools have different additional principles or axioms given by the particular approach to constructivism.

d. Constructive methods in mathematics

Upon adopting only constructive methods, we lose some powerful proof tools commonly used in classical mathematics. We have already seen that the Principle of Excluded Middle is highly suspect from the constructivist viewpoint, as (under the BHK interpretation) it claims the existence of a universal algorithm to determine the truth of any given statement. This is not to say that PEM is constructively false, however. Both Russian recursive mathematics (in which PEM is provably false) and classical mathematics (in which it is logically true) are in a sense models, or interpretations, of constructive mathematics. So in a way, PEM is independent of constructive mathematics. Note, however, that if one is given a statement, it may be possible to prove PEM concerning that particular statement—such statements are called decidable. The point is that there is no general constructive method for doing so for all statements.

If PEM is not universally valid we also lose universal applicability of any mode of argument which validates it. For example, double negation elimination, or proof by contradiction. Again, it must be emphasized that it is only the universal applicability which is challenged: proof by contradiction is constructively just fine for proving negative statements; and double negation elimination is just fine for decidable statements.

However these limitations are in fact often advantages. In a lot of cases, constructive alternatives to non-constructive classical principles in mathematics exist, leading to results which are often constructively stronger than their classical counterpart. For example, the classical least upper bound principle (LUB) is not constructively provable:

LUB: Any nonempty set of real numbers that is bounded from above has a least upper bound.

The constructive least upper bound principle, by contrast, is constructively provable (Bishop & Bridges, 1985, p. 37):

CLUB: Any order-located nonempty set of reals that is bounded from above has a least upper bound.

A set is order-located if given any real x, the distance from x to the set is computable.
It is quite common for a constructive alternative to be classically equivalent to the classical principle; and, indeed, classically every nonempty set of reals is order-located.

To see why LUB is not provable, we may consider a so-called Brouwerian counterexample (see Section 2a), such as the set

S={xR:(x=2)(x=3P)}


where P is some as-yet unproven statement, such as Goldbach's conjecture that every even number greater than 2 is the sum of two prime numbers. (There may be some philosophical problems with this set; however these do not matter for the purpose of this example. Section 2a has a much "cleaner", though more technically involved, example.) If the set S had a computable least upper bound, then we would have a quick proof of the truth or falsity of Goldbach's conjecture. A Brouwerian counterexample is an example which shows that if a certain property holds, then it is possible to constructively prove a non-constructive principle (such as PEM); and thus the property itself must be essentially non-constructive.

e. Early history

In the late 19th century, the mathematical community embarked on a search for foundations: unquestionable solid ground on which mathematical theorems could be proved. An early exemplar of this search is Kronecker's 1887 paper "Über den Zahlbegriff" ("On the Concept of Number") (Kronecker, 1887), where he outlines the project of arithmetization (that is, founding on the fundamental notion of number) of analysis and algebra. It is perhaps in this work that we see the earliest instance of the constructive manifesto in mathematical practice. Kronecker was famously quoted by Weber (1893) as saying "Die ganzen Zahlen hat der liebe Gott gemacht, alles andere ist Menschenwerk." ("The integers were made by God; all else is the work of man.")

It has been said that "almost all mathematical proofs before 1888 were essentially constructive." (Beeson, 1985, p. 418). This is not to say that constructive mathematics, as currently conceived of, was common practice; but rather that the natural interpretation of existence of the time was existence in the constructive sense. As mathematical concepts have changed over time, however, old proofs have taken on new meaning. There are thus plenty of examples of results which are today regarded as essentially non-constructive, but seen within the context of the period during which they were written, the objects of study that the authors had in mind allowed proofs of a constructive nature. One good example of this is Cauchy's 1821 proof of the Intermediate Value Theorem (Cauchy, 1821; in Bradley & Sandifer, 2009, pp. 309–312); the classical interval-halving argument. The concept of "function" has, since then, expanded to include objects which allow for Brouwerian counterexamples (see Section 2a).

The first major systematic development of a constructive approach to mathematics is that of Brouwer and the intuitionists, introduced in the next major section. Some notable forerunners (who should perhaps not be thought of as intuitionists themselves) were Henri Poincaré, who argued that mathematics is in a way more immediate than logic, and Emile Borel, who maintained that the only objects that concern science are those that can be effectively defined (Borel, 1914). Poincaré argues that intuition is a necessary component of mathematical thought, rejects the idea of an actual infinite, and argues that mathematical induction is an unprovable fact (Poincaré, 1908).

As a result of techniques from ZF set theory, rigorous proofs have over time taken on non-constructive aspects. David Hilbert, contemporary of Brouwer and an opponent of Brouwer's philosophy, deserves particular mention, since he was an early pioneer of highly non-constructive techniques. The debate between Brouwer and Hilbert grew fierce and controversial, which perhaps added fuel to the development of both constructivism (à la Brouwer) and formalism (à la Hilbert) (see, for example, van Heijenoort, 1976). The spectacular 1890 proof by Hilbert of his famous basis theorem (Hilbert, 1890), showing the existence of a finite set of generators for the invariants of quantics, was greeted by the German mathematician Gordan with another now-(in)famous phrase: "Das ist nicht Mathematik. Das ist Theologie." ("That is not Mathematics. That is Theology.") This is perhaps the most dramatic example of a non-constructive proof, relying on PEM in an infinite extension.

The method Hilbert exhibited there has become widely accepted by the mathematics community as a pure existence proof (that is, proving that non-existence of such an object was contradictory without actually exhibiting the object); however it was not admissible as a constructive technique. Weyl, one of Hilbert's students (who Hilbert would eventually "lose"—perhaps temporarily—to intuitionism) commented on pure existence proofs, that they "inform the world that a treasure exists without disclosing its location" (Weyl, 1946). The Axiom of Choice, perhaps due in part to its regular use in many non-constructive proofs (and heavily implicated in many of Hilbert's most influential proofs), has been accused of being the source of non-constructivity in mathematics.

f. The Axiom of Choice

The Axiom of Choice (AC) has been controversial to varying degrees in mathematics ever since it was recognized by Zermelo (1904). Loosely, it states that given a collection of non-empty sets, we can choose exactly one item from each set.

AC: If to each x in A there corresponds a y in B, then there is a function f such that f(x) is in B whenever x is in A.

Formally, given non-empty sets A and B, the Axiom may be stated as:

xAyBP(x,y)fBAxAP(x,f(x)).


Intuitively, this seems almost trivial, and the case where the non-empty sets are finite in size is indeed so, since one does not need the axiom of choice to prove that such a choice can be made in most formulations of set theory. However it becomes less clear when the size of the sets involved is very large or somehow not easily determined. In such cases, a principled, functional choice may not necessarily be made. Even in classical mathematics, the axiom of choice is occasionally viewed with some suspicion: it leads to results that are surprising and counterintuitive, such as the Banach-Tarski paradox: using the axiom of choice allows one to take a solid sphere in 3-dimensional space, decompose it into a finite number of non-overlapping pieces, and reassemble the pieces (using only rotations and translations) into two solid spheres, each with volume identical to the original sphere. Due to its controversial nature, many mathematicians will explicitly state when and where in their proof AC has been used.

Nonetheless, the BHK interpretation of the quantifiers seems to invite one to think about the existential quantifier yB as defining the choice function, and so it would seem very natural to adopt AC from the constructive point of view (see, however, the discussion in Sections 4b and 5a). But consider the set S given by

S={xZ:(x=0P)(x=1¬P)}


for some syntactically correct statement P. This set is not empty, since if it were then both P and ¬P would have to be false; so ¬P¬¬P, a contradiction. Similarly, S cannot contain both 0 and 1. But suppose that we had an algorithmic procedure which (in a finite time) returns an element of S for us. If it returns 0, then we know that P must be true; whereas if it returns 1 then we know that P must be false—and thus we will have proved that P¬P. We may repeat this for any statement, and so this amounts to a (constructive) proof of the universal applicability of PEM. And as we have seen, universal applicability of PEM is not constructively acceptable. Thus, while the set S is non-empty, we cannot necessarily exhibit any of its members either. A set which has constructible members is called inhabited, and the distinction between inhabited and non-empty sets is key in constructive set theories. Related is the issue of the size of S; since S is not empty, its size is not 0, but since S is not (necessarily) inhabited, its size is no other natural number either—S is an at-most-singleton set. It is a consquence of AC, in fact, that every non-empty set is inhabited.

It is no coincidence that the symbolic form of AC suggests that it is essentially a quantifier swap: AC states that if to each element of A an element of B can be assigned, then it can be done so in a systematic (algorithmic) way. It is thus a kind of uniformity principle.

The example set S above is known as a Brouwerian example (although most of Brouwer's examples of this sort were a little more specific—see below). It is to intuitionism, Brouwer's philosophy of mathematics, that we now turn.

2. Intuitionism

If there is a name synonymous with constructive mathematics, it is L.E.J. Brouwer (Luitzen Egbertus Jan Brouwer; Bertus to his friends). In his doctoral thesis, Over de Grondslagen der Wiskunde (On the Foundations of Mathematics; Brouwer, 1907), Brouwer began his program of laying the foundations of constructive mathematics in a systematic way. Brouwer's particular type of constructive mathematics is called "intuitionism" or "intuitionistic mathematics" (not to be confused with intuitionistic logic; recall Section 1c).

Shortly after the presentation of his thesis, Brouwer wrote a paper entitled "De onbetrouwbaarheid der logische principes" ("The Untrustworthiness of the Principles of Logic"; Brouwer, 1908), in which he challenged the absolute validity of classical logic: "De vraag naar de geldigheid van het principium tertii exclusi is dus aequivalent met de vraag naar de mogelijkheid van onoplosbare wiskundige problemen." ("The question of validity of the principle of excluded third is equivalent to the question of the possibility of unsolvable mathematical problems."; Brouwer, 1908, p. 156) In other words, PEM is valid only if there are no unsolvable mathematical problems.

In intuitionistic mathematics, mathematics is seen as a free creation of the human mind: a (mathematical) object exists just in case it can be (mentally) constructed. This immediately justifies the BHK interpretation, since any existence proof cannot be a proof-by-contradiction—the contradiction leaves us without a construction of the object.

According to Brouwer, the natural numbers and (perhaps surprisingly) the continuum, are primitive notions given directly by intuition alone. This connects to the idea of what Brouwer called free choice sequence, a generalization of the notion of sequence. It is perhaps ironic that when Brouwer initially encountered the idea of a free choice sequence, which had been a mathematical curiosity at the time, he rejected such sequences as non-intuitionistic (Brouwer, 1912). However, soon afterward he accepted them and was the first to discover how important they were to practical constructive mathematics (Brouwer, 1914). This is one of two major aspects which distinguishes intuitionistic mathematics from other kinds of constructive mathematics (the second being Brouwer's technique of bar induction, which we do not explain in great depth here; though see Section 2b).

A free choice sequence is given by a constructing agent (Brouwer's creative subject), who is at any stage of the progression of a sequence free to choose (or subject to mild restrictions) the next member of the sequence. For example, if one is asked to produce a binary free choice sequence, one may start "0000" for a hundred digits, and then (perhaps unexpectedly) freely choose "1" as the next digit. A real number, classically thought of as a (converging) Cauchy sequence, thus need not be given by a determinate rule; instead, it is only subject to a Cauchy restriction (that is, some rate of convergence).

The idea of a free choice sequence leads to some very strong commitments. Not immediately obvious is the principle of continuous choice. It states:

If PNN×N and for each x in NN there exists n in N such that (x,n) is in P, then there is a choice function f:NNN such that (x,f(x)) is in P for all x in NN.

The notation BA, where A and B are sets, denotes the collection of functions from A to B. So NN denotes the collection of functions from natural numbers to natural numbers; the arguments x in question are thus actually functions x:NN. One may conceive of these (as Brouwer did) as sequences of natural numbers; further technical details are omitted.

Intuitively, the principle arises from the following considerations. Let PNN×N and suppose that we have a procedure that we may apply to any given sequence x=x1,x2,x3, of natural numbers which computes an index n such that (x,n)P—that is, P is a set in which every sequence of natural numbers is (constructively) paired with some natural number. Under the intuitionist's view, at any given moment we may have only a finite number of terms of x at hand. The procedure which computes n such that (x,n)P must thus do so after being given only a finite initial fragment of x; say x1,x2,,xm. Now if we are given another sequence y, if the first m terms of y are the same as those of x (that is, if y is "close" to x in some sense), then the procedure must return the same value of n for y as it would give for x. The procedure is thus a continuous function (the choice function) from NN to N.

The principle of continuous choice is a restricted form of the classical axiom of choice (see Section 1f). This principle (together with the BHK interpretation) gives rise to a number of very strong consequences—perhaps the most (in)famous being that every function f:RR which maps the real line into itself is pointwise continuous.

If this seems outlandish, then remember that in intuitionistic mathematics the very ideas of "function" and "real number" are different from their classical counterparts. We now discuss a technique that originated with Brouwer which justifies, to some extent, these conceptions.

a. Brouwerian counterexamples

As we have seen, Brouwer rejected the principle of excluded middle. The intuitionist school further rejects appeal to so-called omniscience principles; logical principles which allow one to decide the truth of predicates over an infinite number of things. An illuminating example of such a principle is the limited principle of omniscience (LPO), which states that:

For each binary sequence a1a2a3, either an=0 for all natural numbers n or there exists a natural number n such that an=1.

In symbols LPO states that, for any given binary sequence a1a2a3, the following statement is decidable:

n(an=0)n(an=1).


For one familiar with computability, the above statement will already appear problematical. It says that there is an algorithm which, given an arbitrary binary sequence will in finite time determine either that all the terms in α are zero, or it will output an n such that an=1. Since computers have a finite memory capacity, the problem of loss of significance (or underflow) occurs: the computer carrying out the calculation may not carry enough digits of the sequence to determine whether a 1 occurs or not (though see Section 3).

However, (modern) computational issues aside, this was not the problem seen by the intuitionist. To Brouwer, the possible existence of choice sequences made this statement unacceptable. Since a constructing agent is free to insert a 1 while constructing a sequence at any stage of the construction, there is no a priori bound on how many terms one may need to search in order to establish the presence, or lack of, a 1.

Thus we see that the lesser principle of omniscience does not apply to binary sequences in intuitionist mathematics. Let us now turn to the real numbers. The decimal expansion of π is a favourite example. There are well-known algorithms for computing π to arbitrary precision, limited only by the power of the device on which the algorithms run. At the time of writing this article, it is unkown whether a string of 100 successive 0s (that is, 0000, with 100 digits) appears in the decimal expansion of π. It is known that π does not have a terminating decimal expansion (so it does not finish in all 0s) and that it has no systematic repetition (since it is not a rational number).

But suppose we define a real number x by the expansion

x=n=1(12)nan


where an is defined as

an={01if a hundred consecutive 0s have not occurred by the nth digit of π's expansionotherwise.


The number x is constructible: we have prescribed an algorithm for computing it to any desired precision. It also has a Cauchy condition—the 1/2n part—which ensures it converges. However, we now have a problem as regards the sign of x: if a hundred consecutive 0s never occur in the decimal expansion of π, then x=0. If a hundred consecutive 0s do occur in the decimal expansion of π, and the hundredth such 0 occurs in an odd place, then x<0; whereas if it occurs in an even place then x>0. So if we have an answer to the question whether such a string occurs in the decimal expansion of π, then we can determine whether x=0 or not; however, since we do not have an answer, we cannot conclude that x>0, or x=0, or x<0.

The previous paragraph is a Brouwerian counterexample to the statement "every real number satisfies the law of trichotomy". Trichotomy is exactly the decision (disjunction) that every real number is either positive, zero, or negative. Under Brouwer's view, we may construct real numbers about which we simply do not have enough information at hand—even in principle—to determine their sign.

b. The Fan Theorem

In addition to the use of intuitionistic logic and the principle of continuous choice, Brouwer's intuitionism involves one final central technique, bar induction. It is a technically challenging idea which we do not cover in depth here (see Beeson, 1985, p. 51 or Troelstra & van Dalen, 1988, p. 229 for detail; see also Bridges, Dent & McKubre-Jordens, 2012). Bar induction is a key principle which allowed Brouwer to prove his Fan Theorem; we will outline the theorem for the binary fan, demonstrating the significance of the fan theorem.

The (complete) binary fan is the collection of all (finite) sequences of 0s and 1s (including the empty sequence). Diagrammatically, we may draw a tree-like diagram (the "fan") with a binary split for each branch at each level; one branch corresponds to 0, another to 1. A path through the binary fan is a sequence (finite or infinite) of 0s and 1s. Compare then:

  • A bar of the binary fan is a set of `cut-offs', such that for each infinite path α through the fan there exists a natural number n such that the first n terms of α are `in the bar'.
  • A bar of the binary fan is uniform if there exists a natural number N such that for each infinite path α through the fan, there is some kN such that the first k terms of α are in the bar.

Note the quantifier shift: we go from "for each path there exists n…" to "there exists N such that for each path …". Brouwer's fan theorem for the binary fan then states that

FAN: Every bar of the binary fan is uniform.

The classical contrapositive of the fan theorem, well-known in graph theory, is König's Lemma: if for every n there exists a path of length n which does not hit the bar, then there exists an infinite path that is not in the bar.

As mentioned, the fact that Brouwer's proof of the fan theorem (1927) was published nine years before König's proof of his Lemma (1936) shows the innaccuracy of the criticism sometimes held that constructive mathematics is revisionist. Indeed, as is shown by the Gödel-Gentzen negative translation (Gödel, 1933), every classical theorem may be translated into a constructive theorem (some of which are not very informative); and so constructive mathematics can be interpreted as a generalization of classical mathematics. If one adds PEM to intuitionistic logic, then the full classical system is recovered. (Of course, the reverse view may also be argued for—that classical mathematics is a generalization of constructive methods.)

3. Constructive recursive mathematics

In the 1930s, amidst developments in what is now known as computer science, concepts of algorithmic computability were formalised. The well-known Turing machines, λ-calculus, combinatory logic and the like arose. These were proved to be equivalent, leading Church, Turing and Markov to postulate the thesis now known as the Church-Markov-Turing thesis (or the Church-Turing thesis, or simply Church's thesis):

CMT: a function is computable if and only if it is computable by a Turing machine.

Markov's work, based in what he called normal algorithms (now known as Markov algorithms; Markov, 1954) was essentially recursive function theory employing intuitionistic logic. In this type of constructive mathematics, the groundwork for which was done in the 40s and 50s, Church's thesis is accepted as true (as it is to classical mathematicians, by and large). Moreover, in Markov's school, Markov's Principle of unbounded search is accepted:

MP: For any binary sequence, if it is impossible that all the terms are zero, then there exists a term equal to 1.

This reflects the idea that if it is contradictory for all the terms to be zero, then (although we do not have an a priori bound on how far in the sequence we must seek) eventually any machine that is looking for a 1 will find one in finite (possibly large) time.

While MP may seem like a kind of omniscience principle, there is a distinct difference: the implication PQ is, under the BHK interpretation, weaker than the disjunction ¬PQ. An algorithm which takes any (correct) proofs of P and turns them into (correct) proofs of Q does not leave us any the wiser which of ¬P or Q is actually the case. In part because of this, the constructive status of this principle is not so clear. While the omniscience principles are not accepted by any school of constructivism, there is at least pragmatic (that is, computer-implementable) reason to admit MP: the algorithm which will compute the 1 in a binary sequence for which it is impossible that all terms are 0 is simply to carry on the process of writing the sequence down until you come across a 1. Of course, MP does not provide a guarantee that a 1 will be found in a sequence before, say, the extinction of the human race, or the heat death of the universe.

Recursive function theory with intuitionistic logic and Markov's principle is known as Russian recursive mathematics. If Markov's principle is omitted, this leaves constructive recursive mathematics more generally—however, there appear to be few current practitioners of this style of mathematics (Beeson, 1985, p. 48).

A central tenet of recursive function theory, is the following axiom of Computable Partial Functions:

CPF: There is an enumeration of the set of all computable partial functions from N to N with countable domains.

Much may be deduced using this seemingly innocuous axiom. For example, PEM (and weakenings thereof, such as LPO) may be shown to be (perhaps surprisingly) simply false within the theory.

a. Surprises

Constructive recursive mathematics presents other surprises as well. Constructivists are already skeptical about PEM, LPO, and other wide-reaching principles. However, some amazing (and classically contradictory) results can be obtained, such as Specker's theorem (Specker, 1949):

Theorem.
There exists a strictly increasing sequence (xn)n1 of rational numbers in the closed interval [0,1] that is eventually bounded away from each real number in [0,1].

To be `eventually bounded away from a number x' means that, given the number x, one may calculate a positive integer N and a positive distance δ such that the distance from x to xk is at least δ whenever kN.

Of course such sequences cannot be uniformly bounded away from the whole of [0,1], otherwise they could not progress within [0,1] past this uniform bound (again, there is a quantifier swap, from the pointwise property xN,δ to the uniform property N,δx—recall Section 2b). While it seems like a contradictory result, the theorem becomes clear when one considers the objects of study: sequences (and functions) and real numbers, are recursive sequences and real numbers; so Specker sequences are sequences which classically converge to non-recursive numbers.

Another interesting result is the constructive existence, within this theory, of a recursive function from [0,1] to the reals which is everywhere pointwise continuous yet not uniformly continuous (as such a function would be both classically and intuitionistically, as a consequence of Brouwer's fan theorem).

4. Bishop's constructive mathematics

In 1967 Errett Bishop published the seminal monograph Foundations of Constructive Analysis (Bishop, 1967). Not only did this assuage the worries expressed by some leading mathematicians (such as Weyl) about the feasibility of constructive proofs in analysis, but it helped lay the foundation for a programme of mathematical research that has flourished since. It captures the numerical content of a large part of the classical research programme and has become the standard for constructive mathematics in the broader sense, as we will shortly see.

a. Proof readability and preservation of numerical meaning

Bishop refused to formally define the notion of algorithm in the BHK interpretation. It is in part this quality which lends Bishop-style constructive mathematics (BISH) the following property: every other model of constructive mathematics, philosophical background notwithstanding, can be seen as an interpretation of Bishop's constructive mathematics where some further assumptions have been added. For example, intuitionistic mathematics is BISH together with bar induction and continuous choice; and (Russian) constructive recursive mathematics is BISH together with the Computable Partial Functions axiom and Markov's Principle. Even classical mathematics—or mathematics with classical two-valued logic—can be seen as a model of Bishop-style constructive mathematics where the principle of excluded middle is added as axiom.

Another factor that contributes to this versatility is the fact that Bishop did not add extra philosophical commitments to the programme of his constructive mathematics, beyond the commitment of ensuring that every theorem has numerical meaning (Bishop, 1975). The idea of preserving numerical meaning is intuitive: the numerical content of any fact given in the conclusion of a theorem must somehow be algorithmically linked to (or preserved from) the numerical content of the hypotheses. Thus BISH may be used to study the same objects as the classical mathematician (or the recursive function theorist, or the intuitionist). As a result, a proof in Bishop-style mathematics can be read, understood, and accepted as correct, by everyone (Beeson, 1985, p. 49) (barring perhaps the paraconsistentists, for whom the disjunctive syllogism, weakening, and contraction—accepted under BHK—portend trouble).

At heart, Bishop's constructive mathematics is simply mathematics done with intuitionistic logic, and may be regarded as "constructive mathematics for the working mathematician" (Troelstra & van Dalen, 1988, p. 28). If one reads a Bishop-style proof, the classical analyst will recognize it as mathematics, even if some of the moves made within the proof seem slightly strange to those unaccustomed to preserving numerical meaning. Further, BISH can be interpreted in theories of computable mathematics, such as Weihrauch's Type 2 effectivity theory (Weihrauch, 2000).

b. The Axiom of Choice in BISH

One (in)famous claim Bishop makes is that "A choice function exists in constructive mathematics, because a choice is implied by the very meaning of existence" (Bishop & Bridges, 1985, p. 12). The axiom is in fact provable in some constructive theories (see Section 5). However, Diaconescu has shown that AC implies PEM, so it would seem that constructivists ought to accept PEM after all (the proof in Diaconescu (1975) is actually remarkably simple). And as we have seen, for some constructivists it would actually be inconsistent to accept PEM. It thus seems that AC, while very important, presents a significant problem for constructivists. However, as we now discuss, such a conclusion is hasty and unjustified.

The key observation to make regards the interpretation of the quantifiers. AC comes out as provable only when we interpret the quantifiers under BHK: for then the hypothesis xAyBP(x,y) says that there is an algorithm which takes us from elements x of A together with the data that x belongs to A, to an element y of B. Thus only if there is no extra work to be done beyond the construction of x in showing that x is a member of A (that is, if this can be done uniformly) will this be a genuine function from A to B. We will revisit this in Section 5. Thus the classical interperatation of AC is not valid under BHK, and PEM cannot be problematically derived.

Let us return to Bishop's statement at the start of this section, regarding the existence of a choice function being implied by the meaning of existence. The context in which his claim was made illuminates the situation. The passage (Bishop & Bridges, 1985, p. 12) goes on to read: "[Typical] applications of the axiom of choice in classical mathematics either are irrelevant or are combined with a sweeping use of the principle of omniscience." This also shows that blaming the axiom of choice for non-constructivity is actually a mistake—it is the appeal to PEM, LPO, or similar absolute principles applied to objects other than the finite which prevent proof from being thoroughly algorithmic. Some constructive mathematicians adopt various weakenings of AC which are (perhaps) less contentious; Brouwer, as we saw in Section 2, adopts Continuous Choice.

c. Foundations & Advances

Bishop's view is that the Gödel interpretation of logical symbols accurately describes numerical meaning (Bishop, 1970). This view does not appear to have retained much traction with practitioners of BISH (Beeson, 1985, p. 49). Leaving the question of foundations open is, as mentioned, partly responsible for the portability of Bishop-style proofs.

Some work has been done on founding BISH. A set-theoretic approach was proposed in Myhill (1975) and Friedman (1977); Feferman (1979) suggested an approach based on classes and operations. Yet another approach is Martin-Löf's theory of types, which we visit in Section 5.

The Techniques of Constructive Analysis book (Bridges & Vîta, 2006) develops a Bishop-style constructive mathematics that revisits and extends the work from Bishop (1967) in a more modern setting, showing how modern proof techniques from advanced topics such as Hilbert spaces and operator theory can be successfully covered constructively. It is interesting to observe that, as a rule of thumb, the more ingenious a proof looks, the less explicit numerical content it tends to contain. Thus such proofs are usually more difficult to translate into constructive proofs. Of course there are exceptions, but the comparison between (say) Robinson's nonstandard analysis and Bishop's constructive methods appears to be a confusion between elegance and content propagated by some authors; see, for example, Stewart (1986) and Richman (1987).

5. Martin-Löf Type Theory

In 1968, Per Martin-Löf published his Notes on Constructive Mathematics (Martin-Löf, 1968). It encapsulates mathematics based on recursive function theory, with a background of the algorithms of Post (1936).

However he soon revisited foundations in a different way, which harks back to Russell's type theory, albeit using a more constructive and less logicist approach. The basic idea of Martin-Löf's theory of types (Martin-Löf, 1975) is that mathematical objects all come as types, and are always given in terms of a type (for example, one such type is that of functions from natural numbers to natural numbers), due to an intuitive understanding that we have of the notion of the given type.

The central distinction Martin-Löf makes is that between proof and derivation. Derivations convince us of the truth of a statement, whereas a proof contains the data necessary for computational (that is, mechanical) verification of a proposition. Thus what one finds in standard mathematical textbooks are derivations; a proof is a kind of realizability (c.f. Kleene, 1945) and links mathematics to implementation (at least implicitly).

The theory is very reminiscent of a kind of cumulative hierarchy. Propositions can be represented as types (a proposition's type is the type of its proofs), and to each type one may associate a proposition (that the associated type is not empty). One then builds further types by construction on already existing types.

a. The Axiom of Choice is provable

Within Martin-Löf type theory, the axiom of choice, stated symbolically as

xAyBP(x,y)fBAxAP(x,f(x))


is derivable. Recall that this would be inconsistent, if AC were interpreted classically. However the axiom is derivable because in the construction of types, the construction of an element of a set (type) is sufficient to prove its membership. In Bishop-style constructive mathematics, these sets are "completely presented", in that we need to know no more than an object's construction to determine its set membership. Under the BHK interpretation, the added requirement on the proof of xAP(x) that the algorithm may depend also on the data that x belongs to A and not just on the construction of x itself is thus automatically satisfied.

It is worth noting that sets construed in this theory are constrained by this realizability criterion. For example, in Martin-Löf's theory, the power set axiom is not accepted in full generality—the power set of N, for example, is considered to be what would in classical theory be called a class, but not a set (Martin-Löf, 1975).

6. Constructive Reverse Mathematics

The program of Reverse Mathematics, instigated by Harvey Friedman (1975), aims to classify theorems according to their equivalence to various set-theoretic principles. The constructive equivalent was begun in a systematic manner by Ishihara (2006) and, separately and independently by W. Veldman in 2005.

The field has since had numerous developments, the interest of which lies mainly in the metamathematical comparison of different branches of constructive mathematics and the logical strengths of various principles endorsed or rejected by the different schools. This also points to the requirements on computational data for various properties to hold, such as various notions of compactness (Diener, 2008).

Principles whose classification is of interest include:

  • The Uniform Continuity Theorem (UCT): Every pointwise continuous mapping of the closed interval [0,1] into R is uniformly continuous.
  • Brouwer's fan theorem, and various weakenings thereof. (There is actually a hierarchy of fan theorems; see, for example, Diener, 2008.)
  • The Anti-Specker property (AS): A sequence (zn)n1 of real numbers in [0,1] that is eventually bounded away from each point of [0,1] is eventually bounded away from all of [0,1] (recall the discussion in Section 3a).
  • Omniscience principles.

It should be noted that arguably the first author dealing with reverse-mathematical ideas was Brouwer, although he certainly would not have seen it as such. The weak counterexamples (see Section 2a) he introduced were of the form: P implies some non-constructive principle. Though Brouwer may have been aware of the possibility of reversing the implication in many cases, to him non-constructive principles were meaningless and as such the full equivalence result would be of little interest.

7. Summary

The philosophical commitments of constructivist philosophies (Section 1b) are:

  • Truth is replaced by (algorithmic) proof as a primitive notion, and
  • Existence means constructibility.

This naturally leads to intuitionistic logic, characterized by the BHK interpretation (Section 1c). Omniscience principles, such as the Principle of Excluded Middle (PEM), and any mode of argument which validates such principles, are not in general valid under this interpretation, and the classical equivalence between the logical connectives does not hold.

Constructive proofs of classical theorems are often enlightening: the computational content of the hypotheses is explicitly seen to produce the conclusion. While intuitionistic logic places restrictions on inferences, the larger part of classical mathematics (or what is classically equivalent) can be recovered using only constructive methods (Section 4); there are often several constructively different versions of the same classical theorem.

The computational advantage of constructive proof is borne out in two ways. Constructive proofs:

  • embody (in principle) an algorithm (for computing objects, converting other algorithms, etc.), and
  • prove that the algorithm they embody is correct (that is, that it meets its design specification).

The programme of constructive reverse mathematics (Section 6) connects various principles and theorems (such as omniscience principles, versions of the fan theorem, etc.), shedding light on constructive or computational requirements of theorems. Furthermore, the use of Brouwerian counterexamples (Section 2a) often allows the mathematician to distinguish which aspects of classical proof are essentially nonconstructive.

Throught the article, several schools of constructivism are outlined. Each is essentially mathematics with intuitionistic logic, philosophical differences notwithstanding. Different schools add different further axioms: for example, (Russian) constructive recursive mathematics (Section 3) is mathematics with intuitionistic logic, the computable partial functions axiom (and Markov's principle of unbounded search). Classical mathematics can be interpreted as Bishop's constructive mathematics, with PEM added.

a. Some Further Remarks

The contrast between classical and constructive mathematicians is clear: in order to obtain the numerical content of a proof, the classical mathematician must be careful at each step of the proof to avoid decisions that cannot be algorithmically made; whereas the constructive mathematician, in adopting intuitionistic logic, has automatically dealt with the computational content carefully enough that an algorithm may be extracted from their proof.

In an age where computers are ubiquitous, the constructivist programme needs even less (pragmatic) justification, perhaps, than the classical approach. This is borne out by the successful translation of constructive proofs into actual algorithms (see, for example, Schwichtenberg (2009) and Berger, Berghofer, Letouzey & Schwichtenberg (2006)). The link between programming and abstract mathematics is stronger than ever, and will only strengthen as new research emerges.

8. References and Further reading

a. Further Reading

An overview of constructivism in mathematics is given in both Troelstra & van Dalen (1988), and Beeson (1985). For further reading in intuitionism from a philosophical perspective, Dummett's Elements (1977) is the prime resource. The Bishop and Bridges book (1985) is the cornerstone textbook for the Bishop-style constructivist. For the finitist version, see Ye (2011), where it is shown that even a strict finitist interpretation allows large tracts of constructive mathematics to be realized; in particular we see application to finite things of Lebesgue integration, and extension of the constructive theory to semi-Riemannian geometry. For advances and techniques in Bishop-style mathematics, see Bridges & Vîta (2006). For an introduction to constructive recursive function theory, see Richman (1983); Bridges & Richman (1987) expounds the theory relative to other kinds of constructive mathematics. For an introduction to computable analysis, see, for example, Weihrauch (2000). For topology, Bridges & Vîta (2011) outlays a Bishop-style development of topology; Sambin (1987) and (2003) are good starting points for further reading in the recent development of point-free (or formal) topology.

b. References

  • Beeson, M. (1985). Foundations of constructive mathematics. Heidelberg: Springer-Verlag.
  • Berger, U., Berghofer, S., Letouzey, P. & Schwichtenberg, H. (2006). Program extraction from normalization proofs. Studia Logica, 82, 25–49.
  • Bishop, E. (1967). Foundations of constructive analysis. New York: Academic Press.
  • Bishop, E. (1970). Mathematics as a numerical language. In A. Kino, J. Myhill & R. E. Vesley (Eds.), Intuitionism and proof theory: proceedings of the summer conference at Buffalo, New York, 1968 (pp. 53–71). Amsterdam: North-Holland.
  • Bishop, E. (1973). Schizophrenia in contemporary mathematics. Amer. Math. Soc. Colloquium Lectures. Missoula: University of Montana. Reprinted in: Rosenblatt, M. (1985). Errett bishop: reflections on him and his research. (M. Rosenblatt. & E. Bishop, Eds.) Amer. Math. Soc. Memoirs 39.
  • Bishop, E. (1975). The crisis in contemporary mathematics. Historia Mathematica, 2(4), 507–517.
  • Bishop, E. & Bridges, D. S. (1985). Constructive analysis. Berlin: Springer-Verlag.
  • Borel, E. (1914). Leçons sur la théorie des fonctions (2nd ed.). In French. Paris: Gauthier-Villars.
  • Bradley, R. E. & Sandifer, C. E. (2009). Cauchy's cours d'analyse. New York: Springer.
  • Bridges, D. & Richman, F. (1987). Varieties of constructive mathematics. London Math. Soc. Lecture Notes 97. Cambridge: Cambridge University Press.
  • Bridges, D. S. & Vîta, L. S. (2006). Techniques of constructive analysis. Universitext. Heidelberg: Springer-Verlag.
  • Bridges, D. S. & Vîta, L. S. (2011). Apartness and uniformity: a constructive development. Heidelberg: Springer.
  • Bridges, D. S., Dent, J. & McKubre-Jordens, M. (2012). Two direct proofs that LLPO implies the detachable Fan Theorem. Preprint.
  • Brouwer, L. E. J. (1907). Over de grondslagen der wiskunde. (Doctoral dissertation, University of Amsterdam, Amsterdam). In Dutch.
  • Brouwer, L. E. J. (1908). De onbetrouwbaarheid der logische principes. Tijdschrift voor Wijsbegeerte, 2, 152–158. In Dutch.
  • Brouwer, L. E. J. (1912). Intuïtionisme en formalisme. Wiskundig Tijdschrift, 9, 180–211. In Dutch.
  • Brouwer, L. E. J. (1914). Review of Schoenflies's "Die Entwicklung der Mengenlehre und ihre Anwendungen, erste Hälfte". Jahresber. Dtsch. Math.-Ver. 23(2), 78–83.
  • Brouwer, L. E. J. (1927). On the domains of definition of functions. Reprinted in: van Heijenoort, J. (1976). From Frege to Gödel: a source book in mathematical logic, 1879–1931 (2nd printing with corrections). Cambridge, Massachusetts: Harvard University Press.
  • Cauchy, A.-L. (1821). Cours d'analyse. English translation: Bradley, R. E. & Sandifer, C. E. (2009). Cauchy's cours d'analyse. New York: Springer
  • Diaconescu, R. (1975). Axiom of choice and complementation. Proc. Amer. Math. Soc. 51, 176–178.
  • Diener, H. (2008). Compactness under constructive scrutiny. (Doctoral dissertation, University of Canterbury, New Zealand).
  • Dummett, M. A. E. (1977). Elements of intuitionism. Oxford: Clarendon Press.
  • Feferman, S. (1979). Constructive theories of functions and classes. In Logic colloquium '78 (Proc. Mons Colloq.) (pp. 159–224). Stud. Logic Foundations Math. 97. Amsterdam: North-Holland.
  • Friedman, H. (1975). Some systems of second order arithmetic and their use. In Proceedings of the 17th international congress of mathematicians (Vancouver, British Columbia, 1974).
  • Friedman, H. (1977). Set theoretic foundations for constructive analysis. Ann. Math. 105(1), 1–28.
  • Gödel, K. (1933). Zur intuitionistischen arithmetik und zahlentheorie. Ergebnisse eines mathematischen Kolloquiums, 4, 34–38. In German.
  • Hilbert, D. (1890). Über die theorie der algebraischen formen. Math. Ann. 36, 473–534. In German.
  • Ishihara, H. (2006). Reverse mathematics in bishop's constructive mathematics. Phil. Scientiae. Cahier Special 6, 43–59.
  • Kleene, S. C. (1945). On the interpretation of intuitionistic number theory. J. Symbolic Logic, 10(4), 109–124.
  • König, D. (1936). Theorie der endlichen und unendlichen graphen: kombinatorische topologie der streckenkomplexe. In German. Leipzig: Akad. Verlag.
  • Kronecker, L. (1887). Über den zahlbegriff. J. Reine Angew. Math. 101, 337–355. In German.
  • Markov, A. A. (1954). Theory of algorithms, In Trudy Mat. Istituta imeni V. A. Steklova (Vol. 42). In Russian. Moskva: Izdatel'stvo Akademii Nauk SSSR.
  • Martin-Löf, P. (1968). Notes on constructive analysis. Stockholm: Almqvist & Wiksell.
  • Martin-Löf, P. (1975). An intuitionistic theory of types: predicative part. In H. E. Rose & J. C. Shepherdson (Eds.), Logic colloquium 1973 (pp. 73–118). Amsterdam: North-Holland.
  • Myhill, J. (1975). Constructive set theory. J. Symbolic Logic, 40(3), 347–382.
  • Poincaré, H. (1908). Science et méthode. In French. Paris: Flammarion.
  • Post, E. (1936). Finite combinatory processes - formulation 1. J. Symbolic Logic, 1, 103–105.
  • Richman, F. (1983). Church's thesis without tears. J. Symbolic Logic, 48, 797–803.
  • Richman, F. (1987). The frog replies. The Mathematical Intelligencer, 9(3), 22–24. See also the subsequent discussion, pp. 24–26.
  • Rosenblatt, M. (1985). Errett bishop: reflections on him and his research (M. Rosenblatt & E. Bishop, Eds.). Amer. Math. Soc. Memoirs 39.
  • Sambin, G. (1987). Intuitionistic formal spaces - a first communication. In D. Skordev (Ed.), Mathematical logic and its applications (pp. 187–204). New York: Plenum.
  • Sambin, G. (2003). Some points in formal topology. Theoretical Computer Science, 305, 347–408.
  • Schwichtenberg, H. (2009). Program extraction in constructive analysis. In S. Lindström, E. Palmgren & K. Segerberg (Eds.), Logicism, intuitionism, and formalism: what has become of them? (pp. 255–275). Heidelberg: Springer.
  • Specker, E. (1949). Nicht konstruktiv beweisbare sätze der analysis. J. Symbolic Logic, 14, 145–158. In German.
  • Stewart, I. (1986). Frog and mouse revisited: a review of Constructive Analysis by Errett Bishop and Douglas Bridges (Springer 1985) and An Introduction to Nonstandard Real Analysis by A. E. Hurd and P. A. Loeb. The Mathematical Intelligencer, 8(4), 78–82.
  • Troelstra, A. S. & van Dalen, D. (1988). Constructivism in mathematics: an introduction. Two volumes. Amsterdam: North Holland.
  • van Heijenoort, J. (1976). From Frege to Gödel: a source book in mathematical logic, 1879–1931 (2nd printing with corrections). Cambridge, Massachusetts: Harvard University Press.
  • Weber, H. (1893). Leopold Kronecker. Math. Ann., 43, 1–25.
  • Weihrauch, K. (2000). Computable analysis. EATCS Texts in Theoretical Computer Science. Heidelberg: Springer-Verlag.
  • Weyl, H. (1946). Mathematics and logic. Amer. Math. Monthly, 53, 2–13.
  • Ye, F. (2011). Strict Finitism and the logic of mathematical applications. Synthese Library 355. Heidelberg: Springer.
  • Zermelo, E. (1904). Beweis, dass jede menge wohlgerdnet werden kann. Math. Ann., 59, 514–516. In German.

Author Information

Maarten McKubre-Jordens
Email: maarten.jordens@canterbury.ac.nz
University of Canterbury
New Zealand

Ludwig Wittgenstein: Later Philosophy of Mathematics

Mathematics was a central and constant preoccupation for Ludwig Wittgenstein (1889–1951). He started in philosophy by reflecting on the nature of mathematics and logic; and, at the end of his life, his manuscripts on these topics amounted to thousands of pages, including notebooks and correspondence. In 1944, he said his primary contribution to philosophy was in the philosophy of mathematics. Yet his later views on mathematics have been less well received than his earlier conception, due to their anti-scientific, even anti-rationalist, spirit.

This article focuses on the relation between the later Wittgenstein’s philosophy of mathematics and other philosophies of mathematics, especially Platonism; however, other doctrines (formalism, conventionalism, constructivism, empiricism) will be discussed as well.

Wittgenstein does not sympathize with any traditional philosophy of mathematics, and in particular his hostility toward Platonism (the conception that mathematics is about a-causal objects and mind-independent truths) is quite evident. This is in line with what can be described as his more general philosophical project: to expose deep conceptual confusions in the academic doctrines, rather than to defend his own doctrine. In fact, it is not even clear that the threads of his thinking on mathematics, when pulled together, amount to what we would today call a coherent, unified -ism. However, one view that can be attributed to him is that mathematical identities such as ‘Three times three is nine’ are not really propositions, as their superficial form indicates, but are certain kinds of rules; and, thus understood, the question is whether they are arbitrary or not. The interpretive position preferred in this article is that they are not, since they are grounded in empirical regularities – hence the recurrence of the theme of the applicability of mathematics in Wittgenstein’s later reflections on this topic.

Some have characterized him as a finitist-constructivist, others as a conventionalist, while many strongly disagree about these labels. He notoriously held the view that philosophy should be eminently descriptive, strongly opposing any interference with how mathematics is actually done (though he also predicted that lucid philosophy would curb the growth of certain mathematical branches). In particular, he was worried about the philosophers’ tendency to provide a foundation for mathematics, mainly because he thought it does not need one.

Table of Contents

  1. The Reaction from Other Philosophers
  2. Which 'ism'?
  3. Mathematical Reality
  4. Philosophical Method, Finitism-Constructivism, Logicism and Formalism
  5. Regularities, Rules, Agreement, and Contingency
  6. Concluding Remarks
  7. References and Further Reading
    1. Wittgenstein's Writings
    2. Other Sources

1. The Reaction from Other Philosophers

 

Although Wittgenstein’s earlier views on mathematics and logic (mainly exposed in his 1922 work Tractatus Logico-Philosophicus) have been very influential, his later conception (roughly, post-1929) has been “met with an ambivalent reaction, (...) drawing both the interest and the ire of working logicians” (Floyd [2005, 77]). A good deal of Wittgenstein’s later work on mathematics has been collected and edited by G.H. von Wright, R. Rhees and G.E.M. Anscombe under the title Remarks on the Foundations of Mathematics (RFM hereafter), and was first published in 1956. Another collection, Wittgenstein's Lectures on the Foundations of Mathematics. Cambridge 1939, was edited by Cora Diamond and published in 1976. Two harsh reviews of RFM, by G. Kreisel [1958] and A. R. Anderson [1958], set the tone for a sceptical attitude toward his later views on mathematics persisting even into the late twentieth century. P. Maddy, for instance, places the following warning at the beginning of her lucid Wittgenstein section in Naturalism in Mathematics: “Some philosophers, especially those with a technical bent, tend to be unsympathetic to the style and content of the late Wittgenstein. I encourage such Wittgenstein-phobes to skip over [the section].” [1997, 161, fn. 1] Perhaps illustrative for this way of thinking is also the omission of the Wittgenstein material in the second edition [1983] of Benacerraf and Putnam’s landmark collection Philosophy of Mathematics: Selected Readings. (The first edition of 1964 included excerpts from RFM.)

This Wittgenstein-phobia, if it truly exists, is surely regrettable – not only because of the intrinsic value of Wittgenstein’s highly original insights, but also because understanding his views on mathematics might illuminate difficult themes in his widely-studied work Philosophical Investigations [1953] (PI). It is telling that the early draft of PI had contained a good portion of what is now RFM, and it is no coincidence that some of the examples and scenarios used to illustrate the much discussed rule-following ‘paradoxes’ appearing in PI are cast in mathematical terms (Fogelin [1976], Kripke [1982]).

2. Which 'ism'?

This article focuses on the relation between later Wittgenstein’s philosophy of mathematics and other philosophies of mathematics, especially Platonism; however, other doctrines (formalism, conventionalism, constructivism, empiricism) will be discussed as well. Hopefully, even a glance at this relation will give the reader a minimally misleading sense of his later views on mathematics – although, naturally, there is much more to explore than can be covered here (for example, Wittgenstein’s points on proofs as forging ‘internal relations’ and their role in concept formation, his take on mathematical necessity, or on Gödel’s First Incompleteness Theorem, this last issue alone generating a series of interesting exchanges over the years; see Tymoczko [1984), Shanker [1988), Rodych [1999, 2006], Putnam and Floyd [2000], Floyd [2001], Steiner [2001], Bays [2004]). Note also that very little will be said on the debate over the partition of Wittgenstein’s thinking. When it comes to mathematics, the standard demarcation (two Wittgensteins – the ‘early’ one of the Tractatus, and the ‘later’ one of the PI and RFM) is questioned by Gerrard ([1991], [1996]), who distinguishes two lines of thought within the post-Tractarian period: a middle one, or “the calculus conception”, to be found in Philosophical Grammar (PG), and a truly later one, “the language-game conception." Stern [1991], however, worries about too finely dividing Wittgenstein’s thinking, concerned with the more recent tendency to add even the fourth period, post-PI, which Wittgenstein devoted to philosophical psychology and is illustrated in his Remarks on the Philosophy of Psychology.

In addition to the aphoristic and multi-voiced style, one of the most baffling aspects of later Wittgenstein’s views on mathematics is his position relative to the traditional philosophies of mathematics. While his hostility toward Platonism (especially in the version advocated by the Cambridge mathematician G. E. Hardy) is quite pronounced, not much can be confidently said about the doctrine he actually espouses. In fact, given his rejection of theories and theses in philosophy (PI §128), one might even suspect that the threads of his thought, when pulled together, do not amount to what we would today call a ‘position’, let alone a coherent, unified ‘ism.’ He rejects the logicism of his former teachers Frege and Russell, but this does not turn him into a formalist (of Hilbertian inspiration), or an intuitionist-constructivist-finitist (like Brouwer); moreover, Kantian sympathies are discernible, even empiricist ones. Most of Wittgenstein’s remarks, however, are directed against academic schools, and Fogelin [1987], for one, describes his position via a double negative: ‘anti-platonism without conventionalism.’  Here I will not propose a new label, yet the best starting point to discuss Wittgenstein’s conception is its relation to conventionalism. How exactly his views relate to this doctrine is, once again, hard to pin down; what appears relatively clearer is that his conventionalism is not the ‘full-blooded’ conventionalism that Dummett [1959] attributed to him.

As a first approximation, for Wittgenstein arithmetical identities (such as ‘three times three is nine’) are not propositions, as their superficial grammar indicates – but rules. Importantly though, these rules are not arbitrary; in a sense (to be explicated later on), the rules in place are the only ones that could have been adopted or, as Steiner [2009, 12] put it, “the only rules available." The typical conventionalist difficulty (that they might have an arbitrary character) is answered when it is added that the rules are grounded in objectively verifiable empirical regularities (Fogelin [1987]; Steiner [1996], [2000], [2009]); or, as Wittgenstein says, the empirical regularities are “hardened” into rules (RFM VI-22). In discussing Wittgenstein’s relation to conventionalism, a central task in what follows will be to clarify why (and how, and what kind of) agreement within a community is a crucial presupposition of the very existence of mathematics.

3. Mathematical Reality

As stated above, it is an open question whether Wittgenstein actually held a ‘position’ in the philosophy of mathematics, in the sense of advancing a compact body of doctrine. Most of his remarks seem reactions to what he takes to be (philosophical, non-trivial) misunderstandings concerning the nature of mathematics. Platonism falls in this category. He most likely understood this doctrine as the conjunction of two tenets: a semantic thesis – mathematical propositions state truths, and an ontological claim, according to which the truth-makers of the mathematical propositions, that is, objects like numbers, sets, functions, and so forth, exist, in a fashion similar to Plato’s forms, and populate an a-causal, non-spatiotemporal domain. This is to say that they are abstract; additionally, they are also language- and mind-independent. A third, epistemological, thesis is often added: we humans can know about these objects and truths insofar as we possess a special cognitive faculty of ‘intuition’ (that is, despite the fact that we don’t interact with them causally). However, note that an adequate characterization of Platonism itself, or of the version Wittgenstein dismissed, are substantial philosophical tasks in themselves, and this article will not attempt to undertake them. Yet one issue, among others, is worth mentioning in passing. There are Platonist realists who claim that mathematical statements have truth-value (Shapiro [2000] calls this “realism in truth-value”), yet also hold that this does not entail that certain ‘objects’ need to be postulated as their truth-makers (the view that we do need to postulate these objects is called “realism in ontology”). Many famous mathematicians, philosophers and logicians have been attracted or intrigued by Platonism: Hardy [1929], [1967], Bernays [1935], Gödel [1947], Benaceraff [1973], Dummett [1978a] are loci classici. More recently, Burgess and Rosen [1997] and Balaguer [1998] offer book-length treatments of this conception; Linebo [2009] and Cole [2010] are useful surveys and rich up-to-date bibliographical sources.

What is Wittgenstein’s take on Platonism? The standard view is that for him this conception is what C. Wright [1980, 5] calls a “dangerous error.” Wittgenstein surely believed that it is extremely misleading to understand mathematical identities as stating truths about some (mathematical) objects. This is so because mathematical formulae are not in the business of making statements to begin with: they are not propositions. The acceptance of this assumption – that they are, as their surface grammar indicates – is, for some commentators, one of those “decisive moment[s] in the conjuring trick (...)” he mentions in PI §308, “the very one that we thought quite innocent." Thus, the decisive step on the road leading to Platonism is asking the question ‘What is mathematics about?’ The query seems innocent indeed: mathematics is a type of discourse, all discourses have a subject-matter, so mathematics must have one too. To see the problem with this question, Wittgenstein urges us to consider another question – ‘What is chess about?’ (This kind of move is characteristic for his style of philosophizing in the later period.) The initial question suggests a certain perspective on the issue, while the new question is meant to challenge this perspective, to reveal other possibilities – in particular, that mathematical equations may not be descriptions (statements, affirmations, declaratives, and so forth), let alone descriptions of some abstract, mind-independent, non-spatiotemporal entities and their relations. As has been noted (Wrigley [1977]), this idea bears clear similarities to one of the key-thoughts of the Tractatus, that not all words (in particular the logical vocabulary) stand for things. The grammatical form of number words (as nouns), and of mathematical formulae (as affirmations), must be treated with care. More concretely, ‘is’ in ‘two and two is four’ does not have the role of a description of a state of affairs (so to speak) holding within an abstract non-spatiotemporal realm.

It is generally accepted, at least among philosophers, that Platonism is the natural, or “default” metaphysical position for the working mathematician (Cole [2010]). Wittgenstein has an interesting answer to the question ‘Why are mathematicians attracted to this view?’ and this section shall close by sketching it.

Commentators have emphasized that his answer is grounded in the distinction between applied and pure (better: not-yet-applied) mathematics, and is thus entirely different from the standard reasons adduced by Platonists themselves. (For them, this doctrine is appealing because: (i) it squares well with our pre-theoretical picture of mathematics (as containing truth-valuable, objective, mind-independent, a priori statements), and (ii) it explains the agreement among mathematicians of all times.) As Maddy [1997, 167-8] points out, Wittgenstein thought it was very important to reflect on how professional mathematicians typically account for the importance of a certain piece of mathematical formalism – say, a new theorem. (He discusses this in terms of subjecting the “interest of calculations” to a test; (RFM II-62)) One thing mathematicians could do is reveal the implications of that bit of mathematics for science and mathematics as a whole. In other words, they could point out the great and unquestionable effectiveness of that formalism in other areas of mathematics and in scientific applications. Yet, when this type of answer is not available – that is, when they have to justify their interest in an unapplied formalism – they usually say things that Wittgenstein calls ‘prose’ (Wittgenstein and the Vienna Circle [WVC], p. 149). Instead of (perhaps boringly) detailing the internal-mathematical complex network of conceptual relations in which the new theorem fits, its unifying power, its capacity to open up new lines of research, and so forth, many mathematicians indulge in the more thrilling project of presenting their activity as one of foray into a mysterious super-empirical realm. Hence the familiar Platonist distinction: the claim that they do not invent theories, but discover objective eternal truths about a-causal, non-spatiotemporal objects. Wittgenstein refers to this as the illusory revelation of “the mysteries of the mathematical world” (RFM II - 40, 41).

These critical points do not amount to a positive view yet. They rather lead to more queries, two of which will be discussed in the next section.  They are: (i) What is wrong with asking what mathematics is about, and (ii) Does the comparison with chess mean that Wittgenstein believes that mathematics is (merely) a game? Note that for Wittgenstein reflection on games occupies a central place in his philosophical methodology, and that chess is his favourite object of comparison (it appears in many contexts; to mention only the PI, see §31, §33, §136, and so forth)

4. Philosophical Method, Finitism-Constructivism, Logicism and Formalism

If one asks ‘what is mathematics about?’, a typical answer is – ‘certain entities, like number 2, inhabiting a Platonic reality.' The adoption of this metaphysics comes in handy, seemingly increasing our understanding in this matter: we might get the feeling that we now finally know what mathematicians have been talking about for thousands of years. However, this is deceptive; the acceptance of this proposal creates more and deeper problems than the ones we began with. Wittgenstein regards the talk about ‘reality’ in mathematics not as a step toward a better understanding of this human practice, but as a source of confusion, of philosophical problems, which he regards as conceptual ‘illnesses’ requiring ‘treatment.' Here we encounter his view that philosophy ultimately has a therapeutic role, an idea appearing in PI first at §133, with mathematics mentioned explicitly in §254. The therapy consists in applying one of his philosophical methods – to remind  those who are puzzled of where concepts are ‘at home’, and to reveal how philosophizing uproots them from their natural context:

Consider Professor Hardy’s article (“Mathematical Proof”) and his remark that “to mathematical propositions there corresponds – in some sense, however sophisticated – a reality."..[Y]ou forget where the expression ‘a reality corresponds to’ is really at home –What is “reality”? We think of “reality” as something we can point to. It is this, that. Professor Hardy is comparing mathematical propositions to the propositions of physics. This comparison is extremely misleading (Lectures on the Foundations of Mathematics, p. 239-40; LFM hereafter)

The diagnosis of PI §38 is that philosophical problems appear when ‘language goes on holiday’, or ‘idles.' What does this mean? (The original reads "...wenn die Sprache feiert." Stern [2004, 97] argues that Rhees’ initial translation of ‘feiert’ by ‘idle’ coveys better the idea that the language does no work than Anscombe’s standard translation ‘goes on holiday.') Steps toward the clarification of this point can be made by observing that Wittgenstein has in mind here the situation in which a concept otherwise fully intelligible, oftentimes even unremarkable, is detached from its original contexts and uses (language-games). Once dragged into a new ‘territory’, the concept idles (just like a cogwheel separated from the gear), as it cannot engage other related concepts anymore. Anyone trying to employ the concept in this new ‘environment’ experiences mental cramps and disquietudes, precisely because, as it happens when one ventures into uncharted ‘terrain’, the familiar ‘paths’ and ‘crossroads’ no longer exist: one does not know her “way about” (PI §123), one does not see things clearly anymore. Hence, according to Wittgenstein, the whole point of philosophizing is not to defend ‘philosophical positions’ but to undo this perplexity, and achieve a liberating ‘perspicuous representation’, or over-view (übersichtliche Darstellung; PI §122).

When discussing the LFM quote above, Maddy observes [1997, 167] that one such problem, of much fame in the current debates in the philosophy of mathematics, is the epistemological problem for Platonists, due to Benacerraf [1973]. Roughly put, the question is how we get to know anything about numbers, if they are a-causal, non-spatiotemporal entities, and if all our knowledge arises by causal interaction with the known objects? This seems a legitimate worry, but Wittgenstein would disagree with one of its assumptions. Not only does he reject the idea that a causal basis is essential for knowledge (‘knowledge’ most likely counts as a “family-resemblance” concept, lacking an essential feature; cf. PI §67), but he tackles the issue from an entirely different angle. He notes that the force of the problem (“the feeling of something queer, “thumb-catching of the intellect”) “comes from a misunderstanding” (RFM V-6). The misunderstanding originates in what can be called a ‘transfer’ move: the Platonist assumes that the talk of ‘reality’ (and ‘entities’, and so forth) can be extended, or transferred, from the natural ‘home’ of these notions (the vernacular and the natural sciences) to the mathematical discourse – and, moreover, that this transfer can take place without loss of meaning. The rejected assumption is not that we can talk about relations and objective truth in mathematics, but that we can talk about them in the same way we talk in empirical science. Once this transfer is made, ‘deep’ questions (like Benacerraf’s) follow immediately. What Wittgenstein urges, then, is to take this epistemological difficulty as exposing the meaning-annihilation effect of the illicit transfer move – so, in the end, Platonism is not even wrong, but (disguised) nonsense.

What is the more general outcome of adopting such a philosophical strategy? A ‘therapeutic’ one: the liberation of our minds from the grip of an ‘illegitimate question’ (‘what is mathematics about?’) Such a strategy can be traced back to Heinrich Hertz’s insight in Principles of Mechanics, one of Wittgenstein’s formative readings: “(...) the question [...] will not have been answered; but our minds, no longer vexed, will cease to ask illegitimate questions.” (Hertz [1899 / 2003, 8] The question Hertz entertained was about ‘the ultimate nature of force’).

Before taking up the second issue, it is worth noting that Wittgenstein’s view fails to fit other traditional ‘isms’ too. Wrigley [1977, 50] sketches some of the reasons why he cannot be described as a finitist-constructivist, thus disagreeing with how Dummett [1959] and Bernays [1959] portray him. While following the teachings of this philosophical doctrine would lead to massive changes in mathematics (for example, a big portion of real analysis must be wiped out as meaningless), Wittgenstein notoriously held the view that “philosophy may in no way interfere with the actual use of language; it can in the end only describe it. For it cannot give it a foundation either. It leaves everything as it is. It leaves mathematics as it is, and no mathematical discovery can advance it” (PI §124). Another marked difference between Wittgenstein and the intuitionists is discussed by Gerrard [1996, 196, fn 37]. Brouwer sees mathematics as divorced from the language in which it is done, and says: “FIRST ACT OF INTUITIONISM Completely separating mathematics from mathematical language…recognizing that intuitionistic mathematics is an essentially languageless activity of the mind…” (cited in D. van Dalen [1981, 4]). Yet, as we will see below, it is a constant of Wittgenstein’s view that mathematics cannot be separated from a language and a human practice.

Moreover, Wittgenstein also distances himself from the logicism of Frege and Russell. The previous point about the status of mathematical formulae (that is, not being propositions) explains why. Insofar as mathematical formulae aren’t propositional truths, it is simply out of the question whether they are, or can be reduced to, a particular kind of truths, namely truths of logic. But Wittgenstein’s dissatisfaction with logicism goes further than that; he is unhappy with the ‘epistemically reductive’ character of this doctrine, and emphasizes the “motley” of mathematical techniques of proof (RFM III-46A, III-48). Formulated in the logicist style, mathematical proofs might gain in precision but surely lose in perspicuity and conceptual elegance. Floyd [2005, 109] summarizes the point: “That is not to say that there are no uses for formalized proofs (for example, in running computer programs). It is to say that a central part of the challenge of presenting proofs in mathematics involves synoptic designs and models, the kind of manner of organizing concepts and phenomena that is evinced in elementary arguments by diagram. This Wittgenstein treated as undercutting the force of logicism as an epistemically reductive philosophy of mathematics: the fact that the derivation of even an elementary arithmetical equation (like 7 + 5 = 12) would be unclearly expressed and unwieldy in (Russel's and Alfred North Whitehead's) Principia Mathematica shows that arithmetic has not been reduced in its essence to the system of Principia (compare RFM III-25, 45, 46).”

Returning to the second issue (how mathematics is different from a game), recall that this concern arises naturally given Wittgenstein’s insistence that mathematical formulae aren’t genuine propositions, but some kind of rules. This takes us to the question as to how Wittgenstein’s conception differs from formalism. The answer must begin by noting that mathematics provides us with “rules for the identity of descriptions” (Fogelin [1987, 214]).

Consider the multiplication 3 × 3 = 9. Obviously enough, one typical role of this equality is to license the replacement of one string of symbols (‘3’, ‘×’, ‘3’) with another symbol (‘9’) in extensional contexts. Wittgenstein does not disagree with this account, but has a more substantial story to tell. These identifications are not (cannot be) mechanical, disembodied actions – rather, they are grounded in ancestral, natural human practices, such as sorting and arranging objects. A simple example, in essence Fogelin’s [1987], should illustrate this point. Suppose we describe to a child the 3 times 3 operation by the following arrangement:

(a)

■    ■    ■

■    ■    ■

■    ■    ■

Next, suppose we produce another arrangement

(b)

■       ■ ■       ■

and tell her that this also describes the 3 × 3 operation. Thus, the identity ‘3 × 3 = 9’ is the rule that the two descriptions are correct and they say the same thing (that is, are ‘identical’, to use Fogelin’s terminology). So, we hope, the child understands that three batches of three objects amount to nine objects.

Now suppose that a few days later the child no longer remembers all the details of this story and conceives of the 3 × 3 multiplication as follows:

(c)

■   1.

■   2.

3.

She counts the squares, and reports the result: 7. We protest, and the child gets confused. Her puzzlement originates in her belief that she generated arrangement (c) by doing the same thing we did initially – she considered three batches of three objects, and then counted the objects!

This simple example signals a more serious problem. We recognize here one version of the well-known rule-following ‘paradoxes’ in PI highlighted by Kripke [1982] many years ago: “any course of action can be made out to accord with the rule”, and thus “no course of action could be determined by a rule." (PI §201) It is impossible to draw a list containing all the directives needed to follow a rule, and no matter what one does, there is an interpretation of one’s actions that makes them accord with the rule. In our case, one just can’t draw a list containing everything that is, and is not, allowed in the manipulation of the little black squares when describing the 3 × 3 operation – thus clearing up all possible confusions. (This is easy to see: suppose one proposes as a regress-stopper adding a rule explicitly excluding superpositions. But this won’t work, since yet another Goodmanesque-Kripkean scenario can be invented in which an arrangement is obtained ‘showing’ that, say, 3×3 = 33 because, for instance, one misunderstands what superposition is. And so on. See Goodman [1955], Kripke [1982].)

In essence, here Wittgenstein would urge that it is just a brute fact of nature that we are indeed able to avoid this situation, and sort out our confusion, especially after the teachers intervene and signal the mistake we made. To emphasize, it is a brute fact that when people are asked to represent the multiplication 3 × 3, the (a)-type and (b)-type arrangements predominate, while those of the (c)-type are very rare, and even more so after training.

More generally, it is a given of the human forms of life that we are able to follow rules and orders; that there is a point when we stop ‘interpreting’ them – and just act. Thus, crucial here is our ability to act ‘blindly’ when following directions, as Wittgenstein puts it (PI §219) – that is, blind to “distractions” (Stern [2004, 154-155]), to all the so numerous possibilities to stray away. Were we not able to act this way, were the confusions (such as the ones described above) overwhelmingly prevalent, then the arithmetical practice would not exist to begin with. (It is instructive to peruse Stern’s critique of Fogelin’s [1994, 219-20] misreading of PI §219. But see also Fogelin [2009] for more on these matters.)

So, as a result of training, the child becomes able to recognize that, as Fogelin notes [1987, 215], despite superficial similarities arrangement (c) is not the result of doing ‘the same thing’ we did initially ((a) and (b)). While this might be distressing, there is no guarantee that the child will reach this stage in understanding. Moreover, the process is gradual: some children ‘get it’ faster than others. The immersion in the community of ‘mathematicians’, the continual checking and correcting, the social pressure (through low grades), and so forth, help the children master the technique of distinguishing what is allowed from what is not, to differentiate what must be ‘turned a blind eye on’ from what really matters.  The arithmetical training consists in inculcating in children a certain technique to deploy when presented with situations of the kind discussed here: they understand multiplication when they are able to establish (and recognize) the identity of various descriptions.

The squares example can be invoked to illustrate Wittgenstein’s position regarding formalism. The arithmetical identities are not reducible to mere manipulations of symbols, but come embedded into, and govern the relations of, (arrangement) practices. The arithmetical training consists not only in having the pupils learn the allowed strings of symbols (the multiplication table) by heart, but also, more importantly, in inculcating in them a certain reaction when presented with arrangements of the kind discussed above.

At this point, two aspects of the issue should be distinguished. The first is purely descriptive. In terms of what people actually do, the ‘normal’ situations (arrangements of (a)-type and (b)-type) prevail after training, while the ‘deviant’ ones (arrangements of the (c)-type) are the exception. This is an empirical regularity: with training, most people get the arrangements right most of the time.

The second aspect is normative: this empirical regularity becomes a candidate to be codified into a special form. By doing this, we thus grant it a new status, that of a norm, or rule – an arbiter of ‘right’ and ‘wrong.' We say that it is incorrect to take an arrangement like (c) to stand for the 3 times 3 operation. The empirical regularity, the uniform application of the technique of multiplication, is thus ‘hardened’ into a rule. When discussing multiplication in LFM X, p. 95, Wittgenstein says the following:

I say to [the trainee], “You know what you’ve done so far. Now do the same sort of thing for these two numbers.”—I assume he does what we usually do. This is an experiment—and one which we may later adopt as a calculation. What does that mean? Well, suppose that 90 per cent do it all one way. I say, “This is now going to be the right result.” The experiment was to show what the most natural way is—which way most of them go. Now everybody is taught to do it—and now there is a right and wrong. Before there was not.

To indicate the change of status, Wittgenstein uses several suggestive metaphors. The first is the road building process:

It is like finding the best place to build a road across the moors. We may first send people across, and see which is the most natural way for them to go, and then build the road that way. (LFM, p. 95)

Here, building the road is a decision, but not an arbitrary one: different crossing paths have been taken over time, but usually one is most often travelled, and thus preferred on a regular basis. It is this one that gradually emerges as the most suited for crossing, and the one which the lasting road will follow.

The second metaphor is legalistic: we regard an empirical regularity as a mathematical proposition when we “deposit [it] in the archives” (LFM, p. 104). Note that what is now in the archives has previously been in circulation, ‘out there’, had a genuine life. On the other hand, what is in the archives is protected, withdrawn from circulation – that is, not open to change and dispute. The relations between the archived items are frozen, solidified. (Note the normative role of archives as well: it is illegal to tamper with them.)

A related metaphor we already encountered above is that of the physical process of condensation: just as an amount of vapor enters a qualitatively different form of aggregation turning into a drop of water, so the empirical-behavioral regularities are turned into arithmetical rules. What happens is that these regularities are ‘condensed’ or, to use Wittgenstein’s own word, “hardened”:

It is as if we had hardened the empirical proposition into a rule. And now we have, not an hypothesis that gets tested by experience, but a paradigm with which experience is compared and judged. And so a new kind of judgment. (RFM VI-22)

The elevation to a new status (performed because of the robust, natural agreement) is indicated by archiving:

 

Because they all agree in what they do, we lay it down as a rule, and put it in the archives. Not until we do that have we got to mathematics. One of the main reasons for adopting this as a standard, is that it’s the natural way to do it, the natural way to go – for all these people. (LFM, p. 107)

5. Regularities, Rules, Agreement, and Contingency

Wittgenstein’s emphasis on conceiving mathematics as essentially presupposing human practices and human language should not be taken as an endorsement of a form of subjectivism of the sort right-is-what-I-(my-community)-take-to-be-right. Thus, interestingly, there is a sense in which Wittgenstein actually agrees with the line taken by Hardy, Frege and other Platonists insisting on the objectivity of mathematics. What Wittgenstein opposes is not objectivity per se, but the ‘philosophical’ explanation of it. The alternative account he proposes is that arithmetical identities emerge as a special codification of these contingent but extremely robust, objectively verifiable behavioral regularities. (Yet, recall that although the arithmetical propositions owe their origin and relevance to the existence of such regularities, they belong to a different order.) So, what Wittgenstein rejects is a certain “metaphysics of objectivity” (Gerrard [1996, 173]). In essence, he rejects the idea that mathematics is transcendent, that an extra-layer of ‘mathematical reality’ is the ultimate judge of what is proved, here and now, by human means. (“We know as much as God does in mathematics” (LFM, p. 104)) Therefore, from this perspective, progress in explaining Wittgenstein’s conception consist in clarifying how he manages to account for the “peculiar inexorability of mathematics” (RFM I-4), while avoiding both Platonism and subjectivism.

A good starting point for this task is understanding Wittgenstein’s view of the relation between mathematical and empirical statements. Unlike some philosophers oftentimes compared with him (such as Quine), he takes mathematical propositions to be non-revisable in light of empirical investigation (“mathematics as such is always measure, not thing measured” RFM, III-75). To see the larger relevance of this point, consider the simple question ‘What if, by putting together two pebbles and other three pebbles, we get sometimes five, sometimes six, and sometimes even four pebbles?’ Importantly, this state of affairs would not be an indication that 2+3=5 is false – and thus needs revision. This situation would only show that pebbles are not good for demonstrating (and teaching) addition. The mathematical identity is untouchable (as it were), and the only option left to us is to suspect that maybe the empirical context is abnormal. Thus, the next step is to ask whether we face the same anomalies when we use other objects, such as fruits, pencils, books, bottles, fingers, and so forth.

So, let us explore, for a moment, for the sake of the argument, the scenario in which a large variety of ordinary items is subject to this strange variation. Under this assumption, the conclusion to draw is a radical one: were this scenario real, the arithmetical practice of summing could not have even gotten off the ground. This would be the situation defining the supreme pointlessness of arithmetic, which would thus become a mere game of symbols. In such a situation, the truth or falsity of the identity 2+3 = 5 would no longer matter, as it is, in a sense, meaningless. As Wittgenstein puts it,

I want to say: it is essential to mathematics that its signs are also employed in mufti. It is the use outside mathematics, and so the meaning of the signs that make the sign game into mathematics. (RFM IV-41)

The terms ‘sense’ and ‘nonsense’, rather than the terms ‘true’ and ‘false’, bring out the relation of mathematical to nonmathematical propositions. (Wittgenstein's Lectures, Cambridge 1932-35; hereafter AWL, p. 152)

However, the wild variation described above does not exist. It is a contingent, brute natural fact (again!) that we do not live in such a world, but in one in which regularities prevail. Moreover, it is precisely the existence of such regularities – together with, as we will see in a moment, regularities of human behaviour – that makes possible the arithmetical practice in the first place. Wittgenstein, however, realized this rather late, as Steiner [2009] documents. Remarks about the role of empirical regularities are entirely absent in PG and PR; so, according to Steiner, understanding the importance of the empirical regularities for mathematics amounts for Wittgenstein to a “silent revolution”, in addition to the well-known “overt revolution” (the repudiation of the Tractatus).

A closer look at the contingent regularity relevant in this context – behavioural agreement – is now in order. (At PI §206 and 207 Wittgenstein suggests that these regularities form the basis of language itself.) This type of agreement consists in all of us having, roughly, the same natural reactions when presented with the same ‘mathematically’ related situations (arranging, sorting, recognizing shapes, performing one-to-one correspondences, and so forth.) Its existence is supported by the already discussed facts: (i) we can be trained to have these reactions, and (ii) the world itself presents a certain stability, many regular features, including the regularity that people receiving similar training will react similarly in similar situations. (There surely is a neuro-physiological basis for this; cats, unlike dogs, cannot be trained to fetch.)

Yet, as stressed above, it is crucial to note that speaking in terms of behavioural agreement when it comes to understanding the mathematical enterprise should not lead one to believe that Wittgenstein is in the business of undermining the objectivity of mathematics. This is Dummett’s [1959] early influential line of interpretation, describing Wittgenstein as a “full-blooded conventionalist” (or even “anarchist”; this is Dummett’s [1978b, 64] famous label, when comparing Wittgenstein, “in his later phase” with the “Bolsheviks” Brouwer and Weyl.) According to him, Wittgenstein maintains that at any step in a calculation we could go any way we want – and the only reason that we go the way we usually go is an agreement between us, as the members of the community: in essence, a convention we all accept (And which, since it might be changed, is entirely contingent. Dummett [1978b, 67] writes: “What makes a […mathematical] answer correct is that we are able to agree in acknowledging it as correct." (Cited in Gerrard [1996, 197, fn. 43]).

Thus, one should say that a mathematical identity is true by convention; that is, it is taken, accepted as true by all calculators because a convention binds them. However, textual evidence can be amassed against this reading. Wittgenstein does not regard the agreement among the members of the community’s opinions on mathematical propositions as establishing their truth-value. Convincing passages illustrating this point can be found virtually everywhere in his later works, and Gerrard [1996] collects several of them.

In In PI II, xi, pp.226, Wittgenstein says that “Certainly the propositions ‘Human beings believe that twice two is four’ and ‘Twice two is four’ do not mean the same.” In LFM, p. 240, when talking about ‘mathematical responsibility’, he points out that “(...) one proposition is responsible to another. Given certain principles and laws of deduction, you can say certain things and not others." Some passages in LFM are most perspicuous: “Mathematical truth isn’t established by their all agreeing that it’s true” (p. 107); also, “it has often been put in the form of an assertion that the truths of logic are determined by a consensus of opinions. Is this what I am saying? No.” (p. p. 183).

So, it is simply not the case that the truth-value of a mathematical identity is established by convention. Yet behavioural agreement does play a fundamental role in Wittgenstein’s view. This is, however, not agreement in verbal, discursive behaviour, in the “opinions” of the members of the community. It is a different, deeper form of consensus – “of action” (LFM, p. 183; both italics in original), or agreement in “what [people] do” (LFM, p. 107). Tellingly, RFM, VI-30C reads: “The agreement of people in calculation is not an agreement in opinions or convictions.” In PI §241, this is spelled out as “not agreement in opinions but in form of life." Moreover, LFM contains passages like this:

There is no opinion at all; it is not a question of opinion. They are determined by a consensus of action: a consensus of doing the same thing, reacting in the same way. There is a consensus but it is not a consensus of opinion. We all act the same way, walk the same way, count the same way (p. 183-4)

The specific kind of behavioural agreement (in action) is a precondition of the existence of the mathematical practice. The agreement is constitutive of the practice; it must already be in place before we can speak of ‘mathematics.' The regularities of behaviour (we subsequently ‘harden’) must already hold. So, we do not ‘go on’ in calculations (or make up rules) as we wish: it is the existent regularities of behaviour (to be ‘hardened’) that bind us.

Thus, not much is left of the ‘full-blooded conventionalist’ idea. We don’t really have much freedom. Steiner [2009, 12] explains: the rules obtained from these regularities “are the only rules available. (The only degree of freedom is to avoid laying down these rules, not to adopt alternative rules. It is only in this sense that the mathematician is an inventor, not a discoverer.)” We can now see more clearly how this view contrasts with Dummett’s: mathematical identities are not true by convention, since they themselves are conventions, rules, elevated to this new status (‘archived’) from their initial condition of empirical regularities (Steiner [1996, 196]).

While the behavioural agreement constitutes the background for the arithmetical practice, Wittgenstein takes great care to keep it separated from the content of this practice (Gerrard [1996, 191]). As we saw, his view is that the latter (the relations between the already ‘archived’ items) is governed by necessity, not contingency; the background, however, is entirely contingent. As Gerrard observes, this distinction corresponds, roughly, to the one drawn in LFM, p. 241: “We must distinguish between a necessity in the system and a necessity of the whole system.” (See also RFM VI-49: “The agreement of humans that is a presupposition of logic is not an agreement in opinions on questions of logic.”) It is thus conceivable that the background might cease to exist; should it vanish, should people start disagreeing on a large scale on simple calculations or manipulations, then, as discussed, this would be the end of arithmetic – not a rejection of the truth of 2+3=5, but the end of ‘right’ (and ‘wrong’) itself, the moment when such an identity turns into a mere string of symbols whose truth would not matter more than, say, the truth of ‘chess bishops move diagonally.' (Note that this rule is not grounded in a behavioural empirical regularity, but it is merely formal, and arbitrary.)

The very fact of the existence of this background is not amenable to philosophical analysis. The question ‘Why do we all act the same way when confronted with certain (mathematical) situations?’ is, for Wittgenstein, a request for an explanation, and it can only be answered by advancing a theory of empirical science (neurophysiology, perhaps, or evolutionary psychology). This is a question that he, qua philosopher, does not take to be his concern. He sees himself as being in the business of only describing this background, with the avowed goal of drawing attention to its existence and (overlooked) function.

Moreover, it now hopefully becomes clearer how his rejection of ‘philosophical’ explanations, still baffling many readers, is connected to his central strategy of dissolving (rather than solving) philosophical puzzles. (Recall Wittgenstein’s warnings about confusing philosophical questions with empirical ones in The Blue and Brown Books, p.18, 35 and the related diagnostic in RFM VI-31, “Our disease is one of wanting to explain.") When a philosophical explanation is advanced, it usually takes the Platonist form: “we all agree about mathematical identities because we all ‘grasp’ (‘look at’, are ‘guided by’, and so forth) the same (a-causal non-spatiotemporal) mathematical entities and their relations." But this explanation is preposterous – literally, it gets things in the wrong order. As Steiner puts it [1996, 192; emphasis added), “(w)hat Wittgenstein rejects in mathematical Platonism is not so much its doctrine that ‘mathematical objects exist’, but the illusion it fosters that the existence of mathematical objects, and of a special faculty of intuiting them can explain the extraordinary agreement among mathematicians – of all cultures – about what propositions have been proved. To the contrary, Wittgenstein argues, the ‘great’ and ‘surprising’ agreement among mathematicians underlies mathematics; indeed, makes it possible. But the task of philosophy is, and can be, only to describe, not explain, the fundamental role of the regularity of human mathematical behaviour.”

Related to Platonism, ‘mentalism’ is another target of Wittgenstein, as Putnam [1996] notes. This is the idea that rules are followed (and calculations made) because there is something that ‘guides’ the mind in these activities. If we recall the black squares example, the guide must play the role of a regress-stopper, constituting the explanation as to how all possible interpretations and distractions are averted. The mind and this guide form an infallible mechanism delivering the result. This is a supermechanism, as Putnam calls it, borrowing Wittgenstein’s own way to characterize the proposal. (Wittgenstein talks about ‘super-concepts’ at PI §97b, ‘super-likeness’ at PI §389, and so forth.) Putnam [1996, 245-6] makes the point as follows: “Wittgenstein shows that while the words used in philosophers’ explanations of rule following may sometimes hit it off – for example, it is perfectly true that a rule determines the results of a calculation – the minute we suppose that some entity or process has been described that explains How We Are Able to Follow Rules, then we slip into thinking that we have discovered a ‘supermechanism’; moreover, if we try to take these super-mechanisms seriously we fall into absurdities. Wittgenstein sometimes indicates the general character of such ‘explanations’ by putting forward parodistic proposals, for example, that what happens when we follow a rule is that the mind is guided by invisible train tracks (§218), ‘lines along which [the rule] is to be followed through the whole of space’ (§219).”

6. Concluding Remarks

In 1944, Wittgenstein assessed that his “chief contribution has been in the philosophy of mathematics” (Monk [1990, 466]). According to P.M.S. Hacker, Wittgenstein’s reflections on mathematics are his “least influential and least understood” part of his later philosophy [1996, 295]. Later Wittgenstein’s philosophy of mathematics is indeed difficult to characterize and classify, especially within the framework of academic schools. (For instance, while the tendency to call him today an ‘antirealist’ appears to be well supported by textual evidence, one should not ignore the attempts to identify ‘realist’ tendencies in his work. See Diamond [1991] and Conant [1997] for discussions of Wittgenstein’s location within the broader realist-antirealist landscape). Thus there is ample room for further discussion of his views, and especially for clarifying how his philosophy of mathematics complements, or even augments, his relatively better understood philosophy of language and mind. The relation between his view of mathematics and that of the professional mathematicians is yet another interesting and potentially controversial matter worth of further study – for, if we are to believe him,

A mathematician is bound to be horrified by my mathematical comments, since he has always been trained to avoid indulging in thoughts and doubts of the kind I develop. He has learned to regard them as something contemptible and… he has acquired a revulsion from them as infantile. That is to say, I trot out all the problems that a child learning arithmetic, and so forth, finds difficult, the problems that education represses without solving. I say to those repressed doubts: you are quite correct, go on asking, demand clarification! (PG 381, 1932)

7. References and Further Reading

For comprehensive bibliographical sources, see Sluga and Stern [1996] and Floyd [2005], and, for material available online, see Rodych [2008] and Biletzki and Matar [2009].

a. Wittgenstein's Writings

  • Tractatus Logico-Philosophicus, [1922], London: Routledge and Kegan Paul, 1961; translated by D.F. Pears and B.F. McGuinness.
  • (PI) Philosophical Investigations, [1953]. [2001], 3rd ed., Oxford: Blackwell Publishing. The German text, with a revised English translation by G. E. M. Anscombe.
  • (RFM) Remarks on the Foundations of Mathematics, [1991], MIT Press: Cambridge Massachusetts and London, England (paperback edition) [1956] 1st ed. Edited by G.H. von Wright, R. Rhees and G.E.M. Anscombe (eds.); translated by G.E.M Anscombe.
  • (PG) Philosophical Grammar, [1974], Oxford: Basil Blackwell; Rush Rhees, (ed.); translated by Anthony Kenny.
  • (PR) Philosophical Remarks, [1975], Oxford: Basil Blackwell; Rush Rhees, (ed.); translated by Raymond Hargreaves and Roger White.
  • (RPP) Remarks on the Philosophy of Psychology, [1980], Vol. I, Chicago: University of Chicago Press, G.E.M. Anscombe and G.H. von Wright, (eds.), translated by G.E.M. Anscombe.
  • (BB) The Blue and Brown Books, [1964], [1958] 1st ed. Oxford: Blackwell
  • (AWL) Ambrose, Alice, (ed.), [1979], Wittgenstein's Lectures, Cambridge 1932-35: Ed. A. Ambrose, Totowa: NJ: Littlefield and Adams.
  • (LFM) Diamond, Cora, (ed.), [1976], Wittgenstein's Lectures on the Foundations of Mathematics, Ithaca, N.Y.: Cornell University Press.
  • (WVC) Waismann, Friedrich, [1979], Wittgenstein and the Vienna Circle, Oxford: Basil Blackwell; edited by B.F. McGuinness; translated by Joachim Schulte and B.F. McGuinness.

b. Other Sources

  • Anderson, A.R., [1958], ‘Mathematics and the “Language Game”’, The Review of Metaphysics II, 446-458.
  • Balaguer, Mark [1998], Platonism and Anti-Platonism in Mathematics, Oxford: Oxford University Press.
  • Bays, Timothy, [2004], “On Floyd and Putnam on Wittgenstein on Gödel,” Journal of Philosophy, CI.4, April: 197-210.
  • Benacerraf, Paul, [1973], “Mathematical Truth”, Journal of Philosophy, 70(19): 661–679.
  • Benacerraf, Paul and Putnam, Hilary (eds.), [1983] 2nd ed., Philosophy of Mathematics: Selected Readings, Cambridge: Cambridge University Press. ([1964] 1st ed.)
  • Bernays, Paul, [1935], “On Platonism in Mathematics”, Reprinted in Benacerraf and Putnam [1983].
  • Bernays, Paul, [1959], “Comments on Ludwig Wittgenstein's Remarks on the Foundations of Mathematics,” Ratio, Vol. 2, Number 1, 1-22.
  • Biletzki, Anat and Matar, Anat, [2009], "Ludwig Wittgenstein", The Stanford Encyclopedia of Philosophy (Summer 2011 Edition), Edward N. Zalta (ed.).
  • Burgess, John P. and Rosen, Gideon, [1997], A Subject with No Object, Oxford: Oxford University Press.
  • Cole, Julian, [2010], “Mathematical Platonism”, The Internet Encyclopedia of Philosophy, ISSN 2161-0002, 22 Oct 2011
  • Conant , James, [1997], “On Wittgenstein's Philosophy of Mathematics” Proceedings of the Aristotelian Society, New Series, Vol. 97 (1997), pp. 195-222.
  • Dalen, van D., [1981], Brouwer’s Cambridge Lectures on Intuitionism. Cambridge Univ. Press
  • Diamond, Cora, [1991], The Realistic Spirit, Cambridge, MA: MIT Press.
  • Dummett, Michael, [1959], “Wittgenstein's Philosophy of Mathematics,” The Philosophical Review, Vol. LXVIII: 324-348.
  • Dummett, Michael, [1978a], “Platonism”, in Truth and Other Enigmas. Cambridge, MA: Harvard University Press pp. 202-15
  • Dummett, Michael, [1978b], “Reckonings: Wittgenstein on Mathematics,” Encounter, Vol. L, No. 3 (March 1978): 63-68.
  • Floyd, Juliet, [1991], “Wittgenstein on 2, 2, 2…: The Opening of Remarks on the Foundations of Mathematics,” Synthese 87: 143-180.
  • Floyd, Juliet, and Putnam, Hilary, [2000], “A Note on Wittgenstein's “Notorious Paragraph” about the Gödel Theorem,” The Journal of Philosophy, Volume XCVII, Number 11, November: 624-632.
  • Floyd, Juliet, [2001], “Prose versus Proof: Wittgenstein on Gödel, Tarski, and Truth,” Philosophia Mathematica 3, Vol. 9: 280-307.
  • Floyd, Juliet, [2005], “Wittgenstein on Philosophy of Logic and Mathematics,” in The Oxford Handbook of Philosophy of Logic and Mathematics, S. Shapiro (ed.), Oxford: Oxford University Press: 75-128.
  • Floyd, Juliet, and Putnam, Hilary, [2006], “Bays, Steiner, and Wittgenstein's ‘Notorious’ Paragraph about the Gödel Theorem,” The Journal of Philosophy, February, Vol. CIII, No. 2: 101-110.
  • Fogelin, Robert J., [1994], [1987; 1st ed. 1976], Wittgenstein, Second Edition, New York: Routledge & Kegan Paul.
  • Fogelin, Robert J., [2009], Taking Wittgenstein at His Word Princeton: Princeton University Press, 2009
  • Gerrard, Steve, [1991], “Wittgenstein's Philosophies of Mathematics,” Synthese 87: 125-142.
  • Gerrard, Steve, [1996], “A Philosophy of Mathematics Between Two Camps,” in The Cambridge Companion to Wittgenstein, Sluga and Stern, (eds.), Cambridge: Cambridge University Press: 171-197.
  • Gödel, Kurt, [1947], "What is Cantor's Continuum Problem?", Amer. Math. Monthly, 54: 515–525.
  • Goodman, Nelson, [1955], Fact, Fiction, & Forecast, Cambridge, MA: Harvard University Press.
  • Hacker, P. M. S., [1996], Wittgenstein’s Place in Twentieth-Century Analytic Philosophy, Blackwell.
  • Hardy, Geoffrey H, [1929], “Mathematical Proof” Mind 38: 149
  • Hardy, Geoffrey H, [1967], A Mathematician’s Apology, Cambridge Univ. Press
  • Hertz, Heinrich, [1899, 2003], The Principles of Mechanics, Robert S. Cohen (ed.), D.E. Jones and J.T. Walley (trans.), New York: Dover Phoenix Editions 1956.
  • Kreisel, Georg, [1958], “Wittgenstein's Remarks on the Foundations of Mathematics,” British Journal for the Philosophy of Science 9, No. 34, August 1958, 135-57.
  • Kripke, Saul A., [1982], Wittgenstein on Rules and Private Language, Cambridge, Mass.: Harvard University Press.
  • Linnebo, Øystein, [2009] "Platonism in the Philosophy of Mathematics", The Stanford Encyclopedia of Philosophy (Fall 2009 Edition), Edward N. Zalta (ed.).
  • Maddy, Penelope, [1997], Naturalism in Mathematics Oxford Univ. Press
  • Monk, Ray [1990] Ludwig Wittgenstein: The Duty of Genius Macmillan Free Press, London.
  • Morton, Adam, and Stich, Stephen, [1996], Benacerraf and His Critics, Oxford: Blackwell.
  • Putnam, Hilary, [1996], ‘On Wittgenstein's Philosophy of Mathematics’ Proceedings of the Aristotelian Society, Supplementary Volumes, Vol. 70 (1996), pp. 243-265
  • Putnam, Hilary, and Juliet Floyd, [2000], “A Note on Wittgenstein's “Notorious Paragraph” about the Gödel Theorem,” The Journal of Philosophy, Volume XCVII, Number 11, November: 624-632.
  • Rodych, Victor, [1999], “Wittgenstein's Inversion of Gödel's Theorem,” Erkenntnis, Vol. 51, Nos. 2/3: 173-206.
  • Rodych, Victor, [2006], “Who Is Wittgenstein's Worst Enemy?: Steiner on Wittgenstein on Gödel,” Logique et Analyse, Vol. 49, No. 193: 55-84.
  • Rodych, Victor, [2008], "Wittgenstein's Philosophy of Mathematics", The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.).
  • Shanker, Stuart, [1988], “Wittgenstein's Remarks on the Significance of Gödel's Theorem,” in Shanker, S. (ed.) Gödel's Theorem in Focus, Routledge; pp. 155-256.
  • Shapiro, Stewart, [2000], Thinking about Mathematics. The Philosophy of Mathematics. Oxford University Press.
  • Shapiro, Stewart (ed.), [2005], The Oxford Handbook of Philosophy of Logic and Mathematics, Oxford: Oxford University Press.
  • Sluga, Hans, and Stern, David G., (eds.), [1996], The Cambridge Companion to Wittgenstein, Cambridge: Cambridge University Press.
  • Steiner, Mark, [1996], “Wittgenstein: Mathematics, Regularities, Rules,” in Benacerraf and His Critics, Morton and Stich (eds.), Oxford: Blackwell: 190-212.
  • Steiner, Mark, [2001], “Wittgenstein as His Own Worst Enemy: The Case of Gödel's Theorem,” Philosophia Mathematica, Vol. 9, Issue 3: 257-279.
  • Steiner, Mark, [2009], “Empirical Regularities in Wittgenstein's Philosophy of Mathematics” Philosophia Mathematica 17 (1): 1-34
  • Stern, David, [1991], “The ‘Middle Wittgenstein’: From Logical Atomism to Practical Holism” Synthese 87: 203-26.
  • Stern, David, [2004], Wittgenstein’s Philosophical Investigations. An Introduction. Cambridge Univ. Press.
  • Tymoczko, T, [1984], “Gödel, Wittgenstein and the Nature of Mathematical Knowledge” PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association, Vol. II: Symposia and Invited Papers, pp. 449-468
  • Wright, Crispin, [1980], Wittgenstein on the Foundations of Mathematics, London: Duckworth.
  • Wrigley, Michael, [1977], “Wittgenstein's Philosophy of Mathematics,” Philosophical Quarterly, Vol. 27, No. 106: 50-9.

 

Author Information

Sorin Bangu
Email: sorin.bangu@fof.uib.no
University of Bergen
Norway

Lambda Calculi

Lambda calculi (λ-calculi) are formal systems describing functions and function application. One of them, the untyped version, is often referred to as the λ-calculus. This exposition will adopt this convention. At its core, the λ-calculus is a formal language with certain reduction rules intended to capture the notion of function application [Church, 1932, p. 352]. Through the primitive notion of function application, the λ-calculus also serves as a prototypical programming language. The philosophical significance of the calculus comes from the expressive power of a seemingly simple formal system. The λ-calculus is in fact a family of formal systems, all stemming from the same common root. The variety and expressiveness of these calculi yield results in formal logic, recursive function theory, the foundations of mathematics, and programming language theory. After a discussion of the history of the λ-calculus, we will explore the expressiveness of the untyped λ-calculus. Then we will investigate the role of the untyped lambda calculus in providing a negative answer to Hilbert’s Entscheidungsproblem and in forming the Church-Turing thesis. After this, we will take a brief foray into typed λ-calculi and discuss the Curry-Howard isomorphism, which provides a correspondence between these calculi (which again are prototypical programming languages) and systems of natural deduction.

Table of Contents

  1. History
    1. Main Developments
    2. Notes
  2. The Untyped Lambda Calculus
    1. Syntax
    2. Substitution
    3. α-equivalence
    4. β-equivalence
    5. η-equivalence
    6. Notes
  3. Expressiveness
    1. Booleans
    2. Church Numerals
    3. Notes
  4. Philosophical Importance
    1. The Church-Turing Thesis
    2. An Answer to Hilbert’s Entscheidungsproblem
    3. Notes
  5. Types and Programming Languages
    1. Overview
    2. Notes
  6. References and Further Reading

1. History

a. Main Developments

Alonzo Church first introduced the λ-calculus as “A set of postulates for the foundation of logic” in two papers of that title published in 1932 and 1933. Church believed that “the entities of formal logic are abstractions, invented because of their use in describing and systematizing facts of experience or observation, and their properties, determined in rough outline by this intended use, depend for their exact character on the arbitrary choice of the inventor” [Church, 1932, p. 348]. The intended use of the formal system Church developed was, as mentioned in the introduction, function application. Intuitively, the expression (later called a term) λx.x2 corresponds to an anonymous definition of the mathematical function f(x) = x2. An “anonymous definition” of a function refers to a function defined without a name; in the current case, instead of defining a function “f”, an anonymous definition corresponds to the mathematician’s style of defining a function as a mapping, such as “xx2”. Do note that the operation of squaring is not yet explicitly defined in the λ-calculus. Later, it will be shown how it can be. For our present purposes, the use of squaring is pedagogical. By limiting the use of free variables and the law of excluded middle in his system in certain ways, Church hoped to escape paradoxes of transfinite set theory [Church, 1932, p. 346–8]. The original formal system of 1932–1933 turned out, however, not to be consistent. In it, Church defined many symbols besides function definition and application: a two-place predicate for extensional equality, an existential quantifier, negation, conjunction, and the unique solution of a function. In 1935, Church’s students Kleene and Rosser, using this full range of symbols and Gödel’s method of representing the syntax of a language numerically so that a system can express statements about itself, proved that any formula is provable in Church’s original system. In 1935, Church isolated the portion of his formal system dealing solely with functions and proved the consistency of this system. One idiosyncratic feature of the system of 1935 was eliminated in Church’s 1936b paper, which introduced what is now known as the untyped λ-calculus. This 1936 paper also provides the first exposition of the Church-Turing thesis and a negative answer to Hilbert’s famous problem of determining whether a given formula in a formal system is provable. We will discuss these results later.

b. Notes

For a thorough history of the λ-calculus, including many modern developments, with a plethora of pointers to primary and secondary literature, see Cardone and Hindley [2006]. Church [1932] is very accessible and provides some insight into Church’s thinking about formal logic in general. Though the inconsistency proof of Kleene and Rosser [1935] is a formidable argument, the first page of their paper provides an overview of their proof strategy which should be comprehensible to anyone with some mathematical/logical background and rough understanding of the arithmetization of syntax á la Gödel. According to Barendregt [1997], Church originally intended to use the notation ˆx.x2. His original typesetter could not typeset this, and so moved the hat in front of the x, which another typesetter than read as λx.x2. In an unpublished letter, Church writes that he placed the hat in front, not a typesetter. According to Cardone and Hindley [2006, p. 7], Church later contradicted his earlier account, telling enquirers that he was in need of a symbol and just happened to pick λ.

2. The Untyped Lambda Calculus

At its most basic level, the λ-calculus is a formal system with a concrete syntax and distinct reduction rules. In order to proceed properly, we must define the alphabet and syntax of our language and then the rules for forming and manipulating well-formed formulas in this language. In the process of this exposition, formal definitions will be given along with informal description of the intuitive meaning behind the expressions.

a. Syntax

The language of the λ-calculus consists of the symbols (, ), λ and a countably infinite set of variables x, y, z, … We use upper-case variables M, N, … to refer to formulas of the λ-calculus. These are not symbols in the calculus itself, but rather convenient metalinguistic abbreviations. The terms (formulas, expressions; these three will be used interchangeably) of the calculus are defined inductively as follows:
  • x is a term for any variable.
  • If M, N are terms, then (MN) is a term.
  • If x is a variable and M a term, then (λx.M) is a term.
The latter two rules of term formation bear most of the meat. The second bullet corresponds to function application. The last bullet is called “abstraction” and corresponds to function definition. Intuitively, the notation λx.M corresponds to the mathematical notation of an anonymous function xM. As you can see from this language definition, everything is a function. The functions in this language can “return” other functions and accept other functions as their input. For this reason, they are often referred to as higher-order functions or first-class functions in programming language theory. We provide a few examples of λ terms before examining evaluation rules and the expressive power of this simple language.
  • λx.x: this represents the identity function, which just returns its argument.
  • λf.(λx.(fx)): this function takes in two arguments and applies the first to the second.
  • λf.(λx.(f(xx)))(λx.(f(xx))): this term is called a “fixed-point combinator” and usually denoted by Y. We will discuss its importance later, but introduce it here to show how more complex terms can be built.
Though I referred to the second term above as accepting two arguments, this is actually not the case. Every function in the λ-calculus takes only one argument as its input (as can be seen by a quick examination of the language definition). We will see later, however, that this does not in any way restrict us and that we can often think of nested abstractions as accepting multiple arguments. Before we proceed, a few notational conventions should be mentioned:
  • We always drop the outermost parentheses. This was already adopted in the three examples above.
  • Application associates to the left: (MNP) is shorthand for ((MN)P).
  • Conversely, abstraction associates to the right: (λx.λy.M) is short for (λx.(λy.M))
  • We can drop λs from repeated abstractions: λx.λy.M can be written λxy.M and similarly for any length sequence of abstractions.

b. Substitution

Because it will play a pivotal role in determining the equivalence between λ-terms, we must define a notion of variable substitution. This depends on the notion of free and bound variables, which we will define only informally. In a term λx.P, x is a bound variable, as are similarly bound variables in the subterm P; variables which are not bound are free. In other words, free variables are those that occur outside the scope of a λ abstraction on that variable. A term containing no free variables is called a combinator. We can then define the substitution of N for x in term M, denoted M[x := N], as follows:
  • x[x := N] = N
  • y[x := N] = y (provided that x is a variable different than y)
  • (PQ)[x := N] = P[x := N]Q[x := N]
  • (λx.P)[x := N] = λx.P
  • (λy.P)[x := N] = λy.P[x := N] (provided that x is different than y and y is not free in N or x is not free in P)
Intuitively, these substitution rules allow us to replace all the free occurrences of a variable (x) with any term (N). We cannot replace the bound occurrences of the variable (or allow for a variable to become bound by a substitution, as accounted for in the last rule) because that would change the “meaning” of the term. The scare quotes are included because the formal semantics of the λ-calculus falls beyond the scope of the present article.

c. α-equivalence

The α-equivalence relation, denoted ≡α, captures the idea that two terms are equivalent up to a change of bound variables. For instance, though I referred to λx.x as “the” identity function earlier, we also want to think of λy.y, λz.z, et cetera as identity functions. We can define α-equivalence formally as follows:
  • xα x
  • MNα PQ if Mα P and Nα Q
  • λx.Mα λx.N (provided that Mα N)
  • λx.Mα λy.M[x := y] (provided that y is not free in M)
It will then follow as a theorem that ≡α is an equivalence relation. It should therefore be noted that what was defined earlier as a λ-term is in fact properly called a pre-term. Equipped with the notion of α-equivalence, we can then define a λ-term as the α-equivalence class of a pre-term. This allows us to say that λx.x and λy.y are in fact the same term since they are α-equivalent.

d. β-equivalence

β-reduction, denoted →β, is the principle rewriting rule in the λ-calculus. It captures the intuitive notion of function application. So, for instance, (λx.x)Mβ M, which is what we would expect from an identity function. Formally, →β is the least relation of λ-terms satisfying:
  • If Mβ N then λx.Mβ λx.N
  • If Mβ N then MZβ NZ for any terms M, N, Z
  • If Mβ N then ZMβ ZN for any terms M, N, Z
  • (λx.M)Nβ M[x := N]
The first three conditions define what’s referred to as a compatible relation. →β is then the least compatible relation of terms satisfying the fourth condition, which encodes the notion of function application as variable substitution. We will use the symbol ↠β to denote the iterated (zero or more times) application of →β and =β to denote the smallest equivalence relation containing →β. Two terms M and N are said to be convertible when M =αβ N (=αβ refers to the union of =α and =β). A term of the form (λx.M)N is called a β-redux (or just a redux) because it can be reduced to M[x := N]. A term M is said to be in β-normal form (henceforth referred to simply as normal form) iff there is no term N such that Mβ N. That is to say that a term M is in normal form iff M contains no β-redexes. A term M is called normalizing (or weakly normalizing) iff there is an N in normal form such that Mβ N. A term M is strongly normalizing if every β-reduction sequence starting with M is finite. A term which is not normalizing is called divergent. Regarding β-reduction as the execution of a program, this definition says that divergent programs never terminate since normal form terms correspond to terminated programs. Although we do not prove the result here, it is worth nothing that the untyped λ-calculus is neither normalizing nor strongly normalizing. For instance, Y as defined earlier does not normalize. The interested reader may try an exercise: show that Yβ Y1β Y2β Y3β …, where YiYj for all ij. Iterated β reduction allows us to express functions of multiple variables in a language that only has one-argument functions. Consider the example before of the term λf.λx.(fx). Letting F, X denote arbitrary terms, we can evaluate:
(λf.(λx.(fx)))FX β (λx.(fx)[f := F])X
β (Fx)[x := X]
= FX
This result is exactly what we expected. The process of treating a function of multiple arguments as iterated single-argument functions is generally referred to as Currying. Moreover, ↠β satisfies an important property known either as confluence or as the Church-Rosser property. For any term M, if Mβ M1 and Mβ M2, then there exists a term M3 such that M1β M3 and M2β M3. Although a proof of this statement requires machinery that lies beyond the scope of the present exposition, it is both an important property and one that is weaker than either normalization or strong normalization. In other words, although the untyped λ-calculus is neither normalizing nor strongly normalizing, it does satisfy the Church-Rosser property.

e. η-equivalence

Another common notion of reduction is →η which is the least compatible relation (i.e. a relation satisfying the first three items in the definition of →β) such that
  • λx.Mxη M (provided that x is not free in M)
As before, =η is the least equivalence relation containing →η. The notion of η-equivalence captures the idea of extensionality, which can be broadly stated as the fact that two objects (of whatever kind) are equal if they share all their external properties, even if they may differ on internal properties. (The notion of external and internal properties used here is meant to be intuitive only.) η-equivalence is a kind of extensionality statement for the untyped λ-calculus because it states that any function is equal to the function which takes an argument and applies the original function to this argument. In other words, two functions are equivalent when they yield the same output on the same input. This assumption is an extensionality one because it ignores any differences in how the two functions compute that output.

f. Notes

In its original formulation, Church allowed abstraction to occur only over free variables in the body of the function. While this makes intuitive sense (i.e. the body of the function should depend on its input), the requirement does not change any properties of the calculus and has generally been dropped. Barendregt [1984] is the most thorough work on the untyped λ-calculus. It is also very formal and potentially impenetrable to a new reader. Introductions with learning curves that are not quite as steep include Hindley and Seldin [2008] and Hankin [2004]. The first chapter of Sørensen and Urzyczyn [2006] contains a quick, concise tour through the material, with some intuitive motivation. Other less formal expositions include the entry on Wikipedia and assorted unpublished lecture notes to be found on the web. Not every aspect of the untyped λ-calculus was covered in this section. Moreover, notions such as η-equivalence and the Church-Rosser property were introduced only briefly. These are important philosophically. For more discussion and elaboration on the important technical results, the interested reader can learn more from the above cited sources. We also omit any discussion of the formal semantics of the untyped λ-calculus, originally developed in the 1960s by Dana Scott and Christopher Strachey. Again, Barendregt’s book contains a formal exposition of this material and extensive pointers to further reading on it.

3. Expressiveness

Though the language and reduction rules of the untyped λ-calculus may at first glance appear somewhat limiting, we will see in this section how expressive the calculus can be. Because everything is a function, our general strategy will be to define some particular terms to represent common objects like booleans and numbers and then show that certain other terms properly codify functions of booleans and numbers in the sense that they β reduce as one would expect. Instead, however, of continuing these general remarks, the methodology will be used in the particular cases of booleans and natural numbers. Throughout this exposition, bold face will be used to name particular terms.

a. Booleans

We first introduce the basic definitions:
true = λxy.x
false = λxy.y
In other words, true is a function of two arguments which returns its first argument and false is one which returns the second argument. To see how these two basic definitions can be used to encode the behavior of the booleans, we first introduce the following definition:
ifthenelse = λxyz.xyz
The statement ifthenelsePQR should be read as “if P then Q else R”. It should also be noted that ifthenelsePQRβ PQR; often times, only this reduct is presented as standing for the linguistic expression quoted above. Note that ifthenelse actually captures the desired notion. When P = true, we have that
ifthenelsetrueQR β trueQR
= (λxy.x)QR
β (λy.Q)R
β Q
An identical argument will show that ifthenelse false QRβ R. The above considerations will also hold for any P such that Pβ true or Pβ false. We will now show how a particular boolean connective (and) can be encoded in the λ-calculus and then list some further examples, leaving it to the reader to verify their correctness. We define
and = λpq.pqp
To see this definition’s adequacy, we can manually check the four possible truth-values of the “propositional variables” p and q. Although any λ-term can serve as an argument for a boolean function, these functions encode the truth tables of the connectives in the sense that they behave as one would expect when acting on boolean expressions.
and true true = (λpq.pqp)(λxy.x)(λxy.x)
β (λq.pqp)[p := (λxy.x)](λxy.x)
= (λq.((λxy.x)q(λxy.x)))(λxy.x)
β ((λxy.x)q(λxy.x))[q := λxy.x]
= (λxy.x)(λxy.x)(λxy.x)
β λxy.x
= true
Similarly, we see that
and true false = (λpq.pqp)(λxy.x)(λxy.y)
β (λq.pqp)[p := (λxy.x)](λxy.y)
= (λq.((λxy.x)q(λxy.x)))(λxy.y)
β (λxy.x) (λxy.y)(λxy.x)
β λxy.y
= false
We leave it to the reader to verify that and false trueβ false and that and false falseβ false. Together, these results imply that our definition of and encodes the truth table for “and” just as we would expect it to do. The following list of connectives similarly encode their respective truth tables. Because we think it fruitful for the reader to work out the details to gain a fuller understanding, we simply list these without examples:
or = λpq.ppq
not = λpxy.pyx
xor = λpq.p(q false true)q

b. Church Numerals

At the very end of his 1933 paper, Church introduced an encoding for numbers in the untyped λ-calculus that allows arithmetic to be carried out in this purely functional setting. Because the ramifications were not fully explored until his 1936 paper, we hold off on discussion of the import of this encoding and focus here on an exposition of the ability of the λ-calculus to compute functions of natural numbers. The numerals are defined as follows:
1 = λab.ab
2 = λab.a(ab)
3 = λab.a(a(ab))
Successor, addition, multiplication, predecessor, and subtraction can also be defined as follows:
succ = λnab.a(nab)
plus = λmnab.ma(nab)
mult = λmna.n(ma)
pred = λnab.n(λgh.h(ga))(λu.b)(λu.u)
minus = λmn.(n pred)m
As in the case of the booleans, an example of using the successor and addition functions will help show how these definitions functionally capture arithmetic.
succ 1 = (λnab.a(nab))(λab.ab)
β (λab.a(nab))[n := (λab.ab)]
= λab.a((λab.ab)ab)
(λab.ab)ab β ((λb.ab)[a := a])b
= (λb.ab)b
β ab
λab.a((λab.ab)ab) β λab.a(ab)
= 2
Similarly, an example of addition:
plus 1 2 = (λmnab.ma(nab))(λab.ab)(λab.a(ab))
(λmnab.ma(nab)) 1 2 β ((λnab.ma(nab))[m := λab.ab])2
= (λnab.(λab.ab)a(nab))2
β (λab.(λab.ab)a(nab))[n := λab.a(ab)]
= λab.(λab.ab)a((λab.a(ab))ab)
(λab.ab)a β λb.ab
(λab.a(ab))ab β (λb.a(ab))b
β a(ab)
λab.1a(2ab) β λab.a(a(ab))
= 3
Seeing as how these examples capture the appropriate function on natural numbers in terms of Church numerals, we can introduce the notion of λ-definability in the way that Church did in 1936. A function f of one natural number is said to be λ-definable if it is possible to find a term f such that if f(m) = r, then fmβ r, where m and r are the Church numerals of m and r. The arithmetical functions defined above clearly fit this description. The astute reader may have anticipated the result, proved in Kleene [1936], that the λ-definable functions coincide exactly with partial recursive functions. This quite remarkable result encapsulates the arithmetical expressiveness of the untyped λ-calculus completely. While the import of the result will be discussed in section 4, I will briefly discuss how fixed-point combinators allow recursive functions to be encoded in the λ-calculus. Recall that a combinator is a term with no free variables. A combinator M is called a fixed-point combinator if for every term N, we have that MN =β N(MN). In particular, Y as defined before, is a fixed point combinator, since
YF ≡ (λf. (λx.f(xx))(λx.f(xx)))F
β (λx.F(xx))(λx.F(xx))
β F(xx)[x := λx.F(xx)]
F((λx.F(xx))(λx.F(xx)))
=β F(YF)
Although the technical development of the result lies outside the scope of the present exposition, one must note that fixed-point combinators allow the minimization operator to be defined in the λ-calculus. Therefore, all (partial) recursive functions on natural numbers—not just the primitive recursive ones (which are a proper subset of the partial recursive functions)—can be encoded in the λ-calculus.

c. Notes

In the exposition of booleans, connectives such as and, or, et cetera, were given in a very general formulation. They can also be defined more concretely in terms of true and false in a manner that may be closer to capturing their truth tables. For instance, one may define and = λpq.p(q true false) false and verify that it also captures the truth table properly even though it does not directly β-reduce to the definition presented above. In the 1933 paper, Church defined plus, mult, and minus somewhat differently than presented here. This original exposition used some of the idiosyncrasies of the early version of the λ-calculus that were dropped by the time of the 1936 paper. The predecessor function was originally defined by Kleene, in a story which will be discussed in section 4. In addition to the original Kleene paper, more modern expositions of the result that the partial recursive functions are co-extensive with the λ-definable functions can be found in Sørensen and Urzyczyn [2006, pp. 20–22] and Barendregt [1984, pp. 135–139], among others.

4. Philosophical Importance

The expressiveness of the λ-calculus has allowed it to play a critical role in debates at the foundations of mathematics, logic and programming languages. At the end of the introduction to his 1936 paper, Church writes that “The purpose of the present paper is to propose a definition of effective calculability which is thought to correspond satisfactorily to the somewhat vague intuitive notion in terms of which problems of this class are often stated, and to show, by means of an example, that not every problem of this class is solvable.” The first subsection below will explore the definition of effective calculability in terms of λ-definability. We will then explore the second purpose of the 1936 paper, namely that of proving a negative answer to Hilbert’s Entscheidungsproblem. In the course of both of these expositions, connections will be made with the work of other influential logicians (Gödel and Turing) of the time. Finally, we will conclude with a brief foray into more modern developments in typed λ-calculi and the correspondence between them and logical proofs, a correspondence often referred to as “proofs-as-programs” for reasons that will become clear.

a. The Church-Turing Thesis

In his 1936 paper, in a section entitled “The notion of effective calculability”, Church writes that “We now define the notion, already discussed, of an effectively calculable function of positive integers by identifying it with the notion of a recursive function of positive integers (or a λ-definable function of positive integers)” [Church, 1936b, p. 356, emphasis in original]. The most striking feature of this definition, which will be discussed in more depth in what follows, is that it defines an intuitive notion (effective calculability) with a formal notion. In this sense, some regard the thesis as more of an unverifiable hypothesis than a definition. In what follows, we will consider, in addition to the development of alternative formulations of the thesis, what qualifies as evidence in support of this thesis. In the preceding parts of Church’s paper, he claims to have shown that for any λ-definable function, there exists an algorithm for calculating its values. The idea of an algorithm existing to calculate the values of a function (F) plays a pivotal role in Church’s thesis. Church argues that an effectively calculable function of one natural number must have an algorithm, consisting in a series of expressions (“steps” in the algorithm) that lead to the calculated result. Then, because each of these expressions can be represented as a natural number using Gödel’s methodology, the functions that act on them to calculate F will be recursive and therefore F will be as well. In short, a function which has an algorithm that calculates its values will be effectively calculable and any effectively calculable function of one natural number has such an algorithm. Furthermore, if such an algorithm exists, it will be λ-definable (via the formal result that all recursive functions are λ-definable). Church offers a second notion of effective calculability in terms of provability within an arbitrary logical system. Because he can show that this notion also coincides with recursiveness, he concludes that “no more general definition of effective calculability than that proposed above can be obtained by either of the two methods which naturally suggest themselves” [Church, 1936b, p. 358]. The history of this particular version of the thesis has no clear story. Kleene allegedly realized how to define the predecessor function (as given in the previous section: Kleene had earlier defined the predecessor function using an encoding of numerals different from those of Church) while at the dentist for the removal of two wisdom teeth. For a first-hand account, see Crossley [1975, pp. 4–5]. According to Barendregt [1997, p. 186]:
After Kleene showed the solution to his teacher, Church remarked something like: “But then all intuitively computable functions must be lambda definable. In fact, lambda definability must coincide with intuitive computability.” Many years later—it was at the occasion of Robin Gandy’s 70th birthday, I believe—I heard Kleene say: “I would like to be able to say that, at the moment of discovering how to lambda define the predecessor function, I got the idea of Church’s Thesis. But I did not, Church did.”
Alan Turing, in his groundbreaking paper also published in 1936, developed the notion of computable numbers in terms of his computing machines. While Turing believes that any definition of an intuitive notion will be “rather unsatisfactory mathematically,” he frames the question in terms of computation as opposed to effective calculability: “The real question at issue is ‘what are the possible processes which can be carried out in computing a number?’ ” [Turing, 1937a, p. 249] By analyzing the way that a computer—in fact, an idealized human computer—would use a piece of paper to carry out this process of computing a number, which he reduces to observing a finite number of squares on a one-dimensional paper and changing one of the squares or another within a small distance of the currently observed squares, Turing shows how this intuitive notion of computing can be captured by one of his machines. Thus, he claims to have captured the intuitive notion of computability with computability by Turing machines. In the appendix of this paper added later, Turing also offers a sketch of a proof that λ-definability and his computability capture the same class of functions. In a subsequent paper, he also suggests that “The identification of ‘effectively calculable’ functions with computable functions is possibly more convincing than an identification with the λ-definable or general recursive functions” [Turing, 1937b, p. 153]. While Turing may believe that his machines are closer to the intuitive notion of calculating a number, the fact that the three classes of functions are identical means they can be used interchangeably. This fact has often been considered to count as evidence that these formalisms do in fact capture an intuitive notion of calculability. Because Turing’s analysis of the notion of computability often makes reference to a human computer, with squares on paper referred to as “states of mind”, the Church-Turing thesis, along with Turing’s result that there is a universal machine which can compute any function that any Turing machine can, has often been interpreted, in conjunction with the computational theory of mind, as having profound results on the limits of human thought. For a detailed analysis of many misinterpretations of the thesis and what it does imply in the context of the nature of mind, see Copeland [2010].

b. An Answer to Hilbert’s Entscheidungsproblem

In a famous lecture addressed at the meeting of the International Congress of Mathematicians in 1900, David Hilbert proposed 23 problems in need of solution. The second of these was to give a proof of the consistency of analysis without reducing it to a consistency proof of another system. Though much of Hilbert’s thought and the particular phrasing of this problem changed over the ensuing 30 years, the challenge to prove the consistency of certain formal systems remained. At a 1928 conference, Hilbert posed three problems that continued his general program. The third of these problems was the Entscheidungsproblem in the sense that Church used the term: “By the Entscheidungsproblem of a system of symbolic logic is here understood the problem to find an effective method by which, given any expression Q in the notation of the system, it can be determined whether or not Q is provable in the system” [Church, 1936a, p. 41]. In 1936b, Church used the λ-calculus to prove a negative solution to this problem for a wide/natural/interesting class of formal systems. The technical result proved in this paper is that it is undecidable, given two λ-terms A and B whether A is convertible into B. Church argues that any system of logic that is strong enough to express a certain amount of arithmetic will be able to express the formula ψ(a, b), which encodes that a and b are the Gödel numbers of A and B such that A is convertible into B. If the system of logic satisfied the Entscheidungsproblem, then it would be decidable for every a, b whether ψ(a, b) is provable; but this would provide a way to decide whether A is convertible into B for any terms A and B, contradicting Church’s already proven result that this problem is not decidable. Given that we have just seen the equivalence between Turing machine computability and λ-definability, it should come as no surprise that Turing also provided a negative solution to the Entscheidungsproblem. Turing proceeds by defining a formula of one variable which takes as input the complete description of a Turing machine and is provable if and only if the machine input ever halts. Therefore, if the Entscheidungsproblem can be solved, then it provides a method for proving whether or not a given machine halts. But Turing had shown earlier that this halting problem cannot be solved. These two results, in light of the Church-Turing Thesis, imply that there are functions of natural numbers that are not effectively calculable/computable. It remains open, however, whether there are computing machines or physical processes that elude the limitations of Turing computability (equivalently, λ-definability) and whether or not the human brain may embody one such process. For a thorough discussion of computation in physical systems and the resulting implications for philosophy of mind, see Piccinini [2007].

c. Notes

Kleene [1952] offers (in chapters 12 and 13) a detailed account of the development of the Church-Turing thesis from a figure who was central to the logical developments involved. Turing’s strategy of reducing the Entscheudingsproblem to the halting problem has been adopted by many researchers tackling other problems. Kleene [1952] also discusses Turing computability. The halting problem, as outlined above, however, takes an arbitrary machine and an arbitrary input. There are, of course, particular machines and particular inputs for which it can be decided whether the machine halts on that input.  For instance, every finite-state automaton halts on every input.

5. Types and Programming Languages

a. Overview

In 1940, Church gave a “formulation of the simple theory of types which incorporates certain features of λ-conversion” [Church, 1940, p. 56]. Though the history of simple types extends beyond the scope of this paper and Church’s original formulation extended beyond the syntax of the untyped λ-calculus (in a way somewhat similar to the original, inconsistent formulation), one can intuitively think of types as syntactic decorations that restrict the formation of terms so that functions may only be applied to appropriate arguments. In informal mathematics, a function f : ℕ → ℕ cannot be applied to a fraction. We now define a set of types and a new notion of term that incorporates this idea. We have a set of base types, say σ1, σ2. (In Church’s formulation, ι and o were the only base types. In many programming languages, base types include things such as nat, bool, etc. This set can be countably infinite for the purposes of proof although no programming language could implement such a type system.) For any types α, β, then αβ is also a type. (Church used the notation (βα), but the arrow is used in most modern expositions because it captures the intuition of a function type better.) In what follows, σ, τ and any other lower-case Greek letters will range over types. We will write either e : τ or eτ to indicate that term e is of type τ. The simply typed terms can then be defined as follows:
  • xβ is a term.
  • If xβ is a variable and Mα is a term, then (λxβ.Mα)βα is a term.
  • If Mαβ and Nα are terms, then (MN)β is a term.
While the first two rules can be seen as simply adding “syntactic decoration” to the original untyped terms, this third rule captures the idea of only applying functions to proper input. It can be read as saying: If M is a function from α to β and N is an α, then MN is a β. This formal system does not permit other application terms to be formed. One must note that this simply-typed λ-calculus is strongly normalizing, meaning that any well-typed term has a finite reduction sequence. It has been mentioned that λ-calculi are prototypical programming languages. We can then think of typed λ-calculi as being typed programming languages. Seen in this light, the strongly normalizing property states that one cannot write a non-terminating program. For instance, the term Y in the untyped λ-calculus has no simply-typed equivalent. Clearly, then, this is a somewhat restrictive language. Because of this property and the results in the preceding section, it follows that the simply-typed λ-calculus is not Turing complete, i.e. there are λ-definable functions which are not definable in the simply-typed version. Taking this analogy with programming languages literally, we can interpret a family of results collectively known as the Curry-Howard correspondence as providing a “proofs-as-programs” metaphor. (The correspondence is sometimes referred to as Curry-Howard-Lambek due to Lambek’s work in expressing both the λ-calculus and natural deduction in a category-theoretic setting.) In the case of the simply-typed λ-calculus, the correspondence is between the λ-calculus and the implicational fragment of intuitionistic natural deduction. One can readily see that simple types correspond directly to propositions in this fragment of propositional logic (modulo a translation of base types to propositional variables). The correspondence then states that a proposition is provable if and only if there is a λ-term with the type of the proposition. The term then corresponds to a proof of the proposition. In the simply-typed case, the correspondence can be fleshed out as:
λ-calculus propositional logic
type proposition
term proof
abstraction arrow introduction
application modus ponens
variables assumptions
Table 1. Curry-Howard in the simply-typed λ-calculus.
For example, consider the intuitionistic tautology p → ¬¬p, where ¬p =df p → ⊥. While a full exposition of natural deduction for intuitionistic propositional logic lies beyond the present scope, observe that we have the following proof of p → ¬¬p = p → ((p → ⊥) → ⊥):

  p, p → ⊥ ⊢ p → ⊥         p, p → ⊥ ⊢ p       p, p → ⊥ ⊢ ⊥         p ⊢ (p → ⊥) → ⊥     p → ((p → ⊥) → ⊥)

We can then follow this proof in order to construct a corresponding term in the simply-typed λ-calculus as follows:

  xp, yp→⊥yp→⊥         xp, yp→⊥xp       xp, yp→⊥yx         xp ⊢ (λyp→⊥.yx)(p→⊥)→⊥     ⊢ (λxp.λyp→⊥.yx)p→((p→⊥)→⊥)

As was mentioned, the Curry-Howard correspondence is not one correspondence, but rather a family. The main idea is that terms correspond to proofs of their type. In the way that we can extend logics beyond the implicational fragment of intuitionistic propositional logic to include things like connectives, negation, quantifiers, and higher-order logic, so too can we extend the syntax and type systems of λ-calculus. There are thus many variations of the proofs-as-programs correspondence. See the following notes section for many sources exploring the development of more complex correspondences. That programs represent proofs is more than an intellectual delight: it is the correspondence on which software verifiers and automated mathematical theorem provers (such as Automath, HOL, Isabelle, and Coq) are based. These tools both verify and help users generate proofs and have in fact been used to develop proofs of unproven conjectures.

b. Notes

For a thorough exposition of typed λ-calculi, starting with simple types and progressing through all faces of the “λ-cube”, see Barendregt [1992]. Sørensen and Urzyczyn [2006] explores the Curry-Howard correspondence between many different forms of logic and the corresponding type systems. Pierce [2002] focuses on the λ-calculus and its relation to programming languages, thoroughly developing calculi from the simple to the very complex, along with type checking and operational semantics for each.

6. References and Further Reading

  • Barendregt, Henk. 1984. The Lambda Calculus: Its Syntax and Semantics. Elsevier Science, Amsterdam.
  • Barendregt, Henk. 1992. Lambda Calculi with Types. Oxford University Press, Oxford.
  • Barendregt, Henk. 1997. “The Impact of the Lambda Calculus in Logic and Computer Science.” The Bulletin of Symbolic Logic 3(2):181–215.
  • Cardone, Felice and J. R. Hindley. 2006. Handbook of the History of Logic, eds., D. M. Gabbay and J. Woods, vol. 5., History of Lambda-Calculus and Combinatory Logic, Elsevier.
  • Church, Alonzo. 1932. “A Set of Postulates for the Foundation of Logic (Part I).” Annals of Mathematics, 33(2):346–366.
  • Church, Alonzo. 1933. “A Set of Postulates for the Foundation of Logic (Part II).” Annals of Mathematics, 34(4):839–864.
  • Church, Alonzo. 1935. “A Proof of Freedom from Contradiction.” Proceedings of the National Academy of Sciences of the United States of America, 21(5):275–281.
  • Church, Alonzo. 1936a. “A Note on the Entscheidungsproblem.” Journal of Symbolic Logic, 1(1):40–41.
  • Church, Alonzo. 1936b. “An Unsolvable Problem of Elementary Number Theory.” American Journal of Mathematics, 58(2):345–363, 1936b.
  • Church, Alonzo. 1940. “A Formulation of the Simple Theory of Types.” Journal of Symbolic Logic, 5(2):56–68.
  • Church, Alonzo. 1941. The Calculi of Lambda Conversion. Princeton University Press, Princeton, NJ.
  • Copeland, B. Jack. 2010. “The Church-Turing Thesis.” Stanford Encyclopedia of Philosophy.
  • Crossley, J. N. 1975. “Reminiscences of Logicians.” Algebra and Logic: Papers from the 1974 Summer Research Institute of the Australian Mathematical Society, Monash University, Australia, Lecture Notes in Mathematics, vol. 450, Springer, Berlin.
  • Hankin, Chris. 2004. An Introduction to Lambda Calculi for Computer Scientists. College Publications.
  • Hindley, J. Roger and Jonathan P. Seldin. 2008. Lambda-Calculus and Combinators: An Introduction. Cambridge University Press, Cambridge.
  • Kleene, S. C. 1936. “λ-definability and Recursiveness.” Duke Mathematical Journal, 2(2):340–353.
  • Kleene, S. C. 1952. Introduction to Metamathematics. D. van Norstrand, New York.
  • Kleene, S. C. and J. B. Rosser. 1935. “The Inconsistency of Certain Formal Logics.” Annals of Mathematics, 36(3):630–636.
  • Piccinini, Gualtiero. 2007. “Computational Modelling vs. Computational Explanation: Is Everything a Turing Machine, and Does It Matter to the Philosophy of Mind?” Australasian Journal of Philosophy, 85(1):93–115.
  • Pierce, Benjamin C. 2002. Types and Programming Languages. MIT Press, Cambridge.
  • Sørensen, M. H. and P. Urzyczyn. 2006. Lectures on the Curry-Howard Isomorphism. Elsevier Science, Oxford.
  • Turing, Alan M. 1937a. “On Computable Numbers, with an Application to the Entscheidungsproblem.” Proceedings of the London Mathematical Society, 42(2):230–265.
  • Turing, Alan M. 1937b. “Computability and λ-Definability.” Journal of Symbolic Logic, 2(4): 153–163.

Author Information

Shane Steinert-Threlkeld Email: shanest@stanford.edu Stanford University U. S. A.

The Indispensability Argument in the Philosophy of Mathematics

In his seminal 1973 paper, “Mathematical Truth,” Paul Benacerraf presented a problem facing all accounts of mathematical truth and knowledge.  Standard readings of mathematical claims entail the existence of mathematical objects.  But, our best epistemic theories seem to debar any knowledge of mathematical objects.  Thus, the philosopher of mathematics faces a dilemma: either abandon standard readings of mathematical claims or give up our best epistemic theories.  Neither option is attractive.

The indispensability argument in the philosophy of mathematics is an attempt to avoid Benacerraf’s dilemma by showing that our best epistemology is consistent with standard readings of mathematical claims.  Broadly speaking, it is an attempt to justify knowledge of an abstract mathematical ontology using only a strictly empiricist epistemology.

The indispensability argument in the philosophy of mathematics, in its most general form, consists of two premises.  The major premise states that we should believe that mathematical objects exist if we need them in our best scientific theory.  The minor premise claims that we do in fact require mathematical objects in our scientific theory.  The argument concludes that we should believe in the abstract objects of mathematics.

This article begins with a general overview of the problem of justifying our mathematical beliefs that motivates the indispensability argument.  The most prominent proponents of the indispensability argument have been W.V. Quine and Hilary Putnam.  The second section of the article discusses a reconstruction of Quine’s argument in detail.  Quine’s argument depends on his general procedure for determining a best theory and its ontic commitments, and on his confirmation holism.  The third and fourth sections of the article discuss versions of the indispensability argument which do not depend on Quine’s method: one from Putnam and one from Michael Resnik.

The relationship between constructing a best theory and producing scientific explanations has recently become a salient topic in discussions of the indispensability argument.  The fifth section of the article discusses a newer version of the indispensability argument, the explanatory indispensability argument.  The last four sections of the article are devoted to a general characterization of indispensability arguments over various versions, a brief discussion of the most prominent responses to the indispensability argument, a distinction between inter- and intra-theoretic indispensability arguments, and a short conclusion.

Table of Contents

  1. The Problem of Beliefs About Mathematical Objects
  2. Quine’s Indispensability Argument
    1. A Best Theory
    2. Believing Our Best Theory
    3. Quine’s Procedure for Determining Ontic Commitments
    4. Mathematization
  3. Putnam’s Success Argument
  4. Resnik’s Pragmatic Indispensability Argument
  5. The Explanatory Indispensability Argument
  6. Characteristics of Indispensability Arguments in the Philosophy of Mathematics
  7. Responses to the Indispensability Argument
  8. Inter-theoretic and Intra-theoretic Indispensability Arguments
  9. Conclusion
  10. References and Further Reading

1. The Problem of Beliefs About Mathematical Objects

Most of us have a lot of mathematical beliefs. For example, we might believe that the tangent to a circle intersects the radius of that circle at right angles, that the square root of two can not be expressed as the ratio of two integers, and that the set of all subsets of a given set has more elements than the given set. Most of us also believe that those claims refer to mathematical objects such as circles, integers, and sets. Regarding all these mathematical beliefs, the fundamental question motivating the indispensability argument is, “How can we justify our mathematical beliefs?”

Mathematical objects are in many ways unlike ordinary physical objects such as trees and cars. We learn about ordinary objects, at least in part, by using our senses. It is not obvious that we learn about mathematical objects this way. Indeed, it is difficult to see how we could use our senses to learn about mathematical objects. We do not see integers, or hold sets. Even geometric figures are not the kinds of things that we can sense. Consider any point in space; call it P. P is only a point, too small for us to see, or otherwise sense. Now imagine a precise fixed distance away from P, say an inch and a half. The collection of all points that are exactly an inch and a half away from P is a sphere. The points on the sphere are, like P, too small to sense. We have no sense experience of the geometric sphere. If we tried to approximate the sphere with a physical object, say by holding up a ball with a three-inch diameter, some points on the edge of the ball would be slightly further than an inch and a half away from P, and some would be slightly closer. The sphere is a mathematically precise object. The ball is rough around the edges. In order to mark the differences between ordinary objects and mathematical objects, we often call mathematical objects "abstract objects."

When we study geometry, the theorems we prove apply directly and exactly to mathematical objects, like our sphere, and only indirectly and approximately to physical objects, like our ball. Numbers, too, are insensible. While we might see or touch a bowl of precisely eighteen grapes, we see and taste the grapes, not the eighteen. We can see a numeral, "18," but that is the name for a number, just as the term "Russell" is my name and not me. We can sense the elements of some sets, but not the sets themselves. And some sets are sets of sets, abstract collections of abstract objects. Mathematical objects are not the kinds of things that we can see or touch, or smell, taste or hear. If we can not learn about mathematical objects by using our senses, a serious worry arises about how we can justify our mathematical beliefs.

The question of how we can justify our beliefs about mathematical objects has long puzzled philosophers. One obvious way to try to answer our question is to appeal to the fact that we prove the theorems of mathematics. But, appealing to mathematical proofs will not solve the problem. Mathematical proofs are ordinarily construed as derivations from fundamental axioms. These axioms, such as the Zermelo-Frankel axioms for set theory, the Peano Axioms for number theory, or the more-familiar Euclidean axioms for geometry, refer to the same kinds of mathematical objects. Our question remains about how we justify our beliefs in the axioms.

For simplicity, to consider our question of how we can justify our beliefs about mathematical objects, we will only consider sets. Set theory is generally considered the most fundamental mathematical theory. All statements of number theory, including those concerning real numbers, can be written in the language of set theory. Through the method of analysis, all geometric theorems can be written as algebraic statements, where geometric points are represented by real numbers. Claims from all other areas of mathematics can be written in the language of set theory, too.

Sets are abstract objects, lacking any spatio-temporal location. Their existence is not contingent on our existence. They lack causal efficacy. Our question, then, given that we lack sense experience of sets, is how we can justify our beliefs about sets and set theory.

There are a variety of distinct answers to our question. Some philosophers, called rationalists, claim that we have a special, non-sensory capacity for understanding mathematical truths, a rational insight arising from pure thought. But, the rationalist’s claims appear incompatible with an understanding of human beings as physical creatures whose capacities for learning are exhausted by our physical bodies. Other philosophers, called logicists, argue that mathematical truths are just complex logical truths. In the late nineteenth and early twentieth centuries, the logicists Gottlob Frege, Alfred North Whitehead, and Bertrand Russell attempted to reduce all of mathematics to obvious statements of logic, for example, that every object is identical to itself, or that if p then p. But, it turns out that we can not reduce mathematics to logic without adding substantial portions of set theory to our logic. A third group of philosophers, called nominalists or fictionalists, deny that there are any mathematical objects; if there are no mathematical objects, we need not justify our beliefs about them.

The indispensability argument in the philosophy of mathematics is an attempt to justify our mathematical beliefs about abstract objects, while avoiding any appeal to rational insight. Its most significant proponent was Willard van Orman Quine.

2. Quine’s Indispensability Argument

Though Quine alludes to an indispensability argument in many places, he never presented a detailed formulation. For a selection of such allusions, see Quine 1939, 1948, 1951, 1955, 1958, 1960, 1978, and 1986a. This articles discusses the following version of the argument:

QI: QI1. We should believe the theory which best accounts for our sense experience.
QI2. If we believe a theory, we must believe in its ontic commitments.
QI3. The ontic commitments of any theory are the objects over which that theory first-order quantifies.
QI4. The theory which best accounts for our sense experience first-order quantifies over mathematical objects.
QIC. We should believe that mathematical objects exist.

An ontic commitment to object o is a commitment to believing that o exists. First-order quantification is quantification in standard predicate logic. One presumption behind QI is that the theory which best accounts for our sense experience is our best scientific theory. Thus, Quine naturally defers much of the work of determining what exists to scientists. While it is obvious that scientists use mathematics in developing their theories, it is not obvious why the uses of mathematics in science should lead us to believe in the existence of abstract objects. For example, when we study the interactions of charged particles, we rely on Coulomb’s Law, which states that the electromagnetic force F between two charged particles 1 and 2  is proportional to the charges q1 and q2 on the particles and, inversely, to the distance r between them.

CL:    F = k ∣q1q2∣/ r2 ,   where the electrostatic constant k ≈ 9 x 109 Nm2/c2

∣q1q2∣ is the absolute value of the product of the two charges. Notice that CL refers to a real number, k, and employs mathematical functions such as multiplication and absolute value. Still, we use Coulomb’s Law to study charged particles, not to study mathematical objects, which have no effect on those particles. The plausibility of Quine’s indispensability argument thus depends on both (i) Quine’s claim that the evidence for our scientific theories transfers to the mathematical elements of those theories, which is implicit in QI1, and (ii) his method for determining the ontic commitments of our theories at QI3 and QI4. The method underlying Quine’s argument involves gathering our physical laws and writing them in a canonical language of first-order logic. The commitments of this formal theory may be found by examining its quantifications.

The remainder of this section discusses each of the premises of QI in turn.

a. A Best Theory

The first premise of QI is that we should believe the theory which best accounts for our sense experience, that is, we should believe our best scientific theory.

Quine’s belief that we should defer all questions about what exists to natural science is really an expression of what he calls, and has come to be known as, naturalism. He describes naturalism as, “[A]bandonment of the goal of a first philosophy. It sees natural science as an inquiry into reality, fallible and corrigible but not answerable to any supra-scientific tribunal, and not in need of any justification beyond observation and the hypothetico-deductive method” (Quine 1981: 72).

Quine's naturalism was developed in large part as a response to logical positivism, which is also called logical empiricism or just positivism. Positivism requires that any justified belief be constructed out of, and be reducible to, claims about observable phenomena. We know about ordinary objects like trees because we have sense experience, or sense data, of trees directly. We know about very small or very distant objects, despite having no direct sense experience of them, by having sense data of their effects, say electron trails in a cloud chamber. For the positivists, any scientific claim must be reducible to sense data.

Instead of starting with sense data and reconstructing a world of trees and persons, Quine assumes that ordinary objects exist. Further, Quine starts with an understanding of natural science as our best account of the sense experience which gives us beliefs about ordinary objects. Traditionally, philosophers believed that it was the job of epistemology to justify our knowledge. In contrast, the central job of Quine’s naturalist is to describe how we construct our best theory, to trace the path from stimulus to science, rather than to justify knowledge of either ordinary objects or scientific theory.

Quine’s rejection of positivism included the insight, now known as confirmation holism, that individual sentences are only confirmed in the context of a broader theory. Confirmation holism arises from an uncontroversial observation that any sentence s can be assimilated without contradiction to any theory T, as long as we adjust truth values of any sentences of T that conflict with s. These adjustments may entail further adjustments, and the new theory may in the end look quite different than it did before we accommodated s. But we can, as a matter of logic, hold on to any sentence come what may. And, we are not forced to hold on to any statement, come what may; there are no unassailable truths.

For a simple example, suppose I have a friend named Abigail. I have a set of beliefs, which we can call a theory of my friendship with her. (A theory is just a collection of sentences.) Suppose that I overhear Abigail saying mean things about me. New evidence conflicts with my old theory. I have a choice whether to reject the evidence (for example, “I must have mis-heard”) or to accommodate the evidence by adjusting my theory, giving up the portions about Abigail being my friend, say, or about my friends not saying mean things about me. Similarly, when new astronomical evidence in the 15th and 16th centuries threatened the old geocentric model of the universe, people were faced with a choice of whether to accept the evidence, giving up beliefs about the Earth being at the center of the universe, or to reject the evidence. Instead of requiring that individual experiences are each independently assessed, confirmation holism entails that there are no justifications for particular claims independent of our entire best theory. We always have various options for restoring an inconsistent theory to consistency.

Confirmation holism entails that our mathematical theories and our scientific theories are linked, and that our justifications for believing in science and mathematics are not independent. When new evidence conflicts with our current scientific theory, we can choose to adjust either scientific principles or mathematical ones. Evidence for the scientific theory is also evidence for the mathematics used in that theory.

The question of how we justify our beliefs about mathematical objects arose mainly because we could not perceive them directly. By rejecting positivism’s requirement for reductions of scientific claims to sense data, Quine allows for beliefs in mathematical objects despite their abstractness. We do not need sensory experience of mathematical objects in order to justify our mathematical beliefs. We just need to show that mathematical objects are indispensable to our best theory.

QI1 may best be seen as a working hypothesis in the spirit of Ockham’s Razor. We look to our most reliable endeavor, natural science, to tell us what there is. We bring to science a preference that it account for our entrenched esteem for ordinary experience. And we posit no more than is necessary for our best scientific theory.

b. Believing Our Best Theory

The second premise of QI states that our belief in a theory naturally extends to the objects which that theory posits.

Against QI.2, one might think that we could believe a theory while remaining agnostic or instrumentalist about whether its objects exist. Physics is full of fictional idealizations, like infinitely long wires, centers of mass, and uniform distributions of charge. Other sciences also posit objects that we do not really think exist, like populations in Hardy-Weinberg equilibrium (biology), perfectly rational consumers (economics), and average families (sociology). We posit such ideal objects to facilitate using a theory. We might believe our theory, while recognizing that the objects to which it refers are only ideal. If we hold this instrumentalist attitude toward average families and infinitely long wires, we might hold it toward circles, numbers and sets, too.

Quine argues that any discrepancy between our belief in a theory and our beliefs in its objects is illegitimate double-talk. One can not believe in only certain elements of a theory which one accepts. If we believe a theory which says that there are centers of mass, then we are committed to those centers of mass. If we believe a theory which says that there are electrons and quarks and other particles too small to see, then we are committed to such particles. If our best theory posits mathematical objects, then we must believe that they exist.

QI1 and QI2 together entail that we should believe in all of the objects that our best theory says exist. Any particular evidence applies to the whole theory, which produces its posits uniformly. Quine thus makes no distinction between justifications of observable and unobservable objects, or between physical and mathematical objects. All objects, trees and electrons and sets, are equally posits of our best theory, to be taken equally seriously. “To call a posit a posit is not to patronize it” (Quine 1960: 22).

There will be conflict between our currently best theory and the better theories that future science will produce. The best theories are, of course, not now available. Yet, what exists does not vary with our best theory. Thus, any current expression of our commitments is at best speculative. We must have some skepticism toward our currently best theory, if only due to an inductive awareness of the transience of such theories.

On the other hand, given Quine’s naturalism, we have no better theory from which to evaluate the posits of our currently best theory. There is no external, meta-scientific perspective. We know, casually and meta-theoretically, that our current theory will be superceded, and that we will give up some of our current beliefs; but we do not know how our theory will be improved, and we do not know which beliefs we will give up. The best we can do is believe the best theory we have, and believe in its posits, and have a bit of humility about these beliefs.

c. Quine’s Procedure for Determining Ontic Commitments

The first two premises of QI entail that we should believe in the posits of our best theory, but they do not specify how to determine precisely what the posits of a theory are. The third premise appeals to Quine’s general procedure for determining the ontic commitments of a theory. Anyone who wishes to know what to believe exists, in particular whether to believe that mathematical objects exist, needs a general method for determining ontic commitment. Rather than relying on brute observations, Quine provides a simple, broadly applicable procedure. First, we choose a best theory. Next, we regiment that theory in first-order logic with identity. Last, we examine the domain of quantification of the theory to see what objects the theory needs to come out as true.

Quine’s method for determining our commitments applies to any theory—theories which refer to trees, electrons and numbers, and theories which refer to ghosts, caloric and God.

We have already discussed how the first step in Quine’s procedure applies to QI. The second step of Quine’s procedure for determining the commitments of a theory refers to first-order logic as a canonical language. Quine credits first-order logic with extensionality, efficiency, elegance, convenience, simplicity, and beauty (Quine 1986: 79 and 87). Quine’s enthusiasm for first-order logic largely derives from various attractive technical virtues. In first-order logic, a variety of definitions of logical truth concur: in terms of logical structure, substitution of sentences or of terms, satisfaction by models, and proof. First-order logic is complete, in the sense that any valid formula is provable. Every consistent first-order theory has a model. First-order logic is compact, which means that any set of first-order axioms will be consistent if every finite subset of that set is consistent. It admits of both upward and downward Löwenheim-Skolem theorems, which mean that every theory which has an infinite model will have a model of every infinite cardinality (upward) and that every theory which has an infinite model of any cardinality will have a denumerable model (downward). (See Mendelson 1997: 377.)

Less technically, the existential quantifier in first-order logic is a natural equivalent of the English term "there is," and Quine proposes that all existence claims can and should be made by existential sentences of first-order logic. “The doctrine is that all traits of reality worthy of the name can be set down in an idiom of this austere form if in any idiom” (Quine 1960: 228).

We should take first-order logic as our canonical language only if:

A.   We need a single canonical language;

B.   It really is adequate; and

C.   There is no other adequate language.

Condition A arises almost without argument from QI1 and QI2. One of Quine’s most striking and important innovations was his linking of our questions about what exists with our concerns when constructing a canonical formal language. When we regiment our correct scientific theory correctly, Quine argues, we will know what we should believe exists. “The quest of a simplest, clearest overall pattern of canonical notation is not to be distinguished from a quest of ultimate categories, a limning of the most general traits of reality” (Quine 1960: 161).

Whether condition B holds depends on how we use our canonical language. First-order logic is uncontroversially useful for what Quine calls semantic ascent. When we ascend, we talk about words without presuming that they refer to anything; we can deny the existence of objects without seeming to commit to them. For example, on some theories of language sentences which contain terms that do not refer to real things are puzzling. Consider:

CP:     The current president of the United States does not have three children.

TF:     The tooth fairy does not exist.

If CP is to be analyzed as saying that there is a current president who lacks the property of having three children, then by parity of reasoning TF seems to say that there is a tooth fairy that lacks the attribute of existence. This analysis comes close to interpreting the reasonable sentence TF as a contradiction saying that there is something that is not.

In contrast, we can semantically ascend, claiming that the term "the tooth fairy" does not refer. TF is conveniently regimented in first-order logic, using "T" to stand for the property of being the tooth fairy. "‘∼(∃x)Tx" carries with it no implication that the tooth fairy exists. Similar methods can be applied to more serious existence questions, like whether there is dark energy or an even number greater than two which is not the sum of two primes. Thus, first-order logic provides a framework for settling disagreements over existence claims.

Against the supposed adequacy of first-order logic, there are expressions whose first-order logical regimentations are awkward at best. Leibniz’s law, that identical objects share all properties, seems to require a second-order formulation. Similarly, H resists first-order logical treatment.

H:     Some book by every author is referred to in some essay by every critic (Hintikka 1973: 345).

H may be adequately handled by branching quantifiers, which are not elements of first-order logic.

There are other limitations on first-order logic. Regimenting a truth predicate in first-order logic leads naturally to the paradox of the liar. And propositional attitudes, such as belief, create opaque contexts that prevent natural substitutions of identicals otherwise permitted by standard first-order inference rules. Still, defenders of first-order logic have proposed a variety of solutions to these difficulties, many of which may be due not to first-order logic itself but to deeper problems with language.

For condition C, Quine argues that no other language is adequate for canonical purposes. Ordinary language appears to be too sloppy, in large part due to its use of names to refer to objects. We use names to refer to some of the things that exist: "Muhammad Ali," "Jackie Chan," "The Eiffel Tower," But some names, such as "Spiderman," do not refer to anything real. Some things, such as most insects and pebbles, lack names. Some things, such as most people, have multiple names. We could clean up our language, constructing an artificial version in which everything has exactly one name. Still, in principle, there will not be enough names for all objects. As Cantor’s diagonal argument shows, there are more real numbers than there are available names for those numbers. If we want a language in which to express all and only our commitments, we have to look beyond languages which rely on names.

Given the deficiencies of languages with names and Quine’s argument that the existential quantifier is a natural formal equivalent of "there is," the only obvious remaining alternatives to first-order logic as a canonical language are higher-order logics. Higher-order logics have all of the expressive powers of first-order logic, and more. Most distinctly, where first-order logic allows variables only in the object position (i.e. following a predicate), second-order logic allows variables in the predicate positions as well, and introduces quantifiers to bind those predicates. Logics of third and higher order allow further predication and quantification. As they raise no significant philosophical worries beyond those concerning second-order logic, this discussion will focus solely on second-order logic.

To see how second-order logic works, consider the inference R.

R: R1. There is a red shirt.
R2. There is a red hat.
RC. So, there is something (redness) that some shirt and some hat share.

RC does not follow from R1 and R2 in first-order logic, but it does follow in second-order logic. If we let Sx mean that x is a shirt, and Px mean that x is a property, and so forth, then we have the following valid second-order inference RS:

RS: R1S. (∃x)(Sx & Rx)
R2S. (∃x)(Hx & Rx)
RCS. (∃P)(∃x)(∃y)(Sx & Hy & Px & Py)

Accommodating inferences such as R by extending one’s logic to second-order might seem useful. But, higher-order logics allow us to infer, as a matter of logic, that there is some thing, presumably the property of redness, that the shirt and the hat share. It is simple common sense that shirts and hats exist. It is a matter of significant philosophical controversy whether properties like redness exist. Thus, a logic which permits an inference like RS is controversial.

Quine’s objection to higher-order logics, and thus his defense of first-order logic, is that we are forced to admit controversial elements as interpretations of predicate variables. Even if we interpret predicate variables in the least controversial way, as sets of objects that have those properties, higher-order logics demand sets. Thus, Quine calls second-order logic, “Set theory in sheep’s clothing” (Quine 1986a: 66). Additionally, higher-order logics lack many of the technical virtues, such as completeness and compactness, of first-order logic. (For a defense of second-order logic, see Shapiro 1991.)

Once we settle on first-order logic as a canonical language, we must specify a method for determining the commitments of a theory in that language. Reading existential claims seems straightforward. For example, R2 says that there is a thing which is a hat, and which is red. But, theories do not determine their own interpretations. Quine relies on standard Tarskian model-theoretic methods to interpret first-order theories. On a Tarskian interpretation, or semantics, we ascend to a metalanguage to construct a domain of quantification for a given theory. We consider whether sequences of objects in the domain, taken as values of the variables bound by the quantifiers, satisfy the theory’s statements, or theorems. The objects in the domain that make the theory come out true are the commitments of the theory. “To be is to be a value of a variable” (Quine 1939: 50) is how Quine summarizes the point. (For an accessible discussion of Tarskian semantics, see Tarski 1944).

Quine’s procedure for determining the commitments of a theory can prevent us from prejudging what exists. We construct scientific theory in the most effective and attractive way we can. We balance formal considerations, like the elegance of the mathematics involved, with an attempt to account for the broadest sensory evidence. The more comprehensive and elegant the theory, the more we are compelled to believe it, even if it tells us that the world is not the way we thought it is. If the theory yields a heliocentric model of the solar system, or the bending of rays of light, then we are committed to heliocentrism or bent light rays. Our commitments are the byproducts of this neutral process.

d. Mathematization

The final step of QI involves simply looking at the domain of the theory we have constructed. When we write our best theory in our first-order language, we discover that the theory includes physical laws which refer to functions, sets, and numbers. Consider again Coulomb’s Law: F = k ∣q1q2∣/ r2. Here is a partial first-order regimentation which suffices to demonstrate the commitments of the law, using ‘Px’ for ‘x is a charged particle’.

CLR:     ∀x∀y{(Px & Py) → ∃f [f(q(x), q(y), d(x,y), k) =  F]}       where F = k ∣q(x) q(y)∣ / d(x,y)2

In addition to the charged particles over which the universal quantifiers in front range, there is an existential quantification over a function, f. This function maps numbers [the Coulomb’s Law constant k, and measurements of charge q(x) and q(y), and distance d] to other numbers (measurements of force F between the particles).

In order to ensure that there are enough sets to construct these numbers and functions, our ideal theory must include set-theoretic axioms, perhaps those of Zermelo-Fraenkel set theory, ZF.

The full theory of ZF is unnecessary for scientific purposes; there will be some sets which are never needed and some numbers which are never used to measure any real quantity. But, we take a full set theory in order to make our larger theory as elegant as possible. We can derive from the axioms of any adequate set theory a vast universe of sets. So, CL contains or entails several existential mathematical claims. According to QI, we should believe that these mathematical objects exist.

Examples such as CLR abound. Real numbers are used for measurement throughout physics, and other sciences. Quantum mechanics makes essential use of Hilbert spaces and probability functions. The theory of relativity invokes the hyperbolic space of Lobachevskian geometry. Economics is full of analytic functions. Psychology uses a wide range of statistics.

Opponents of the indispensability argument have developed sophisticated strategies for re-interpreting apparently ineliminable uses of mathematics, especially in physics. Some reinterpretations use alethic modalities (necessity and possibility) to replace mathematical objects. Others replace numbers with space-time points or regions. It is quite easy, but technical, to rewrite first-order theories in order to avoid quantifying over mathematical objects. It is less easy to do so while maintaining Quine’s canonical language of first-order logic.

For example, Hartry Field’s reformulation of Newtonian gravitational theory (Field 1980; discussed below in section 7) replaces the real numbers which are ordinarily used to measure fundamental properties such as mass and momentum with relations among regions of space-time. Field replaces the "2" in the claim “The beryllium sphere has a mass of 2 kg” with a ratio of space-time regions, one twice as long as the other. In order to construct the proper ratios of space-time regions, and having no mathematical axioms at his disposal, Field’s project requires either second-order logic or axioms of mereology, both of which are controversial extensions of first-order logic. For an excellent survey of dispensabilist strategies, and further references, see Burgess and Rosen 1997; for more recent work, see Melia 1998 and Melia 2000.

Quine’s indispensability argument depends on controversial claims about believing in a single best theory, finding our commitments by using a canonical first-order logic, and the ineliminability of mathematics from scientific theories. Other versions of the argument attempt to avoid some of these controversial claims.

3. Putnam’s Success Argument

In his early work, Hilary Putnam accepted Quine’s version of the indispensability argument, but he eventually differed with Quine on a variety of questions. Most relevantly, Putnam abandoned Quine’s commitment to a single, regimented, best theory; and he urged that realism in mathematics can be justified by its indispensability for correspondence notions of truth (which require set-theoretic relations) and for formal logic, especially for metalogical notions like derivability and validity which are ordinarily treated set-theoretically.

The position Putnam calls realism in mathematics is ambiguous between two compatible views. Sentence realism is the claim that sentences of mathematics can be true or false. Object realism is the claim that mathematical objects exist. Most object realists are sentence realists, though some sentence realists, including some structuralists, deny object realism. Indispensability arguments may be taken to establish either sentence realism or object realism. Quine was an object realist. Michael Resnik presents an indispensability argument for sentence realism; see section 4. This article takes Putnam’s realism to be both object and sentence realism, but nothing said below depends on that claim.

Realism contrasts most obviously with fictionalism, on which there are no mathematical objects, and many mathematical sentences considered to be true by the realist are taken to be false. To understand the contrast between realism and fictionalism, consider the following two paradigm mathematical claims, the first existential and the second conditional.

E:   There is a prime number greater than any you have ever thought of.

C:   The consecutive angles of any parallelogram are supplementary.

The fictionalist claims that mathematical existence claims like E are false since prime numbers are numbers and there are no numbers. The standard conditional interpretation of C is that if any two angles are consecutive in a parallelogram, then they are supplementary. If there are no mathematical objects, then standard truth-table semantics for the material conditional entail that C, having a false antecedent, is true. The fictionalist claims that conditional statements which refer to mathematical objects, such as C, are only vacuously true, if true.

Putnam’s non-Quinean indispensability argument, the success argument, is a defense of realism over fictionalism, and other anti-realist positions. The success argument emphasizes the success of science, rather than the construction and interpretation of a best theory.

PS: PS1. Mathematics succeeds as the language of science.
PS2. There must be a reason for the success of mathematics as the language of science.
PS3. No positions other than realism in mathematics provide a reason.
PSC. So, realism in mathematics must be correct.

Putnam’s success argument for mathematics is analogous to his success argument for scientific realism. The scientific success argument relies on the claim that any position other than realism makes the success of science miraculous. The mathematical success argument claims that the success of mathematics can only be explained by a realist attitude toward its theorems and objects. “I believe that the positive argument for realism [in science] has an analogue in the case of mathematical realism. Here too, I believe, realism is the only philosophy that doesn’t make the success of the science a miracle” (Putnam 1975a: 73).

One potential criticism of any indispensability argument is that by making the justification of our mathematical beliefs depend on our justification for believing in science, our mathematical beliefs become only as strong as our confidence in science. It is notoriously difficult to establish the truth of scientific theory. Some philosophers, such as Nancy Cartwright and Bas van Fraassen, have argued that science, or much of it, is false, in part due to idealizations. (See Cartwright 1983 and van Fraassen 1980.) The success of science may be explained by its usefulness, without presuming that scientific theories are true.

Still, even if science were only useful rather than true, PS claims that our mathematical beliefs may be justified by the uses of mathematics in science. The problems with scientific realism focus on the incompleteness and error of contemporary scientific theory. These problems need not infect our beliefs in the mathematics used in science. A tool may work fine, even on a broken machine. One could deny or remain agnostic towards the claims of science, and still attempt to justify our mathematical beliefs using Putnam’s indispensability argument.

The first two premises of PS are uncontroversial, so Putnam’s defense of PS focuses on its third premise. His argument for that premise is essentially a rejection of the argument that mathematics could be indispensable, yet not true. “It is silly to agree that a reason for believing that p warrants accepting p in all scientific circumstances, and then to add ‘but even so it is not good enough’” (Putnam 1971: 356).

For the Quinean holist, Putnam’s argument for PS3 has some force. Such a holist has no external perspective from which to evaluate the mathematics in scientific theory as merely useful. The holist can not say, “Well, I commit to mathematical objects within scientific theory, but I don’t really mean that they exist.”

In contrast, the opponent of PS may abandon the claim that our most sincere commitments are found in the quantifications of our single best theory. Instead, such an opponent might claim that only objects which have causal relations to ordinary physical objects exist. Such a critic is free to deny that mathematical objects exist, despite their utility in science, and nothing in PS prevents such a move, in the way that QI1-QI3 do for Quine’s original argument.

More importantly, any account of the applicability of mathematics to the natural world other than the indispensabilist’s refutes PS3. For example, Mark Balaguer’s plenitudinous platonism claims that mathematics provides a theoretical apparatus which applies to all possible states of the world. (See Balaguer 1998.) It explains the applicability of mathematics to the natural world, non-miraculously, since any possible state of the natural world will be described by some mathematical theory.

Similarly, and more influentially, Hartry Field has argued that the reason that mathematics is successful as the language of science is because it is conservative over nominalist versions of scientific theories. (See Field 1980, especially the preliminary remarks and Chapter 1.) In other words, Field claims that mathematics is just a convenient shorthand for a theory which includes no mathematical axioms.

In response, one could amend PS3 to improve Putnam’s argument:

PS3*: Realism best explains the success of mathematics as the language of science.

A defense of the new argument, PS*, would entail showing that realism is a better explanation of the utility of mathematics than other options.

4. Resnik’s Pragmatic Indispensability Argument

Michael Resnik, like Putnam, presents both a holistic indispensability argument, such as Quine’s, and a non-holistic argument called the pragmatic indispensability argument. In the pragmatic argument, Resnik first links mathematical and scientific justification.

RP: RP1. In stating its laws and conducting its derivations, science assumes the existence of many mathematical objects and the truth of much mathematics.
RP2. These assumptions are indispensable to the pursuit of science; moreover, many of the important conclusions drawn from and within science could not be drawn without taking mathematical claims to be true.
RP3. So, we are justified in drawing conclusions from and within science only if we are justified in taking the mathematics used in science to be true.
RP4. We are justified in using science to explain and predict.
RP5. The only way we know of using science thus involves drawing conclusions from and within it.
RPC. So, by RP3, we are justified in taking mathematics to be true (Resnik 1997: 46-8).

RP, like PS, avoids the problems that may undermine our confidence in science. Even if our best scientific theories are false, their undeniable practical utility still justifies our using them. RP states that we need to presume the truth of mathematics even if science is merely useful. The key premises for RP, then, are the first two. If we can also take mathematics to be merely useful, then those premises are unjustified. The question for the proponent of RP, then, is how to determine whether science really presumes the existence of mathematical objects, and mathematical truth. How do we determine the commitments of scientific theory?

We could ask scientists about their beliefs, but they may work without considering the question of mathematical truth at all. Like PS, RP seems liable to the critic who claims that the same laws and derivations in science can be stated while taking mathematics to be merely useful. (See Azzouni 2004, Leng 2005, and Melia 2000.) The defender of RP needs a procedure for determining the commitments of science that blocks such a response, if not a more general procedure for determining our commitments. RP may thus be best interpreted as a reference back to Quine’s holistic argument, which provided both.

5. The Explanatory Indispensability Argument

Alan Baker and Mark Colyvan have defended an explanatory indispensability argument (Mancosu 2008: section 3.2; see also Baker 2005 and Lyon and Colyvan 2008).

EI: EI1. There are genuinely mathematical explanations of empirical phenomena.
EI2. We ought to be committed to the theoretical posits in such explanations.
EIC. We ought to be committed to the entities postulated by the mathematics in question.

EI differs from Quine’s original indispensability argument QI, and other versions of the argument, by seeking the justification for our mathematical beliefs in scientific explanations which rely on mathematics, rather than in scientific theories. Mathematical explanation and scientific explanation are difficult and controversial topics, beyond the scope of this article. Still, two comments are appropriate.

First, it is unclear whether EI is intended as a greater demand on the indispensabilist than the standard indispensability argument. Does the platonist have to show that mathematical objects are indispensable in both our best theories and our best explanations, or just in one of them? Conversely, must the nominalist dispense with mathematics in both theories and explanations?

Second, EI, like Putnam’s success argument and Resnik’s pragmatic argument, leaves open the question of how one is supposed to determine the commitments of an explanation. EI2 refers to the theoretical posits postulated by explanations, but does not tell us how we are supposed to figure out what an explanation posits. If the commitments of a scientific explanation are found in the best scientific theory used in that explanation, then EI is no improvement on QI. If, on the other hand, EI is supposed to be a new and independent argument, its proponents must present a new and independent criterion for determining the commitments of explanations.

Given the development of the explanatory indispensability argument and the current interest in mathematical explanation, it is likely that more work will be done on these questions.

6. Characteristics of Indispensability Arguments in the Philosophy of Mathematics

Quine’s indispensability argument relies on specific claims about how we determine our commitments, and on what our canonical language must be. Putnam’s success argument, Resnik’s pragmatic argument, and the explanatory indispensability argument are more general, since they do not specify a particular method for determining the commitments of a theory. Putnam and Resnik maintain that we are committed to mathematics because of the ineliminable role that mathematical objects play in scientific theory. Proponents of the explanatory argument argue that our mathematical beliefs are justified by the role that mathematics plays in scientific explanations.

Indispensability arguments in the philosophy of mathematics can be quite general, and can rely on supposedly indispensable uses of mathematics in a wide variety of contexts. For instance, in later work, Putnam defends belief in mathematical objects for their indispensability in explaining our mathematical intuitions. (See Putnam 1994: 506.) Since he thinks that our mathematical intuitions derive exclusively from our sense experience, this later argument may still be classified as an indispensability argument.

Here are some characteristics of many indispensability arguments in the philosophy of mathematics, no matter how general:

Naturalism: The job of the philosopher, as of the scientist, is exclusively to understand our sensible experience of the physical world.
Theory Construction: In order to explain our sensible experience we construct a theory, or theories, of the physical world. We find our commitments exclusively in our best theory or theories.
Mathematization: Some mathematical objects are ineliminable from our best theory or theories.
Subordination of Practice: Mathematical practice depends for its legitimacy on natural scientific practice.

Although the indispensability argument is a late twentieth century development, earlier philosophers may have held versions of the argument. Mark Colyvan classifies arguments from both Frege and Gödel as indispensability arguments, on the strength of their commitments to Theory Construction and Mathematization. (See Colyvan 2001: 8-9.) Both Frege and Gödel, though, deny Naturalism and Subordination of Practice, so they are not indispensabilists according to the characterization in this section.

7. Responses to the Indispensability Argument

The most influential approach to denying the indispensability argument is to reject the claim that mathematics is essential to science. The main strategy for this response is to introduce scientific or mathematical theories which entail all of the consequences of standard theories, but which do not refer to mathematical objects. Such nominalizing strategies break into two groups.

In the first group are theories which show how to eliminate quantification over mathematical objects within scientific theories. Hartry Field has shown how we can reformulate some physical theories to quantify over space-time regions rather than over sets. (See Field 1980 and Field 1989.) According to Field, mathematics is useful because it is a convenient shorthand for more complicated statements about physical quantities. John Burgess has extended Field’s work. (See Burgess 1984, Burgess 1991a, and Burgess and Rosen 1997.) Mark Balaguer has presented steps toward nominalizing quantum mechanics. (See Balaguer 1998.)

The second group of nominalizing strategies attempts to reformulate mathematical theories to avoid commitments to mathematical objects. Charles Chihara (Chihara 1990), Geoffrey Hellman (Hellman 1989), and Hilary Putnam (Putnam 1967b and Putnam 1975a) have all explored modal reformulations of mathematical theories. Modal reformulations replace claims about mathematical objects with claims about possibility.

Another line of criticism of the indispensability argument is that the argument is insufficient to generate a satisfying mathematical ontology. For example, no scientific theory requires any more than א1 sets; we don’t need anything nearly as powerful as the full ZFC hierarchy. But, standard set theory entails the existence of much larger cardinalities. Quine calls such un-applied mathematics “mathematical recreation” (Quine 1986b: 400).

The indispensabilist can justify extending mathematical ontology a little bit beyond those objects explicitly required for science, for simplicity and rounding out. But few indispensabilists have shown interest in justifying beliefs in, say, inaccessible cardinals. (Though, see Colyvan 2007 for such an attempt.) Thus, the indispensabilist has a restricted ontology. Similarly, the indispensability argument may be criticized for making mathematical epistemology a posteriori, rather than  a priori, and for classifying mathematical truths as contingent, rather than necessary. Indispensabilists may welcome these departures from traditional interpretations of mathematics. (For example, see Colyvan 2001, Chapter 6.)

8. Inter-theoretic and Intra-theoretic Indispensability Arguments

Indispensability arguments need not be restricted to the philosophy of mathematics. Considered more generally, an indispensability argument is an inference to the best explanation which transfers evidence for one set of claims to another. If the transfer crosses disciplinary lines, we can call the argument an inter-theoretic indispensability argument. If evidence is transferred within a theory, we can call the argument an intra-theoretic indispensability argument. The indispensability argument in the philosophy of mathematics transfers evidence from natural science to mathematics. Thus, this argument is an inter-theoretic indispensability argument.

One might apply inter-theoretic indispensability arguments in other areas. For example, one could argue that we should believe in gravitational fields (physics) because they are ineliminable from our explanations of why zebras do not go flying off into space (biology). We might think that biological laws reduce, in some sense, to physical laws, or we might think that they are independent of physics, or supervenient on physics. Still, our beliefs in some basic claims of physics seem indispensable to other sciences.

As an example of an intra-theoretic indispensability argument, consider the justification for our believing in the existence of atoms. Atomic theory makes accurate predictions which extend to the observable world. It has led to a deeper understanding of the world, as well as further successful research. Despite our lacking direct perception of atoms, they play an indispensable role in atomic theory. According to atomic theory, atoms exist. Thus, according to an intra-theoretic indispensability argument, we should believe that atoms exist.

As an example of an intra-theoretic indispensability argument within mathematics, consider Church’s Thesis. Church’s Thesis claims that our intuitive notion of an algorithm is equivalent to the technical notion of a recursive function. Church’s Thesis is not provable, in the ordinary sense. But, it might be defended by using an intra-theoretic indispensability argument: Church’s Thesis is fruitful, and, arguably, indispensable to our understanding of mathematics.  For another example, Quine’s argument for QI2, that we must believe in the commitments of any theory we accept, might itself also be called an intra-theoretic indispensability argument.

9. Conclusion

There are at least three ways of arguing for empirical justification of mathematics. The first is to argue, as John Stuart Mill did, that mathematical beliefs are about ordinary, physical objects to which we have sensory access. The second is to argue that, although mathematical beliefs are about abstract mathematical objects, we have sensory access to such objects. (See Maddy 1990.) The currently most popular way to justify mathematics empirically is to argue:

A. Mathematical beliefs are about abstract objects;

B. We have experiences only with physical objects; and yet

C. Our experiences with physical objects justify our mathematical beliefs.

This is the indispensability argument in the philosophy of mathematics.

10. References and Further Reading

  • Azzouni, Jody.  2004. Deflating Existential Consequence: A Case for Nominalism. Oxford University Press.
  • Azzouni, Jody.  1998. “On ‘On What There Is’.”  Pacific Philosophical Quarterly 79: 1-18.
  • Azzouni, Jody.  1997b. “Applied Mathematics, Existential Commitment, and the Quine-Putnam Indispensability Thesis.”  Philosophia Mathematica (3) 5: 193-209.
  • Baker, Alan.  2005. “Are there Genuine Mathematical Explanations of Physical Phenomena?” Mind: 114: 223-238.
  • Baker, Alan.  2003. “The Indispensability Argument and Multiple Foundations for Mathematics.” The Philosophical Quarterly 53.210: 49-67.
  • Baker, Alan.  2001. “Mathematics, Indispensability and Scientific Practice.” Erkenntnis 55: 85 116.
  • Balaguer, Mark.  1998. Platonism and Anti-Platonism in Mathematics. New York: Oxford University Press.
  • Balaguer, Mark.  1996. “Toward a Nominalization of Quantum Mechanics.”  Mind 105: 209-226.
  • Bangu, Sorin Ioan.  2008. “Inference to the Best Explanation and Mathematical Realism.” Synthese 160: 13-20.
  • Benacerraf, Paul.  1973.  “Mathematical Truth.”   In Paul Benacerraf and Hilary Putnam, eds., Philosophy of Mathematics: Selected Readings, second edition, Cambridge: Cambridge University Press, 1983.
  • Burgess, John.  1991b. “Synthetic Physics and Nominalist Realism.”  In C. Wade Savage and Philip Ehrlich, Philosophical and Foundational Issues in Measurement Theory, Hillsdale: Lawrence Erlblum Associates, 1992.
  • Burgess, John.  1991a. “Synthetic Mechanics Revisited.” Journal of Philosophical Logic 20: 121-130. Burgess, John.  1984. “Synthetic Mechanics.” Journal of Philosophical Logic 13: 379-395.
  • Burgess, John.  1983. “Why I am Not a Nominalist.”  Notre Dame Journal of Formal Logic 24.1: 93-105. Burgess, John, and Gideon Rosen.  1997. A Subject with No Object. New York: Oxford.
  • Cartwright, Nancy.  1983. How the Laws of Physics Lie. Oxford: Clarendon Press.
  • Chihara, Charles.  1990. Constructabiliy and Mathematical Existence. Oxford: Oxford University Press.
  • Colyvan, Mark.  2007.  “Mathematical Recreation versus Mathematical Knowledge.”  In Leng, Mary, Alexander Paseau and Michael Potter, eds, Mathematical Knowledge, Oxford University Press, 109-122.
  • Colyvan, Mark.  2002. “Mathematics and Aesthetic Considerations in Science.”  Mind 11: 69-78.
  • Colyvan, Mark.  2001. The Indispensability of Mathematics. Oxford University Press.
  • Field, Hartry.  1989. Realism, Mathematics, and Modality. Oxford: Basil Blackwell.
  • Field, Hartry.  1980. Science Without Numbers. Princeton: Princeton University Press.
  • Frege, Gottlob.  1953. The Foundations of Arithmetic. Evanston: Northwestern University Press.
  • Gödel, Kurt.  1963. “What is Cantor’s Continuum Problem?” In Paul Benacerraf and Hilary Putnam, eds., Philosophy of Mathematics: Selected Readings, second edition. Cambridge: Cambridge University Press, 1983.
  • Gödel, Kurt.  1961. “The Modern Development of the Foundations of Mathematics in the Light of Philosophy.”  In Solomon Feferman et al., eds., Kurt Gödel: Collected Works, Vol. III. New York: Oxford University Press, 1995.
  • Hellman, Geoffrey.  1989. Mathematics Without Numbers. New York: Oxford University Press.
  • Hintikka, Jaakko.  1973. “Quantifiers vs Quantification Theory.”  Dialectica 27.3: 329-358.
  • Leng, Mary.  2005.  “Mathematical Explanation.”  In Cellucci, Carlo and Donald Gillies eds, Mathematical Reasoning and Heuristics, King's College Publications, London, 167-189.
  • Lyon, Aidan and Mark Colyvan.  2007. “The Explanatory Power of Phase Spaces.” Philosophia Mathematica 16.2: 227-243.
  • Maddy, Penelope.  1992. “Indispensability and Practice.”  The Journal of Philosophy 89: 275­-289.
  • Maddy, Penelope. 1990. Realism in Mathematics. Oxford: Clarendon Press. Mancosu, Paolo.  “Explanation in Mathematics.”  The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.).
  • Marcus, Russell.  2007. “Structuralism, Indispensability, and the Access Problem.” Facta Philosophica 9, 2007: 203-211.
  • Melia, Joseph.  2002. “Response to Colyvan.”  Mind 111: 75-79.
  • Melia, Joseph.  2000. “Weaseling Away the Indispensability Argument.”  Mind 109: 455-479.
  • Melia, Joseph.  1998. “Field’s Programme: Some Interference.”  Analysis 58.2: 63-71.
  • Mendelson, Elliott.  1997. Introduction to Mathematical Logic, 4th ed.  Chapman & Hall/CRC.
  • Mill, John Stuart.  1941. A System of Logic, Ratiocinative and Inductive: Being a Connected View of the Principles of Evidence and the Methods of Scientific Investigation. London, Longmans, Green.
  • Putnam, Hilary.  1994.  “Philosophy of Mathematics: Why Nothing Works.” In his Words and Life. Cambridge: Harvard University Press.
  • Putnam, Hilary.  1975b. Mathematics, Matter, and Method: Philosophical Papers, Vol. I. Cambridge: Cambridge University Press.
  • Putnam, Hilary. 1975a. “What is Mathematical Truth?”  In Putnam 1975b. Putnam, Hilary.  1974. “Science as Approximation to Truth.”  In Putnam 1975b.
  • Putnam, Hilary.  1971. Philosophy of Logic. In Putnam 1975b.
  • Putnam, Hilary.  1967b. “Mathematics Without Foundations.”  In Putnam 1975b.
  • Putnam, Hilary.  1967a. “The Thesis that Mathematics is Logic.”  In Putnam 1975b.
  • Putnam, Hilary.  1956. “Mathematics and the Existence of Abstract Entities.”  Philosophical Studies 7: 81-88.
  • Quine, W.V. 1995. From Stimulus to Science. Cambridge: Harvard University Press.
  • Quine, W.V. 1986b. “Reply to Charles Parsons.” In Lewis Edwin Hahn and Paul Arthur Schilpp, eds., The Philosophy of W.V. Quine. La Salle: Open Court, 1986.
  • Quine, W.V. 1986a. Philosophy of Logic, 2nd edition.  Cambridge: Harvard University Press.
  • Quine, W.V. 1981. Theories and Things. Cambridge: Harvard University Press.
  • Quine, W.V.  1980. From a Logical Point of View. Cambridge: Harvard University Press.
  • Quine, W.V.  1978. “Success and the Limits of Mathematization.”  In Quine 1981.
  • Quine, W.V.  1969. Ontological Relativity and Other Essays. New York: Columbia University Press. Quine, W.V.  1960. Word & Object. Cambridge: The MIT Press.
  • Quine, W.V.  1958. “Speaking of Objects.”  In Quine 1969.
  • Quine, W.V.  1955. “Posits and Reality.”  In The Ways of Paradox. Cambridge: Harvard University Press, 1976.
  • Quine, W.V.  1951. “Two Dogmas of Empiricism.”  In Quine 1980.
  • Quine, W.V.  1948. “On What There Is.”  In Quine 1980.
  • Quine, W.V.  1939. “Designation and Existence.”  In Feigl and Sellars, Readings in Philosophical Analysis, Appleton-Century-Crofts, Inc., New York: 1940.
  • Resnik, Michael.  1997. Mathematics as a Science of Patterns. Oxford: Oxford University Press.
  • Resnik, Michael.  1995. “Scientific vs. Mathematical Realism: The Indispensability Argument.  Philosophia Mathematica (3) 3: 166-174.
  • Resnik, Michael D.  1993. “A Naturalized Epistemology for a Platonist Mathematical Ontology.”  In Sal Restivo, et. al., eds., Math Worlds: Philosophical and Social Studies of Mathematics and Mathematics Education, Albany: SUNY Press, 1993.
  • Shapiro, Stewart.  Foundations without Foundationalism: A Case for Second-Order Logic. Oxford University Press, 1991.
  • Sher, Gila. 1990. “Ways of Branching Quantifiers.” Linguistics and Philosophy 13: 393-422.
  • Sober, Elliott.  1993. “Mathematics and Indispensability.”  The Philosophical Review 102: 35-57.
  • Tarski, Alfred.  1944. “The Semantic Conception of Truth: and the Foundations of Semantics.” Philosophy and Phenomenological Research 4.3: 341-376.
  • Van Fraassen, Bas C.  1980. The Scientific Image. Oxford: Clarendon Press.
  • Whitehead, Alfred North and Bertrand Russell.  1997. Principia Mathematica to *56. Cambridge University Press.

Author Information

Russell Marcus
Email: rmarcus1@hamilton.edu
Hamilton College
U. S. A.

Fictionalism in the Philosophy of Mathematics

The distinctive character of fictionalism about any discourse is (a) recognition of some valuable purpose to that discourse, and (b) the claim that that purpose can be served even if sentences uttered in the context of that discourse are not literally true. Regarding (b), if the discourse in question involves mathematics, either pure or applied, the core of the mathematical fictionalist’s view about such discourse is that the purpose of engaging in that discourse can be served even if the mathematical utterances one makes in the context of that discourse are not true (or, in the case of negative existentials such as ‘There are no square prime numbers’, are only trivially true).

Regarding (a), in developing mathematical fictionalism, then, mathematical fictionalists must add to this core view at the very least an account of the value of mathematical inquiry and an explanation of why this value can be expected to be served if we do not assume the literal or face-value truth of mathematics.

The label ‘fictionalism’ suggests a comparison of mathematics with literary fiction, and although the fictionalist may wish to draw only the minimal comparison that both mathematics and fiction can be good without being true, fictionalists may also wish to develop this analogy in further dimensions, for example by drawing on discussions of the semantics of fiction, or on how fiction can represent. Before turning to these issues, though, this article considers what the literal truth of a sentence uttered in the context of mathematical inquiry would amount to, so as to understand the position that fictionalists wish to reject.

Table of Contents

  1. Face-value Semantics, Platonism, and its Competitors
  2. The Fictionalist’s Attitude: Acceptance without Belief
  3. Preface or Prefix Fictionalism?
  4. Mathematical Fictionalism and Empirical Theorizing
    1. Mathematical Fictionalism + Scientific Realism
    2. Mathematical Fictionalism + Nominalistic Scientific Realism
    3. Mathematical Fictionalism + Constructive Empiricism (Bueno)
  5. Hermeneutic or Revolutionary Fictionalism
  6. References and Further Reading

1. Face-value Semantics, Platonism, and its Competitors

In the context of quite ordinary mathematical theorizing we find ourselves uttering sentences whose literal or ‘face-value’ truth would seem to require the existence of mathematical objects such as numbers, functions, or sets.  Thus: ‘2 is an even number’ appears to be of subject-predicate form, with the singular term ‘2’ purporting to stand for an object which is said to have the property of being an even number.  ‘The empty set has no members’ uses a definite description, and at least since Russell presented his theory of definite descriptions it has standardly been assumed that the truth of such sentences requires, at a minimum, the existence and uniqueness of something satisfying the indefinite descriptive phrase 'is an empty set'.  Most stark, though, is the use of the existential quantifier in the sentences used to express our mathematical theories.  Euclid proved a theorem whose content we would express by means of the sentence ‘There are infinitely many prime numbers’.  One would have to make some very fancy manoeuvres indeed to construe this sentence as requiring anything less than the existence of numbers – infinitely many of them. (For an argument against the ‘ontologically committing’ reading of ‘there is’, see Jody Azzouni (2004: 67), according to which ‘there is’ in English functions as an ‘ontologically neutral anaphora’.  Azzouni’s position on ontological commitments is discussed helpfully in Joseph Melia’s (2005) online review of Azzouni’s book.  For the remainder of this article, though, we will assume, contrary to Azzouni’s position, that the literal truth of sentences of the form ‘there are Fs’ requires the existence of Fs.)

Similar points about the ‘face-value’ commitments of our ordinary utterances can be made if we move outside of the context of pure mathematics to sentences uttered in the context of ordinary day-to-day reasoning, or in the context of empirical science.  As Hilary Putnam (1971) famously pointed out, in stating the laws of our scientific theories we make use of sentences that, at face value, are dripping with commitments to mathematical objects.  Thus, Newton’s law of universal gravitation says that, between any two massive objects a and b there is a force whose magnitude F is directly proportional to the product of the masses ma and mb of those objects and inversely proportional to the square of the distance d between them (F = Gmamb/d2).  Unpacking this statement a little bit we see that it requires that, corresponding to any massive object o there is a real number mo representing its mass as a multiple of some unit of mass; corresponding to any distance between two objects there is a real number d representing that distance as a multiple of some unit of distance; and corresponding to any force there is a real number F representing the magnitude of that force as a multiple of some unit of force.  So, conjoined with the familiar truth that there are massive objects, the literal truth of Newton’s law requires not only that there be forces acting on these objects, and distances separating them, but that there be real numbers corresponding appropriately to masses, forces, and distances, and related in such a way that F = Gmamb/d2.  So Newton’s law taken literally requires the existence of real numbers and of correspondences of objects with real numbers (i.e., functions).

These examples show that, on a literal or face-value reading, some of the sentences used to express our mathematical and scientific theories imply the existence of mathematical objects.  The theories that are expressed by means of such sentences are thus said to be ontologically committed to mathematical objects.  Furthermore, if we interpret these sentences at face value, and if we endorse those sentences when so interpreted (accepting them as expressing truths on that face value interpretation), then it seems that we too, by our acceptance of the truth of such sentences, are committed to an ontology that includes such things as numbers, functions, and sets.

Mathematical realists standardly endorse a face-value reading of those sentences used to express our mathematical and scientific theories, and accept that such sentences so interpreted express truths.  They therefore commit themselves to accepting the existence of mathematical objects.  In inquiring into the nature of the objects to which they are thereby committed, mathematical platonists typically go on to state that the objects to which they are committed are abstract, where this is understood negatively to mean, at a minimum, non-spatiotemporal, acausal, and mind-independent.  But many philosophers are wary of accepting the existence of objects of this sort, not least because (as Benacerraf  (1973) points out) their negative characterization renders it difficult, if not impossible, to account for our ability to have knowledge of such things.  And even without such specific epistemological worries, general ‘Ockhamist’ tendencies warn that we should be wary of accepting the existence of abstract mathematical objects unless the assumption that there are such things proves to be unavoidable.  For many philosophers then, fictionalists included, mathematical platonism presents itself as a last resort – a view to be adopted only if no viable alternative that does not require belief in the existence of abstract mathematical objects presents itself.

What, then, are the alternatives to platonism?  One might reject the face-value interpretation of mathematical sentences, holding that these sentences are true, but that their truth does not (despite surface appearances) require the existence of mathematical objects.  Defenders of such an alternative must provide a method for reinterpreting those sentences of mathematical and empirical discourse that appear to imply the existence of abstract mathematical objects so that, when so-interpreted, these implications disappear.  Alternatively, one might accept the face-value semantics, but reject the truth of the sentences used to express our mathematical theories.  Assuming that it is an advantage of standard platonism that it provides a standard semantics for the sentences used to express our mathematical theories, and that it preserves our intuition that many of these sentences assert truths, each of these options preserves one advantage at the expense of another.  A final alternative is to reject both the face-value interpretation of mathematical sentences and to reject the truth of mathematical sentences once reinterpreted.  This apparently more drastic response is behind at least one position in the philosophy of mathematics that reasonably calls itself fictionalist (Hoffman (2004), building on Kitcher (1984); Hoffman’s claim is that ordinary mathematical utterances are best interpreted as making claims about the collecting and segregating abilities of a (fictional) ideal mathematician.   These claims are not literally true, since the ideal mathematician does not really exist, but there is a standard for correctness for such claims, given by the ‘story’ provided of the ideal agent and his abilities).  However, the label ‘fictionalism’ in the philosophy of mathematics is generally used to pick out positions of the second kind, and that convention will be adhered to in what follows.  That is, according to mathematical fictionalists, sentences of mathematical and mathematically-infused empirical discourse should be interpreted at face value as implying the existence of mathematical objects, but we should not accept that such sentences so-interpreted express truths.  As such, mathematical fictionalism is an error theory with respect to ordinary mathematical and empirical discourse.

2. The Fictionalist’s Attitude: Acceptance without Belief

At a minimum, then, mathematical fictionalists accept a face-value reading of sentences uttered in the context of mathematical and ordinary empirical theorizing, but when those sentences are, on that reading, committed to the existence of mathematical objects, mathematical fictionalists do not accept those sentences to be true.   Some fictionalists, e.g., Field (1989, 45) will go further than this and say that we ought to reject such sentences as false, taking it to be undue epistemic caution to maintain agnosticism rather than rejecting the existence of mathematical objects once it is recognized that we have no reason to believe that there are such things.  Whether one follows Field in rejecting the existence of mathematical objects will depend on one’s motivation for fictionalism.  A broadly naturalist motivation, according to which we should accept the existence of all and only those objects whose existence is confirmed according to our best scientific standards, would seem to counsel disbelief.  On the other hand, fictionalists who reach that position from a starting point of constructive empiricism will take the agnosticism about the unobservable endorsed by that position to apply also in the mathematical case.  Either way, though, fictionalists will agree that one ought not accept the truth of sentences that, on a literal reading, are committed to the existence of mathematical objects.

But simply refusing to accept the truth of sentences of a discourse does not amount to fictionalism with respect to that discourse: one may, for example, refuse to accept the truth-at-face-value of sentences uttered by homeopaths in the context of their discourse concerning homeopathic medicine, but doing so would be indicative of a healthy scepticism rather than a fictionalist approach to the claims of homeopathy.  What is distinctive about fictionalism is that fictionalists place some value on mathematical theorizing: they think that there is some valuable purpose to engaging in discourse apparently about numbers, functions, sets, and so on, and that that purpose is not lost if we do not think that the utterances of our discourse express truths.

Mathematical fictionalists, while refusing to accept the truth of mathematics, do not reject mathematical discourse.  They do not want mathematicians to stop doing mathematics, or empirical scientists to stop doing mathematically-infused empirical science. Rather, they advocate taking an attitude sometimes called acceptance to the utterances of ordinary mathematical discourse.  That is, they advocate making full use of those utterances in one’s theorizing without holding those utterances to be true (an attitude aptly described by Chris Daly (2008, 426) as exploitation). It is, therefore, rather misleading that the locus classicus of mathematical fictionalism is entitled Science without Numbers.  As we will see below, much of Field’s efforts in Science without Numbers are focussed on explaining why scientists can carry on exploiting the mathematically-infused theories they have always used, without committing themselves to the truth of the mathematics assumed by those theories.

The reasonableness of an attitude of acceptance or exploitation rather than belief will depend on the mathematical fictionalist’s analysis of the purpose of mathematical theorizing: what advantages are gained by speaking as if mathematical sentences are true, and are these advantages ones that we can reasonably expect to remain if the sentences of mathematical discourse are not in fact true?  But even prior to consideration of this question, it has been questioned (e.g., by Horwich (1991), and O’Leary-Hawthorne (1994)) whether adopting the proposed attitude of mere acceptance without belief is even possible, given that, plausibly, one’s belief states are indicated by one’s propensities to behave in a particular way, and fictionalists advocate behaving ‘as if’ they are believers, when engaging in a discourse they purport to accept. (These objections are aimed at Bas van Fraassen’s constructive empiricism (van Fraassen 1980), but apply equally well to fictionalists who do not believe our standard mathematical and scientific theories but similarly wish to continue to immerse themselves in ordinary mathematical and scientific activity, despite their reservations.)  Daly (2008) responds to this objection on the fictionalist’s behalf.

3. Preface or Prefix Fictionalism?

Mathematical fictionalists choose to speak ‘as if’ there are numbers, even though they do not believe that there are such things.  How should we understand their ‘disavowal’?  As David Lewis (2005: 315) points out, “There are prefixes or prefaces (explicit or implicit) that rob all that comes after of assertoric force.  They disown or cancel what follows, no matter what that may be.”, and we might imagine mathematical fictionalists as implicitly employing a prefix or preface that stops them from asserting, when doing mathematics, what they readily deny as metaphysicians.  The difference between a prefix and a preface is that “When the assertoric force of what follows is cancelled by a prefix, straightaway some other assertion takes place… Not so for prefaces.”  (ibid. 315)  Thus, preceding the sentence ‘Holmes lived at 221B Baker Street)’ with the prefix ‘According to the Sherlock Holmes stories…’ produces another sentence with assertoric force, whose force is not disowned.  On the other hand, preceding that very same sentence with the preface ‘Let’s make believe the Holmes stories are true, though they aren’t.’ one is not making a further assertion, but rather indicating that one is stepping back from the business of making assertions.

When mathematical fictionalists speak ‘as if’ there are numbers, can we, then, read them as implicitly employing a disowning prefix or preface?  When they utter the sentence ‘There are infinitely many prime numbers’, should we ‘hear’ them as really having begun under their breath with the prefix ‘According to standard mathematics…’, or as having prefaced their utterance with a mumbled: ‘Let’s make believe that the claims of standard mathematics are true, though they aren’t’?  Either reading is possible, but a prefix fictionalism would make the mathematical fictionalist’s insistence on a standard semantics for ordinary mathematical utterances rather less distinctive.  The claim that we have no reason to believe that ordinary mathematical utterances taken at face-value are true, but that the addition of an appropriate prefix transforms them into true claims may be correct, but does not sufficiently distinguish fictionalism so-construed from those reinterpretive anti-Platonist views that simply hold that that ordinary mathematical sentences should be given a non-standard semantics according to which they assert truths.  For example, Geoffrey Hellman’s modal structuralism holds that a mathematical sentence S uttered in the context of a mathematical theory T should be read as essentially ‘saying’: ‘T is consistent and it follows from the axioms of T that S’ (or, in other words, ‘According to (the consistent axioms of) a standard mathematical theory, S’, which is the result of applying a plausible fictionalist prefix to S).  Whether or not one reads this prefix into the semantics of mathematical claims or advocates a standard semantics but places all the value of ‘speaking as if’ there are mathematical objects on the possibility of construing one’s utterances as so-prefixed is arguably a matter of taste.

To take seriously the mathematical fictionalist’s insistence on a standard semantics, then, it is perhaps better to view mathematical fictionalists as implicitly or explicitly preceding their mathematical utterances with a disavowing preface which excuses them from the business of making assertions when they utter sentences whose literal truth would require the existence of mathematical objects.  But this, of course, raises the question of what they think they are doing when they engage, as fictionalists, in mathematical theorizing (both in the context of pure mathematics and in the context of empirical science). Pure mathematics does not present a major difficulty here – fictionalists may, for example, view the purpose of speaking as if the assumptions of our mathematical theories as true to be to enable us easily to consider what follows from those assumptions.  Pure mathematical inquiry can then be considered as speculative inquiry into what would be true if our mathematical assumptions were true, without concern about the question of whether those assumptions are in fact true, and it is perfectly reasonable to carry out such inquiry as one would a conditional proof, taking mathematical axioms as undischarged assumptions.  But in the context of empirical science this answer is not enough.  In empirical scientific theorizing we require that at least some of our theoretical utterances (minimally, those that report or predict observations) to be true, and part of the purpose of engaging in empirical scientific theorizing is to justify our unconditional assertion of the empirical consequences of our theories.  Despite the mathematical fictionalist’s disavowing preface, insulating them from the business of making assertions when they utter mathematical sentences in the context of their empirical theorizing, they are not, and would not want to be, entirely excused from the business of assertion.  The most pressing problem for mathematical fictionalists is to explain why they are licensed to endorse the truth of some, and only some, utterances made in the context of ordinary, mathematically-infused, empirical theorizing.

4. Mathematical Fictionalism and Empirical Theorizing

As we have already noted, our ordinary empirical theorizing is mathematical through and through.  We use mathematics in stating the laws of our scientific theories, in describing and organising the data to which those theories are applied, and in drawing out the consequences of our theoretical assumptions.  If we believe that the mathematical assumptions utilized by those theories are true, and also believe any non-mathematical assumptions we make use of, then we have no difficulty justifying our belief in any empirical consequences we validly derive from those assumptions: if the premises of our arguments are true then the truth of the conclusions we derive will be guaranteed as a matter of logic.  On the other hand, though, if we do not believe the mathematical premises in our empirical arguments, what reason have we to believe their conclusions?

Different versions of mathematical fictionalism take different approaches to answer this question, depending on how realist they wish to be about our scientific theories.  In particular, mathematical fictionalism can be combined with scientific realism (Hartry Field); with ‘nominalistic scientific realism’, or entity realism (Mark Balaguer, Mary Leng); and with constructive empiricism (Otávio Bueno).  We will consider these combinations separately.

a. Mathematical Fictionalism + Scientific Realism

I will here reserve the label ‘scientific realism’ for the Putnam-Boyd formulation of the view.  In Putnam’s words (1975: 73), scientific realists in this sense hold “that terms typically refer… that the theories accepted in mature science are typically approximately true, [and] that the same terms can refer to the same even when they occur in different theories”.  It is the first two parts of Putnam’s tripartite characterization that are particularly problematic for mathematical fictionalists, the first suggesting a standard semantics and the second a commitment to the truth of scientific theories.  If our mature scientific theories include, as Putnam himself contends that they do, statements whose (approximate) truth would require the existence of mathematical objects, then the combination of scientific realism with mathematical fictionalism seems impossible.

If one accepts scientific realism so-formulated, then what prospects are there for mathematical fictionalism?  The only room for manoeuvre comes with the notion of a mature scientific theory.  Certainly, in formulating the claims of our ordinary scientific theories we make use of sentences whose literal truth would require the existence of mathematical objects.  But we also make use of sentences whose literal truth would require the existence of ideal objects such as point masses or continuous fluids, and we generally do not take our use of such sentences to commit us to the existence of such objects.  Quine’s view is that, in our best, most careful expressions of mature scientific theories, sentences making apparent commitments to such objects will disappear in favour of literally true alternatives that carry with them no such commitments.  That is, we are not committed to point masses or continuous fluids because these theoretical fictions can be dispensed with in our best formulation of these mature theories.  Hartry Field, who wishes to combine scientific realism with mathematical fictionalism, thinks that the same can be said for the mathematical objects to which our ordinary scientific theories appear to be committed: in our best expressions of those theories, sentences whose literal truth would require the existence of such objects can be dispensed with.

Hence, the so-called ‘indispensability argument’ for mathematical platonism, and Field’s scientific realist response to this argument:

P1 (Scientific Realism): We ought to believe that the sentences used to express our best (mature) scientific theories, when taken at face value, are true or approximately true.

P2 (Indispensability):  Sentences whose literal truth would require the existence of mathematical objects are indispensable to our best formulations of our best scientific theories.

Therefore:

C (Mathematical Platonism): We ought to believe in the existence of mathematical objects.

(For an alternative formulation of the argument, and defence, see Colyvan (2001).)  In his defence of mathematical fictionalism, Field rejects P2, arguing that we can dispense with commitments to mathematical objects in our best formulations of our scientific theories.  In Science without Numbers (1980) Field makes the case for the dispensability of mathematics in Newtonian science, sketching how to formulate the claims of Newtonian gravitational theory without quantifying over mathematical objects.

But Field is a fictionalist about mathematics, not a mere skeptic about mathematically-stated theories.  That is, Field thinks that there is some value to speaking ‘as if’ there are mathematical objects, even though he does not accept that there really are such things.  The claim that we can dispense with mathematics in formulating the laws of our best scientific theories is, therefore, only the beginning of the story for Field: he also wishes to explain why it is safe for us to use our ordinary mathematical formulations of our scientific theories in our day-to-day theorizing about the world.

Field’s answer to this question is that our ordinary (mathematically-stated) scientific theories are conservative extensions of the literally true non-mathematical theories that we come to once we dispense with mathematics in our theoretical formulations.  A mathematically-stated empirical theory P is a conservative extension of a nominalistically stated theory N just in case any nominalistically stated consequence A of P is also a consequence of the nominalistic theory N.  Or, put another way, suppose we have an ordinary (mathematically-expressed, and therefore platonistic) scientific theory P and a nominalistically acceptable reformulation of that theory, N.  The nominalistically acceptable reformulation will aim to preserve P’s picture of the nonmathematical realm while avoiding positing the existence of any mathematical objects.  If this reformulation is successful, then every nonmathematical fact about the nonmathematical realm implied by P will also be implied by N.  In fact, typically, P will be identical to N + S: the combination of the nominalistic theory N with a mathematical theory S, such as set theory with nonmathematical urelements, that allows one to combine mathematical and nonmathematical vocabulary, e.g., by allowing nonmathematical vocabulary to figure in its comprehension schema.  In the case of Newtonian gravitational theory, Field makes the case for having found the appropriate theory N by sketching a proof of a ‘representation theorem’ which links up nonmathematical laws of N with laws of P that, against the backdrop of N + S, are materially equivalent to the nonmathematical laws.

Why, if we have a pair of such theories, N and P, does this give us license to believe the nonmathematical consequences of P, a theory whose truth we do not accept?  Simply because those consequences are already consequences of the preferred nominalistic theory N, which as scientific realists we take to be true or approximately true.  Our confidence in the truth of the nonmathematical consequences we draw from our mathematically stated scientific theories piggy backs on our confidence in the truth of the nonmathematical theories those theories conservatively extend.

But why, we may ask, should we bother with the mathematically-infused versions of our scientific theories if these theories simply extend our literally believed nominalistic theories by adding a body of falsehoods?  Field’s answer is that mathematics is an incredibly useful, practically (and sometimes even theoretically) indispensable tool that enables us to draw out the consequences of our nominalistic theories.  Nominalistically stated theories are unwieldy, and arguments from nominalistically stated premises to nominalistically stated conclusions, even when available, can be difficult to find and impractically long to write down.   With the help of mathematics, though, such problems can become tractable.  If we want to draw out the consequences of a body of nominalistically stated claims, we can use a ‘representation theorem’ to enable us to ascend to their platonistically stated counterparts, give a quick mathematical argument to some platonistically stated conclusions, then descend, again via the representation theorem, to nominalistic counterparts of those conclusions.  In short, following Carl G. Hempel’s image, mathematics has the function of a ‘theoretical juice extractor’ when applied to the nominalistic theory N:

Thus, in the establishment of empirical knowledge, mathematics (as well as logic) has, so to speak, the function of a theoretical juice extractor: the techniques of mathematical and logical theory can produce no more juice of factual information than is contained in the assumptions to which they are applied; but they may produce a great deal more juice of this kind than might have been anticipated upon a first intuitive inspection of those assumptions which form the raw material for the extractor. — C. G. Hempel (1945): 391

Extracting the consequences of our nominalistically stated theories ‘by hand’ is extremely time consuming (so much so as to make this procedure humanly impracticable without mathematics, as Ketland (2005) has pointed out).  Furthermore, if our nominalistic theories employ second-order logic in their formulation, some of these consequences can only be extracted with the help of mathematics (as noted by Field (1980: 115n. 30; 1985), Urquhart (1990: 151), and discussed in detail by Shapiro (1983)).  Nevertheless, despite being practically and even potentially theoretically indispensable in extracting the juice of factual information from our nominalistically stated theories, the indispensability of mathematics in this sense does not conflict with Field’s rejection of P2 of the indispensability argument as we have presented it, which requires only that we can state the assumptions of our best scientific theories in nonmathematical terms, not that we dispense with all uses of mathematics, for example in drawing out the consequences of those assumptions.

Field’s defense of fictionalism, though admirable, has its problems.  There are concerns both about Field’s dispensability claim and his conservativeness claim.  On the latter point, Shapiro objects that if our nominalistic scientific theories employ second-order logic, then the fact that mathematics is indispensable in drawing out some of the (semantic) consequences of those theories speaks against Field’s claim to have dispensed with mathematics.  As we have noted, indispensability in this sense does not affect Field’s ability to reject P2 of the original indispensability argument.  However, it does suggest a further indispensability argument that questions our license to use mathematics in drawing inferences if we do not believe that mathematics to be true.  Michael D. Resnik (1995: 169-70) expresses an argument of this form in his ‘Pragmatic Indispensability Argument’ as follows:

  1. In stating its laws and conducting its derivations science assumes the existence of many mathematical objects and the truth of much mathematics.
  2. These assumptions are indispensable to the pursuit of science; moreover, many of the important conclusions drawn from and within science could not be drawn without taking mathematical claims to be true.
  3. So we are justified in drawing conclusions from and within science only if we are justified in taking the mathematics used in science to be true.

Even if Field can dispense with mathematics in stating the laws of our scientific theories, the focus this argument places on the indispensability of mathematics in derivations, i.e., in drawing out the consequences of our scientific theories, presents a new challenge to Field’s program.

If we stick with second-order formulations of our nominalistic theories, then premise 2 of this argument is right in claiming that mathematics is indispensable to drawing out some of the consequences of these theories (in the sense that, for any consistent such theory and any sound derivation system for such a theory there will be semantic consequences of those theories that are not derivable within those theories relative to that derivation system).  But does the use of mathematics in uncovering the consequences of our theories require belief in the truth of the mathematics used?  Arguably, our reliance on, e.g., model theory in working out what follows from our nominalistic assumptions requires only that we believe in the consistency of our set theoretic models, not in the actual existence of those sets, so perhaps this form of the indispensability argument (based on the indispensability of mathematics in metalogic rather than in empirical science) can be responded to without dispensing with mathematics in such cases.  (See, e.g., Field (1984); Leng (2007).)

Field (1985: 255), though, has expressed some concerns about the second-order version of Newtonian gravitational theory developed in Science without Numbers.  Aside from the worry about the need to rely on set theory to discover the consequences of our second-order theories, there are more general concerns about the nominalistic acceptability of second-order quantification (with Quine (1970: 66), most famously, complaining that second-order logic is simply ‘set theory in sheep’s clothing’).  While there are various defences of the nominalistic cogency of second order logic available, Field’s own considered view is that second-order quantification is best avoided by nominalists.  This, however, rather complicates the account of applications given in Science without Numbers, since Field can no longer claim that our ordinary (mathematically stated) scientific theories conservatively extend their nominalistically stated counterparts.  In the equation above, what we can say is that for a first-order nominalistic theory N, and a mathematical theory S such as set theory with nonmathematical urelements, N+ S will be a conservative extension of N.  But for our ordinary mathematically stated scientific theory P, we will not be able to find an N such that N + S = P.  In fact, P will in general have ‘greater’ nominalistic content than any proposed counterpart first-order theory N does, by virtue of ruling out some ‘non-standard’ models that N will allow (effectively, because P will imply the existence in spacetime of a standard model for the natural numbers, whereas N will always admit of non-standard models).  We thus do not have the neat representation theorems that allow us to move from claims of N to equivalent claims of P that the machinery of Field’s ‘theoretical juice extractor’ requires.  As Field (2005) concedes, at best we can have partial representation theorems that allow for some match between our mathematical and nonmathematical claims, without straightforward equivalence.

Setting aside the logical machinery required by Field’s account of applications, there are also concerns about the prospects for finding genuinely nominalistic alternatives to our current scientific theories.  Field’s sketched nominalization of Newtonian gravitational theory is meant to show the way for further nominalizations of contemporary theories.  But even if Field has succeeded in making the case for there being a genuinely nominalistic alternative to standard Newtonian science (something that has been questioned by those who are concerned about Field’s postulation of the existence of spacetime points with a structure isomorphic to the 4-dimensional real space R4), many have remained pessimistic about the prospect for extending Field’s technique to further theories, for various reasons.  As Alasdair Urquhart (1990) points out, Newtonian science is very convenient in that, since it assumes that spacetime has the structure of R4, it becomes easy to find claims about relations between spacetime points that correspond to mathematical claims expressed in terms of real numbers.  But contemporary science takes spacetime to have non-constant curvature, and with the lack of an isomorphism between spacetime and R4, the prospects for finding suitable representation theorems to match mathematical claims with claims expressed solely in terms of qualitative relations between spacetime points are less clear.  Furthermore, as David Malament (1982) notes, Newtonian science is likewise convenient in that its laws primarily concern spacetime points and their properties (the mass concentrated at a point, the distance between points, etc.).  But many of our best scientific theories (such as classical Hamiltonian mechanics or quantum mechanics) are standardly expressed as phase space theories, with their laws expressing relations between the possible states of a physical system.  An analogous approach to that of Science without Numbers would dispense with mathematical expressions of these relations in favour of nonmathematical expressions of the same – but this would still leave us with an ontology of possibilia, something that would presumably be at least as problematic as an ontology of abstract mathematical objects.  As Malament (1982: 533) points out, ‘Even a generous nominalist like Field cannot feel entitled to quantify over possible dynamical states.’  And finally, as Malament further notes, the case of quantum mechanics presents even more problems, since, as well as being a phase space theory, the Hilbert space formulation represents the quantum mechanical state of a physical system as an assignment of probabilities to measurement events.  What is given a mathematical measure is a proposition or eventuality (the proposition that ‘A measurement of observable A yields a value within the set Δ’ is assigned a probability’).  But if propositions are the basic ‘objects’ whose properties are represented by the mathematical theory of Hilbert spaces, then applying an analogous approach to that of Science without Numbers would still throw up nominalistically unacceptable commitments.  For, Malament (1982: 534) asks, ‘What could be worse than propositions or eventualities’ for a nominalist such as Field?

These objections, while not conclusive, make clear just how much hard work remains for a full defense of Field’s dispensability claim.  It is not enough to rely on the sketch provided by Science without Numbers: mathematical fictionalists who wish to remain scientific realists in the Putnam-Boyd sense must show how they plan to dispense with mathematics in those cases where the analogy with Newtonian science breaks down (at a minimum, explaining how to deal with spacetime of non-constant curvature, phase space theories, and the probabilistic properties of quantum mechanics).  Mark Balaguer (1996) attempts the third of these challenges, arguing that we can dispense with quantum events just as long as we assume that there are physically real propensity properties of physical systems, an assumption that he claims to be ‘compatible with all interpretations of quantum mechanics except for hidden variables interpretations’ (Balaguer 1996, p. 217).  But there clearly remains much to be done to show that mathematical assumptions can be dispensed with in favour of nominalistically acceptable alternatives.

b. Mathematical Fictionalism + Nominalistic Scientific Realism

Despair about the prospects of completing Field’s project, as well as attention to the explanation of the applicability of mathematics that Field provides, has led some fictionalists (who, with a nod to Colyvan, I will label ‘Easy Road’ fictionalists), to wonder whether there isn’t an easier way to defend their position against the challenge of explaining why it is appropriate to trust the predictions of our mathematically-stated scientific theories if we do not believe those theories to be true.  Look again at Field’s explanation of the applicability of mathematics.  Field’s claim is, effectively, that our mathematically stated theories are predictively successful because they have a true ‘nominalistic’ core (as expressed, for theories for which we have dispensed with mathematics, by the claims of a nominalistic theory N).  What accounts for the trustworthiness of those theories is not that they are true (in their mathematical and nonmathematical parts), but that they are correct in the picture they paint of the nonmathematical realm.  Easy Road fictionalists then ask, doesn't this explanation of the predictive success of a false theory undermine the case for scientific realism as on the Putnam/Boyd formulation?

Realists typically claim that we have to believe that our scientific theories are true or approximately true, otherwise their predictive success would be miraculous (thus, according to Putnam (1975: 73), realism ‘is the only philosophy that doesn't make the success of science a miracle’).  But Field’s explanation of the predictive success of ordinary (mathematically stated) Newtonian science shows how that success does not depend on the truth or approximate truth of that theory, only on its having a true ‘nominalistic content’ (as expressed in the nominalistic version of the theory).  Mightn’t we use this as evidence against taking the predictive success of even those theories for which we do not have a neat nominalistic alternative to be indicative of their truth?  Perhaps those, too, may be successful not because they are true in all their parts, but simply because they are correct in the picture they paint of the nonmathematical world?  The contribution mathematical assumptions might be making to our theoretical success would then not depend on the truth of those assumptions, but merely on their ability to enable us to represent systems of nonmathematical objects.  Mathematics provides us with an extremely rich language to describe such systems; maybe it is even indispensable to this purpose.  But if all that the mathematics is doing in our scientific theories is enabling us to form theoretically amenable descriptions of physical systems, then why take the indispensable presence of mathematical assumptions used for this purpose as indicative of their truth?

Thus, Mark Balaguer (1998: 131) suggests that fictionalists should not be scientific realists in the Putnam-Boyd sense, but should instead adopt ‘Nominalistic Scientific Realism’:

the view that the nominalistic content of empirical science—that is, what empirical science entails about the physical world—is true (or mostly true—there may be some mistakes scattered through it), while its platonistic content—that is, what it entails “about” an abstract mathematical realm—is fictional.

This view depends on holding that there is a nominalistic content to our empirical theories (even if this content cannot be expressed in nominalistic terms), and that it is reasonable to believe just this content (believing that, as we might say, our empirical theories are nominalistically adequate, not that they are true).    Similar claims (about the value of our mathematically stated scientific theories residing in their accurate nominalistic content rather than in their truth) can be found in the work of Joseph Melia (2000), Stephen Yablo (2005) and Mary Leng (2010), though of these only Leng explicitly endorses fictionalism. Difficulties arise in characterising exactly what is meant by the nominalistic content of an empirical theory (or the claim that such a theory is nominalistically adequate).  Yablo compares the nominalistic content of a mathematical utterance with the ‘metaphorical content’ of figurative speech, and as with metaphorical content, it is perhaps easier to make a case for our mathematically stated empirical theories having such content than to give a formal account of what that content is (we are assuming, of course, that mathematics may be indispensable to expressing the nominalistic content of our theories, so that we cannot in general expect to be able to identify the nominalistic content of a mathematically stated empirical theory with the literal content of some related theory).

As Leng (2010) argues, the case for the combination of mathematical fictionalism with nominalistic scientific realism depends crucially on showing that the fictionalist’s proposal, to continue to speak with the vulgar in doing science, remains reasonable on the assumption that there are no mathematical objects.  That is, fictionalists must explain why they can reasonably rely on our ordinary scientific theories in meeting standard scientific goals such as the goals of providing predictions and explanations, if they do not believe in the mathematical objects posited by those theories.  Balaguer attempts such an explanation by means of his principal of causal isolation, the claim that there are no causally efficacious mathematical objects.  According to Balaguer (1998: 133),

Empirical science knows, so to speak, that mathematical objects are causally inert.  That is, it does not assign any causal role to any mathematical entity.  Thus, it seems that empirical science predicts that the behaviour of the physical world is not dependent in any way on the existence of mathematical objects.  But this suggests that what empirical science says about the physical world—that is, its complete picture of the physical world—could be true even if there aren’t any mathematical objects.

But while causal inefficacy goes a long way to explaining why the existence of mathematical objects is not required for the empirical success of our scientific theories, and especially in drawing a line between unobservable mathematical and physical posits (such as electrons), by itself the principle of causal isolation doesn’t show mathematical posits to be an idle wheel.  Not all predictions predict by identifying a cause of the phenomenon predicted, and, perhaps more crucially, not all explanations explain causally.  Balaguer suggests that, were there no mathematical objects at all, physical objects in the physical world could still be configured in just the ways our theories claim.  But if there were no mathematical objects, would that mean that we would lose any means of explaining why the world is configured the way it is?  If mathematical posits are essential to some of our explanations of the behaviour of physical systems, and if explanations have to be true in order to explain, then a fictionalist who does not believe in mathematical objects cannot reasonably endorse the kinds of explanations usually provided by empirical science.

Hence yet another indispensability argument has been developed to press ‘nominalistic scientific realists’ on this issue (see, particularly, Colyvan (2002); Baker (2005)).  Alan Baker (2009: 613) summarises the argument as follows:

The Enhanced Indispensability Argument

(1)   We ought rationally to believe in the existence of any entity that places an indispensable explanatory role in our best scientific theories.

(2)   Mathematical objects play an indispensable explanatory role in science.

(3)   Hence, we ought rationally to believe in the existence of mathematical objects.

What the fictionalist should make of this argument depends on what is meant by mathematical objects playing an ‘indispensable explanatory role’.  Let us suppose that this means that sentences whose literal truth would require the existence of mathematical objects are present amongst the explanandans of some explanations that we take to be genuinely explanatory.  Two lines of response suggest themselves: first, we may challenge the explanatoriness of such explanations, arguing that candidate such explanations are merely acting as placeholders for more basic explanations that do not assume any mathematics (effectively rejecting (2); Melia (2000, 2002) may be interpreted as taking this line; Bangu (2008) does so more explicitly).  On the other hand, we may accept that some such explanations are genuinely explanatory, but argue that the explanatoriness of the mathematics in these cases does not depend on its truth (effectively rejecting (1); this line is taken by Leng (2005), who argues that explaining the behaviour of a physical system by appeal to the mathematical features of a mathematical model of that system does not require belief in the existence of the mathematical model in question, only that the features that the model is imagined to have are appropriately tied to the actual features of the physical system in question).  Causal inefficacy does make a difference here, but in a slightly more nuanced way than Balaguer’s discussion suggests: it is hard to see how a causal explanation remains in any way explanatory for one who does not believe in the existence of the object posited as cause.  On the other hand, though, it is less clear that an explanation that appeals to the mathematically described structure of a physical system loses its explanatory efficacy if one is merely pretending or imagining that there are mathematical objects that relate appropriately to them.  As Nancy Cartwright (1983) suggests, causal explanations may be special in this regard – to the extent that nominalistic scientific realists agree with Cartwright on the limitations of inference to the best explanation, we may consider nominalistic scientific realism to be most naturally allied to ‘entity realism’, rather than realism about theories, in the philosophy of science.

c. Mathematical Fictionalism + Constructive Empiricism (Bueno)

What makes ‘nominalistic scientific realism’ a broadly ‘realist’ approach to our scientific theories is that, although its proponents do not believe such theories to be true or even approximately true (due to their mathematical commitments), in holding that our best scientific theories are broadly correct in their presentation of the nonmathematical world, they take it that we have reason to believe in the unobservable physical objects that those theories posit.  An alternative fictionalist option is the combination of mathematical fictionalism with constructive empiricism, according to which we should only believe that our best scientific theories are empirically adequate, correct in their picture of the observable world, remaining agnostic about the claims that those theories make about unobservables.  The combination of mathematical fictionalism with constructive empiricism has been defended by Otávio Bueno (2009).

While Bas van Fraassen is standardly viewed as presenting constructive empiricism as agnosticism about the unobservable physical world, it would seem straightforward that any reason for epistemic caution about theories positing unobservable physical entities should immediately transfer to a caution about theories positing unobservable mathematical entities.  As Gideon Rosen (194: 164) puts the point, abstract entities

are unobservable if anything is.  Experience cannot tell us whether they exist or what they are like.  The theorist who believes what his theories say about the abstract must therefore treat something other than experience as a source of information about what there is.  The empiricist makes it his business to resist this. So it would seem that just as he suspends judgment on what his theory says about unobservable physical objects, he should suspend judgment on what they say about the abstract domain.

This would suggest that constructive empiricism already encompasses, or at least should encompass, mathematical fictionalism, simply extending the fictionalist’s attitude to mathematical posits further to cover the unobservable physical entities posited by our theories.

Despite their natural affinity, the combination of mathematical fictionalism with constructive empiricism is not as straightforward as it may at first seem.  This is because, in characterising the constructive empiricist’s attitude of acceptance, van Fraassen appears to commit the constructive empiricist scientist to beliefs about mathematical objects.  Van Fraassen (1980: 64) adopts the semantic view of theories, and holds that a

theory is empirically adequate if it has some model such that all appearances are isomorphic to empirical substructures of that model.

To accept a theory is, for van Fraassen, to believe it to be empirically adequate, and to believe it to be empirically adequate is to believe something about an abstract mathematical model.  Thus, as Rosen (1994: 165) points out, for the constructive empiricist so-characterized,

The very act of acceptance involves the theorist in a commitment to at least one abstract object.

Bueno (2009: 66) suggests two options for the fictionalist in responding to this challenge: either reformulate the notion of empirical adequacy so that it does not presuppose abstract entities, or adopt a ‘truly fictionalist strategy’ which takes mathematical entities seriously as fictional entities, which Bueno takes (following Thomasson 1999) to be abstract artifacts, created by the act of theorizing.  This latter strategy effectively reintroduces commitment to mathematical objects, albeit as ‘non-existent things’ (Bueno 2009: 74), a move which is hard to reconcile with the fictionalist’s insistence on a uniform semantics (where ‘exists’ is held to mean exists).

5. Hermeneutic or Revolutionary Fictionalism

Each of these three versions of mathematical fictionalism can be viewed in one of two ways: as a hermeneutic account of the attitude mathematicians and scientists actually take to their theories, or as a potentially revolutionary proposal concerning the attitude one ought to take once one sees the fictionalist light.  Each version of fictionalism faces its own challenges (see, e.g., Burgess (2004) for objections to both versions, and, e.g., Leng (2005) and Balaguer (2009) for responses).  The question of whether fictionalism ought to be a hermeneutic or a revolutionary project is an interesting one, and has led at least one theorist with fictionalist leanings to hold back from wholeheartedly endorsing fictionalism.  From a broadly Quinean, naturalistic starting point, Stephen Yablo (1998) has argued that ontological questions should be answered, if at all, by considering the content of our best confirmed scientific theories.  However, Yablo notes, the sentences we use to express those theories can have a dual content – a ‘metaphorical’ content (such as the ‘nominalistic’ content of a mathematically stated empirical claim), as well as a literal content.  Working out what our best confirmed theories say involves us, Yablo thinks, in the hermeneutic project of working out what theorists mean by their theoretical utterances.  But for the more controversial cases, Yablo (1998: 257) argues, there may be no fact of the matter about whether theorists mean to assert the literal content or some metaphorical alternative.   We will often, he points out, utter a sentence S in a ‘make-the-most-of-it spirit’:

I want to be understood as meaning what I literally say if my statement is literally true… and meaning whatever my statement projects onto… if my statement is literally false.  It is thus indeterminate from my point of view whether I am advancing S’s literal content or not.

If answering ontological questions requires the possibility of completing the hermeneutic project of interpreting what theorists themselves mean by their mathematical utterances, and if (as Yablo (1998: 259) worries), the controversial mathematical utterances are permanently ‘equipoised between literal and metaphorical’ then there may be no principled way of choosing between mathematical fictionalism and platonism. Hence we have Yablo’s own reticence, in some incarnations, in endorsing mathematical fictionalism, even in the light of the acknowledged possibility of providing a fictionalist interpretation of our theoretical utterances.

6. References and Further Reading

  • Azzouni, J., (2004): Deflating Existential Consequence: A Case for Nominalism (Oxford: Oxford University Press)
  • Baker, A., (2005): ‘Are There Genuine Mathematical Explanations of Physical Phenomena?’, Mind 114: 223-38
  • Baker, A., (2009): ‘Mathematical Explanation in Science’, British Journal for the Philosophy of Science 60: 611-633
  • Balaguer, M. (1996): ‘Towards a Nominalization of Quantum Mechanics’, Mind105: 209-26.
  • Balaguer, M., (1998): Platonism and Anti-Platonism in Mathematics (Oxford: Oxford University Press)
  • Balaguer, M. (2009): ‘Fictionalism, Theft, and the Story of Mathematics’, Philosophia Mathematica 17: 131-162
  • Bangu, S., (2008): ‘Inference to the Best Explanation and Mathematical Realism’, Synthese 160: 13-20
  • Benacerraf, P., (1973): ‘Mathematical Truth’, The Journal of Philosophy 70: 661-679
  • Bueno, O., (2009): ‘Mathematical Fictionalism’, in Bueno, O. and Linnebo, Ø., eds., New Waves in Philosophy of Mathematics (Hampshire: Palgrave MacMillan): 59-79
  • Cartwright, N., (1983): How the Laws of Physics Lie (Oxford: Oxford University Press)
  • Colyvan, M., (2001): The Indispensability of Mathematics (Oxford: Oxford University Press)
  • Colyvan, M., (2002): ‘Mathematics and Aesthetic Considerations in Science’, Mind 111: 69-74
  • Daly, C., (2008): ‘Fictionalism and the Attitudes’, Philosophical Studies 139: 423-440
  • Field, H., (1980): Science without Numbers (Princeton, NJ: Princeton University Press)
  • Field, H., (1984): ‘Is Mathematical Knowledge Just Logical Knowledge?’, Philosophical Review 93: 509-52.  Reprinted with a postscript in Field (1989): 79-124.
  • Field, H. (1985): ‘On Conservativeness and Incompleteness’, Journal of Philosophy 81: 239-60. Reprinted with a postscript in Field (1989): 125-46
  • Field, H., (1989): Realism, Mathematics and Modality (Oxford: Blackwell)
  • Hellman, G. (1989): Mathematics without Numbers: Towards a Modal-Structural Interpretation (Oxford: Oxford University Press)
  • Hempel, C. G. (1945): ‘On the Nature of Mathematical Truth’, American Mathematical Monthly 52: 543-56.  Reprinted in P. Benacerraf and H. Putnam, eds. (1983): Philosophy of Mathematics: Selected Readings (Cambridge: Cambridge University Press): 377-93
  • Hoffman, S. (2004): ‘Kitcher, Ideal Agents, and Fictionalism’, Philosophia Mathematica 12: 3-17
  • Horwich, P. (1991): ‘On the Nature and Norms of Theoretical Commitment’, Philosophy of Science 58: 1-14
  • Kalderon, M. E. (ed.), (2005): Fictionalism in Metaphysics (Oxford: Oxford University Press)
  • Ketland, J. (2005): ‘Some more curious inferences’, Analysis 65: 18-24
  • Kitcher, P. (1984):  The Nature of Mathematical Knowledge (Oxford: Oxford University Press)
  • Leng, M. (2005a): ‘Mathematical Explanation’, in C. Cellucci and D. Gillies, eds., Mathematical Reasoning, Heuristics and the Development of Mathematics (London: King’s College Publications); 167-89
  • Leng, M. (2005b): ‘Revolutionary Fictionalism: A Call to Arms’, Philosophia Mathematica 13: 277-293
  • Leng, M., (2010): Mathematics and Reality (Oxford: Oxford University Press)
  • Lewis, D., (2005): ‘Quasi-Realism is Fictionalism’, in Kalderon (2005): 314-321
  • Melia, J. (2000): ‘Weaseling Away the Indispensability Argument’, Mind 109: 455-79
  • Melia, J. (2002): ‘Response to Colyvan’, Mind 111: 75-79
  • Melia, J. (2004): ‘Review of Jody Azzouni, Deflating Existential Consequence: A Case for Nominalism’,  Notre Dame Philosophical Reviews 2005/08/15
  • O’Leary-Hawthorne, J. (1994): ‘What does van Fraassen’s critique of scientific realism show?’, The Monist 77: 128-145
  • Putnam, H. (1971): Philosophy of Logic (New York: Harper and Row).  Reprinted in Putnam (1979): 323-57
  • Putnam, H. (1975): ‘What is Mathematical Truth?’, Historia Mathematica 2: 529-43.  Reprinted in Putnam (1979): 60-78
  • Putnam, H. (1979): Mathematics, Matter and Method: Philosophical Papers Vol. II (Cambridge: Cambridge University Press, 2nd ed.)
  • Quine, W. V. (1970): Philosophy of Logic (Cambridge: MA, Harvard University Press, 2nd ed., 1986)
  • Resnik, M. D. (1995): ‘Scientific vs. Mathematical Realism: The Indispensability Argument, Philosophia Mathematica 3: 166-74
  • Rosen, G. (1984): ‘What is Constructive Empiricism?’, Philosophical Studies 74: 143-78
  • Shapiro, S. (1983): ‘Conservativeness and Incompleteness’, Journal of Philosophy 80: 521-31
  • Thomas, R. (2000): ‘Mathematics and Fiction I: Identification’, Logique et Analyse 171-172: 301-340
  • Thomas, R. (2002): ‘Mathematics and Fiction II: Analogy’, Logique et Analyse 177-178: 185-228
  • Urquhart, A. (1990): ‘The Logic of Physical Theory’, in A. D. Irvine, ed., (1990): Physicalism in Mathematics (Dordrecht: Kluwer): 145-54
  • Van Fraassen, B. (1980): The Scientific Image (Oxford: Clarendon Press)
  • Yablo, S. (1998): ‘Does Ontology Rest on a Mistake?’, Aristotelian Society, Supplementary Volume 72: 228-61
  • Yablo, S. (2005): ‘The Myth of the Seven’, in Kalderon (2005): 88-115

Author Information

Mary Leng
Email: mcleng@liv.ac.uk
University of Liverpool
United Kingdom

Mathematical Structuralism

The theme of mathematical structuralism is that what matters to a mathematical theory is not the internal nature of its objects, such as its numbers, functions, sets, or points, but how those objects relate to each other. In a sense, the thesis is that mathematical objects (if there are such objects) simply have no intrinsic nature. The structuralist theme grew most notably from developments within mathematics toward the end of the nineteenth century and on through to the present, particularly, but not exclusively, in the program of providing a categorical foundation to mathematics.

Philosophically, there are a variety of competing ways to articulate the structuralist theme. These invoke various ontological and epistemic themes. This article begins with an overview of the underlying idea, and then proceeds to the major versions of the view found in the philosophical literature. On the metaphysical front, the most pressing question is whether there are or can be “incomplete” objects that have no intrinsic nature, or whether structuralism requires a rejection of the existence of mathematical objects altogether. Each of these options yields distinctive epistemic questions concerning how mathematics is known.

There are ontologically robust versions of structuralism, philosophical theories that postulate a vast ontology of structures and their places; on the other hand, there are versions of structuralism amenable to those who prefer desert landscapes, denying the existence of distinctively mathematical objects altogether; and also there are versions of structuralism in between those two—postulating an ontology for mathematics, but not a specific realm of structures. The article sketches the various strengths of each option and the challenges posed for them.

Table of Contents

  1. The Main Idea
  2. Taking on the Metaphysics: The Ante Rem Approach
  3. Getting by without Ontology: Structuralism without (Ante Rem) Structures
  4. References and Further Reading

1. The Main Idea

David Hilbert’s Grundlagen der Geometrie [1899] represents the culmination of a trend toward structuralism within mathematics.  That book gives what, with some hindsight, we might call implicit definitions of geometric notions, characterizing them in terms of the relations they bear to each other.  The early pages contain phrases such as “the axioms of this group define the idea expressed by the word ‘between’ . . .” and “the axioms of this group define the notion of congruence or motion.”  The idea is summed up as follows:

We think of . . . points, straight lines, and planes as having certain mutual relations, which we indicate by means of such words as “are situated,” “between,” “parallel,” “congruent,” “continuous,” etc.  The complete and exact description of these relations follows as a consequence of the axioms of geometry.

Hilbert also remarks that the axioms express “certain related fundamental facts of our intuition,” but in the book—in the mathematical development itself—all that remains of the intuitive content are the diagrams that accompany some of the theorems.

Mathematical structuralism is similar, in some ways, to functionalist views in, for example, philosophy of mind.  A functional definition is, in effect, a structural one, since it, too, focuses on relations that the defined items have to each other.  The difference is that mathematical structures are more abstract, and free-standing, in the sense that there are no restrictions on the kind of things that can exemplify them (see Shapiro [1997, Chapter 3, §6]).

There are several different, and mutually incompatible, philosophical ideas that can underlie and motivate mathematical structuralism.  Some philosophers postulate an ontology of structures, and claim that the subject matter of a given branch of mathematics is a particular structure, or a class of structures.  An advocate of a view like this would articulate what a structure is, and then say something about the metaphysical nature of structures, and how they and their properties can become known.  Other structuralist views deny the existence of structures, and develop the underlying theme in other ways.

Let us define a system to be collection of objects together with certain relations on those objects.  For example, an extended family is a system of people under certain blood and marital relations—father, aunt, great niece, son-in-law, and so forth.  A work of music is a collection of notes under certain temporal and other musical relations.  To get closer to mathematics, define a natural number system to be a countably infinite collection of objects with a designated initial object, a one-to-one successor relation that satisfies the principle of mathematical induction and the other axioms of arithmetic.  Examples of natural number systems are the Arabic numerals in their natural order, an infinite sequence of distinct moments of time in temporal order, the strings on a finite (or countable) alphabet arranged in lexical order, and, perhaps, the natural numbers themselves.

To bring Hilbert [1899] into the fold, define a Euclidean system to be three collections of objects, one to be called “points,” a second to be called “lines,” and a third to be called “planes," along with certain relations between them, such that the axioms are true of those objects and relations, so construed.  Otto Blumenthal reports that in a discussion in a Berlin train station in 1891, Hilbert said that in a proper axiomatization of geometry, “one must always be able to say, instead of ‘points, straight lines, and planes’, ‘tables, chairs, and beer mugs’” (“Lebensgeschichte” in Hilbert [1935, 388-429]; the story is related on p. 403).  In a much-discussed correspondence with Gottlob Frege, Hilbert wrote (see Frege [1976], [1980]):

Every theory is only a scaffolding or schema of concepts together with their necessary relations to one another, and that the basic elements can be thought of in any way one likes.  If in speaking of my points, I think of some system of things, e.g., the system love, law, chimney-sweep . . . and then assume all my axioms as relations between these things, then my propositions, e.g., Pythagoras’ theorem, are also valid for these things . . . [A]ny theory can always be applied to infinitely many systems of basic elements.

A structure is the abstract form of a system, which ignores or abstracts away from any features of the objects that do not bear on the relations.  So, the natural number structure is the form common to all of the natural number systems.  And this structure is the subject matter of arithmetic.  The Euclidean-space-structure is the form common to all Euclidean systems.  The theme of structuralism is that, in general, the subject matter of a branch of mathematics is a given structure or a class of related structures—such as all algebraically closed fields.

A structure is thus a “one over many,” a sort of universal.  The difference between a structure and a more traditional universal, such as a property, is that a property applies to, or holds of, individual objects, while a structure applies to, or holds of, systems.  Structures are thus much like structural universals, whose existence remains subject to debate among metaphysicians (see, for example, Lewis [1986], Armstrong [1986], Pagès [2002])).  Indeed, one might think of a mathematical structure as a sort of free-standing structural universal, one in which the nature of the individual objects that fill the places of the structure, is irrelevant (see Shapiro [2008, §4]).

Any of the usual array of philosophical views on universals can be adapted to structures.  One can be a Platonic ante rem realist, holding that each structure exists and has its properties independent of any systems that have that structure.  On this view, structures exist objectively, and are ontologically prior to any systems that have them (or at least ontologically independent of such systems).  Or one can be an Aristotelian in re realist, holding that structures exist, but insisting that they are ontologically posterior to the systems that instantiate them.  Destroy all the natural number systems and, alas, you have destroyed the natural number structure itself.  A third option is to deny that structures exist at all.  Talk of structures is just a convenient shorthand for talk of systems that have a certain similarity.

In a retrospective article, Paul Bernays [1967, 497] provides a way to articulate the latter sort of view:

A main feature of Hilbert’s axiomatization of geometry is that the axiomatic method is presented and practiced in the spirit of the abstract conception of mathematics that arose at the end of the nineteenth century and which has generally been adopted in modern mathematics.  It consists in abstracting from the intuitive meaning of the terms . . . and in understanding the assertions (theorems) of the axiomatized theory in a hypothetical sense, that is, as holding true for any interpretation . . . for which the axioms are satisfied.  Thus, an axiom system is regarded not as a system of statements about a subject matter but as a system of conditions for what might be called a relational structure . . . [On] this conception of axiomatics, . . . logical reasoning on the basis of the axioms is used not merely as a means of assisting intuition in the study of spatial figures; rather, logical dependencies are considered for their own sake, and it is insisted that in reasoning we should rely only on those properties of a figure that either are explicitly assumed or follow logically from the assumptions and axioms.

Advocates of these different ontological positions concerning structures take different approaches to other central philosophical concerns, such as epistemology, semantics, and methodology.  Each such view has it relatively easy for some issues and finds deep, perhaps intractable problems with others.  The ante rem realist, for example, has a straightforward account of reference and semantics:  the variables of a branch of mathematics, like arithmetic, analysis, and set theory, range over the places in an ante rem structure.  Each singular term denotes one such place.  So the language is understood at face value.  But the ante rem realist must account for how one obtains knowledge of structures, so construed, and she must account for how statements about ante rem structures play a role in scientific theories of the physical world.  As suggested by the above Bernays passage, the nominalistic, eliminative structuralist has it easier on epistemology.  Knowing a truth of, say, real analysis, is knowing what follows from the description of the theory.   But this sort of structuralist must account for the semantics of the reconstrued statements, how they are known, how they figure in science, and so forth.

2. Taking on the Metaphysics: The Ante Rem Approach

To repeat, the ante rem structuralist holds that, say, the natural number structure and the Euclidean space structure exist objectively, independent of the mathematician, her form of life, and so forth, and also independent of whether the structures are exemplified in the non-mathematical realm.  That is what makes them ante rem.  The semantics of the respective languages is straightforward:  The first-order variables range over the places in the respective structure, and a singular term such as ‘0’ denotes a particular place in the structure.

So, on this view, the statements of a mathematical theory are to be read at face value.  The grammatical structure of the mathematical language reflects the underlying logical form of the propositions.  For example, in the arithmetic equation, 3×8=24, the numerals ‘3’, ‘8’, and ‘24’, at least seem to be singular terms—proper names.  On the ante rem view, they are singular terms.  The role of a singular term is to denote an individual object.  On the ante rem view, these numerals denote places in the natural number structure.  And, of course, the equation expresses a truth about that structure.  In this respect, then, ante rem structuralism is a variation on traditional Platonism.

For this perspective to make sense, however, one has to think of a place in a structure as a bona fide object, the sort of thing that can be denoted by a singular term, and the sort of thing that can be in the range of first-order variables.  To pursue the foregoing analogy with universals, a place in a structure is akin to a role or an office, one that can be occupied by different people or things.  So, the idea here is to construe an office as an object in its own right, at least with respect to the structure.

There is, of course, an intuitive difference between an object and a place in a structure, between an office-holder and an office.  Indeed, the ante rem view depends on that very distinction, in order to characterize structures in the first place (in terms of systems).  Yet, we also think of the places in ante rem structures as objects in their own right.

This is made coherent by highlighting and enforcing a distinction in linguistic practice.  It is a matter of keeping track of what we are talking about at any given time.  There are two different orientations involved in discussing structures and their places.  First, a structure, including its places, can be discussed in the context of systems that exemplify the structure.  For example, one might say that the current vice president used to be a senator, or that the white king’s bishop in one game was the white queen’s bishop in another game.  For a more mathematical example, in the system of Arabic numerals, the symbol ‘2’ plays the two-role (if we think of the structure as starting with one), while in the system of roman numerals, the string ‘II’ plays that role.  Call this the places-are-offices perspective.  This office-orientation presupposes a background ontology that supplies objects that fill the places of the structures.  In the case of political systems, the background ontology is people (who have met certain criteria, such as being of a certain age and being duly elected); in the case of chess games, the background ontology is small, moveable objects—pieces with certain colors and shapes.  In the case of arithmetic, anything at all can be used as the background ontology: anything at all can play the two-role in a natural number system.  This is what is meant by saying that mathematical structures, like this one, are “free-standing”.

In contrast to the places-are-offices perspective, there are contexts in which the places of a given structure are treated as objects in their own right.  We say that the vice president presides over the senate, and that a bishop that is on a black square cannot move to a white square, without intending to speak about any particular vice president or chess piece.  Such statements are about the roles themselves.  Call this the places-are-objects perspective.  Here, the statements are about the respective structure as such, independent of any exemplifications it may have.  The ante rem structuralist proposes that we think of typical statements in pure mathematics as made in the places-are-objects mode.  This includes such simple equations as 3×8=24, and more sophisticated statements, for example, that there are infinitely many prime numbers.

To be sure, one can think of statements in the places-are-objects mode as simply generalizations over all systems that exemplify the structure.  This is consonant with the above passage from Bernays [1967], suggesting that we understand “the assertions (theorems) of [an] axiomatized theory in a hypothetical sense, that is, as holding true for any interpretation . . . for which the axioms are satisfied.” However, the ante rem structuralist takes the mathematical statements, in places-are-objects mode, as being about the structure itself.  Indeed, on that view, the structure exists, and so we can talk about its places directly.

So, for the ante rem structuralist, in the places-are-offices mode, singular terms denoting places are bona fide singular terms, and variables ranging over places are bona fide variables, ranging over places.  Places are bona fide objects.

The ante rem structuralist envisions a smooth interplay between places-are-offices statements and places-are-objects statements.  When treating a structure in the places-are-offices mode, the background ontology sometimes includes places from other structures.  We say, for example, that the finite von Neumann ordinals exemplify the natural number structure (under the ordinal successor relation, in which the successor of an ordinal α is α∪{α}).  In the places-are-objects mode, for set theory, the variables range over the places of the iterative hierarchy, such places construed as objects.  The von Neumann ordinals are some of those places-cum-objects.  We think of those objects as forming a system, under the ordinal successor relation.  That system exemplifies the natural number structure.  And in that system, the set-cum-object {φ,{φ}}, is in the two-role (beginning with the empty set, as zero).  We sometimes write {φ,{φ}}=2.  From the ante rem perspective, a symbol denoting {φ,{φ}} is construed in the places-are-objects mode, vis-à-vis the iterative hierarchy, and “2” is in the places-are-offices mode, vis-à-vis the natural number structure.  So construed, “{φ,{φ}}=2” is not actually an identity, but more like a predication.  It says that a certain von Neumann ordinal plays a certain role in a given natural number system.  In the Zermelo system, {{φ}}=2.

Sometimes, the background ontology for the places-are-offices perspective consists of places of the very structure under discussion.  It is commonly noted, for example, that the even natural numbers exemplify the natural number structure.  That is, we consider a system whose objects are the even natural numbers (themselves construed from the places-are-objects mode), under the relation symbolized by “+2”.  That system exemplifies the natural number structure.  In that system, 6 is in the three-role.  Of course, it would be confusing to write 6=3, but if care is taken, it can be properly understood, remembering that it is not an identity.

Trivially, the natural number structure itself, construed from the places-are-objects mode, exemplifies the natural number structure.  In that case, 6 plays the six-role.  Some philosophers might think that this raises a problem analogous to Aristotle’s Third Man argument against Plato.  It depends on the relationship between an ante rem structure and the systems it exemplifies.  In short, the Third Man arguments are problematic if one holds that the reason why a given collection of objects, under certain relations, is a natural number system is that it exemplifies the natural number structure.  This invokes something like the Principle of Sufficient Reason.  If something is so, then there must be a reason why it is so, and the cited reason must be, in some metaphysical sense, prior to the something.  From that perspective, one cannot hold that the natural number structure itself exemplifies the natural number structure because it exemplifies the natural number structure.  That would be a circular reason.  The ante rem structuralist is free to reject this instance of the Principle of Sufficient Reason.  She simply points out that the reason the natural number structure exemplifies the natural number structure is that it has a distinguished initial position and satisfies the relevant principles (for an opposing construal, see Hand [1993]).

The ante rem structuralist should say something about the metaphysical nature of a structure, and how it is that mathematical objects are somehow constituted by a structure.  Consider the following slogan from Shapiro [1997, p. 9]:

Structures are prior to places in the same sense that any organization is prior to the offices that constitute it.  The natural number structure is prior to “6,” just as “baseball defense” is prior to “shortstop” or “U.S. Government” is prior to “Vice President."

What is this notion of priority? For the non-mathematical examples such as baseball defenses and governments, one might characterize the priority in terms of possible existence.  To say that A is prior to B is to say that B could not exist without A.  No one can be a shortstop independent of a baseball defense; no one can be vice president independent of a government (or organization).  Unfortunately, this articulation of the priority does not make sense of the mathematical cases.  The ante rem structuralist follows most ontological realists in holding that the mathematical structures and their places exist of necessity.  It does not make sense to think of the natural number structure existing without its places, nor for the places to exist without the structure.

The dependence relation in the slogans for ante rem structuralism is that of constitution.  Each ante rem structure consists of some places and some relations.  A structure is constituted by its places and its relations, in the same way that any organization is constituted by its offices and the relations between them.  The constitution is not that of mereology.  It is not the case that a structure is just the sum of its places, since, in general, the places have to be related to each other via the relations of the structure.  An ante rem structure is a whole consisting of, or constituted by, its places and its relations.

3. Getting by without Ontology: Structuralism without (Ante Rem) Structures

Some philosophers find the existence of ante rem structures extravagant.  For such thinkers, there are other ways to preserve the structuralist insights.  One can take structures to exist, but only in the systems that exemplify them.  Metaphysically, the idea is to reverse the priority cited above: structures are posterior to the systems that exemplify them—although, again, it may prove difficult to articulate the relevant notion of priority.  This would be an Aristotelian, in re realism.  On a view like this, the only structures that exist are those that are exemplified.  I do not know of any philosophers of mathematics who articulate such a view in detail.  I mention it, in passing, in light of the connection between structures and traditional universals.

Another, perhaps ontologically cleaner, option is to reject the existence of structures, in any sense of “existence.”  On such a view, apparent talk of structures is only a façon de parler, a way of talking about systems that are structured in a certain way.  The view is sometimes dubbed eliminative structuralism.

The eliminativist can acknowledge the places-are-objects orientation when discussing structures or, to be precise, when discussing structured systems, but he cannot understand such statements literally (without adopting an error theory).  For the eliminativist, the surface grammar of places-are-objects statements does not reflect their underlying logical form, since, from that perspective, there are no structures and there are no places to which one can refer.

The ante rem structuralist and the eliminativist agree that statements in the places-are-objects mode imply generalizations concerning systems that exemplify the structure.  We say, for example, that the vice president presides over the senate, and this entails that all vice presidents preside over their respective senates.  The chess king can move one square in any direction, so long as the move does not result in check.  This entails that all kings are so mobile, and so immobile.  Of course, the generalizations themselves do not entail that there are any vice presidents or chess kings—nor do they entail that there are any structures.

The eliminative structuralist holds that places-are-objects statements are just ways of expressing the relevant generalizations, and he accuses the ante rem structuralist of making too much of their surface grammar, trying to draw deep metaphysical conclusions from that.  The same goes for typical statements in pure mathematics.  Those, too, should be regimented as generalizations over all systems that exemplify the given structure or structures.  For example, the statement “For every natural number n there is a prime p>n” is rendered:

In any natural number system S, for every object x in S, there is another object y in S such that y comes after x in S and y has no divisors in S other than itself and the unit object of S.

In general, any sentence Φ in the language of arithmetic gets regimented as something like:

(Φʹ)      In any natural number system S, Φ[S],

where Φ[S] is obtained from Φ by restricting the quantifiers to the objects in S, and interpreting the non-logical terminology in terms of the relations of S.

In a similar manner, the eliminative structuralist paraphrases or regiments—and deflates—what seem to be substantial metaphysical statements, the very statements made by his philosophical opponents.  For example, “the number 2 exists” becomes “in every natural number system, there is an object in the 2-place”.  Or “real numbers exist” becomes “every real number system has objects in its places.”  These are trivially true, analytic if you will—not the sort of statements that generate heated metaphysical arguments.

The sailing is not completely smooth for the eliminative structuralist, however.  As noted, this view takes the places-are-offices perspective to be primary—paraphrasing places-are-objects statements in those terms.  Places-are-offices statements, recall, presuppose a background of objects to fill the places in the systems.  For mathematics, the nature of these objects is not relevant.  For example, as noted, anything at all can play the two-role in a natural number system.  Nevertheless, for the regimented statements to get their expected truth-values, the background ontology must be quite extensive.

Suppose, for example, that the entire universe consists of no more than 10100,000 objects.  Then there are no natural number systems (since each such system must have infinitely many objects).  So for any sentence Φ in the language of arithmetic, the regimented sentence Φʹ would be vacuously true.  So the eliminativist would be committed to the truth of (the regimented version of) 1+1=0.

In other words, a straightforward, successful eliminative account of arithmetic requires a countably infinite background ontology.  And it gets worse for other branches of mathematics.  An eliminative account of real analysis demands an ontology whose size is that of the continuum; for functional analysis, we’d need the powerset of that many objects.  And on it goes.  The size of some of the structures studied in mathematics is staggering.

Even if the physical universe does exceed 10100,000 objects, and, indeed, even if it is infinite, there is surely some limit to how many physical objects there are (invoking Cantor’s theorem that the powerset of any set is larger than it).  Branches of mathematics that require more objects than the number of physical objects might end up being vacuously trivial, at least by the lights of the straightforward, eliminative structuralist.  This would be bad news for such theorists, as the goal is to make sense of mathematics as practiced.  In any case, no philosophy of mathematics should be hostage to empirical and contingent facts, including features of the size of the physical universe.

In the literature, there are two eliminativist reactions to this threat of vacuity.  First, the philosopher might argue, or assume, that there are enough abstract objects for every mathematical structure to be exemplified.  In other words, we postulate that, for each field of mathematics, there are enough abstract objects to keep the regimented statements from becoming vacuous.

Some mathematicians, and some philosophers, think of the set-theoretic hierarchy as the ontology for all of mathematics.  Mathematical objects—all mathematical objects—are sets in the iterative hierarchy.  Less controversially, it is often thought that the iterative hierarchy is rich enough to recapitulate every mathematical theory.  Penelope Maddy [2007, 354] writes:

Set theory hopes to provide a dependable and perspicuous mathematical theory that is ample enough to include (surrogates for) all the objects of classical mathematics and strong enough to imply all the classical theorems about them.  In this way, set theory aims to provide a court of final appeal for claims of existence and proof in classical mathematics . . . Thus set theory aims to provide a single arena in which the objects of classical mathematics are all included, where they can be compared side-by-side.

One might wonder why it is that a foundational theory only needs “surrogates” for each mathematical object, and not the real things.  For a structuralist, the answer is that in mathematics the individual nature of the objects is irrelevant.  What matters is their relations to each other (see Shapiro [2004]).

An eliminative structuralist might maintain that the theory of the background ontology for mathematics—set theory or some other—is not, after all, the theory of a particular structure.  The foundation is a mathematical theory with an intended ontology in the usual, non-structuralist sense.  In the case of set theory, the intended ontology is the sets.  Set theory is not (merely) about all set-theoretic systems—all systems that satisfy the axioms.  So, the foundational theory is an exception to the theme that mathematics is the science of structure.  But, the argument continues, every other branch of mathematics is to be understood in eliminative structuralist terms.  Arithmetic is the study of all natural number systems—within the iterative hierarchy.  Euclidean geometry is the study of all Euclidean systems, and so forth.  There are thus no structures—ante rem or otherwise—and, with the exception of sets, or whatever the background ontology may be, there are no mathematical objects either.  Øystein Linnebo [2008] articulates and defends a view like this.  Although there is not much discussion of the background ontology, Paul Benacerraf’s classic [1965] can be read in these terms as well.  Benacerraf famously argues that there are no numbers—talk of numbers is only a way to talk about all systems of a certain kind, but he seems to have no similar qualms about sets.

Of course, this ontological version of eliminative structuralism is anathema to a nominalist, who rejects the existence of abstracta altogether.  For the nominalist, sets and ante rem structures are pretty much on a par—neither are wanted.  The other prominent eliminative reaction to the threat of vacuity is to invoke modality.  In effect, one avoids (or attempts to avoid) a commitment to a vast ontology by inserting modal operators into the regimented generalizations.  To reiterate the above example, the modal eliminativist renders the statement “For every natural number n there is a prime p>n” as something like:

In any possible natural number system S, for every object x in S, there is another object y in S such that y comes after x in S and y has no divisors in S other than itself and the unit object of S.

In general, let Φ be any sentence in the language of arithmetic; Φ gets regimented as:

In any possible natural number system S, Φ[S],

or, perhaps,

Necessarily, for any natural number system S, Φ[S],

where, again, Φ[S] is obtained from Φ by restricting the quantifiers to the objects in S, and by interpreting the non-logical terminology in terms of the relations of S.

The difference with the ontological, eliminative program, of course, is that here the variables ranging over systems are inside the scope of a modal operator.  So the modal eliminativist does not require an extensive rich background ontology.  Rather, she needs a large ontology to be possible.  Geoffrey Hellman [1989] develops a modal program in detail.

The central, open problem with this brand of eliminativist structuralism concerns the nature of the invoked modality.  Of course, it won’t do much good to render the modality in terms of possible worlds.  If we do that, and if we take possible worlds, and possibilia, to exist, then modal eliminative structuralism would collapse into the above, ontological version of eliminative structuralism.  Not much would be gained by adding the modal operators.  So the modalist typically takes the modality to be primitive—not defined in terms of anything more fundamental.  But, of course, this move does not relieve the modalist of having to say something about the nature of the indicated modality, and having to say something about how we know propositions about what is possible.

Invoking metaphysical possibility and necessity does not seem appropriate here.  Intuitively, if mathematical objects exist at all, then they exist of necessity.   And perhaps also intuitively, if mathematical objects do not exist, then their non-existence is necessary.  Physical and conceptual modalities are also problematic for present purposes.

Hellman mobilizes the logical modalities for his eliminative structuralism.  Our arithmetic sentence Φ becomes

In any logically possible natural number system S, Φ[S].

It is logically necessary that for any natural number system S, Φ[S].

In contemporary logic textbooks and classes, the logical modalities are understood in terms of sets.  To say that a sentence is logically possible is to say that there is a certain set that satisfies it.  Of course, this will not do here, for the same reason that the modalist cannot define the modality in terms of possible worlds.  It is especially problematic here.  It does no good to render mathematical ‘existence’ in terms of logical possibility if the latter is to be rendered in terms of existence in the set-theoretic hierarchy.  Again, the modalist takes the notion of logical possibility to be a primitive, explicated by the theory as a whole.  For more on this program, see Hellman [1989], [2001], [2005].

To briefly sum up and conclude, the parties to the debate over how to best articulate the structuralist insights agree that each of the major versions has its strengths and, of course, each has its peculiar difficulties.  Negotiating such tradeoffs is, of course, a stock feature of philosophy in general.  The literature has produced an increased understanding of mathematics, of the relevant philosophical issues, and how the issues bear on each other, and the discussion shows no signs of abating. For additional discussion see "The Applicability of Mathematics."

4. References and Further Reading

  • Armstrong, D. [1986], “In defence of structural universals,” Australasian Journal of Philosophy 64, 85-88.
    • As the title says.
  • Awodey, S. [1996], “Structure in mathematics and logic:  a categorical perspective,” Philosophia Mathematica (3) 4, 209-237.
    • Articulates a connection between structuralism and category theory.
  • Awodey, S. [2004], “An answer to Hellman’s question:  'Does category theory provide a framework for mathematical structuralism?',” Philosophia Mathematica (3) 12, 54-64.
    • Continuation of the above.
  • Awodey, S. [2006], Category theory, Oxford, Oxford University Press.
    • Readable presentation of category theory.
  • Benacerraf, P. [1965], “What numbers could not be,” Philosophical Review 74, 47-73; reprinted in Philosophy of mathematics, edited by P. Benacerraf and H. Putnam, Englewood Cliffs, New Jersey, Prentice-Hall, 1983, 272-294.
    • Classic motivation for the (eliminative) structuralist perspective.
  • Bernays, P. [1967], “Hilbert, David” in The encyclopedia of philosophy, Volume 3, edited by P. Edwards, New York, Macmillan publishing company and The Free Press, 496-504.
  • Chihara, C. [2004], A structural account of mathematics, Oxford, Oxford University Press.
    • Account of the application of mathematics in “structural” terms, but without adopting a structuralist philosophy.
  • Frege, G. [1976], Wissenschaftlicher Briefwechsel, edited by G. Gabriel, H. Hermes, F. Kambartel, and C. Thiel, Hamburg, Felix Meiner.
  • Frege, G. [1980], Philosophical and mathematical correspondence, Oxford, Basil Blackwell.
  • Hale, Bob, [1996], “Structuralism’s unpaid epistemological debts,” Philosophica Mathematica (3) 4, 124-143.
    • Criticism of modal eliminative structuralism.
  • Hand, M. [1993], “Mathematical structuralism and the third man,” Canadian Journal of Philosophy 23, 179-192.
    • Critique of ante rem structuralism, on Aristotelian grounds.
  • Hellman, G. [1989], Mathematics without numbers, Oxford, Oxford University Press.
    • Detailed articulation and defense of modal eliminative structuralism.
  • Hellman, G. [2001], “Three varieties of mathematical structuralism,” Philosophia Mathematica (III) 9, 184-211.
    • Comparison of the varieties of structuralism, favoring the modal eliminative version.
  • Hellman, G. [2005], “Structuralism,” Oxford handbook of philosophy of mathematics and logic, edited by Stewart Shapiro, Oxford, Oxford University Press, 536-562.
    • Comparison of the varieties of structuralism, again favoring the modal eliminative version.
  • Hilbert, D. [1899], Grundlagen der Geometrie, Leipzig, Teubner; Foundations of geometry, translated by E. Townsend, La Salle, Illinois, Open Court, 1959.
  • Hilbert, D. [1935], Gesammelte Abhandlungen, Dritter Band, Berlin, Julius Springer.
  • Lewis, D. [1986], “Against structural universals,” Australasian Journal of Philosophy 64, 25-46.
    • As the title says.
  • Linnebo, Øystein [2008], “Structuralism and the Notion of Dependence,” Philosophical Quarterly 58, 59-79.
    • An ontological eliminative structuralism, using set theory as the (non-structural) background foundation.
  • MacBride, F. [2005], “Structuralism reconsidered,” Oxford Handbook of philosophy of mathematics and logic, edited by Stewart Shapiro, Oxford, Oxford University Press, 563-589.
    • Philosophically based criticism of the varieties of structuralism.
  • Maddy, P. [2007], Second philosophy: a naturalistic method, Oxford, Oxford University Press.
  • McLarty, C. [1993], “Numbers can be just what they have to,” Nous 27, 487-498.
    • Connection between category theory and the philosophical aspects of structuralism.
  • Pagès, J. [2002], “Structural universals and formal relations,” Synthese 131, 215-221.
    • Articulation and defenses of structural universals.
  • Resnik, M. [1981], “Mathematics as a science of patterns: Ontology and reference,” Nous 15, 529-550.
    • Philosophical articulation of structuralism, with focus on metaphysical issues.
  • Resnik, M. [1982], “Mathematics as a science of patterns: Epistemology,” Nous 16, 95-105.
    • Philosophical articulation of structuralism, with focus on epistemological issues.
  • Resnik, M. [1992], “A structuralist’s involvement with modality,” Mind 101, 107-122.
    • Review of Hellman [1989] focusing on issues concerning the invoked notion of modality.
  • Resnik, M. [1997], Mathematics as a science of patterns, Oxford, Oxford University Press.
    • Detailed articulation of a realist version of structuralism.
  • Shapiro, S. [1997], Philosophy of mathematics: structure and ontology, New York, Oxford University Press.
    • Elaborate articulation of structuralism, with focus on the various versions; defense of the ante rem approach.
  • Shapiro, S. [2004], “Foundations of mathematics:  metaphysics, epistemology, structure,” Philosophical Quarterly 54, 16-37.
    • The role of structuralist insights in foundational studies.
  • Shapiro, S. [2008], “Identity, indiscernibility, and ante rem structuralism:  the tale of i and -i,” Philosophia Mathematica (3) 16, 2008, 285-309.
    • Treatment of the identity relation, from an ante rem structuralist perspective, and the metaphysical nature of structures.

Author Information

Stewart Shapiro
Email: shapiro.4@osu.edu
The Ohio State University, U.S.A. and
University of St. Andrews, United Kingdom

Poincaré’s Philosophy of Mathematics

Henri PoincareJules Henri Poincaré was an important French mathematician, scientist and philosopher in the late nineteenth and early twentieth century who was especially known for his conventionalist philosophy.  Most of his publishing was in analysis, topology, probability, mechanics and mathematical physics.  His overall philosophy of mathematics is Kantian because he believes that intuition provides a foundation for all of mathematics, including geometry.

He advocated conventionalism for some principles of science, most notably for the choice of applied geometry (the geometry that is best paired with physics for an account of reality).  But the choice of a geometric system is not an arbitrary convention.  According to Poincaré, we choose the system based on considerations of simplicity and efficiency given the overall empirical and theoretical situation in which we find ourselves.  Along with the desiderata of theoretical simplicity and efficiency, empirical information must inform and guide our choices, including our geometric choices.  Thus, even with respect to applied geometry, where Poincaré is at his most conventional, empirical information is crucial to the choice we make.

Balancing the empirical element, there is also a strongly apriorist element in Poincaré’s philosophical views for he argued that intuition provides an a priori epistemological foundation for mathematics.  His views about intuition descend from Kant, whom Poincaré explicitly defends.  Kant held that space and time are the forms of experience, and provide the a priori, intuitive sources of mathematical content.  While defending the same basic vision, Poincaré adapts Kant's views by rejecting the foundation upon space and time.  Rather than time, Poincaré argues for the intuition of indefinite repetition, or iteration, as the main source of extra-logical content in number theory.  Rather than space, Poincaré argues that, in addition to iteration, we must presuppose an intuitive understanding of both the continuum and the concept of group in geometry and topology.

Table of Contents

  1. Introduction
  2. Geometry and the A Priori
  3. Poincaré’s Relationship to Kant
  4. Poincaré’s Arguments for Intuition: Continuity
  5. Poincaré’s Arguments for Intuition: Indefinitite Repetition
    1. Argument One
    2. Argument Two
    3. Argument Three
    4. Argument Four
  6. Intuition and Other Topics in Poincaré’s Philosophy
    1. Predicativism
    2. Philosophy of Science
  7. References and Further Reading

1. Introduction

Jules Henri Poincaré (1854-1912) was an important French mathematician, scientist and thinker.  He was a prolific mathematician, publishing in a wide variety of areas, including analysis, topology, probability, mechanics and mathematical physics.  He also wrote popular and philosophical works on the foundations of mathematics and science, from which one can sketch a picture of his views.

As an eminent mathematician, Poincaré's philosophical views were influential and taken seriously during his lifetime.  Today, however, his papers seem somewhat loose, informal, and at times polemical.  Indeed many are based on speeches he gave to primarily non-philosophical audiences, and part of their aim was to entertain.  One must therefore be careful when reading Poinaré not to misinterpret him as being inconsistent, or not taking philosophy seriously.  He was a mathematician, not a trained philosopher.  Yet he regarded philosophical and foundational questions as important to science, and one can still find many philosophical insights in his writings.

He was also a Kantian because he was committed to mathematical intuition as the foundation of mathematics. Known for his conventionalist philosophy, his views are really quite complicated and subtle.  He espoused conventionalism for some principles of science, most notably for the choice of applied geometry, but he was not a conventionalist about every aspect of science.  Even the choice of a geometric system is not a completely arbitrary convention.  It is not the kind of choice that could be based on the flip of a coin, for example.  Rather, we choose – according to Poincaré – based on considerations of simplicity and efficiency given the overall empirical and theoretical situation in which we find ourselves.  His point is that when articulating a theoretical framework for a given base of evidence there are almost always alternatives.  This has become known by the slogan: “Underdetermination of theory by data.”  So, there are almost always choices in how we construct our theory.  Along with the desiderata of theoretical simplicity and efficiency, empirical information must inform and guide our choices – including our geometric choices.  Thus, even with respect to applied geometry, where Poincaré is at his most conventional, empirical information is crucial to the choice we make.

Balancing the empirical element, there is also a strongly apriorist element in Poincaré’s philosophical views.  First, he viewed Euclidean geometry as so simple that we would always prefer to alter physics than to choose a non-Euclidean geometry.  This is despite the fact that he actually used non-Euclidean geometry in some of his work on celestial mechanics.  We can regard this belief in the inherent simplicity and appeal of Euclidean geometry as simply a case of a bad gamble:  he bet on the wrong horse because he bet too early (prior to general relativity).  However, there is a second, more deeply seated, apriorist element in geometry – one that links his philosophy of geometry with his more general philosophy of mathematics.  That is his belief that mathematical intuition provides an a priori epistemological foundation for mathematics, including geometry.

2. Geometry and the A Priori

All geometries are based on some common presuppositions in the axioms, postulates, and/or definitions.  Non-Euclidean geometries can be constructed by substituting alternative versions of Euclid’s parallel postulate; but they begin by keeping some axioms fixed.  Keeping some aspects of the axiomatic structure fixed is what makes the different systems all geometries.  Unifying the various geometric systems is the fact that they determine the possible constructions, or objects, in space.  What primarily differentiates Riemannian and Lobachevskian from Euclidean geometries, for example, are different existence claims regarding parallel lines (whether or not they exist, and if so how many).  In Euclidean geometry, given a line there is exactly one parallel to it on the plane through a given external point.  In Lobachevskian geometry there are infinite such parallels; and in Riemannian there are none.  The different axioms regarding parallels yield different internal angle sum theorems in each geometry:  Euclidean triangles have internal angles that sum to exactly 180 degrees; Lobachevskian triangles sum to less than 180 degrees; and Riemannian triangles sum to greater than 180 degrees.  (In the latter two cases, how much more or less than 180 degrees depends on the size of the triangle relative to the curvature of the space.)

If we consider the unifying features of these three approaches to geometry, that is, the features that the different metric systems share, a natural question concerns the epistemological and methodological status of this common basis.  One thought is that what grounds this common basis, which we might call “pure geometry in general”, is an intuitive understanding of space in general.  This is essentially what Poincaré proposed:  that there is an a priori intuitive basis for geometry in general, upon which the different metric geometries can be constructed in pure mathematics. Once constructed, they can then be applied depending on empirical and theoretical need.  The a priori basis for geometry has two elements for Poincaré.  First, he postulated that we have an intuitive understanding of continuity, which – applied to the idea of space – provides an a priori foundation for all geometry, as well as for topology.  Second, he proposed that we also have an a priori understanding of group theory.  This additional group theoretic element applied to rigid body motion for example, leads to the set of geometries of constant curvature.

For Poincaré, therefore, even if physics can help us choose between different metric geometries, the set of possibilities from which it chooses is a priori delimited by the nature of our minds.  We are led to a delimited set of possible geometries by our intuition of continuity coupled with the a priori understanding of groups.  Together these constrain our natural assumptions about possible constructions and motions in space.

As in contemporary conceptions of mathematics, Poincaré made a fairly sharp distinction between pure and applied geometry.  Pure geometry is part of pure mathematics.   As such its foundation consists in a combination of logic and intuition.  In this way, he is a Kantian about all of pure mathematics, including the mathematical study of various geometric systems.  (There is also a hint of Hilbertian axiomatics here:  in pure geometry one studies various axiom systems.)  Conventionalism for Poincaré describes applied geometry only – to characterize the quasi-empirical choice of which metric geometry to pair with physics to best model the world.

Poincaré’s philosophy of pure mathematics, is in fact dominated by the attempt to defend mathematical intuition.  This takes various forms throughout his career, but perhaps the most important example is his defense of some version of Kant’s theory of intuition in arithmetic, in opposition to the logicist program.  The logicists attempted to provide a mathematical demonstration that arithmetic has no need for intuition, Kantian or otherwise, by deriving the basic postulates of arithmetic from logical laws and, logically expressed, definitions alone.  Poincaré argued against this program, insisting that any formal system adequate to derive the basic postulates of arithmetic will by necessity presuppose some intuitive arithmetic.

In contrast with geometry, where there is a range of genuine alternatives to consider, he agreed with the logicists that there is only one genuine arithmetic.  So, the set of options is here much more strictly delimited – to one.  He disagreed with the logicists, who saw the uniqueness and epistemic depth of arithmetic as an indicator that it is nothing more than logic.  For Poincaré, arithmetic is uniquely forced on us by intuition rather than by logic alone.  Furthermore, for Poincaré, arithmetic was at the bottom of the scientific pyramid:  the most fundamental of the sciences and the one that is presupposed by all the rest.  In his hierarchy of sciences, arithmetic lies at the bottom.  Thus, arithmetic's foundation is important for the rest of the sciences.  In order to understand Poincaré’s philosophy of science and mathematics in general, therefore, one must come to grips with his philosophy of arithmetic.

3. Poincaré’s Relationship to Kant

We must begin with Kant, who is the historical source of Poincaré’s appeal to mathematical intuition.  For Kant, there are two a priori intuitions, space and time; and these provide the form of all experience.  All experiences, inner and outer, are temporal, or in time; and all outer experiences are also spatial.  A thought or desire might be an example of a non-spatial but temporal experience; and taking a walk would be in both space and time.  According to Kant, the mind comes equipped with these forms – for otherwise, he argues, we could not account for the coherence, structure, and universality of human experience.  In his vision, a priori intuition, or spatio-temporality, helps to mold brute sensations into the objects of experience.

These same a priori intuitions, the a priori form of all experience, also explain how mathematics is both a priori (non-empirical) and yet has non-trivial content.  In short, a priori intuition supplies the non-empirical content of mathematics.  Mathematics has a distinctive subject matter, but that subject matter is not provided by some external reality, Platonic or otherwise.  Rather, it is provided a priori – by the mind itself.  Intuitive space provides much of the a priori synthetic content for geometry (which is Euclidean for Kant); and intuitive time provides the a priori synthetic content for quantitative mathematics.  This makes mathematical knowledge both synthetic and a priori.  It is synthetic because it is not mere analysis of concepts, and has an intuitive subject matter.  It is a priori because its subject matter or content, spatio-temporality, is given a priori by the form of experience.

Poincaré adopts Kant’s basic vision of mathematics as synthetic a priori knowledge owing to the epistemological and methodological foundation provided by a priori intuition.  Yet, as we have seen already, he does not agree with many details of Kant’s philosophy of mathematics.  Unlike Kant, Poincaré considers Euclidean geometry to be a kind of choice; so Euclidean geometry is not uniquely, or a priori, imposed by intuition.  The closest thing to Kant’s intuitive space, for Poincare, is not Euclidean space but rather the more minimal intuitive idea of continuity, which is one of the features presupposed in Euclidean space.  Rather than intuitive time, Poincaré emphasizes the intuitive understanding of indefinite iteration for number theory.  Though he views time as a “form pre-existent in our mind”, and one can hypothesize on the connection between this form and the intuition of indefinite iteration, Poincaré does not himself stress the connection.  Thus, both sources of mathematical information – the intuitive continuum and the intuition of indefinite iteration – are somewhat less robust, and less connected to experience, for Poincaré than for Kant.

4. Poincaré’s Arguments for Intuition: Continuity

First, we shall deal briefly with the intuitive continuum.  The clearest argument for an a priori intuition of spatial or mathematical continuity is quite Kantian, but it only appears late in Poincaré writings (Last Essays).  In earlier works his remarks about the continuum are less definite and less Kantian.  For example, in Science and Hypothesis, chapter II, he focused more on priority than apriority, arguing that the continuum is mathematically prior to analysis rather than that it is given by a priori intuition.  He thought analysis presupposes the mathematical continuum because one cannot generate the real number continuum by set theoretic constructions, “from below.”   To get genuine continuity, rather than a merely dense set, and to account for the origin, utility, and our overall understanding of the symbolic constructions, Poincaré felt we must appeal to a preconceived idea of a continuum, where “the line exists previous to the point,” (pp. 18, 21).  There is no clear suggestion here of the ideas of Kant or of the idea that a continuum is given by a priori intuition.  The mathematical continuum is rather presented as partly suggested by experience and geometry, and then refined by analysis.

A few years later, in The Value of Science, he moves closer to an apriorist view – though he does not yet use the term “intuition” in connection with the continuum.  In Chapter III he discusses the “primitively amorphous” continuum that forms a common basis for the different metric systems  (p. 37).  And in Chapter IV he asserts that the mathematical continuum is constructed from “materials and models” rather than nothing.  “These materials, like these models, preexist within [the mind],” (p. 72).  He goes on to say that it is experience that enables us to choose from the different possible models.  Thus, he has here taken a big step towards suggesting a Kantian intuition of continuity – in asserting that some materials must pre-exist within the mind in order to construct the mathematical continuum.

Later, however, Poincaré explicitly connects this idea of the pre-existence of the continuum with intuition:

“I shall conclude that there is in all of us an intuitive notion of the continuum of any number of dimensions whatever because we possess the capacity to construct a physical and mathematical continuum; and that this capacity exists in us before any experience because, without it, experience properly speaking would be impossible and would be reduced to brute sensations, unsuitable for any organization; and because this intuition is merely the awareness that we possess this faculty. And yet this faculty could be used in different ways; it could enable us to construct a space of four just as well as a space of three dimensions.  It is the exterior world, it is experience which induces us to make use of it in one sense rather than in the other.” (Last Essays, 44)

The intuitive continuum is an a priori basis for mathematical and empirical construction.  In arguing for this intuition, Poincaré appeals to its necessity for coherent, organized, experience, as well as its necessity for our capacity to construct mathematical theories of the continuum.  His approach here is now quite similar to some of Kant’s transcendental arguments.  For example, Kant argues that spatio-temporality must be brought to rather than derived from experience, for it is what makes experience coherent.  In other words, Kant argues that spatio-temporality cannot be derived, for it is required in order for us to derive anything from experience.  Poincaré’s appeal to intuition in order to explain both a mathematical capacity – the capacity to construct certain mathematical structures – and the fact that our experience is coherent, is thus very reminiscent of Kant.  It is a priori because it is necessarily prior to experience, providing its form or capacity for organization.

5. Poincaré’s Arguments for Intuition: Indefinitite Repetition

In contrast, even Poincaré’s clearest arguments for an intuition of iteration seem quite non-Kantian, for they are less connected to coherent experience, and more focused on pure mathematical contexts.   Three types of arguments are sketched below.

a. Argument One

One approach involves a kind of Sherlock Holmes strategy.  Poincaré considers several alternatives to mathematics being synthetic a priori, or based on intuition, and eliminates them.  In the course of the argument he ends up with the view that inductive reasoning is especially characteristic of mathematics; and it is why mathematics is synthetic a priori.  Induction will turn out to be the main conduit of intuition in mathematics, but first Poincaré focuses on simply its classification as synthetic a priori.  This particular argument has three parts.

He first begins by considering the alternative that mathematics, being a priori, is purely deductive, and has no extra-logical content.  Against this, Poincaré leverages his famous giant tautology objection.  If math were just logic it would be a giant tautology.  It’s not.  Thus, mathematics has some non-logical source of information or content.

The very possibility of mathematical science seems an insoluble contradiction.  If this science is only deductive in appearance, from whence is derived that perfect rigour which is challenged by none?  If, on the contrary, all the propositions which it enunciates may be derived in order by the rules of formal logic, how is it that mathematics is not reduced to a gigantic tautology?... Are we then to admit that the enunciations of all the theorems with which so many volumes are filled, are only indirect ways of saying that A is A? (Science and Hypothesis, pp 1-2)

Though this reductio by ridicule is amusing, it presupposes some things about logic, which, after logicism, are neither obvious nor uncontroversial.  One presupposition is that if something is a tautology we could recognize it.  This had already been contested by the logicist Dedekind, who acknowledged that chains of inferences can be so long, unconscious, and even frightening, that we may not recognize them as purely logical, even if they are. (Dedekind, p. 33)  Another presupposition Poincaré makes here is that logic is a giant tautology, which had already been contested by the logicist Frege, who explicitly disputes the idea that logic is sterile, (Frege, section 17).  Finally, even if we grant Poincaré’s presuppositions about logic, that it is recognizably empty, the extra-logical content on which mathematics depends is undetermined by this argument.  Additional arguments are required to move us towards the conclusion that mathematics is synthetic a priori, dependent on intuition rather than experience or some other source for its content.

Thus, Poincaré continues in the second part of this argument by considering the possibility that the extralogical content is simply provided by the non-logical axioms.  Formalism, or axiomatics, would be an example of this type of view.  In opposition to this, Poincaré argues that axiomatics is not faithful to mathematics.  According to the axiomatic viewpoint, logic can only extract what is given in the axioms (Science and Hypothesis, 2).  Poincaré feels that mathematics does more than squeeze out information that resides in axioms.  Mathematical growth can occur, he thought, within mathematics itself – without the addition of new axioms or other information.  He insists, in fact, that growth occurs by way of mathematical reasoning itself.

So, if mathematical reasoning can yield genuine growth without adding new axioms; and given his conception of logic as empty; then mathematical reasoning, not just mathematical content, must transcend logic alone.  How can mathematical reasoning transcend logic?  Well, mathematicians constantly use the tool of reasoning by recurrence, or inductive reasoning and definition, in order to make general definitions and conclusions.  A simple example of the principle of induction is:  if we can show that 0 has a property, P; and we can also show that for any number n, if n has P then n+1 has P; then we can conclude that all numbers have the property, P.  Poincaré regarded inductive reasoning as mathematical reasoning par excellence; and he felt that it transcends logic because it gives us a way to jump over infinite steps of reasoning.  Once we think about it a bit, we see it must be true: P (0) and P (n) → P (n+1) entails P (1); P (1) and P (n) → P (n+1) entails P (2); and so on.  The conclusion of induction – that for all n, P (n) – does enable us to jump over these tedious modus ponens steps, and Poincaré viewed it as a major source of progress in mathematics (Science and Hypothesis, 10-11)

Finally, to finish off this argument, Poincaré examines the nature of induction and reasoning by recurrence.  He argues that since induction cannot be logically derived, and it was certainly not traditionally regarded as a logical principle, it is synthetic.  However, it is not a merely experimental truth, because – despite the fact that it transcends logic – it is “imposed on us with an irresistible weight of evidence,” (Science and Hypothesis, 12-13).  Thus, he concludes, it is synthetic and a priori.  This status is also why it could not be regarded as a mere convention:  because it is not a choice or a definition.  Rather, it is a rule that is imposed on us by the nature of our own minds,  (Science and Hypothesis, 48).  By way of this three-part argument, Poincaré feels he has exhausted the likely alternatives; and is left with only one viable option, which is that induction is a synthetic a priori principle.

b. Argument Two

The second argument is by introspection.  This follows the last part of his argument above, and consists of an examination of the nature of the “irresistible weight of evidence” which forces induction on us.  The aim of this reflection is to establish that the reason induction is synthetic a priori, that it is based on a priori intuition.  Here we get some of the distinctive flavor of Poincaré’s conception of intuition in contrast with Kant’s.  For we see that for Poincaré, the intuition can be a kind of insight, somewhat evocative of Husserl, rather than a form of experience.  The intuition of iteration involves insight into a power of the mind itself.  So, it is the mind having a self-insight: into its own power to conceive of the indefinite iteration of an act once seen to be possible:

Why then is this view [the judgement that induction is a true principle] imposed upon us with such an irresistible weight of evidence?  It is because it is only the affirmation of the power of the mind which knows it can conceive of the indefinite repetition of the same act, when the act is once possible.  The mind has a direct intuition of this power, and experiment can only be for it an opportunity of using it, and thereby of becoming conscious of it. (Science and Hypothesis, 13).

In this case intuition gives us insight into a power of our own minds, a power to conceive of indefinite repetition, which in turn enables us to understand why induction must be true.  Thus, intuition lies at the foundation for math – whenever we explicitly (as in induction) or implicitly conceive of indefinite iterations (as in understanding domains generated by iterated processes such as the successor function).  Mathematical induction is different however from scientific induction, for it is certain while empirical induction is never certain.  Its certainty derives from the fact that it merely affirms a property of the mind itself – rather than makes an assertion about something outside the mind, (a priori versus a posteriori), (Science and Hypothesis, 13).  In this second argument, Poincaré uses intuition to explain the synthetic a priori status of induction.  Thus, despite the somewhat non-Kantian flavor of this intuition – its connection to insight rather than the form of experience – Poincaré's use of it is analogous to Kant who also appealed to intuition to explain the synthetic a priori status of mathematics.

c. Argument Three

A third argument is really a set of objections to logicism, which take the form of circularity arguments.  When combined they add up to a powerful objection against logical or set theoretic reconstructions of arithmetic.  Each argument follows the same basic format, which is that any formal reconstruction of arithmetic that tries to avoid intuition will fail; for it will presuppose intuition somewhere in the reconstruction.

There are at least four, and taking them in order, the first two objections may not seem very impressive.

(i) First Poincaré seems to treat logicism as a kind of formalism or conventionalism, as if the Peano Axioms are implicit definitions of the concept of number.  Against this he argues that to show that these axioms are consistent requires the use of induction, which is one of the implicit definitions.  So this would be a circular endeavor.

And it would be if that were what logicism was up to.  However, logicists aimed to derive the Peano Axioms – including induction – from explicit definitions of zero, number and successor; they did not use the Peano axioms as (implicit) definitions themselves.  So this first argument seems to misfire.

(ii) In the second circularity argument Poincaré objects that the symbolism of logicism merely hides the fact that its definitions of the numbers are circular.  For example, he complains that the logicist definition of zero uses symbolic notation that means, “Zero is the number of things satisfying a condition never satisfied.  But as 'never' means in no case I do not see that the progress is great…” (Ewald translation, 1905b, VII, 1029)  He makes similar remarks against the standard definition of one, which in a sense invokes the idea of two.

Now, anyone familiar with contemporary logic may regard Poincaré’s complaint as a mere psychological objection based on logical ignorance, but I think this is too easy a dismissal.  His view is that a basic understanding of number is necessary in order to understand the symbolic definitions of the numbers, and this is not obviously a purely psychological point.  It is a normative claim about understanding rather than an empirical claim about how we happen to think.  So this argument cannot be immediately dismissed as has been claimed (e.g., see Goldfarb 1988).

(iii)  The last two arguments are intertwined and are generally regarded as stronger.  Following on the second argument above, Poincaré’s third objection complains that the new logic is mathematics in (symbolic) disguise.  We can reconstruct this argument along the following lines.  Modern symbolic logic has an infinite combinatorial nature, which makes it very different from Aristotelian logic.  For example, the standard definition of well-formed formula is recursive, which as we noted above is a peculiarly mathematical tool according to Poincaré.  It is the recursive nature of logic that makes it infinite.  Since recursive definition was formerly a peculiarly mathematical tool, the worry is that the logicist has in some sense shifted the boundary between math and logic.  If logic has “invaded the territory” of mathematics; and “stolen” some of its tools; then of course it would have more power.  In thus shifting the boundary, Poincaré believes, logicists have presupposed an essentially arithmetic, intuitive tool.  That is, the logicist hasn’t avoided intuition for he presupposes intuition in the very tools he uses, that is, in the new logic itself.

(iv) Fourth, if the logicist is, even just potentially, adding substantive content to logic via these new powerful tools, he owes us a justification that the new principles are – at least – consistent.  For example, the logicist could treat the rules of inference as disguised definitions of the logical constants, and then show that their use can never lead to inconsistency.  But, Poincaré objects, there will be no such consistency proof without induction.  So, the logicist will still have to presuppose induction, which has two problems.  The justification would therefore be circular since induction is one of the principles to be derived.  Also, logicism would be explicitly depending on intuition in justifying the new logical principles, which is what he was claiming to avoid.

This is not the place to assess Poincaré’s objections to logicism and the extent to which they can be dismissed as psychologistic.  (See Goldfarb 1988 for such arguments; and see the response, Folina 2006, for a rebuttal.)  Let us just say that when put together, these arguments suggest a genuine challenge to logicism along the following lines.  Modern symbolic logic has an infinite combinatorial structure, which can only be justified by mathematical means, including inductive tools.

d. Argument Four

This structure owes itself to the fact that ordinary definitions of well-formed formula in a standard system are recursive; and thus the inference rules themselves – which depend on what makes something a well-formed formula of a certain type – will also inherit this infinite combinatorial nature, (Argument 3)  Any proper understanding of the rules of inference will thus presuppose some grasp of the recursive procedures that determine them, (Argument 2)  Thus, logicist reconstructions of arithmetic, even if symbolic, cannot reduce arithmetic to an intuition-free content if recursive reasoning is intuitive.

6. Intuition and Other Topics in Poincaré’s Philosophy

To conclude, consider two other important topics:  Poincaré's advocacy of predicative definitions in mathematics; and the more general issue of his philosophy of natural science.  Each fits with his semi-Kantian defense of intuition in mathematics.

a. Predicativism

Poincaré was central in advancing the understanding the nature of the vicious circle paradoxes of mathematics.  He was the first to articulate a general distinction between predicative and non-predicative definitions, and he helped to show the relevance of this distinction to the paradoxes in general.  Rather than treating the paradoxes on a case by case basis, he and Russell saw a common cause underlying all of them – that of self-reference.  Russell’s solution to the paradoxes – his ramified theory of types (developed in Principia Mathematica) – is indeed an attempt to formalize the idea of eliminating impredicative definitions.

The vicious circle paradoxes of mathematics showed that one can create a contradiction in mathematics by using a certain kind of self-referential definition along with some basic existence principles.  The most famous is Russell’s paradox because Russell first published his discovery of an inconsistency in Frege’s logicist system.  In generating the numbers, Frege had used an axiom that entails that any property whatever determines a set – the set of objects that have that property. Russell then considered the property of being non-self-membered.  Some sets are self-membered, the set of abstract objects is itself an abstract object, so it is self-membered; some are non-self-membered, the set of elephants is not an elephant so it is non-self-membered.  However, if non-self-membered is a bona fide property, then it too should determine a set according to Frege's axiom:  the set of all sets that are non-self membered.  This yields a contradiction because given this property and the existence of the set by Frege's axiom, the set in question is both self-membered and non-self-membered.

The property of being non-self-membered, however, is impredicative – for, to collect together all the sets that have this property, one must see whether the property applies to the set one is in the process of collecting.  In general, impredicative definitions appeal either implicitly or explicitly to a collection to which the object being defined belongs.  The problem with outlawing all impredicative definitions, however, is that many are unproblematic.  For example, “Tallest person in the room” is strictly speaking impredicative but neither logically inconsistent nor even confusing.  “Least upper bound” was thought by many mathematicians to fall into this category – of strictly impredicative but not viciously circular.  Indeed, the program to eliminate impredicativity from mathematics was doomed to fail.  Too many widely accepted definitions would have been eliminated; and mathematics would, as Weyl put it, have been almost unbearably awkward. (Weyl, 54)

Poincaré’s attitude to impredicativity was interesting and complex.  He was central in characterizing the notion, and as a constructivist he was someone for whom the notion is important.  However, he did not advocate a formal reconstruction of mathematics by eliminating all impredicativity.  Instead, he first advocated simply avoiding impredicative definitions.  Second, and more importantly, he distinguished between different definition contexts.  One definition context is constructive.  When the object does not already exist by virtue of another definition or presupposition, the definition context is constructive – and then it must be predicative.  For otherwise we are attempting to build something out of materials that require it to already exist, which is certainly a viciously circular procedure.  The other definition context is non-constructive, such as, when a definition merely identifies, or picks out, an already existing object.  In this case impredicativity is harmless, for it is more like the case of the “tallest man in the room”, which merely picks out an existing person and does not thereby construct him.  So, for Poincaré even the constructivist needs to worry about impredicativity only in certain situations:  when the definition is playing the role of a construction.

In this way, despite the fact that Poincaré was a constructivist, he did not regard all mathematical definitions as constructions.  There are two types of nonconstructive definition contexts:  when the object exists by way of a prior definition, and when the object exists by guarantee of intuition.  For him, least upper bound was indeed similar to the specification, “Tallest man in the room” – because he regarded sets of upper bounds as given a priori by the intuitive continuum, since all real numbers are thereby guaranteed to exist.  By relying on intuition to supplement his constructivism, he attempted to avoid the unbearable awkwardness and restrictions of a purely predicativist approach to mathematics.

b. Philosophy of Science

Poincaré’s philosophy of natural science covers much interesting terrain.  He was famous for distinguishing between types of hypotheses in science, but he also distinguishes between types of facts, emphasizing the importance of simple facts in science.  Simple facts are the most general and most useful facts, which also have the power to unify different areas of science.  These same facts are the interesting facts to us and they are the most beautiful as well.  Their beauty rests on their, “harmonious order of the parts and which a pure intelligence can grasp,” (The Value of Science, 8).  Simplicity, beauty, and utility are one and the same for Poincaré.

A second important theme in Poincaré’s vision of scientific knowledge involves his appeal to Darwin’s theory of evolution.  He asks the question why we find beauty in the simple, general, harmonious facts?  One answer is Darwinian:  natural selection will favor creatures that find interest and beauty in the facts that prove more useful to their survival.  The idea is, the fact that humans notice and are interested in regularities no doubt helped them survive.  Indeed, Poincaré appeals to natural selection in just this context, (The Value of Science, 5, 9).

Third, as noted above, Poincaré makes important distinctions between types of hypotheses, in Science and Hypothesis.  Some hypotheses are mere conventions, or definitions in disguise; some are tentative hypotheses that are malleable as a theory is being articulated or built; and some are verifiable, “And when once confirmed by experiment become truths of great fertility,” (Science and Hypothesis, xxii).  Though he is a conventionalist about some aspects of science, he opposes what he calls nominalism, which is too much emphasis on free choice in science.

Poincaré regards the utility of science as evidence that scientists do not create facts – they discover facts.  Yet, on the other hand, he does not espouse a sort of direct realism by which science merely reflects the objective world.  Science neither creates, nor passively reflects truth.  Rather, it has a limited power to uncover certain kinds of truths – those that capture, “Not things themselves… but the relations between things; outside those relations there is no reality knowable,”  (Science and Hypothesis, xxiv).

Let us consider these three aspects of Poincaré’s philosophy of science side by side with his constructivist philosophy of mathematics.  For Poincaré the most harmonious, simple, and beautiful facts are those that are typically expressed mathematically.  He goes so far as to assert that the only objective reality that science can discover consists of relations between facts; and these relations are expressed mathematically, (The Value of Science, 13).  Thus, mathematics does not merely provide a useful language for science; it provides the only possible language for knowing the only types of facts we can objectively know – the relational facts.

Poincaré’s emphasis on structural, or relational, facts; and the fact that he rejects the idea that science discovers the essences of things themselves; has been characterized by some as structural realism, (Worrall).  Structural realism currently takes various forms, but the basic aim is to stake a moderate, middle position between skepticism and naïve realism.  We cannot know things in themselves, or things directly.  So against naïve realism, science does not directly reflect reality.  Yet, the success of science is surely not a miracle; its progress not a mere illusion.  We can explain this success, without naïve or direct realism, but with the hypothesis that the important, lasting truths that science discovers are structural, or relational, in character.  Poincaré indeed espouses views that fit well with structural realism.

If relations are the most objective facts we can know; and if this is a form of realism; then relations must be real.  A question arises, however, over whether or not Poincaré’s underlying Kantian views are in tension with the realism in structural realism.  That is, given Poincaré’s anti-realism about mathematics, emphasizing the mathematical nature of the structural facts we can know seems to move us even further away from realism.  So, a question is whether his view should really be called structural Kantianism rather than structural realism.  If structure is mathematical, and mathematics is not conceived realistically, then how can he be a realist about structure?

I think there is a way to preserve the realism in his structural realism by remembering two things:  one, his appeal to the empirical basis and utility of science in opposition to the nominalist; the other is his Darwinism.  First utility.  We express the lasting, useful scientific relations mathematically; but it does not follow that the relations expressed mathematically have no reality to them over and above mathematical reality.  If the relations had no such reality, they wouldn’t be so useful.  Moreover, since the scientist relies on experimental facts, “His freedom is always limited by the properties of the raw material on which he works,” (The Value of Science, 121).  The rules that the scientist lays down are not arbitrary, like the rules of a game; they are constrained by experiment, (The Value of Science, 114).  They are also proven by their long-term usefulness; and some facts even survive theory change, at least in rough form, (The Value of Science, 95).

For Poincaré, the true relations, the real relations, are shown by their endurance through theory change; and he believed science had uncovered a number of such truths.  This is consistent with the view that what endures through scientific change, the enduring mathematically expressed relations, reflects reality as it really is, (Science and Hypothesis, chapter X).  This is the same structural realist idea that science can cut nature at its joints, where the increasing complexity of science, including the overthrow of old theories for new ones, can sometimes be construed as science making more refined cuts in roughly the same places as it progresses.  (Think of a 16th century map, which is superseded by newer, more precise, maps.  It is not that the earlier map represented nothing.)

We can bolster this picture with Poincaré’s Darwinism.  We evolved in the world as it is.  This is a kind of minimal realism for it entails that the world is a certain way independent of our social, scientific, constructions.  Evolutionary pressure gives us capacities that help us to survive.  So, there is an evolved fit between our cognitive structures and the structures of the world.  If there weren’t, we wouldn’t have survived; indeed Poincaré suggests that if the world did not contain real regularities then there might be no life at all:

The most interesting facts are those which may serve many times; these are the facts which have a chance of coming up again.  We have been so fortunate as to be born in a world where there are such…. In [a world without recurring facts] there would be no science; perhaps thought and even life would be impossible, since evolution could not there develop the preservational instincts. (The Value of Science, 5)

The existence of life, no less science, confirms the existence of real regularities in the world.  We are beings who notice, and even look for, regularities.  So we survive.  In addition, although we impose mathematics on our cognition of the world, on the way we cognize the regularities, what we impose is not arbitrary.  Rather, mathematics reflects aspects of our cognitive capacities that have helped us survive in the world as it is.  That is, our inclination to search for order and regularities is also what makes us mathematical.

Kantian constructivism about mathematics is thus not opposed to scientific realism, provided realism is not taken in a naïve way.  For Poincaré, the structural realist hypothesis is that the enduring relations, which we can know, are real, because we have evolved to cut nature at its real joints, or as he once put it its “nodal points” (Science and Method, 287).  Mathematics is a sort of by-product of evolution, on this picture.  In this way, Poincaré's Kantianism about pure mathematics is supported by a Darwinian conception of human evolution – a picture that enables his philosophy of mathematics to coexist with his diverse views about natural science.

7. References and Further Reading

  • Dedekind:  Essays on the Theory of Numbers, Berman transl, Dover, 1963.
  • Ewald:    From Kant to Hilbert, Oxford University Press, 1996. (Contains good translations of several papers by Poincaré that were formerly available in English only in abridged form.)
  • Folina:  Poincaré and the Philosophy of Mathematics, Macmillan, 1992.
  • Folina:  "Poincaré's Circularity Arguments for Mathematical Intuition," The Kantian Legacy in Nineteenth Century Science, Friedman and Nordmann eds, MIT Press, 2006.
  • Frege:  The Foundations of Arithmetic, J L Austin transl, Oxford, 1969.
  • Goldfarb:  "Poincaré against the logicists," History and Philosophy of Modern Mathematics, Aspray and Kitcher eds, University of Minnesota Press, 1988.
  • Greffe, Heinzmann and Lorenz:  Henri Poincaré, Science and Philosophy, Akademie Verlag and Albert Blanchard, 1994.  (Anthology containing a wide variety of papers.)
  • Kant:  Critique of Pure Reason, N K Smith transl, St Martin's Press, 1965.
  • Poincaré:  Science and Hypothesis, W J Greenstreet transl, Dover 1952 (reprint of 1905; includes introduction by Larmor and general prefatory essay by Poincaré).
  • Poincaré: The Value of Science, George Bruce Halsted transl, Dover, 1958 (includes prefatory essay by Poincaré on the choice of facts).
  • Poincaré: Science and Method, Francis Maitland transl, Thoemmes Press, 1996   (reprint of 1914 edition with preface by Russell).
  • Poincaré: Last Essays, John Bolduc transl, Dover, 1963.
  • Poincaré: "Mathematics and Logic" (I, 1905b), in From Kant to Hilbert, Ewald ed, Halsted and Ewald transl, Oxford University Press, 1996.
  • Russell with Alfred North Whitehead:  Principia Mathematica, 1910-1913. 3 vols. Cambridge, UK: Cambridge Univ. Press. Revised ed., 1925-1927.
  • Weyl:  Philosophy of Mathematics and Natural Science, Helmer transl, Atheneum, 1963.
  • Worrall:  "Structural realism: the best of both worlds?" in Dialectica 43, pp. 99-124, 1989.

Author Information

Janet Folina
Email: folina@macalester.edu
Macalester College
U. S. A.

Predicative and Impredicative Definitions

The distinction between predicative and impredicative definitions is today widely regarded as an important watershed in logic and the philosophy of mathematics. A definition is said to be impredicative if it generalizes over a totality to which the entity being defined belongs. Otherwise the definition is said to be predicative. In the examples below, (2) and (4) are impredicative.

  1. Let π be the ratio between the circumference and diameter of a circle.
  2. Let n be the least natural number such that n cannot be written as the sum of at most four cubes.
  3. A natural number n is prime if and only if  n > 1 and the only divisors of n are 1 and n itself.
  4. A person x is general-like if and only if, for every property P which all great generals have, x too has P.

Definition (1) is predicative since π is defined solely in terms of the circumference and diameter of some given circle.  Definition (2), on the other hand, is impredicative, as this definition generalizes over all natural numbers, including n itself. Definition (3) is predicative, as the property of being prime is defined without any generalization over properties. By contrast, definition (4) is impredicative, as the property of being general-like is defined by generalization over the totality of all properties.

Impredicative definitions have long been controversial in logic and the philosophy of mathematics. Many prominent logicians and philosophers—most importantly Henri Poincaré, Bertrand Russell, and Hermann Weyl—have rejected such definitions as viciously circular. However, it turns out that the rejection of such definitions would require a major revision of classical mathematics. The most common contemporary view is probably that of Kurt Gödel, who argued that impredicative definitions are legitimate provided one holds a realist view of the entities in question.

Although few theorists any longer reject all impredicative definitions, it is widely recognized that such definitions require stronger theoretical assumptions than do predicative definitions.

Table of Contents

  1. Paradoxes and the Vicious Circle Principle
  2. Impredicativity in Classical Mathematics
  3. Defenses of Impredicative Definitions
  4. References and Further Readings

1. Paradoxes and the Vicious Circle Principle

The notion of predicativity has its origin in the early twentieth century debate between Poincaré, Russell and others about the nature and source of the logical paradoxes. ([Poincaré 1906], [Russell 1908]) So, it will be useful to review some of the most important logical paradoxes.

Russell's paradox. Let the Russell class R be the class of all classes that are not members of themselves. If R is a member of itself, then it doesn't satisfy the membership criterion and hence isn't a member of itself. If, on the other hand, R isn't a member of itself, then it does satisfy the membership criterion and hence is a member of itself after all. Thus, R is a member of itself iff (if and only if) R is not a member of itself.

The Liar paradox. "This sentence is false." If this quoted sentence is true, then what it says is correct, which means that the sentence is false. If, on the other hand, the sentence is false, then what it says is correct, which means that the sentence is true. Thus, the sentence is true just in case it is false.

Berry's paradox. There are only finitely many strings of the English alphabet of less than 200 characters. But there are infinitely many natural numbers. Hence there must be a least integer not nameable in less than 200 characters. But we have just named it in less than 200 characters!

Both Poincaré and Russell argued that the paradoxes are caused by some form of vicious circularity. What goes wrong, they claimed, is that an entity is defined, or a proposition is formulated, in a way that is unacceptably circular. Sometimes this circularity is transparent, as in the Liar paradox. But in other paradoxes there is no explicit circularity. For instance, the definition of the Russell class makes no explicit reference to the class being defined. Nor does the definition in Berry's paradox make any explicit reference to itself.

However, Poincaré and Russell argued that paradoxes such as Russell's and Berry's are guilty of an implicit form of circularity. The problem with the Russell class is said to be that its definition generalizes over a totality to which the defined class would belong. This is because the Russell class is defined as the class whose members are all and only the non-self-membered objects. So one of the objects that needs to be considered for membership in the Russell class is this very class itself. Similarly, the definition in Berry's paradox generalizes over all definitions, including the very definition in question.

Poincaré's and Russell's diagnosis is very general. Whenever we generalize over a totality, we presuppose all the entities that make up this totality. So when we attempt to define an entity by generalizing over a totality to which this entity would belong, we are tacitly presupposing the entity we are trying to define. And this, they claim, involves a vicious circle. The solution to the paradoxes is therefore to ban such circles by laying down what Russell calls the Vicious Circle Principle. This principle has received a bewildering variety of formulations. Here are two famous examples (from ([Russell 1908], p. 225):

Whatever involves all of a collection must not be one of the collection.

If, provided a certain collection has a total, it would have members only definable in terms of that total, then the said collection has no total.

In a justly famous analysis, Gödel distinguishes between the following three forms of the Vicious Circle Principle ([Gödel 1944]):

(VCP1) No entity can be defined in terms of a totality to which this entity belongs.

(VCP2) No entity can involve a totality to which this entity belongs.

(VCP3) No entity can presuppose a totality to which this entity belongs.

The clearest of these principles is probably (VCP1). For this principle is simply a ban on impredicative definitions. This principle requires that a definition not generalize over a totality to which the entity defined would belong.

According to Gödel, the other two principles, (VCP2) and (VCP3), are more plausible than the first, if not necessarily convincing. The tenability of these two principles is a fascinating question but beyond the scope of this survey.

For two other introductions to the question of predicativity, see [G