Category Archives: Philosophy of Science


clock2Time is what we use a clock to measure. Despite 2,500 years of investigation into the nature of time, many issues about it are unresolved. Here is a list in no particular order of the most important issues that are discussed in this article: •What time actually is; •Whether time exists when nothing is changing; •What kinds of time travel are possible; •How time is related to mind; •Why time has an arrow; •Whether the future and past are as real as the present; •How to correctly analyze the metaphor of time’s flow; •Whether contingent sentences about the future have truth values now; •Whether future time will be infinite; •Whether there was time before our Big Bang; •Whether tensed or tenseless concepts are semantically basic; •What the proper formalism or logic is for capturing the special role that time plays in reasoning; •What neural mechanisms account for our experience of time; •Which aspects of time are conventional; and •Whether there is a timeless substratum from which time emerges.

Consider this one issue upon which philosophers are deeply divided: What sort of ontological differences are there among the present, the past and the future? There are three competing theories. Presentists argue that necessarily only present objects and present experiences are real, and we conscious beings recognize this in the special vividness of our present experience compared to our memories of past experiences and our expectations of future experiences. So, the dinosaurs have slipped out of reality. However, according to the growing-past theory, the past and present are both real, but the future is not real because the future is indeterminate or merely potential. Dinosaurs are real, but our death is not. The third theory is that there are no objective ontological differences among present, past, and future because the differences are merely subjective. This third theory is called “eternalism.”

Table of Contents

  1. What Should a Philosophical Theory of Time Do?
  2. How Is Time Related to Mind?
  3. What Is Time?
    1. The Variety of Answers
    2. Time vs. “Time”
    3. Linear and Circular Time
    4. The Extent of Time
    5. Does Time Emerge from Something More Basic?
    6. Time and Conventionality
  4. What Does Science Require of Time?
  5. What Kinds of Time Travel are Possible?
  6. Does Time Require Change? (Relational vs. Substantival Theories)
  7. Does Time Flow?
    1. McTaggart's A-Series and B-Series
    2. Subjective Flow and Objective Flow
  8. What are the Differences among the Past, Present, and Future?
    1. Presentism, the Growing-Past, Eternalism, and the Block-Universe
    2. Is the Present, the Now, Objectively Real?
    3. Persist, Endure, Perdure, and Four-Dimensionalism
    4. Truth Values and Free Will
  9. Are There Essentially-Tensed Facts?
  10. What Gives Time Its Direction or Arrow?
    1. Time without an Arrow
    2. What Needs To Be Explained
    3. Explanations or Theories of the Arrow
    4. Multiple Arrows
    5. Reversing the Arrow
  11. What is Temporal Logic?
  12. Supplements
    1. Frequently Asked Questions
    2. What Science Requires of Time
    3. Special Relativity: Proper Times, Coordinate Systems, and Lorentz Transformations (by Andrew Holster)
  13. References and Further Reading

1. What Should a Philosophical Theory of Time Do?

Philosophers of time tend to divide into two broad camps on some of the key philosophical issues, although many philosophers do not fit into these pigeonholes. Members of  the A-camp say that McTaggart's A-series is the fundamental way to view time; events are always changing, the now is objectively real and so is time's flow; ontologically we should accept either presentism or the growing-past theory; predictions are not true or false at the time they are uttered; tenses are semantically basic; and the ontologically fundamental entities are 3-dimensional objects. Members of the B-camp say that McTaggart's B-series is the fundamental way to view time; events are never changing; the now is not objectively real and neither is time's flow; ontologically we should accept eternalism and the block-universe theory; predictions are true or false at the time they are uttered; tenses are not semantically basic; and the fundamental entities are 4-dimensional events or processes. This article provides an introduction to this controversy between the camps.

However, there are many other issues about time whose solutions do not fit into one or the other of the above two camps. (i) Does time exist only for beings who have minds? (ii) Can time exist if no event is happening anywhere? (iii) What sorts of time travel are possible? (iv) Why does time have an arrow? (v) Is the concept of time inconsistent?

A full theory of time should address this constellation of philosophical issues about time. Narrower theories of time will focus on resolving one or more members of this constellation, but the long-range goal is to knit together these theories into a full, systematic, and detailed theory of time. Philosophers also ask whether to adopt  a realist or anti-realist interpretation of a theory of time, but this article does not explore this subtle metaphysical question.

2. How Is Time Related to Mind?

Physical time is public time, the time that clocks are designed to measure. Biological time, by contrast, is indicated by an organism's circadian rhythm or body clock, which is normally regulated by the pattern of sunlight and darkness. Psychological time is different from both physical time and biological time. Psychological time is private time. It is also called phenomenological time, and it is perhaps best understood as awareness of physical time. Psychological time passes relatively swiftly for us while we are enjoying an activity, but it slows dramatically if we are waiting anxiously for the  pot of water to boil on the stove. The slowness is probably due to focusing our attention on short intervals of physical time. Meanwhile, the clock by the stove is measuring physical time and is not affected by any person’s awareness or by any organism's biological time.

When a physicist defines speed to be the rate of change of position with respect to time, the term “time” refers to physical time, not psychological time or biological time. Physical time is more basic or fundamental than psychological time for helping us understand our shared experiences in the world, and so it is more useful for doing physical science, but psychological time is vitally important for understanding many mental experiences.

Psychological time is faster for older people than for children, as you notice when your grandmother says, "Oh, it's my birthday again." That is, an older person's psychological time is faster relative to physical time. Psychological time is slower or faster depending upon where we are in the spectrum of conscious experience: awake normally, involved in a daydream,  sleeping normally, drugged with anesthetics, or in a coma. Some philosophers claim that psychological time is completely transcended in the mental state called nirvana because psychological time slows to a complete stop. There is general agreement among philosophers that, when we are awake normally, we do not experience time as stopping and starting.

A major philosophical problem is to explain the origin and character of our temporal experiences. Philosophers continue to investigate, but so far do not agree on, how our experience of temporal phenomena produces our consciousness of our experiencing temporal phenomena. With the notable exception of Husserl, most philosophers say our ability to imagine other times is a necessary ingredient in our having any consciousness at all. Many philosophers also say people in a coma have a low level of consciousness, yet when a person awakes from a coma they can imagine other times but have no good sense about how long they've been in the coma.

We make use of our ability to imagine other times when we experience a difference between our present perceptions and our present memories of past perceptions.  Somehow the difference between the two gets interpreted by us as evidence that the world we are experiencing is changing through time, with some events succeeding other events. Locke said our train of ideas produces our idea that events succeed each other in time, but he offered no details on how this train does the producing.

Philosophers also want to know which aspects of time we have direct experience of, and which we have only indirect experience of. Is our direct experience of only of the momentary present, as Aristotle, Thomas Reid, and Alexius Meinong believed, or instead do we have direct experience of what William James called a "specious present," a short stretch of physical time? Among those accepting the notion of a specious present, there is continuing controversy about whether the individual specious presents can overlap each other and about how the individual specious presents combine to form our stream of consciousness.

The brain takes an active role in building a mental scenario of what is taking place beyond the brain. For one example, the "time dilation effect" in psychology occurs when events involving an object coming toward you last longer in psychological time than an event with the same object being stationary. For another example, try tapping your nose with one hand and your knee with your other hand at the same time. Even though it takes longer for the signal from your knee to reach your brain than the signal from your nose to reach your brain, you will have the experience of the two tappings being simultaneous—thanks to the brain's manipulation of the data. Neuroscientists suggest that your brain waits about 80 milliseconds for all the relevant input to come in before you experience a “now.” Craig Callender surveyed the psycho-physics literature on human experience of the present, and concluded that, if the duration in physical time between two experienced events is less than about a quarter of a second (250 milliseconds), then humans will say both events happened simultaneously, and this duration is slightly different for different people but is stable within the experience of any single person. Also, "our impression of subjective present-ness...can be manipulated in a variety of ways" such as by what other sights or sounds are present at nearby times. See (Callender 2003-4, p. 124) and (Callender 2008).

Within the field of cognitive science, researchers want to know what are the neural mechanisms that account for our experience of time—for our awareness of change, for our sense of time’s flow, for our ability to place events into the proper time order (temporal succession), and for our ability to notice, and often accurately estimate, durations (persistence). The most surprising experimental result about our experience of time is Benjamin Libet’s claim in the 1970s that his experiments show that the brain events involved in initiating our free choice occur about a third of a second before we are aware of our choice. Before Libet’s work, it was universally agreed that a person is aware of deciding to act freely, then later the body initiates the action. Libet's work has been used to challenge this universal claim about decisions. However, Libet's own experiments have been difficult to repeat because he drilled through the skull and inserted electrodes to shock the underlying brain tissue. See (Damasio 2002) for more discussion of Libet's experiments.

Neuroscientists and psychologists have investigated whether they can speed up our minds relative to a duration of physical time. If so, we might become mentally more productive, and get more high quality decision making done per fixed amount of physical time, and learn more per minute. Several avenues have been explored: using cocaine, amphetamines and other drugs; undergoing extreme experiences such as jumping backwards off a tall bridge with bungee cords attached to one's ankles; and trying different forms of meditation. So far, none of these avenues have led to success productivity-wise.

Any organism’s sense of time is subjective, but is the time that is sensed also subjective, a mind-dependent phenomenon? Throughout history, philosophers of time have disagreed on the answer. Without minds in the world, nothing in the world would be surprising or beautiful or interesting. Can we add that nothing would be in time? The majority answer is "no." The ability of the concept of time to help us make sense of our phenomenological evidence involving change, persistence, and succession of events is a sign that time may be objectively real. Consider succession, that is, order of events in time. We all agree that our memories of events occur after the events occur. If judgments of time were subjective in the way judgments of being interesting vs. not-interesting are subjective, then it would be too miraculous that everyone can so easily agree on the ordering of events in time. For example, first Einstein was born, then he went to school, then he died. Everybody agrees that it happened in this order: birth, school, death. No other order. The agreement on time order for so many events, both psychological events and physical events, is part of the reason that most philosophers and scientists believe physical time is an objective and not dependent on being consciously experienced.

Another large part of the reason to believe time is objective is that our universe has so many different processes that bear consistent time relations, or frequency of occurrence relations, to each other. For example, the frequency of rotation of the Earth around its axis is a constant multiple of the frequency of oscillation of a fixed-length pendulum, which in turn is a constant multiple of the half life of a specific radioactive uranium isotope, which in turn is a multiple of the frequency of a vibrating violin string; the relationship of these oscillators does not change as time goes by (at least not much and not for a long time, and when there is deviation we know how to predict it and compensate for it). The existence of these sorts of relationships makes our system of physical laws much simpler than it otherwise would be, and it makes us more confident that there is something objective we are referring to with the time-variable in those laws. The stability of these relationships over a long time makes it easy to create clocks. Time can be measured easily because we have access to long-term simple harmonic oscillators that have a regular period or “regular ticking.” This regularity shows up in completely different stable systems: rotations of the Earth, a swinging ball hanging from a string (a pendulum), a bouncing ball hanging from a coiled spring, revolutions of the Earth around the Sun, oscillating electric circuits, and vibrations of a quartz crystal. Many of these systems make good clocks. The existence of these possibilities for clocks strongly suggests that time is objective, and is not merely an aspect of consciousness.

The issue about objectivity vs. subjectivity is related to another issue: realism vs. idealism. Is time real or instead just a useful instrument or just a useful convention or perhaps an arbitrary convention? This issue will appear several times throughout this article, including in the later section on conventionality.

Aristotle raised this issue of the mind-dependence of time when he said, “Whether, if soul (mind) did not exist, time would exist or not, is a question that may fairly be asked; for if there cannot be someone to count there cannot be anything that can be counted…” (Physics, chapter 14). He does not answer his own question because, he says rather profoundly, it depends on whether time is the conscious numbering of movement or instead is just the capability of movements being numbered were consciousness to exist.

St. Augustine, adopting a subjective view of time, said time is nothing in reality but exists only in the mind’s apprehension of that reality. The 13th century philosophers Henry of Ghent and Giles of Rome said time exists in reality as a mind-independent continuum, but is distinguished into earlier and later parts only by the mind. In the 13th century, Duns Scotus clearly recognized both physical and psychological time.

At the end of the 18th century, Kant suggested a subtle relationship between time and mind–that our mind actually structures our perceptions so that we can know a priori that time is like a mathematical line. Time is, on this theory, a form of conscious experience, and our sense of time is a necessary condition of our having experiences such as sensations. In the 19th century, Ernst Mach claimed instead that our sense of time is a simple sensation, not an a priori form of sensation. This controversy took another turn when other philosophers argued that both Kant and Mach were incorrect because our sense of time is, instead, an intellectual construction (see Whitrow 1980, p. 64).

In the 20th century, the philosopher of science Bas van Fraassen described time, including physical time, by saying, “There would be no time were there no beings capable of reason” just as “there would be no food were there no organisms, and no teacups if there were no tea drinkers.”

The controversy in metaphysics between idealism and realism is that, for the idealist, nothing exists independently of the mind. If this controversy is settled in favor of idealism, then physical time, too, would have that subjective feature.

It has been suggested by some philosophers that Einstein’s theory of relativity, when confirmed, showed us that physical time depends on the observer, and thus that physical time is subjective, or dependent on the mind. This error is probably caused by Einstein’s use of the term “observer.” Einstein’s theory implies that the duration of an event depends on the observer’s frame of reference or coordinate system, but what Einstein means by “observer’s frame of reference” is merely a perspective or coordinate framework from which measurements could be made. The “observer” need not have a mind. So, Einstein is not making a point about mind-dependence.

To mention one last issue about the relationship between mind and time, if all organisms were to die, there would be events after those deaths. The stars would continue to shine, for example, but would any of these events be in the future? This is a controversial question because advocates of McTaggart’s A-theory will answer “yes,” whereas advocates of McTaggart’s B-theory will answer “no” and say “whose future?”

For more on the consciousness of time and related issues, see the article “Phenomenology and Time-Consciousness.” For more on whether the present, as opposed to time itself, is subjective, see the section called "Is the Present, the Now, Objectively Real?"

3. What Is Time?

Physical time seems to be objective, whereas psychological time is subjective. Many philosophers of science argue that physical time is more fundamental even though psychological time is discovered first by each of us during our childhood, and even though psychological time was discovered first as we human beings evolved from our animal ancestors. The remainder of this article focuses more on physical time than psychological time.

Time is what we use a clock or calendar to measure. We can say time is composed of all the instants or all the times, but that word "times" is ambiguous and also means measurements of time. Think of our placing a coordinate system on our spacetime (this cannot be done successfully in all spacetimes) as our giving names to spacetime points. The measurements we make of time are numbers variously called times, dates, clock readings, and temporal coordinates; and these numbers are relative to time zones and reference frames and conventional agreements about how to define the second, the conventional unit for measuring time. It is because of what time is that we can succeed in assigning time numbers in this manner. Another feature of time is that we can place all events in a single reference frame into a linear sequence one after the other according to their times of occurrence; for any two instants, they are either simultaneous or else one happens before the other but not vice versa. A third feature is that we can succeed in coherently specifying with real numbers how long an event lasts; this is the duration between the event's beginning instant and its ending instant. These are three key features of time, but they do not quite tell us what time itself is.

In discussion about time, the terminology is often ambiguous. We have just mentioned that care is often not taken in distinguishing time from the measure of time. Here are some additional comments about terminology: A moment is said to be a short time, a short event, and to have a short duration or short interval ("length" of time). Comparing a moment to an instant, a moment is brief, but an instant is even briefer. An instant is usually thought to have either a zero duration or else a duration so short as not to be detectable.

a. The Variety of Answers

We cannot trip over a moment of time nor enclose it in a box, so what exactly are moments? Are they created by humans analogous to how, according to some constructivist philosophers, mathematical objects are created by humans, and once created then they have well-determined properties some of which might be difficult for humans to discover? Or is time more like a Platonic idea? Or is time an emergent feature of changes in analogy to how a sound wave is an emergent features the molecules of a vibrating tuning fork, with no single molecule making a sound? When we know what time is, then we can answer all these questions.

One answer to our question, “What is time?” is that time is whatever the time variable t is denoting in the best-confirmed and most fundamental theories of current science. “Time” is given an implicit definition this way. Nearly all philosophers would agree that we do learn much about physical time by looking at the behavior of the time variable in these theories; but they complain that the full nature of physical time can be revealed only with a philosophical theory of time that addresses the many philosophical issues that scientists do not concern themselves with.

Physicists often say time is a sequence of moments in a linear order. Presumably a moment is a durationless instant. Michael Dummett’s constructive model of time implies instead that time is a composition of intervals rather than of durationless instants. The model is constructive in the sense that it implies there do not exist any times which are not detectable in principle by a physical process.

One answer to the question "What is time?" is that it is a general feature of the actual changes in the universe so that if all changes are reversed then time itself reverses. This answer is called "relationism" and "relationalism." A competing answer is that time is more like a substance in that it exists independently of relationships among changes or events. These two competing answers to our question are explored in a later section.

A popular post-Einstein answer to "What is time?" is that time is a single dimension of spacetime.

Because time is intimately related to change, the answer to our question is likely to depend on our answer to the question, "What is change?" The most popular type of answer here is that change is an alteration in the properties of some enduring thing, for example, the alteration from green to brown of an enduring leaf. A different type of answer is that change is basically a sequence of states, such as a sequence containing a state in which the leaf is green and a state in which the leaf is brown. This issue won't be pursued here, and the former answer will be presumed at several places later in the article.

Before the creation of Einstein's special theory of relativity, it might have been said that time must provide these four things: (1) For any event, it specifies when it occurs. (2) For any event, it specifies its duration—how long it lasts. (3) For any event, it fixes what other events are simultaneous with it. (4) For any pair of events that are not simultaneous, it specifies which happens first. With the creation of the special theory of relativity in 1905, it was realized that these questions can get different answers in different frames of reference.

Bothered by the contradictions they claimed to find in our concept of time, Zeno, Plato, Spinoza, Hegel, and McTaggart answer the question, “What is time?” by replying that it is nothing because it does not exist (LePoidevin and MacBeath 1993, p. 23). In a similar vein, the early 20th century English philosopher F. H. Bradley argued, “Time, like space, has most evidently proved not to be real, but a contradictory appearance….The problem of change defies solution.” In the mid-twentieth century, Gödel argued for the unreality of time because Einstein's equations allow for physically possible worlds in which events precede themselves.  In the twenty-first century some physicists such as Julian Barbour say that in order to reconcile general relativity with quantum mechanics either time does not exist or else it is not fundamental in nature; see (Callender 2010) for a discussion of this. However, most philosophers agree that time does exist. They just cannot agree on what it is.

Let’s briefly explore other answers that have been given throughout history to our question, “What is time?” Aristotle claimed that “time is the measure of change” (Physics, chapter 12). He never said space is a measure of anything. Aristotle emphasized “that time is not change [itself]” because a change “may be faster or slower, but not time…” (Physics, chapter 10). For example, a specific change such as the descent of a leaf can be faster or slower, but time itself cannot be faster or slower. In developing his views about time, Aristotle advocated what is now referred to as the relational theory when he said, “there is no time apart from change….” (Physics, chapter 11). In addition, Aristotle said time is not discrete or atomistic but “is continuous…. In respect of size there is no minimum; for every line is divided ad infinitum. Hence it is so with time” (Physics, chapter 11).

René Descartes had a very different answer to “What is time?” He argued that a material body has the property of spatial extension but no inherent capacity for temporal endurance, and that God by his continual action sustains (or re-creates) the body at each successive instant. Time is a kind of sustenance or re-creation ("Third Meditation" in Meditations on First Philosophy).

In the 17th century, the English physicist Isaac Barrow rejected Aristotle’s linkage between time and change. Barrow said time is something which exists independently of motion or change and which existed even before God created the matter in the universe. Barrow’s student, Isaac Newton, agreed with this substantival theory of time. Newton argued very specifically that time and space are an infinitely large container for all events, and that the container exists with or without the events. He added that space and time are not material substances, but are like substances in not being dependent on anything except God.

Gottfried Leibniz objected. He argued that time is not an entity existing independently of actual events. He insisted that Newton had underemphasized the fact that time necessarily involves an ordering of any pair of non-simultaneous events. This is why time “needs” events, so to speak. Leibniz added that this overall order is time. He accepted a relational theory of time and rejected a substantival theory.

In the 18th century, Immanuel Kant said time and space are forms that the mind projects upon the external things-in-themselves. He spoke of our mind structuring our perceptions so that space always has a Euclidean geometry, and time has the structure of the mathematical line. Kant’s idea that time is a form of apprehending phenomena is probably best taken as suggesting that we have no direct perception of time but only the ability to experience things and events in time. Some historians distinguish perceptual space from physical space and say that Kant was right about perceptual space. It is difficult, though, to get a clear concept of perceptual space. If physical space and perceptual space are the same thing, then Kant is claiming we know a priori that physical space is Euclidean. With the discovery of non-Euclidean geometries in the 1820s, and with increased doubt about the reliability of Kant’s method of transcendental proof, the view that truths about space and time are a priori truths began to lose favor.

The above discussion does not exhaust all the claims about what time is. And there is no sharp line separating a definition of time, a theory of time, and an explanation of time.

b. Time vs. “Time”

Whatever time is, it is not “time.” “Time” is the most common noun in all documents on the Internet's web pages; time is not. Nevertheless, it might help us understand time if we improved our understanding of the sense of the word “time.” Should the proper answer to the question “What is time?” produce a definition of the word as a means of capturing its sense? No. At least not if the definition must be some analysis that provides a simple paraphrase in all its occurrences. There are just too many varied occurrences of the word: time out, behind the times, in the nick of time, and so forth.

But how about narrowing the goal to a definition of the word “time” in its main sense, the sense that most interests philosophers and physicists? That is, explore the usage of the word “time” in its principal sense as a means of learning what time is. Well, this project would require some consideration of the grammar of the word “time.” Most philosophers today would agree with A. N. Prior who remarked that, “there are genuine metaphysical problems, but I think you have to talk about grammar at least a little bit in order to solve most of them.” However, do we learn enough about what time is when we learn about the grammatical intricacies of the word? John Austin made this point in “A Plea for Excuses,” when he said, if we are using the analytic method, the method of analysis of language, in order to sharpen our perception of the phenomena, then “it is plainly preferable to investigate a field where ordinary language is rich and subtle, as it is in the pressingly practical matter of Excuses, but certainly is not in the matter, say, of Time.” Ordinary-language philosophers have studied time talk, what Wittgenstein called the “language game” of discourse about time. Wittgenstein’s expectation is that by drawing attention to ordinary ways of speaking we will be able to dissolve rather than answer our philosophical questions. But most philosophers of time are unsatisfied with this approach; they want the questions answered, not dissolved, although they are happy to have help from the ordinary language philosopher in clearing up misconceptions that may be produced by the way we use the word in our ordinary, non-technical discourse.

c. Linear and Circular Time

Is time more like a straight line or instead more like a circle? If your personal time were circular, then eventually you would be reborn. With circular time, the future is also in the past, and every event occurs before itself. If your time is like this, then the question arises as to whether you would be born an infinite number of times or only once. The argument that you'd be born only once appeals to Leibniz’s Principle of the Identity of Indiscernibles: each supposedly repeating state of the world would occur just once because each state would not be discernible from the state that recurs. The way to support the idea of eternal recurrence or repeated occurrence seems to be to presuppose a linear ordering in some "hyper" time of all the cycles so that each cycle is discernible from its predecessor because it occurs at a different hyper time.

During history (and long before Einstein made a distinction between proper time and coordinate time), a variety of answers were given to the question of whether time is like a line or, instead, closed like a circle. The concept of linear time first appeared in the writings of the Hebrews and the Zoroastrian Iranians. The Roman writer Seneca also advocated linear time. Plato and most other Greeks and Romans believed time to be motion and believed cosmic motion was cyclical, but this was not envisioned as requiring any detailed endless repetition such as the multiple rebirths of Socrates. However, the Pythagoreans and some Stoic philosophers such as Chrysippus did adopt this drastic position. Circular time was promoted in Ecclesiastes 1:9: "That which has been is what will be, That which is done is what will be done, And there is nothing new under the sun." The idea was picked up again by Nietzsche in 1882. Scholars do not agree on whether Nietzsche meant his idea of circular time to be taken literally or merely for a moral lesson about how you should live your life if you knew that you'd live it over and over.

Many Islamic and Christian theologians adopted the ancient idea that time is linear. Nevertheless, it was not until 1602 that the concept of linear time was more clearly formulated—by the English philosopher Francis Bacon. In 1687, Newton advocated linear time when he represented time mathematically by using a continuous straight line with points being analogous to instants of time. The concept of linear time was promoted by Descartes, Spinoza, Hobbes, Barrow, Newton, Leibniz, Locke and Kant. Kant argued that it is a matter of necessity. In the early 19th century in Europe, the idea of linear time had become dominant in both science and philosophy.

There are many other mathematically possible topologies for time. Time could be linear or closed (circular). Linear time might have a beginning or have no beginning; it might have an ending or no ending. There could be two disconnected time streams, in two parallel worlds; perhaps one would be linear and the other circular. There could be branching time, in which time is like the letter "Y", and there could be a fusion time in which two different time streams are separate for some durations but merge into one for others. Time might be two dimensional instead of one dimensional. For all these topologies, there could be discrete time or, instead, continuous time. That is, the micro-structure of time's instants might be analogous to a sequence of integers or, instead, analogous to a continuum of real numbers. For physicists, if time were discrete or quantized, their favorite lower limit on a possible duration is the Planck time of about 10-43 seconds.

d. The Extent of Time

In ancient Greece, Plato and Aristotle agreed that the past is eternal. Aristotle claimed that time had no beginning because, for any time, we always can imagine an earlier time.  The reliability of appealing to our imagination to tell us how things are eventually waned. Although Aquinas agreed with Aristotle about the past being eternal, his contemporary St. Bonaventure did not. Martin Luther estimated the world to have begun in 4,000 B.C.E.; Johannes Kepler estimates it to have begun in 4,004 B.C.E; and the Calvinist James Ussher calculated that the world began on Friday, October 28, 4,004 B.C.E. Advances in the science of geology eventually refuted these small estimates for the age of the Earth, and advances in astronomy eventually refuted the idea that the Earth and the universe were created at about the same time.

Physicists generally agree that future time is infinite, but it is an open question whether past time is finite or infinite. Many physicists believe that past time is infinite, but many others believe instead that time began with the Big Bang about 13.8 billion years ago.

In the most well-accepted version of the Big Bang Theory in the field of astrophysics, about 13.8 billion years ago our universe had an almost infinitesimal size and an almost infinite temperature and gravitational field. The universe has been expanding and cooling ever since.

In the more popular version of the Big Bang theory, the Big Bang theory with inflation, the universe once was an extremely tiny bit of explosively inflating material. About 10-36 second later, this inflationary material underwent an accelerating expansion that lasted for 10-30 seconds during which the universe expanded by a factor of 1078. Once this brief period of inflation ended, the volume of the universe was the size of an orange, and the energy causing the inflation was transformed into a dense gas of expanding hot radiation. This expansion has never stopped. But with expansion came cooling, and this allowed individual material particles to condense and eventually much later to clump into stars and galaxies. The mutual gravitational force of the universe’s matter and energy decelerated the expansion, but seven billion years after our Big Bang, the universe’s dark energy became especially influential and started to accelerate the expansion again, despite the mutual gravitational force, although not at the explosive rate of the initial inflation. This more recent inflation of the universe will continue forever at an exponentially accelerating rate, as the remaining matter-energy becomes more and more diluted.

The Big Bang Theory with or without inflation is challenged by other theories such as a cyclic theory in which every trillion years the expansion changes to contraction until the universe becomes infinitesimal, at which time there is a bounce or new Big Bang. The cycles of Bang and Crunch continue forever, and they might or might not have existed forever. For the details, see (Steinhardt 2012). A promising but as yet untested theory called "eternal inflation" implies that our particular Big Bang is one among many other Big Bangs that occurred within a background spacetime that is actually infinite in space and in past time and future time.

Consider this challenging argument from (Newton-Smith 1980, p. 111) that claims time cannot have had a finite past: “As we have reasons for supposing that macroscopic events have causal origins, we have reason to suppose that some prior state of the universe led to the product of [the Big Bang]. So the prospects for ever being warranted in positing a beginning of time are dim.” The usual response to Newton-Smith here is two-fold. First, our Big Bang is a microscopic event, not a macroscopic event, so it might not be relevant that macroscopic events have causal origins. Second, and more importantly, if a confirmed cosmological theory implies there is a first event, we can say this event is an exception to any metaphysical principle that every event has a prior cause.

e. Does Time Emerge from Something More Basic?

Is time a fundamental feature of nature, or does it emerge from more basic timeless features–in analogy to the way the smoothness of water flow emerges from the complicated behavior of the underlying molecules, none of which is properly called "smooth"? That is, is time ontologically basic (fundamental), or does it depend on something even more basic?

We might rephrase this question more technically by asking whether facts about time supervene on more basic facts. Facts about sound supervene on, or are a product of, facts about changes in the molecules of the air, so molecular change is more basic than sound. Minkowski argued in 1908 that we should believe spacetime is more basic than time, and this argument is generally well accepted. However, is this spacetime itself basic? Some physicists argue that spacetime is the product of some more basic micro-substrate at the level of the Planck length, although there is no agreed-upon theory of what the substrate is, although a leading candidate is quantum information.

Other physicists say space is not basic, but time is. In 2004, after winning the Nobel Prize in physics, David Gross expressed this viewpoint:

Everyone in string theory is convinced…that spacetime is doomed. But we don’t know what it’s replaced by. We have an enormous amount of evidence that space is doomed. We even have examples, mathematically well-defined examples, where space is an emergent concept…. But in my opinion the tough problem that has not yet been faced up to at all is, “How do we imagine a dynamical theory of physics in which time is emergent?” …All the examples we have do not have an emergent time. They have emergent space but not time. It is very hard for me to imagine a formulation of physics without time as a primary concept because physics is typically thought of as predicting the future given the past. We have unitary time evolution. How could we have a theory of physics where we start with something in which time is never mentioned?

The discussion in this section about whether time is ontologically basic has no implications for whether the word “time” is semantically basic or whether the idea of time is basic to concept formation.

f. Time and Conventionality

It is an arbitrary convention that our civilization designs clocks to count up to higher numbers rather than down to lower numbers as time goes on. It is just a matter of convenience that we agree to the convention of re-setting our clock by one hour as we cross a time-zone. It is an arbitrary convention that there are twenty-four hours in a day instead of ten, that there are sixty seconds in a minute rather than twelve, that a second lasts as long as it does, and that the origin of our coordinate system for time is associated with the birth of Jesus on some calendars but the entry of Mohammed into Mecca on other calendars.

According to relativity theory, if two events couldn't have had a causal effect on each other, then we analysts are free to choose a reference frame in which one of the events happens first, or instead the other event happens first, or instead the two events are simultaneous. But once a frame is chosen, this fixes the time order of any pair of events. This point is discussed further in the next section.

In 1905, the French physicist Henri Poincaré argued that time is not a feature of reality to be discovered, but rather is something we've invented for our convenience. Because, he said, possible empirical tests cannot determine very much about time, he recommended the convention of adopting the concept of time that makes for the simplest laws of physics. Opposing this conventionalist picture of time, other philosophers of science have recommended a less idealistic view in which time is an objective feature of reality. These philosophers are recommending an objectivist picture of time.

Can our standard clock be inaccurate? Yes, say the objectivists about the standard clock. No, say the conventionalists who say that the standard clock is accurate by convention; if it acts strangely, then all clocks must act strangely in order to stay in synchrony with the standard clock that tells everyone the correct time. A closely related question is whether, when we change our standard clock, from being the Earth's rotation to being an atomic clock, or just our standard from one kind of atomic clock to another kind of atomic clock, are we merely adopting constitutive conventions for our convenience, or in some objective sense are we making a more correct choice?

Consider how we use a clock to measure how long an event lasts, its duration. We always use the following method: Take the time of the instant at which the event ends, and subtract the time of the instant when the event starts. To find how long an event lasts that starts at 3:00 and ends at 5:00, we subtract and get the answer of two hours. Is the use of this method merely a convention, or in some objective sense is it the only way that a clock should be used? The method of subtracting the start time from the end time is called the "metric" of time. Is there an objective metric, or is time "metrically amorphous," to use a phrase from Adolf Grünbaum, because there are alternatively acceptable metrics, such as subtracting the square roots of those times, or perhaps using the square root of their difference and calling this the "duration"?

There is an ongoing dispute about the extent to which there is an element of conventionality in Einstein’s notion of two separated events happening at the same time. Einstein said that to define simultaneity in a single reference frame you must adopt a convention about how fast light travels going one way as opposed to coming back (or going any other direction). He recommended adopting the convention that light travels the same speed in all directions (in a vacuum free of the influence of gravity). He claimed it must be a convention because there is no way to measure whether the speed is really the same in opposite directions since any measurement of the two speeds between two locations requires first having synchronized clocks at those two locations, yet the synchronization process will presuppose whether the speed is the same in both directions. The philosophers B. Ellis and P. Bowman in 1967 and D. Malament in 1977 gave different reasons why Einstein is mistaken. For an introduction to this dispute, see the Frequently Asked Questions. For more discussion, see (Callender and Hoefer 2002).

4. What Does Science Require of Time?

Physics, including astronomy, is the only science that explicitly studies time, although all sciences use the concept. Yet different physical theories place different demands on this concept. So, let's discuss time from the perspective of current science.

Physical theories treat time as being another dimension, analogous to a spatial dimension, and they describe an event as being located at temporal coordinate t, where t is a real number. Each specific temporal coordinate is called a "time." An instantaneous event is a moment and is located at just one time, or one temporal coordinate, say t1. It is said to last for an "instant." If the event is also a so-called "point event," then it is located at a single spatial coordinate, say <x1, y1, z1>. Locations constitute space, and times constitute time.

The fundamental laws of science do not pick out a present moment or present time. This fact is often surprising to a student who takes a science class and notices all sorts of talk about the present. Scientists frequently do apply some law of science while assigning, say, t0 to be the name of the present moment, then calculate this or that. This insertion of the fact that t0 is the present is an initial condition of the situation to which the law is being applied, and is not part of the law itself. The laws themselves treat all moments equally.

Science does not require that its theories have symmetry under time-translation, but this is a goal that physicists do pursue for their basic (fundamental) theories. If a theory has symmetry under time-translation, then the laws of the theories do not change. The law of gravitation in the 21st century is the same law that held one thousand centuries ago.

Physics also requires that almost all the basic laws of science to be time symmetric. This means that a law, if it is a basic law, must not distinguish between backward and forward time directions.

In physics we need to speak of one event happening pi seconds after another, and of one event happening the square root of three seconds after another. In ordinary discourse outside of science we would never need this kind of precision. The need for this precision has led to requiring time to be a linear continuum, very much like a segment of the real number line. So, one  requirement that relativity, quantum mechanics and the Big Bang theory place on any duration is that is be a continuum. This implies that time is not quantized, even in quantum mechanics. In a world with time being a continuum, we cannot speak of some event being caused by the state of the world at the immediately preceding instant because there is no immediately preceding instant, just as there is no real number immediately preceding pi.

EinsteinEinstein's theory of relativity has had the biggest impact on our understanding of time. But Einstein was not the first physicist to appreciate the relativity of motion. Galileo and Newton would have said speed is relative to reference frame. Einstein would agree but would add that durations and occurrence times are also relative. For example, any observer fixed to a moving railroad car in which you are seated will say your speed is zero, whereas an observer fixed to the train station will say you have a positive speed. But as Galileo and Newton understood relativity, both observers will agree about the time you had lunch on the train. Einstein would say they are making a mistake about your lunchtime; they should disagree about when you had lunch. For Newton, the speed of anything, including light, would be different in the two frames that move relative to each other, but Einstein said Maxwell’s equations require the speed of light to be invariant. This implies that the Galilean equations of motion are incorrect. Einstein figured out how to change the equations; the consequence is the Lorentz transformations in which two observers in relative motion will have to disagree also about the durations and occurrence times of events. What is happening here is that Einstein is requiring a mixing of space and time; Minkowski said it follows that there is a spacetime which divides into its space and time differently for different observers.

One consequence of this is that relativity's spacetime is more fundamental than either space or time alone. Spacetime is commonly said to be four-dimensional, but because time is not space it is more accurate to think of spacetime as being (3 + 1)-dimensional. Time is a distinguished, linear subspace of four-dimensional spacetime.

Time is relative in the sense that the duration of an event depends on the reference frame used in measuring the duration. Specifying that an event lasted three minutes without giving even an implicit indication of the reference frame is like asking someone to stand over there and not giving any indication of where “there” is. One implication of this is that it becomes more difficult to defend McTaggart's A-theory which says that properties of events such as "happened twenty-three minutes ago" and "is happening now" are basic properties of events and are not properties relative to chosen reference frames.

Another profound idea from relativity theory is that accurate clocks do not tick the same for everyone everywhere. Each object has its own proper time, and so the correct time shown by a clock depends on its history (in particular, it history of speed and gravitational influence).  Relative to clocks that are stationary in the reference frame, clocks in motion run slower, as do clocks in stronger gravitational fields. In general, two synchronized clocks do not stay synchronized if they move relative to each other or undergo different gravitational forces. Clocks in cars driving by your apartment building run slower than your apartment’s clock.

Suppose there are two twins. One stays on Earth while the other twin zooms away in a spaceship and returns ten years later according to the spaceship’s clock. That same arrival event could be twenty years later according to an Earth-based clock, provided the spaceship went fast enough. The Earth twin would now be ten years older than the spaceship twin. So, one could say that the Earth twin lived two seconds for every one second of the spaceship twin.

According to relativity theory, the order of events in time is only a partial order because for any event e, there is an event f such that e need not occur before f, simultaneous with f, nor after f.  These pairs of events are said to be in each others’ “absolute elsewhere,” which is another way of saying that neither could causally affect each other because even a light signal could not reach from one event to the other. Adding a coordinate system or reference frame to spacetime will force the events in all these pairs to have an order and so force the set of all events to be totally ordered in time, but what is interesting philosophically is that there is a leeway in the choice of the frame. For any two specific events e and f that could never causally affect each other, the analyst may choose a frame in which e occurs first, or choose another frame in which f occurs first, or instead choose another frame in which they are simultaneous. Any choice of frame will be correct. Such is the surprising nature of time according to relativity theory.

General relativity places other requirements on events that are not required in special relativity. Unlike in Newton's physics and the physics of special relativity, in general relativity the spacetime is not a passive container for events; it is dynamic in the sense that any change in the amount and distribution of matter-energy will change the curvature of spacetime itself. Gravity is a manifestation of the warping of spacetime. In special relativity, its Minkowski spacetime has no curvature. In general relativity a spacetime with no mass or energy might or might not have curvature, so the geometry of spacetime is not always determined by the behavior of matter and energy.

In 1611, Bishop James Ussher declared that the beginning of time occurred on October 23, 4004 B.C.E. Today's science disagrees. According to one interpretation of the Big Bang theory of cosmology, the universe began 13.8 billion years ago as spacetime started to expand from an infinitesimal volume; and the expansion continues today, with the volume of space now doubling in size about every ten billion years. The amount of future time  is a potential infinity (in Aristotle's sense of the term) as opposed to an actual infinity. For more discussion of all these compressed remarks, see What Science Requires of Time.

5. What Kinds of Time Travel are Possible?

Most scientists and philosophers of time agree that there is good evidence that human time travel has occurred. To explain, let’s first define the term. We mean physical time travel, not travel by wishing or dreaming or sitting still and letting time march on. In any case of physical time travel the traveler’s journey as judged by a correct clock attached to the traveler takes a different amount of time than the journey does as judged by a correct clock of someone who does not take the journey.

The physical possibility of human travel to the future is well accepted, but travel to the past is more controversial, and time travel that changes either the future or the past is generally considered to be impossible. Our understanding of time travel comes mostly from the implications of Einstein’s general theory of relativity. This theory has never failed any of its many experimental tests, so we trust its implications for human time travel.

Einstein’s general theory of relativity permits two kinds of future time travel—either by moving at high speed or by taking advantage of the presence of an intense gravitational field. Let's consider just the time travel due to high speed. Actually any motion produces time travel (relative to the clocks of those who do not travel), but if  you move at extremely high speed, the time travel is more noticeable; you can travel into the future to the year 2,300 on Earth (as measured by clocks fixed to the Earth) while your personal clock measures that merely, let’s say, ten years have elapsed. You can participate in that future, not just view it. You can meet your twin sister’s descendants. But you cannot get back to the twenty-first century on Earth by reversing your velocity. If you get back, it will be via some other way.

It's not that you suddenly jump into the Earth's future of the year 2,300. Instead you have continually been traveling forward in both your personal time and the Earth’s external time, and you could have been continuously observed from Earth’s telescopes during your voyage.

How about travel to the past, the more interesting kind of time travel? This is not allowed by either Newton's physics or Einstein's special relativity, but is allowed by general relativity. In 1949, Kurt Gödel surprised Albert Einstein by discovering that in some unusual worlds that obey the equations of general relativity—but not in the actual world—you can continually travel forward in your personal time but eventually arrive into your own past.

Unfortunately, say many philosophers and scientists, even if you can travel to the past in the actual world you cannot do anything that has not already been done, or else there would be a contradiction. In fact, if you do go back, you would already have been back there. For this reason, if you go back in time and try to kill your childhood self, you will fail no matter how hard you try. You can kill yourself, but you won’t because you didn’t. While attempting to kill yourself, you will be in two different bodies at the same time.

Here are a variety of philosophical arguments against past-directed time travel.

  1. If past time travel were possible, then you could be in two different bodies at the same time, which is ridiculous.
  2. If you were presently to go back in time, then your present events would cause past events, which violates our concept of causality.
  3. Time travel is impossible because, if it were possible, we should have seen many time travelers by now, but nobody has encountered any time travelers.
  4. If past time travel were possible, criminals could avoid their future arrest by traveling back in time, but that is absurd, so time travel is, too.
  5. If there were time travel, then when time travelers go back and attempt to change history, they must always botch their attempts to change anything, and it will appear to anyone watching them at the time as if Nature is conspiring against them. Since observers have never witnessed this apparent conspiracy of Nature, there is no time travel.
  6. Travel to the past is impossible because it allows the gaining of information for free. Here is a possible scenario. Buy a copy of Darwin's book The Origin of Species, which was published in 1859. In the 21st century, enter a time machine with it, go back to 1855 and give the book to Darwin himself. He could have used your copy in order to write his manuscript which he sent off to the publisher. If so, who first came up with the knowledge about evolution? Neither you nor Darwin. Because this scenario contradicts what we know about where knowledge comes from, past-directed time travel isn't really possible.
  7. The philosopher John Earman describes a rocket ship that carries a time machine capable of firing a probe (perhaps a smaller rocket) into its recent past. The ship is programmed to fire the probe at a certain time unless a safety switch is on at that time. Suppose the safety switch is programmed to be turned on if and only if the “return” or “impending arrival” of the probe is detected by a sensing device on the ship. Does the probe get launched? It seems to be launched if and only if it is not launched. However, the argument of Earman’s Paradox depends on the assumptions that the rocket ship does work as intended—that people are able to build the computer program, the probe, the safety switch, and an effective sensing device. Earman himself says all these premises are acceptable and so the only weak point in the reasoning to the paradoxical conclusion is the assumption that travel to the past is physically possible. There is an alternative solution to Earman’s Paradox. Nature conspires to prevent the design of the rocket ship just as it conspires to prevent anyone from building a gun that shoots if and only if it does not shoot. We cannot say what part of the gun is the obstacle, and we cannot say what part of Earman’s rocket ship is the obstacle.

These complaints about travel to the past are a mixture of arguments that past-directed time travel is not logically possible, that it is not physically possible, that it is not technologically possible with current technology, and that it is unlikely, given today's empirical evidence.

For more discussion of time travel, see the encyclopedia article “Time Travel.”

6. Does Time Require Change? (Relational vs. Substantival Theories)

By "time requires change," we mean that for time to exist something must change its properties over time. We don't mean, change it properties over space as in change color from top to bottom. There are two main philosophical theories about whether time requires change, relational theories and substantival theories.

In a relational theory of time, time is defined in terms of relationships among objects, in particular their changes. Substantival theories are theories that imply time is substance-like in that it exists independently of changes; it exists independently of all the spacetime relations exhibited by physical processes. This theory allows "empty time" in which nothing changes. On the other hand, relational theories do not allow this. They imply that at every time something is happening—such as an electron moving through space or a tree leaf changing its color. In short, no change implies no time. Some substantival theories describe spacetime as being like a container for events. The container exists with or without events in it. Relational theories imply there is no container without contents. But the substance that substantivalists have in mind is more like a medium pervading all of spacetime and less like an external container. The vast majority of relationists present their relational theories in terms of actually instantiated relations and not merely possible relations.

Everyone agrees time cannot be measured without there being changes, because we measure time by observing changes in some property or other, but the present issue is whether time exists without changes. On this issue, we need to be clear about what sense of change and what sense of property we are intending. For the relational theory, the term "property" is intended to exclude what Nelson Goodman called grue-like properties. Let us define an object to be grue if it is green before the beginning of the year 1888 but is blue thereafter. Then the world’s chlorophyll undergoes a change from grue to non-grue in 1888. We’d naturally react to this by saying that change in chlorophyll's grue property is not a “real change” in the world’s chlorophyll.

Does Queen Anne’s death change when I forget about it? Yes, but the debate here is whether the event’s intrinsic properties can change, not merely its non-intrinsic properties such as its relationships to us. This special intrinsic change is called by many names: secondary change and second-order change and McTaggartian change and McTaggart change. Second-order change is the kind of change that A-theorists say occurs when Queen Anne's death recedes ever farther into the past. The objection from the B-theorists here is that this is not a "real, objective, intrinsic change" in her death. First-order change is ordinary change, the kind that occurs when a leaf changes from green to brown, or a person changes from sitting to standing.

Einstein's general theory of relativity does imply it is possible for spacetime to exist while empty of events. This empty time is permissible according to the substantival theory but not allowed by the relational theory. Yet Einstein considered himself to be a relationalist.

Substantival theories are sometimes called "absolute theories." Unfortunately the term "absolute theory" is used in two other ways. A second sense of " to be absolute" is to be immutable,  or changeless. A third sense is to be independent of observer or reference frame. Although Einstein’s theory implies there is no absolute time in the sense of being independent of reference frame, it is an open question whether relativity theory undermines absolute time in the sense of substantival time; Einstein believed it did, but many philosophers of science do not.

The first advocate of a relational theory of time was Aristotle. He said, “neither does time exist without change.” (Physics, book IV, chapter 11, page 218b) However, the battle lines were most clearly drawn in the early 18th century when Leibniz argued for the relational position against Newton, who had adopted a substantival theory of time. Leibniz’s principal argument against Newton is a reductio ad absurdum. Suppose Newton’s space and time were to exist. But one could then imagine a universe just like ours except with everything shifted five kilometers east and five minutes earlier. However, there would be no reason why this shifted universe does not exist and ours does. Now we have arrived at a contradiction because, if there is no reason for there to be our universe rather than the shifted universe, then we have violated Leibniz’s Principle of Sufficient Reason: that there is an understandable reason for everything being the way it is. So, by reductio ad absurdum, Newton’s substantival space and time do not exist. In short, the trouble with Newton’s theory is that it leads to too many unnecessary possibilities.

Newton offered this two-part response: (1) Leibniz is correct to accept the Principle of Sufficient Reason regarding the rational intelligibility of the universe, but there do not have to be knowable reasons for humans; God might have had His own sufficient reason for creating the universe at a given place and time even though mere mortals cannot comprehend His reasons. (2) The bucket thought-experiment shows that acceleration relative to absolute space is detectable; thus absolute space is real, and if absolute space is real, so is absolute time. Here's how to detect absolute space. Suppose we tie a bucket’s handle to a rope hanging down from a tree branch. Partially fill the bucket with water, and let it come to equilibrium. Notice that there is no relative motion between the bucket and the water, and in this case the water surface is flat. Now spin the bucket, and keep doing this until the angular velocity of the water and the bucket are the same. In this second case there is again no relative motion between the bucket and the water, but now the water surface is concave. So spinning makes a difference, but how can a relational theory explain the difference in the shape of the surface? It cannot, says Newton. When the bucket and water are spinning, what are they spinning relative to? Because we can disregard the rest of the environment including the tree and rope, says Newton, the only explanation of the difference in surface shape between the non-spinning case and the spinning case is that when it is not spinning there is no motion relative to space, but when it is spinning there is motion relative to a third thing, space itself, and space itself is acting upon the water surface to make it concave. Alternatively expressed, the key idea is that the presence of centrifugal force is a sign of rotation relative to absolute space. Leibniz had no rebuttal. So, for over two centuries after this argument was created, Newton’s absolute theory of space and time was generally accepted by European scientists and philosophers.

One hundred years later, Kant entered the arena on the side of Newton. In a space containing only a single glove, said Kant, Leibniz could not account for its being a right-handed glove versus a left-handed glove because all the internal relationships would be the same in either case. However, we all know that there is a real difference between a right and a left glove, so this difference can only be due to the glove’s relationship to space itself. But if there is a “space itself,” then the absolute or substantival theory is better than the relational theory.

Newton’s theory of time was dominant in the 18th and 19th centuries, even though during those centuries Huygens, Berkeley, and Mach had entered the arena on the side of Leibniz. Mach argued that it must be the remaining matter in the universe, such as the "fixed" stars, which causes the water surface in the bucket to be concave, and that without these stars or other matter, a spinning bucket would have a flat surface. In the 20th century, Hans Reichenbach and the early Einstein declared the special theory of relativity to be a victory for the relational theory, in large part because a Newtonian absolute space would be undetectable. Special relativity, they also said, ruled out a space-filling ether, the leading candidate for substantival space, so the substantival theory was incorrect. And the response to Newton’s bucket argument is to note Newton’s error in not considering the environment. Einstein agreed with Mach that, if you hold the bucket still but spin the background stars  in the environment, then the water will creep up the side of the bucket and form a concave surface—so the bucket thought experiment does not require absolute space.

Although it was initially believed by Einstein and Reichenbach that relativity theory supported Mach regarding the bucket experiment and the absence of absolute space, this belief is controversial. Many philosophers argue that Reichenbach and the early Einstein have been overstating the amount of metaphysics that can be extracted from the physics.  There is substantival in the sense of independent of reference frame and substantival in the sense of independent of events. Isn't only the first sense ruled out when we reject a space-filling ether? The critics admit that general relativity does show that the curvature of spacetime is affected by the distribution of matter, so today it is no longer plausible for a substantivalist to assert that the “container” is independent of the behavior of the matter it contains. But, so they argue, general relativity does not rule out a more sophisticated substantival theory in which spacetime exists even if it is empty and in which two empty universes could differ in the curvature of their spacetime. For this reason, by the end of the 20th century, substantival theories had gained some ground.

In 1969, Sydney Shoemaker presented an argument attempting to establish the understandability of time existing without change, as Newton’s absolutism requires. Divide all space into three disjoint regions, called region 3, region 4, and region 5. In region 3, change ceases every third year for one year. People in regions 4 and 5 can verify this and then convince the people in region 3 of it after they come back to life at the end of their frozen year. Similarly, change ceases in region 4 every fourth year for a year; and change ceases in region 5 every fifth year. Every sixty years, that is, every 3 x 4 x 5 years, all three regions freeze simultaneously for a year. In year sixty-one, everyone comes back to life, time having marched on for a year with no change. Note that even if Shoemaker’s scenario successfully shows that the notion of empty time is understandable, it does not show that empty time actually exists. If we accept that empty time occasionally exists, then someone who claims the tick of the clock lasts one second could be challenged by a skeptic who says perhaps empty time periods occur randomly and this supposed one-second duration contains three changeless intervals each lasting one billion years, so the duration is really three billion and one second rather than one second. However, we usually prefer the simpler of two competing hypotheses.

Empty time isn't directly detectable by those who are frozen, but it may be indirectly detectable, perhaps in the manner described by Shoemaker or by signs in advance of the freeze:

Suppose that immediately prior to the beginning of a local freeze there is a period of "sluggishness" during which the inhabitants of the region find that it makes more than the usual amount of effort for them to move the limbs of their bodies, and we can suppose that the length of this period of sluggishness is found to be correlated with the length of the freeze. (Shoemaker 1969, p. 374)

Is the ending of the freeze causeless, or does something cause the freeze to end? Perhaps the empty time itself causes the freeze to end. Yet if a period of empty time, a period of "mere" passage of time, is somehow able to cause something, then, argues Ruth Barcan Marcus, it is not clear that empty time can be dismissed as not being genuine change. (Shoemaker 1969, p. 380)

7. Does Time Flow?

Time seems to flow or pass in the sense that future events become present events and then become past events, just like a runner who passes us by and then recedes farther and farther from us.  In 1938, the philosopher George Santayana offered this description of the flow of time: “The essence of nowness runs like fire along the fuse of time.” The converse image of time's flowing past us is our advancing through time. Time definitely seems to flow, but there is philosophical disagreement about whether it really does flow, or pass. Is the flow objectively real? The dispute is related to the dispute about whether McTaggart's A-series or B-series is more fundamental.

a. McTaggart's A-Series and B-Series

In 1908, the philosopher J. M. E. McTaggart proposed two ways of linearly ordering all events in time by placing them into a series according to the times at which they occur. But this ordering can be created in two ways, an A way and a B way. Consider two past events a and b, in which b is the most recent of the two. In McTaggart's B-series, event a happens before event b in the series because the time of occurrence of event a is less than the time of occurrence of event b. But when ordering the same events into McTaggart's A-series, event a happens before event b for a different reason—because event a is more in the past than event b. Both series produce exactly the same ordering of events. Here is a picture of the ordering. c is another event that happens after a and b.


There are many other events that are located within the series at event a's location, namely all events simultaneous with event a. If we were to consider an instant of time to be a set of simultaneous events, then instants of time are also linearly ordered into an A-series and a B-series. McTaggart himself believed the A-series is paradoxical [for reasons that will not be explored in this article], but McTaggart also believed the A-properties such as being past are essential to our current concept of time, so for this reason he believed our current concept of time is incoherent.

Let's suppose that event c occurs in our present after events a and b. The information that c occurs in the present is not contained within either the A-series or the B-series. However, the information that c is in the present is used to create the A-series; it is what tells us to place c to the right of b. That information is not used to create the B-series.

Metaphysicians dispute whether the A-theory or instead the B-theory is the correct theory of reality. The A-theory comprises two theses, each of which is contrary to the B-theory: (1) Time is constituted by an A-series in which any event's being in the past (or in the present or in the future) is an intrinsic, objective, monadic property of the event itself and not merely a subjective relation between the event and us who exist. (2) The second thesis of the A-theory is that events change. In 1908, McTaggart described the special way that events change:

Take any event—the death of Queen Anne, for example—and consider what change can take place in its characteristics. That it is a death, that it is the death of Anne Stuart, that it has such causes, that it has such effects—every characteristic of this sort never changes.... But in one respect it does change. It began by being a future event. It became every moment an event in the nearer future. At last it was present. Then it became past, and will always remain so, though every moment it becomes further and further past.

This special change is called secondary change and second-order change and also McTaggartian change.

The B-theory disagrees with both thesis (1) and thesis (2) of the A-theory. According to the B-theory, the B-series and not the A-series is fundamental; fundamental temporal properties are relational; McTaggartian change is not an objective change and so is not metaphysically basic or ultimately real. The B-theory implies that an event's property of occurring in the past (or occurring twenty-three minutes ago, or now, or in a future century) is merely a subjective relation between the event and us because, when analyzed, it will be seen to make reference to our own perspective on the world. Here is how it is subjective, according to the B-theory. Queen Anne's death has the property of occurring in the past because it occurs in our past as opposed to, say, Aristotle's past; and it occurs in our past rather than our present or our future because it occurs at a time that is less than the time of occurrence of some event that we (rather than Aristotle) would say is occurring.  The B-theory is committed to there being no objective distinction among past, present and future. Both the A-theory and B-theory agree, however, that it would be a mistake to say of some event that it happens on a certain date but then later it fails to happen on that date.

The B-theorists complain that thesis (1) of the A-theory implies that an event’s being in the present is an intrinsic property of that event, so it implies that there is an absolute, global present for all of us. The B-theorist points out that according to Einstein’s Special Theory of Relativity there is no global present. An event can be in the present for you and not in the present for me. An event can be present in a reference frame in which you are a fixed observer, but if you are moving relative to me, then that same event will not be present in a reference frame in which I am a fixed observer. So, being present is not a property of an event, as the A theory implies. According to relativity theory, what is a property of an event is being present in a chosen reference frame, and this implies that being present is relative to us who are making the choice of reference frame.

When discussing the A-theory and the B-theory, metaphysicians often speak of

    • A-series and B-series, of
    • A-theory and B-theory, of
    • A-facts and B-facts, of
    • A-terms and B-terms, of
    • A-properties and B-properties, of
    • A-predicates and B-predicates, of
    • A-statements and B-statements, and of the
    • A-camp and B-camp.

Here are some examples. Typical B-series terms are relational; they are relations between events: "earlier than," "happens twenty-three minutes after," and "simultaneous with." Typical A-theory terms are monadic, they are one-place qualities of events: "the near future," "twenty-three minutes ago," and "present." The B-theory terms represent distinctively B-properties; the A-theory terms represent distinctively A-properties. The B-fact that event a occurs before event b will always be a fact, but the A-fact that event a occurred about an hour ago soon won’t be a fact. Similarly the A-statement that event a occurred about an hour ago will, if true, soon become false. However, B-facts are not transitory, and B-statements have fixed truth values. For the B-theorist, the statement "Event a occurs an hour before b" will, if true, never become false. The A-theory usually says A-facts are the truthmakers of true A-statements and so A-facts are ontologically fundamental; the B-theorist appeals instead to B-facts, insofar as one accepts facts into one’s ontology, which is metaphysically controversial. According to the B-theory, when the A-theorist correctly says "It began snowing twenty-three minutes ago," what really makes it true isn't the A-fact that the event of the snow's beginning has twenty-three minutes of pastness; what makes it true is that the event of uttering the sentence occurs twenty-three minutes after the event of it beginning to snow. Notice that "occurs ... after" is a B-term. Those persons in the A-camp and B-camp recognize that in ordinary speech we are not careful to use one of the two kinds of terminology, but each camp believes that it can best explain the terminology of the other camp in its own terms.

b. Subjective Flow and Objective Flow

There are two primary theories about time’s flow: (A) the flow is objectively real. (B) the flow is a myth or else is merely subjective. Often theory A is called the dynamic theory or the A-theory while theory B  is called the static theory or B-theory.

The static theory implies that the flow is an illusion, the product of a faulty metaphor. The defense of the theory goes something like this. Time exists, things change, but time does not change by flowing. The present does not move. We all experience this flow, but only in the sense that we all frequently misinterpret our experience. There is some objective feature of our brains that causes us to believe we are experiencing a flow of time, such as the fact that we have different perceptions at different times and the fact that anticipations of experiences always happen before memories of those experiences; but the flow itself is not objective. This kind of theory of time's flow is often characterized as a myth-of-passage theory. The myth-of-passage theory is more likely to be adopted by those who believe in McTaggart’s B-theory. One point offered in favor of the myth-of-passage theory is to ask about the rate at which time flows. It would be a rate of one second per second. But that is silly. One second divided by one second is the number one. That’s not a coherent rate. There are other arguments, but these won't be explored here.

Physicists sometimes speak of time flowing in another sense of the term "flow." This is the sense in which change is continuous rather than discrete. That is not the sense of “flow” that philosophers normally use when debating the objectivity of time's flow.

There is another uncontroversial sense of flow—when physicists say that time flows differently for the two twins in Einstein's twin paradox. All the physicists mean here is that time is different in different reference frames that are moving relative to each other; they need not be promoting the dynamic theory over the static theory.

Physicists sometimes carelessly speak of time flowing in yet another sense—when what they mean is that time has an arrow, a direction from the past to the future. But again this is not the sense of “flow” that philosophers use when speaking of the dynamic theory of time's flow.

There is no doubt that time seems to pass, so a B-theorist might say the flow is subjectively real but not objectively real. There surely is some objective feature of our brains, say the critics of the dynamic theories, that causes us to mistakenly believe we are experiencing a flow of time, such as the objective fact that we have different perceptions at different times and that anticipations of experiences always happen before memories of those experiences, but the flow itself is not objectively real.

According to the dynamic theories, the flow of time is objective, a feature of our mind-independent reality. A dynamic theory is closer to common sense, and has historically been the more popular theory among philosophers. It is more likely to be adopted by those who believe that McTaggart's A-series is a fundamental feature of time but his B-series is not.

One dynamic theory implies that the flow is a matter of events changing from being future, to being present, to being past, and they also change in their degree of pastness and degree of presentness. This kind of change is often called McTaggart's second-order change to distinguish it from more ordinary, first-order change as when a leaf changes from a green state to a brown state. For the B-theorist the only proper kind of change is when different states of affairs obtain at different times.

A second dynamic theory implies that the flow is a matter of events changing from being indeterminate in the future to being determinate in the present and past. Time’s flow is really events becoming determinate, so these dynamic theorists speak of time’s flow as “temporal becoming.”

Opponents of these two dynamic theories complain that when events are said to change, the change is not a real change in the event’s essential, intrinsic properties, but only in the event’s relationship to the observer. For example, saying the death of Queen Anne is an event that changes from present to past is no more of an objectively real change in her death than saying her death changed from being approved of to being disapproved of. This extrinsic change in approval does not count as an objectively real change in her death, and neither does the so-called second-order change from present to past or from indeterminate to determinate. Attacking the notion of time’s flow in this manner, Adolf Grünbaum said: “Events simply are or occur…but they do not ‘advance’ into a pre-existing frame called ‘time.’ … An event does not move and neither do any of its relations.”

A third dynamic theory says time's flow is the coming into existence of facts, the actualization of new states of affairs; but, unlike the first two dynamic theories, there is no commitment to events changing. This is the theory of flow that is usually accepted by advocates of presentism.

A fourth dynamic theory suggests the flow is (or is reflected in) the change over time of truth values of declarative sentences. For example, suppose the sentence, “It is now raining,” was true during the rain yesterday but has changed to false on today’s sunny day. That's an indication that time flowed from yesterday to today, and these sorts of truth value changes are at the root of the flow. In response, critics suggest that the temporal indexical sentence, “It is now raining,” has no truth value because the reference of the word “now” is unspecified. If it cannot have a truth value, it cannot change its truth value. However, the sentence is related to a sentence that does have a truth value, the sentence with the temp0ral indexical replaced by the date that refers to a specific time and with the other indexicals replaced by names of whatever they refer to. Supposing it is now midnight here on April 1, 2007, and the speaker is in Sacramento, California, then the indexical sentence, “It is now raining,” is intimately related to the more complete or context-explicit sentence, “It is raining at midnight on April 1, 2007 in Sacramento, California.” Only these latter, non-indexical, non-context-dependent, complete sentences have truth values, and these truth values do not change with time so they do not underlie any flow of time. Fully-described events do not change their properties and so time does not flow because complete or "eternal" sentences do not change their truth values.

Among B-theorists, Hans Reichenbach has argued that the flow of time is produced by the collapse of the quantum mechanical wave function. Another dynamic theory is promoted by advocates of the B-theory who add to the block-universe  a flowing present which "spotlights" the block at a particular slice at any time. This is often called the moving spotlight view.

John Norton (Norton 2010) argues that time's flow is objective but so far is beyond the reach of our understanding. Tim Maudlin argues that the objective flow of time is fundamental and unanalyzable. He is happy to say “time does indeed pass at the rate of one hour per hour.” (Maudlin 2007, p. 112)

Regardless of how we analyze the metaphor of time’s flow, it flows in the direction of the future, the direction of the arrow of time, and we need to analyze this metaphor of time's arrow.

8. What are the Differences among the Past, Present, and Future?

a. Presentism, the Growing-Past, Eternalism and the Block-Universe

Have dinosaurs slipped out of existence? More generally, we are asking whether the past is part of reality. How about the future? Philosophers are divided on the question of the reality of the past, present, and future. (1): According to presentism, if something is real, then it is real now; all and only things that exist now are real. The presentist maintains that the past and the future are not real, so if a statement about the past is true, this must be because some present facts make it true. Heraclitus, Duns Scotus, A. N. Prior, and Ned Markosian are presentists. Presentists belong in the A-camp because presentism implies that being present is an intrinsic property of an event; it's a property that the event has independent of our being alive now.

(2): Advocates of a growing-past agree with the presents that the present is special ontologically, but they argue that, in addition to the present, the past is also real and is growing bigger all the time. C. D. Broad, Richard Jeffrey, and Michael Tooley have defended this view. They claim the past and present are real, but the future is not real. William James famously remarked that the future is so unreal that even God cannot anticipate it. It is not clear whether Aristotle accepted the growing-past theory or accepted a form of presentism; see (Putnam 1967), p. 244 for commentary.

(3): Proponents of eternalism oppose presentism and the growing-past theory. Bertrand Russell, J. J. C. Smart, W. V. O. Quine, Adolf Grünbaum, and Paul Horwich object to assigning special ontological status to the past, the present, or the future. Advocates of eternalism do not deny the reality of the events that we classify as being in our past, present or future, but they say there is no objective ontological difference among the past, the present, and the future, just as there is no objective ontological difference among here, there, and far. Yes, we thank goodness that the threat to our safety is there rather than here, and that it is past rather than present, but these differences are subjective, being dependent on our point of view. The classification of events into past, or present, or future is a subjective classification, not an objective one.

Presentism is one of the theories in the A‐camp because it presumes that being present is an objective property that events have.

Eternalism, on the other hand, is closely associated with the block-universe theory as is four-dimensionalism. Four-dimensionalism implies that the ontologically basic (that is, fundamental) objects in the universe are four-dimensional rather than three-dimensional. Here, time is treated as being somewhat like a fourth dimension of space, though strictly speaking time is not a dimension of space. On the block theory, time is like a very special extra dimension of space, as in a Minkowski diagram, and for this reason the block theory is said to promote the spatialization of time. If time has an infinite future or infinite past, or if space has an infinite extent, then the block is infinitely large along those dimensions.

The block-universe theory implies that reality is a single block of spacetime with its time slices (planes of simultaneous events) ordered by the happens-before relation. Four-dimensionalism adds that every object that lasts longer than an instant is in fact a four-dimensional object with an infinite number of time-slices or temporal parts. Adults are composed of their infancy time-slices, plus their childhood time-slices, plus their teenage time-slices, and so forth.

The block itself has no distinguished past, present, and future, but any chosen reference frame has its own past, present, and future. The future, by the way, is the actual future, not all possible futures. William James coined the term “block-universe.” The growing-past theory is also called the growing-block theory.

All three ontologies about the past, present, and future agree that we only ever experience the present. One of the major issues for presentism is how to ground true propositions about the past. What makes it true that U.S. President Abraham Lincoln was assassinated? Some presentists will say what makes it true are only features of the present way things are. The eternalist disagrees. When someone says truly that Abraham Lincoln was assassinated, the eternalist believes this is to say something true of an existing Abraham Lincoln who is also a non-present thing.

A second issue for the presentist is to account for causation, for the fact that April showers caused May flowers. When causes occur, their effects are not yet present. A survey of defenses of presentism can be found in (Markosian 2003), but opponents of presentism need to be careful not to beg the question.

The presentist and the advocate of the growing-past will usually unite in opposition to eternalism on three grounds: (i) The present is so much more vivid to a conscious being than are memories of past experiences and expectations of future experiences. (No one can stand outside time and compare the vividness of present experience with the vividness of future experience and past experience.) (ii)  Eternalism misses the special “open” and changeable character of the future. In the block-universe, which is the ontological theory promoted by most eternalists, there is only one future, so this implies the future exists already, but we know this determinsm and its denial of free will is incorrect. (iii) A present event "moves" in the sense that a moment later it is no longer present, having lost its property of presentness.

The counter from the defenders of eternalism and the block-universe is that, regarding (i), the now is significant but not objectively real. Regarding (ii) and the open future,  the block theory allows determinism and fatalism but does not require either one. Eventually there will be one future, regardless of whether that future is now open or closed, and that is what constitutes the future portion of the block. Finally, don't we all fear impending doom? But according to presentism and the growing-block theory, why should we have this fear if the doom is known not to exist? The best philosophy of time will not make our different attitudes toward future and past danger be so mysterious.

The advocates of the block-universe attack both presentism and the growing-past theory by claiming that only the block-universe can make sense of the special theory of relativity’s implication that, if persons A and B are separated but in relative motion, an event in person A’s present can be in person B’s future, yet this implies that advocates of presentism and the growing-past theories must suppose that this event is both real and unreal because it is real for A but not real for B. Surely that conclusion is unacceptable, claim the eternalists. Two key assumptions of the block theory here are, first, that relativity does provide an accurate account of the spatiotemporal relations among events, and, second, that if there is some frame of reference in which two events are simultaneous, then if one of the events is real, so is the other.

Opponents of the block-universe counter that block theory does not provide an accurate account of the way things are because the block theory considers the present to be subjective, and not part of objective reality, yet the present is known to be part of objective reality. If science doesn't use the concept of the present in its basic laws, then this is one of science's faults. For a review of the argument from relativity against presentism, and for criticisms of the block theory, see (Putnam 1967) and (Saunders 2002).

b. Is the Present, the Now, Objectively Real?

A calendar does not tell us which day is the present day. The calendar leaves out the "now." All philosophers agree that we would be missing some important information if we did not know what time it is now, but these philosophers disagree over just what sort of information this is. Proponents of the objectivity of the present are committed to claiming the universe would have a present even if there were no conscious beings. This claim is controversial. For example, in 1915, Bertrand Russell objected to giving the present any special ontological standing:

In a world in which there was no experience, there would be no past, present, or future, but there might well be earlier and later. (Russell 1915, p. 212)

The debate about whether the present is objectively real is intimately related to the metaphysical dispute between McTaggart's A-theory and B-theory. The B-theory implies that the present is either non-existent or else mind-dependent, whereas the A-theory does not. The principal argument for believing in the objectivity of the now is that the now is so vivid to everyone; the present stands out specially among all times. If science doesn't explain this vividness, then there is a defect within science. A second argument points out that there is so much agreement among people around us about what is happening now and what is not. So, isn't that a sign that the concept of the now is objective, not subjective, and existent rather than non-existent? A third argument for objectivity of the now is that when we examine ordinary language we find evidence that a belief in the now is ingrained in our language. Notice all the present-tensed terminology in the English language. It is unlikely that it would be so ingrained if it were not correct to believe it.

One criticism of the first argument, the argument from vividness, is that the now is vivid but so is the "here," yet we don't conclude from this that the here is somehow objective geographically. Why then assume that the vividness of the now points to it being objective temporally? A second criticism is that we cannot now step outside our present experience and compare its vividness with experience now of future time and past times. What is being compared when we speak of "vividness" is our present experience with our memories and expectations.

A third criticism of the first argument regarding vividness points out that there are empirical studies by cognitive psychologists and neuroscientists showing that our judgment about what is vividly happening now is plastic and can be affected by our expectations and by what other experiences we are having at the time. For example, we see and hear a woman speaking to us from across the room; then we construct an artificial now in which hearing her speak and seeing her speak happen at the same time, whereas the acoustic engineer tells us we are mistaken because the sound traveled much slower than the light.

According to McTaggart's A-camp, there is a global now shared by all of us. The B-camp disagrees and says this belief is a product of our falsely supposing that everything we see is happening now; we are not factoring in the finite speed of light. Proponents of the subjectivity of the present frequently claim that a proper analysis of time talk should treat the phrases "the present" and "now" as indexical terms which refer to the time at which the phrases are uttered or written by the speaker, so their relativity to us speakers shows the essential subjectivity of the present. The main positive argument for subjectivity, and against the A-camp, appeals to the relativity of simultaneity, a feature of Einstein's Special Theory of Relativity of 1905. The argument points out that in this theory there is a block of space-time in which past events are separated from future events by a plane or "time slice" of simultaneous, presently-occurring instantaneous events, but this time slice is different in different reference frames. For example, take a reference frame in which you and I are not moving relative to each other; then we will easily agree on what is happening now—that is, on the 'now' slice of spacetime—because our clocks tick at the same rate. Not so for someone moving relative to us. If that other person is far enough away from us (that any causal influence of Beethoven's death couldn't have reached that person) and is moving fast enough away from us, then that person might truly say that Beethoven's death is occurring now! Yet if that person were moving rapidly towards us, they might truly say that our future death is happening now. Because the present is frame relative, the A-camp proponent of an objective now must select a frame and thus one of these different planes of simultaneous events as being "what's really happening now," but surely any such choice is just arbitrary, or so Einstein would say. Therefore, if we aren't going to reject Einstein's interpretation of his theory of special relativity, then we should reject the objectivity of the now. Instead we should think of every event as having its own past and future, with its present being all events that are simultaneous with it. For further discussion of this issue see (Butterfield 1984).

There are interesting issues about the now even in theology. Norman Kretzmann has argued that if God is omniscient, then He knows what time it is, and so must always be changing. Therefore, there is an incompatibility between God's being omniscient and God's being immutable.

c. Persist, Endure, Perdure, and Four-Dimensionalism

Some objects last longer than others. They persist longer. But there is philosophical disagreement about how to understand persistence. Objects considered four-dimensionally are said to persist by perduring rather than enduring. Think of events and processes as being four-dimensional. The more familiar three-dimensional objects such as chairs and people are usually considered to exist wholly at a single time and are said to persist by enduring through time. Advocates of four-dimensionalism endorse perduring objects rather than enduring objects as the metaphysically basic entities. All events, processes and other physical objects are four-dimensional sub-blocks of the block-universe. The perduring object persists by being the sum or “fusion” of a series of its temporal parts (also called its temporal stages and temporal slices and time slices). For example, a middle-aged man can be considered to be a four-dimensional perduring object consisting of his childhood, his middle age and his future old age. These are three of his infinitely many temporal parts.

One argument against four-dimensionalism is that it allows an object to have too many temporal parts. Four-dimensionalism implies that, during every second in which an object exists, there are at least as many temporal parts of the object as there are sub-intervals of the mathematical line in the interval from zero to one. According to (Thomson 1983), this is too many parts for any object to have. Thomson also says that as the present moves along, present temporal parts move into the past and go out of existence while some future temporal parts "pop" into existence, and she complains that this popping in and out of existence is implausible. The four-dimensionalist can respond to these complaints by remarking that the present temporal parts do not go out of existence when they are no longer in the present; instead, they simply do not presently exist. Similarly dinosaurs have not popped out of existence; they simply do not exist presently.

According to David Lewis in On the Plurality of Worlds, the primary argument for perdurantism is that it has an easy time of solving what he calls the problem of temporary intrinsics, of which the Heraclitus paradox is one example. The Heraclitus Paradox is the problem, first introduced by Heraclitus, of explaining our not being able to step into the same river twice because the water is different the second time. The mereological essentialist agrees with Heraclitus, but our common sense says Heraclitus is mistaken. The advocate of endurance has trouble showing that Heraclitus is mistaken for the following reason:  We do not step into two different rivers, do we? Yet the river has two different intrinsic properties, namely being two different collections of water; but, by Leibniz’s Law of the Indiscernibility of Identicals, identical objects cannot have different properties. A 4-dimensionalist who advocates perdurance says the proper metaphysical analysis of the Heraclitus paradox is that we can step into the same river twice by stepping into two different temporal parts of the same 4-d river. Similarly, we cannot see a football game at a moment; we can see only a momentary temporal part of the 4-d game. For more discussion of this topic in metaphysics, see (Carroll and Markosian 2010, pp. 173-7).

Eternalism differs from 4-dimensionalism. Eternalism says the present, past, and future are equally real, whereas 4-dimensionalism says the basic objects are 4-dimensional. Most 4-dimensionalists accept eternalism and four-dimensionalism and McTaggart's B-theory.

One of A. N. Prior’s criticisms of the B-theory involves the reasonableness of our saying of some painful, past event, “Thank goodness that is over.” Prior says the B-theorist cannot explain this reasonableness because no B-theorist should thank goodness that the end of their pain happens before their present utterance of "Thank goodness that is over," since that B-fact or B-relationship is timeless or tenseless; it has always held and always will. The only way then to make sense of our saying “Thank goodness that is over” is to assume we are thankful for the A-fact that the pain event has pastness. But if so, then the A-theory is correct and the B-theory is incorrect.

One B-theorist response is discussed in a later section, but another response is simply to disagree with Prior that it is improper for a B-theorist to thank goodness that the end of their pain happens before their present utterance, even though this is an eternal B-fact. Still another response from the B-theorist comes from the 4-dimensionalist who says that as 4-dimensional beings it is proper for us to care more about our later time-slices than our earlier time-slices. If so, then it is reasonable to thank goodness that the time slice at the end of the pain occurs before the time slice that is saying, "Thank goodness that is over." Admittedly this is caring about an eternal B-fact. So Prior’s premise [that the only way to make sense of our saying “Thank goodness that is over” is to assume we are thankful for the A-fact that the pain event has pastness] is a faulty premise, and Prior’s argument for the A-theory is invalid.

Four-dimensionalism has implications for the philosophical problem of personal identity. According to four-dimensionalism, you as a teenager and you as a child are not the same person but rather are two different parts of one 4-dimensional person.

d. Truth Values and Free Will

The philosophical dispute about presentism, the growing-past theory, and the block theory or eternalism has taken a linguistic turn by focusing upon a question about language: “Are predictions true or false at the time they are uttered?” Those who believe in the block-universe (and thus in the determinate reality of the future) will answer “Yes” while a “No” will be given by presentists and advocates of the growing-past. The issue is whether contingent sentences uttered now about future events are true or false now rather than true or false only in the future at the time the predicted event is supposed to occur.

Suppose someone says, “Tomorrow the admiral will start a sea battle.” And suppose that tomorrow the admiral orders a sneak attack on the enemy ships which starts a sea battle. Advocates of the block-universe argue that, if so, then the above quoted sentence was true at the time it was uttered. Truth is eternal or fixed, they say, and “is true” is a tenseless predicate, not one that merely says “is true now.” These philosophers point favorably to the ancient Greek philosopher Chrysippus who was convinced that a contingent sentence about the future is true or false. If so, the sentence cannot have any other value such as “indeterminate” or "neither true or false now." Many other philosophers, usually in McTaggart's B-camp, agree with Aristotle's suggestion that the sentence is not true until it can be known to be true, namely at the time at which the sea battle occurs. The sentence was not true before the battle occurred. In other words, predictions have no (classical) truth values at the time they are uttered. Predictions fall into the “truth value gap.” This position that contingent sentences have no classical truth values is called the Aristotelian position because many researchers throughout history have taken Aristotle to be holding the position in chapter 9 of On Interpretation—although today it is not so clear that Aristotle himself held the position.

The principal motive for adopting the Aristotelian position arises from the belief that if sentences about future human actions are now true, then humans are determined to perform those actions, and so humans have no free will. To defend free will, we must deny truth values to predictions.

This Aristotelian argument against predictions being true or false has been discussed as much as any in the history of philosophy, and it faces a series of challenges. First, if there really is no free will, or if free will is compatible with determinism, then the motivation to deny truth values to predictions is undermined.

Second, according to the compatibilist, your choices affect the world, and if it is true that you will perform an action in the future, it does not follow that now you will not perform it freely, nor that you are not free to do otherwise if your intentions are different, but only that you will not do otherwise. For more on this point about modal logic, see Foreknowledge and Free Will.

A third challenge, from Quine and others, claims the Aristotelian position wreaks havoc with the logical system we use to reason and argue with predictions. For example, here is a deductively valid argument:

There will be a sea battle tomorrow.

If there will be a sea battle tomorrow, then we should wake up the admiral.

So, we should wake up the admiral.

Without the premises in this argument having truth values, that is, being true or false, we cannot properly assess the argument using the usual standards of deductive validity because this standard is about the relationships among truth values of the component sentences—that a valid argument is one in which it is impossible for the premises to be true and the conclusion to be false. Unfortunately, the Aristotelian position says that some of these component sentences are neither true nor false, so Aristotle’s position is implausible.

In reaction to this third challenge, proponents of the Aristotelian argument say that if Quine would embrace tensed propositions and expand his classical logic to a tense logic, he could avoid those difficulties in assessing the validity of arguments that involve sentences having future tense.

Quine has claimed that the analysts of our talk involving time should in principle be able to eliminate the temporal indexical words such as "now" and "tomorrow" because their removal is needed for fixed truth and falsity of our sentences [fixed in the sense of being eternal sentences whose truth values are not relative to the situation because the indexicals and indicator words have been replaced by times, places and names, and whose verbs are treated as tenseless], and having fixed truth values is crucial for the logical system used to clarify science. “To formulate logical laws in such a way as not to depend thus upon the assumption of fixed truth and falsity would be decidedly awkward and complicated, and wholly unrewarding,” says Quine.

Philosophers are still divided on the issues of whether only the present is real, what sort of deductive logic to use for reasoning about time, and whether future contingent sentences have truth values.

9. Are There Essentially-Tensed Facts?

Using a tensed verb is a grammatical way of locating an event in time. All the world’s cultures have a conception of time, but in only half the world’s languages is the ordering of events expressed in the form of grammatical tenses. For example, the Chinese, Burmese and Malay languages do not have any tenses. The English language expresses conceptions of time with tensed verbs but also in other ways, such as with the adverbial time phrases “now” and “twenty-three days ago,” and with the adjective phrases "brand-new" and "ancient," and with the prepositions "until" and "since." Philosophers have asked what we are basically committed to when we use tense to locate an event in the past, in the present, or in the future.

There are two principal answers or theories. One is that tense distinctions represent objective features of reality that are not captured by eternalism and the block-universe approach.  This theory is said to "take tense seriously" and is called the tensed theory of time, or the A-theory. This theory claims that when we learn the truth values of certain tensed sentences we obtain knowledge that tenseless sentences do not provide, for example, that such and such a time is the present time. Perhaps the tenseless theory rather than the tensed theory can be more useful for explaining human behavior than a tensed theory. Tenses are the same as positions in McTaggart's A-series, so the tensed theory is commonly associated with the A-camp that was discussed earlier in this article.

A second, contrary answer to the question of the significance of tenses is that tenses are merely subjective features of the perspective from which the speaking subject views the universe.  Using a tensed verb is a grammatical way, not of locating an event in the A-series, but rather of locating the event in time relative to the time that the verb is uttered or written. Actually this philosophical disagreement is not just about tenses in the grammatical sense. It is primarily about the significance of the distinctions of past, present, and future which those tenses are used to mark. The main metaphysical disagreement is about whether times and events have non-relational properties of pastness, presentness, and futurity. Does an event have or not have the property of, say, pastness independent of the event's relation to us and our temporal location?

On the tenseless theory of time, or the B-theory, whether the death of U. S. Lieutenant Colonel George Armstrong Custer occurred here depends on the speaker’s relation to the death event (Is the speaker standing at the battle site in Montana?); similarly, whether the death occurs now is equally subjective (Is it now 1876 for the speaker?). The proponent of the tenseless view does not deny the importance or coherence of talk about the past, but will say it should be analyzed in terms of talk about the speaker's relation to events. My assertion that the event of Custer's death occurred in the past might be analyzed by the B-theorist as asserting that Custer's death event happens before the event of my writing this sentence. This latter assertion does not explicitly use the past tense. According to the classical B-theorist, the use of tense is an extraneous and eliminable feature of language, as is all use of the terminology of the A-series.

This controversy is often presented as a dispute about whether tensed facts exist, with advocates of the tenseless theory objecting to tensed facts and advocates of the tensed theory promoting them as essential. The primary function of tensed facts is to make tensed sentences true. For the purposes of explaining this dispute, let us uncritically accept the Correspondence Theory of Truth and apply it to the following sentence:

Custer died in Montana.

If we apply the Correspondence Theory directly to this sentence, then the tensed theory or A-theory implies

The sentence “Custer died in Montana” is true because it corresponds to the tensed fact that Custer died in Montana.

The old tenseless theory or B-theory, created by Bertrand Russell (1915), would give a different analysis without tensed facts. It would say that the Correspondence Theory should be applied only to the result of first analyzing away tensed sentences into equivalent sentences that do not use tenses. Proponents of this classical tenseless theory prefer to analyze our sentence “Custer died in Montana” as having the same meaning as the following “eternal” sentence:

There is a time t such that Custer dies in Montana at time t, and time t is before the time of the writing of the sentence “Custer died in Montana” by B. Dowden in the article “Time” in the Internet Encyclopedia of Philosophy.

In this analysis, the verb dies is logically tenseless (although grammatically it is in the present tense just like the "is" in "7 plus 5 is 12"). Applying the Correspondence Theory to this new sentence then yields:

The sentence “Custer died in Montana” is true because it corresponds to the tenseless fact that there is a time t such that Custer dies in Montana at time t, and time t is before the time of your reading the sentence “Custer died in Montana” by B. Dowden in the article “Time” in the Internet Encyclopedia of Philosophy.

This Russell-like analysis is less straight-forward than the analysis offered by the tensed theory, but it does not use tensed facts.

This B-theory analysis is challenged by proponents of the tensed A-theory on the grounds that it can succeed only for utterances or readings or inscriptions, but a sentence can be true even if never read or inscribed. There are other challenges. Roderick Chisholm and A. N. Prior claim that the word “is” in the sentence “It is now midnight” is essentially present tensed because there is no adequate translation using only tenseless verbs. Trying to analyze it as, say, “There is a time t such that t = midnight” is to miss the essential reference to the present in the original sentence because the original sentence is not always true, but the sentence “There is a time t such that t = midnight” is always true. So, the tenseless analysis fails. There is no escape from this criticism by adding “and t is now” because this last indexical still needs analysis, and we are starting a vicious regress.

(Prior 1959) supported the tensed A-theory by arguing that after experiencing a painful event,

one says, e.g., “Thank goodness that’s over,” and [this]…says something which it is impossible that any use of a tenseless copula with a date should convey. It certainly doesn’t mean the same as, e.g., “Thank goodness the date of the conclusion of that thing is Friday, June 15, 1954,” even if it be said then. (Nor, for that matter, does it mean “Thank goodness the conclusion of that thing is contemporaneous with this utterance.” Why should anyone thank goodness for that?).

D.  H. Mellor and J. J. C. Smart agree that tensed talk is important for understanding how we think and speak—the temporal indexicals are essential, as are other indexicals—but they claim it is not important for describing temporal, extra-linguistic reality. They advocate a newer tenseless B-theory by saying the truth conditions of any tensed declarative sentence can be explained without tensed facts even if Chisholm and Prior are correct that some tensed sentences in English cannot be translated into tenseless ones. [The truth conditions of a sentence are the conditions which must be satisfied in the world in order for the sentence to be true.  The sentence "Snow is white" is true on the condition that snow is white. More particularly, it is true if whatever is referred to by the term 'snow' satisfies the predicate 'is white'. The conditions under which the conditional sentence "If it's snowing, then it's cold" are true are that it is not both true that it is snowing and false that it is cold. Other analyses are offered for the truth conditions of sentences that are more complex grammatically.]

According to the newer B-theory of Mellor and Smart, if I am speaking to you and say, "It is now midnight," then this sentence admittedly cannot be translated into tenseless terminology without loss of meaning, but the truth conditions can be explained with tenseless terminology. The truth conditions of "It is now midnight" are that my utterance occurs at the same time as your hearing the utterance, which in turn is the same time as when our standard clock declares the time to be midnight in our reference frame. In brief, it's true just in case it is uttered at midnight. Notice that no tensed facts are appealed to in the explanation of those truth conditions. Similarly, an advocate of the new tenseless theory could say it is not the pastness of the painful event that explains why I say, “Thank goodness that’s over.” I say it because I believe that the time of the occurrence of that utterance is greater than the time of the occurrence of the painful event, and because I am glad about this. Of course I'd be even gladder if there were no pain at any time. I may not be consciously thinking about the time of the utterance when I make it; nevertheless that time is what helps explain what I am glad about. Notice that appeal to tensed terminology was removed in that explanation.

In addition, it is claimed by Mellor and other new B-theorists that tenseless sentences can be used to explain the logical relations between tensed sentences: that one tensed sentence implies another, is inconsistent with yet another, and so forth. Understanding a declarative sentence's truth conditions and its truth implications and how it behaves in a network of inferences is what we understand whenever we know the meaning of the sentence. According to this new theory of tenseless time, once it is established that tensed sentences can be explained without utilizing tensed facts, then Ockham’s Razor is applied. If we can do without essentially-tensed facts, then we should say essentially-tensed facts do not exist. To summarize, tensed facts were presumed to be needed to account for the truth of tensed talk; but the new B-theory analysis shows that ordinary tenseless facts are adequate. The theory concludes that we should not take seriously metaphysical tenses with their tensed facts because they are not needed for describing the objective features of the extra-linguistic world. Proponents of the tensed theory of time do not agree with this conclusion. So, the philosophical debate continues over whether tensed concepts have semantical priority over untensed concepts, and whether tensed facts have ontological priority over untensed facts.

10. What Gives Time Its Direction or Arrow?

Time's arrow is revealed in the way macroscopic or multi-particle processes tend to go over time, and that way is the direction toward disarray, the direction toward equilibrium, the direction toward higher entropy. For example, egg processes always go from unbroken eggs to omelets, never in the direction from omelets to unbroken eggs. The process of mixing coffee always goes from black coffee and cream toward brown coffee. You can’t unmix brown coffee. We can ring a bell but never un-ring it.

The arrow of a physical process is the way it normally goes, the way it normally unfolds through time. If a process goes only one-way, we call it an irreversible process; otherwise it is reversible. (Strictly speaking, a reversible process is one that is reversed by an infinitesimal change of its surrounding conditions, but we can overlook this fine point because of the general level of the present discussion.) The amalgamation of the universe’s irreversible processes produces the cosmic arrow of time, the master arrow. This arrow of time is the same for all of us. Usually this arrow is what is meant when one speaks of time’s arrow. So, time's arrow indicates directed processes in time, and the arrow may or may not have anything to do with the flow of time.

Because so many of the physical processes that we commonly observe do have an arrow, you might think that an inspection of the basic micro-physical laws would readily reveal time’s arrow. It will not. With some exceptions, such as the collapse of the quantum mechanical wave function and the decay of a B meson, all the basic laws of fundamental processes are time symmetric. A process that is time symmetric can go forward or backward in time; the laws allow both. Maxwell’s equations of electromagnetism, for example, can be used to predict that television signals can exist, but these equations do not tell us whether those signals arrive before or arrive after they are transmitted. In other words, the basic laws of science, its fundamental laws, do not by themselves imply an arrow of time. Something else must tell us why television signals are emitted from, but not absorbed into, TV antennas and why omelets don't turn into whole, unbroken eggs. The existence of the arrow of time is not derivable from the basic laws of science but is due to entropy, to the fact that entropy goes from low to high and not the other way.  But, as we will see in a moment, it is not clear why entropy behaves this way. So, how to explain the arrow is still an open question in science and philosophy.

a. Time without an Arrow

Time could exist in a universe that had no arrow, provided there was change in the universe. However, that change needs to be random change in which processes happen one way sometimes and the reverse way at other times. The second law of thermodynamics would fail in such a universe.

b. What Needs to be Explained

There are many goals for a fully developed theory of time’s arrow. It should tell us (1) why time has an arrow; (2) why the basic laws of science do not reveal the arrow, (3) how the arrow is connected with entropy, (4) why the arrow is apparent in macro processes but not micro processes; (5) why the entropy of a closed system increases in the future rather than decreases even though the decrease is physically possible given current basic laws; (6) what it would be like for our arrow of time to reverse direction; (7) what are the characteristics of a physical theory that would pick out a preferred direction in time; (8) what the relationships are among the various more specific arrows of time—the various kinds of temporally asymmetric processes such as a B meson decay [the B-meson arrow], the collapse of the wave function [the quantum mechanical arrow], entropy increases [the thermodynamic arrow], causes preceding their effects [the causal arrow], light radiating away from hot objects rather than converging into them [the electromagnetic arrow], and our knowing the past more easily than the future [the knowledge arrow].

c. Explanations or Theories of the Arrow

There are three principal explanations of the arrow: (i) it is a product of one-way entropy flow which in turn is due to the initial conditions of the universe, (ii) it is a product of one-way entropy flow which in turn is due to some as yet unknown asymmetrical laws of nature, (iii) it is a product of causation which itself is asymmetrical.

Leibniz first proposed (iii), the so-called causal theory of time's order. Hans Reichenbach developed the idea in detail in 1928. He suggested that event A happens before event B if A could have caused B but B could not have caused A. The usefulness of this causal theory depends on a clarification of the notorious notions of causality and possibility without producing a circular explanation that presupposes an understanding of time order.

21st century physicists generally favor explanation (i). They say the most likely explanation of the emergence of an arrow of time in a world with time-blind basic laws is that the arrow is a product of the direction of entropy change. A leading suggestion is that this directedness of entropy change is due to increasing quantum entanglement plus the low-entropy state of the universe at the time of our Big Bang. Unfortunately there is no known explanation of why the entropy was so low at the time of our Big Bang. Some say the initially low entropy is just a brute fact with no more fundamental explanation. Others say it is due to as yet undiscovered basic laws that are time-asymmetric. And still others say it must be the product of the way the universe was before our Big Bang.

Before saying more about quantum entanglement let's describe entropy. There are many useful definitions of entropy. On one definition, it is a measure inversely related to the energy available for work in a physical system. According to another definition, the entropy of a physical system that is isolated from external influences is a measure [specifically, the logarithm] of how many microstates are macroscopically indistinguishable.  Less formally, entropy is a measure of how disordered or "messy" or "run down" a closed system is. More entropy implies more disorganization. Changes toward disorganization are so much more frequent than changes toward more organization because there are so many more ways for a closed system to be disorganized than for it to be organized. For example, there are so many more ways for the air molecules in an otherwise empty room to be scattered about evenly throughout the room giving it a uniform air density than there are ways for there to be a concentration of air within a sphere near the floor while the rest of the room is a vacuum. According to the 2nd Law of Thermodynamics, which is not one of our basic or fundamental laws of science, entropy in an isolated system or region never decreases in the future and almost always increases toward a state of equilibrium. Although Sadi Carnot discovered a version of the second law in 1824, Rudolf Clausius invented the concept of entropy and expressed the law in terms of heat. However, Ludwig Boltzmann generalized this work, expressed the law in terms of a more sophisticated concept of entropy involving atoms and their arrangements, and also tried to explain the law statistically as being due to the fact that there are so many more ways for a system of atoms to have arrangements with high entropy than arrangements with low entropy. This is why entropy flows from low to high naturally.

For example, if you float ice cubes in hot coffee, why do you end up with lukewarm coffee if you don’t interfere with this coffee-ice-cube system? And why doesn’t lukewarm coffee ever spontaneously turn into hot coffee with ice cubes? The answer from Boltzmann is that the number of macroscopically indistinguishable arrangements of the atoms in the system that appear to us as lukewarm coffee is so very much greater than the number of macroscopically indistinguishable arrangements of the atoms in the system that appear to us as ice cubes floating in the hot coffee. It is all about probabilities of arrangements of the atoms.

“What’s really going on [with the arrow of time pointing in the direction of equilibrium] is things are becoming more correlated with each other,” M.I.T. professor Seth Lloyd said. He was the first person to suggest that the arrow of time in any process is an arrow of increasing correlations as the particles in that process become more entangled with neighboring particles.

Said more simply and without mentioning entanglement, the change in entropy of a system that is not yet in equilibrium is a one-way street toward greater disorganization and less useful forms of energy. For example, when a car burns gasoline, the entropy increase is evident in the fact that the new heat energy distributed throughout the byproducts of  the gasoline combustion is much less useful than was the potential chemical energy in the pre-combustion gasoline. The entropy of our universe, conceived of as the largest isolated system, has been increasing for the last 13.8 billion years and will continue to do so for a very long time. At the time of the Big Bang, our universe was in a highly organized, low-entropy, non-equilibrium state, and it has been running down and getting more disorganized ever since. This running down is the cosmic arrow of time.

According to the 2nd Law of Thermodynamics, if an isolated system is not in equilibrium and has a great many particles, then it is overwhelmingly likely that the system's entropy will increase in the future. This 2nd law is universal but not fundamental because it apparently can be explained in terms of the behavior of the atoms making up the system. Ludwig Boltzmann was the first person to claim to have deduced the macroscopic 2nd law from reversible microscopic laws of Newtonian physics. Yet it seems too odd, said Joseph Loschmidt, that a one-way macroscopic process can be deduced from two-way microscopic processes. In 1876, Loschmidt argued that if you look at our present state (the black dot in the diagram below), then you ought to deduce from the basic laws (assuming you have no knowledge that the universe actually had lower entropy in the past) that it evolved not from a state of low entropy in the past, but from a state of higher entropy in the past, which of course is not at all what we know our past to be like. The difficulty is displayed in the diagram below.

graph of entropy vs. time

Yet we know our universe is an isolated system by definition, and we have good observational evidence that it surely did not have high entropy in the past—at least not in the past that is between now and the Big Bang—so the actual low value of entropy in the past is puzzling. Sean Carroll (2010) offers a simple illustration of the puzzle. If you found a half-melted ice cube in an isolated glass of water (analogous to the black dot in the diagram), and all you otherwise knew about the universe is that it obeys our current, basic time-reversible laws and you knew nothing about its low entropy past, then you'd infer, not surprisingly, that the ice cube would melt into a liquid in the future (solid green line). But, more surprisingly, you also would infer that your glass evolved from a state of  liquid water (dashed red line). You would not infer that the present half-melted state evolved from a state where the glass had a solid ice cube in it (dashed green line). To infer the solid cube you would need to appeal to your empirical experience of how processes are working around you, but you'd not infer the solid cube if all you had to work with were the basic time-reversible laws. To solve this so-called Loschmidt Paradox for the cosmos as a whole, and to predict the dashed green line rather than the dashed red line, physicists have suggested it is necessary to adopt the Past Hypothesis—that the universe at the time of the Big Bang was in a state of very low entropy. Using this Past Hypothesis, the most probable history of the universe over the last 13.8 billion years is one in which entropy increases.

Can the Past Hypothesis be justified from other principles? Some physicists (for example, Richard Feynman) and philosophers (for example, Craig Callender) say the initial low entropy may simply be a brute fact—that is, there is no causal explanation for the initial low entropy. Objecting to inexplicable initial facts as being unacceptably ad hoc, the physicists Walther Ritz and Roger Penrose say we need to keep looking for basic, time-asymmetrical laws that will account for the initial low entropy and thus for time’s arrow. A third perspective on the Past Hypothesis is that perhaps a future theory of quantum gravity will provide a justification of the Hypothesis. A fourth perspective appeals to God's having designed the Big Bang to start with low entropy. A fifth perspective appeals to the anthropic principle and the many-worlds interpretation of quantum mechanics in order to argue that since there exist so many universes with different initial entropies, there had to be one universe like our particular universe with its initial low entropy—and that is the only reason why our universe had low entropy initially.

d. Multiple Arrows

The past and future are different in many ways that reflect the arrow of time. Consider the difference between time’s arrow and time’s arrows. The direction of entropy change is the thermodynamic arrow. Here are some suggestions for additional arrows:

  1. We remember last week, not next week.
  2. There is evidence of the past but not of the future.
  3. Our present actions affect the future and not the past.
  4. It is easier to know the past than to know the future.
  5. Radio waves spread out from the antenna, but never converge into it.
  6. The universe expands in volume rather than shrinks.
  7. Causes precede their effects.
  8. We see black holes but never white holes.
  9. B meson decay, neutral kaon decay, and Higgs boson decay are each different in a time reversed world.
  10. Quantum mechanical measurement collapses the wave function.
  11. Possibilities decrease as time goes on.

Most physicists suspect all these arrows are linked so that we cannot have some arrows reversing while others do not. For example, the collapse of the wave function is generally considered to be due to an increase in the entropy of the universe. It is well accepted that entropy increase can account for the fact that we remember the past but not the future, that effects follow causes rather than precede them, and that animals grow old and never young. However, whether all the arrows are linked is still an open question.

e. Reversing the Arrow

Could the cosmic arrow of time have gone the other way? Most physicists suspect that the answer is yes, and they say it could have gone the other way if the initial conditions of the universe at our Big Bang had been different. Crudely put, if all the particles’ trajectories and charges are reversed, then the arrow of time would reverse. Here is a scenario of how it might happen. As our universe evolves closer to a point of equilibrium and very high entropy, time would lose its unidirectionality. Eventually, though, the universe could evolve away from equilibrium and perhaps it would evolve so that the directional processes we are presently familiar with would go in reverse. For example, we would get eggs from omelets very easily, but it would be too difficult to get omelets from eggs. Fires would absorb light instead of emit light. This new era would be an era of reversed time, and there would be a vaguely defined period of non-directional time separating the two eras.

If the cosmic arrow of time were to reverse this way, perhaps our past would be re-created and lived in reverse order. This re-occurrence of the past is different than the re-living of past events via time travel. With time travel the past is re-visited in the original order, not in reverse order.

Philosophers have asked interesting questions about the reversal of time’s arrow. What does it really mean to say time reverses? Does it require entropy to decrease on average in closed systems? If time were to reverse only in some far off corner of the universe, but not in our region of the universe, would dead people there become undead, and would the people there walk backwards up steps while remembering the future? First off, would it even be possible for them to be conscious? Assuming consciousness is caused by brain processes, could there be consciousness if their nerve pulses reversed, or would this reversal destroy consciousness? Supposing the answer is that they would be conscious, would people in that far off corner appear to us to be pre-cognitive if we could communicate with them? Would the feeling of being conscious be different for time-reversed people? [Here is one suggestion. There is one direction of time they would remember and call “the past,” and it would be when the entropy is lower. That is just as it is for us who do not experience time-reversal.] Consider communication between us and the inhabitants of that far off time-reversed region of the universe. If we sent a signal to the time-reversed region, could our message cross the border, or would it dissolve there, or would it bounce back? If residents of the time-reversed region successfully sent a recorded film across the border to us, should we play it in the ordinary way or in reverse?

11. What is Temporal Logic?

Temporal logic is the representation of reasoning about time by using the methods of symbolic logic in order to formalize which statements (or propositions or sentences) about time imply which others. For example, in McTaggart's B-series, the most important relation is the happens-before relation on events. Logicians have asked what sort of principles must this relation obey in order to properly account for our reasoning about time.

Here is one suggestion. Consider this informally valid reasoning:

Adam's arrival at the train station happened before Bryan's. Therefore, Bryan's arrival at the station did not happen before Adam's.

Let us translate this into classical predicate logic using a domain of instantaneous events, namely point events, where the individual constant 'a' denotes Adam's arrival at the train station, and 'b' denotes Bryan's arrival at the train station. Let the two-argument relation B(x,y) be interpreted as "x happens before y." The direct translation produces:


Unfortunately, this formal reasoning is invalid. To make the formal argument become valid, we could make explicit the implicit premise that the B relation is asymmetric. That is, we need to add the implicit premise:

∀x∀y[B(x,y)   ~B(y,x)]

So, we might want to add this principle as an axiom into our temporal logic.

In other informally valid reasoning, we discover a need to make even more assumptions about the happens-before relation. Suppose Adam arrived at the train station before Bryan, and suppose Bryan arrived before Charles. Is it valid reasoning to infer that Adam arrived before Charles? Yes, but if we translate directly into classical predicate logic we get this invalid argument:


To make this argument be valid we need the implicit premise that says the happens-before relation is transitive, that is:

∀x∀y∀z [(B(x,y) & B(y,z))  B(x,z)]

What other constraints should be placed on the B relation (when it is to be interpreted as the happens-before relation)? Logicians have offered many suggestions: that B is irreflexive, that in any reference frame any two events are related somehow by the B relation (there are no disconnected pairs of events), that B is dense in the sense that there is a third point event between any two point events that are not simultaneous, and so forth.

The more classical approach to temporal logic, however, does not add premises to arguments in classical predicate logic as we have just been doing. The classical approach is via tense logic, a formalism that adds tense operators on propositions of propositional logic. The pioneer in the late 1950s was A. N. Prior. He created a new symbolic logic to describe our reasoning involving time phrases such as “now,” “happens before,” “twenty-three minutes afterwards,” “at all times,” and “sometimes.” He hoped that a precise, formal treatment of these concepts could lead to resolution of some of the controversial philosophical issues about time.

Prior begins with an important assumption: that a proposition such as “Custer dies in Montana” can be true at one time and false at another time. That assumption is challenged by some philosophers, such as W.V. Quine, who prefer to avoid use of this sort of proposition and who recommend that temporal logics use only sentences that are timelessly true or timelessly false, and that have no indexicals whose reference can shift from one context to another.

Prior's main original idea was to appreciate that time concepts are similar in structure to modal concepts such as “it is possible that” and “it is necessary that.” He adapted modal propositional logic for his tense logic. Michael Dummett and E. J. Lemmon also made major, early contributions to tense logic. One standard system of tense logic is a variant of the S4.3 system of modal logic. In this formal tense logic, the modal operator that is interpreted to mean “it is possible that” is re-interpreted to mean “at some past time it was the case that” or, equivalently, “it once was the case that,” or "it once was that." Let the capital letter 'P' represent this operator. P will operate on present-tensed propositions, such as p. If p represents the proposition “Custer dies in Montana,” then Pp says Custer died in Montana. If Prior can make do with the variable p ranging only over present-tensed propositions, then he may have found a way to eliminate any ontological commitment to non-present entities such as dinosaurs while preserving the possibility of true past tense propositions such as "There were dinosaurs."

Prior added to the axioms of classical propositional logic the axiom P(p v q) ↔ (Pp v Pq). The axiom says that for any two propositions p and q, at some past time it was the case that p or q if and only if either at some past time it was the case that p or at some past time (perhaps a different past time) it was the case that q.

If p is the proposition “Custer dies in Montana” and q is “Sitting Bull dies in Montana,” then

P(p v q) ↔ (Pp v Pq)


Custer or Sitting Bull died in Montana if and only if either Custer died in Montana or Sitting Bull died in Montana.

The S4.3 system’s key axiom is the equivalence, for all propositions p and q,

Pp & Pq ↔ [P(p & q) v P(p & Pq) v P(q & Pp)].

This axiom when interpreted in tense logic captures part of our ordinary conception of time as a linear succession of states of the world.

Another axiom of tense logic might state that if proposition q is true, then it will always be true that q has been true at some time. If H is the operator “It has always been the case that,” then a new axiom might be

Pp ↔ ~H~p.

This axiom of tense logic is analogous to the modal logic axiom that p is possible if and only if it is not the case that it is necessary that not-p.

A tense logic may need additional axioms in order to express “q has been true for the past two weeks.” Prior and others have suggested a wide variety of additional axioms for tense logic, but logicians still disagree about which axioms to accept.

It is controversial whether to add axioms that express the topology of time, for example that it comes to an end or doesn't come to an end; the reason is that this is an empirical matter, not a matter for logic to settle.

Regarding a semantics for tense logic, Prior had the idea that the truth of a tensed proposition should be expressed in terms of truth-at-a-time. For example, a modal proposition Pp (it was once the case that p) is true at a time t if and only if p is true at a time earlier than t. This suggestion has led to an extensive development of the formal semantics for tense logic.

The concept of being in the past is usually treated by metaphysicians as a predicate that assigns properties to events, but, in the tense logic just presented, the concept is treated as an operator P upon propositions, and this difference in treatment is objectionable to some metaphysicians.

The other major approach to temporal logic does not use a tense logic. Instead, it formalizes temporal reasoning within a first-order logic without modal-like tense operators. One method for developing ideas about temporal logic is the method of temporal arguments which adds an additional temporal argument to any predicate involving time in order to indicate how its satisfaction depends on time. A predicate such as “is less than seven” does not involve time, but the predicate “is resting” does, even though both use the word "is". If the “x is resting” is represented classically as P(x), where P is a one-argument predicate, then it could be represented in temporal logic instead as the two-argument predicate P(x,t), and this would be interpreted as saying x has property P at time t. P has been changed to a two-argument predicate by adding a “temporal argument.” The time variable 't' is treated as a new sort of variable requiring new axioms. Suggested new axioms allow time to be a dense linear ordering of instantaneous instants or to be continuous or to have some other structure.

Occasionally the method of temporal arguments uses a special constant symbol, say 'n', to denote now, the present time. This helps with the translation of common temporal sentences. For example, let Q(t) be interpreted as “Socrates is sitting down at t.” The sentence or proposition that Socrates has always been sitting down may be translated into first-order temporal logic as

(∀t)[(t < n) → Q(t)].

Some temporal logics allow sentences to lack both classical truth-values. The first person to give a clear presentation of the implications of treating declarative sentences as being neither true nor false was the Polish logician Jan Lukasiewicz in 1920. To carry out Aristotle’s suggestion that future contingent sentences do not yet have truth values, he developed a three-valued symbolic logic, with each grammatical declarative sentence having the truth-values True, or False, or else Indeterminate [T, F, or I]. Contingent sentences about the future, such as, "There will be a sea battle tomorrow," are assigned an I value in order to indicate the indeterminacy of the future. Truth tables for the connectives of propositional logic are redefined to maintain logical consistency and to maximally preserve our intuitions about truth and falsehood. See (Haack 1974) for more details about this application of three-valued logic.

Different temporal logics have been created depending on whether one wants to model circular time, discrete time, time obeying general relativity, the time of ordinary discourse, and so forth. For an introduction to tense logic and other temporal logics, see (Øhrstrøm and Hasle 1995).

12. Supplements

a. Frequently Asked Questions

The following questions are addressed in the Time Supplement article:

  1. What are Instants and Durations?
  2. What is an Event?
  3. What is a Reference Frame?
  4. What is an Inertial Frame?
  5. What is Spacetime?
  6. What is a Minkowski Diagram?
  7. What are the Metric and the Interval?
  8. Does the Theory of Relativity Imply Time is Part of Space?
  9. Is Time the Fourth Dimension?
  10. Is There More Than One Kind of Physical Time?
  11. How is Time Relative to the Observer?
  12. What is the Relativity of Simultaneity?
  13. What is the Conventionality of Simultaneity?
  14. What is the Difference Between the Past and the Absolute Past?
  15. What is Time Dilation?
  16. How does Gravity Affect Time?
  17. What Happens to Time Near a Black Hole?
  18. What is the Solution to the Twin Paradox (Clock Paradox)?
  19. What is the Solution to Zeno’s Paradoxes?
  20. How do Time Coordinates Get Assigned to Points of Spacetime?
  21. How do Dates Get Assigned to Actual Events?
  22. What is Essential to Being a Clock?
  23. What does It Mean for a Clock To Be Accurate?
  24. What is Our Standard Clock?
  25. Why are Some Standard Clocks Better Than Others?

b. What Science Requires of Time

c. Special Relativity: Proper times, Coordinate systems, and Lorentz Transformations

13. References and Further Reading

  • Butterfield, Jeremy. “Seeing the Present” Mind, 93, (1984), pp. 161-76.
    • Defends the B-camp position on the subjectivity of the present and its not being a global present.
  • Callender, Craig, and Ralph Edney. Introducing Time, Totem Books, USA, 2001.
    • A cartoon-style book covering most of the topics in this encyclopedia article in a more elementary way. Each page is two-thirds graphics and one-third text.
  • Callender, Craig and Carl Hoefer. “Philosophy of Space-Time Physics” in The Blackwell Guide to the Philosophy of Science, ed. by Peter Machamer and Michael Silberstein, Blackwell Publishers, 2002, pp. 173-98.
    • Discusses whether it is a fact or a convention that in a reference frame the speed of light going one direction is the same as the speed coming back.
  • Callender, Craig. "The Subjectivity of the Present," Chronos, V, 2003-4, pp. 108-126.
    • Surveys the psychological and neuroscience literature and suggests that the evidence tends to support the claim that our experience of the "now" is the experience of a subjective property rather than merely of an objective property, and it offers an interesting explanation of why so many people believe in the objectivity of the present.
  • Callender, Craig. "The Common Now," Philosophical Issues 18, pp. 339-361 (2008).
    • Develops the ideas presented in (Callender 2003-4).
  • Callender, Craig. "Is Time an Illusion?", Scientific American, June, 2010, pp. 58-65.
    • Explains how the belief that time is fundamental may be an illusion because time emerges from a universe that is basically static.
  • Carroll, John W. and Ned Markosian. An Introduction to Metaphysics. Cambridge University Press, 2010.
    • This introductory, undergraduate metaphysics textbook contains an excellent chapter introducing the metaphysical issues involving time, beginning with the McTaggart controversy.
  • Carroll, Sean. From Eternity to Here: The Quest for the Ultimate Theory of Time, Dutton/Penguin Group, New York, 2010.
    • Part Three "Entropy and Time's Arrow" provides a very clear explanation of the details of the problems involved with time's arrow. For an interesting answer to the question of whether any interaction between our part of the universe and a part in which the arrow of times goes in reverse, see endnote 137 for p. 164.
  • Carroll, Sean. "Ten Things Everyone Should Know About Time," Discover Magazine, Cosmic Variance, online 2011.
    • Contains the quotation about how the mind reconstructs its story of what is happening "now."
  • Damasio, Antonio R. “Remembering When,” Scientific American: Special Edition: A Matter of Time, vol. 287, no. 3, 2002; reprinted in Katzenstein, 2006, pp.34-41.
    • A look at the brain structures involved in how our mind organizes our experiences into the proper temporal order. Includes a discussion of Benjamin Libet’s discovery in the 1970s that the brain events involved in initiating a free choice occur about a third of a second before we are aware of our making the choice.
  • Dainton, Barry. Time and Space, Second Edition, McGill-Queens University Press: Ithaca, 2010.
    • A survey of all the topics in this article, but at a deeper level.
  • Davies, Paul. About Time: Einstein’s Unfinished Revolution, Simon & Schuster, 1995.
    • An easy to read survey of the impact of the theory of relativity on our understanding of time.
  • Davies, Paul. How to Build a Time Machine, Viking Penguin, 2002.
    • A popular exposition of the details behind the possibilities of time travel.
  • Deutsch, David and Michael Lockwood, “The Quantum Physics of Time Travel,” Scientific American, pp. 68-74. March 1994.
    • An investigation of the puzzle of getting information for free by traveling in time.
  • Dowden, Bradley. The Metaphysics of Time: A Dialogue, Rowman & Littlefield Publishers, Inc. 2009.
    • An undergraduate textbook in dialogue form that covers most of the topics discussed in this encyclopedia article.
  • Dummett, Michael. “Is Time a Continuum of Instants?,” Philosophy, 2000, Cambridge University Press, pp. 497-515.
    • A constructivist model of time that challenges the idea that time is composed of durationless instants.
  • Earman, John. “Implications of Causal Propagation Outside the Null-Cone," Australasian Journal of Philosophy, 50, 1972, pp. 222-37.
    • Describes his rocket paradox that challenges time travel to the past.
  • Grünbaum, Adolf. “Relativity and the Atomicity of Becoming,” Review of Metaphysics, 1950-51, pp. 143-186.
    • An attack on the notion of time’s flow, and a defense of the treatment of time and space as being continua and of physical processes as being aggregates of point-events. Difficult reading.
  • Haack, Susan. Deviant Logic, Cambridge University Press, 1974.
    • Chapter 4 contains a clear account of Aristotle’s argument (in section 9c of the present article) for truth value gaps, and its development in Lukasiewicz’s three-valued logic.
  • Hawking, Stephen. “The Chronology Protection Hypothesis,” Physical Review. D 46, p. 603, 1992.
    • Reasons for the impossibility of time travel.
  • Hawking, Stephen. A Brief History of Time, Updated and Expanded Tenth Anniversary Edition, Bantam Books, 1996.
    • A leading theoretical physicist provides introductory chapters on space and time, black holes, the origin and fate of the universe, the arrow of time, and time travel. Hawking suggests that perhaps our universe originally had four space dimensions and no time dimension, and time came into existence when one of the space dimensions evolved into a time dimension. He calls this space dimension “imaginary time.”
  • Horwich, Paul. Asymmetries in Time, The MIT Press, 1987.
    • A monograph that relates the central problems of time to other problems in metaphysics, philosophy of science, philosophy of language and philosophy of action.
  • Katzenstein, Larry, ed. Scientific American Special Edition: A Matter of Time, vol. 16, no. 1, 2006.
    • A collection of Scientific American articles about time.
  • Krauss, Lawrence M. and Glenn D. Starkman, “The Fate of Life in the Universe,” Scientific American Special Edition: The Once and Future Cosmos, Dec. 2002, pp. 50-57.
    • Discusses the future of intelligent life and how it might adapt to and survive the expansion of the universe.
  • Kretzmann, Norman, “Omniscience and Immutability,” The Journal of Philosophy, July 1966, pp. 409-421.
    • If God knows what time it is, does this demonstrate that God is not immutable?
  • Lasky, Ronald C. “Time and the Twin Paradox,” in Katzenstein, 2006, pp. 21-23.
    • A short, but careful and authoritative analysis of the twin paradox, with helpful graphs showing how each twin would view his clock and the other twin’s clock during the trip. Because of the spaceship’s changing velocity by turning around, the twin on the spaceship has a shorter world-line than the Earth-based twin and takes less time than the Earth-based twin.
  • Le Poidevin, Robin and Murray MacBeath, The Philosophy of Time, Oxford University Press, 1993.
    • A collection of twelve influential articles on the passage of time, subjective facts, the reality of the future, the unreality of time, time without change, causal theories of time, time travel, causation, empty time, topology, possible worlds, tense and modality, direction and possibility, and thought experiments about time. Difficult reading for undergraduates.
  • Le Poidevin, Robin, Travels in Four Dimensions: The Enigmas of Space and Time, Oxford University Press, 2003.
    • A philosophical introduction to conceptual questions involving space and time. Suitable for use as an undergraduate textbook without presupposing any other course in philosophy. There is a de-emphasis on teaching the scientific theories, and an emphasis on elementary introductions to the relationship of time to change, the implications that different structures for time have for our understanding of causation, difficulties with Zeno’s Paradoxes, whether time passes, the nature of the present, and why time has an arrow. The treatment of time travel says, rather oddly, that time machines “disappear” and that when a “time machine leaves for 2101, it simply does not exist in the intervening times,” as measured from an external reference frame.
  • Lockwood, Michael, The Labyrinth of Time: Introducing the Universe, Oxford University Press, 2005.
    • A philosopher of physics presents the implications of contemporary physics for our understanding of time. Chapter 15, “Schrödinger’s Time-Traveller,” presents the Oxford physicist David Deutsch’s quantum analysis of time travel.
  • Markosian, Ned, “A Defense of Presentism,” in Zimmerman, Dean (ed.), Oxford Studies in Metaphysics, Vol. 1, Oxford University Press, 2003.
  • Maudlin, Tim. The Metaphysics Within Physics, Oxford University Press, 2007.
    • Chapter 4, “On the Passing of Time,” defends the dynamic theory of time’s flow, and argues that the passage of time is objective.
  • McTaggart, J. M. E. The Nature of Existence, Cambridge University Press, 1927.
    • Chapter 33 restates more clearly the arguments that McTaggart presented in 1908 for his A series and B series and how they should be understood to show that time is unreal. Difficult reading. The argument that a single event is in the past, is present, and will be future yet it is inconsistent for an event to have more than one of these properties is called "McTaggart's Paradox." The chapter is renamed "The Unreality of Time," and is reprinted on pp. 23-59 of (LePoidevin and MacBeath 1993).
  • Mellor, D. H. Real Time II, International Library of Philosophy, 1998.
    • This monograph presents a subjective theory of tenses. Mellor argues that the truth conditions of any tensed sentence can be explained without tensed facts.
  • Mozersky, M. Joshua. "The B-Theory in the Twentieth Century," in A Companion to the Philosophy of Time. Ed. by Heather Dyke and Adrian Bardon, John Wiley & Sons, Inc., 2013, pp. 167-182.
    • A detailed evaluation and defense of the B-Theory.
  • Nadis, Steve. "Starting Point," Discover, September 2013, pp. 36-41.
    • Non-technical discussion of the argument by cosmologist Alexander Vilenkin that the past of the multiverse must be finite but its future must be infinite.
  • Newton-Smith, W. H. The Structure of Time, Routledge & Kegan Paul, 1980.
    • A survey of the philosophical issues involving time. It emphasizes the logical and mathematical structure of time.
  • Norton, John. "Time Really Passes," Humana.Mente: Journal of Philosophical Studies, 13 April 2010.
    • Argues that "We don't find passage in our present theories and we would like to preserve the vanity that our physical theories of time have captured all the important facts of time. So we protect our vanity by the stratagem of dismissing passage as an illusion."
  • Øhrstrøm, P. and P.  F. V. Hasle. Temporal Logic: from Ancient Ideas to Artificial Intelligence. Kluwer Academic Publishers, 1995.
    • An elementary introduction to the logic of temporal reasoning.
  • Perry, John. "The Problem of the Essential Indexical," Noûs, 13(1), (1979), pp. 3-21.
    • Argues that indexicals are essential to what we want to say in natural language; they cannot be eliminated in favor of B-theory discourse.
  • Pinker, Steven. The Stuff of Thought: Language as a Window into Human Nature, Penguin Group, 2007.
    • Chapter 4 discusses how the conceptions of space and time are expressed in language in a way very different from that described by either Kant or Newton. Page 189 says that t in only half the world’s languages is the ordering of events expressed in the form of grammatical tenses. Chinese has no tenses.
  • Pöppel, Ernst. Mindworks: Time and Conscious Experience. San Diego: Harcourt Brace Jovanovich. 1988.
    • A neuroscientist explores our experience of time.
  • Prior, A. N. “Thank Goodness That’s Over,” Philosophy, 34 (1959), p. 17.
    • Argues that a tenseless or B-theory of time fails to account for our relief that painful past events are in the past rather than in the present.
  • Prior, A. N. Past, Present and Future, Oxford University Press, 1967.
    • A pioneering work in temporal logic, the symbolic logic of time, which permits propositions to be true at one time and false at another.
  • Prior, A. N. “Critical Notices: Richard Gale, The Language of Time,” Mind78, no. 311, 1969, 453-460.
    • Contains his attack on the attempt to define time in terms of causation.
  • Prior, A. N. “The Notion of the Present,” Studium Generale, volume 23, 1970, pp. 245-8.
    • A brief defense of presentism, the view that the past and the future are not real.
  • Putnam, Hilary. "Time and Physical Geometry," The Journal of Philosophy, 64 (1967), pp. 240-246.
    • Comments on whether Aristotle is a presentist and why Aristotle was wrong if Relativity is right.
  • Russell, Bertrand. "On the Experience of Time," Monist, 25 (1915), pp. 212-233.
    • The classical tenseless theory.
  • Saunders, Simon. "How Relativity Contradicts Presentism," in Time, Reality & Experience edited by Craig Callender, Cambridge University Press, 2002, pp. 277-292.
    • Reviews the arguments for and against the claim that, since the present in the theory of relativity is relative to reference frame, presentism must be incorrect.
  • Savitt, Steven F. (ed.). Time’s Arrows Today: Recent Physical and Philosophical Work on the Direction of Time. Cambridge University Press, 1995.
    • A survey of research in this area, presupposing sophisticated knowledge of mathematics and physics.
  • Sciama, Dennis. “Time ‘Paradoxes’ in Relativity,” in The Nature of Time edited by Raymond Flood and Michael Lockwood, Basil Blackwell, 1986, pp. 6-21.
    • A good account of the twin paradox.
  • Shoemaker, Sydney. “Time without Change,” Journal of Philosophy, 66 (1969), pp. 363-381.
    • A thought experiment designed to show us circumstances in which the esxistence of changeless intervals in the universe could be detected.
  • Sider, Ted. “The Stage View and Temporary Intrinsics,” The Philosophical Review, 106 (2) (2000), pp. 197-231.
    • Examines the problem of temporary intrinsics and the pros and cons of four-dimensionalism.
  • Sklar, Lawrence. Space, Time, and Spacetime, University of California Press, 1976.
    • Chapter III, Section E discusses general relativity and the problem of substantival spacetime, where Sklar argues that Einstein’s theory does not support Mach’s views against Newton’s interpretations of his bucket experiment; that is, Mach’s argument against substantivialism fails.
  • Sorabji, Richard. Matter, Space, & Motion: Theories in Antiquity and Their Sequel. Cornell University Press, 1988.
    • Chapter 10 discusses ancient and contemporary accounts of circular time.
  • Steinhardt, Paul J. "The Inflation Debate: Is the theory at the heart of modern cosmology deeply flawed?" Scientific American, April, 2011, pp. 36-43.
    • Argues that the Big Bang Theory with inflation is incorrect and that we need a cyclic cosmology with an eternal series of Big Bangs and big crunches but with no inflation.
  • Thomson, Judith Jarvis. "Parthood and Identity across Time," Journal of Philosophy 80, 1983, 201-20.
    • Argues against four-dimensionalism and its idea of objects having infinitely many temporal parts.
  • Thorne, Kip S. Black Holes and Time Warps: Einstein’s Outrageous Legacy, W. W. Norton & Co., 1994.
    • Chapter 14 is a popular account of how to use a wormhole to create a time machine.
  • Van Fraassen, Bas C. An Introduction to the Philosophy of Time and Space, Columbia University Press, 1985.
    • An advanced undergraduate textbook by an important philosopher of science.
  • Veneziano, Gabriele. “The Myth of the Beginning of Time,” Scientific American, May 2004, pp. 54-65, reprinted in Katzenstein, 2006, pp. 72-81.
    • An account of string theory’s impact on our understanding of time’s origin. Veneziano hypothesizes that our Big Bang was not the origin of time but simply the outcome of a preexisting state.
  • Whitrow. G. J. The Natural Philosophy of Time, Second Edition, Clarendon Press, 1980.
    • A broad survey of the topic of time and its role in physics, biology, and psychology. Pitched at a higher level than the Davies books.

Author Information

Bradley Dowden
California State University, Sacramento
U. S. A.

Scientific Change

How do scientific theories, concepts and methods change over time? Answers to this question have historical parts and philosophical parts. There can be descriptive accounts of the recorded differences over time of particular theories, concepts, and methods—what might be called the shape of scientific change. Many stories of scientific change attempt to give more than statements of what, where and when change took place. Why this change then, and toward what end? By what processes did they take place? What is the nature of scientific change?

This article gives a brief overview of the most influential views on the shape and nature of change in science. Important thematic questions are: How gradual or rapid is scientific change? Is science really revolutionary? How radical is the change? Are periods in science incommensurable, or is there continuity between the first and latest scientific ideas? Is science getting closer to some final form, or merely moving away from a contingent, non-determining past? What role do the factors of community, society, gender, or technology play in facilitating or mitigating scientific change? The most important modern development in the topic is that none of these questions have the same answer for all sciences. When we speak of scientific change it should be recognized that it is only at a fairly contextualized level of description of the practices of scientists at rather specific times and places that anything substantial can be said.

Nonetheless, scientific change is connected with many other key issues in philosophy of science and broader epistemology, such as realism, rationality and relativism. The present article does not attempt to address them all. Higher-order debates regarding the methods of historiography or the epistemology of science, or the disciplinary differences between History and Philosophy, while important and interesting, represent an iteration of reflection on top of scientific change itself, and so go beyond the article’s scope.

Table of Contents

  1. If Science Changes, What is Science?
  2. History of Science and Scientific Change
  3. Philosophical Views on Change and Progress in Science
    1. Kuhn, Paradigms and Revolutions
      1. Key Concepts in Kuhn’s Account of Scientific Change
      2. Incommensurability as the Result of Radical Scientific Change
    2. Lakatos and Progressing and Degenerating Research Programs
    3. Laudan and Research Traditions
  4. The Social Processes of Change
    1. Fleck
    2. Hull’s Evolutionary Account of Scientific Change
  5. Cognitive Views on Scientific Change
    1. Cognitive History of Science
    2. Scientific Change and Science Education
  6. Further Reading and References
    1. Primary Sources
    2. Secondary Sources
      1. Concepts, Cognition and Change
      2. Feminist, Situated and Social Approaches
      3. The Scientific Revolution

1. If Science Changes, What is Science?

We begin with some organizing remarks. It is interesting to note at the outset the reflexive nature of the topic of scientific change. A main concern of science is understanding physical change, whether it be motions, growth, cause and effect, the creation of the universe or the evolution of species. Scientific views of change have influenced philosophical views of change and of identity, particularly among philosophers impressed by science's success at predicting and controlling change. These philosophical views are then reflected back, through the history and philosophy of science, as images of how science itself changes, of how its theories are created, evolve and die. Models of change from science—evolutionary, mechanical, revolutionary—often serve as models of change in science.

This makes it difficult to disentangle the actual history of science from our philosophical expectations about it. And the historiography and the philosophy of science do not always live together comfortably. Historians balk at the evaluative, forward-looking, and often necessitarian, claims of standard philosophical reconstructions of scientific events. Philosophers, for their part, have argued that details of the history of science matter little to a proper theory of scientific change, and that a distinction can and should be made between how scientific ideas are discovered and how they are justified. Beneath the ranging, messy, and contingent happenings which led to our current scientific outlook, there lies a progressive, systematically evolving activity waiting to be rationally reconstructed.

Clearly, to tell any story of ‘science changing’ means looking beneath the surface of those changes in order to find something that remains constant, the thing which remains science. Conversely, what one takes to be the demarcating criteria of science will largely dictate how one talks about its changes. What part of human history is to be identified with science? Where does science start and where does it end? The breadth of science has a dimension across concurrent events as well as across the past and future. That is, it has both synchronic (at a time) and diachronic (over time) dimensions. Science will consist of a range of contemporary events which need to be demarcated. But likewise, science has a temporal breadth: a beginning, or possibly several beginnings, and possibly several ends.

The synchronic dimension of science is one way views of scientific change can be distinguished. On one hand there are logical or rationalistic views according to which scientific activity can be reduced to a collection of objective, rational decisions of a number of individual scientists. On this latter view, the most significant changes in science can each be described through the logically-reconstructable actions and words of one historical figure, or at most a very few. According to many of the more recent views, however, an adequate picture of science cannot be formed with anything less than the full context of social and political structures: the personal, institutional, and cultural relations scientists are a part of. We look at some of these broader sociological views in the section on social process of change.

Historians and philosophers of science have wanted also to “broaden” science diachronically, to historicize its content, such that the justifications of science, or even its meanings, cannot be divorced from their past. We will begin with the most influential figure for history and philosophy of science in North America in the last half-century: Thomas Kuhn. Kuhn's work in the middle of the last century was primarily a reaction to the then prevalent, rationalistic and a-historical view described in the previous paragraph. Along with Kuhn, we describe the closely related views of Imre Lakatos and Larry Laudan. For an introduction to the most influential philosophical accounts of the diachronical development of science, see Losee 2004.

When Kuhn and the others advanced their new views on the development of science into Anglo-Saxon philosophy of science, history and sociology were already an important part of the landscape of Continental history and philosophy of science. A discussion of these views can be found as part of the sociology of science section as well. The article concludes with more recent naturalized approaches to scientific change, which turn to cognitive science for accounts of scientific understanding and how that understanding is formed and changed, as well as suggestions for further reading.

Science itself, at least in a form recognizable to us, is a twentieth century phenomenon. Although a matter of debate, the canonical view of the history of scientific change is that its seminal event is the one tellingly labeled the Scientific Revolution. It is usually dated to the 16th and 17th centuries. The first historiographies of science—as much construction of the revolution as they were documentation—were not far behind, coming in the eighteenth and nineteenth centuries. Professionalization of the history of science, characterized by reflections on the telling of the history of science, followed later. We begin our story there.

2. History of Science and Scientific Change

As history of science professionalized, becoming a separate academic discipline in the twentieth century, scientific change was seen early on as an important theme within the discipline. Admittedly, the idea of radical change was not a key notion for early practitioners of the field such as George Sarton (1884-1956), the father of history of science in the United States, but with the work of historians of science such as Alexandre Koyré (1892-1964), Herbert Butterfield (1900-1979) and A. Rupert Hall (1920-2009), radical conceptual transformations came to play a much more important role.

One of the early outcomes of this interest in change was the volume Scientific Change (Crombie, 1963) in which historians of science covering the span of science from the physical to the biological sciences, and the span of history from antiquity to modern science, all investigated the conditions for scientific change by examining cases from a multitude of periods, societies, and scientific disciplines. The introduction to Crombie's volume presented a large number of questions regarding scientific change that remained key issues in both history and philosophy of science for several decades:

What were the essential changes in scientific thought and how were they brought about? What was the part played in the initiation of change by mutations in fundamental ideas leading to new questions being asked, new problems being seen, new criteria of satisfactory explanation replacing the old? What was the part played by new technical inventions in mathematics and experimental apparatus; by developments in pure mathematics; by the refinements of measurement; by the transference of ideas, methods and information from one field of study to another? What significance can be given to the description and use of scientific methods and concepts in advance of scientific achievement? How have methods and concepts of explanation differed in different sciences? How has language changed in changing scientific contexts? What parts have chance and personal idiosyncrasy played in discovery? How have scientific changes been located in the context of general ideas and intellectual motives, and to what extent have extra-scientific beliefs given theories their power to convince? … How have scientific and technical changes been located in the social context of motives and opportunities? What value has been put on scientific activity by society at large, by the needs of industry, commerce, war, medicine and the arts, by governmental and private investment, by religion, by different states and social systems? To what external social, economic and political pressures have science, technology and medicine been exposed? Are money and opportunity all that is needed to create scientific and technical progress in modern society? (Crombie, 1963, p. 10)

Of particular interest among historians of science have been the changes associated with scientific revolutions and especially the period often referred to as the Scientific Revolution, seen as the sum of achievements in science from Copernicus to Newton (Cohen 1985; Hall 1954; Koyré 1965). The word ‘revolution’ had started being applied in the eighteenth century to the developments in astronomy and physics as well as the change in chemical theory which emerged with the work of Lavoisier in the 1770s, or the change in biology which was initiated by Darwin’s work in the mid-nineteenth century. These were fundamental changes that overturned not only the reigning theories but also carried with them significant consequences outside their respective scientific disciplines. In most of the early work in history of science, scientific change in the form of scientific revolutions was something which happened only rarely. This view was changed by the historian and philosopher of science Thomas S. Kuhn whose 1962 monograph The Structure of Scientific Revolutions (1970) came to influence philosophy of science for decades. Kuhn wanted in his monograph to argue for a change in the philosophical conceptions of science and its development, but based on historical case studies. The notion of revolutions that he used in Structure included not only fundamental changes of theory that had a significant influence on the overall world view of both scientists and non-scientists, but also changes of theory whose consequences remained solely within the scientific discipline in which the change had taken place. This considerably widened the notion of scientific revolutions compared to earlier historians and initiated discussions among both historians and philosophers on the balance between continuity and change in the development of science.

3. Philosophical Views on Change and Progress in Science

In the British and North American schools of philosophy of science, scientific change did not became a major topic until the 1960s onwards when historically inclined philosophers of science, including Thomas S. Kuhn (1922-1996), Paul K. Feyerabend (1924-1994), N. Russell Hanson (1924-1967), Michael Polanyi (1891-1971), Stephen Toulmin (1922-2009) and Mary Hesse (*1924) started questioning the assumptions of logical positivism, arguing that philosophy of science should be concerned with the historical structure of science rather than with an ahistorical logical structure which they found to be a chimera. The occupation with history led naturally to a focus on how science develops, including whether science progresses incrementally or through changes which represent some kind of discontinuity.

Similar questions had also been discussed among Continental scholars. The development of the theory of relativity and of quantum mechanics in the beginning of the twentieth century suggested that empirical science could overturn deeply held intuitions and introduce counter-intuitive new concepts and ideas; and several European philosophers, among them the German neo-Kantian philosopher Ernst Cassirer (1874-1945), directed their work towards rejecting Kant’s absolute categories in favor of categories that may change over time. In France, the historian and philosopher of science Gaston Bachelard (1884-1962) also noted that what Kant had taken to be absolute preconditions for knowledge had turned out wrong in the light of modern physics. On Bachelard’s view, what had seemed to be absolute preconditions for knowledge were instead merely contingent conditions. These conditions were still required for scientific reasoning and therefore, Bachelard concluded, a full account of scientific reasoning could only be derived from reflections upon its historical conditions and development. Based on the analysis of the historical development of science, Bachelard advanced a model of scientific change according to which the conceptions of nature are from time to time replaced by radical new conceptions – what Bachelard called epistemological breaks.

Bachelard’s view was later developed and modified by the historian and philosopher of science, and student of Bachelard, George Canguilhem (1904-1995) and by the philosopher and social historian, and student of Canguilhem, Michel Foucault (1926-1984). Beyond the teacher-student connections, there are other commonalities which unify this tradition. In North America and England, among those who wanted to make philosophy more like science, or to import into philosophical practice lessons from the success of science, the exemplar was almost always physics. The most striking and profound advances in science seemed to be, after all, in physics, namely the quantum and relativity revolutions. But on the Continent, model sciences were just as often linguistics or sociology, biology or anthropology, and not limited to those. Canguilhem's interest in changing notions of the normal versus the pathological, for example, coming from an interest in medicine, typified the more human-centered theorising of the tradition. What we as humans know, how we know it, and how we successfully achieve our aims, are the guiding questions, not how to escape our human condition or situatedness.

Foucault described his project as archaeology of the history of human thought and its conditions. He compared his project to Kant’s critique of reason, but with the difference that Foucault’s interest was in a historical a priori; that is, with what seem to be for a given period the necessary conditions governing reason, and how these constraints have a contingent historical origin. Hence, in his analysis of the development of the human sciences from the Renaissance to the present, Foucault described various so-called epistemes that determined the conditions for all knowledge of their time, and he argued that the transition from one episteme to the next happens as a break that entails radical changes in the conception of knowledge. Michael Friedman's work on the relativized and dynamic a priori can be seen as continuation of this thread (Friedman 2001). For a detailed account of the work of Bachelard, Canguilhem and Foucalt, see Gutting (1989).

With the advent of Kuhn’s Structure, “non-Continental” philosophy of science also started focusing in its own way on the historical development of science, often apparently unaware of the earlier tradition, and in the decades to follow alternative models were developed to describe how theories supersede their successors, and whether progress in science is gradual and incremental or whether it is discontinuous. Among the key contributions to this discussion, besides Kuhn’s famous paradigm-shift model, were Imre Lakatos’ (1922-1974) model of progressing and degenerating research programs and Larry Laudan’s (*1941) model of successive research traditions.

a. Kuhn, Paradigms and Revolutions

One of the key contributions that provoked interest in scientific change among philosophers of science was Thomas S. Kuhn’s seminal monograph The Structure of Scientific Revolutions from 1962. The aim of this monograph was to question the view that science is cumulative and progressive, and Kuhn opened with: “History, if viewed as a repository for more than anecdote or chronology, could produce a decisive transformation in the image of science by which we are now possessed” (p. 1). History was expected to do more than just chronicle the successive increments of, or impediments to, our progress towards the present. Instead, historians and philosophers should focus on the historical integrity of science at a particular time in its development, and should analyze science as it developed. Instead of describing a cumulative, teleological development toward the present, history of science should see science as developing from a given point in history. Kuhn expected a new image of science would emerge from this diachronic historiography. In the rest of Structure he used historical examples to question the view of science as a cumulative development in which scientists gradually add new pieces to the ever-growing aggregate of scientific knowledge, and instead he described how science develops through successive periods of tradition-preserving normal science and tradition-shattering revolutions. For introductions to Kuhn’s philosophy of science, see for example Andersen 2001, Bird 2000, and Hoyningen-Huene 1993.

i. Key Concepts in Kuhn’s Account of Scientific Change

On Kuhn’s model, science proceeds in key phases. The predominant phase is normal science which, while progressing successfully in its aims, inherently generates what Kuhn calls anomalies. In brief, anomalies lead to crisis and extraordinary science, followed by revolution, and finally a new phase of normal science.

Normal science is characterized by a consensus which exists throughout the scientific community as to (a) the concepts used in communication among scientists, (b) the problems which can meaningfully be formulated as relevant research problems, and (c) a set of exemplary problem solutions that serve as models in solving new problems. Kuhn first introduced the notion 'paradigm' to denote these shared communal aspects, and also the tools used by that community for solving its research problems. Because so much was apparently captured by the term ‘paradigm’, Kuhn was criticized for using the term in ambiguous ways (see especially Masterman 1970). He later offered the alternative notion 'disciplinary matrix', covering (a) symbolic generalizations, or laws in their most fundamental forms, (b) beliefs about which objects and phenomena that exist in the world, (c) values by which the quality of research can be evaluated, and (d) exemplary problems and problem situations. In normal science, scientists draw on the tools provided by the disciplinary matrix, and they expect the solutions of new problems to be in consonance with the descriptions and solutions of the problems that they have previously examined. But sometimes these expectations are violated. Problems may turn out not to be solvable in an acceptable way, and then instead they represent anomalies for the reigning theories.

Not all anomalies are equally severe. Some discrepancy can always be found between theoretical predictions and experimental findings, and this does not necessarily challenge the foundations of normal science. Hence, some anomalies can be neglected, at least for some time. Others may find a solution within the reigning theoretical framework. Only a small number will be so severe and so persistent, that they suggest the tools provided by the accepted theories must be given up, or at least be seriously modified. Science has then entered the crisis phase of Kuhn's model. Even in crisis, revolution may not be immediately forthcoming. Scientists may “agree” that no solution is likely to be found in the present state of their field and simply set the problems aside for future scientists to solve with more developed tools, while they return to normal science in its present form. More often though, when crisis has become severe enough for questioning the foundation, and the anomalies may be solved by a new theory, that theory gradually receives acceptance until eventually a new consensus is established among members of the scientific community regarding the new theory. Only in this case has a scientific revolution occurred.

Importantly though, even severe anomalies are not simply falsifying instances. Severe anomalies cause scientists to question the accepted theories, but the anomalies do not lead the scientists to abandon the paradigm without an alternative to replace it. This raises a crucial question regarding scientific change on Kuhn's model: where do new theories come from? Kuhn said little about this creative aspect of scientific change; a topic that later became central to cognitively inclined philosophers of science working on scientific change (see the section on Cognitive Views below). Kuhn described merely how severe anomalies would become the fixation point for further research, while attempts to solve them might gradually diverge more and more from the solution hitherto accepted as exemplary. Until, in the course of this development, embryonic forms of alternative theories were born.

ii. Incommensurability as the Result of Radical Scientific Change

For Kuhn the relation between normal science traditions separated by a scientific revolution cannot be described as incorporation of one into the other, or as incremental growth. To describe the relation, Kuhn adopted the term ‘incommensurability’ from mathematics, claiming that the new normal-scientific tradition which emerges from a scientific revolution is not only incompatible but often actually incommensurable with that which has gone before.

Kuhn's notion of incommensurability covered three different aspects of the relation between the pre- and post-revolutionary normal science traditions: (1) a change in the set of scientific problems and the way in which they are attacked, (2) conceptual changes, and (3) a change, in some sense, in the world of the scientists’ research. This latter, “world-changing” aspect is the most fundamental aspect of incommensurability. However, it is a matter of great debate exactly how strongly we should take Kuhn's meaning, for instance when he stated that “though the world does not change with a change of paradigm, the scientist afterwards works in a different world” (p. 121). To make sense of these claims it is necessary to distinguish between two different senses of the term ‘world’: the world as the independent object which scientists investigate and the world as the perceived world in which scientists practice their trade.

In Structure, Kuhn argued for incommensurability in perceptual terms. Drawing on results from psychological experiments showing that subjects’ perceptions of various objects were dependent on their training and experience, Kuhn suspected that something like a paradigm was prerequisite to perception itself and that, therefore, different normal science traditions would cause scientists to perceive differently. But when it comes to visual gestalt-switch images, one has recourse to the actual lines drawn on the paper. Contrary to this possibility of employing an ‘external standard’, Kuhn claimed that scientists can have no recourse above or beyond what they see with their eyes and instruments. For Kuhn, the change in perception cannot be reduced to a change in the interpretation of stable data, simply because stable data do not exist. Kuhn thus strongly attacked the idea of a neutral observation-language; an attack similarly launched by other scholars during the late 1950s and early 1960s, most notably Hanson (Hanson 1958).

These aspects of incommensurability have important consequences for the communication between proponents of competing normal science traditions and for the choice between such traditions. Recognizing different problems and adopting different standards and concepts, scientists may talk past each other when debating the relative merits of their respective paradigms. But if they do not agree on the list of problems that must be solved or on what constitutes an acceptable solution, there can be no point-by-point comparison of competing theories. Instead, Kuhn claimed that the role of paradigms in theory choice was necessarily circular in the sense that the proponents of each would use their own paradigm to argue in that paradigm’s defense. Paradigm choice is a conversion that cannot be forced by logic and neutral experience.

This view has led many critics of Kuhn to the misunderstanding that he saw paradigm choice as devoid of rational elements. However, Kuhn did emphasize that although paradigm choice cannot be justified by proof, this does not mean that arguments are not relevant or that scientists are not rationally persuaded to change their minds. In contrast, Kuhn argued that, “Individual scientists embrace a new paradigm for all sorts of reasons and usually for several at once.” (Kuhn 1996. p. 152)  According to Kuhn, such arguments are, first of all, about whether the new paradigm can solve the problems that have led the old paradigm to a crisis, whether it displays a quantitative precision strikingly better than its older competitor, and whether in the new paradigm or with the new theory there are predictions of phenomena that had been entirely unsuspected while the old one prevailed. Aesthetic arguments, based on simplicity for example, may enter as well.

Another common misunderstanding of Kuhn’s notion of incommensurability is that it should be taken to imply a total discontinuity between the normal science traditions separated by a scientific revolution. Kuhn emphasized, rather, that a new paradigm often incorporates much of the vocabulary and apparatus, both conceptual and manipulative, of its predecessor. Paradigm shifts may be “non-cumulative developmental episodes …,” but the former paradigm can be replaced “... in whole or in part …” (Ibid. p. 2). In this way, parts of the achievements of a normal science tradition will turn out to be permanent, even across a revolution. “[P]ostrevolutionary science invariably includes many of the same manipulations, performed with the same instruments and described in the same terms ...” (Ibid. p 129-130). Incommensurability is a relation that holds only between minor parts of the object domains of two competing theories.

b. Lakatos and Progressing and Degenerating Research Programs

Lakatos agreed with Kuhn’s insistence on the tenacity of some scientific theories and the rejection of naïve falsification, but he was opposed to Kuhn’s account of the process of change, which he saw as “a matter for mob psychology” (Lakatos, 1970, p. 178). Lakatos therefore sought to improve upon Kuhn’s account by providing a more satisfactory methodology of scientific change, along with a meta-methodological justification of the rationality of that method, both of which were seen to be either lacking or significantly undeveloped in Kuhn’s early writings. On Lakatos’ account, a scientific research program consists of a central core that is taken to be inviolable by scientists working within the research program, and a collection of auxiliary hypotheses that are continuously developing as the core is applied. In this way, the methodological rules of a research program divide into two different kinds: a negative heuristic that tells the scientists which paths of research to avoid, and a positive heuristic that tells the scientists which paths to pursue. On this view, all tests are necessarily directed at the auxiliary hypotheses which come to form a protective belt around the hard core of the research program.

Lakatos aims to reconstruct changes in science as occurring within research programs. A research program is constituted by the series of theories resulting from adjustments to the protective belt but all of which share a hard core. As adjustments are made in response to problems, new problems arise, and over a series of theories there will be a collective problem-shift. Any series of theories is theoretically progressive, or constitutes a theoretically progressive problem-shift, if and only if there is at least one theory in the series which has some excess empirical content over its predecessor. In the case if this excess empirical content is also corroborated the series of theories is empirically progressive. A problem-shift is progressive, then, if it is both theoretically and empirically progressive, otherwise it is degenerate. A research program is successful if it leads to progressive problem-shifts and unsuccessful if it leads to degenerating problem-shifts. The further aim of Lakatos’ account, in other words, is to discover, through reconstruction in terms of research programs, where progress is made in scientific change.

The rationally reconstructive aspect of Lakatos’ account is the target of criticism. The notion of empirical content, for instance, is carrying a pretty heavy burden in the account. In order to assess the progressiveness of a program, one would seem to need a measure of the empirical content of theories in order to judge when there is excess content. Without some such measure, however, Lakatos' methodology is dangerously close to being vacuous or ad hoc.

We can instead take the increase in empirical content to be a meta-methodological principle, one which dictates an aim for scientists (that is, to increase empirical knowledge), while cashing this out at the methodological level by identifying progress in research programs with making novel predictions. The importance of novel predictions, in other words, can be justified by their leading to an increase in the empirical content of the theories of a research program. A problem-shift which results in novel predictions can be taken to entail an increase in empirical content. It remains a worry, however, whether such an inference is warranted, since it seems to simply assume novelty and cumulativity go together unproblematically. That they might not was precisely Kuhn's point.

A second objection is that Lakatos' reconstruction of scientific change through appeal to a unified method runs counter to the prevailing attitude among philosophers of science from the second half of the twentieth century on, according to which there is no unified method for all of science. At best, anything they all have in common methodologically will be so general as to be unhelpful or uninteresting.

At any rate, Lakatos does offer us a positive heuristic for the description and even explanation of scientific change. For him, change in science is a difficult and delicate thing, requiring balance and persistence. “Purely negative, destructive criticism, like ‘refutation’ or demonstration of an inconsistency does not eliminate a program. Criticism of a program is a long and often frustrating process and one must treat budding programs leniently. One may, of course, whop up on [criticize] the degeneration of a research program, but it is only constructive criticism which, with the help of rival research programs, can achieve real successes; and dramatic spectacular results become visible only with hindsight and rational reconstruction” (Lakatos, 1970, p. 179).

c. Laudan and Research Traditions

In his Progress and Its Problems: Towards a Theory of Scientific Growth (1977), Laudan defined a research tradition as a set of general assumptions about the entities and processes in a given domain and about the appropriate methods to be used for investigating the problems and constructing the theories in that domain. Such research traditions should be seen as historical entities created and articulated within a particular intellectual environment, and as historical entities they would “wax and wane” (p. 95). On Laudan’s view, it is important to consider scientific change both as changes that may appear within a research tradition and as changes of the research tradition itself.

The key engine driving scientific change for Laudan is problem solving. Changes within a research tradition may be minor modifications of subordinate, specific theories, such as modifications of boundary conditions, revisions of constants, refinements of terminology, or expansion of a theory’s classificatory network to encompass new discoveries. Such changes solve empirical problems, essentially those problems Kuhn conceives of as anomalies. But, contrary to Kuhn's normal science and to Lakatos' research programs, Laudan held that changes within a research tradition might also involve changes to its most basic core elements. Severe anomalies which are not solvable merely by modification of specific theories within the tradition may be seen as symptoms of a deeper conceptual problem. In such cases scientists may instead explore what sorts of (minimal) adjustments could be made in the deep-level methodology or ontology of that research tradition (p. 98). When Laudan looked at the history of science, he saw Aristotelians who had abandoned the Aristotelian doctrine that motion in a void is impossible, and Newtonians who had abandoned the Newtonian demand that all matter has inertial mass, and he saw no reason to claim that they were no longer working within those research traditions.

Solutions to conceptual problems may even result in a theory with less empirical support and still count as progress since it is overall problem solving effectiveness (not all problems are empirical ones) which is the measure of success of a research tradition (Laudan 1996). Most importantly for Laudan, if there are what can be called revolutions in science, they reflect different kinds of problems, not a different sort of activity. David Pearce calls this Laudan's methodological monism (see Pearce 1984). For Kuhn and Lakatos, identification of a research tradition (or program or paradigm) could be made at the level of specific invariant, non-rejectable elements. For Laudan, there is no such class of sacrosanct elements within a research tradition—everything is open to change over time. For example, while absolute time and space were seen as part of the unrejectable core of Newtonian physics in the eighteenth century, they were no longer seen as such a century later. This leaves a dilemma for Laudan’s view. If research traditions undergo deep-level transformations of their problem solving apparatus this would seem to constitute a significant change to the problem solving activity that may warrant considering the change the basis of a new research tradition. On the other hand, if the activity of problem solving is strong enough to provide the identity conditions of a tradition across changes, consistency might force us to identify all problem solving activity as part of one research tradition, blurring distinctions between science and non-science. Distinguishing between a change within a research tradition and the replacement of a research tradition with another seems both arbitrary and open-ended. One way of solving this problem is by turning from just internal characteristics of science to external factors of social and historical context.

4. The Social Processes of Change

Science is not just a body of facts or sets of sentences. However one characterizes its content, that content must be embodied in institutions and practices comprised of scientists themselves. An important question then, with respect to scientific change, regards how “science” is constructed out of scientists, and which unit of analysis – the individual scientist or the community—is the proper one for understanding the dynamic of scientific change? Popper's falsificationism was very much a matter of personal responsibility and reflection. Kuhn, on the other hand, saw scientific change as a change of community and generations. While Structure may have been largely responsible for making North American philosophers aware of the importance of historical and social context in shaping scientific change, Kuhn was certainly not the first to theorize about it. Kuhn himself recognized his views in the earlier work of Ludwick Fleck (See for example Brorson and Andersen 2001, Babich 2007 and Mössner 2011 for comparisons between the views of Kuhn and Fleck).

a. Fleck

As early as the mid-1930s, Ludwik Fleck (1896-1961) gave an account of how thoughts and ideas change through their circulation within the social strata of a thought-collective (Denkkollektiv) and how this thought-traffic contributes to the process of verification. Drawing on a case study from medicine on the development of a diagnostic test for syphilis, Ludwik Fleck argued in his 1935 monograph Genesis and the Development of a Scientific Fact that a thought collective is a functional unit in which people who interact intellectually are tied together through a particular ‘thought style’ that forces narrow constraints upon the thinking of the individual. The thought-style is dogmatically transmitted from one generation to the next, by initiation, training, education or other devices whose aim is introduction into the collective. Most people participate in numerous thought-collectives, and any individual therefore possesses several overlapping thought-styles and may become carriers of influence between the various thought-collectives in which they participate. This traffic of thoughts outside the collective is linked to the most outstanding alterations in thought-content. The ensuing modification and assimilation according to the foreign thought-style is a significant source of divergent thinking. According to Fleck, any circulation of thoughts therefore also causes transformation of the circulated thought.

In Kuhn’s Structure, the distinction between the individual scientist and the community as the agent of change was not quite clear, and Kuhn later regretted having used the notion of a gestalt switch to characterize changes in a community because “communities do not have experiences, much less gestalt switches.” Consequently, he realized that “to speak, as I repeatedly have, of a community’s undergoing a gestalt switch is to compress an extended process of change into an instant, leaving no room for the microprocesses by which the change is achieved” (Kuhn 1989, p. 50). Rather than helping himself to an unexamined notion of communal change, Fleck, on the other hand, made the process by which individual interacted with collective central to his account of scientific development and the joint construction of scientific thought. What the accounts have in common is a view that the social plays a role in scientific change through the social shaping of science content. It is not a relation between scientist and physical world which is constitutive of scientific knowledge, but a relation between the scientists and the discipline to which they belong. That relation can be restrictive of change in science. It can also provide the dynamics for change.

b. Hull’s Evolutionary Account of Scientific Change

Several philosophers of science have held the view that the dynamics of scientific change can be seen as an evolutionary process in which some kind of selection plays a central role. One of the most detailed evolutionary accounts of scientific change has been provided by David Hull (1935-2010). On Hull's account of scientific change, the development of science is a function of the interplay between cooperation and competition for credit among scientists. Hence, selection in the form of citations plays a central role in this account.

The basic structure of Hull’s account is that, for the content element of science—problems and their solutions, accumulated data, but also beliefs about the goals of science, proper ways to realize these goals, and so forth—to survive in science they must be transmitted more or less intact through history. That is, they must be seen as replicators that pass on their structure in successive replication. Hence, conceptual replication is a matter of information being transmitted largely intact by different vehicles. These vehicles of transmission may be media such as books or journals, but also scientists themselves. Whereas books and journals are passive vehicles, scientists are active in testing and changing the transmitted ideas. They are therefore not only vehicles of transmission but also interactors, interacting with their environment in a way that causes replication to be differential and hence enabling of scientific change.

Hull did not elaborate much on the inner structure of differential replication, apart from arguing that the underdetermination of theory by observation made it possible. Instead, the focus of his account is on the selection mechanism that can cause some lineages of scientific ideas to cease and others to continue. First, scientists tend to behave in ways that increase their conceptual fitness. Scientists want their work to be accepted, which requires that they gain support from other scientists. One kind of support is to show that their work rests on preceding research. But that is at the same time a decrease in originality. There is a trade-off between credit and support. Scientists whose support is worth having are likely to be cited more frequently.

Second, this social process is highly structured. Scientists tend to organize into tightly knit research groups in order to develop and disseminate a particular set of views. Few scientists have all the skills and knowledge necessary to solve the problems that they confront; they therefore tend to form research groups of varying degrees of cohesiveness. Cooperating scientists may often share ideas that are identical in descent, and transmission of their contributions can be viewed as similar to kin selection. In the wider scientific community, scientists may form a deme in the sense that they use the ideas of each other much more frequently than the ideas of scientists outside the community.

Initially, criticism and evaluation come from within a research group. Scientists expose their work to severe tests prior to publication, but some things are taken so much for granted that it never occurs to them to question it. After publication, it shifts to scientists outside the group, especially opponents who are likely to have different—though equally unnoticed—presuppositions. The self-correction of science depends on other scientists having different perspectives and different career interests—scientists’ career interests are not damaged by refuting the views of their opponents.

5. Cognitive Views on Scientific Change

Scientific change received new interest during the 1980s and 1990s with the emergence of cognitive science; a field that draws on cognitive psychology, cognitive anthropology, linguistics, philosophy, artificial intelligence and neuroscience. Historians and philosophers of science adapted results from this interdisciplinary work to develop new approaches to their field. Among the approaches are Paul Churchland’s (*1942) neurocomputational perspective (Churchland, 1989; Churchland, 1992), Ronald Giere’s (*1938) work on cognitive models of science (Giere, 1988), Nancy Nersessian’s (*1947) cognitive history of science (Nersessian, 1984; Nersessian, 1992; Nersessian, 1995a; 1995b), and Paul Thagard’s (*1950) computational philosophy of science (Thagard, 1988; Thagard, 1992). Rather than explaining scientific change in terms of a priori principles, these new approaches aim at being naturalized by drawing on cognitive science to provide insights on how humans generally construct and develop conceptual systems and how they use these insights in analyses of scientific change as conceptual change. (For an overview of research in conceptual change, see (Vosniadou, 2008).)

a. Cognitive History of Science

Much of the early work on conceptual change emphasized the discontinuous character of major changes by using metaphors like ‘gestalt switch’, indicating that such major changes happen all at once. This idea had originally been introduced by Kuhn, but in his later writings he admitted that his use of the gestalt switch metaphor had its origin in his experience as a historian working backwards in time and that, consequently, it was not necessarily suitable for describing the experience of the scientists taking part in scientific development. Instead of dramatic gestalt shifts, it is equally plausible that for the historical actors there exist micro-processes in their conceptual development. The development of science may happen stepwise with minor changes and yet still sum up over time to something that appears revolutionary to the historian looking backward and comparing the original conceptual structures to the end product of subsequent changes. Kuhn realized this, but also saw that his own work did not offer any details on how such micro-processes would work, though it did leave room for their exploration (Kuhn 1989).

Exploration of conceptual microstructures has been one of the main issues within the cognitive history and philosophy of science. Historical case studies of conceptual change have been carried out by many scholars, including Nersessian, Thagard, the Andersen-Barker-Chen groupThat (see for example Nersessian, 1984; Thagard, 1992; Andersen, Barker, and Chen, 2006).

Some of the early work in cognitive history and philosophy of science focused on mapping conceptual structures at different stages during scientific change (see for example Thagard, 1990; Thagard and Nowak, 1990; Nersessian and Resnick, 1989) and developing typologies of conceptual change in terms of their degree of severeness (Thagard, 1992). These approaches are useful for comparing between different stages of scientific change and for discussing such issues as incommensurability. However, they do not provide much detail on the creative process through which changes are created.

Other lines of research have focused on the reasoning processes that are used in creating new concepts during scientific change. One of the early contributions to this line of work was Shapere who argued that, as concepts evolve, chains of reasoning connect the successive versions of a concept. These chains of reasoning therefore also establish continuity in scientific change, and this continuity can only be fully understood by analysis of the reasons that motivated each step in the chain of changes (Shapere 1987a;1987b). Over the last two decades, this approach has been extended and substantiated by Nersessian (2008a; 2008b) whose work has focused on the nature of the practices employed by scientists in creating, communicating and replacing scientific representations within a given scientific domain. She argues that conceptual change is a problem-solving process. Model-based reasoning processes, especially, are used to facilitate and constrain abstraction and information from multiple sources during this process.

b. Scientific Change and Science Education

Aiming at insights into general mechanisms of conceptual development, some of the cognitive approaches have been directed toward investigating not only the development of science, but also how sciences are learned. During the 1980s and early 1990s, several scholars argued that conceptual divides of the same kind as described by Kuhn’s incommensurability thesis might exist in science education between teacher and student. Science teaching should, therefore, address these misconceptions in an attempt to facilitate conceptual change in students. Part of this research incorporated the (controversial) thesis that the development of ideas in students mirrors the development of ideas in the history of science—that cognitive ontogeny recapitulates scientific phylogeny. For the field of mechanics in particular, research was done to show that children’s’ naïve beliefs parallel early scientific beliefs, like impetus theories, for example. (Champagne, Klopfer, and Anderson, 1980; Clement, 1983; McClosky, 1983). However, most research went beyond the search for analogies between students’ naïve views and historically held beliefs. Instead, they carried out material investigations of the cognitive processes employed by scientists in constructing scientific concepts and theories more generally, through the available historical records, focussing on the kinds of reasoning strategies communicated in those records (see Nersessian, 1992; Nersessian, 1995a). Thus, this work still assumed that the cognitive activities of scientists in their construction of new scientific concepts was relevant to learning, but it marked a return to a view of the relevance of the history of science as a repository of case studies demonstrating how scientific concepts are constructed and changed. In assuming a conceptual continuity between scientific understanding “then and now,” the cognitive approach had moved away from the Kuhnian emphasis on incommensurability and gestalt shift conceptual change.

6. Further Reading and References

It is impossible to disentangle entirely the history and philosophy of scientific change from a great number of other issues and disciplines. We have not addressed here the epistemology of science, the role of experiments in science (or of thought experiments), for instance. The question of whether science, or knowledge in general, is approaching truth, or tracking truth, or approximating to truth, are debates taken up in epistemology. For more on those issues one should consult the relevant references. Whether science progresses (and not just changes) is a question which supports its own literature as well. Many iterations of interpretations, criticism and replies to challenges of incommensurability, non-cumulativity, and irrationality of science have been given. Beliefs in scientific progress founded on a naïve realism, according to which science is getting ever closer to a literally true picture of the world, have been criticized soundly. A simple version of the criticism is the pessimistic meta-induction: every scientific image of reality in the past has been proven wrong, therefore all future scientific images will be wrong (see Putnam 1978; Laudan 1984). In response to challenges to realism, much attention has been paid to structural realism, an attempt to describe some underlying mathematical structure which is preserved even across major theory changes. Past theories were not entirely wrong, on this view, and not entirely discarded, because they had some of the structure correct, albeit wrongly interpreted or embedded in a mistaken ontology or broader world view which has been since abandoned.
On the question of unity of science, on whether the methods of science are universal or plural, and whether they are rational, see the references given for Cartwright (2007), Feyerabend (1974), Mitchell (2000;2003); Kellert, et al (2006). For feminist criticisms and alternatives to traditional philosophy and history of science the interested reader should consult Longino (1990;2002); Gary, et al (1996); Keller, et al (1996); Ruetsche (2004). Clough (2004) puts forward a program combining feminism and naturalism. Among twenty-first century approaches to the historicity of science there are Friedman's dynamic a priori approach (Friedman 2001), the evolving subject-object relation of McGuire and Tuchanska (2000), and complementary science of Hasok Chang (2004).

Finally, on the topic of the Scientific Revolution, there are the standard Cohen (1985), Hall (1954) and Koyré (1965); but for subsequent discussion of the appropriateness of revolution as a metaphor in the historiography of science we recommend the collection Rethinking the Scientific Revolution, edited by Osler (2000).

a. Primary Sources

  • Crombie, A. C. (1963). Scientific Change: Historical studies in the intellectual, social and technical conditions for scientific discovery and technical invention, from antiquity to the present. London: Heinemann.
  • Feyerabend, P. (1974) Against Method. London: New Left Books.
  • Feyerabend, P. (1987) Farewell to Reason. London: Verso.
  • Fleck, L. (1979) The Genesis and Development of a Scientific Fact, (edited by T.J. Trenn and R.K. Merton, foreword by Thomas Kuhn) Chicago: University of Chicago Press
  • Hull, D.L. (1988). Science as a Process: Evolutionary Account of the Social and Conceptual Development of Science. Chicago: The University of Chicago Press.
  • Kuhn, T. S. (1970). The Structure of Scientific Revolutions. Chicago: Chicago University Press.
  • Kuhn, T. S. (1989). Speaker´s Reply. In S. Allén (Ed.), Possible Worlds in Humanities, arts, and Sciences. Berlin: de Gruyter. 49-51.
  • Lakatos, I. (1970). Falsification and the Methodology of Scientific Research Programs. In I. Lakatos and A. Musgrave, eds., Criticism and the Growth of Knowledge. Cambridge: Cambridge University Press. 91-196.
  • Laudan, L. (1977). Progress and Its Problems. Towards a Theory of Scientific Growth. Berkeley: University of California Press.
  • Laudan, L. (1996). Beyond Positivism and Relativism: Theory, Method, and Evidence. Boulder: Westview Press.
  • Toulmin, S. (1972). Human Understanding: The Collective Use and Evolution of Concepts. Princeton: Princeton University Press.

b. Secondary Sources

  • Andersen, H. (2001). On Kuhn, Belmont CA: Wadsworth
  • Babich, B. E. (2003). From Fleck’s Denkstil to Kuhn’s paradigm: conceptual schemes and incommensurability, International Studies in the Philosophy of Science 17: 75-92
  • Bird, A. (2000). Thomas Kuhn, Chesham: Acumen
  • Brorson, S. and H. Andersen (2001). Stabilizing and changing phenomenal worlds: Ludwik Fleck and Thomas Kuhn on scientific literature, Journal for General Philosophy of Science 32: 109-129
  • Cartwright, Nancy (2007). Hunting Causes and Using Them. Cambridge: Cambridge University Press.
  • Chang, H. (2004). Inventing Temperature: Measurement and Scientific Progress. Oxford: Oxford University Press.
  • Clough, S. Having It All: Naturalized Normativity in Feminist Science Studies. Hypatia, vol. 19 no. 1 (Winter 2004). 102-18.
  • Feyerabend, P. K. (1981). Explanation, reduction and empiricism. In Realism, Rationalism and Scientific Method: Philosophical Papers. Volume 1. Cambridge: Cambridge University Press. 44-96.
  • Friedman, M. (2001). Dynamics of Reason. Stanford: CSLI Publications.
  • Gutting G. (1989). Michel Foucault's archaeology of scientific reason. Cambridge: Cambridge University Press
  • Gutting G. (2005). Continental philosophy of science. Oxford: Blackwell
  • Hall, A.R. (1954). The Scientific Revolution 1500-1800. Boston: Beacon Press.
  • Hoyningen-Huene, P. (1993). Reconstructing Scientific Revolutions, Chicago: University of Chicago Press.
  • Losee, J. (2004). Theories of Scientific Progress. London: Routledge.
  • McGuire, J. E. and Tuchanska, B. (2000). Science Unfettered. Athens: Ohio University Press.
  • Mössner, N. (2011). Thought styles and paradigms – a comparative study of Ludwik Fleck and Thomas S. Kuhn, Studies in History and Philosophy of Science 42: 362-371.

i. Concepts, Cognition and Change

  • Andersen, H., Barker, P., and Chen, X. (2006). The Cognitive Structure of Scientific Revolutions. Cambridge: Cambridge University Press.
  • Champagne, A. B., Klopfer, L. E., and Anderson, J. (1980). Factors Influencing Learning of Classical Mechanics. American Journal of Physics, 48, 1074-1079.
  • Churchland, P. M. (1989). A Neurocomputational Perspective. The Nature of Mind and the Structure of Science. Cambridge, MA: MIT Press.
  • Churchland, P. M. (1992). A deeper unity: Some Feyerabendian themes in neurocomputational form. In R. N. Giere, ed., Cognitive models of science. Minnesota studies in the philosophy of science. Minneapolis: University of Minnesota Press. 341-363.
  • Clement, J. (1983). A Conceptual Model Discussed by Galileo and Used Intuitively by Physics Students. In D. Gentner and A. L. Stevens, eds. Mental Models. Hillsdale: Lawrence Earlbaum Associates. 325-340.
  • Giere, R. N. (1988). Explaining Science: A Cognitive Approach. Chicago: University of Chicago Press.
  • Hanson, N.R.(1958). Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science. Cambridge: Cambridge University Press.
  • McClosky, M. (1983). Naive Theories of Motion. In D. Gentner and A. L. Stevens (Eds.), Mental Models. Hillsdale: Lawrence Erlbaum Associates. 75-98.
  • Nersessian, N. J. (1984). Faraday to Einstein: Constructing Meaning in Scientific Theories. Dordrecht: Martinus Nijhoff.
  • Nersessian, N. J. (1992). Constructing and Instructing: The Role of "Abstraction Techniques" in Creating and Learning Physics. In R.A. Duschl and R. J. Hamilton, eds. Philosophy of Science, Cognition, Psychology and Educational Theory and Practice. Albany: SUNY Press. 48-53.
  • Nersessian, N. J. (1992). How Do Scientists Think? Capturing the Dynamics of Conceptual Change in Science. In R. N. Giere, ed. Cognitive Models of Science. Minneapolis: University of Minnesota Press. 3-44.
  • Nersessian, N. J. (1995a). Should Physicists Preach What They Practice? Constructive Modeling in Doing and Learning Physics. Science and Education, 4. 203-226.
  • Nersessian, N. J. (1995b). Opening the Black Box: Cognitive Science and History of Science. Osiris, 10. 194-211.
  • Nersessian, N. J. (2008a). Creating Scientific Concepts. Cambridge MA: MIT Press.
  • Nersessian, N. J. (2008b). Mental Modelling in Conceptual Change. In S.Vosniadou, ed. International Handbook of Research on Conceptual Change. New York: Routledge. 391-416.
  • Nersessian, N., ed. (1987). The Process of Science. Netherlands: Kluwer Academic Publisher.
  • Nersessian, N. J. and Resnick, L. B. (1989). Comparing Historical and Intuitive Explanations of Motion: Does "Naive Physics" Have a Structure. Proceedings of the Cognitive Science Society, 11. 412-420.
  • Shapere, D. (1987a). “Method in the Philosophy of Science and Epistemology: How to Inquire about Inquiry and Knowledge.” In Nersessian, N., ed. The Process of Science. Netherlands: Kluwer Academic Publisher.
  • Shapere, D. (1987b.) “External and Internal Factors in the Development of Science.” Science and Technology Studies, 1. 1–9.
  • Thagard, P. (1990). The Conceptual Structure of the Chemical Revolution. Philosophy of Science 57, 183-209.
  • Thagard, P. (1992). Conceptual Revolutions. Princeton: Princeton University Press.
  • Thagard, P. and Nowak, G. (1990). The Conceptual Structure of the Geological Revolution. In J. Shrager and P. Langley, eds. Computational Models of Scientific Discovery and Theory Formation. San Mateo: Morgan Kaufmann. 27-72.
  • Thagard, P. (1988). Computational Philosophy of Science. Cambridge: MIT Press.
  • Thagard, P. (1992). Conceptual Revolutions. Princeton: Princeton University Press.
  • Vosniadou, S. (2008). International Handbook of Research in Conceptual Change. London: Routledge.

ii. Feminist, Situated and Social Approaches

  • Garry, Ann and Marilyn Pearsall, eds. (1996). Women, Knowledge and Reality: Explorations in Feminist Epistemology. New York: Routledge.
  • Goldman, Alvin. (1999). Knowledge in a Social World. New York: Oxford University Press.
  • Hacking, Ian. (1999). The Social Construction of What? Cambridge: Harvard University Press.
  • Keller, Evelyn Fox and Helen Longino, eds. (1996). Feminism and Science. Oxford: Oxford University Press.
  • Keller, Stephen H., and Helen E. Longino, and C. Kenneth Waters, eds (2006). Scientific Pluralism. Minnesota Studies in the Philosophy of Science, Volume 19, Minneapolis: University of Minnesota Press.
  • Longino, H. E. (2002). The Fate of Knowledge. Princeton: Princeton University Press.
  • Longino, H. E. (1990). Science as Social Knowledge: Values and Objectivity in Scientific Inquiry. Princeton, NJ: Princeton University Press.
  • McMullin, Ernan, ed. (1992). Social Dimensions of Scientific Knowledge. South Bend: Notre Dame University Press.
  • Ruetsche, Laura, 2004, “Virtue and Contingent History: Possibilities for Feminist Epistemology”, Hypatia, 19.1: 73–101
  • Solomon, Miriam. (2001). Social Empiricism. Cambridge: Massachusetts Institute of Technology Press.

iii. The Scientific Revolution

  • Cohen, I. B., (1985). Revolution in Science, Cambridge: Harvard University Press.
  • Koyré, A. (1965). Newtonian Studies. Chicago: The University of Chicago Press.
  • Osler, Margaret (2000). Rethinking the Scientific Revolution. Cambridge: Cambridge University Press.


Author Information

Hanne Andersen
University of Aarhus


Brian Hepburn
University of Aarhus

What Science Requires of Time

Table of Contents

  1. Relativity and Quantum Mechanics
  2. The Big Bang
  3. Infinite Time
  4. Continuity of Time

Relativity and Quantum Mechanics

EinsteinScience currently requires all the basic laws of science to be time symmetric, to not distinguish between change toward the future and change toward the past. [The second law of thermodynamics is not a basic law.] Also, the basic laws cannot change from one day to another. The basic laws are the laws at the foundation of our two most fundamental physical theories, general relativity and quantum mechanics. The Big Bang theory is the leading theory of cosmology, and it, too, has consequences for our understanding of time, as we shall see.

According to relativity and quantum mechanics, spacetime is, loosely speaking, a collection of points called “spacetime locations” where the universe’s physical events occur. Spacetime is four-dimensional and a continuum, and time is a distinguished, one-dimensional sub-space of this continuum. Therefore, it is less misleading to speak of 4-dimensional spacetime as (3 + 1)-dimensional spacetime.

Any interval of time–that is, any duration–is a linear continuum of instants. So, science requires every duration to have a point-like structure that is the same structure as an interval of real numbers. This implies that between any two instants there are an aleph-one infinity of other instants, and there are no gaps in the sequence of instants. Notice that time is not quantized even in quantum mechanics.

That first response to the question “What does science require of time?” is too simple. There are complications. There is an important difference between the universe’s cosmic time and any object's proper time; and there is an important difference between proper time and a reference frame’s coordinate time.  Unlike in special relativity, most spacetimes can not have a single coordinate system. Also, special relativity considers space-time to be a passive arena for events, but general relativity requires spacetime to be dynamic in the sense that changes in matter-energy can change the curvature of space-time itself. All physicists believe that relativity and quantum mechanics are logically inconsistent and need to be replaced by a theory of quantum gravity. A successful theory of quantum gravity is likely to have radical implications for our understanding of time; two prominent suggestions of what those implications might be are that time and space will be seen to be discrete rather than continuous, and time and space will be seen to emerge from more basic entities. But today "the best game in town" says time is not discrete and does not emerge from a more basic timeless entity.

Aristotle, Newton, and everyone else before Einstein, believed there is a frame-independent notion of duration. For example, if the time interval (duration) between two lightning flashes is 100 seconds on someone’s accurate clock, then it also is 100 seconds on your own accurate clock, even if you are flying at an incredible speed nearby or far away. Einstein rejected this piece of common sense in his 1905 special theory of relativity when he declared that the duration of a non-instantaneous event is relative to (that is, depends on) the observer’s reference frame. As Einstein expressed it, “Every reference-body has its own particular time; unless we are told the reference-body to which the statement of time refers, there is no meaning in a statement of the time of an event.” Two reference frames, or reference-bodies, that are moving relative to each other will divide spacetime differently into its time part and its space part, so they will disagree about the duration of an event that is not instantaneous. In short, your accurate clock need not agree with my accurate clock, and any two initially synchronized clocks will not stay synchronized if they are in motion relative to each other or undergo different gravitational forces.

In 1908, the mathematician Hermann Minkowski had an original idea in metaphysics regarding space and time. He was the first person to realize that spacetime is more fundamental than either time or space alone. As he put it, “Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.” The metaphysical assumption behind Minkowski’s remark is that what is “independently real” is what does not vary from one reference frame to another. What does not vary is their union, what we now call “spacetime.” It seems to follow that the division of events into the past ones, the present ones, and the future ones is also not “independently real.” One philosophical implication that Minkowski and Einstein accepted is that it’s an error to say, “Only my present is real.”

A coordinate system or reference frame is a way of representing space and time using numbers to represent spacetime points. Science confidently assigns numbers to times because, in any reference frame, the happens-before order-relation on events is faithfully reflected in the less-than order-relation on the time numbers (dates) that we assign to events. In the fundamental theories such as relativity and quantum mechanics, the values of the time variable t in any reference frame are real numbers, not merely rational numbers. Each number designates an instant of time, and time is a linear continuum of these instants ordered by the happens-before relation, similar to the mathematician’s line segment that is ordered by the less-than relation. Therefore, if these fundamental theories are correct, then physical time is one-dimensional rather than two-dimensional, and continuous rather than discrete. These features do not require time to be linear, however, because a segment of a circle is also a linear continuum, but there is no evidence for circular time, that is, for causal loops. Causal loops are worldlines that are closed curves in spacetime.

In mathematical physics, the ordering of instants by the happens-before relation, that is, by temporal precedence, is complete in the sense that there are no gaps in the sequence of instants. Unlike physical objects, physical time is believed to be infinitely divisible--divisible in the sense of the actually infinite, not merely in Aristotle's sense of potentially infinite. Regarding the number of instants in any (non-zero) duration, time’s being a linear continuum implies the ordered instants are so densely packed that between any two there is a third, so that no instant has a next instant. In fact, time’s being a linear continuum implies that there is a nondenumerable infinity of instants between any two instants, that is, an aleph one number of instants. There is little doubt that the actual temporal structure of events can be embedded in the real numbers, but how about the converse? That is, to what extent is it known that the real numbers can be adequately embedded into the structure of the instants? The problem here is that, although time is not quantized in quantum theory, for times shorter than about 10-43 second (the so-called Planck time), science has no experimental grounds for the claim that between any two events there is a third. Instead, the justification of saying the reals can be embedded into an interval of instants is that the assumption of continuity is convenient and useful, and there are no known inconsistencies due to making this assumption, and that there are no better theories available.

Relativity theory challenges a great many of our intuitive beliefs about time. For events occurring at the same place, relativity theory implies the order is absolute (independent of the frame of reference) and so agrees with common sense, but for distant events occurring close enough in time to be in each other’s absolute elsewhere, event A can occur before event B in one reference frame, but after B in another frame, and simultaneously with B in yet another frame. For example, suppose you are sitting exactly in the middle of a moving train when lightning strikes simultaneously in the front and back of the train. You will know they were simultaneous if the light from the two strikes reaches you at the same time. But from the reference frame of a person standing still on the ground outside the train, the lightning strike at the back of the train happened first. From a frame fixed to a fast plane flying overhead in the same direction as the train and toward the front of the train, then the lightning strike at the front of the train really happened first. It was Einstein's original idea that all three judgments are correct. The event at the front of the train really did happen first, and it really did happen second, and it really did happen at the same time as the event at the back. It's all a matter of which reference frame is used to make the judgment. Philosophical realists infer from this that events in your absolute elsewhere are as real as any other events even though the only part of the universe that you can directly observe is your own past light cone, your backward cone.

Science impacts our understanding of time in other fundamental ways. Special relativity theory implies there is time dilation between one frame and another. For example, the faster a clock moves, the slower it runs, relative to stationary clocks. But this does not work just for clocks. If a human being moves fast, the human being also ages more slowly than someone who is stationary. Time dilation effects occur for tiny protons, too, but protons do not readily show the effects of their aging the way human bodies and clocks do.

Time dilation shows itself when a speeding twin returns to find that his (or her) Earth-bound twin has aged more rapidly. This surprising dilation result has caused some philosophers to question the consistency of relativity theory by arguing that, if motion is relative, then we could call the speeding twin “stationary” and it would follow that this twin is now the one who ages more rapidly. This argument is called the twin paradox. Experts now are agreed that the mistake is within the argument for the paradox, not within relativity theory. The twins feel different accelerations, so their two situations are not sufficiently similar to carry out the argument. The argument fails to notice the radically different relationships that each twin has to the rest of the universe as a whole. This is why one twin’s proper time is so different than the other’s.

[An object's proper time along its worldline, that is, along its path in 4-d spacetime, is the time elapsed by a clock having the same worldline. Coordinate time is the time measured by a clock at rest in the (inertial) frame. A clock isn't really measuring the time in a reference frame other than one fixed to the clock. In other words, a clock primarily measures the elapsed proper time between events that occur along its own worldline. Technically, a clock is a device that measures the spacetime interval along its own worldline. If the clock is at rest in an inertial frame, then it measures the "coordinate time." If the spacetime has no inertial frame then it can't have a normal coordinate time.]

There are two kinds of time dilation. Special relativity’s time dilation involves speed; general relativity’s also involves gravitational fields (and accelerations). Two ideally synchronized clocks need not stay in synchrony if they undergo different gravitational forces. This gravitational time dilation would be especially apparent if one of the two clocks were to approach a black hole. As a clock falls toward a black hole, time slows on approach to the event horizon, and it completely stops at the horizon (not just at the center of the hole)—relative to time on a clock that remains safely back on Earth.

If, as many physicists suspect, the microstructure of spacetime (near the Planck length which is much smaller than the diameter of a proton) is a quantum foam of changing curvature of spacetime with black holes forming and dissolving, then time loses its meaning at this small scale. The philosophical implication is that time exists only when we are speaking of regions large compared to the Planck length.

General Relativity theory may have even more profound implications for time. In 1948, the logician Kurt Gödel  discovered radical solutions to Einstein’s equations, solutions in which there are closed timelike curves due to the rotation of the universe’s matter, so that as one progresses forward in time along one of these curves one arrives back at one’s starting point. Gödel drew the conclusion that if matter is distributed so that there is Gödelian spacetime (that is, with a preponderance of galaxies rotating in one direction rather than another), then the universe has no linear time. There is no evidence that our universe has this rotation.

We’ve said little about quantum mechanics, but time reversibility is implied by quantum mechanics and not relativity theory. The process of falling into a black hole does not have an inverse process in relativity theory, but every quantum process has an inverse process, so the two major theories are inconsistent on this issue.

The Big Bang

The Big Bang is a violent explosion of spacetime that began billions of years ago. It is not an explosion within preexisting space; the explosion creates new space. The Big Bang theory in some form or other is accepted by the vast majority of astronomers, but it is not as firmly accepted as is the theory of relativity. Here is a quick story of its origin. In 1922, the Russian physicist Alexander Friedmann predicted from general relativity that the universe should be expanding. In 1925, the American astronomer Edwin Hubble made careful observations of clusters of galaxies and confirmed that they are undergoing a universal expansion, on average.

The Big Bang theory is a theory of how our universe evolved, how it expanded and cooled from this beginning. This beginning process is called the “Big Bang” and the expansion and cooling is continuing today. Atoms are not expanding; our solar system is not expanding; even the cluster of galaxies to which the Milky Way belongs is not expanding. But most every galaxy cluster is moving away from the others. It is as if the clusters are exploding away from each other, and in the future they will be very much farther away from each other. But the explosion is not occurring within space; the explosion is an explosion of space. Now, consider the past instead of the future. At any earlier moment the universe was more compact. Projecting to earlier and earlier times, and assuming that gravitation is the main force at work, the astronomers now conclude that 13.7 billion years ago (which happens to be three times the age of our planet) the universe was in a state of nearly zero size and infinite density. Because all substances cool when they expand, physicists believe the universe itself must have been cooling down over the last 13.7 billion years, and so it begin expanding when it was extremely hot. At present the average temperature of space in all very large regions has cooled to 2.7 Celsius degrees above absolute zero. Space is presently expanding at a rate of 71 kilometers per second per megaparsec, a rate that is increasing. A galaxy that is now 100 light years away from the Milky Way will, in another 13.7 billion years, be more than 200 light years away.

As far as we knew back in the 20th century, the entire universe was created in the Big Bang, and time itself came into existence “at that time.” So, the day of the Big Bang was a day without a yesterday. With the appearance of the new theories of quantum gravity in the 21st century, the question of what happened for the Big Bang has been resurrected as legitimate.

In the literature in both physics and philosophy, descriptions of the Big Bang often assume that a first event is also a first instant of time and that spacetime did not exist outside the Big Bang. This intimate linking of a first event with a first time is a philosophical move, not something demanded by the science. It is not even clear that it is correct to call the Big Bang an event. The Big Bang “event” is a singularity without space coordinates, but events normally must have space coordinates. One response to this problem is to alter the definition of “event” to allow the Big Bang to be an event. Another response, from James Hartle and Stephen Hawking, is to consider the past cosmic time-interval to be open rather than closed at t = 0. Looking back to the Big Bang is then like following the positive real numbers back to ever smaller positive numbers without ever reaching a smallest positive one. If Hartle and Hawking are correct that time is actually like this, then the universe had no beginning event.

Classical Big Bang theory is based on the assumption that the universal expansion of clusters of galaxies can be projected all the way back. Yet physicists agree that the projection must become untrustworthy in the Planck era, that is, for all times less than 10-43 second after the beginning of the Big Bang. Current science cannot speak with confidence about the nature of time within the Planck era. If a theory of quantum gravity does get confirmed, it should provide information about this Planck era, and it may even allow physicists to answer the question, “What caused the Big Bang?” and "Did anything happen before then?"

The scientifically radical, but theologically popular, answer, “God caused the Big Bang, but He, himself, does not exist in time” is a cryptic answer because it is not based on a well-justified and detailed theory of who God is, how He caused the Big Bang, and how He can exist but not be in time. It is also difficult to understand St. Augustine’s remark that “time itself was made by God.” On the other hand, for a person of faith, belief in their God is usually stronger than belief in any scientific hypothesis, or in any desire for a scientific justification of their remark about God, or in the importance of satisfying any philosopher’s demand for clarification.

Some physicists are advocating revision of the classical Big Bang theory in order to allow for the “cosmic landscape” or “multiverse,” in which there are multiple big bangs. See (Veneziano, 2006). But there is no external time in which these universes exist, which means that it is not sensible to speak of one universe occurring before or after any other within the multiverse. Also, in some of these universes there is no time dimension at all. However, this new theory is not generally accepted by theoretical cosmologists. Another cosmological theory is that the Big Bang represents a bounce from an earlier compression of the universe; there may be a sequence of bangs and crunches, and presently we are in a bang phase, that is, an expanding phase.

Infinite Time

clockThere are three ways to interpret the question of whether physical time is infinite: (a) Is time infinitely divisible? (b) Will there be an infinite amount of time in the future? (c) Was there an infinite amount of time in the past?

(a) Is time infinitely divisible? Yes, because general relativity and quantum mechanics require time to be a continuum. But the answer is no if these theories are eventually replaced by a relativistic quantum mechanics that quantizes time. “Although there have been suggestions that spacetime may have a discrete structure,” Stephen Hawking said in 1996, “I see no reason to abandon the continuum theories that have been so successful.”

(b) Will there be an infinite amount of time in the future? Probably. According to the classical theory of the Big Bang, the answer depends on whether events will keep occurring. The best estimate from the cosmologists these days is that the expansion of the universe is accelerating and will continue forever. There always will be the events of galaxy clusters getting farther apart, even though gravity will continue to compact much of the matter into black holes, and so the future is potentially infinite.

(c) Was there an infinite amount of time in the past? Aristotle argued “yes.” But by invoking the radical notion that God is “outside of time,” St. Augustine disagreed and said, “Time itself being part of God’s creation, there was simply no before!” (that is, no time before God created everything else but Himself). So, for theological reasons, Augustine declared time had a finite past. After advances in astronomy in the late 19th and early 20th centuries, the question of the age of the universe became a scientific question. With the acceptance of the classical Big Bang theory, the amount of past time was judged to be less than 14 billion years because this is when the Big Bang began. The assumption is that time does not exist independently of the spacetime relations exhibited by physical events. Recently, however, the classical Big Bang theory has been challenged. There could be an infinite amount of time in the past according to some proposed, but as yet untested, theories of quantum gravity based on the assumptions that general relativity theory fails to hold for infinitesimal volumes. These theories imply that the beginning of the Big Bang was actually an inflationary expansion from a pre-existing physical state. There was never a singularity. In that case our Big Bang could be just one bang among other bangs in a multiverse or landscape. If so, then is the past of this multiverse finite or infinite? Cosmologists do not agree on that issue. For a discussion of the controversies, see (Veneziano, 2006) and (Nadis, 2013).

There have been interesting speculations on how conscious life could continue forever, despite the fact that the available energy for life will decrease as the universe expands, and despite the fact that any life swept up into a black hole will reach the center of the hole in a finite time at which point death will be certain. For an introduction to these speculations, see (Krauss and Starkman, 2002).

Continuity of Time

In the classical theories of relativity and quantum mechanics, time is not quantized, but is a continuum. However, if certain, as yet untested, theories attempting to unify relativity and quantum mechanics are correct, then there is a shortest duration for any possible event (about 10-43 second), and time is digital rather than analog.

Author Information

Bradley Dowden
California State University, Sacramento
U. S. A.

Back to the main "Time" article.

The Philosophy of Anthropology

The Philosophy of Anthropology refers to the central philosophical perspectives which underpin, or have underpinned, the dominant schools in anthropological thinking. It is distinct from Philosophical Anthropology which attempts to define and understand what it means to be human.

This article provides an overview of the most salient anthropological schools, the philosophies which underpin them and the philosophical debates surrounding these schools within anthropology. It specifically operates within these limits because the broader discussions surrounding the Philosophy of Science and the Philosophy of Social Science  have been dealt with at length elsewhere in this encyclopedia. Moreover, the specific philosophical perspectives have also been discussed in great depth in other contributions, so they will be elucidated to the extent that this is useful to comprehending their relationship with anthropology. In examining the Philosophy of Anthropology, it is necessary to draw some, even if cautious borders, between anthropology and other disciplines. Accordingly, in drawing upon anthropological discussions, we will define, as anthropologists, scholars who identify as such and who publish in anthropological journals and the like. In addition, early anthropologists will be selected by virtue of their interest in peasant culture and non-Western, non-capitalist and stateless forms of human organization.

The article specifically aims to summarize the philosophies underpinning anthropology, focusing on the way in which anthropology has drawn upon them. The philosophies themselves have been dealt with in depth elsewhere in this encyclopedia. It has been suggested by philosophers of social science that anthropology tends to reflect, at any one time, the dominant intellectual philosophy because, unlike in the physical sciences, it is influenced by qualitative methods and so can more easily become influenced by ideology (for example Kuznar 1997 or Andreski 1974). This article begins by examining what is commonly termed ‘physical anthropology.’ This is the science-oriented form of anthropology which came to prominence in the nineteenth century. As part of this section, the article also examines early positivist social anthropology, the historical relationship between anthropology and eugenics, and the philosophy underpinning this.

The next section examines naturalistic anthropology. ‘Naturalism,’ in this usage, is drawn from the biological ‘naturalists’ who collected specimens in nature and described them in depth, in contrast to ‘experimentalists.’ Anthropological ‘naturalists’ thus conduct fieldwork with groups of people rather than engage in more experimental methods. The naturalism section looks at the philosophy underpinning the development of ethnography-focused anthropology, including cultural determinism, cultural relativism, fieldwork ethics and the many criticisms which this kind of anthropology has provoked. Differences in its development in Western and Eastern Europe also are analyzed. As part of this, the article discusses the most influential schools within naturalistic anthropology and their philosophical foundations.

The article then examines Post-Modern or ‘Contemporary’ anthropology. This school grew out of the ‘Crisis of Representation’ in anthropology beginning in the 1970s. The article looks at how the Post-Modern critique has been applied to anthropology, and it examines the philosophical assumptions behind developments such as auto-ethnography. Finally, it examines the view that there is a growing philosophical split within the discipline.

Table of Contents

  1. Positivist Anthropology
    1. Physical Anthropology
    2. Race and Eugenics in Nineteenth Century Anthropology
    3. Early Evolutionary Social Anthropology
  2. Naturalist Anthropology
    1. The Eastern European School
    2. The Ethnographic School
    3. Ethics and Participant Observation Fieldwork
  3. Anthropology since World War I
    1. Cultural Determinism and Cultural Relativism
    2. Functionalism and Structuralism
    3. Post-Modern or Contemporary Anthropology
  4. Philosophical Dividing Lines
    1. Contemporary Evolutionary Anthropology
    2. Anthropology: A Philosophical Split?
  5. References and Further Reading

1. Positivist Anthropology

a. Physical Anthropology

Anthropology itself began to develop as a separate discipline in the mid-nineteenth century, as Charles Darwin’s (1809-1882) Theory of Evolution by Natural Selection (Darwin 1859) became widely accepted among scientists. Early anthropologists attempted to apply evolutionary theory within the human species, focusing on physical differences between different human sub-species or racial groups (see Eriksen 2001) and the perceived intellectual differences that followed.

The philosophical assumptions of these anthropologists were, to a great extent, the same assumptions which have been argued to underpin science itself. This is the positivism, rooted in Empiricism, which argued that knowledge could only be reached through the empirical method and statements were meaningful only if they could be empirically justified, though it should be noted that Darwin should not necessarily be termed a positivist. Science needed to be solely empirical, systematic and exploratory, logical, theoretical (and thus focused on answering questions). It needed to attempt to make predictions which are open to testing and falsification and it needed to be epistemologically optimistic (assuming that the world can be understood). Equally, positivism argues that truth-statements are value-neutral, something disputed by the postmodern school. Philosophers of Science, such as Karl Popper (1902-1994) (for example Popper 1963), have also stressed that science must be self-critical, prepared to abandon long-held models as new information arises, and thus characterized by falsification rather than verification though this point was also earlier suggested by Herbert Spencer (1820-1903) (for example Spencer 1873). Nevertheless, the philosophy of early physical anthropologists included a belief in empiricism, the fundamentals of logic and epistemological optimism. This philosophy has been criticized by anthropologists such as Risjord (2007) who has argued that it is not self-aware – because values, he claims, are always involved in science – and non-neutral scholarship can be useful in science because it forces scientists to better contemplate their ideas.

b. Race and Eugenics in Nineteenth Century Anthropology

During the mid-nineteenth and early twentieth centuries, anthropologists began to systematically examine the issue of racial differences, something which became even more researched after the acceptance of evolutionary theory (see Darwin 1871). That said, it should be noted that Darwin himself did not specifically advocate eugenics or theories of progress. However, even prior to Darwin’s presentation of evolution (Darwin 1859), scholars were already attempting to understand 'races' and the evolution of societies from ‘primitive’ to complex (for example Tylor 1865).

Early anthropologists such as Englishman John Beddoe (1826-1911) (Boddoe 1862) or Frenchman Arthur de Gobineau (1816-1882) (Gobineau 1915) developed and systematized racial taxonomies which divided, for example, between ‘black,’ ‘yellow’ and ‘white.’ For these anthropologists, societies were reflections of their racial inheritance; a viewpoint termed biological determinism. The concept of ‘race’ has been criticized, within anthropology, variously, as being simplistic and as not being a predictive (and thus not a scientific) category (for example Montagu 1945) and there was already some criticism of the scope of its predictive validity in the mid-nineteenth century (for example Pike 1869). The concept has also been criticized on ethical grounds, because racial analysis is seen to promote racial violence and discrimination and uphold a certain hierarchy, and some have suggested its rejection because of its connotations with such regimes as National Socialism or Apartheid, meaning that it is not a neutral category (for example Wilson 2002, 229).

Those anthropologists who continue to employ the category have argued that ‘race’ is predictive in terms of life history, only involves the same inherent problems as any cautiously essentialist taxonomy and that moral arguments are irrelevant to the scientific usefulness of a category of apprehension (for example Pearson 1991) but, to a great extent, current anthropologists reject racial categorization. The American Anthropological Association’s (1998) ‘Statement on Race’ began by asserting that: ‘"Race" thus evolved as a worldview, a body of prejudgments that distorts our ideas about human differences and group behavior. Racial beliefs constitute myths about the diversity in the human species and about the abilities and behavior of people homogenized into "racial" categories.’ In addition, a 1985 survey by the American Anthropological Association found that only a third of cultural anthropologists (but 59 percent of physical anthropologists) regarded ‘race’ as a meaningful category (Lynn 2006, 15). Accordingly, there is general agreement amongst anthropologists that the idea, promoted by anthropologists such as Beddoe, that there is a racial hierarchy, with the white race as superior to others, involves importing the old ‘Great Chain of Being’ (see Lovejoy 1936) into scientific analysis and should be rejected as unscientific, as should ‘race’ itself. In terms of philosophy, some aspects of nineteenth century racial anthropology might be seen to reflect the theories of progress that developed in the nineteenth century, such as those of G. W. F. Hegel (1770-1831) (see below). In addition, though we will argue that Herderian nationalism is more influential in Eastern Europe, we should not regard it as having no influence at all in British anthropology. Native peasant culture, the staple of the Eastern European, Romantic nationalism-influenced school (as we will see), was studied in nineteenth century Britain, especially in Scotland and Wales, though it was specifically classified as ‘folklore’ and as outside anthropology (see Rogan 2012). However, as we will discuss, the influence is stronger in Eastern Europe.

The interest in race in anthropology developed alongside a broader interest in heredity and eugenics. Influenced by positivism, scholars such as Herbert Spencer (1873) applied evolutionary theory as a means of understanding differences between different societies. Spencer was also seemingly influenced, on some level, by theories of progress of the kind advocated by Hegel and even found in Christian theology. For him, evolution logically led to eugenics. Spencer argued that evolution involved a progression through stages of ever increasing complexity – from lower forms to higher forms - to an end-point at which humanity was highly advanced and was in a state of equilibrium with nature. For this perfected humanity to be reached, humans needed to engage in self-improvement through selective breeding.

American anthropologist Madison Grant (1865-1937) (Grant 1916), for example, reflected a significant anthropological view in 1916 when he argued that humans, and therefore human societies, were essentially reflections of their biological inheritance and that environmental differences had almost no impact on societal differences. Grant, as with other influential anthropologists of the time, advocated a program of eugenics in order to improve the human stock. According to this program, efforts would be made to encourage breeding among the supposedly superior races and social classes and to discourage it amongst the inferior races and classes (see also Galton 1909). This form of anthropology has been criticized for having a motivation other than the pursuit of truth, which has been argued to be the only appropriate motivation for any scientist. It has also been criticized for basing its arguments on disputed system of categories – race – and for uncritically holding certain assumptions about what is good for humanity (for example Kuznar 1997, 101-109). It should be emphasized that though eugenics was widely accepted among anthropologists in the nineteenth century, there were also those who criticized it and its assumptions (for example Boas 1907. See Stocking 1991 for a detailed discussion). Proponents have countered that a scientist’s motivations are irrelevant as long as his or her research is scientific, that race should not be a controversial category from a philosophical perspective and that it is for the good of science itself that the more scientifically-minded are encouraged to breed (for example Cattell 1972). As noted, some scholars stress the utility of ideologically-based scholarship.

A further criticism of eugenics is that it fails to recognize the supposed inherent worth of all individual humans (for example Pichot 2009). Advocates of eugenics, such as Grant (1916), dismiss this as a ‘sentimental’ dogma which fails to accept that humans are animals, as acceptance of evolutionary theory, it is argued, obliges people to accept, and which would lead to the decline of civilization and science itself. We will note possible problems with this perspective in our discussion of ethics. Also, it might be useful to mention that the form of anthropology that is sympathetic to eugenics is today centered around an academic journal called The Mankind Quarterly, which critics regard as ‘racist’ (for example Tucker 2002, 2) and even academically biased (for example Ehrenfels 1962). Although ostensibly an anthropology journal, it also publishes psychological research. A prominent example of such an anthropologist is Roger Pearson (b. 1927), the journal’s current editor. But such a perspective is highly marginal in current anthropology.

c. Early Evolutionary Social Anthropology

Also from the middle of the nineteenth century, there developed a school in Western European and North American anthropology which focused less on race and eugenics and more on answering questions relating to human institutions, and how they evolved, such as ‘How did religion develop?’ or ‘How did marriage develop?’ This school was known as ‘cultural evolutionism.’ Members of this school, such as Sir James Frazer (1854-1941) (Frazer 1922), were influenced by the positivist view that science was the best model for answering questions about social life. They also shared with other evolutionists an acceptance of a modal human nature which reflected evolution to a specific environment. However, some, such as E. B. Tylor (1832-1917) (Tylor 1871), argued that human nature was the same everywhere, moving away from the focus on human intellectual differences according to race. The early evolutionists believed that as surviving ‘primitive’ social organizations, within European Empires for example, were examples of the ‘primitive Man,’ the nature of humanity, and the origins of its institutions, could be best understood through analysis of these various social groups and their relationship with more ‘civilized’ societies (see Gellner 1995, Ch. 2).

As with the biological naturalists, scholars such as Frazer and Tylor collected specimens on these groups – in the form of missionary descriptions of ‘tribal life’ or descriptions of 'tribal life' by Westernized tribal members – and compared them to accounts of more advanced cultures in order to answer discrete questions. Using this method of accruing sources, now termed ‘armchair anthropology’ by its critics, the early evolutionists attempted to answered discrete questions about the origins and evolution of societal institutions. As early sociologist Emile Durkheim (1858-1917) (Durkheim 1965) summarized it, such scholars aimed to discover ‘social facts.’ For example, Frazer concluded, based on sources, that societies evolved from being dominated by a belief in Magic, to a belief in Spirits and then a belief in gods and ultimately one God. For Tylor, religion began with ‘animism’ and evolved into more complex forms but tribal animism was the essence of religion and it had developed in order to aid human survival.

This school of anthropology has been criticized because of its perceived inclination towards reductionism (such as defining ‘religion’ purely as ‘survival’), its speculative nature and its failure to appreciate the problems inherent in relying on sources, such as ‘gate keepers’ who will present their group in the light in which they want it to be seen. Defenders have countered that without attempting to understand the evolution of societies, social anthropology has no scientific aim and can turn into a political project or simply description of perceived oddities (for example Hallpike 1986, 13). Moreover, the kind of stage theories advocated by Tylor have been criticized for conflating evolution with historicist theories of progress, by arguing that societies always pass through certain phases of belief and the Western civilization is the pinnacle of development, a belief known as unilinealism. This latter point has been criticized as ethnocentric (for example Eriksen 2001) and reflects some of the thinking of Herbert Spencer, who was influential in early British anthropology.

2. Naturalist Anthropology

a. The Eastern European School

Whereas Western European and North American anthropology were oriented towards studying the peoples within the Empires run by the Western powers and was influenced by Darwinian science, Eastern European anthropology developed among nascent Eastern European nations. This form of anthropology was strongly influenced by Herderian nationalism and ultimately by Hegelian political philosophy and the Romantic Movement of eighteenth century philosopher Jean-Jacques Rousseau (1712-1778). Eastern European anthropologists believed, following the Romantic Movement, that industrial or bourgeois society was corrupt and sterile. The truly noble life was found in the simplicity and naturalness of communities close to nature. The most natural form of community was a nation of people, bonded together by shared history, blood and customs, and the most authentic form of such a nation’s lifestyle was to be found amongst its peasants. Accordingly, Eastern European anthropology elevated peasant life as the most natural form of life, a form of life that should, on some level, be strived towards in developing the new ‘nation’ (see Gellner 1995).

Eastern European anthropologists, many of them motivated by Romantic nationalism, focused on studying their own nations’ peasant culture and folklore in order to preserve it and because the nation was regarded as unique and studying its most authentic manifestation was therefore seen as a good in itself. As such, Eastern European anthropologists engaged in fieldwork amongst the peasants, observing and documenting their lives. There is a degree to which the kind of anthropology – or ‘ethnology’ – remains more popular in Eastern than in Western Europe (see, for example, Ciubrinskas 2007 or SarkanyND) at the time of writing.

Siikala (2006) observes that Finnish anthropology is now moving towards the Western model of fieldwork abroad but as recently as the 1970s was still predominantly the study of folklore and peasant culture. Baranski (2009) notes that in Poland, Polish anthropologists who wish to study international topics still tend to go to the international centers while those who remain in Poland tend to focus on Polish folk culture, though the situation is slowly changing. Lithuanian anthropologist Vytis Ciubrinkas (2007) notes that throughout Eastern Europe, there is very little separate ‘anthropology,’ with the focus being ‘national ethnology’ and ‘folklore studies,’ almost always published in the vernacular. But, again, he observes that the kind of anthropology popular in Western Europe is making inroads into Eastern Europe. In Russia, national ethnology and peasant culture also tends to be predominant (for example Baiburin 2005). Indeed, even beyond Eastern Europe, it was noted in the year 2000 that ‘the emphasis of Indian social anthropologists remains largely on Indian tribes and peasants. But the irony is that barring the detailed tribal monographs prepared by the British colonial officers and others (. . .) before Independence, we do not have any recent good ethnographies of a comparable type’ (Srivastava 2000). By contrast, Japanese social anthropology has traditionally been in the Western model, studying cultures more ‘primitive’ than its own (such as Chinese communities), at least in the nineteenth century. Only later did it start to focus more on Japanese folk culture and it is now moving back towards a Western model (see Sedgwick 2006, 67).

The Eastern school has been criticized for uncritically placing a set of dogmas – specifically nationalism – above the pursuit of truth, accepting a form of historicism with regard to the unfolding of the nation’s history and drawing a sharp, essentialist line around the nationalist period of history (for example Popper 1957). Its anthropological method has been criticized because, it is suggested, Eastern European anthropologists suffer from home blindness. By virtue of having been raised in the culture which they are studying, they cannot see it objectively and penetrate to its ontological presuppositions (for example Kapferer 2001).

b. The Ethnographic School

The Ethnographic school, which has since come to characterize social and cultural anthropology, was developed by Polish anthropologist Bronislaw Malinowski (1884-1942) (for example Malinowski 1922). Originally trained in Poland, Malinowski’s anthropological philosophy brought together key aspects of the Eastern and Western schools. He argued that, as with the Western European school, anthropologists should study foreign societies. This avoided home blindness and allowed them to better perceive these societies objectively. However, as with the Eastern European School, he argued that anthropologists should observe these societies in person, something termed ‘participant observation’ or ‘ethnography.’ This method, he argued, solved many of the problems inherent in armchair anthropology.

It is this method which anthropologists generally summarize as ‘naturalism’ in contrast to the ‘positivism,’ usually followed alongside a quantitative method, of evolutionary anthropologists. Naturalist anthropologists argue that their method is ‘scientific’ in the sense that it is based on empirical observation but they argue that some kinds of information cannot be obtained in laboratory conditions or through questionnaires, both of which lend themselves to quantitative, strictly scientific analysis. Human culturally-influenced actions differ from the subjects of physical science because they involve meaning within a system and meaning can only be discerned after long-term immersion in the culture in question. Naturalists therefore argue that a useful way to find out information about and understand a people – such as a tribe – is to live with them, observe their lives, gain their trust and eventually live, and even think, as they do. This latter aim, specifically highlighted by Malinowski, has been termed the empathetic perspective and is considered, by many naturalist anthropologists, to be a crucial sign of research that is anthropological. In addition to these ideas, the naturalist perspective draws upon aspects of the Romantic Movement in that it stresses, and elevates, the importance of ‘gaining empathy’ and respecting the group it is studying, some naturalists argue that there are ‘ways of knowing’ other than science (for example Rees 2010) and that respect for the group can be more important than gaining new knowledge. They also argue that human societies are so complex that they cannot simply be reduced to biological explanations.

In many ways, the successor to Malinowski as the most influential cultural anthropologist was the American Clifford Geertz (1926-2006). Where Malinowski emphasized ‘participant observation’ – and thus, to a greater degree, an outsider perspective – it was Geertz who argued that the successful anthropologist reaches a point where he sees things from the perspective of the native. The anthropologist should bring alive the native point of view, which Roth (1989) notes ‘privileges’ the native, thus challenging a hierarchical relationship between the observed and the observer. He thus strongly rejected a distinction which Malinowski is merely critical of: the distinction between a ‘primitive’ and ‘civilized’ culture. In many respects, this distinction was also criticised by the Structuralists – whose central figure, Claude Levi-Strauss (1908-2009), was an earlier generation than Geertz – as they argued that all human minds involved similar binary structures (see below).

However, there was a degree to which both Malinowski and Geertz did not divorce ‘culture’ from ‘biology.’ Malinowski (1922) argued that anthropological interpretations should ultimately be reducible to human instincts while Geertz (1973, 46-48) argued that culture can be reduced to biology and that culture also influences biology, though he felt that the main aim of the ethnographer was to interpret. Accordingly, it is not for the anthropologist to comment on the culture in terms of its success or the validity of its beliefs. The anthropologist’s purpose is merely to record and interpret.

The majority of those who practice this form of anthropology are interpretivists. They argue that the aim of anthropology is to understand the norms, values, symbols and processes of a society and, in particular, their ‘meaning’ – how they fit together. This lends itself to the more subjective methods of participant observation. Applying a positivist methodology to studying social groups is regarded as dangerous because scientific understanding is argued to lead to better controlling the world and, in this case, controlling people. Interpretivist anthropology has been criticized, variously, as being indebted to imperialism (see below) and as too subjective and unscientific, because, unless there is a common set of analytical standards (such as an acceptance of the scientific method, at least to some extent), there is no reason to accept one subjective interpretation over another. This criticism has, in particular, been leveled against naturalists who accept cultural relativism (see below).

Also, many naturalist anthropologists emphasize the separateness of ‘culture’ from ‘biology,’ arguing that culture cannot simply be traced back to biology but rather is, to a great extent, independent of it; a separate category. For example, Risjord (2000) argues that anthropology ‘will never reach the social reality at which it aims’ precisely because ‘culture’ cannot simply be reduced to a series of scientific explanations. But it has been argued that if the findings of naturalist anthropology are not ultimately consilient with science then they are not useful to people outside of naturalist anthropology and that naturalist anthropology draws too stark a line between apes and humans when it claims that human societies are too complex to be reduced to biology or that culture is not closely reflective of biology (Wilson 1998, Ch. 1). In this regard, Bidney (1953, 65) argues that, ‘Theories of culture must explain the origins of culture and its intrinsic relations to the psychobiological nature of man’ as to fail to do so simply leaves the origin of culture as a ‘mystery or an accident of time.’

c. Ethics and Participant Observation Fieldwork

From the 1970s, the various leading anthropological associations began to develop codes of ethics. This was, at least in part, inspired by the perceived collaboration of anthropologists with the US-led counterinsurgency groups in South American states. For example, in the 1960s, Project Camelot commissioned anthropologists to look into the causes of insurgency and revolution in South American States, with a view to confronting these perceived problems. It was also inspired by the way that increasing numbers of anthropologists were employed outside of universities, in the private sector (see Sluka 2007).

The leading anthropological bodies – such as the Royal Anthropological Institute – hold to a system of research ethics which anthropologists, conducting fieldwork, are expected, though not obliged, to adhere to. For example, the most recent American Anthropological Association Code of Ethics (1998) emphasizes that certain ethical obligations can supersede the goal of seeking new knowledge. Anthropologists, for example, may not publish research which may harm the ‘safety,’ ‘privacy’ or ‘dignity’ of those whom they study, they must explain their fieldwork to their subjects and emphasise that attempts at anonymity may sometimes fail, they should find ways of reciprocating to those whom they study and they should preserve opportunities for future fieldworkers.

Though the American Anthropological Association does not make their philosophy explicit, much of the philosophy appears to be underpinned by the golden rule. One should treat others as one would wish to be treated oneself. In this regard, one would not wish to be exploited, misled or have ones safety or privacy comprised. For some scientists, the problem with such a philosophy is that, from their perspective, humans should be an objective object of study like any other. The assertion that the ‘dignity’ of the individual should be preserved may be seen to reflect a humanist belief in the inherent worth of each human being. Humanism has been accused of being sentimental and of failing to appreciate the substantial differences between human beings intellectually, with some anthropologists even questioning the usefulness of the broad category ‘human’ (for example Grant 1916). It has also been accused of failing to appreciate that, from a scientific perspective, humans are a highly evolved form of ape and scholars who study them should attempt to think, as Wilson (1975, 575) argues, as if they are alien zoologists. Equally, it has been asked why primary ethical responsibility should be to those studied. Why should it not be to the public or the funding body? (see Sluka 2007) In this regard, it might be suggested that the code reflects the lauding of members of (often non-Western) cultures which might ultimately be traced back to the Romantic Movement. Their rights are more important than those of the funders, the public or of other anthropologists.

Equally, the code has been criticized in terms of power dynamics, with critics arguing that the anthropologist is usually in a dominant position over those being studied which renders questionable the whole idea of ‘informed consent’ (Bourgois 2007). Indeed, it has been argued that the most recent American Anthropological Association Code of Ethics (1998) is a movement to the right, in political terms, because it accepts, explicitly, that responsibility should also be to the public and to funding bodies and is less censorious than previous codes with regard to covert research (Pels 1999). This seems to be a movement towards a situation where a commitment to the group being studied is less important than the pursuit of truth, though the commitment to the subject of study is still clear.

Likewise, the most recent set of ethical guidelines from the Association of Anthropologists of the UK and the Commonwealth implicitly accepts that there is a difference of opinion among anthropologists regarding whom they are obliged to. It asserts, ‘Most anthropologists would maintain that their paramount obligation is to their research participants . . .’ This document specifically warrants against giving subjects ‘self-knowledge which they did not seek or want.’ This may be seen to reflect a belief in a form of cultural relativism. Permitting people to preserve their way of thinking is more important than their knowing what a scientist would regard as the truth. Their way of thinking – a part of their culture - should be respected, because it is theirs, even if it is inaccurate. This could conceivably prevent anthropologists from publishing dissections of particular cultures if they might be read by members of that culture (see Dutton 2009, Ch. 2). Thus, philosophically, the debate in fieldwork ethics ranges from a form of consequentialism to, in the form of humanism, a deontological form of ethics. However, it should be emphasized that the standard fieldwork ethics noted are very widely accepted amongst anthropologists, particularly with regard to informed consent. Thus, the idea of experimenting on unwilling or unknowing humans is strongly rejected, which might be interpreted to imply some belief in human separateness.

3. Anthropology since World War I

a. Cultural Determinism and Cultural Relativism

As already discussed, Western European anthropology, around the time of World War I, was influenced by eugenics and biological determinism. But as early as the 1880s, this was beginning to be questioned by German-American anthropologist Franz Boas (1858-1942) (for example Boas 1907), based at Columbia University in New York. He was critical of biological determinism and argued for the importance of environmental influence on individual personality and thus modal national personality in a way of thinking called ‘historical particularism.’

Boas emphasized the importance of environment and history in shaping different cultures, arguing that all humans were biologically relatively similar and rejecting distinctions of ‘primitive’ and civilized.’ Boas also presented critiques of the work of early evolutionists, such as Tylor, demonstrating that not all societies passed through the phases he suggested or did not do so in the order he suggested. Boas used these findings to stress the importance of understanding societies individually in terms of their history and culture (for example Freeman 1983).

Boas sent his student Margaret Mead (1901-1978) to American Samoa to study the people there with the aim of proving that they were a ‘negative instance’ in terms of violence and teenage angst. If this could be proven, it would undermine biological determinism and demonstrate that people were in fact culturally determined and that biology had very little influence on personality, something argued by John Locke (1632-1704) and his concept of the tabula rasa. This would in turn mean that Western people’s supposed teenage angst could be changed through changing the culture. After six months in American Samoa, Mead returned to the USA and published, in 1928, her influential book Coming of Age in Samoa: A Psychological Study of Primitive Youth for Western Civilization (Mead 1928). It portrayed Samoa as a society of sexual liberty in which there were none of the problems associated with puberty that were associated with Western civilization. Accordingly, Mead argued that she had found a negative instance and that humans were overwhelming culturally determined. At around the same time Ruth Benedict (1887-1948), also a student of Boas’s, published her research in which she argued that individuals simply reflected the ‘culture’ in which they were raised (Benedict 1934).

The cultural determinism advocated by Boas, Benedict and especially Mead became very popular and developed into school which has been termed ‘Multiculturalism’ (Gottfried 2004). This school can be compared to Romantic nationalism in the sense that it regards all cultures as unique developments which should be preserved and thus advocates a form of ‘cultural relativism’ in which cultures cannot be judged by the standards of other cultures and can only be comprehended in their own terms. However, it should be noted that ‘cultural relativism’ is sometimes used to refer to the way in which the parts of a whole form a kind of separate organism, though this is usually referred to as ‘Functionalism.' In addition, Harris (see Headland, Pike, and Harris 1990) distinguishes between ‘emic’ (insider) and ‘etic’ (outsider) understanding of a social group, arguing that both perspectives seem to make sense from the different viewpoints. This might also be understood as cultural relativism and perhaps raises the question of whether the two worlds can so easily be separated.  Cultural relativism also argues, as with Romantic Nationalism, that so-called developed cultures can learn a great deal from that which they might regard as ‘primitive’ cultures. Moreover, humans are regarded as, in essence, products of culture and as extremely similar in terms of biology.

Cultural Relativism led to so-called ‘cultural anthropologists’ focusing on the symbols within a culture rather than comparing the different structures and functions of different social groups, as occurred in ‘social anthropology’ (see below). As comparison was frowned upon, as each culture was regarded as unique, anthropology in the tradition of Mead tended to focus on descriptions of a group’s way of life. Thick description is a trait of ethnography more broadly but it is especially salient amongst anthropologists who believe that cultures can only be understood in their own terms. Such a philosophy has been criticized for turning anthropology into little more than academic-sounding travel writing because it renders it highly personal and lacking in comparative analysis (see Sandall 2001, Ch. 1).

Cultural relativism has also been criticized as philosophically impractical and, ultimately, epistemologically pessimistic (Scruton 2000), because it means that nothing can be compared to anything else or even assessed through the medium of a foreign language’s categories. In implicitly defending cultural relativism, anthropologists have cautioned against assuming that some cultures are more ‘rational’ than others. Hollis (1967), for example, argues that anthropology demonstrates that superficially irrational actions may become ‘rational’ once the ethnographer understands the ‘culture.’ Risjord (2000) makes a similar point. This implies that the cultures are separate worlds, ‘rational’ in themselves. Others have suggested that entering the field assuming that the Western, ‘rational’ way of thinking is correct can lead to biased fieldwork interpretation (for example Rees 2010).

Critics have argued that certain forms of behaviour can be regarded as undesirable in all cultures, yet are only prevalent in some. It has also been argued that Multiculturalism is a form of Neo-Marxism on the grounds that it assumes imperialism and Western civilization to be inherently problematic but also because it lauds the materially unsuccessful. Whereas Marxism extols the values and lifestyle of the worker, and critiques that of the wealthy, Multiculturalism promotes “materially unsuccessful” cultures and critiques more materially successful, Western cultures (for example Ellis 2004 or Gottfried 2004).

Cultural determinism has been criticized both from within and from outside anthropology. From within anthropology, New Zealand anthropologist Derek Freeman (1916-2001), having been heavily influenced by Margaret Mead, conducted his own fieldwork in Samoa around twenty years after she did and then in subsequent fieldwork visits. As he stayed there far longer than Mead, Freeman was accepted to a greater extent and given an honorary chiefly title. This allowed him considerable access to Samoan life. Eventually, in 1983 (after Mead’s death) he published his refutation: Margaret Mead and Samoa: The Making and Unmaking of an Anthropological Myth (Freeman 1983). In it, he argued that Mead was completely mistaken. Samoa was sexually puritanical, violent and teenagers experienced just as much angst as they did everywhere else. In addition, he highlighted serious faults with her fieldwork: her sample was very small, she chose to live at the American naval base rather than with a Samoan family, she did not speak Samoan well, she focused mainly on teenage girls and Freeman even tracked one down who, as an elderly lady, admitted she and her friends had deliberately lied to Mead about their sex lives for their own amusement (Freeman 1999). It should be emphasized that Freeman’s critique of Mead related to her failure to conduct participant observation fieldwork properly (in line with Malinowski’s recommendations). In that Freeman rejects distinctions of primitive and advanced, and stresses the importance of culture in understanding human differences, it is also in the tradition of Boas. However, it should be noted that Freeman’s (1983) critique of Mead has also been criticized as being unnecessarily cutting, prosecuting a case against Mead to the point of bias against her and ignoring points which Mead got right (Schankman 2009, 17).

There remains an ongoing debate about the extent to which culture reflects biology or is on a biological leash. However, a growing body of research in genetics is indicating that human personality is heavily influenced by genetic factors (for example Alarcon, Foulks, and Vakkur 1998 or Wilson 1998), though some research also indicates that environment, especially while a fetus, can alter the expression of genes (see Nettle 2007). This has become part of the critique of cultural determinism from evolutionary anthropologists.

b. Functionalism and Structuralism

Between the 1930s and 1970s, various forms of functionalism were influential in British social anthropology. These schools accepted, to varying degrees, the cultural determinist belief that ‘culture’ was a separate sphere from biology and operated according to its own rules but they also argued that social institutions could be compared in order to better discern the rules of such institutions. They attempted to discern and describe how cultures operated and how the different parts of a culture functioned within the whole. Perceiving societies as organisms has been traced back to Herbert Spencer. Indeed, there is a degree to which Durkheim (1965) attempted to understand, for example, the function of religion in society. But functionalism seemingly reflected aspects of positivism: the search for, in this case, social facts (cross-culturally true), based on empirical evidence.

E. E. Evans-Pritchard (1902-1973) was a leading British functionalist from the 1930s onwards. Rejecting grand theories of religion, he argued that a tribe’s religion could only make sense in terms of function within society and therefore a detailed understanding of the tribe’s history and context was necessary. British functionalism, in this respect, was influenced by the linguistic theories of Swiss thinker Ferdinand de Saussure (1857-1913), who suggested that signs only made sense within a system of signs. He also engaged in lengthy fieldwork. This school developed into ‘structural functionalism.’ A. R. Radcliffe-Brown (1881-1955) is often argued to be a structural functionalist, though he denied this. Radcliffe-Brown rejected Malinowski’s functionalism – which argued that social practices were grounded in human instincts. Instead, he was influenced by the process philosophy of Alfred North Whitehead (1861-1947). Radcliffe-Brown claimed that the units of anthropology were processes of human life and interaction. They are in constant flux and so anthropology must explain social stability. He argued that practices, in order to survive, must adapt to other practices, something called ‘co-adaptation’ (Radcliffe-Brown 1957). It might be argued that this leads us asking where any of the practices came from in the first place.

However, a leading member of the structural functionalist school was Scottish anthropologist Victor Turner (1920-1983). Structural functionalists attempted to understand society as a structure with inter-related parts. In attempting to understand Rites of Passage, Turner argued that everyday structured society could be contrasted with the Rite of Passage (Turner 1969). This was a liminal (transitional) phase which involved communitas (a relative breakdown of structure). Another prominent anthropologist in this field was Mary Douglas (1921-2007). She examined the contrast between the ‘sacred’ and ‘profane’ in terms of categories of ‘purity’ and ‘impurity’ (Douglas 1966). She also suggested a model – the Grid/Group Model – through which the structures of different cultures could be categorized (Douglas 1970). Philosophically, this school accepted many of the assumptions of naturalism but it held to aspects of positivism in that it aimed to answer discrete questions, using the ethnographic method. It has been criticized, as we will see below, by postmodern anthropologists and also for its failure to attempt consilience with science.

Turner, Douglas and other anthropologists in this school, followed Malinowski by using categories drawn from the study of 'tribal' cultures – such as Rites of Passage, Shaman and Totem – to better comprehend advanced societies such as that of Britain. For example, Turner was highly influential in pursuing the Anthropology of Religion in which he used tribal categories as a means of comprehending aspects of the Catholic Church, such as modern-day pilgrimage (Turner and Turner 1978). This research also involved using the participant observation method. Critics, such as Romanian anthropologist Mircea Eliade (1907-1986) (for example Eliade 2004), have insisted that categories such as ‘shaman’ only make sense within their specific cultural context. Other critics have argued that such scholarship attempts to reduce all societies to the level of the local community despite there being many important differences and fails to take into account considerable differences in societal complexity (for example Sandall 2001, Ch. 1). Nevertheless, there is a growing movement within anthropology towards examining various aspects of human life through the so-called tribal prism and, more broadly, through the cultural one. Mary Douglas, for example, has looked at business life anthropologically while others have focused on politics, medicine or education. This has been termed ‘traditional empiricism’ by critics in contemporary anthropology (for example Davies 2010).

In France, in particular, the most prominent school, during this period, was known as Structuralism. Unlike British Functionalism, structuralism was influenced by Hegelian idealism.  Most associated with Claude Levi-Strauss, structuralism argued that all cultures follow the Hegelian dialectic. The human mind has a universal structure and a kind of a priori category system of opposites, a point which Hollis argues can be used as a starting point for any comparative cultural analysis. Cultures can be broken up into components – such as ‘Mythology’ or ‘Ritual’ – which evolve according to the dialectical process, leading to cultural differences. As such, the deep structures, or grammar, of each culture can be traced back to a shared starting point (and in a sense, the shared human mind) just as one can with a language. But each culture has a grammar and this allows them to be compared and permits insights to be made about them (see, for example, Levi-Strauss 1978). It might be suggested that the same criticisms that have been leveled against the Hegelian dialectic might be leveled against structuralism, such as it being based around a dogma. It has also been argued that category systems vary considerably between cultures (see Diamond 1974). Even supporters of Levi-Strauss have conceded that his works are opaque and verbose (for example Leach 1974).

c. Post-Modern or Contemporary Anthropology

The ‘postmodern’ thinking of scholars such as Jacques Derrida (1930-2004) and Michel Foucault (1926-1984) began to become influential in anthropology in the 1970s and have been termed anthropology’s ‘Crisis of Representation.’ During this crisis, which many anthropologists regard as ongoing, every aspect of ‘traditional empirical anthropology’ came to be questioned.

Hymes (1974) criticized anthropologists for imposing ‘Western categories’ – such as Western measurement – on those they study, arguing that this is a form of domination and was immoral, insisting that truth statements were always subjective and carried cultural values. Talal Asad (1971) criticized field-work based anthropology for ultimately being indebted to colonialism and suggested that anthropology has essentially been a project to enforce colonialism. Geertzian anthropology was criticized because it involved representing a culture, something which inherently involved imposing Western categories upon it through producing texts. Marcus argued that anthropology was ultimately composed of ‘texts’ – ethnographies – which can be deconstructed to reveal power dynamics, normally the dominant-culture anthropologist making sense of the oppressed object of study through means of his or her subjective cultural categories and presenting it to his or her culture (for example Marcus and Cushman 1982). By extension, as all texts – including scientific texts – could be deconstructed, they argued, that they can make no objective assertions. Roth (1989) specifically criticizes seeing anthropology as ‘texts’ arguing that it does not undermine the empirical validity of the observations involved or help to find the power structures.

Various anthropologists, such as Roy Wagner (b. 1938) (Wagner 1981), argued that anthropologists were simply products of Western culture and they could only ever hope to understand another culture through their own. There was no objective truth beyond culture, simply different cultures with some, scientific ones, happening to be dominant for various historical reasons. Thus, this school strongly advocated cultural relativism. Critics have countered that, after Malinowski, anthropologists, with their participant observation breaking down the color bar, were in fact an irritation to colonial authorities (for example Kuper 1973) and have criticized cultural relativism, as discussed.

This situation led to what has been called the ‘reflexive turn’ in cultural anthropology. As Western anthropologists were products of their culture, just as those whom they studied were, and as the anthropologist was himself fallible, there developed an increasing movement towards ‘auto-ethnography’ in which the anthropologist analyzed their own emotions and feelings towards their fieldwork. The essential argument for anthropologists engaging in detailed analysis of their own emotions, sometimes known as the reflexive turn, is anthropologist Charlotte Davies’ (1999, 6) argument that the ‘purpose of research is to mediate between different constructions of reality, and doing research means increasing understanding of these varying constructs, among which is included the anthropologist’s own constructions’ (see Curran 2010, 109). But implicit in Davies’ argument is that there is no such thing as objective reality and objective truth; there are simply different constructions of reality, as Wagner (1981) also argues. It has also been argued that autoethnography is ‘emancipatory’ because it turns anthropology into a dialogue rather than a traditional hierarchical analysis (Heaton-Shreshta 2010, 49). Auto-ethnography has been criticized as self-indulgent and based on problematic assumptions such as cultural relativism and the belief that morality is the most important dimension to scholarship (for example Gellner 1992). In addition, the same criticisms that have been leveled against postmodernism more broadly have been leveled against postmodern anthropology, including criticism of a sometimes verbose and emotive style and the belief that it is epistemologically pessimistic and therefore leads to a Void (for example Scruton 2000). However, cautious defenders insist on the importance of being at least ‘psychologically aware’ (for example Emmett 1976) before conducting fieldwork, a point also argued by Popper (1963) with regard to conducting any scientific research. And Berger (2010) argues that auto-ethnography can be useful to the extent that it elucidates how a ‘social fact’ was uncovered by the anthropologist.

One of the significant results of the ‘Crisis of Representation’ has been a cooling towards the concept of ‘culture’ (and indeed ‘culture shock’) which was previously central to ‘cultural anthropology’ (see Oberg 1960 or Dutton 2012). ‘Culture’ has been criticized as old-fashioned, boring, problematic because it possesses a history (Rees 2010), associated with racism because it has come to replace ‘race’ in far right politics (Wilson 2002, 229), problematic because it imposes (imperialistically) a Western category on other cultures, vague and difficult to perfectly define (Rees 2010), helping to maintain a hierarchy of cultures (Abu Lughod 1991) and increasingly questioned by globalization and the breakdown of discrete cultures (for example Eriksen 2002 or Rees 2010). Defenders of culture have countered that many of these criticisms can be leveled against any category of apprehension and that the term is not synonymous with ‘nation’ so can be employed even if nations become less relevant (for example Fox and King 2002). Equally, ‘culture shock,’ formerly used to describe a rite of passage amongst anthropologists engaging in fieldwork, has been criticized because of its association with culture and also as old-fashioned (Crapanzano 2010).

In addition, a number of further movements have been provoked by the postmodern movement in anthropology. One of these is ‘Sensory Ethnography’ (for example Pink 2009). It has been argued that traditionally anthropology privileges the Western emphasis on sight and the word and that ethnographies, in order to avoid this kind of cultural imposition, need to look at other senses such as smell, taste and touch. Another movement, specifically in the Anthropology of Religion, has argued that anthropologists should not go into the field as agnostics but should accept the possibility that the religious perspective of the group which they are studying may actually be correct and even work on the assumption that it is and engage in analysis accordingly (a point discussed in Engelke 2002).

During the same period, schools within anthropology developed based around a number of other fashionable philosophical ideologies. Feminist anthropology, like postmodern anthropology, began to come to prominence in the early 1970s. Philosophers such as Sandra Harding (1991) argued that anthropology had been dominated by men and this had led to anthropological interpretations being androcentric and a failure to appreciate the importance of women in social organizations. It has also led to androcentric metaphors in anthropological writing and focusing on research questions that mainly concern men. Strathern (1988) uses what she calls a Marxist-Feminist approach. She employs the categories of Melanesia in order to understand Melanesian gender relations to produce an ‘endogenous’ analysis of the situation. In doing so, she argues that actions in Melanesia are gender-neutral and the asymmetry between males and females is ‘action-specific.’ Thus, Melanesian women are not in any permanent state of social inferiority to men. In other words, if there is a sexual hierarchy it is de facto rather than de jure.

Critics have countered that prominent feminist interpretations have simply turned out to be empirically inaccurate. For example, feminist anthropologists, such as Weiner (1992) as well as philosopher Susan Dahlberg (1981), argued that foraging societies prized females and were peaceful and sexually egalitarian. It has been countered that this is a projection of feminist ideals which does not match with the facts (Kuznar 1997, Ch. 3). It has been argued that it does not follow that just because anthropology is male-dominated it is thus biased (Kuznar 1997, Ch. 3). However, feminist anthropologist Alison Wylie (see Risjord 1997) has argued that ‘politically motivated critiques’ including feminist ones, can improve science. Feminist critique, she argues, demonstrates the influence of ‘androcentric values’ on theory which forces scientists to hone their theories.

Another school, composed of some anthropologists from less developed countries or their descendants, have proffered a similar critique, shifting the feminist view that anthropology is androcentric by arguing that it is Euro-centric. It has been argued that anthropology is dominated by Europeans, and specifically Western Europeans and those of Western European descent, and therefore reflects European thinking and bias. For example, anthropologists from developing countries, such as Greenlandic Karla Jessen-Williamson, have argued that anthropology would benefit from the more holistic, intuitive thinking of non-Western cultures and that this should be integrated into anthropology (for example Jessen-Williamson 2006). American anthropologist Lee Baker (1991) describes himself as ‘Afro-Centric’ and argues that anthropology must be critiqued due to being based on a ‘Western’ and ‘positivistic’ tradition which is thus biased in favour of Europe. Afrocentric anthropology aims to shift this to an African (or African American) perspective. He argues that metaphors in anthropology, for example, are Euro-centric and justify the suppression of Africans. Thus, Afrocentric anthropologists wish to construct an ‘epistemology’ the foundations of which are African. The criticisms leveled against cultural relativism have been leveled with regard to such perspectives (see Levin 2005).

4. Philosophical Dividing Lines

a. Contemporary Evolutionary Anthropology

The positivist, empirical philosophy already discussed broadly underpins current evolutionary anthropology and there is an extent to which it, therefore, crosses over with biology. This is inline with the Consilience model, advocated by Harvard biologist Edward Wilson (b. 1929) (Wilson 1998), who has argued that the social sciences must attempt to be scientific, in order to share in the success of science, and, therefore, must be reducible to the science which underpins them. Contemporary evolutionary anthropologists, therefore, follow the scientific method, and often a quantitative methodology, to answer discrete questions and attempt to orient anthropological research within biology and the latest discoveries in this field. Also some scholars, such as Derek Freeman (1983), have defended a more qualitative methodology but, nevertheless, argued that their findings need to be ultimately underpinned by scientific research.

For example, anthropologist Pascal Boyer (2001) has attempted to understand the origins of ‘religion’ by drawing upon the latest research in genetics and in particular research into the functioning of the human mind. He has examined this alongside evidence from participant observation in an attempt to ‘explain’ religion. This subsection of evolutionary anthropology has been termed ‘Neuro-anthropology’ and attempts to better understand ‘culture’ through the latest discoveries in brain science. There are many other schools which apply different aspects of evolutionary theory – such as behavioral ecology, evolutionary genetics, paleontology and evolutionary psychology – to understanding cultural differences and different aspects of culture or subsections of culture such as ‘religion.’ Some scholars, such as Richard Dawkins (b. 1941) (Dawkins 1976), have attempted to render the study of culture more systematic by introducing the concept of cultural units – memes – and attempting to chart how and why certain memes are more successful than others, in light of research into the nature of the human brain.

Critics, in naturalist anthropology, have suggested that evolutionary anthropologists are insufficiently critical and go into the field thinking they already know the answers (for example Davies 2010). They have also argued that evolutionary anthropologists fail to appreciate that there are ways of knowing other than science. Some critics have also argued that evolutionary anthropology, with its acceptance of personality differences based on genetics, may lead to the maintenance of class and race hierarchies and to racism and discrimination (see Segerstråle 2000).

b. Anthropology: A Philosophical Split?

It has been argued both by scholars and journalists that anthropology, more so than other social scientific disciplines, is rent by a fundamental philosophical divide, though some anthropologists have disputed this and suggested that qualitative research can help to answer scientific research questions as long as naturalistic anthropologists accept the significance of biology.

The divide is trenchantly summarized by Lawson and McCauley (1993) who divide between ‘interpretivists’ and ‘scientists,’ or, as noted above, ‘positivists’ and ‘naturalists.’ For the scientists, the views of the ‘cultural anthropologists’ (as they call themselves) are too speculative, especially because pure ethnographic research is subjective, and are meaningless where they cannot be reduced to science. For the interpretivists, the ‘evolutionary anthropologists’ are too ‘reductionistic’ and ‘mechanistic,’ they do not appreciate the benefits of subjective approach (such as garnering information that could not otherwise be garnered), and they ignore questions of ‘meaning,’ as they suffer from ‘physics envy.’

Some anthropologists, such as Risjord (2000, 8), have criticized this divide arguing that two perspectives can be united and that only through ‘explanatory coherence’ (combining objective analysis of a group with the face-value beliefs of the group members) can a fully coherent explanation be reached. Otherwise, anthropology will ‘never reach the social reality at which it aims.’ But this seems to raise the question of what it means to ‘reach the social reality.’

In terms of physical action, the split has already been happening, as discussed in Segal and Yanagisako (2005, Ch. 1). They note that some American anthropological departments demand that their lecturers are committed to holist ‘four field anthropology’ (archaeology, cultural, biological and linguistic) precisely because of this ongoing split and in particular the divergence between biological and cultural anthropology. They observe that already by the end of the 1980s most biological anthropologists had left the American Anthropological Association. Though they argue that ‘holism’ was less necessary in Europe – because of the way that US anthropology, in focusing on Native Americans, ‘bundled’ the four - Fearn (2008) notes that there is a growing divide in British anthropology departments as well along the same dividing lines of positivism and naturalism.

Evolutionary anthropologists and, in particular, postmodern anthropologists do seem to follow philosophies with essentially different presuppositions. In November 2010, this divide became particularly contentious when the American Anthropological Association voted to remove the word ‘science’ from its Mission Statement (Berrett 2010).

5. References and Further Reading

  • Abu-Lughod, Lila. 1991. “Writing Against Culture.” In Richard Fox (ed.), Recapturing Anthropology: Working in the Present (pp. 466-479). Santa Fe: School of American Research Press.
  • Alarcon, Renato, Foulks, Edward and Vakkur, Mark. 1998. Personality Disorders and Culture: Clinical and Conceptual Interactions. New York: John Wiley and Sons.
  • American Anthropological Association. 1998. “American Anthropological Association Statement on “Race.”” 17 May.
  • Andreski, Stanislav. 1974. Social Sciences as Sorcery. London: Penguin.
  • Asad, Talal. 1971. “Introduction.” In Talal Asad (ed.), Anthropology and the Colonial Encounter. Atlantic Highlands: Humanities Press.
  • Baiburin, Albert. 2005. “The Current State of Ethnography and Anthropology in Russia.” For Anthropology and Culture 2, 448-489.
  • Baker, Lee. 1991. “Afro-Centric Racism.” University of Pennsylvania: African Studies Center.
  • Barenski, Janusz. 2008. “The New Polish Anthropology.” Studio Ethnologica Croatica 20, 211-222.
  • Beddoe, John. 1862. The Races of Britain: A Contribution to the Anthropology of Western Europe. London.
  • Benedict, Ruth. 1934. Patterns of Culture. New York: Mifflin.
  • Berger, Peter. 2010. “Assessing the Relevance and Effects of “Key Emotional Episodes” for the Fieldwork Process.” In Dimitrina Spencer and James Davies (eds.), Anthropological Fieldwork: A Relational Process (pp. 119-143). Newcastle: Cambridge Scholars Press.
  • Berrett, Daniel. 2010. “Anthropology Without Science.” Inside Higher Ed, 30 November.
  • Bidney, David. 1953. Theoretical Anthropology. New York: Columbia University Press.
  • Boas, Franz. 1907. The Mind of Primitive Man. New York: MacMillan.
  • Bourgois, Philippe. 2007. “Confronting the Ethics of Ethnography: Lessons from Fieldwork in Central America.” In Antonius Robben and Jeffrey Slukka (eds.), Ethnographic Fieldwork: An Anthropological Reader (pp. 288-297). Oxford: Blackwell.
  • Boyer, Pascal. 2001. Religion Explained: The Human Instincts That Fashion Gods, Spirits and Ancestors. London: William Heinnemann.
  • Cattell, Raymond. 1972. Beyondism: A New Morality from Science. New York: Pergamon.
  • Ciubrinskas, Vytis. 2007. “Interview: “Anthropology is Badly Needed in Eastern Europe.””
  • Curran, John. 2010. “Emotional Interaction and the Acting Ethnographer: An Ethical Dilemma?” In Dimitrina Spencer and James Davies (eds.), Anthropological Fieldwork: A Relational Process (pp. 100-118). Newcastle: Cambridge Scholars Press.
  • Crapanzano, Vincent. 2010. ““At the Heart of the Discipline”: Critical Reflections on Fieldwork.” In James Davies and Dimitrina Spencer (eds.), Emotions in the Field: The Psychology and Anthropology of Fieldwork Experience (pp. 55-78). Stanford: Stanford University Press.
  • Dahlberg, Frances. 1981. “Introduction.” In Frances Dahlberg (ed.), Woman the Gatherer (pp. 1-33). New Haven: Yale University Press.
  • Darwin, Charles. 1871. The Descent of Man. London: John Murray.
  • Darwin, Charles. 1859. The Origin of Species. London: John Murray.
  • Davies, James. 2010. “Conclusion: Subjectivity in the Field: A History of Neglect.” In Dimitrina Spencer and James Davies (eds.), Anthropological Fieldwork: A Relational Process (pp. 229-243). Newcastle: Cambridge Scholars Publishing.
  • Davies, Charlotte. 1999. Reflexive Ethnography: A Guide to Researching Selves and Others. London: Routledge.
  • Dawkins, Richard. 1976. The Selfish Gene. Oxford: Oxford University Press.
  • Diamond, Stanley. 1974. In Search of the Primitive. New Brunswick: Transaction Books.
  • Douglas, Mary. 1970. Natural Symbols: Explorations in Cosmology. London: Routledge.
  • Douglas, Mary. 1966. Purity and Danger: An Analysis of the Concepts of Pollution and Taboo. London: Routledge.
  • Durkheim, Emile. 1995. The Elementary Forms of Religious Life. New York: Free Press.
  • Dutton, Edward. 2012. Culture Shock and Multiculturalism. Newcastle: Cambridge Scholars Publishing.
  • Ehrenfels, Umar Rolf, Madan, Triloki Nath, and Comas, Juan. 1962. “Mankind Quarterly Under Heavy Criticism: 3 Comments on Editorial Practices.” Current Anthropology 3, 154-158.
  • Eliade, Mircea. 2004. Shamanism: Archaic Technique of Ecstasy. Princeton: Princeton University Press.
  • Ellis, Frank. 2004. Political Correctness and the Theoretical Struggle: From Lenin and Mao to Marcus and Foucault. Auckland: Maxim Institute.
  • Emmet, Dorothy. 1976. “Motivation in Sociology and Social Anthropology.” Journal for the Theory of Social Behaviour 6, 85-104.
  • Engelke, Matthew. 2002. “The Problem of Belief: Evans-Pritchard and Victor Turner on the Inner Life.” Anthropology Today 18, 3-8.
  • Eriksen, Thomas Hylland. 2003. “Introduction.” In Thomas Hylland Eriksen (ed.), Globalisation: Studies in Anthropology (pp. 1-17). London: Pluto Press.
  • Eriksen, Thomas Hylland. 2001. A History of Anthropology. London: Pluto Press.
  • Fearn, Hannah. 2008. “The Great Divide.” Times Higher Education, 28 November.
  • Fox, Richard, and King, Barbara. 2002. “Introduction: Beyond Culture Worry.” In Richard Fox and Barbara King (eds.), Anthropology Beyond Culture (pp. 1-19). Oxford: Berg.
  • Frazer, James. 1922. The Golden Bough: A Study in Magic and Religion. London: MacMillan.
  • Freeman, Derek. 1999. The Fateful Hoaxing of Margaret Mead: A Historical Analysis of Her Samoan Research. London: Basic Books.
  • Freeman, Derek. 1983. Margaret Mead and Samoa: The Making and Unmaking of an Anthropological Myth. Cambridge: Harvard University Press.
  • Galton, Francis. 1909. Essays in Eugenics. London: Eugenics Education Society.
  • Geertz, Clifford. 1973. The Interpretation of Cultures. New York: Basic Books.
  • Geertz, Clifford. 1999. “From the Native’s Point of View’: On the Nature of Anthropological Understanding.” In Russell T. McCutcheon (ed.), The Insider/Outsider Problem in the Study of Religion: A Reader (pp. 50-63). New York: Cassell.
  • Gellner, Ernest. 1995. Anthropology and Politics: Revolutions in the Sacred Grove. Oxford: Blackwell.
  • Gellner, Ernest. 1992. Post-Modernism, Reason and Religion. London: Routledge.
  • Gobineau, Arthur de. 1915. The Inequality of Races. New York: G. P. Putnam and Sons.
  • Gorton, William. 2010. “The Philosophy of Social Science.” Internet Encyclopedia of Philosophy.
  • Gottfried, Paul. 2004. Multiculturalism and the Politics of Guilt: Towards a Secular Theocracy. Columbia: University of Missouri Press.
  • Grant, Madison. 1916. The Passing of the Great Race: Or the Racial Basis of European History. New York: Charles Scribner’s Sons.
  • Hallpike, Christopher Robert. 1986. The Principles of Social Evolution. Oxford: Clarendon Press.
  • Harding, Sandra. 1991. Whose Science? Whose Knowledge? Thinking from Women’s Lives. Ithaca: Cornell University Press.
  • Headland, Thomas, Pike, Kenneth, and Harris, Marvin. 1990. Emics and Etics: The Insider/ Outsider Debate. New York: Sage Publications.
  • Heaton-Shreshta, Celayne. 2010. “Emotional Apprenticeships: Reflections on the Role of Academic Practice in the Construction of “the Field.”” In Dimitrina Spencer and James Davies (eds.), Anthropological Fieldwork: A Relational Process (pp. 48-74). Newcastle: Cambridge Scholars Publishing.
  • Hollis, Martin. 1967. “The Limits of Irrationality.” European Journal of Sociology 8, 265-271.
  • Hymes, Dell. 1974. “The Use of Anthropology: Critical, Political, Personal.” In Dell Hymes (ed.), Reinventing Anthropology (pp. 3-82). New York: Vintage Books.
  • Jessen Williamson, Karla. 2006. Inuit Post-Colonial Gender Relations in Greenland. Aberdeen University: PhD Thesis.
  • Kapferer, Bruce. 2001. “Star Wars: About Anthropology, Culture and Globalization.” Suomen Antropologi: Journal of the Finnish Anthropological Society 26, 2-29.
  • Kuper, Adam. 1973. Anthropologists and Anthropology: The British School 1922-1972. New York: Pica Press.
  • Kuznar, Lawrence. 1997. Reclaiming a Scientific Anthropology. Walnut Creek: AltaMira Press.
  • Lawson, Thomas, and McCauley, Robert. 1993. Rethinking Religion: Connecting Cognition and Culture. Cambridge: Cambridge University Press.
  • Leach, Edmund. 1974. Claude Levi-Strauss. New York: Viking Press.
  • Levi-Strauss, Claude. 1978. Myth and Meaning. London: Routledge.
  • Levin, Michael. 2005. Why Race Matters. Oakton: New Century Foundation.
  • Lovejoy, Arthur. 1936. The Great Chain of Being: A Study of the History of an Idea. Cambridge: Harvard University Press.
  • Lynn, Richard. 2006. Race Differences in Intelligence: An Evolutionary Analysis. Augusta: Washington Summit Publishers.
  • Malinowski, Bronislaw. 1922. Argonauts of the Western Pacific. London: Routledge.
  • Marcus, George, and Cushman, Dick. 1974. “Ethnographies as Texts.” Annual Review of Anthropology 11, 25-69.
  • Mead, Margaret. 1928. Coming of Age in Samoa: A Psychological Study of Primitive Youth for Western Civilization. London: Penguin.
  • Montagu, Ashley. 1945. Man’s Most Dangerous Myth: The Fallacy of Race. New York: Columbia University Press.
  • Nettle, Daniel. 2007. Personality: What Makes Us the Way We Are. Oxford: Oxford University Press.
  • Oberg, Kalervo. 1960. “Culture Shock: Adjustment to New Cultural Environments.” Practical Anthropology 7, 177-182.
  • Pearson, Roger. 1991. Race, Intelligence and Bias in Academe. Washington DC: Scott-Townsend Publishers.
  • Pels, Peter. 1999. “Professions of Duplexity: A Prehistory of Ethical Codes.” Current Anthropology 40, 101-136.
  • Pichot, Andre. 2009. The Pure Society: From Darwin to Hitler. London: Verso.
  • Pike, Luke. 1869. “On the Alleged Influence of Race Upon Religion.” Journal of the Anthropological Society of London 7, CXXXV-CLIII.
  • Pink, Sarah. 2009. Doing Sensory Ethnography. London: Sage Publications.
  • Popper, Karl. 1963. Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge.
  • Popper, Karl. 1957. The Poverty of Historicism. London: Routledge.
  • Radcliffe-Brown, Alfred Reginald. 1957. A Natural Science of Society. Chicago: University of Chicago Press.
  • Rees, Tobias. 2010. “On the Challenge – and the Beauty – of (Contemporary) Anthropological Inquiry: A Response to Edward Dutton.” Journal of the Royal Anthropological Institute 16, 895-900.
  • Risjord, Mark. 2007. “Scientific Change as Political Action: Franz Boas and the Anthropology of Race.” Philosophy of Social Science 37, 24-45.
  • Risjord, Mark. 2000. Woodcutters and Witchcraft: Rationality and the Interpretation of Change in the Social Sciences. New York: University of New York Press.
  • Rogan, Bjarn. 2012. “The Institutionalization of Folklore.” In Regina Bendix and Galit Hasam-Rokem (eds.), A Companion to Folklore (pp. 598-630). Oxford: Blackwell.
  • Roth, Paul. 1989. “Anthropology Without Tears.” Current Anthropology 30, 555-569.
  • Sandall, Roger. 2001. The Culture Cult: On Designer Tribalism and Other Essays. Oxford: Westview Press.
  • Sarkany, Mihaly. ND. “Cultural and Social Anthropology in Central and Eastern Europe.” Liebnitz Institute for the Social Sciences.
  • Shankman, Paul. 2009. The Trashing of Margaret Mead: Anatomy of an Anthropological Controversy. Madison: University of Wisconsin Press.
  • Scruton, Roger. 2000. Modern Culture. London: Continuum.
  • Sedgwick, Mitchell. 2006. “The Discipline of Context: On Ethnography Amongst the Japanese.” In Joy Hendry and Heung Wah Wong (eds.), Dismantling the East-West Dichotomy: Essays in Honour of Jan van Bremen (pp. 64-68). New York: Routledge.
  • Segal, Daniel, and Yanagisako, Sylvia. 2005. “Introduction.” In Daniel Segal and Sylvia Yanagisako (eds.), Unwrapping the Sacred Bundle: Reflections on the Disciplining of Anthropology (pp. 1-23). Durham: Duke University Press.
  • Segerstråle, Ullica. 2000. Defenders of the Truth: The Sociobiology Debate. Oxford: Oxford University Press.
  • Siikala, Jukka. 2006. “The Ethnography of Finland.” Annual Review of Anthropology 35, 153-170.
  • Sluka, Jeffrey. 2007. “Fieldwork Ethics: Introduction.” In Antonius Robben and Jeffrey Slukka (eds.), Ethnographic Fieldwork: An Anthropological Reader (pp. 271-276). Oxford: Blackwell.
  • Spencer, Herbert. 1873. The Study of Sociology. New York: D. Appleton and Co.
  • Srivastava, Vinay Kumar. 2000. “Teaching Anthropology.” Seminar 495.
  • Strathern, Marylin. 1988. The Gender of the Gift: Problems with Women and Problems with Society in Melanesia. Berkley: University of California Press.
  • Stocking, George. 1991. Victorian Anthropology. New York: Free Press.
  • Tucker, William. 2002. The Funding of Scientific Racism: Wycliffe Draper and the Pioneer Fund. Illinois: University of Illinois Press.
  • Turner, Victor. 1969. The Ritual Process: Structure and Anti-Structure. New York: Aldine Publishers.
  • Turner, Victor, and Turner, Edith. 1978. Image and Pilgrimage in Christian Culture: Anthropological Perspectives. New York: Columbia University Press.
  • Tylor, Edward Burnett. 1871. Primitive Culture: Researches into the Development of Mythology, Religion, Art and Custom. London: John Murray.
  • Tylor, Edward Burnett. 1865. Researchers into the Early History of Mankind. London: John Murray.
  • Wagner, Roy. 1981. The Invention of Culture. Chicago: University of Chicago Press.
  • Weiner, Annette. 1992. Inalienable Possession. Los Angeles: University of California Press.
  • Wilson, Edward Osborne. 1998. Consilience: Towards the Unity of Knowledge. New York: Alfred A. Knopf.  
  • Wilson, Edward Osborne. 1975. Sociobiology: A New Synthesis. Cambridge: Harvard University Press.
  • Wilson, Richard. 2002. “The Politics of Culture in Post-apartheid South Africa.” In Richard Fox and Barbara King (eds.), Anthropology Beyond Culture (pp. 209-234). Oxford: Berg.


Author Information

Edward Dutton
University of Oulu

Time Supplement

This supplement answers a series of questions designed to reveal more about what science requires of physical time, and to provide background information about other topics discussed in the Time article.

Table of Contents

  1. What are Instants and Durations?
  2. What is an Event?
  3. What is a Reference Frame?
  4. What is an Inertial Frame?
  5. What is Spacetime?
  6. What is a Minkowski Spacetime Diagram?
  7. What are the Metric and the Interval?
  8. Does the Theory of Relativity Imply Time is Partly Space?
  9. Is Time the Fourth Dimension?
  10. Is There More Than One Kind of Physical Time?
  11. How is Time Relative to the Observer?
  12. What is the Relativity of Simultaneity?
  13. What is the Conventionality of Simultaneity?
  14. What is the Difference between the Past and the Absolute Past?
  15. What Is Time Dilation?
  16. How does Gravity Affect Time?
  17. What Happens to Time Near a Black Hole?
  18. What is the Solution to the Twin Paradox?
  19. What is the Solution to Zeno's Paradoxes?
  20. How do Time Coordinates Get Assigned to Points of Spacetime?
  21. How do Dates Get Assigned to Actual Events?
  22. What is Essential to Being a Clock?
  23. What does It Mean for a Clock to be Accurate?
  24. What is Our Standard Clock?
  25. Why are Some Standard Clocks Better Than Others?

1. What Are Instants and Durations?

A duration is an amount of time. The duration of Earth's existence is about five billion years; the duration of a flash of lightning is 0.0002 seconds. The second is the standard unit for the measurement of duration [in the S.I. system (the International Systems of Units, that is, Le Système International d'Unités)]. In informal conversation, an instant is a very short duration. In physics, however, an instant is instantaneous; it is not a very short duration but rather a point in time of zero duration. It is assumed in physics that a finite duration of a real event is always a linear continuum of the instants that compose the duration, but it is an interesting philosophical question to ask how physicists know this.

2. What Is an Event?

In ordinary discourse, an event is a happening lasting a finite duration during which some object changes its properties. For example, this morning’s event of buttering the toast is the toast’s changing from unbuttered to buttered. In ordinary discourse, unlike in physics, events are not basic, but rather are defined in terms of something more basic—objects and their properties. In physics it is the other way round. Events are basic, and objects are defined in terms of them.

The philosopher Jaegwon Kim suggested that an event is an object’s having a property at a time. So, two events are the same if they are both events of the same object having the same property at the same time. This suggestion makes it difficult to make sense of the remark, “The bombing of Pearl Harbor in World War II could have started an hour earlier.” On Kim’s analysis, the bombing could not have started earlier because, if it did, it would be a different event. A possible-worlds analysis of events might be the way to solve this problem, but the solution will not be explored here.

Physicists adopt the idealization that a basic event is a so-called point event: a property (value of a variable) at an instant of time and at a point in space. For example, there is the event of the gravitational field having the value g at place <x,y,z> at time t. In ordinary discourse an event must involve a change in some property; the physicist’s event does not have this requirement. A physicist’s basic event is called a “point event,” and, for the physicist, all other events are said to be composed of point events. The bombing of Pearl Harbor is a large set of point events.

A mathematical space is a collection of points, and the points might represent anything, for example, dollars. But the points of a real space, that is, a physical space, are locations. For example, the place called “New York City” at one time is composed of the actual point locations which occur within the city’s boundary at that time.

The physicists’ notion of point event is metaphysically unacceptable to many philosophers, in part because it deviates so much from the way “event” is used in ordinary language. In 1936, in order to avoid point events, Bertrand Russell and A. N. Whitehead developed a theory of time based on the assumption that all events in spacetime have a finite, non-zero duration. However, they had to assume that any finite part of an event is an event, and this assumption is no closer to common sense than the physicist’s assumption that all events are composed of point events. The encyclopedia article on Zeno’s Paradoxes mentions that Michael Dummett and Frank Arntzenius have continued in the 21st century to develop Russell’s and Whitehead’s idea that any event must have a non-zero duration.

McTaggart argued early in the twentieth century that events change. For example, the event of Queen Anne’s death is changing because it is receding ever farther into the past as time goes on. It is an open question in philosophy as to whether events change in this manner. Many other philosophers believe it is improper to consider an event to be something that can change. This is still an open question in philosophy.

For the physicist, it would be a mistake to say an event is an object’s having a property at a time and place. One needs to say an event is an object's having a property at a time and place in a specific reference frame. The bombing of Pearl Harbor lasts longer in some reference frames than others. The point is developed in the next section of this Supplement.

For a more detailed discussion of what an event is, see the article on Events.

3. What Is a Reference Frame?

A reference frame for a space is a standard point of view or a perspective for making observations, measurements and judgments about points in the space and phenomena that take place there. Usually a reference frame is specified by choosing a coordinate system.

Choosing a good reference frame can make a situation much easier to describe. If you are trying to describe the motion of a car down a straight highway, you would not want to choose a reference frame that is fixed to a spinning carousel. Instead, choose a reference frame fixed to the highway or else fixed to the car.

A reference frame is often specified by selecting a solid object that doesn’t change its size and by saying that the reference frame is fixed to the object. We might select a reference frame fixed to the Rock of Gibraltar. Another object is said to be at rest in the reference frame if it remains at a constant distance in a fixed direction from the Rock of Gibraltar. For example, your house is at rest in a reference frame fixed to the Rock of Gibraltar [not counting your house's vibrating when a truck drives by, nor the house's speed due to plate tectonics]. When we say the Sun rose this morning, we are implicitly choosing a reference frame fixed to the Earth’s surface. The Sun is not at rest in this reference frame, but the Earth is.

The reference frame or coordinate system must specify locations, and this is normally done by assigning numbers to points of space. In a flat (that is, Euclidean) three-dimensional space, the analyst who wants to assign a Cartesian (that is, flat or rectangular) coordinate system to the space will need to specify four distinct points on the reference body, or four objects mutually at rest somewhere in the frame. In a Cartesian coordinate system, one of the four points is the origin, and the other three can be used to define three independent, perpendicular axes, the familiar x, y and z directions. Two point objects are at the same place if they have the same x-value, the same y-value and the same z-value. To keep track of events rather than simply 3-d objects, you the analyst will need a time axis, a “t” axis, and so you will expand your three-dimensional mathematical space to a four-dimensional mathematical space. Two point events are identical if they occur at the same place and also at the same time. In this way, the analyst is placing a four-dimensional coordinate system on the space and time. The coordinates could have been letters instead of numbers, but real numbers are the best choice because we want to use them for measurement, not just for naming places and events.

For the physicist, in a reference frame, two basic events are simultaneous if a light beam from each will meet halfway between the locations of the two events in that frame. The assumption here is that the light beam hits no obstacles along the way. Similarly, the concept of earlier-than is frame relative. A moment, that is, a time, can be characterized as the set of all basic events which are simultaneous with one another (in a given reference frame). Moment x is considered to be earlier than moment y if all events constituting x are earlier than all events composing y. Given an event, there is no single time or moment at which it occurs; it can occur at one moment in one frame and at a different moment in another frame. We are now far from the intuitive idea of moment.

Physicists define a useful frame-independent notion of an event x being in the absolute past, as opposed to merely being in the past, of event y by saying this occurs if and only if (iff), in all frames of reference, x is earlier than y. What follows is that x is in the absolute past of y iff a light beam from x could have reached y. This is often expressed by saying x is in the absolute past of y iff x could have caused y but not vice versa.

This definition of “moment” presupposes relationalism. Also, it uses actual events rather than possible events, and it presupposes there are no empty moments, moments at which no event takes place. For any point of spacetime, perhaps it can be assumed that some event or other is always occurring there, such as its having a value for the gravitational field, or its having the property of not being part of a unicorn at that location and time.

The fact that physical spacetime has curvature implies that no single rigid (or Cartesian) coordinate system is capable of covering the entire spacetime. To cover all of spacetime in that case, we must make do with covering different regions of spacetime with different coordinate patches that are “knitted together” where one patch meets another. No single Cartesian coordinate system can cover the surface of a sphere without creating a singularity, but the sphere can be covered by patching together coordinate systems. Nevertheless if we can live with non-rigid curvilinear coordinates, then any curved spacetime can be covered with a global four-dimensional coordinate system in which every point being uniquely identified with a set of four numbers in a continuous way. That is, we use a curved coordinate system on curved spacetime.

A dimension is a direction in a space, and a coordinate is a number that serves as a location along a dimension. That we use four numbers per point usually indicates the space is four-dimensional. In creating reference frames for spaces, the usual assumption is that we should supply n independent numbers to specify a place in an n-dimensional space, where n is an integer. This is usual but not required; instead we could exploit the idea that there are space-filling curves which permit a single continuous curve to completely fill, and thus coordinatize, a region of dimension higher than one, such as a plane or a 3-dimensional space. For this reason (namely, that each point in n-dimensional space doesn’t always need n numbers to uniquely name the point), the contemporary definition of “dimension” is rather exotic.

Inertial frames are very special reference frames; see below.

4. What Is an Inertial Frame?

Special relativity is intended to apply only to inertial frames. Einstein's theory of special relativity is his 1905 theory of bodies that move in space and time. It is called "special" because it postulates the Lorentz-invariance of all physical law statements that hold in a special reference frames, called inertial frames. If we do not speak too precisely, we can say an inertial reference frame is a frame of reference in which Newton’s laws of motion are satisfied. That means that if you place a rock somewhere and don’t put any unbalanced external force on it, then the rock stays there forever; and if you give that rock a speed of 3 miles per hour, then from then on it will travel at 3 miles per hour until some force acts on it such as its hitting another rock. Our reality isn’t so simple; inertial reference frames do not exist and Newton's laws of motion are not true. However, for small volumes (rather than the whole universe) and short times (rather than eternity) there can be frames that are approximately inertial.

Suppose you've pre-selected your frame. How do you tell whether it is an inertial frame? The answer is that you check its laws of motion; you check that objects accelerate only when acted on by external forces. If no forces are present, then a moving object moves in a straight line. It doesn't curve; it coasts. And it travels equal distances in equal amounts of time.

Any frame of reference moving at constant velocity relative to an inertial frame is also an inertial frame. A reference frame spinning relative to an inertial frame is never an inertial frame.

According to the theory, the speed of light in a vacuum is the same when observed from any inertial frame of reference. Unlike the speed of a spaceship, the speed of light in a vacuum isn't affected by which inertial reference frame is used for the measurement. If you have two relatively stationary, synchronized clocks in an inertial frame, then they will read the same time, but if one moves relative to the other, then they will get out of synchrony. This loss of synchrony due to relative motion is called "time dilation."

The presence of gravitation normally destroys any possibility of finding a perfect inertial frame. Nevertheless, any spacetime obeying the general theory of relativity and thus accounting for gravitation will be locally Minkowskian in the sense that any infinitesimal region of spacetime has an inertial frame obeying the principles of special relativity.

5. What Is Spacetime?

Spacetime is where events are located, or, depending on your theory of spacetime, it can be said to be all possible events. Metaphysicians might say it is the mereological sum of those events. The dimensions of real spacetime include the time dimension of happens-after and (at least) the three ordinary space dimensions of, say, up-down, left-right, and forward-backward. That is, spacetime is usually represented with a four-dimensional mathematical space, one of whose dimensions represents time and three of whose dimensions represent space.

Spacetime is the intended model of the general theory of relativity. This requires it to be a differentiable space in which physical objects obey the equations of motion of the theory. Minkowski space (that is, Minkowski spacetime) is the model of special relativity. General relativity theory requires that spacetime be locally a Minkowski spacetime.

Hermann Minkowski, in 1908, was the first person to say that spacetime is fundamental and that space and time are just aspects of spacetime. Minkowski meant it is fundamental in the sense that the spacetime interval between any two events is intrinsic to spacetime and does not vary with the reference frame, unlike a distance or a duration between the two events.

Spacetime is believed to be a continuum in which we can define points and straight lines. However, these points and lines do not satisfy the principles of Euclidean geometry when gravity is present. Einstein showed that the presence of gravity affects geometry by warping space and time. Einstein's principal equation in his general theory of relativity implies that the curvature of spacetime is directly proportional to the density of mass in the spacetime. That is, Einstein says the structure of spacetime changes as matter moves because the gravitational field from matter actually curves spacetime. Black holes are a sign of radical curvature. The Earth's curving of spacetime is very slight but still significant enough that it must be accounted for in clocks of the Global Positioning Satellites (GPS) along with the other time dilation effect that is caused by speed. The GPS satellites are launched with their clocks adjusted so that when they reach orbit they mark time the same as Earth-based clocks do.

There have been serious attempts over the last few decades to construct theories of physics in which spacetime is a product of more basic entities. The primary aim of these new theories is to unify relativity with quantum theory. So far these theories have not stood up to any empirical observations or experiments that could show them to be superior to the presently accepted theories. So, for the present, the concept of spacetime remains fundamental.

The metaphysical question of whether spacetime is a substantial object or a relationship among events, or neither, is considered in the discussion of the relational theory of time.

6. What Is a Minkowski Spacetime Diagram?

A spacetime diagram is a graphical representation of the point-events in spacetime. A Minkowski spacetime diagram is a representation of a spacetime obeying the laws of special relativity. In a Minkowski spacetime diagram, normally a rectangular coordinate system is used, the time axis is shown vertically, one or two of the spatial axes are suppressed (that is, not included). Here is an example with only one space dimension:

This Minkowski diagram shows a point-sized Einstein standing still midway between the two places at which there is a flash of light. The directed arrows represent the path of light rays from the flash. In a Minkowski diagram, a physical (point) object is not represented as occupying a point but as occupying a line containing all the spacetime points at which it exists. That line, which usually is not straight, is called the worldline of the object. In the above diagram, Einstein's worldline is a vertical straight line because no total external force is acting on him. The history or path of an object’s inertial motion (its coasting) is a series of events that are represented by a straight line. If it is not straight, the object is not coasting (with zero external force acting on it).  If an object's worldline intersects or meets another object's worldline, then the two objects collide at the point of intersection. The units along the vertical time axis are customarily chosen to be the product of time and the speed of light so that worldlines of light rays make a forty-five degree angle with each axis. So, if a centimeter in the up or time direction is one second, then a centimeter to the right or space direction is one light-second, a very long distance.

The set of all possible photon histories or light-speed worldlines going through an event defines the two light cones of that event: the past light cone and the future light cone. The future cone is called a "cone" because, if we were to add another space dimension to our diagram, so it has two space dimensions and one time dimension, light emitted from the flash spreads out in the two dimensions of space in a circle of growing diameter, producing a cone shape. The future light cone of the flash event is all the space-time events reached by the light emitted from the flash. Events inside the cone are events that in principle could have been affected by the event; they events are said to be causally-connectible to the event, and the relation between any other event and the event is said to be time-like.

Inertial motion produces a straight worldline, and accelerated motion produces a curved worldline. If at some time Einstein were to jump on a train moving by at constant speed, then his worldline would, from that time onward, tilt away from the vertical and form some angle less than 45 degrees with the time axis. In order to force a 45 degree angle to be the path of a light ray, the units on the time axis are not seconds but seconds times the speed of light. Any line tilted from than 45 degrees from the vertical is the worldline of an object moving faster than the speed of light in a vacuum. Events on the same horizontal line of the Minkowski diagram are simultaneous in that reference frame. Special relativity does not allow a worldline to be circular, or a closed curve, since the traveler would have to approach infinite speed at the top of the circle and at the bottom. A moving observer is added to the above diagram to produce the diagram below in section 12 in the discussion about the relativity of simultaneity.

Does an observer move along their worldline? Is the worldline static and unchanging? According to J.J.C. Smart, "Within the Minkowski representation we must not talk of our four-dimensional entites changing or not changing." ("Spatialising Time," Mind, 64: 239-241.)

Not all spacetimes can be given Minkowski diagrams, but any spacetime satisfying Einstein's Special Theory of Relativity can. Minkowski diagrams are diagrams of a Minkowski space, which is a spacetime satisfying the Special Theory of Relativity and having zero vacuum energy. Einstein's Special Theory falsely presupposes that physical processes, such as gravitational processes, have no effect on the structure of spacetime. When attention needs to be given to the real effect of these processes on the structure of spacetime, that is, when general relativity needs to be used, then Minkowski diagrams become inappropriate for spacetime. General relativity assumes that the geometry of spacetime is locally Minkowskian but not globally. That is, spacetime is locally flat in the sense that in any very small region one always finds spacetime to be 4-D Minkowskian (but not 4-D Euclidean). Special relativity holds in infinitesimally small region of spacetime that satisfies general relativity, and so any such region can be fitted with an inertial reference frame. When we say spacetime is "really curved" and not flat, we mean it really deviates from 4-D Minkowskian geometry.

To repeat a point made earlier, when we speak of a point in these diagrams being a spacetime event, that is a non-standard use of the word "event." A point event in a Minkowski diagram is merely a location in spacetime where an event might or might not happen. The point exists even if no object is actually there.

7. What Are the Metric and the Interval?

A space is simply a collection of points. A metrification of the space assigns locations to the points by assigning them numbers or sets of numbers. It will assign the origin of a coordinate system on a 3-D space the location <0,0,0>. How far is it between any two points? The metric is the answer to this question. A metric on a space, whether it's a physical space or a mathematical space, provides a definition of distance (or length) by giving a function from each pair of points to a real number, called the distance between the points. In Euclidean space, the distance between two points is the length of the straight line connecting them. The metric of a space determines its geometry, and this metric and geometry are intrinsic in the sense that they do not change as we change the reference frame. Philosophers are interested in the issue of whether the choice of a metric for a space is natural (or objective) or whether it is always a matter of convention (or subjective).

How about the metric for time? The introduction of the metric for time allows the scientist to define the time interval between any two events, from which it follows that all pairs of events can be classified by the relation "earlier than" or "later than" or "simultaneous." In this way it defines the future and the past of any given event. The customary metric for any two points in a one-dimensional Euclidean space, such as time, is the absolute value of the numerical difference between the coordinates of the two points (that, the length of the line segment connecting them). For example, the duration between an event with the coordinate 5:00 and an event with the coordinate 7:00 is exactly two hours (assuming the events occur on the same day and we do not have an a.m. vs. p.m. ambiguity or ambiguity due to change of time zone). If we select a standard clock and the standard way of calculating durations between clock readings, then that clock implicitly defines the metric of time because, by definition, it yields the correct answer for the duration between any two point events. Here we assume the period between any two successive clock ticks is congruent (the same) while the clock is stationary in the coordinate system where the clock readings are taken. When we define the unit of time (the second) to be so many successive ticks of the standard clock, what we are doing is implicitly specifying the metric, provided we implicitly agree that the clock readings are correct and agree to adopt the customary procedure for how to read the duration between two point events. For example, to speak simplistically, if you want to know how much time has passed between the birth of Mohammed and the death of Abraham Lincoln, then you find the dates of the two events and subtract the first from the second; this procedure is equivalent to noting the tick on the standard clock that is simultaneous with the birth of Mohammed and then counting how many ticks occurred until the tick that is simultaneous with the death of Abraham Lincoln. It is customary to subtract the dates, but would it be incorrect instead to subtract the square roots of the dates, or to subtract the dates and then take the square root of the result? Philosophers disagree about whether it would be incorrect or merely inconvenient.

Points of space are located by being assigned a coordinate. For doing quantitative science rather than merely qualitative science we want the coordinate to be a number and not, say, a letter of the alphabet. A coordinate for a point in two-dimensional space requires two numbers; a coordinate for a point in n-dimensional space requires n numbers, where n is a positive integer. You might consider why you'd prefer a real number rather than a rational number even though no measuring tool could detect the difference between the two choices.

In a 2-dimensional (or 2-D) space, the metric for the distance between the point (x,y) with Cartesian coordinates x and y and the point (x',y') with coordinates x' and y' is defined to be the square root of (x' - x)2 + (y' - y)2 when the space is flat, that is, Euclidean. If the space is not flat, then a more sophisticated definition of the metric is required. Note the application of the Pythagorean Theorem.

We have intuitions about locations and distances that we expect will hold. For example, we believe that in a one-dimensional space representing time, if event p happens before event q, and q happens before r, then the locations numbers for those events, namely, l(p), l(q) and l(r), must satisfy this inequality: l(p) < l(q) < l(r). If not, then we shouldn't be labeling points that way.

Our intuitive idea of what a distance is tells us that, no matter how strange the space is, we want its metric d to have the following distance-like properties. Let d(p,q) stand for the distance between any two points p and q in the space. d is a function with two arguments. For any points p, q and r, the following five conditions must be satisfied:

  1. d(p,p) = 0
  2. d(p,q) is greater than or equal to 0
  3. If d(p,q) = 0, then p = q
  4. d(p,q) = d(q,p)
  5. d(p,q) + d(q,r) is greater than or equal to d(p,r)

Notice that there is no mention of the path the distance is taken across; all the attention is on the point pairs themselves. Does your idea of distance imply that those conditions on d should be true? If you were to check, then you'd find that the usual 2-D metric defined above, namely the square root of (x' - x)2 + (y' - y)2, does satisfy these four conditions. In 3-D Euclidean space, the metric that is defined to be the square root of (x' - x)2 + (y' - y)2 + (z' - z)2 works very well. So does the 1-D metric for the duration that we get for two instantaneous events by subtracting their clock readings; the duration between two instants p and q is the absolute value of the difference in their dates (that is, their clock readings or locations in time). In real physical space, the Euclidean metric works very well—at least for small regions (such as apartments and farms but not solar systems) that aren't too small (such as infinitesimally close to a proton). We might want a scale factor, say a, on the metric so that d2 = a[(x' - x)2 + (y' - y)2 + (z' - z)2]. If space were to expand uniformly, then a is not a constant but a function of time a(t). a(t) was zero at the Big Bang.

To have a metric for a spacetime, we desire a definition of the distance between any two infinitesimally neighboring points in that spacetime. Less generally, consider an appropriate metric for the 4-D mathematical space that is used to represent the spacetime obeying the laws of special relativity theory, namely Minkowski spacetime. What's an appropriate metric for this space? Well, if we were just interested in the space portion of this spacetime, then the above 3-D Euclidean metric is fine. But we've asked a delicate question because the fourth dimension of Minkowski's mathematical space is really a time dimension and not a space dimension. Using Cartesian coordinates, the spacetime has the following Lorentzian metric (or Minkowski metric) for any pair of point events at (x',y',z',t') and (x,y,z,t):

Δs2 = - (x' - x)2 - (y' - y)2 - (z' - z)2 + c2(t' - t)2

Δs is called the interval of Minkowski spacetime. Notice the plus and minus signs on the four terms. The interval corresponds to the difference in clock measurements between a pair of instantaneous events that happen at the same point place in the reference frame but are separated enough in time so that one event could have had a causal effect on the other. For a pair of events that occur at the same time in the frame but are separated in space, then the interval is what a meter stick would measure between the events. That is, Δs is then our spatial metric d. Most pairs of events, though, do not occur at the same place in the frame nor at the same time. One happy feature of this Lorentzian metric is that the value of the interval is unaffected by changing to a new reference frame or coordinate system provided the new one is not accelerating relative to the first. That is, changing to a new, unaccelerated reference frame on the spacetime will change the values of all the coordinates of the points of the spacetime, but some relations between all pairs of points won't be affected, namely the intervals between pairs of points. Thus there is something "absolute" about the metric; it is independent of unaccelerated reference frames. Take any two observers who use different reference frames that are not accelerating relative to each other. Now consider some single event with a finite duration. The two observers won't agree on how long that event lasts, nor where it occurs, but they will always agree on the interval between the beginning and end of the event. That's why the interval is said to be absolute.

The interval of spacetime between two point events is complicated because its square can be negative. If Δs2 is negative, the two points have a space-like separation, meaning these events have a greater separation in space than they do in time. If Δs2 is positive, then the two have a time-like separation, meaning enough time has passed that one event could have had a causal effect on the other. If Δs2 is zero, the two events might be identical, or they might have occurred millions of miles apart. In ordinary space, if the space interval between two events is zero, then the two events happened at the same time and place, but in spacetime, if the spacetime interval between two events is zero, this means only that there could be a light ray connecting them. It is because the spacetime interval between two events can be zero even when the events are far apart in distance that the term "interval" is very unlike what we normally mean by the term "distance." All the events that have a zero spacetime interval from some event e constitute e's two light cones. This set of events is given that name because it has the shape of cones when represented in a Minkowski diagram for 2-D space, one cone for events in e's future and one cone for events in e's past. If event 2 is outside the light cones of event 1, then event 2 is said to occur in the "absolute elsewhere" of event 1.

Another equally legitimate choice of a definition for a metric in Minkowskian 4-D spacetime is:

Δs2 =  (x' - x)2 + (y' - y)2 + (z' - z)2 - c2(t' - t)2

and now when Δs2 is positive we have a spacelike displacement instead of, as in the previous metric, a timelike displacement. Because true metrics are always positive, neither metric is a true metric, nor even a pseudometric; but it is customary for physicists to refer to it loosely as a "metric" because Δs retains enough other features of distance.

What if we turn now from special relativity to general relativity? Adding space and time dependence (particularly the values of mass-energy and momentum at points) to each term of the Lorentzian metric, the metric for special relativity, produces the metric for general relativity. That metric requires more complex tensor equations.

8. Does the Theory of Relativity Imply Time Is Partly Space?

In 1908, Minkowski remarked that "Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality." Many people took this to mean that time is partly space, and vice versa. C. D. Broad countered that the discovery of spacetime did not break down the distinction between time and space but only their independence or isolation. He argued that their lack of independence does not imply a lack of reality.

Nevertheless, there is a deep sense in which time and space are "mixed up" or linked. This is evident from the Lorentz transformations of special relativity that connect the time t in one inertial frame with the time t' in another frame that is moving in the x direction at a constant speed v. In this Lorentz equation, t' is dependent upon the space coordinate x and the speed. In this way, time is not independent of either space or speed. It follows that the time between two events could be zero in one frame but not zero in another. Each frame has its own way of splitting up spacetime into its space part and its time part.

The reason why time is not partly space is that, within a single frame, time is always distinct from space. Time is a distinguished dimension of spacetime, not an arbitrary dimension. What being distinguished amounts to is that when you set up a rectangular coordinate system on spacetime with an origin at, say, the event of Mohammed's birth, you may point the x-axis east or north or up, but you may not point it forward in time—you may do that only with the t-axis, the time axis.

9. Is Time the Fourth Dimension?

Yes and no; it depends on what you are talking about. Time is the fourth dimension of 4-d spacetime, but time is not the fourth dimension of space, the space of places.

Mathematicians have a broader notion of the term "space" than the average person; and in their sense a space need not consist of places, that is, geographical locations. Not paying attention to the two meanings of the term "space" is the source of all the confusion about whether time is the fourth dimension. The mathematical space used by mathematical physicists to represent physical spacetime is four dimensional and in that space, the space of places is a 3-d sub-space and time is another 1-d sub-space. Minkowski was the first person to construct such a mathematical space, although in 1895 H. G. Wells treated time as a fourth dimension in his novel The Time Machine. Spacetime is represented mathematically by Minkowski as a space of events, not as a space of ordinary geographical places.

In any coordinate system on spacetime, it takes at least four independent numbers to determine a spacetime location. In any coordinate system on the space of places, it takes at least three. That's why spacetime is four dimensional but the space of places is three dimensional. Actually this 19th century definition of dimensionality, which is due to Bernhard Riemann, is not quite adequate because mathematicians have subsequently discovered how to assign each point on the plane to a point on the line without any two points on the plane being assigned to the same point on the line. The idea comes from Georg Cantor. Because of this one-to-one correspondence, the points on a plane could be specified with just one number. If so, then the line and plane must have the same dimensions according to the Riemann definition. To avoid this problem and to keep the plane being a 2-d object, the notion of dimensionality of a space has been given a new, but rather complex, definition.

10. Is There More Than One Kind of Physical Time?

Every reference frame has its own physical time, but the question is intended in another sense. At present, physicists measure time electromagnetically. They define a standard atomic clock using periodic electromagnetic processes in atoms, then use electromagnetic signals (light) to synchronize clocks that are far from the standard clock. In doing this, are physicists measuring '"electromagnetic time" but not other kinds of physical time?

In the 1930s, the physicists Arthur Milne and Paul Dirac worried about this question. Independently, they suggested there may be very many time scales. For example, there could be the time of atomic processes and perhaps also a time of gravitation and large-scale physical processes. Clocks for the two processes might drift out of synchrony after being initially synchronized, yet there would be no reasonable explanation for why they don't stay in synchrony. Ditto for clocks based on the pendulum, on superconducting resonators, on the spread of electromagnetic radiation through space, and on other physical principles. Just imagine the difficulty for physicists if they had to work with electromagnetic time, gravitational time, nuclear time, neutrino time, and so forth. Current physics, however, has found no reason to assume there is more than one kind of time for physical processes.

In 1967, physicists did reject the astronomical standard for the atomic standard because the deviation between known atomic and gravitation periodic processes could be explained better assuming that the atomic processes were the more regular of the two. But this is not a cause for worry about two times drifting apart. Physicists still have no reason to believe a gravitational periodic process that is just as regular initially as the atomic process and that is not affected by friction or impacts or other forces would ever drift out of synchrony with the atomic process, yet this is the possibility that worried Milne and Dirac.

11. How is Time Relative to the Observer?

Physical time is not relative to any observer's state of mind. Wishing time will pass does not affect the rate at which the observed clock ticks. On the other hand, physical time is relative to the observer's reference system--in trivial ways and in a deep way discovered by Albert Einstein.

In a trivial way, time is relative to the chosen coordinate system on the reference frame, though not to the reference frame itself. For example, it depends on the units chosen as when the duration of some event is 34 seconds if seconds are defined to be a certain number of ticks of the standard clock, but is 24 seconds if seconds are defined to be a different number of ticks of that standard clock. Similarly, the difference between the Christian calendar and the Jewish calendar for the date of some event is due to a different unit and origin. Also trivially, time depends on the coordinate system when a change is made from Eastern Standard Time to Pacific Standard Time. These dependencies are taken into account by scientists but usually never mentioned. For example, if a pendulum's approximately one-second swing is measured in a physics laboratory during the autumn night when the society changes from Daylight Savings Time back to Standard Time, the scientists do not note that one unusual swing of the pendulum that evening took a negative fifty-nine minutes and fifty-nine seconds instead of the usual one second.

Isn't time relative to the observer's coordinate system in the sense that in some reference frames there could be fifty-nine seconds in a minute? No, due to scientific convention, it is absolutely certain that there are sixty seconds in any minute in any reference frame. How long an event lasts is relative to the reference frame used to measure the time elapsed, but in any reference frame there are exactly sixty seconds in a minute because this is true by definition. Similarly, you do not need to worry that in some reference frame there might be two gallons in a quart.

In a deeper sense, time is relative, not just to the coordinate system, but to the reference frame itself. That is Einstein's principal original idea about time. Einstein's special theory of relativity requires physical laws not change if we change from one inertial reference frame to another. In technical-speak Einstein is requiring that the statements of physical laws must be Lorentz-invariant. The equations of light and electricity and magnetism (Maxwell electrodynamics) are Lorentz-invariant, but those of Newton's mechanics are not, and Einstein eventually figured out that what needs changing in the laws of mechanics is that temporal durations and spatial intervals between two events must be allowed to be relative to which reference frame is being used. There is no frame-independent duration for an event extended in time.  To be redundant, Einstein's idea is that without reference to the frame, there is no fixed time interval between two events, no 'actual' duration between them. This idea was philosophically shocking as well as scientifically revolutionary.

Einstein illustrated his idea using two observers, one on a moving train in the middle of the train, and a second observer standing on the embankment next to the train tracks. If the observer sitting in the middle of the rapidly moving train receives signals simultaneously from lightning flashes at the front and back of the train, then in his reference frame the two lightning strikes were simultaneous. But the strikes were not simultaneous in a frame fixed to an observer on the ground. This outside observer will say that the flash from the back had farther to travel because the observer on the train was moving away from the flash. If one flash had farther to travel, then it must have left before the other one, assuming that both flashes moved at the same speed. Therefore, the lightning struck the back of the train before the lightning struck the front of the train in the reference frame fixed to the tracks.

Let's assume that a number of observers are moving with various constant speeds in various directions. Consider the inertial frame of reference in which each observer is at rest in his or her own frame. Which of these observers will agree on their time measurements? Only observers with zero relative speed will agree. Observers with different relative speeds will not, even if they agree on how to define the second and agree on some event occurring at time zero (the origin of the time axis). If two observers are moving relative to each other, but each makes judgments from a reference frame fixed to themselves, then the assigned times to the event will disagree more, the faster their relative speed. All observers will be observing the same objective reality, the same event in the same spacetime, but their different frames of reference will require disagreement about how spacetime divides up into its space part and its time part.

This relativity of time to reference frame implies that there be no such thing as The Past in the sense of a past independent of reference frame. This is because a past event in one reference frame might not be past in another reference frame. However, this frame relativity usually isn't very important except when high speeds or high gravitational fields are involved.

In some reference frame, was Adolf Hitler born before George Washington? No, because the two events are causally connectible. That is, one event could in principle have affected the other since light would have had time to travel from one to the other. We can select a reference frame to reverse the usual Earth-based order of two events only if they are not causally connectible, that is, only if one event is in the absolute elsewhere of the other. Despite the relativity of time to a reference frame, any two observers in any two reference frames should agree about which of two causally connectible events happened first.

12. What Is the Relativity of Simultaneity?

Because the universe obeys relativistic physics, events that occur simultaneously with respect to one reference frame will not occur simultaneously in another reference frame that is moving with respect to the first frame. This is called the relativity of simultaneity.

In order to explain this point that the spatial 'plane' or 'time slice' of simultaneous events is different in different reference frames, notice that we calculate the time when something occurred far away by computing the difference between the time when a light signal arrives to us from the event minus the time it took for the light to travel all that way.  We see a flash of light at time t arriving from a distant place P. When did the flash occur back at P? Let's call the time of that earlier P-event tp. Here is how to compute tp. Suppose we know the distance from us to P is x. Then the flash occurred at t minus the travel time for the light. That travel time is x/c. So,

tp = t - x/c.

For example, if we see an explosion on the sun at t, then we know to say it really occurred eight minutes before, because x/c is approximately eight minutes, if x is the distance from Earth to the sun.

Calculations like this work fine for events in one reference frame, but they don't always work when we change reference frames. The diagram below illustrates the problem. There are two light flashes that occur simultaneously, with Einstein at rest midway between them.


The Minkowski diagram represents Einstein sitting still in the reference frame (marked by the coordinate system with the thick black axes) while Lorentz is not sitting still but is traveling rapidly away from him and toward the source of flash 2. Because Lorentz's timeline is a straight line we can tell that he is moving at a constant speed. The two flashes of light arrive at Einstein's location simultaneously, creating spacetime event B. However, Lorentz sees flash 2 before flash 1. That is, the event A of Lorentz seeing flash 2 occurs before event C of Lorentz seeing flash 1. So, Einstein will readily say the flashes are simultaneous, but Lorentz will have to do some computing to figure out that the flashes are simultaneous in the frame because they won't "look" simultaneous. However, if we'd chosen a different reference frame from the one above, one in which Lorentz is not moving but Einstein is, then Lorentz would be correct to say flash 2 occurs before flash 1 in that new frame. So, whether the flashes are or are not simultaneous depends on which reference frame is used in making the judgment. It's all relative.


13. What Is the Conventionality of Simultaneity?

This relativity of simultaneity is philosophically less controversial than the conventionality of simultaneity. To appreciate the difference, consider what is involved in making a determination regarding simultaneity. Given two events that happen essentially at the same place, physicists assume they can tell by direct observation whether the events happened simultaneously. If we don't see one of them happening first, then we say they happened simultaneously, and we assign them the same time coordinate. The determination of simultaneity is more difficult if the two happen at separate places, especially if they are very far apart. One way to measure (operationally define) simultaneity at a distance is to say that two events are simultaneous in a reference frame if unobstructed light signals from the two events would reach us simultaneously when we are midway between the two places where they occur, as judged in that frame. This is the operational definition of simultaneity used by Einstein in his theory of relativity. Instead of using the midway method, we could take the distant clock and send a signal home to our master clock, one already synchronized with our standard clock; the master clock immediately sends a signal back to the distant clock with the information about what time it was when the signal arrived. We at the distant clock notice that the total travel time is t and that the master clock's signal says its time is, say, noon, so we immediately set our clock to be noon plus half of t.

The "midway" method described above of operationally defining simultaneity in one reference frame for two distant signals causally connected to us has a significant presumption: that the light beams travel at the same speed regardless of direction. Einstein, Reichenbach and Grünbaum have called this a reasonable "convention" because any attempt to experimentally confirm it presupposes that we already know how to determine simultaneity at a distance. This is the conventionality, rather than relativity, of simultaneity. To pursue the point, suppose the two original events are in each other's absolute elsewhere; they couldn't have affected each other. Einstein noticed that there is no physical basis for judging the simultaneity or lack of simultaneity between these two events, and for that reason said we rely on a convention when we define distant simultaneity as we do. Hillary Putnam, Michael Friedman, and Graham Nerlich object to calling it a convention--on the grounds that to make any other assumption about light's speed would unnecessarily complicate our description of nature, and we often make choices about how nature is on the basis of simplification of our description. They would say there is less conventionality in the choice than Einstein supposed.

The "midway" method isn't the only way to define simultaneity. Consider a second method, the "mirror reflection" method. Select an Earth-based frame of reference, and send a flash of light from Earth to Mars where it hits a mirror and is reflected back to its source. The flash occurred at 12:00, let's say, and its reflection arrived back on Earth 20 minutes later. The light traveled the same empty, undisturbed path coming and going. At what time did the light flash hit the mirror? The answer involves the so-called conventionality of simultaneity. All physicists agree one should say the reflection event occurred at 12:10. The controversial philosophical question is whether this is really a convention. Einstein pointed out that there would be no inconsistency in our saying that it hit the mirror at 12:17, provided we live with the awkward consequence that light was relatively slow getting to the mirror, but then traveled back to Earth at a faster speed. If we picked the impact time to be 12:05, we'd have to live with the fact that light traveled slower coming back.

Let's explore the reflection method that is used to synchronize a distant, stationary clock so that it reads the same time as our clock. Let's draw a Minkowski diagram of the situation and consider just one spatial dimension in which we are at location A with the standard clock for the reference frame. The distant clock we want to synchronize is at location B. See the following diagram.

conventionality of simultaneity graph

The fact that the timeline of the B-clock is parallel to the time axis shows that the clock there is stationary. We will send light signals in order to synchronize the two clocks. Send a light signal from A at time t1 to B, where it is reflected back to us, arriving at time t3. Then the reading tr on the distant clock at the time of the reflection event should be t2, where

t2 = (1/2)(t3 + t1).

If tr = t2, then the two clocks are synchronized.

Einstein noticed that the use of "(1/2)" in the equation t2 = (1/2)(t3 + t1) rather than the use of some other fraction implicitly assumes that the light speed to and from B is the same. He said this assumption is a convention, the so-called conventionality of simultaneity, and isn't something we could check to see whether it is correct. If t2 were (1/3)(t3 + t1), then the light would travel to B faster than c and return more slowly. If t2 were (2/3)(t3 + t1), then the light would travel to B relatively slowly and return faster than c. Either way, the average travel speed to and from would be c. Only with the fraction (1/2) are the travel speeds the same going and coming back.

Notice how we would check whether the two light speeds really are the same. We would send a light signal from A to B, and see if the travel time was the same as when we sent it from B to A. But to trust these times we would already need to have synchronized the clocks at A and B. But that synchronization process will use the equation t2 = (1/2)(t3 + t1), with the (1/2) again, so we are arguing in a circle here.

Not all philosophers of science agree with Einstein that the choice of (1/2) is a convention nor with those philosophers who say the messiness of any other choice shows that the choice must be correct. Everyone agrees, though, that any other choice than (1/2) would make for messy physics, but they suggest that there's a way to check on the light speeds without presuming the equation t2 = (1/2)(t3 + t1) or presuming that the speeds are the same. Synchronize two clocks at A. Then transport one of the clocks to B at an infinitesimal speed. Going this slow, the clock will arrive at B without having its proper time deviate from that of the A-clock. That is, the two clocks will be synchronized even though they are distant from each other. Now the two clocks can be used to find the time when a light signal left A and the time when it arrived at B. The time difference can be used to compute the light speed. This speed can be compared with the speed computed for a signal that left B and then arrived at A. The experiment has never been performed, but the recommenders are sure that the speeds to and from will turn out to be identical, so they are sure that the (1/2) in the equation t2 = (1/2)(t3 + t1) is correct and not a convention. For more discussion of this controversial issue of conventionality in relativity, see pp. 179-184 of The Blackwell Guide to the Philosophy of Science, edited by Peter Machamer and Michael Silberstein, Blackwell Publishers, Inc., 2002.


14. What Is the Difference between the Past and the Absolute Past?


The events in your absolute past are those that could have directly or indirectly affected you, the observer, now. These absolutely past events are the events in or on the backward light cone of your present event, your here-and-now. The backward light cone of event Q is the imaginary cone-shaped surface of spacetime points formed by the paths of all light rays reaching Q from the past. An event's being in another event's absolute past is a feature of spacetime itself because the event is in the point's past in all possible reference frames. The feature is frame-independent. For any event in your absolute past, every observer in the universe (who isn't making an error) will agree the event happened in your past. Not so for events that are in your past but not in your absolute past. Past events not in your absolute past will be in what Eddington called your "absolute elsewhere" and these past events will be in your present as judged by some other reference frames. The absolute elsewhere is the region of spacetime containing events that are not causally connectible to your here-and-now. Your absolute elsewhere is the region of spacetime that is neither in nor on either your forward or backward light cones. No event here now, can affect any event in your absolute elsewhere; and no event in your absolute elsewhere can affect you here and now. A spacetime point's absolute future is all the future events outside the point's absolute elsewhere.

A single point's absolute elsewhere, absolute future, and absolute past partition all of spacetime beyond the point into three disjoint regions. If point A is in point B's absolute elsewhere, the two events are said to be "spacelike related." If the two are in each other's forward or backward light cones they are said to be "timelike related" or "causally connectible."

The past light cone looks like a triangle when the diagram has just one dimension for space. However, the past light cone is not a triangle but has a pear-shape because all very ancient light lines must have originated from the infinitesimal volume at the big bang.

15. What is Time Dilation?

According to special relativity, two properly functioning clocks next to each other will stay synchronized. Even if they were to be far away from each other, they'd stay synchronized if they didn't move relative to each other. But if one clock moves away from the other, the moving clock will tick slower than the stationary clock, as measured in the inertial reference frame of the stationary clock. This slowing due to motion is called "time dilation." If you move at 99% of the speed of light, then your time slows by a factor of 7 relative to stationary clocks. In addition, you are 7 times thinner than when you are stationary, and you are 7 times heavier. If you move at 99.9%, then you slow by a factor of 22.

Time dilation is about two synchronized clocks getting out of synchrony due either to their relative motion or due to their being in different gravitational fields. Time dilation due to difference in constant speeds is described by Einstein's special theory of relativity. The general theory of relativity describes a second kind of time dilation, one due to different accelerations and different gravitational influences. Suppose your twin's spaceship travels to and from a star one light year away. It takes light from your Earth-based flashlight two years to go there and back. But if the spaceship is fast, your twin can make the trip in less than two years, according to his own clock. Does he travel the distance in less time than it takes light to travel that distance? No, according to your clock he takes more than two years, and so is slower than light.

We sometimes speak of time dilation by saying time itself is "slower," but time isn't going slower in any absolute sense, only relative to some other frame of reference. Does time have a rate? Well, time in a reference frame has no rate in that frame, but time in a reference frame can have a rate as measured in a different frame, such as in a frame moving relative to the first frame.

Time dilation is not an illusion of perception; and it is not a matter of the second having different definitions in different reference frames.

Newton's physics describes duration as an absolute property, implying it is not relative to the reference frame. However, in Newton's physics the speed of light is relative to the frame. Einstein's special theory of relativity reverses both of these aspects of time. For inertial frames, it implies the speed of light is not relative to the frame, but duration is relative to the frame. In general relativity, however, the speed of light can vary within one reference frame if matter and energy are present.

Time dilation due to motion is relative in the sense that if your spaceship moves past mine so fast that I measure your clock to be running at half speed, then you will measure my clock to be running at half speed also, provided both of us are in inertial frames. If one of us is affected by a gravitational field or undergoes acceleration, then that person isn't in an inertial frame and the results are different.

Both types of time dilation play a significant role in time-sensitive satellite navigation systems such as the Global Positioning System. The atomic clocks on the satellites must be programmed to compensate for the relativistic dilation effects of both gravity and motion.

For more on general relativistic dilation, see the discussion of gravity and black holes.

16. How Does Gravity Affect Time?

Einstein's general theory of relativity (1915) is a generalization of his special theory of special relativity (1905). It is not restricted to inertial frames, and it encompasses a broader range of phenomena, namely gravity and accelerated motions. According to general relativity, gravitational differences affect time by dilating it. Observers in a less intense gravitational potential find that clocks in a more intense gravitational potential run slow relative to their own clocks. People live longer in basements than in attics, all other things being equal. Basement flashlights will be shifted toward the red end of the visible spectrum compared to the flashlights in attics. This effect is known as the gravitational red shift. Even the speed of light is slower in the presence of higher gravity.

Informally one speaks of gravity bending light rays around massive objects, but more accurately it is the space that bends, and as a consequence the light is bent, too. The light simply follows the shortest path through spacetime, and when space curves the shortest paths are no longer Euclidean straight lines.

17. What Happens to Time Near a Black Hole?

A black hole is a body of matter with a very high gravitational field that constitutes a severe warp in the spacetime continuum, so much so that objects near the hole get pulled inside, and once inside the horizon surrounding the hole they cannot escape (normally). Even light cannot escape. The center within the hole is a nasty place called a "singularity" where the mass density is infinite, according to the general theory of relativity.

In principle, any material object can be turned into a black hole if it is sufficiently compressed. The Earth would become a black hole if it were somehow compressed to a radius of one centimeter. Just as in other galaxies, there is a massive black hole at the center of our galaxy, the Milky Way. It is in the direction of the constellation Sagittarius. Astrophysicists believe black holes are most commonly formed by the inward collapse of stars whose nuclear fuel has been exhausted. The center of a black hole (the singularity) is infinitely dense according to relativity theory; the singularity is only very, very dense according to theories of quantum gravity, but none of these theories have as yet been confirmed.

The radius of the black hole's event horizon is directly proportional to its mass; if the mass doubles, so does the radius of the horizon. The mass of the black hole in our galaxy is about a million times our sun’s mass.

If you observed an astronaut falling toward the event horizon, their light would become dimmer and redder, and their clock would tick progressively slower compared to your clock. You’d never see them actually reach the horizon no matter how long you waited, although in terms of their own personal time or proper time, they’d be quickly swept through the horizon and into the singularity where their volume would become infinitesimal.

Suppose you do get near the event horizon but are able to escape. What happens to your time? It will be dilated in the sense that, if you were to return home to Earth, you'd discover that you were younger than your Earth-bound twin. Your initially synchronized clocks would show that yours had fallen behind. It is in this sense that you would have experienced a time warp, a warp in the time component of spacetime.

Time inside a black hole is even stranger. In a certain sense, time becomes space, and vice versa. In a Minkowski diagram using polar coordinates, ordinary time is an axial dimension; but, just inside the event horizon of a black hole, time starts tilting until it becomes a radial dimension.

18. What Is the Solution to the Twin Paradox?

This paradox is also called the clock paradox and the twins paradox. It is an argument about time dilation that uses the special theory of relativity to produce a contradiction.  Consider two twins at rest on Earth with their clocks synchronized. One twin climbs into a spaceship and flies far away at a high, constant speed, then reverses course and flies back at the same speed. When they reunite, will the twins still be the same age? An application of the equations of special relativity theory implies that the twin on the spaceship will return and be younger than the Earth-based twin. Here is the argument for the twin paradox. It’s all relative, isn’t it? That is, either twin could regard the other as the traveler. Let's consider the spaceship to be stationary. Wouldn’t relativity theory then imply that the Earth-based twin could race off (while attached to the Earth) and return to be the younger of the two twins? If so, we have a contradiction because, when the twins reunite, each will be younger than the other.

Herbert Dingle famously argued in the 1960s that the paradox reveals an inconsistency in special relativity. Almost all philosophers and scientists now agree that it is not a true paradox, in the sense of revealing a logical inconsistency within relativity theory, but is merely a complex puzzle that can be adequately solved within relativity theory, although there is dispute about whether the solution can occur in special relativity or only in general relativity. Those who say the resolution of the twin paradox requires only special relativity are a small minority. Einstein said the solution to the paradox requires general relativity. Max Born said, "the clock paradox is due to a false application of the special theory of relativity, namely, to a case in which the methods of the general theory should be applied." In 1921, Wolfgang Pauli said, “Of course, a complete explanation of the problem can only be given within the framework of the general theory of relativity.”

There have been a variety of suggestions in the relativity textbooks on how to solve the paradox. Here is one, diagrammed below.

twin paradox

This suggestion for solving the paradox is to apply general relativity and then note that there must be a difference in the proper time taken by the twins because their behavior is different, as shown in their two world lines. The length of the line representing their path in spacetime in the above diagram is not a measure of their proper time. Instead, the spacing of the dots represents a tick of a clock and thus represents the proper time. The diagram shows how sitting still on Earth is a way of maximizing the proper during the trip, and it shows how flying near light speed in a spaceship away from Earth and then back again is a way of minimizing the proper time, even though if you paid attention only to the shape of the world lines and not to the dot spacing within them you might think just the reverse. Surprisingly, a straight world line between two events in a diagram like this has the longest proper time between two events, not the shortest. So, the reasoning in the paradox makes the mistake of supposing that the situation of the two twins is the same as far as elapsed proper time is concerned.

A second way to solve the twin paradox is to note that each twin can consider the other twin to be the one who moves, but their experiences will still be different because their situations are not symmetric. Regardless of which twin is considered to be stationary, only one twin feels the acceleration at the turnaround point, so it should not be surprising that the two situations have different implications about time. And when the gravitational fields are taken into considerations, the equations of general relativity do imply that the younger twin is the one who feels the acceleration. However, the force felt by the spaceship twin is not what "forces" that twin to be younger. Nothing is forcing the twin to be younger anymore than something is forcing the speed of light to remain constant.

A third suggestion for how to solve the paradox is to say that only the Earthbound twin can move at a constant velocity in a single inertial frame. If the spaceship twin is to be considered in an inertial frame and moving at a constant velocity, as required by special relativity, then there must be a different frame for the Earthbound twin's return trip than the frame for the outgoing trip. But changing frames in the middle of the presentation is an improper equivocation and shows that the argument of the paradox breaks down. In short, both twins' motions cannot always be inertial.

These three solutions, which are really variants of the same solution, tend to leave many people unsatisfied, probably because they think of the following situation. If we remove the stars and planets and other material from the universe and simply have two twins, isn't it clear that it would be inappropriate to say "there is an observable difference" due to one twin feeling an acceleration while the other does not? Won't both twins feel the same forces, and wouldn't relativity theory be incorrect if it implied that one twin returned to be younger than the other? (The correct answer to these questions is "yes.") Therefore, why does attaching the Earth to one of the twins force that twin to be the older one upon reunion? The answer to this last question requires appealing to general relativity. Notice that it is not just the Earth that is attached to the one twin. It is the Earth in tandem with all the planets and stars. When the spaceship-twin is considered to be at rest, then the planets and stars also rush away and back. Because of all this movement of mass, the turnaround isn't felt by the Earthbound twin who moves in tandem with those stars, but is felt very clearly by the spaceship twin. So, regardless of which twin is considered to be at rest, it is only the spaceship twin who feels any acceleration. Explaining this failure of the Earthbound twin to feel the force at the turnaround when the spaceship twin is at rest shows that a solution to the paradox ultimately requires a theory of the origin of inertia. But the point remains that the asymmetry in the experience of the two twins accounts for the aging difference and for the error in the argument of the twin paradox.

If you are the twin in the spaceship, then by flying fast and returning to Earth you do gradually advance into your twin's future, but your twin does not go to your past.

19. What Is the Solution to Zeno's Paradoxes?

See the article "Zeno's Paradoxes" in this encyclopedia.

20. How Do Time Coordinates Get Assigned to Points of Spacetime?

To justify the assignment of time numbers (called dates or clock readings) to instants, we cannot literally paste a number to an instant. What we do instead is show that the structure of the set of instantaneous events is the same as the structure of our time numbers. The structure of our time numbers is the structure of real numbers along the mathematical line. Showing that this is so is called "solving the representation problem" for our theory of time measurement. We won't go into detail on how to solve this problem, but the main idea is that to measure any space, including a one-dimensional space of time, we need a metrification for the space. The metrification assigns location coordinates to all points and assigns distances between all pairs of points. The method of assigning these distances is called the “metric” for the space.  A metrification for time assigns dates and durations to the points we call instants of time. Normally we use a clock to do this. Point instants get assigned a unique real number date (a clock reading or date), and the metric for the duration between any two of those point instants is normally found by subtracting their clock readings from each other. The duration is the absolute value of the numerical difference of their dates, that is |t(B) - t(A)| where t(B) is the date of B and t(A) is the date of A. One goal in the assignment of dates is to ensure that, if event A happens before event B, then t(A) < t(B). (Unfortunately, we cannot trust the subtraction of one clock reading from another if one of the clocks is far away from our standard clock and if we are not sure how to reliably synchronize the distant clock with our standard clock; but we will explore this problem in a later section.)

Lets' consider the question of metrification in more detail, starting with the assignment of locations to points. Any space is a collection of points. In a space that is supposed to be time, these points are the instants and the space for time is presumably linear (since presumably time is one-dimensional). Before discussing time coordinates specifically, let's consider what is meant by assigning coordinates to a mathematical space, one that might represent either physical space, or physical time, or spacetime, or something else. In a one-dimensional space, such as a curving line, we assign unique coordinate numbers to points along the line, and we make sure that no point fails to have a coordinate. For a 2-dimensional space, we assign pairs of numbers to points. For a 3-d space, we assign triples of numbers. Why numbers and not letters? If we assign letters instead of numbers, we can not use the tools of mathematics to describe the space. But even if we do assign numbers we cannot assign any coordinate numbers we please. There are restrictions. If the space has a certain geometry, then we have to assign numbers that reflect this geometry. If event A occurs before event B, then the date of event A, namely t(A), must be less than t(B). If event B occurs after event A but before event C, then we should assign dates so that t(A) < t(B) < t(C). Here is the fundamental method of analytic geometry:

Consider a space as a class of fundamental entities: points. The class of points has "structure" imposed upon it, constituting it a geometry—say the full structure of space as described by Euclidean geometry. [By assigning coordinates] we associate another class of entities with the class of points, for example a class of ordered n-tuples of real numbers [for a n-dimensional space], and by means of this "mapping" associate structural features of the space described by the geometry with structural features generated by the relations that may hold among the new class of entities—say functional relations among the reals. We can then study the geometry by studying, instead, the structure of the new associated system [of coordinates]. (Sklar, 1976, p. 28)

The goal in assigning coordinates to a space is to create a reference system for the space. A reference system is a reference frame plus either a coordinate system or an atlas of coordinate systems placed by the analyst upon the space to uniquely name the points. These names or coordinates are frame dependent in that a point can get new coordinates when the reference frame is changed. For 4-d spacetime that obeys special relativity and its Lorentzian geometry, a coordinate system is a grid of smooth timelike and spacelike curves on the spacetime that assigns to each point three space coordinate numbers and one time coordinate number. No two distinct points can have the same set of four coordinate numbers. Inertial frames can have global coordinate systems, but in general we have to make due with atlases. If we are working with general relativity where spacetime can curve and we cannot assume inertial frames, then the best we can do is to assign a coordinate system to a small region of spacetime where the laws of special relativity hold to a good approximation. General relativity requires special relativity to hold locally, and thus for spacetime to be Euclidean locally. That means that locally the 4-d spacetime is correctly described by 4-d Euclidean solid geometry. Consider two coordinate systems on adjacent regions. For the adjacent regions we make sure that the 'edges' of the two coordinate systems match up in the sense that each point near the intersection of the two coordinate systems gets a unique set of four coordinates and that nearby points get nearby coordinate numbers. The result is an "atlas" on spacetime.

For small regions of spacetime, we create a coordinate system by choosing a style of grid, say rectangular coordinates, fixing a point as being the origin, selecting one timelike and three spacelike lines to be the axes, and defining a unit of distance for each dimension. We cannot use letters for coordinates. The alphabet's structure is too simple. Integers won't do either; but real numbers are adequate to the task. The definition of "coordinate system" requires us to assign our real numbers in such a way that numerical betweenness among the coordinate numbers reflects the betweenness relation among points. For example, if we assign numbers 17, pi, and 101.3 to instants, then every interval of time that contains the pi instant and the 101.3 instant had better contain the 17 instant. When this feature holds, the coordinate assignment is said to be monotonic.

The choice of the unit presupposes we have defined what "distance" means. The metric for a space specifies what is meant by distance in that space. The natural metric between any two points in a one-dimensional space, such as the time sub-space of our spacetime, is the numerical difference between the coordinates of the two points. Using this metric for time, the duration between an event with the coordinate 11 and the event with coordinate 7 is 5. The metric for spacetime defines the spacetime interval between two spacetime locations, and it is more complicated than the metric for time alone. The spacetime interval between any two events is invariant or unchanged by a change to any other reference frame, although the time interval can vary with change of frame. More accurately, in the general theory, the infinitesimal spacetime interval between two neighboring points is invariant. The units of the spacetime interval are seconds squared.

In this discussion, there is no need to worry about the distinction between change in metric and change in coordinates. For a space that is topologically equivalent to the real line and for metrics that are consistent with that topology, each coordinate system determines a metric and each metric determines a coordinate system. More precisely, once you decide on a positive direction in the one-dimensional space and a zero-point for the coordinates, then the possible coordinate systems and the possible metrics are in one-to-one correspondence.

There are still other restrictions on the assignments of coordinate numbers. The restriction that we called the "conventionality of simultaneity" fixes what time-slices of spacetime can be counted as collections of simultaneous events. An even more complicated restriction is that coordinate assignments satisfy the demands of general relativity. The metric of spacetime in general relativity is not global but varies from place to place due to the presence of matter and gravitation. Spacetime cannot be given its coordinate numbers without our knowing the distribution of matter and energy.

The features that a space has without its points being assigned any coordinates whatsoever are its topological features. These are its dimensionality, whether it goes on forever or has a boundary, how many points there are, and so forth.

21. How Do Dates Get Assigned to Actual Events?

Ideally for any reference frame we would like to partition the set of all actual events into simultaneity equivalence classes by some reliable method. All events in the same class are said to happen at the same time in the frame, and every event is in some class or other. Consider what event near the supergiant star Betelgeuse is happening at the same time as now. That is a difficult question to answer, so let's begin our discussion with some easier questions.

What is happening at time zero in our coordinate system? There is no way to select one point of spacetime and call it the origin of the coordinate system except by reference to actual events. In practice, we make the origin be the location of a special event. One popular choice is the birth of Jesus; another is the birth of Mohammed.

Our purpose in choosing a coordinate system or atlas is to express relationships among actual and possible events. The time relationships we are interested in are time-order relationships (Did this event occur between those two?) and magnitude-duration relationships (How long after A did B occur?) and date-time relationships (When did event A itself occur?). The date of a (point) event is the time coordinate number of the spacetime location where the event occurs. We expect all these assignments of dates to events to satisfy the requirement that event A happens before event B iff t(A) < t(B), where t(A) is the time coordinate of A, namely its date. The assignments of dates to events also must satisfy the demands of our physical theories, and in this case we face serious problems involving inconsistency as when a geologist gives one date for the birth of Earth and an astronomer gives a different date. By the way, in English the word "date" is ambiguous because we use it to stand for a specific time and also for the name of that specific time. In this article, we use the term both ways, hoping that the context indicates which way the word is intended.

It is a big step from assigning numbers to points of spacetime to assigning them to real events. Here are some of the questions that need answers. How do we determine whether a nearby event and a distant event occurred simultaneously? Assuming we want the second to be the standard unit for measuring the time interval between two events, how do we operationally define the second so we can measure whether one event occurred exactly one second later than another event? A related question is: How do we know whether the clock we have is accurate? Less fundamentally, attention must also be paid to the dependency of dates due to shifting from Standard Time to Daylight Savings Time, to crossing the International Date Line, to switching from the Julian to the Gregorian Calendar, and to comparing regular years with leap years.

Let's design a coordinate system for time. Suppose we have already assigned a date of zero to the event that we choose to be at the origin of our coordinate system. To assign dates to other events, we first must define a standard clock and declare that the time intervals between any two consecutive ticks of that clock are the same. The second, our conventional unit of time measurement, will be defined to be so many ticks of the standard clock. We then synchronize other clocks with the standard clock so the clocks show equal readings at the same time. The time or date at which a point event occurs is the number reading on the clock at rest there. If there is no clock there, the assignment process is more complicated.

We want to use clocks to assign a time even to very distant events, not just to events in the immediate vicinity of the clock. To do this correctly requires some appreciation of Einstein's theory of relativity. A major difficulty is that two nearby synchronized clocks, namely clocks that have been calibrated and set to show the same time when they are next to each other, will not in general stay synchronized if one is transported somewhere else. If they undergo the same motions and gravitational influences, they will stay synchronized; otherwise, they won't. There is no privileged transportation process that we can appeal to. For more on how to assign dates to distant events, see the discussion of the relativity and conventionality of simultaneity.

As a practical matter, dates are assigned to events in a wide variety of ways. The date of the birth of the Sun is assigned very differently from dates assigned to two successive crests of a light wave in a laboratory laser. For example, there are lasers whose successive crests of visible light waves pass by a given location in the laboratory every 10 to the minus 15 seconds. This short time isn't measured with a stopwatch. It is computed from measurements of the light's wavelength. We rely on electromagnetic theory for the equation connecting the periodic time of the wave to its wavelength and speed. Dates for other kinds of events, such as the birth of the Sun, also are often computed rather than directly measured with a clock.

22. What Is Essential to Being a clock?

Every clock, in the principal sense of the word “clock,” has two essential functions: to tick and to count. In order to tick it must generate a sequence of events that are nearly all of the same duration. To tick is to do the same thing over and over again. We need predictable, regular, cyclic behavior in order to measure time with a clock. In a pendulum clock, the cyclic behavior is the swings of the pendulum. In a digital clock, the cycles are oscillations in an electronic circuit. In a sundial, they are regular movements of a shadow. The rotating earth is a clock that ticks once a day. The revolving earth is a clock that ticks once a year.

The second essential function of any clock is to display a count of those periodic events. This count is a measure of the duration of the event that the clock is used for. The count is normally converted into seconds or some other standard unit of time. This counting can be especially difficult if the ticks are occurring a trillion times a second. A calendar is not a clock, but rather a record of the count of a clock's days and months. It is an arbitrary convention that we design clocks to count up to higher numbers rather than down to lower numbers as time goes on. It is also a convention that we re-set our clock by one hour as we move across a time-zone on the earth's surface, or that we add leap days and leap seconds to our calendars.

The term “clock” is ambiguous, and there is another sense of the term in which all that is required of a clock is that it can be used to measure the duration of an event. If we have a process whose behavior is recognized to last a certain duration, then we sometimes use that process to measure the duration of another event that lasts the same duration and call this “using a clock.” For example, we have a candle that we agree takes an hour to burn down; we notice that the candle was lit at the beginning of dinner, then had burned down completely just as the dessert course was served, so we say we used a candle “clock” to measure the time from the beginning of the meal until dessert was served. Or we agree on how long the process of nuclear decay of a given amount of uranium into a given amount of lead takes, and then we measure the percentage of lead to uranium in volcanic rocks and say the volcano exploded a certain time ago, using our uranium-decay “clock” under the assumption that when the volcano exploded it contained no lead at all. Or we agree on the speed of light, and then say that some process has lasted just as long as light has taken to travel a certain distance. We say that we have measured the duration of that process with a “light clock” when we compute the duration from the distance information.

The goal in designing a clock is that it be accurate.

23. What Does It Mean for a Clock to be Accurate?

An accurate clock is a clock that is in synchrony with the standard clock. When the time measurements of the clock agree with the measurements made using the standard clock, we say the clock is accurate or properly calibrated or synchronized with the standard clock or simply correct. A perfectly accurate clock shows that it is time t just when the standard clock shows that it is time t, for all t. Accuracy is different from precision. If four clocks read exactly thirteen minutes slow compared to the standard clock, then the four are very precise, but they all are inaccurate by thirteen minutes.

One issue is whether the standard clock itself is accurate. Realists will say that the standard clock is our best guess as to what time it really is, and we can make incorrect choices for our standard clock. Anti-realists will say that the standard clock cannot, by definition, be inaccurate, so any choice of a standard clock, even the choice of the president's heartbeat as tour standard clock, will yield a standard clock that is accurate.

A clock isn't really measuring the time in a reference frame other than one fixed to the clock. It is not measure time "out there." In other words, a clock measures the elapsed proper time between events that occur along its own worldline. If the clock is in an inertial frame and not moving relative to the standard clock, then it measures the "coordinate time," the time we agree to use in the coordinate system. If the spacetime has no inertial frame, then that spacetime can't have an ordinary coordinate time.

Because clocks are intended to be used to measure events external to themselves, another goal in clock building is to ensure there is no difficulty in telling which clock tick is simultaneous with which events to be measured that are occurring away from the clock. For some situations and clocks, the sound made by the ticking helps us make this determination. We hear the tick just as we see the event occur that we desire to measure. [Note that we are ignoring the difference between the speed of sound and the speed of light.] But we might instead want to determine when the Sun comes up in the morning at some particular place where we and our clock are located.  Actually we are not interested in the Sun itself but in when the sunlight reaches our clock. In this situation, the time measurement is made by our seeing the first sunlight just when we see the digital clock face show a specific time of day. More accuracy in this kind of measurement process requires less reliance on human judgment.

In our discussion so far, we have assumed that the clock is very small, that it can count any part of a second and that it can count high enough to be a calendar. These aren't always good assumptions. Despite those practical problems, there is the theoretical problem of there being a physical limit to the shortest duration measurable by a given clock because no clock can measure events whose duration is shorter than the time it takes light to travel between the components of that clock, the components in the part that generates the sequence of regular ticks. This theoretical limit places a lower limit on the error margin of the measurement.

Every physical motion and every clock is subject to disturbances. So, to be an accurate clock that is in synchrony with the standard clock we want our clock to be adjustable in case it drifts out of synchrony a bit. It helps to keep it isolated from environmental influences such as heat, dust, unusual electromagnetic fields, physical blows (such as dropping the clock), and immersion in the ocean. And it helps to be able to be able to predict how much a specific influence affects the drift out of synchrony so that there can be an adjustment for this influence.

24. What Is Our Standard Clock?

We want to select as our standard clock a clock that we can be reasonably confident will tick regularly in the sense that all periods between adjacent ticks are congruent (the same duration). The international time standard used by most nations is called Coordinated Universal Time, or U.T.C. time, for the initials of the French name. It is not based on a single standard clock but rather on a large group of them. Here is how.

Atomic Time or A.T. time is what is produced by a cesium-based atomic fountain clock that counts in seconds, where those seconds are the S.I. seconds or Système International seconds (in the International Systems of Units, that is, Le Système International d'Unités). The S.I. second is defined to be the time it takes for a standard cesium atomic clock to emit exactly 9,192,631,770 cycles of radiation produced as the clock’s cloud of cesium 133 atoms make a transition between two hyperfine levels of their ground state.

Actually, for the more precise timekeeping, the T.A.I. time scale is used rather than the A.T. scale. The T.A.I. scale does not use a single standard cesium clock but rather a calculated average of the readings of about 200 of the cesium atomic clocks that are distributed around the world in about fifty selected laboratories. One of those laboratories is the National Institute of Standards and Technology in Boulder, Colorado, U.S.A. This calculated average time is called T.A.I. time, the abbreviation of the French phrase for International Atomic Time. The International Bureau of Weights and Measures near Paris performs the averaging about once a month. If your laboratory had sent in your guess for what times "some" events occurred in the previous month according to your own clock, then in the following month, the Bureau would send you a report of how inaccurate your guess was, so you could make adjustments to your clock.

Coordinated Universal Time or U.T.C. time is T.A.I. time plus or minus some integral number of leap seconds. U.T.C. is, by agreement, the time at the Prime Meridian, the longitude that runs through Greenwich England. The official government time is different in different countries. In the U.S.A., for example, the government time is U.T.C. time minus the hourly offsets for the appropriate time zones of the U.S.A. including whether daylight savings time is observed. U.T.C. time is informally called Zulu Time, and it is the time used by the Internet and the aviation industry throughout the world.

A.T. time, T.A.I. time, and U.T.C. time are not kinds of physical time but rather kinds of measurements of physical time. So, this is another reason why the word "time" is ambiguous; sometimes it means unmeasured time, and sometimes it means the measure of that time. Speakers rarely take care to say explicitly how they are using the term, so readers need to stay alert, even in the present Supplement and in the main Time article.

By a convention in 1964 [by ratification by the General Conference of Weights and Measures for the International System of Units, which replaced what was called the old "metric system"], the standard clock is the clock that the ratifying nations agree to use for defining the so-called "standard second" or S.I. second. This second, which has been used by the U.S.A. since 1999, is defined to be the duration of 9,192,631,770 periods (cycles, oscillations, vibrations) of a certain kind of microwave radiation emitted in the standard cesium clock. More specifically, the second is defined to be the duration of 9,192,631,770 periods of the microwave radiation required to produce the maximum fluorescence of a small cloud of cesium 133 atoms (that is, their radiating a specific color of light) as the atoms make a transition between two specific hyperfine energy levels of the ground state of the atoms. This is the internationally agreed upon unit for atomic time [the T.A.I. system]. The old astronomical system [Universal Time 1 or UT1] defined a second to be 1/86,400 of an Earth day.

For this "atomic time," or time measured atomically, the atoms of cesium with a uniform energy are sent through a chamber that is being irradiated with microwaves. The frequency of the microwaves is tuned until maximum fluorescence is achieved. That is, it is adjusted until the maximum number of cesium atoms flip from one energy to the other, showing that the microwave radiation frequency is precisely tuned to be 9,192,631,770 vibrations per second. Because this frequency for maximum fluorescence is so stable from one experiment to the next, the vibration number is accurate to this many significant digits.

The meter depends on the second, so time measurement is more basic than space measurement. It does not follow, though, that time is more basic than space. The best way to measure length is to do it via measuring the number of periods of light, since light propagation is very stable or regular, and a light wave's frequency can also be made very stable, and because distance can't be measured as accurately as time. In 1999, the meter was defined in terms of the (pre-defined) second as being the distance light travels in a vacuum in an inertial frame in exactly 0.000000003335640952 seconds, or 1/299,792,458 seconds. That number is picked by convention so that the new meter will be very nearly the same distance as the old meter. The old meter was defined to be the distance between two specific marks on a platinum bar that was kept in the Paris Observatory. Time can be measured not only more accurately than distance but also more accurately than voltage, temperature, mass, or anything else.

One subtle implication of these standard definitions of the second and the meter is that they fix the speed of light in a vacuum in all inertial frames. The speed is exactly one meter per 0.000000003335640952 seconds or 299,792,458 meters per second, or approximately 186,282 miles per second or about three million football fields per second. There can no longer be any direct measurement to see if that is how fast light really moves; it is simply defined to be moving that fast. Any measurement that produced a different value for the speed of light would be presumed initially to have an error. The error would be in, say, its measurements of lengths and durations, or in its assumptions about being in an inertial frame, or in its adjustments for the influence of gravitation and acceleration, or in its assumption that the light was moving in a vacuum. This initial presumption of where the error lies comes from a deep reliance by scientists on Einstein's theory of relativity. However, if it were eventually decided by the community of scientists that the theory of relativity is incorrect and that the speed of light shouldn't have been fixed as it was, then the scientists would call for a new world convention to re-define the second.

Leap years (with their leap days) are needed as adjustments to the standard clock in order to account for the fact that the number of the Earth’s rotations per Earth revolution does not stay constant from year to year. Without that adjustment, our midnights will drift into the daylight. Leap seconds are needed for another reason. They are needed because the Earth does not rotate regularly and some days last longer than others. Unfortunately, the irregularity is not practically predictable, so when the irregularity occurs a leap second is added or subtracted every six months as needed to keep the time difference between atomic clocks and the Earth’s period of rotation to below 0.9 seconds.

25. Why are Some Standard Clocks Better Than Others?

Other clocks ideally are calibrated by being synchronized to "the" standard clock, but some choices of standard clock are better than others. The philosophical question is whether the better choice is objectively better because it gives us an objectively more accurate clock, or whether the choice is a matter merely of convenience and makes our concept of time a more useful tool for doing physics. The issue is one of realism vs. instrumentalism. Let's consider the various goals we want to achieve in choosing one standard clock rather than another.

One goal is to choose a clock that doesn't drift very much. That is, we want a clock that has a very regular period—so the durations between ticks are congruent. Throughout history, scientists have detected that their currently-chosen standard clock seemed to be drifting. In about 1700, scientists discovered that the time from one day to the next, as determined by sunrises, varied throughout the year. Therefore, they decided to define durations in terms of the mean day throughout the year. Before the 1950s, the standard clock was defined astronomically in terms of the mean rotation of the Earth upon its axis [solar time]. For a short period in the 1950s and 1960s, it was defined in terms of the revolution of the Earth about the Sun [ephemeris time]. The second was defined to be 1/86,400 of the mean solar day, the average throughout the year of the rotational period of the Earth with respect to the Sun.

Now we've found a better standard clock, a certain kind of atomic clock [which displays "atomic time"] that was discussed in the previous section of this Supplement. All atomic clocks measure time in terms of the natural resonant frequencies of certain atoms or molecules. (The dates of adoption of these standard clocks was omitted in this paragraph because different international organizations adopted different standards in different years.) ==The U.S.A.'s National Institute of Standards and Technology's F-1 atomic fountain clock, that is used for reporting time in the U.S.A. (after adjustment so it reports the average from the other laboratories in the T.A.I. network), is so accurate that it drifts by less than one second every 300 million years. We know there is this drift because it is implied by the laws of physics, not because we have a better clock that measures this drift. With engineering improvements, the "300 million" number may improve.

In 2014 several physicists in the journal Nature Physics suggested someday replacing our current standard clock with a network of atomic clocks that are connected via quantum entanglement. They claim that this new clock would not lose a second in 1380 million years, which is the age of the universe.

To achieve the goal of restricting drift, we isolate the clock from outside effects. That is, a practical goal in selecting a standard clock is to find a clock that can be well insulated from environmental impact such as comets impacting the Earth, earthquakes, stray electric fields or the presence of dust. If not insulation, then we pursue the goal of compensation. If there is some theoretically predictable effect of the influence upon the standard clock, then the clock can be regularly adjusted to compensate for this effect.

Consider the insulation problem if we were to use as our standard clock the mean yearly motion of the Earth around the Sun. Can we compensate for all the relevant disturbing effects on the motion of the Earth around the Sun? Not easily. The problem is that the Earth's rate of spin varies in a practically unpredictable manner. Meanwhile, we believe that the relevant factors affecting the spin (such as shifts in winds, comet bombardment, earthquakes, the ocean's tides and currents, convection in Earth's molten core) are affecting the rotational speed and period of revolution of the Earth, but not affecting the behavior of the atomic clock. We don't want to be required to say that an earthquake on Earth or the melting of Greenland ice caused a change in the frequency of cesium emissions throughout the galaxies.

We add leap days and seconds in order to keep our atomic-based calendar in synchrony with the rotations and revolutions of the Earth. We want to keep atomic-noons occurring on astronomical-noons and ultimately to prevent Northern hemisphere winters from occurring in some future July, so we systematically add leap years and leap seconds and leap microseconds in the counting process. These changes do not affect the duration of a second, but they do affect the duration of a year because, with leap years, not all years last the same number of days. In this way, we compensate for the Earth-Sun clocks falling out of synchrony with our standard clock.

Another desirable feature of a standard clock is that reproductions of it stay in synchrony with each other when environmental conditions are the same. Otherwise we may be limited to relying on a specifically-located standard clock that can't be trusted elsewhere and that can be stolen. Cesium clocks in a suburb of Istanbul work just like cesium clocks in an airplane over New York City.

Because of the interplay of space with time in relativity theory, the choice of a standard clock depends not only on the simplicity of having a clock with regular ticks but also on the regularity of distances such as having all atoms in a molecular lattice be the same distance apart.

The principal goal in selecting a standard clock is to reduce mystery in physics by finding a periodic process that, if adopted as our standard, makes the resulting system of physical laws simpler and more useful. Choosing an atomic clock as standard is much better for this purpose than choosing the periodic dripping of water from our goat skin bag or even the periodic revolution of the Earth about the Sun. If scientists were to have retained the Earth-Sun clock as the standard clock and were to say that by definition the Earth does not slow down in any rotation or in any revolution, then when a comet collides with Earth, tempting the scientists to say the Earth's period of rotation and revolution changed , the scientists would be forced instead to alter, among many other things, their atomic theory and say the frequency of light emitted from cesium atoms mysteriously increases all over the universe when comets collide with Earth. By switching to the cesium atomic standard, these alterations are unnecessary, the mystery vanishes. Now scientists can explain that the non-uniform wobbling of the Earth's daily rotations and yearly revolutions is due to comet collisions--or is due to the effect of varying tides on the Earth, convection beneath the Earth's crust, our planet's encounters with dust, and the gravitational pull of the moon, Sun, and other planets. Without the change in standard clock, physicists would be faced with mysterious relationships among these factors; those factors could not be allowed to affect the period of rotation and revolution of the Earth if the periods had to be the same by definition.

To achieve the goal of choosing a standard clock that maximally reduces mystery, we want the clock's readings to be consistent with the accepted laws of motion, in the following sense. Newton's first law of motion says that a body in motion should continue to cover the same distance during the same time interval unless acted upon by an external force. If we used our standard clock to run a series of tests of the time intervals as a body coasted along a carefully measured path, and we found that the law was violated and we couldn't account for this mysterious violation by finding external forces to blame and we were sure that there was no problem otherwise with Newton's law or with the measurement of the length of the path, then the problem would be with the clock. Leonhard Euler [1707-1783] was the first person to suggest this consistency requirement on our choice of a standard clock. A similar argument holds today but with using the laws of motion from Einstein's theory of relativity.

What it means for the standard clock to be accurate depends on your philosophy of time. If you are a conventionalist, then once you select the standard clock it can not fail to be accurate in the sense of being correct. On the other hand, if you are an objectivist, you will say the standard clock can be inaccurate. There are different sorts of objectivists. Suppose we ask the question, "Can the time shown on a properly functioning standard clock be inaccurate?" The answer is "no" if the target is to be in synchrony with the current standard clock, as the conventionalists believe, but "yes" if there is another target. Objectivists can propose at least three distinct targets: (1) absolute time in Newton's sense, (2) the best possible clock, and (3) the best known clock. We do not have a way of knowing whether our current standard clock is close to target 1 or target 2. But if the best known clock has not yet been chosen to be the standard clock, then the current standard clock can be inaccurate in sense 3.

When you want to know how long a basketball game lasts, why do we subtract the start time from the end time? The answer is that we accept a metric for duration in which we subtract two time numbers to determine the duration between the two. Why don't we choose another metric and, let's say, subtract the square root of the start time from the square root of the end time? This question is implicitly asking whether our choice of metric can be incorrect or merely inconvenient.

Let's say more about this. When we choose a standard clock, we are choosing a metric. By agreeing to read the clock so that a duration from 3:00 to 5:00 is 5-3 hours or 2 hours,  we are making a choice about how to compare any two durations in order to decide whether they are equal, that is, congruent. We suppose the duration from 3:00 to 5:00 as shown by yesterday's reading of the standard clock was the same as the duration from 3:00 to 5:00 on the readings from two days ago, and will be the same for today's readings and tomorrow's readings. Philosophers of time continue to dispute the extent to which the choice of metric is conventional rather than objective in the sense of being forced on us by nature. The objectivist says the choice is forced and that the success of the standard atomic clock over the standard solar clock shows that we were more accurate in our choice of the standard clock. An objectivist disagrees and believes that whether two intervals of time are really equivalent is an intrinsic feature of nature, so choosing the standard clock is not any more conventional than our choosing to say the Earth is round rather than flat. Taking this conventional side on this issue, Adolf Grünbaum argues that time is "metrically amorphous." It has no intrinsic metric. Instead, we choose the metric we do in order only to achieve the goals of reducing mystery in science, but satisfying those goals is no sign of being correct.

The conventionalist as opposed to the objectivist would say that if we were to require by convention that the instant at which Jesus was born and the instant at which Abraham Lincoln was assassinated are to be only 24 seconds apart, whereas the duration between Lincoln's assassination and his burial is to be 24 billion seconds, then we could not be mistaken. It is up to us as a civilization to say what is correct when we first create our conventions about measuring duration. We can consistently assign any numerical time coordinates we wish, subject only to the condition that the assignment properly reflect the betweenness relations of the events that occur at those instants. That is, if event J (birth of Jesus) occurs before event L (Lincoln's assassination) and this in turn occurs before event B (burial of Lincoln), then the time assigned to J must be numerically less than the time assigned to L, and both must be less than the time assigned to B so that t(J) < t(L) < t(B). A simple requirement. Yes, but the implication is that this relationship among J, L, and B must hold for events simultaneous with J, and for all events simultaneous with K, and so forth. Another obvious implication is that the devices which served as good clocks according to one choice of metric will  not be good clocks according to a new choice of metric.

It is other features of nature that lead us to reject the above convention about 24 seconds and 24 billion seconds. What features? There are many periodic processes in nature that have a special relationship to each other; their periods are very nearly constant multiples of each other; and this constant stays the same over a long time. For example, the period of the rotation of the Earth is a fairly constant multiple of the period of the revolution of the Earth around the Sun, and both these periods are a constant multiple of the periods of a swinging pendulum and of vibrations of quartz crystals. The class of these periodic processes is very large, so the world will be easier to describe if we choose our standard clock from one of these periodic processes. A good convention for what is regular will make it easier for scientists to find simple laws of nature and to explain what causes other events to be irregular. It is the search for regularity and simplicity and removal of mystery that leads us to adopt the conventions we do for numerical time coordinate assignments and thus leads us to choose the standard clock we do choose. Objectivists disagree and say this search for regularity and simplicity and removal of mystery is all fine, but it is directing us toward the intrinsic metric, not simply the useful metric.

Back to the main “Time” article.


Author Information

Bradley Dowden
California State University Sacramento
U. S. A.

Simplicity in the Philosophy of Science

The view that simplicity is a virtue in scientific theories and that, other things being equal, simpler theories should be preferred to more complex ones has been widely advocated in the history of science and philosophy, and it remains widely held by modern scientists and philosophers of science. It often goes by the name of “Ockham’s Razor.” The claim is that simplicity ought to be one of the key criteria for evaluating and choosing between rival theories, alongside criteria such as consistency with the data and coherence with accepted background theories. Simplicity, in this sense, is often understood ontologically, in terms of how simple a theory represents nature as being—for example, a theory might be said to be simpler than another if it posits the existence of fewer entities, causes, or processes in nature in order to account for the empirical data. However, simplicity can also been understood in terms of various features of how theories go about explaining nature—for example, a theory might be said to be simpler than another if it contains fewer adjustable parameters, if it invokes fewer extraneous assumptions, or if it provides a more unified explanation of the data.

Preferences for simpler theories are widely thought to have played a central role in many important episodes in the history of science. Simplicity considerations are also regarded as integral to many of the standard methods that scientists use for inferring hypotheses from empirical data, the most of common illustration of this being the practice of curve-fitting. Indeed, some philosophers have argued that a systematic bias towards simpler theories and hypotheses is a fundamental component of inductive reasoning quite generally.

However, though the legitimacy of choosing between rival scientific theories on grounds of simplicity is frequently taken for granted, or viewed as self-evident, this practice raises a number of very difficult philosophical problems. A common concern is that notions of simplicity appear vague, and judgments about the relative simplicity of particular theories appear irredeemably subjective. Thus, one problem is to explain more precisely what it is for theories to be simpler than others and how, if at all, the relative simplicity of theories can be objectively measured. In addition, even if we can get clearer about what simplicity is and how it is to be measured, there remains the problem of explaining what justification, if any, can be provided for choosing between rival scientific theories on grounds of simplicity. For instance, do we have any reason for thinking that simpler theories are more likely to be true?

This article provides an overview of the debate over simplicity in the philosophy of science. Section 1 illustrates the putative role of simplicity considerations in scientific methodology, outlining some common views of scientists on this issue, different formulations of Ockham’s Razor, and some commonly cited examples of simplicity at work in the history and current practice of science. Section 2 highlights the wider significance of the philosophical issues surrounding simplicity for central controversies in the philosophy of science and epistemology. Section 3 outlines the challenges facing the project of trying to precisely define and measure theoretical simplicity, and it surveys the leading measures of simplicity and complexity currently on the market. Finally, Section 4 surveys the wide variety of attempts that have been made to justify the practice of choosing between rival theories on grounds of simplicity.

Table of Contents

  1. The Role of Simplicity in Science
    1. Ockham’s Razor
    2. Examples of Simplicity Preferences at Work in the History of Science
      1. Newton’s Argument for Universal Gravitation
      2. Other Examples
    3. Simplicity and Inductive Inference
    4. Simplicity in Statistics and Data Analysis
  2. Wider Philosophical Significance of Issues Surrounding Simplicity
  3. Defining and Measuring Simplicity
    1. Syntactic Measures
    2. Goodman’s Measure
    3. Simplicity as Testability
    4. Sober’s Measure
    5. Thagard’s Measure
    6. Information-Theoretic Measures
    7. Is Simplicity a Unified Concept?
  4. Justifying Preferences for Simpler Theories
    1. Simplicity as an Indicator of Truth
      1. Nature is Simple
      2. Meta-Inductive Proposals
      3. Bayesian Proposals
      4. Simplicity as a Fundamental A Priori Principle
    2. Alternative Justifications
      1. Falsifiability
      2. Simplicity as an Explanatory Virtue
      3. Predictive Accuracy
      4. Truth-Finding Efficiency
    3. Deflationary Approaches
  5. Conclusion
  6. References and Further Reading

1. The Role of Simplicity in Science

There are many ways in which simplicity might be regarded as a desirable feature of scientific theories. Simpler theories are frequently said to be more “beautiful” or more “elegant” than their rivals; they might also be easier to understand and to work with. However, according to many scientists and philosophers, simplicity is not something that is merely to be hoped for in theories; nor is it something that we should only strive for after we have already selected a theory that we believe to be on the right track (for example, by trying to find a simpler formulation of an accepted theory). Rather, the claim is that simplicity should actually be one of the key criteria that we use to evaluate which of a set of rival theories is, in fact, the best theory, given the available evidence: other things being equal, the simplest theory consistent with the data is the best one.

This view has a long and illustrious history. Though it is now most commonly associated with the 14th century philosopher, William of Ockham (also spelt “Occam”), whose name is attached to the famous methodological maxim known as “Ockham’s razor”, which is often interpreted as enjoining us to prefer the simplest theory consistent with the available evidence, it can be traced at least as far back as Aristotle. In his Posterior Analytics, Aristotle argued that nothing in nature was done in vain and nothing was superfluous, so our theories of nature should be as simple as possible. Several centuries later, at the beginning of the modern scientific revolution, Galileo espoused a similar view, holding that, “[n]ature does not multiply things unnecessarily; that she makes use of the easiest and simplest means for producing her effects” (Galilei, 1962, p396). Similarly, at beginning of the third book of the Principia, Isaac Newton included the following principle among his “rules for the study of natural philosophy”:

  • No more causes of natural things should be admitted than are both true and sufficient to explain their phenomena.
    As the philosophers say: Nature does nothing in vain, and more causes are in vain when fewer will suffice. For Nature is simple and does not indulge in the luxury of superfluous causes. (Newton, 1999, p794 [emphasis in original]).

In the 20th century, Albert Einstein asserted that “our experience hitherto justifies us in believing that nature is the realisation of the simplest conceivable mathematical ideas” (Einstein, 1954, p274). More recently, the eminent physicist Steven Weinberg has claimed that he and his fellow physicists “demand simplicity and rigidity in our principles before we are willing to take them seriously” (Weinberg, 1993, p148-9), while the Nobel prize winning economist John Harsanyi has stated that “[o]ther things being equal, a simpler theory will be preferable to a less simple theory” (quoted in McAlleer, 2001, p296).

It should be noted, however, that not all scientists agree that simplicity should be regarded as a legitimate criterion for theory choice. The eminent biologist Francis Crick once complained, “[w]hile Occam’s razor is a useful tool in physics, it can be a very dangerous implement in biology. It is thus very rash to use simplicity and elegance as a guide in biological research” (Crick, 1988, p138). Similarly, here are a group of earth scientists writing in Science:

  • Many scientists accept and apply [Ockham’s Razor] in their work, even though it is an entirely metaphysical assumption. There is scant empirical evidence that the world is actually simple or that simple accounts are more likely than complex ones to be true. Our commitment to simplicity is largely an inheritance of 17th-century theology. (Oreskes et al, 1994, endnote 25)

Hence, while very many scientists assert that rival theories should be evaluated on grounds of simplicity, others are much more skeptical about this idea. Much of this skepticism stems from the suspicion that the cogency of a simplicity criterion depends on assuming that nature is simple (hardly surprising given the way that many scientists have defended such a criterion) and that we have no good reason to make such an assumption. Crick, for instance, seemed to think that such an assumption could make no sense in biology, given the patent complexity of the biological world. In contrast, some advocates of simplicity have argued that a preference for simple theories need not necessarily assume a simple world—for instance, even if nature is demonstrably complex in an ontological sense, we should still prefer comparatively simple explanations for nature’s complexity. Oreskes and others also emphasize that the simplicity principles of scientists such as Galileo and Newton were explicitly rooted in a particular kind of natural theology, which held that a simple and elegant universe was a necessary consequence of God’s benevolence. Today, there is much less enthusiasm for grounding scientific methods in theology (the putative connection between God’s benevolence and the simplicity of creation is theologically controversial in any case). Another common source of skepticism is the apparent vagueness of the notion of simplicity and the suspicion that scientists’ judgments about the relative simplicity of theories lack a principled and objective basis.

Even so, there is no doubting the popularity of the idea that simplicity should be used as a criterion for theory choice and evaluation. It seems to be explicitly ingrained into many scientific methods—for instance, standard statistical methods of data analysis (Section 1d). It has also spread far beyond philosophy and the natural sciences. A recent issue of the FBI Law Enforcement Bulletin, for instance, contained the advice that “[u]nfortunately, many people perceive criminal acts as more complex than they really are… the least complicated explanation of an event is usually the correct one” (Rothwell, 2006, p24).

a. Ockham’s Razor

Many scientists and philosophers endorse a methodological principle known as “Ockham’s Razor”. This principle has been formulated in a variety of different ways. In the early 21st century, it is typically just equated with the general maxim that simpler theories are “better” than more complex ones, other things being equal. Historically, however, it has been more common to formulate Ockham’s Razor as a more specific type of simplicity principle, often referred to as “the principle of parsimony”. Whether William of Ockham himself would have endorsed any of the wide variety of methodological maxims that have been attributed to him is a matter of some controversy (see Thorburn, 1918; entry on William of Ockham), since Ockham never explicitly referred to a methodological principle that he called his “razor”. However, a standard of formulation of the principle of parsimony—one that seems to be reasonably close to the sort of principle that Ockham himself probably would have endorsed—is as the maxim “entities are not to be multiplied beyond necessity”. So stated, the principle is ontological, since it is concerned with parsimony with respect to the entities that theories posit the existence of in attempting to account for the empirical data. “Entity”, in this context, is typically understood broadly, referring not just to objects (for example, atoms and particles), but also to other kinds of natural phenomena that a theory may include in its ontology, such as causes, processes, properties, and so forth. Other, more general formulations of Ockham’s Razor are not exclusively ontological, and may also make reference to various structural features of how theories go about explaining nature, such as the unity of their explanations. The remainder of this section will focus on the more traditional ontological interpretation.

It is important to recognize that the principle, “entities are not to be multiplied beyond necessity” can be read in at least two different ways. One way of reading it is as what we can call an anti-superfluity principle (Barnes, 2000). This principle calls for the elimination of ontological posits from theories that are explanatorily redundant. Suppose, for instance, that there are two theories, T1 and T2, which both seek to explain the same set of empirical data, D. Suppose also that T1 and T2 are identical in terms of the entities that are posited, except for the fact that T2 entails an additional posit, b, that is not part of T1. So let us say that T1 posits a, while T2 posits a + b. Intuitively, T2 is a more complex theory than T1 because it posits more things. Now let us assume that both theories provide an equally complete explanation of D, in the sense that there are no features of D that the two theories cannot account for. In this situation, the anti-superfluity principle would instruct us to prefer the simpler theory, T1, to the more complex theory, T2. The reason for this is because T2 contains an explanatorily redundant posit, b, which does no explanatory work in the theory with respect to D. We know this because T1, which posits a alone provides an equally adequate account of D as T2. Hence, we can infer that positing a alone is sufficient to acquire all the explanatory ability offered by T2, with respect to D; adding b does nothing to improve the ability of T2 to account for the data.

This sort of anti-superfluity principle underlies one important interpretation of “entities are not to be multiplied beyond necessity”: as a principle that invites us to get rid of superfluous components of theories. Here, an ontological posit is superfluous with respect to a given theory, T, in so far as it does nothing to improve T’s ability to account for the phenomena to be explained. This is how John Stuart Mill understood Ockham’s razor (Mill, 1867, p526). Mill also pointed to a plausible justification for the anti-superfluity principle: explanatorily redundant posits—those that have no effect on the ability of the theory to explain the data—are also posits that do not obtain evidential support from the data. This is because it is plausible that theoretical entities are evidentially supported by empirical data only to the extent that they can help us to account for why the data take the form that they do. If a theoretical entity fails to contribute to this end, then the data fails to confirm the existence of this entity. If we have no other independent reason to postulate the existence of this entity, then we have no justification for including this entity in our theoretical ontology.

Another justification that has been offered for the anti-superfluity principle is a probabilistic one. Note that T2 is a logically stronger theory than T1: T2 says that a and b exist, while T1 says that only a exists. It is a consequence of the axioms of probability that a logically stronger theory is always less probable than a logically weaker theory, thus, so long as the probability of a existing and the probability of b existing are independent of each other, the probability of a existing is greater than zero, and the probability of b existing is less than 1, we can assert that Pr (a exists) > Pr (a exists & b exists), where Pr (a exists & b exists) = Pr (a exists) * Pr (b exists). According to this reasoning, we should therefore regard the claims of T1 as more a priori probable than the claims of T2, and this is a reason to prefer it. However, one objection to this probabilistic justification for the anti-superfluity principle is that it doesn’t fully explain why we dislike theories that posit explanatorily redundant entities: it can’t really because they are logically stronger theories; rather it is because they postulate entities that are unsupported by evidence.

When the principle of parsimony is read as an anti-superfluity principle, it seems relatively uncontroversial. However, it is important to recognize that the vast majority of instances where the principle of parsimony is applied (or has been seen as applying) in science cannot be given an interpretation merely in terms of the anti-superfluity principle. This is because the phrase “entities are not to be multiplied beyond necessity” is normally read as what we can call an anti-quantity principle: theories that posit fewer things are (other things being equal) to be preferred to theories that posit more things, whether or not the relevant posits play any genuine explanatory role in the theories concerned (Barnes, 2000). This is a much stronger claim than the claim that we should razor off explanatorily redundant entities. The evidential justification for the anti-superfluity principle just described cannot be used to motivate the anti-quantity principle, since the reasoning behind this justification allows that we can posit as many things as we like, so long as all of the individual posits do some explanatory work within the theory. It merely tells us to get rid of theoretical ontology that, from the perspective of a given theory, is explanatorily redundant. It does not tell us that theories that posit fewer things when accounting for the data are better than theories that posit more things—that is, that sparser ontologies are better than richer ones.

Another important point about the anti-superfluity principle is that it does not give us a reason to assert the non-existence of the superfluous posit. Absence of evidence, is not (by itself) evidence for absence. Hence, this version of Ockham’s razor is sometimes also referred to as an “agnostic” razor rather than an “atheistic” razor, since it only motivates us to be agnostic about the razored-off ontology (Sober, 1981). It seems that in most cases where Ockham’s razor is appealed to in science it is intended to support atheistic conclusions—the entities concerned are not merely cut out of our theoretical ontology, their existence is also denied. Hence, if we are to explain why such a preference is justified we need will to look for a different justification. With respect to the probabilistic justification for the anti-superfluity principle described above, it is important to note that it is not an axiom of probability that Pr (a exists & b doesn’t exist) > Pr (a exists & b exists).

b. Examples of Simplicity Preferences at Work in the History of Science

It is widely believed that there have been numerous episodes in the history of science where particular scientific theories were defended by particular scientists and/or came to be preferred by the wider scientific community less for directly empirical reasons (for example, some telling experimental finding) than as a result of their relative simplicity compared to rival theories. Hence, the history of science is taken to demonstrate the importance of simplicity considerations in how scientists defend, evaluate, and choose between theories. One striking example is Isaac Newton’s argument for universal gravitation.

i. Newton’s Argument for Universal Gravitation

At beginning of the third book of the Principia, subtitled “The system of the world”, Isaac Newton described four “rules for the study of natural philosophy”:

  • Rule 1 No more causes of natural things should be admitted than are both true and sufficient to explain their phenomena.
  • As the philosophers say: Nature does nothing in vain, and more causes are in vain when fewer will suffice. For Nature is simple and does not indulge in the luxury of superfluous causes.
  • Rule 2 Therefore, the causes assigned to natural effects of the same kind must be, so far as possible, the same.
  • Rule 3 Those qualities of bodies that cannot be intended and remitted [i.e., qualities that cannot be increased and diminished] and that belong to all bodies on which experiments can be made should be taken as qualities of all bodies universally.
  • For the qualities of bodies can be known only through experiments; and therefore qualities that square with experiments universally are to be regarded as universal qualities… Certainly ideal fancies ought not to be fabricated recklessly against the evidence of experiments, nor should we depart from the analogy of nature, since nature is always simple and ever consonant with itself…
  • Rule 4 In experimental philosophy, propositions gathered from phenomena by induction should be considered either exactly or very nearly true notwithstanding any contrary hypotheses, until yet other phenomena make such propositions either more exact or liable to exceptions.
  • This rule should be followed so that arguments based on induction may not be nullified by hypotheses. (Newton, 1999, p794-796).

Here we see Newton explicitly placing simplicity at the heart of his conception of the scientific method. Rule 1, a version of Ockham’s Razor, which, despite the use of the word “superfluous”, has typically been read as an anti-quantity principle rather than an anti-superfluity principle (see Section 1a), is taken to follow directly from the assumption that nature is simple, which is in turn taken to give rise to rules 2 and 3, both principles of inductive generalization (infer similar causes for similar effects, and assume to be universal in all bodies those properties found in all observed bodies). These rules play a crucial role in what follows, the centrepiece being the argument for universal gravitation.

After laying out these rules of method, Newton described several “phenomena”—what are in fact empirical generalizations, derived from astronomical observations, about the motions of the planets and their satellites, including the moon. From these phenomena and the rules of method, he then “deduced” several general theoretical propositions. Propositions 1, 2, and 3 state that the satellites of Jupiter, the primary planets, and the moon are attracted towards the centers of Jupiter, the sun, and the earth respectively by forces that keep them in their orbits (stopping them from following a linear path in the direction of their motion at any one time). These forces are also claimed to vary inversely with the square of the distance of the orbiting body (for example, Mars) from the center of the body about which it orbits (for example, the sun). These propositions are taken to follow from the phenomena, including the fact that the respective orbits can be shown to (approximately) obey Kepler’s law of areas and the harmonic law, and the laws of motion developed in book 1 of the Principia. Newton then asserted proposition 4: “The moon gravitates toward the earth and by the force of gravity is always drawn back from rectilinear motion and kept in its orbit” (p802). In other words, it is the force of gravity that keeps the moon in its orbit around the earth. Newton explicitly invoked rules 1 and 2 in the argument for this proposition (what has become known as the “moon-test”). First, astronomical observations told us how fast the moon accelerates towards the earth. Newton was then able to calculate what the acceleration of the moon would be at the earth’s surface, if it were to fall down to the earth. This turned out to be equal to the acceleration of bodies observed to fall in experiments conducted on earth. Since it is the force of gravity that causes bodies on earth to fall (Newton assumed his readers’ familiarity with “gravity” in this sense), and since both gravity and the force acting on the moon “are directed towards the center of the earth and are similar to each other and equal”, Newton asserted that “they will (by rules 1 and 2) have the same cause” (p805). Therefore, the forces that act on falling bodies on earth, and which keeps the moon in its orbit are one and the same: gravity. Given this, the force of gravity acting on terrestrial bodies could now be claimed to obey an inverse-square law. Through similar deployment of rules 1, 2, and 4, Newton was led to the claim that it is also gravity that keeps the planets in their orbits around the sun and the satellites of Jupiter and Saturn in their orbits, since these forces are also directed toward the centers of the sun, Jupiter, and Saturn, and display similar properties to the force of gravity on earth, such as the fact that they obey an inverse-square law. Therefore, the force of gravity was held to act on all planets universally. Through several more steps, Newton was eventually able to get to the principle of universal gravitation: that gravity is a mutually attractive force that acts on any two bodies whatsoever and is described by an inverse-square law, which says that the each body attracts the other with a force of equal magnitude that is proportional to the product of the masses of the two bodies and inversely proportional to the squared distance between them. From there, Newton was able to determine the masses and densities of the sun, Jupiter, Saturn, and the earth, and offer a new explanation for the tides of the seas, thus showing the remarkable explanatory power of this new physics.

Newton’s argument has been the subject of much debate amongst historians and philosophers of science (for further discussion of the various controversies surrounding its structure and the accuracy of its premises, see Glymour, 1980; Cohen, 1999; Harper, 2002). However, one thing that seems to be clear is that his conclusions are by no means forced on us through simple deductions from the phenomena, even when combined with the mathematical theorems and general theory of motion outlined in book 1 of the Principia. No experiment or mathematical derivation from the phenomena demonstrated that it must be gravity that is the common cause of the falling of bodies on earth, the orbits of the moon, the planets and their satellites, much less that gravity is a mutually attractive force acting on all bodies whatsoever. Rather, Newton’s argument appears to boil down to the claim that if gravity did have the properties accorded to it by the principle of universal gravitation, it could provide a common causal explanation for all the phenomena, and his rules of method tell us to infer common causes wherever we can. Hence, the rules, which are in turn grounded in a preference for simplicity, play a crucial role in taking us from the phenomena to universal gravitation (for further discussion of the apparent link between simplicity and common cause reasoning, see Sober, 1988). Newton’s argument for universal gravitation can thus be seen as argument to the putatively simplest explanation for the empirical observations.

ii. Other Examples

Numerous other putative examples of simplicity considerations at work in the history of science have been cited in the literature:

  • One of the most commonly cited concerns Copernicus’ arguments for the heliocentric theory of planetary motion. Copernicus placed particular emphasis on the comparative “simplicity” and “harmony” of the account that his theory gave of the motions of the planets compared with the rival geocentric theory derived from the work of Ptolemy. This argument appears to have carried significant weight for Copernicus’ successors, including Rheticus, Galileo, and Kepler, who all emphasized simplicity as a major motivation for heliocentrism. Philosophers have suggested various reconstructions of the Copernican argument (see for example, Glymour, 1980; Rosencrantz, 1983; Forster and Sober, 1994; Myrvold, 2003; Martens, 2009). However, historians of science have questioned the extent to which simplicity could have played a genuine rather than purely rhetorical role in this episode. For example, it has been argued that there is no clear sense in which the Copernican system was in fact simpler than Ptolemy’s, and that geocentric systems such as the Tychronic system could be constructed that were at least as simple as the Copernican one (for discussion, see Kuhn, 1957; Palter, 1970; Cohen, 1985; Gingerich, 1993; Martens, 2009).
  • It has been widely claimed that simplicity played a key role in the development of Einstein’s theories of theories of special and general relativity, and in the early acceptance of Einstein’s theories by the scientific community (see for example, Hesse, 1974; Holton, 1974; Schaffner, 1974; Sober, 1981; Pais, 1982; Norton, 2000).
  • Thagard (1988) argues that simplicity considerations played an important role in Lavoisier’s case against the existence of phlogiston and in favour of the oxygen theory of combustion.
  • Carlson (1966) describes several episodes in the history of genetics in which simplicity considerations seemed to have held sway.
  • Nolan (1997) argues that a preference for ontological parsimony played an important role in the discovery of the neutrino and in the proposal of Avogadro’s hypothesis.
  • Baker (2007) argues that ontological parsimony was a key issue in discussions over rival dispersalist and extensionist bio-geographical theories in the late 19th and early 20th century.

Though it is commonplace for scientists and philosophers to claim that simplicity considerations have played a significant role in the history of science, it is important to note that some skeptics have argued that the actual historical importance of simplicity considerations has been over-sold (for example, Bunge, 1961; Lakatos and Zahar, 1978). Such skeptics dispute the claim that we can only explain the basis for these and other episodes of theory change by according a role to simplicity, claiming other considerations actually carried more weight. In addition, it has been argued that, in many cases, what appear on the surface to have been appeals to the relative simplicity of theories were in fact covert appeals to some other theoretical virtue (for example, Boyd, 1990; Sober, 1994; Norton, 2003; Fitzpatrick, 2009). Hence, for any putative example of simplicity at work in the history of science, it is important to consider whether the relevant arguments are not best reconstructed in other terms (such a “deflationary” view of simplicity will be discussed further in Section 4c).

c. Simplicity and Inductive Inference

Many philosophers have come to see simplicity considerations figuring not only in how scientists go about evaluating and choosing between developed scientific theories, but also in the mechanics of making much more basic inductive inferences from empirical data. The standard illustration of this in the modern literature is the practice of curve-fitting. Suppose that we have a series of observations of the values of a variable, y, given values of another variable, x. This gives us a series of data points, as represented in Figure 1.

Figure 1

Given this data, what underlying relationship should we posit between x and y so that we can predict future pairs of x-y values? Standard practice is not to select a bumpy curve that neatly passes through all the data points, but rather to select a smooth curve—preferably a straight line, such as H1—that passes close to the data. But why do we do this? Part of an answer comes from the fact that if the data is to some degree contaminated with measurement error (for example, through mistakes in data collection) or “noise” produced by the effects of uncontrolled factors, then any curve that fits the data perfectly will most likely be false. However, this does not explain our preference for a curve like H1 over an infinite number of other curves—H2, for instance—that also pass close to the data. It is here that simplicity has been seen as playing a vital, though often implicit role in how we go about inferring hypotheses from empirical data: H1 posits a “simpler” relationship between x and y than H2—hence, it is for reasons of simplicity that we tend to infer hypotheses like H1.

The practice of curve-fitting has been taken to show that—whether we aware of it or not—human beings have a fundamental cognitive bias towards simple hypotheses. Whether we are deciding between rival scientific theories, or performing more basic generalizations from our experience, we ubiquitously tend to infer the simplest hypothesis consistent with our observations. Moreover, this bias is held to be necessary in order for us to be able select a unique hypotheses from the potentially limitless number of hypotheses consistent with any finite amount of experience.

The view that simplicity may often play an implicit role in empirical reasoning can arguably be traced back to David Hume’s description of enumerative induction in the context of his formulation of the famous problem of induction. Hume suggested that a tacit assumption of the uniformity of nature is ingrained into our psychology. Thus, we are naturally drawn to the conclusion that all ravens have black feathers from the fact that all previously observed ravens have black feathers because we tacitly assume that the world is broadly uniform in its properties. This has been seen as a kind of simplicity assumption: it is simpler to assume more of the same.

A fundamental link between simplicity and inductive reasoning has been retained in many more recent descriptive accounts of inductive inference. For instance, Hans Reichenbach (1949) described induction as an application of what he called the “Straight Rule”, modelling all inductive inference on curve-fitting. In addition, proponents of the model of “Inference to Best Explanation”, who hold that many inductive inferences are best understood as inferences to the hypothesis that would, if true, provide the best explanation for our observations, normally claim that simplicity is one of the criteria that we use to determine which hypothesis constitutes the “best” explanation.

In recent years, the putative role of simplicity in our inferential psychology has been attracting increasing attention from cognitive scientists. For instance, Lombrozo (2007) describes experiments that she claims show that participants use the relative simplicity of rival explanations (for instance, whether a particular medical diagnosis for a set of symptoms involves assuming the presence of one or multiple independent conditions) as a guide to assessing their probability, such that a disproportionate amount of contrary probabilistic evidence is required for participants to choose a more complex explanation over a simpler one. Simplicity considerations have also been seen as central to learning processes in many different cognitive domains, including language acquisition and category learning (for example, Chater, 1999; Lu and others, 2006).

d. Simplicity in Statistics and Data Analysis

Philosophers have long used the example of curve-fitting to illustrate the (often implicit) role played by considerations of simplicity in inductive reasoning from empirical data. However, partly due to the advent of low-cost computing power and that the fact scientists in many disciplines find themselves having to deal with ever larger and more intricate bodies of data, recent decades have seen a remarkable revolution in the methods available to scientists for analyzing and interpreting empirical data (Gauch, 2006). Importantly, there are now numerous formalized procedures for data analysis that can be implemented in computer software—and which are widely used in disciplines from engineering to crop science to sociology—that contain an explicit role for some notion of simplicity. The literature on such methods abounds with talk of “Ockham’s Razor”, “Occam factors”, “Ockham’s hill” (MacKay, 1992; Gauch, 2006), “Occam’s window” (Raftery and others, 1997), and so forth. This literature not only provides important illustrations of the role that simplicity plays in scientific practice, but may also offer insights for philosophers seeking to understand the basis for this role.

As an illustration, consider standard procedures for model selection, such as the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Minimum Message Length (MML) and Minimum Description Length (MDL) procedures, and numerous others (for discussion see, Forster and Sober, 1994; Forster, 2001; Gauch, 2003; Dowe and others, 2007). Model selection is a matter of selecting the kind of relationship that is to be posited between a set of variables, given a sample of data, in an effort to generate hypotheses about the true underlying relationship holding in the population of inference and/or to make predictions about future data. This question arises in the simple curve-fitting example discussed above—for instance, whether the true underlying relationship between x and y is linear, parabolic, quadratic, and so on. It also arises in lots of other contexts, such as the problem of inferring the causal relationship that exists between an empirical effect and a set of variables. “Models” in this sense are families of functions, such as the family of linear functions, LIN: y = a + bx, or the family of parabolic functions, PAR: y = a + bx + cx2. The simplicity of a model is normally explicated in terms of the number of adjustable parameters it contains (MML and MDL measure the simplicity of models in terms of the extent to which they provide compact descriptions of the data, but produce similar results to the counting of adjustable parameters). On this measure, the model LIN is simpler than PAR, since LIN contains two adjustable parameters, whereas PAR has three. A consequence of this is that a more complex model will always be able to fit a given sample of data better than a simpler model (“fitting” a model to the data involves using the data to determine what the values of the parameters in the model should be, given that data—that is, identifying the best-fitting member of the family). For instance, returning to the curve-fitting scenario represented in Figure 1, the best-fitting curve in PAR is guaranteed to fit this data set at least as well as the best-fitting member of the simpler model, LIN, and this is true no matter what the data are, since linear functions are special cases of parabolas, where c = 0, so any curve that is a member of LIN is also a member of PAR.

Model selection procedures produce a ranking of all the models under consideration in light of the data, thus allowing scientists to choose between them. Though they do it in different ways, AIC, BIC, MML, and MDL all implement procedures for model selection that impose a penalty on the complexity of a model, so that a more complex model will have to fit the data sample at hand significantly better than a simpler one for it to be rated higher than the simpler model. Often, this penalty is greater the smaller is the sample of data. Interestingly—and contrary to the assumptions of some philosophers—this seems to suggest that simplicity considerations do not only come into play as a tiebreaker between theories that fit the data equally well: according to the model selection literature, simplicity sometimes trumps better fit to the data. Hence, simplicity need not only come into play when all other things are equal.

Both statisticians and philosophers of statistics have vigorously debated the underlying justification for these sorts of model selection procedures (see, for example, the papers in Zellner and others, 2001). However, one motivation for taking into account the simplicity of models derives from a piece of practical wisdom: when there is error or “noise” in the data sample, a relatively simple model that fits the sample less well will often be more accurate when it comes to predicting extra-sample (for example, future) data than a more complex model that fits the sample more closely. The logic here is that since more complex models are more flexible in their ability to fit the data (since they have more adjustable parameters), they also have a greater propensity to be misled by errors and noise, in which case they may recover less of the true underlying “signal” in the sample. Thus, constraining model complexity may facilitate greater predictive accuracy. This idea is captured in what Gauch (2003, 2006) (following MacKay, 1992) calls “Ockham’s hill”. To the left of the peak of the hill, increasing the complexity of a model improves its accuracy with respect to extra-sample data because this recovers more of the signal in the sample. However, after the peak, increasing complexity actually diminishes predictive accuracy because this leads to over-fitting to spurious noise in the sample. There is therefore an optimal trade-off (at the peak of Ockham’s hill) between simplicity and fit to the sample data when it comes to facilitating accurate prediction of extra-sample data. Indeed, this trade-off is essentially the core idea behind AIC, the development of which initiated the now enormous literature on model selection, and the philosophers Malcolm Forster and Elliott Sober have sought to use such reasoning to make sense of the role of simplicity in many areas of science (see Section 4biii).

One important implication of this apparent link between model simplicity and predictive accuracy is that interpreting sample data using relatively simple models may improve the efficiency of experiments by allowing scientists to do more with less data—for example, scientists may be able to run a costly experiment fewer times before they can be in a position to make relatively accurate predictions about the future. Gauch (2003, 2006) describes several real world cases from crop science and elsewhere where this gain in accuracy and efficiency from the use of relatively simple models has been documented.

2. Wider Philosophical Significance of Issues Surrounding Simplicity

The putative role of simplicity, both in the evaluation of rival scientific theories and in the mechanics of how we go about inferring hypotheses from empirical data, clearly raises a number of difficult philosophical issues. These include, but are by no means limited to: (1) the question of what precisely it means to say the one theory or hypothesis is simpler than another and how the relative simplicity of theories is to be measured; (2) the question of what rational justification (if any) can be provided for choosing between rival theories on grounds of simplicity; and (3) the closely related question of what weight simplicity considerations ought to carry in theory choice relative to other theoretical virtues, particularly if these sometimes have to be traded-off against each other. (For general surveys of the philosophical literature on these issues, see Hesse, 1967; Sober, 2001a, 2001b). Before we delve more deeply into how philosophers have sought to answer these questions, it is worth noting the close connections between philosophical issues surrounding simplicity and many of the most important controversies in the philosophy of science and epistemology.

First, the problem of simplicity has close connections with long-standing issues surrounding the nature and justification of inductive inference. Some philosophers have actually offered up the idea that simpler theories are preferable to less simple ones as a purported solution to the problem of induction: it is the relative simplicity of the hypotheses that we tend to infer from empirical observations that supposedly provides the justification for these inferences—thus, it is simplicity that provides the warrant for our inductive practices. This approach is not as popular as it once was, since it is taken to merely substitute the problem of induction for the equally substantive problem of justifying preferences for simpler theories. A more common view in the recent literature is that the problem of induction and the problem of justifying preferences for simpler theories are closely connected, or may even amount to the same problem. Hence, a solution to the latter problem will provide substantial help towards solving the former.

More generally, the ability to make sense of the putative role of simplicity in scientific reasoning has been seen by many to be a central desideratum for any adequate philosophical theory of the scientific method. For example, Thomas Kuhn’s (1962) influential discussion of the importance of scientists’ aesthetic preferences—including but not limited to judgments of simplicity—in scientific revolutions was a central part of his case for adopting a richer conception of the scientific method and of theory change in science than he found in the dominant logical empiricist views of the time. More recently, critics of the Bayesian approach to scientific reasoning and theory confirmation, which holds that sound inductive reasoning is reasoning according to the formal principles of probability, have claimed that simplicity is an important feature of scientific reasoning that escapes a Bayesian analysis. For instance, Forster and Sober (1994) argue that Bayesian approaches to curve-fitting and model selection (such as the Bayesian Information Criterion) cannot themselves be given Bayesian rationale, nor can any other approach that builds in a bias towards simpler models. The ability of the Bayesian approach to make sense of simplicity in model selection and other aspects of scientific practice has thus been seen as central to evaluating its promise (see for example, Glymour, 1980; Forster and Sober, 1994; Forster, 1995; Kelly and Glymour, 2004; Howson and Urbach, 2006; Dowe and others, 2007).

Discussions over the legitimacy of simplicity as a criterion for theory choice have also been closely bound up with debates over scientific realism. Scientific realists assert that scientific theories aim to offer a literally true description of the world and that we have good reason to believe that the claims of our current best scientific theories are at least approximately true, including those claims that purport to be about “unobservable” natural phenomena that are beyond our direct perceptual access. Some anti-realists object that it is possible to formulate incompatible alternatives to our current best theories that are just as consistent with any current data that we have, perhaps even any future data that we could ever collect. They claim that we can therefore never be justified in asserting that the claims of our current best theories, especially those concerning unobservables, are true, or approximately true. A standard realist response is to emphasize the role of the so-called “theoretical virtues” in theory choice, among which simplicity is normally listed. The claim is thus that we rule out these alternative theories because they are unnecessarily complex. Importantly, for this defense to work, realists have to defend the idea that not only are we justified in choosing between rival theories on grounds of simplicity, but also that simplicity can be used as a guide to the truth. Naturally, anti-realists, particularly those of an empiricist persuasion (for example, van Fraassen, 1989), have expressed deep skepticism about the alleged truth-conduciveness of a simplicity criterion.

3. Defining and Measuring Simplicity

The first major philosophical problem that seems to arise from the notion that simplicity plays a role in theory choice and evaluation concerns specifying in more detail what it means to say that one theory is simpler than another and how the relative simplicity of theories is to be precisely and objectively measured. Numerous attempts have been made to formulate definitions and measures of theoretical simplicity, all of which face very significant challenges. Philosophers have not been the only ones to contribute to this endeavour. For instance, over the last few decades, a number of formal measures of simplicity and complexity have been developed in mathematical information theory. This section provides an overview of some of the main simplicity measures that have been proposed and the problems that they face. The proposals described here have also normally been tied to particular proposals about what justifies preferences for simpler theories. However, discussion of these justifications will be left until Section 4.

To begin with, it is worth considering why providing a precise definition and measure of theoretical simplicity ought to be regarded as a substantial philosophical problem. After all, it often seems that when one is confronted with a set of rival theories designed to explain a particular empirical phenomenon, it is just obvious which is the simplest. One does not always need a precise definition or measure of a particular property to be able to tell whether or not something exhibits it to a greater degree than something else. Hence, it could be suggested that if there is a philosophical problem here, it is only of very minor interest and certainly of little relevance to scientific practice. There are, however, some reasons to regard this as a substantial philosophical problem, which also has some practical relevance.

First, it is not always easy to tell whether one theory really ought to be regarded as simpler than another, and it is not uncommon for practicing scientists to disagree about the relative simplicity of rival theories. A well-known historical example is the disagreement between Galileo and Kepler concerning the relative simplicity of Copernicus’ theory of planetary motion, according to which the planets move only in perfect circular orbits with epicycles, and Kepler’s theory, according to which the planets move in elliptical orbits (see Holton, 1974; McAllister, 1996). Galileo held to the idea that perfect circular motion is simpler than elliptical motion. In contrast, Kepler emphasized that an elliptical model of planetary motion required many fewer orbits than a circular model and enabled a reduction of all the planetary motions to three fundamental laws of planetary motion. The problem here is that scientists seem to evaluate the simplicity of theories along a number of different dimensions that may conflict with each other. Hence, we have to deal with the fact that a theory may be regarded as simpler than a rival in one respect and more complex in another. To illustrate this further, consider the following list of commonly cited ways in which theories may be held to be simpler than others:

  • Quantitative ontological parsimony (or economy): postulating a smaller number of independent entities, processes, causes, or events.
  • Qualitative ontological parsimony (or economy): postulating a smaller number of independent kinds or classes of entities, processes, causes, or events.
  • Common cause explanation: accounting for phenomena in terms of common rather than separate causal processes.
  • Symmetry: postulating that equalities hold between interacting systems and that the laws describing the phenomena look the same from different perspectives.
  • Uniformity (or homogeneity): postulating a smaller number of changes in a given phenomenon and holding that the relations between phenomena are invariant.
  • Unification: explaining a wider and more diverse range of phenomena that might otherwise be thought to require separate explanations in a single theory (theoretical reduction is generally held to be a species of unification).
  • Lower level processes: when the kinds of processes that can be posited to explain a phenomena come in a hierarchy, positing processes that come lower rather than higher in this hierarchy.
  • Familiarity (or conservativeness): explaining new phenomena with minimal new theoretical machinery, reusing existing patterns of explanation.
  • Paucity of auxiliary assumptions: invoking fewer extraneous assumptions about the world.
  • Paucity of adjustable parameters: containing fewer independent parameters that the theory leaves to be determined by the data.

As can be seen from this list, there is considerable diversity here. We can see that theoretical simplicity is frequently thought of in ontological terms (for example, quantitative and qualitative parsimony), but also sometimes as a structural feature of theories (for example, unification, paucity of adjustable parameters), and while some of these intuitive types of simplicity may often cluster together in theories—for instance, qualitative parsimony would seem to often go together with invoking common cause explanations, which would in turn often seem to go together with explanatory unification—there is also considerable scope for them pointing in different directions in particular cases. For example, a theory that is qualitatively parsimonious as a result of positing fewer different kinds of entities might be quantitatively unparsimonious as result of positing more of a particular kind of entity; while the demand to explain in terms of lower-level processes rather than higher-level processes may conflict with the demand to explain in terms of common causes behind similar phenomena, and so on. There are also different possible ways of evaluating the simplicity of a theory with regard to any one of these intuitive types of simplicity. A theory may, for instance, come out as more quantitatively parsimonious than another if one focuses on the number of independent entities that it posits, but less parsimonious if one focuses on the number of independent causes it invokes. Consequently, it seems that if a simplicity criterion is actually to be applicable in practice, we need some way of resolving the disagreements that may arise between scientists about the relative simplicity of rival theories, and this requires a more precise measure of simplicity.

Second, as has already been mentioned, a considerable amount of the skepticism expressed both by philosophers and by scientists about the practice of choosing one theory over another on grounds of relative simplicity has stemmed from the suspicion that our simplicity judgments lack a principled basis (for example, Ackerman, 1961; Bunge, 1961; Priest, 1976). Disagreements between scientists, along with the multiplicity and scope for conflict between intuitive types of simplicity have been important contributors to this suspicion, leading to the view that for any two theories, T1 and T2, there is some way of evaluating their simplicity such that T1 comes out as simpler than T2, and vice versa. It seems, then, that an adequate defense of the legitimacy a simplicity criterion needs to show that there are in fact principled ways of determining when one theory is indeed simpler than another. Moreover, in so far as there is also a justificatory issue to be dealt with, we also need to be clear about exactly what it is that we need to justify a preference for.

a. Syntactic Measures

One proposal is that the simplicity of theories can be precisely and objectively measured in terms of how briefly they can be expressed. For example, a natural way of measuring the simplicity of an equation is just to count the number of terms, or parameters that it contains. Similarly, we could measure the simplicity of a theory in terms of the size of the vocabulary—for example, the number of extra-logical terms—required to write down its claims. Such measures of simplicity are often referred to as syntactic measures, since they involve counting the linguistic elements required to state, or to describe the theory.

A major problem facing any such syntactic measure of simplicity is the problem of language variance. A measure of simplicity is language variant if it delivers different results depending on the language that is used to represent the theories being compared. Suppose, for example, that we measure the simplicity of an equation by counting the number of non-logical terms that it contains. This will produce the result that r = a will come out as simpler than x2 + y2 = a2. However, this second equation is simply a transformation of the first into Cartesian co-ordinates, where r2 = x2 + y2, and is hence logically equivalent. The intuitive proposal for measuring simplicity in curve-fitting contexts, according to which hypotheses are said to be simpler if they contain fewer parameters, is also language variant in this sense. How many parameters a hypothesis contains depends on the co-ordinate scales that one uses. For any two non-identical functions, F and G, there is some way of transforming the co-ordinate scales such that we can turn F into a linear curve and G into a non-linear curve, and vice versa.

Nelson Goodman’s (1983) famous “new riddle of induction” allows us to formulate another example of the problem of language variance. Suppose all previously observed emeralds have been green. Now consider the following hypotheses about the color properties of the entire population of emeralds:

  • H1: all emeralds are green
  • H2: all emeralds first observed prior to time t are green and all emeralds first observed after time t are blue (where t is some future time)

Intuitively, H1 seems to be a simpler hypothesis than H2. To begin with, it can be stated with a smaller vocabulary. H1 also seems to postulate uniformity in the properties of emeralds, while H2 posits non-uniformity. For instance, H2 seems to assume that there is some link between the time at which an emerald is first observed and its properties. Thus it can be viewed as including an additional time parameter. But now consider Goodman’s invented predicates, “grue” and “bleen”. These have been defined in variety of different ways, but let us define them here as follows: an object is grue if it is first observed before time t and the object is green, or first observed after t and the object is blue; an object is bleen if it is first observed before time t and the object is blue, or first observed after the time t and the object is green. With these predicates, we can define a further property, “grolor”. Grue and bleen are grolors just as green and blue are colors. Now, because of the way that grolors are defined, color predicates like “green” and “blue” can also be defined in terms of grolor predicates: an object is green if first observed before time t and the object is grue, or first observed after time t and the object is bleen; an object is blue if first observed before time t and the object is bleen, or first observed after t and the object is grue. This means that statements that are expressed in terms of green and blue can also be expressed in terms of grue and bleen. So, we can rewrite H1 and H2 as follows:

  • H1: all emeralds first observed prior to time t are grue and all emeralds first observed after time t are bleen (where t is some future time)
  • H2: all emeralds are grue

Re-call that earlier we judged H1 to be simpler than H2. However, if we are retain that simplicity judgment, we cannot say that H1 is simpler than H2 because it can be stated with a smaller vocabulary; nor can we say that it H1 posits greater uniformity, and is hence simpler, because it does not contain a time parameter. This is because simplicity judgments based on such syntactic features can be reversed merely by switching the language used to represent the hypotheses from a color language to a grolor language.

Examples such as these have been taken to show two things. First, no syntactic measure of simplicity can suffice to produce a principled simplicity ordering, since all such measures will produce different results depending of the language of representation that is used. It is not enough just to stipulate that we should evaluate simplicity in one language rather than another, since that would not explain why simplicity should be measured in that way. In particular, we want to know that our chosen language is accurately tracking the objective language-independent simplicity of the theories being compared. Hence, if a syntactic measure of simplicity is to be used, say for practical purposes, it must be underwritten by a more fundamental theory of simplicity. Second, a plausible measure of simplicity cannot be entirely neutral with respect to all of the different claims about the world that the theory makes or can be interpreted as making. Because of the respective definitions of colors and grolors, any hypothesis that posits uniformity in color properties must posit non-uniformity in grolor properties. As Goodman emphasized, one can find uniformity anywhere if no restriction is placed on what kinds of properties should be taken into account. Similarly, it will not do to say that theories are simpler because they posit the existence of fewer entities, causes and processes, since, using Goodman-like manipulations, it is trivial to show that a theory can be regarded as positing any number of different entities, causes and processes. Hence, some principled restriction needs to be placed on which aspects of the content of a theory are to be taken into account and which are to be disregarded when measuring their relative simplicity.

b. Goodman’s Measure

According to Nelson Goodman, an important component of the problem of measuring the simplicity of scientific theories is the problem of measuring the degree of systematization that a theory imposes on the world, since, for Goodman, to seek simplicity is to seek a system. In a series of papers in the 1940s and 50s, Goodman (1943, 1955, 1958, 1959) attempted to explicate a precise measure of theoretical systematization in terms of the logical properties of the set of concepts, or extra-logical terms, that make up the statements of the theory.

According to Goodman, scientific theories can be regarded as sets of statements. These statements contain various extra-logical terms, including property terms, relation terms, and so on. These terms can all be assigned predicate symbols. Hence, all the statements of a theory can be expressed in a first order language, using standard symbolic notion. For instance, “… is acid” may become “A(x)”, “… is smaller than ____” may become “S(x, y)”, and so on. Goodman then claims that we can measure the simplicity of the system of predicates employed by the theory in terms of their logical properties, such as their arity, reflexivity, transitivity, symmetry, and so on. The details arehighly technical but, very roughly, Goodman’s proposal is that a system of predicates that can be used to express more is more complex than a system of predicates that can be used to express less. For instance, one of the axioms of Goodman’s proposal is that if every set of predicates of a relevant kind, K, is always replaceable by a set of predicates of another kind, L, then K is not more complex than L.

Part of Goodman’s project was to avoid the problem of language variance. Goodman’s measure is a linguistic measure, since it concerns measuring the simplicity of a theory’s predicate basis in a first order language. However, it is not a purely syntactic measure, since it does not involve merely counting linguistic elements, such as the number of extra-logical predicates. Rather, it can be regarded as an attempt to measure the richness of a conceptual scheme: conceptual schemes that can be used to say more are more complex than conceptual schemes that can be used to say less. Hence, a theory can be regarded as simpler if it requires a less expressive system of concepts.

Goodman developed his axiomatic measure of simplicity in considerable detail. However, Goodman himself only ever regarded it as a measure of one particular type of simplicity, since it only concerns the logical properties of the predicates employed by the theory. It does not, for example, take account of the number of entities that a theory postulates. Moreover, Goodman never showed how the measure could be applied to real scientific theories. It has been objected that even if Goodman’s measure could be applied, it would not discriminate between many theories that intuitively differ in simplicity—indeed, in the kind of simplicity as systematization that Goodman wants to measure. For instance, it is plausible that the system of concepts used to express the Copernican theory of planetary motion is just as expressively rich as the system of concepts used to express the Ptolemaic theory, yet the former is widely regarded as considerably simpler than the latter, partly in virtue of it providing an intuitively more systematic account of the data (for discussion of the details of Goodman’s proposal and the objections it faces, see Kemeny, 1955; Suppes, 1956; Kyburg, 1961; Hesse, 1967).

c. Simplicity as Testability

It has often been argued that simpler theories say more about the world and hence are easier to test than more complex ones. C. S. Peirce (1931), for example, claimed that the simplest theories are those whose empirical consequences are most readily deduced and compared with observation, so that they can be eliminated more easily if they are wrong. Complex theories, on the other hand, tend to be less precise and allow for more wriggle room in accommodating the data. This apparent connection between simplicity and testability has led some philosophers to attempt to formulate measures of simplicity in terms of the relative testability of theories.

Karl Popper (1959) famously proposed one such testability measure of simplicity. Popper associated simplicity with empirical content: simpler theories say more about the world than more complex theories and, in so doing, place more restriction on the ways that the world can be. According to Popper, the empirical content of theories, and hence their simplicity, can be measured in terms of their falsifiability. The falsifiability of a theory concerns the ease with which the theory can be proven false, if the theory is indeed false. Popper argued that this could be measured in terms of the amount of data that one would need to falsify the theory. For example, on Popper’s measure, the hypothesis that x and y are linearly related, according to an equation of the form, y = a + bx, comes out as having greater empirical content and hence greater simplicity than the hypotheses that they are related according a parabola of the form, y = a + bx + cx2. This is because one only needs three data points to falsify the linear hypothesis, but one needs at least four data points to falsify the parabolic hypothesis. Thus Popper argued that empirical content, falsifiability, and hence simplicity, could be seen as equivalent to the paucity of adjustable parameters. John Kemeny (1955) proposed a similar testability measure, according to which theories are more complex if they can come out as true in more ways in an n-member universe, where n is the number of individuals that the universe contains.

Popper’s equation of simplicity with falsifiability suffers from some serious objections. First, it cannot be applied to comparisons between theories that make equally precise claims, such as a comparison between a specific parabolic hypothesis and a specific linear hypothesis, both of which specify precise values for their parameters and can be falsified by only one data point. It also cannot be applied when we compare theories that make probabilistic claims about the world, since probabilistic statements are not strictly falsifiable. This is particularly troublesome when it comes to accounting for the role of simplicity in the practice of curve-fitting, since one normally has to deal with the possibility of error in the data. As a result, an error distribution is normally added to the hypotheses under consideration, so that they are understood as conferring certain probabilities on the data, rather than as having deductive observational consequences. In addition, most philosophers of science now tend to think that falsifiability is not really an intrinsic property of theories themselves, but rather a feature of how scientists are disposed to behave towards their theories. Even deterministic theories normally do not entail particular observational consequences unless they are conjoined with particular auxiliary assumptions, usually leaving the scientist the option of saving the theory from refutation by tinkering with their auxiliary assumptions—a point famously emphasized by Pierre Duhem (1954). This makes it extremely difficult to maintain that simpler theories are intrinsically more falsifiable than less simple ones. Goodman (1961, p150-151) also argued that equating simplicity with falsifiability leads to counter-intuitive consequences. The hypothesis, “All maple trees are deciduous”, is intuitively simpler than the hypothesis, “All maple trees whatsoever, and all sassafras trees in Eagleville, are deciduous”, yet, according to Goodman, the latter hypothesis is clearly the easiest to falsify of the two. Kemeny’s measure inherits many of the same objections.

Both Popper and Kemeny essentially tried to link the simplicity of a theory with the degree to which it can accommodate potential future data: simpler theories are less accommodating than more complex ones. One interesting recent attempt to make sense of this notion of accommodation is due to Harman and Kulkarni (2007). Harman and Kulkarni analyze accommodation in terms of a concept drawn from statistical learning theory known as the Vapnik-Chervonenkis (VC) dimension. The VC dimension of a hypothesis can be roughly understood as a measure of the “richness” of the class of hypotheses from which it is drawn, where a class is richer if it is harder to find data that is inconsistent with some member of the class. Thus, a hypothesis drawn from a class that can fit any possible set of data will have infinite VC dimension. Though VC dimension shares some important similarities with Popper’s measure, there are important differences. Unlike Popper’s measure, it implies that accommodation is not always equivalent to the number of adjustable parameters. If we count adjustable parameters, sine curves of the form y = a sin bx, come out as relatively unaccommodating, however, such curves have an infinite VC dimension. While Harman and Kulkarni do not propose that VC dimension be taken as a general measure of simplicity (in fact, they regard it as an alternative to simplicity in some scientific contexts), ideas along these lines might perhaps hold some future promise for testability/accommodation measures of simplicity. Similar notions of accommodation in terms of “dimension” have been used to explicate the notion of the simplicity of a statistical model in the face of the fact the number of adjustable parameters a model contains is language variant (for discussion, see Forster, 1999; Sober, 2007).

d. Sober’s Measure

In his early work on simplicity, Elliott Sober (1975) proposed that the simplicity of theories be measured in terms of their question-relative informativeness. According to Sober, a theory is more informative if it requires less supplementary information from us in order for us to be able to use it to determine the answer to the particular questions that we are interested in. For instance, the hypothesis, y = 4x, is more informative and hence simpler than y = 2z + 2x with respect to the question, “what is the value of y?” This is because in order to find out the value of y one only needs to determine a value for x on the first hypothesis, whereas on the second hypothesis one also needs to determine a value for z. Similarly, Sober’s proposal can be used to capture the intuition that theories that say that a given class of things are uniform in their properties are simpler than theories that say that the class is non-uniform, because they are more informative relative to particular questions about the properties of the class. For instance, the hypothesis that “all ravens are black” is more informative and hence simpler than “70% of ravens are black” with respect to the question, “what will be the colour of the next observed raven?” This is because on the former hypothesis one needs no additional information in order to answer this question, whereas one will have to supplement the latter hypothesis with considerable extra information in order to generate a determinate answer.

By relativizing the notion of the content-fullness of theories to the question that one is interested in, Sober’s measure avoids the problem that Popper and Kemeny’s proposals faced of the most arbitrarily specific theories, or theories made up of strings of irrelevant conjunctions of claims, turning out to be the simplest. Moreover, according to Sober’s proposal, the content of the theory must be relevant to answering the question for it to count towards the theory’s simplicity. This gives rise to the most distinctive element of Sober’s proposal: different simplicity orderings of theories will be produced depending on the question one asks. For instance, if we want to know what the relationship is between values of z and given values of y and x, then y = 2z + 2x will be more informative, and hence simpler, than y = 4x. Thus, a theory can be simple relative to some questions and complex relative to others.

Critics have argued that Sober’s measure produces a number of counter-intuitive results. Firstly, the measure cannot explain why people tend to judge an equation such as y = 3x + 4x2 – 50 as more complex than an equation like y = 2x, relative to the question, “what is the value of y?” In both cases, one only needs a value of x to work out a value for y. Similarly, Sober’s measure fails to deal with Goodman’s above cited counter-example to the idea that simplicity equates to testability, since it produces the counter-intuitive outcome that there is no difference in simplicity between “all maple trees whatsoever, and all sassafras trees in Eagleville, are deciduous” and “all maple trees are deciduous” relative to questions about whether maple trees are deciduous. The interest-relativity of Sober’s measure has also generated criticism from those who prefer to see simplicity as a property that varies only with what a given theory is being compared with, not with the question that one happens to be asking.

e. Thagard’s Measure

Paul Thagard (1988) proposed that simplicity ought to be understood as a ratio of the number of facts explained by a theory to the number of auxiliary assumptions that the theory requires. Thagard defines an auxiliary assumption as a statement, not part of the original theory, which is assumed in order for the theory to be able to explain one or more of the facts to be explained. Simplicity is then measured as follows:

  • Simplicity of T = (Facts explained by T – Auxiliary assumptions of T) / Facts explained by T

A value of 0 is given to a maximally complex theory that requires as many auxiliary assumptions as facts that it explains and 1 to a maximally simple theory that requires no auxiliary assumptions at all to explain. Thus, the higher the ratio of facts explained to auxiliary assumptions, the simpler the theory. The essence of Thagard’s proposal is that we want to explain as much as we can, while making the fewest assumptions about the way the world is. By balancing the paucity of auxiliary assumptions against explanatory power it prevents the unfortunate consequence of the simplest theories turning out to be those that are most anaemic.

A significant difficulty facing Thargard’s proposal lies in determining what the auxiliary assumptions of theories actually are and how to count them. It could be argued that the problem of counting auxiliary assumptions threatens to become as difficult as the original problem of measuring simplicity. What a theory must assume about the world for it to explain the evidence is frequently extremely unclear and even harder to quantify. In addition, some auxiliary assumptions are bigger and more onerous than others and it is not clear that they should be given equal weighting, as they are in Thagard’s measure. Another objection is that Thagard’s proposal struggles to make sense of things like ontological parsimony—the idea that theories are simpler because they posit fewer things—since it is not clear that parsimony per se would make any particular difference to the number of auxiliary assumptions required. In defense of this, Thagard has argued that ontological parsimony is actually less important to practicing scientists than has often been thought.

f. Information-Theoretic Measures

Over the last few decades, a number of formal measures of simplicity and complexity have been developed in mathematical information theory. Though many of these measures have been designed for addressing specific practical problems, the central ideas behind them have been claimed to have significance for addressing the philosophical problem of measuring the simplicity of scientific theories.

One of the prominent information-theoretic measures of simplicity in the current literature is Kolmogorov complexity, which is a formal measure of quantitative information content (see Li and Vitányi, 1997). The Kolmogorov complexity K(x) of an object x is the length in bits of the shortest binary program that can output a completely faithful description of x in some universal programming language, such as LISP or PASCALL. This measure was originally formulated to measure randomness in data strings (such as sequences of numbers), and is based on the insight that non-random data strings can be “compressed” by finding the patterns that exist in them. If there are patterns in a data string, it is possible to provide a completely accurate description of it that is shorter than the string itself, in terms of the number of “bits” of information used in the description, by using the pattern as a mnemonic that eliminates redundant information that need not be encoded in the description. For instance, if the data string is an ordered sequence of 1s and 0s, where every 1 is followed by a 0, and every 0 by a 1, then it can be given a very short description that specifies the pattern, the value of the first data point and the number of data points. Any further information is redundant. Completely random data sets, however, contain no patterns, no redundancy, and hence are not compressible.

It has been argued that Kolmogorov complexity can be applied as a general measure of the simplicity of scientific theories. Theories can be thought of as specifying the patterns that exist in the data sets they are meant to explain. As a result, we can also think of theories as compressing the data. Accordingly, the more a theory T compresses the data, the lower the value of K for the data using T, and the greater is its simplicity. An important feature of Kolmogorov complexity is that simplicity is measured in a universal programming language and universal programming languages are asymptotically equivalent up to a constant. This means that the difference in code length between the shortest code length for x in one universal programming language and the shortest code length for x in another programming language is a function of a constant c, not of x. Hence, for any program the difference between its shortest code length in one programming language and its shortest code length in another will be the same. This, in turn, means that Kolmogorov complexity measurement is language invariant in the sense that the values of K(x) for different objects can be compared no matter what universal programming language K(x) is measured in. And, by definition, anything that can be expressed in some language can be expressed in a universal programming language. Due to this, along with its generality and mathematical precision, some enthusiasts have claimed that Kolmogorov complexity solves the problem of defining and measuring simplicity.

A number of objections have been raised against this application of Kolmogorov complexity. First, finding K(x) is a non-computable problem: no algorithm exists to compute it. This is claimed to be a serious practical limitation of the measure. Another objection is that Kolmogorov complexity produces some counter-intuitive results. For instance, theories that make probabilistic rather than deterministic predictions about the data must have maximum Kolmogorov complexity. For example, a theory that says that a sequence of coin flips conforms to the probabilistic law, Pr(Heads) = ½, cannot be said to compress the data, since one cannot use this law to reconstruct the exact sequence of heads and tails, even though it offers an intuitively simple explanation of what we observe.

Other information-theoretic measures of simplicity, such as the Minimum Message Length (MML) and Minimum Description Length (MDL) measures, avoid some of the practical problems facing Kolmogorov Complexity. Though there are important differences in the details of these measures (see Wallace and Dowe, 1999), they all adopt the same basic idea that the simplicity of an empirical hypothesis can be measured in terms of the extent to which it provides a compact encoding of the data.

A general objection to all such measures of simplicity is that scientific theories generally aim to do more than specify patterns in the data. They also aim to explain why these patterns are there and it is in relation to how theories go about explaining the patterns in our observations that theories have often been thought to be simple or complex. Hence, it can be argued that mere data compression cannot, by itself, suffice as an explication of simplicity in relation to scientific theories. A further objection to the data compression approach is that theories can be viewed as compressing data sets in a very large number of different ways, many of which we do not consider appropriate contributions to simplicity. The problem raised by Goodman’s new riddle of induction can be seen as the problem of deciding which regularities to measure: for example, color regularities or grolor regularities? Formal information-theoretical measures do not discriminate between different kinds of pattern finding. Hence, any such measure can only be applied once we specify the sorts of patterns and regularities that should be taken into account.

g. Is Simplicity a Unified Concept?

There is a general consensus in the philosophical literature that the project of articulating a precise general measure of theoretical simplicity faces very significant challenges. Of course, this has not stopped practicing scientists from utilizing notions of simplicity in their work, and particular concepts of simplicity—such as the simplicity of a statistical model, understood in terms of paucity of adjustable parameters or model dimension—are firmly entrenched in several areas of science. Given this, one potential way of responding to the difficulties that philosophers and others have encountered in this area—particularly in light of the apparent multiplicity and scope for conflict between intuitive explications of simplicity—is to raise the question of whether theoretical simplicity is in fact a unified concept at all. Perhaps there is no single notion of simplicity that is (or should be) employed by scientists, but rather a cluster of different, sometimes related, but also sometimes conflicting notions of simplicity that scientists find useful to varying degrees in particular contexts. This might be evidenced by the observation that scientists’ simplicity judgments often involve making trade-offs between different notions of simplicity. Kepler’s preference for an astronomical theory that abandoned perfectly circular motions for the planets, but which could offer a unified explanation of the astronomical observations in terms of three basic laws, over a theory that retained perfect circular motion, but could not offer a similarly unified explanation, seems to be a clear example of this.

As a result of thoughts in this sort of direction, some philosophers have argued that there is actually no single theoretical value here at all, but rather a cluster of them (for example, Bunge, 1961). It is also worth considering the possibility that which of the cluster is accorded greater weight than the others, and how each of them is understood in practice, may vary greatly across different disciplines and fields of inquiry. Thus, what really matters when it comes to evaluating the comparative “simplicity” of theories might be quite different for biologists than for physicists, for instance, and perhaps what matters to a particle physicist is different to what matters to an astrophysicist. If there is in fact no unified concept of simplicity at work in science that might also indicate that there is no unitary justification for choosing between rival theories on grounds of simplicity. One important suggestion that this possibility has lead to is that the role of simplicity in science cannot be understood from a global perspective, but can only be understood locally. How simplicity ought to be measured and why it matters may have a peculiarly domain-specific explanation.

4. Justifying Preferences for Simpler Theories

Due to the apparent centrality of simplicity considerations to scientific methods and the link between it and numerous other important philosophical issues, the problem of justifying preferences for simpler theories is regarded as a major problem in the philosophy of science. It is also regarded as one of the most intractable. Though an extremely wide variety of justifications have been proposed—as with the debate over how to correctly define and measure simplicity, some important recent contributions have their origins in scientific literature in statistics, information theory, and other cognate fields—all of them have met with significant objections. There is currently no agreement amongst philosophers on what is the most promising path to take. There is also skepticism in some circles about whether an adequate justification is even possible.

Broadly speaking, justificatory proposals can be categorized into three types: 1) accounts that seek to show that simplicity is an indicator of truth (that is, that simpler theories are, in general, more likely to be true, or are somehow better confirmed by the empirical data than their more complex rivals); 2) accounts that do not regard simplicity as a direct indicator of truth, but which seek to highlight some alternative methodological justification for preferring simpler theories; 3) deflationary approaches, which actually reject the idea that there is a general justification for preferring simpler theories per se, but which seek to analyze particular appeals to simplicity in science in terms of other, less problematic, theoretical virtues.

a. Simplicity as an Indicator of Truth

i. Nature is Simple

Historically, the dominant view about why we should prefer simpler theories to more complex ones has been based on a general metaphysical thesis of the simplicity of nature. Since nature itself is simple, the relative simplicity of theories can thus be regarded as direct evidence for their truth. Such a view was explicitly endorsed by many of the great scientists of the past, including Aristotle, Copernicus, Galileo, Kepler, Newton, Maxwell, and Einstein. Naturally however, the question arises as to what justifies the thesis that nature is simple? Broadly speaking, there have been two different sorts of argument given for this thesis: i) that a benevolent God must have created a simple and elegant universe; ii) that the past record of success of relatively simple theories entitles us to infer that nature is simple. The theological justification was most common amongst scientists and philosophers during the early modern period. Einstein, on the other hand, invoked a meta-inductive justification, claiming that the history of physics justifies us in believing that nature is the realization of the simplest conceivable mathematical ideas.

Despite the historical popularity and influence of this view, more recent philosophers and scientists have been extremely resistant to the idea that we are justified in believing that nature is simple. For a start, it seems difficult to formulate the thesis that nature is simple so that it is not either obviously false, or too vague to be of any use. There would seem to be many counter-examples to the claim that we live in a simple universe. Consider, for instance, the picture of the atomic nucleus that physicists were working with in the early part of the twentieth century: it was assumed that matter was made only of protons and electrons; there were no such things as neutrons or neutrinos and no weak or strong nuclear forces to be explained, only electromagnetism. Subsequent discoveries have arguably led to a much more complex picture of nature and much more complex theories have had to be developed to account for this. In response, it could be claimed that though nature seems to be complex in some superficial respects, there is in fact a deep underlying simplicity in the fundamental structure of nature. It might also be claimed that the respects in which nature appears to be complex are necessary consequences of its underlying simplicity. But this just serves to highlight the vagueness of the claim that nature is simple—what exactly does this thesis amount to, and what kind of evidence could we have for it?

However the thesis is formulated, it would seem to be an extremely difficult one to adequately defend, whether this be on theological or meta-inductive grounds. An attempt to give a theological justification for the claim that nature is simple suffers from an inherent unattractiveness to modern philosophers and scientists who do not want to ground the legitimacy of scientific methods in theology. In any case, many theologians reject the supposed link between God’s benevolence and the simplicity of creation. With respect to a meta-inductive justification, even if it were the case that the history of science demonstrates the better than average success of simpler theories, we may still raise significant worries about the extent to which this could give sufficient credence to the claim that nature is simple. First, it assumes that empirical success can be taken to be a reliable indicator of truth (or at least approximate truth), and hence of what nature is really like. Though this is a standard assumption for many scientific realists—the claim being that success would be “miraculous” if the theory concerned was radically false—it is a highly contentious one, since many anti-realists hold that the history of science shows that all theories, even eminently successful theories, typically turn out to be radically false. Even if one does accept a link between success and truth, our successes to date may still not provide a representative sample of nature: maybe we have only looked at the problems that are most amenable to simple solutions and the real underlying complexity of nature has escaped our notice. We can also question the degree to which we can extrapolate any putative connection between simplicity and truth in one area of nature to nature as a whole. Moreover, in so far as simplicity considerations are held to be fundamental to inductive inference quite generally, such an attempted justification risks a charge of circularity.

ii. Meta-Inductive Proposals

There is another way of appealing to past success in order to try to justify a link between simplicity and truth. Instead of trying to justify a completely general claim about the simplicity of nature, this proposal merely suggests that we can infer a correlation between success and very particular simplicity characteristics in particular fields of inquiry—for instance, a particular kind of symmetry in certain areas of theoretical physics. If success can be regarded as an indicator of at least approximate truth, we can then infer that theories that are simpler in the relevant sense are more likely to be true in fields where the correlation with success holds.

Recent examples of this sort of proposal include McAllister (1996) and Kuipers (2002). In an effort to account for the truth-conduciveness of aesthetic considerations in science, including simplicity, Theo Kuipers (2002) claims that scientists tend to become attracted to theories that share particular aesthetic features in common with successful theories that they have been previously exposed to. In other words, we can explain the particular aesthetic preferences that scientists have in terms that are similar to a well-documented psychological effect known as the “mere-exposure effect”, which occurs when individuals take a liking to something after repeated exposure to it. If, in a given field of inquiry, theories that have been especially successful exhibit a particular type of simplicity (however this is understood), and thus such theories have been repeatedly presented to scientists working in the field during their training, the mere-exposure effect will then lead these scientists to be attracted to other theories that also exhibit that same type of simplicity. This process can then be used to support an aesthetic induction to a correlation between simplicity in the relevant sense and success. One can then make a case that this type of simplicity can legitimately be taken as an indicator of at least approximate truth.

Even though this sort of meta-inductive proposal does not attempt to show that nature in general is simple, many of the same objections can be raised against it as are raised against the attempt to justify that metaphysical thesis by appeal to the past success of simple theories. Once again, there is the problem of justifying the claim that empirical success is a reliable guide to (approximate) truth. Kuipers’ own arguments for this claim rest on a somewhat idiosyncratic account of truth approximation. In addition, in order to legitimately infer that there is a genuine correlation between simplicity and success, one cannot just look at successful theories; one must look at unsuccessful theories too. Even if all the successful theories in a domain have the relevant simplicity characteristic, it might still be the case that the majority of theories with the characteristic have been (or would have been) highly unsuccessful. Indeed, if one can potentially modify a successful theory in an infinite number of ways while keeping the relevant simplicity characteristic, one might actually be able to guarantee that the majority of possible theories with the characteristic would be unsuccessful theories, thus breaking the correlation between simplicity and success. This could be taken as suggesting that in order to carry any weight, arguments from success also need to offer an explanation for why simplicity contributes to success. Moreover, though the mere-exposure effect is well documented, Kuipers provides no direct empirical evidence that scientists actually acquire their aesthetic preferences via the kind of process that he proposes.

iii. Bayesian Proposals

According to standard varieties of Bayesianism, we should evaluate scientific theories according to their probability conditional upon the evidence (posterior probability). This probability, Pr(T | E), is a function of three quantities:

  • Pr(T | E) = Pr(E | T) Pr(T) / Pr(E)

Pr(E | T), is the probability that the theory, T, confers on the evidence, E, which is referred to as the likelihood of T. Pr(T) is the prior probability of T, and Pr(E) is the probability of E. T is then held to have higher posterior probability than a rival theory, T*, if and only if:

  • Pr(E | T) Pr(T) > Pr(E | T*) Pr(T*)

A standard Bayesian proposal for understanding the role of simplicity in theory choice is that simplicity is one of the key determinates of Pr(T): other things being equal, simpler theories and hypotheses are held to have higher prior probability of being true than more complex ones. Thus, if two rival theories confer equal or near equal probability on the data, but differ in relative simplicity, other things being equal, the simpler theory will tend to have a higher posterior probability. This idea, which Harold Jeffreys called “the simplicity postulate”, has been elaborated in a number of different ways by philosophers, statisticians, and information theorists, utilizing various measures of simplicity (for example, Carnap, 1950; Jeffreys, 1957, 1961; Solomonoff, 1964; Li, M. and Vitányi, 1997).

In response to this proposal, Karl Popper (1959) argued that, in some cases, assigning a simpler theory a higher prior probability actually violates the axioms of probability. For instance, Jeffreys proposed that simplicity be measured by counting adjustable parameters. On this measure, the claim that the planets move in circular orbits is simpler than the claim that the planets move in elliptical orbits, since the equation for an ellipse contains an additional adjustable parameter. However, circles can also be viewed as special cases of ellipses, where the additional parameter is set to zero. Hence, the claim that planets move in circular orbits can also be seen as a special case of the claim that the planets move in elliptical orbits. If that is right, then the former claim cannot be more probable than the latter claim because the truth of the former entails the truth of latter and probability respects entailment. In reply to Popper, it has been argued that this prior probabilistic bias towards simpler theories should only be seen as applying to comparisons between inconsistent theories where no relation of entailment holds between them—for instance, between the claim that the planets move in circular orbits and the claim that they move in elliptical but non-circular orbits.

The main objection to the Bayesian proposal that simplicity is a determinate of prior probability is that the theory of probability seems to offer no resources for explaining why simpler theories should be accorded higher prior probability. Rudolf Carnap (1950) thought that prior probabilities could be assigned a priori to any hypothesis stated in a formal language, on the basis of a logical analysis of the structure of the language and assumptions about the equi-probability of all possible states of affairs. However, Carnap’s approach has generally been recognized to be unworkable. If higher prior probabilities cannot be assigned to simpler theories on the basis of purely logical or mathematical considerations, then it seems that Bayesians must look outside of the Bayesian framework itself to justify the simplicity postulate.

Some Bayesians have taken an alternative route, claiming that a direct mathematical connection can be established between the simplicity of theories and their likelihood—that is, the value of Pr(E | T) ( see Rosencrantz, 1983; Myrvold, 2003; White, 2005). This proposal depends on the assumption that simpler theories have fewer adjustable parameters, and hence are consistent with a narrower range of potential data. Suppose that we collect a set of empirical data, E, that can be explained by two theories that differ with respect to this kind of simplicity: a simple theory, S, and a complex theory, C. S has no adjustable parameters and only ever entails E, while C has an adjustable parameter, θ, which can take a range of values, n. When θ is set to some specific value, i, it entails E, but on other values of θ, C entails different and incompatible observations. It is then argued that S confers a higher probability on E. This is because C allows that lots of other possible observations could have been made instead of E (on different possible settings for θ). Hence, the truth of C would make our recording those particular observations less probable than would the truth of S. Here, the likelihood of C is calculated as the average of the likelihoods of each of the n versions of C, defined by a unique setting of θ. Thus, as the complexity of a theory increases—measured in terms of the number of adjustable parameters it contains—the number of versions of the theory that will give a low probability to E will increase and the overall value of Pr(E | T) will go down.

An objection to this proposal (Kelly, 2004, 2010) is that for us to be able to show that S has a higher posterior probability than C as a result of its having a higher likelihood, it must be assumed that the prior probability of C is not significantly greater than the prior probability of S. This is a substantive assumption to make because of the way that simplicity is defined in this argument. We can view C as coming in a variety of different versions, each of which is picked out by a different value given to θ. If we then assume that S and C have roughly equal prior probability we must, by implication, assume that each version of C has a very low prior probability compared to S, since the prior probability of each version of C would be Pr(C) / n (assuming that the theory does not say that any particular parameter setting is more probable than any of the others). This would effectively build in a very strong prior bias in favour of S over each version of C. Given that each version of C could be considered independently—that is, the complex theory could be given a simpler, more restricted formulation—this would require an additional supporting argument. The objection is thus that the proposal simply begs the question by resting on a prior probabilistic bias towards simpler theories. Another objection is that the proposal suffers from the limitation that it can only be applied to comparisons between theories where the simpler theory can be derived from the more complex one by fixing certain of its parameters. At best, this represents a small fraction of cases in which simplicity has been thought to play a role.

iv. Simplicity as a Fundamental A Priori Principle

In the light of the perceived failure of philosophers to justify the claim that simpler theories are more likely to true, Richard Swinburne (2001) has argued that this claim has to be regarded as a fundamental a priori principle. Swinburne argues that it is just obvious that the criteria for theory evaluation that scientists use reliably lead them to make correct judgments about which theories are more likely to true. Since, Swinburne argues, one of these is that simpler theories are, other things being equal, more likely to be true, we just have to accept that simplicity is indeed an indicator of probable truth. However, Swinburne doesn’t think that this connection between simplicity and truth can be established empirically, nor does he think that it can be shown to follow from some more obvious a priori principle. Hence, we have no choice but to regard it as a fundamental a priori principle—a principle that cannot be justified by anything more fundamental.

In response to Swinburne, it can be argued that this is hardly going to convince those scientists and philosophers for whom it is not at all obvious the simpler theories are more likely to be true.

b. Alternative Justifications

i. Falsifiability

Famously, Karl Popper (1959) rejected the idea that theories are ever confirmed by evidence and that we are ever entitled to regard a theory as true, or probably true. Hence, Popper did not think simplicity could be legitimately regarded as an indicator of truth. Rather, he argued that simpler theories are to be valued because they are more falsifiable. Indeed, Popper thought that the simplicity of theories could be measured in terms of their falsifiability, since intuitively simpler theories have greater empirical content, placing more restriction on the ways the world can be, thus leading to a reduced ability to accommodate any future that we might discover. According to Popper, scientific progress consists not in the attainment of true theories, but in the elimination of false ones. Thus, the reason we should prefer more falsifiable theories is because such theories will be more quickly eliminated if they are in fact false. Hence, the practice of first considering the simplest theory consistent with the data provides a faster route to scientific progress. Importantly, for Popper, this meant that we should prefer simpler theories because they have a lower probability of being true, since, for any set of data, it is more likely that some complex theory (in Popper’s sense) will be able to accommodate it than a simpler theory.

Popper’s equation of simplicity with falsifiability suffers from some well-known objections and counter-examples, and these pose significant problems for his justificatory proposal (Section 3c). Another significant problem is that taking degree of falsifiability as a criterion for theory choice seems to lead to absurd consequences, since it encourages us to prefer absurdly specific scientific theories to those that have more general content. For instance, the hypothesis, “all emeralds are green until 11pm today when they will turn blue” should be judged as preferable to “all emeralds are green” because it is easier to falsify. It thus seems deeply implausible to say that selecting and testing such hypotheses first provides the fastest route to scientific progress.

ii. Simplicity as an Explanatory Virtue

A number of philosophers have sought to elucidate the rationale for preferring simpler theories to more complex ones in explanatory terms (for example, Friedman, 1974; Sober, 1975; Walsh, 1979; Thagard, 1988; Kitcher, 1989; Baker, 2003). These proposals have typically been made on the back of accounts of scientific explanation that explicate notions of explanatoriness and explanatory power in terms of unification, which is taken to be intimately bound up with notions of simplicity. According to unification accounts of explanation, a theory is explanatory if it shows how different phenomena are related to each other under certain systematizing theoretical principles, and a theory is held to have greater explanatory power than its rivals if it systematizes more phenomena. For Michael Friedman (1974), for instance, explanatory power is a function of the number of independent phenomena that we need to accept as ultimate: the smaller the number of independent phenomena that are regarded as ultimate by the theory, the more explanatory is the theory. Similarly, for Philip Kitcher (1989), explanatory power is increased the smaller the number of patterns of argument, or “problem-solving schemas”, that are needed to deliver the facts about the world that we accept. Thus, on such accounts, explanatory power is seen as a structural relationship between the sparseness of an explanation—the fewness of hypotheses or argument patterns—and the plenitude of facts that are explained. There have been various attempts to explicate notions of simplicity in terms of these sorts of features. A standard type of argument that is then used is that we want our theories not only to be true, but also explanatory. If truth were our only goal, there would be no reason to prefer a genuine scientific theory to a collection of random factual statements that all happen to be true. Hence, explanation is an ultimate, rather than a purely instrumental goal of scientific inquiry. Thus, we can justify our preferences for simpler theories once we recognize that there is a fundamental link between simplicity and explanatoriness and that explanation is a key goal of scientific inquiry, alongside truth.

There are some well-known objections to unification theories of explanation, though most of them concern the claim that unification is all there is to explanation—a claim on which the current proposal does not depend. However, even if we accept a unification theory of explanation and accept that explanation is an ultimate goal of scientific inquiry, it can be objected that the choice between a simple theory and a more complex rival is not normally a choice between a theory that is genuinely explanatory, in this sense, and a mere factual report. The complex theory can normally be seen as unifying different phenomena under systematizing principles, at least to some degree. Hence, the justificatory question here is not about why we should prefer theories that explain the data to theories that do not, but why we should prefer theories that have greater explanatory power in the senses just described to theories that are comparatively less explanatory. It is certainly a coherent possibility that the truth may turn out to be relatively disunified and unsystematic. Given this, it seems appropriate to ask why we are justified in choosing theories because they are more unifying. Just saying that explanation is an ultimate goal of scientific inquiry does not seem to be enough.

iii. Predictive Accuracy

In the last few decades, the treatment of simplicity as an explicit part of statistical methodology has become increasingly sophisticated. A consequence of this is that some philosophers of science have started looking to the statistics literature for illumination on how to think about the philosophical problems surrounding simplicity. According to Malcolm Forster and Elliott Sober (Forster and Sober, 1994; Forster, 2001; Sober, 2007), the work of the statistician, Hirotugu Akaike (1973), provides a precise theoretical framework for understanding the justification for the role of simplicity in curve-fitting and model selection.

Standard approaches to curve-fitting effect a trade-off between fit to a sample of data and the simplicity of the kind of mathematical relationship that is posited to hold between the variables—that is, the simplicity of the postulated model for the underlying relationship, typically measured in terms of the number of adjustable parameters it contains. This often means, for instance, that a linear hypothesis that fits a sample of data less well may be chosen over a parabolic hypothesis that fits the data better. According to Forster and Sober, Akaike developed an explanation for why it is rational to favor simpler models, under specific circumstances. The proposal builds on the practical wisdom that when there is a particular amount of error or noise in the data sample, more complex models have a greater propensity to “over-fit” to this spurious data in the sample and thus lead to less accurate predictions of extra-sample (for instance, future) data, particularly when dealing with small sample sizes. (Gauch [2003, 2006] calls this “Ockham’s hill”: to the left of the peak of the hill, increasing the complexity of a model improves its accuracy with respect to extra-sample data; after the peak, increasing complexity actually diminishes predictive accuracy. There is therefore an optimal trade-off at the peak of Ockham’s hill between simplicity and fit to the data sample when it comes to facilitating accurate prediction). According to Forster and Sober, what Akaike did was prove a theorem, which shows that, given standard statistical assumptions, we can estimate the degree to which constraining model complexity when fitting a curve to a sample of data will lead to more accurate predictions of extra-sample data. Following Forster and Sober’s presentation (1994, p9-10), Akaike’s theorem can be stated as follows:

  • Estimated[A(M)] = (1/N)[log-likelihood(L(M)) – k],

where A(M) is the predictive accuracy of the model, M, with respect to extra-sample data, N is the number of data points in the sample, log-likelihood is a measure of goodness of fit to the sample (the higher the log-likelihood score the closer the fit to the data), L(M) is the best fitting member of M, and k is the number of adjustable parameters that M contains. Akaike’s theorem is claimed to specify an unbiased estimator of predictive accuracy, which means that the distribution of estimates of A is centered around the true value of A (for proofs and further details on the assumptions behind Akaike’s theorem, see Sakamoto and others, 1986). This gives rise to a model selection procedure, Akaike’s Information Criterion (AIC), which says that we should choose the model that has the highest estimated predictive accuracy, given the data at hand. In practice, AIC implies that when the best-fitting parabola fits the data sample better than the best-fitting straight line, but not so much better that this outweighs its greater complexity (k), the straight line should be used for making predictions. Importantly, the penalty imposed on complexity has less influence on model selection the larger the sample of data, meaning that simplicity matters more for predictive accuracy when dealing with smaller samples.

Forster and Sober argue that Akaike’s theorem explains why simplicity has a quantifiable positive effect on predictive accuracy by combating the risk of over-fitting to noisy data. Hence, if one is interested in generating accurate predictions—for instance, of future data—one has a clear rationale for preferring simpler models. Forster and Sober are explicit that this proposal is only meant to apply to scientific contexts that can be understood from within a model selection framework, where predictive accuracy is the central goal of inquiry and there is a certain amount of error or noise in the data. Hence, they do not view Akaike’s work as offering a complete solution to the problem of justifying preferences for simpler theories. However, they have argued that a very significant number of scientific inference problems can be understood from an Akaikian perspective.

Several objections have been raised against Forster and Sober’s philosophical use of Akaike’s work. One objection is that the measure of simplicity employed by AIC is not language invariant, since the number of adjustable parameters a model contains depends on how the model is described. However, Forster and Sober argue that though, for practical purposes, the quantity, k, is normally spelt out in terms of number of adjustable parameters, it is in fact more accurately explicated in terms of the notion of the dimension of a family of functions, which is language invariant. Another objection is that AIC is not statistically consistent. Forster and Sober reply that this charge rests on a confusion over what AIC is meant to estimate: for example, erroneously assuming that AIC is meant to be estimator of the true value of k (the size of the simplest model that contains the true hypothesis), rather than an estimator of the predictive accuracy of a particular model at hand. Another worry is that over-fitting considerations imply that an idealized false model will often make more accurate predictions than a more realistic model, so the justification is merely instrumentalist and cannot warrant the use of simplicity as a criterion for hypothesis acceptance where hypotheses are construed realistically, rather than just as predictive tools. For their part, Forster and Sober are quite happy to accept this instrumentalist construal of the role of simplicity in curve-fitting and model selection: in this context, simplicity is not a guide to the truth, but to predictive accuracy. Finally, there are a variety of objections concerning the nature and validity of the assumptions behind Akaikie’s theorem and whether AIC is applicable to some important classes of model selection problems (for discussion, see Kieseppä, 1997; Forster, 1999, 2001; Howson and Urbach, 2006; Dowe and others, 2007; Sober, 2007; Kelly, 2010).

iv. Truth-Finding Efficiency

An important recent proposal about how to justify preferences for simpler theories has come from work in the interdisciplinary field known as formal learning theory (Schulte, 1999; Kelly, 2004, 2007, 2010). It has been proposed that even if we do not know whether the world is simple or complex, inferential rules that are biased towards simple hypotheses can be shown to converge to the truth more efficiently than alternative inferential rules. According to this proposal, an inferential rule is said to converge to the truth efficiently, if, relative to other possible convergent inferential rules, it minimizes the maximum number of U-turns or “retractions” of opinion that might be required of the inquirer while using the rule to guide her decisions on what to believe given the data. Such procedures are said to converge to the truth more directly and in a more stable fashion, since they require fewer changes of mind along the way. The proposal is that even if we do not know whether the truth is simple or complex, scientific inference procedures that are biased towards simplicity can be shown a priori to be optimally efficient in this sense, converging to the truth in the most direct and stable way possible.

To illustrate the basic logic behind this proposal, consider the following example from Oliver Schulte (1999). Suppose that we are investigating the existence of hypothetical particle, Ω. If Ω does exist, we will be able to detect it with an appropriate measurement device. However, as yet, it has not been detected. What attitude should we take towards the existence Ω? Let us say that Ockham’s Razor suggests that we deny that Ω exists until it is detected (if ever). Alternatively, we could assert that Ω does exist until a finite number of attempts to detect Ω have proved to be unsuccessful, say ten thousand, in which case, we assert that Ω does not exist; or, we could withhold judgment until Ω is either detected, or there have been ten thousand unsuccessful attempts to detect it. Since we are assuming that existent particles do not go undetected forever, abiding by any of three of these inferential rules will enable us to converge to the truth in the limit, whether Ω exists or not. However, Schulte argues that Ockham’s Razor provides the most efficient route to the truth. This is because following Ockham’s Razor incurs a maximum of only one retraction of opinion: retracting an assertion of non-existence to an assertion of existence, if Ω is detected. In contrast, the alternative inferential rules both incur a maximum of two retractions, since Ω could go undetected ten thousand times, but is then detected on the ten thousandth and one time. Hence, truth-finding efficiency requires that one adopt Ockham’s Razor and presume that Ω does not exist until it is detected.

Kevin Kelly has further developed this U-turn argument in considerable detail. Kelly argues that, with suitable refinements, it can be extended to an extremely wide variety of real world scientific inference problems. Importantly, Kelly has argued that, on this proposal, simplicity should not be seen as purely a pragmatic consideration in theory choice. While simplicity cannot be regarded as a direct indicator of truth, we do nonetheless have a reason to think that the practice of favoring simpler theories is a truth-conducive strategy, since it promotes speedy and stable attainment of true beliefs. Hence, simplicity should be regarded as a genuinely epistemic consideration in theory choice.

One worry about the truth-finding efficiency proposal concerns the general applicability of these results to scientific contexts in which simplicity may play a role. The U-turn argument for Ockham’s razor described above seems to depend on the evidential asymmetry between establishing that Ω exists and establishing that Ω does not exist: a detection of Ω is sufficient to establish the existence of Ω, whereas repeated failures of detection are not sufficient to establish non-existence. The argument may work where detection procedures are relatively clear-cut—for instance where there are relatively unambiguous instrument readings that count as “detections”—but what about entities that are very difficult to detect directly and where mistakes can easily be made about existence as well as non-existence? Similarly, a current stumbling block is that the U-turn argument cannot be used as a justification for the employment of simplicity biases in statistical inference, where the hypotheses under consideration do not have deductive observational consequences. Kelly is, however, optimistic about extending the U-turn argument to statistical inference. Another objection concerns the nature of the justification that is being provided here. What the U-turn argument seems to show is that the strategy of favoring the simplest theory consistent with the data may help one to find the truth with fewer reversals along the way. It does not establish that simpler theories themselves should be regarded as in any way “better” than their more complex rivals. Hence, there are doubts about the extent to which this proposal can actually make sense of standard examples of simplicity preferences at work in the history and current practice of science, where the guiding assumption seems to be that simpler theories are not to be preferred merely for strategic reasons, but because they are better theories.

c. Deflationary Approaches

Various philosophers have sought to defend broadly deflationary accounts of simplicity. Such accounts depart from all of the justificatory accounts discussed so far by rejecting the idea that simplicity should in fact be regarded as a theoretical virtue and criterion for theory choice in its own right. Rather, according to deflationary accounts, when simplicity appears to be a driving factor in theory evaluation, something else is doing the real work.

Richard Boyd (1990), for instance, has argued that scientists’ simplicity judgments are typically best understood as just covert judgements of theoretical plausibility. When a scientist claims that one theory is “simpler” than another this is often just another way of saying that the theory provides a more plausible account of the data. For Boyd, such covert judgments of theoretical plausibility are driven by the scientist’s background theories. Hence, it is the relevant background theories that do the real work in motivating the preference for the “simpler” theory, not the simplicity of the theory per se. John Norton (2003) has advocated a similar view in the context of his “material theory” of induction, according to which inductive inferences are licensed not by universal inductive rules or inference schemas, but rather by local factual assumptions about the domain of inquiry. Norton argues that the apparent use of simplicity in induction merely reflects material assumptions about the nature of the domain being investigated. For instance, when we try to fit curves to data we choose the variables and functions that we believe to be appropriate to the physical reality we are trying to get at. Hence, it is because of the facts that we believe to prevail in this domain that we prefer a “simple” linear function to a quadratic one, if such a curve fits the data sufficiently well. In a different domain, where we believe that different facts prevail, our decision about which hypotheses are “simple” or “complex” are likely to be very different.

Elliott Sober (1988, 1994) has defended this sort of deflationary analysis of various appeals to simplicity and parsimony in evolutionary biology. For example, Sober argues that the common claim that group selection hypotheses are “less parsimonious” and hence to be taken less seriously as explanations for biological adaptations than individual selection hypotheses, rests on substantive assumptions about the comparative rarity of the conditions required for group selection to occur. Hence, the appeal to Ockham’s Razor in this context is just a covert appeal to local background knowledge. Other attempts to offer deflationary analyses of particular appeals to simplicity in science include Plutynski (2005), who focuses on the Fisher-Wright debate in evolutionary biology, and Fitzpatrick (2009), who focuses on appeals to simplicity in debates over the cognitive capacities of non-human primates.

If such deflationary analyses of the putative role of simplicity in particular scientific contexts turn out to be plausible, then problems concerning how to measure simplicity and how to offer a general justification for preferring simpler theories can be avoided, since simplicity per se can be shown to do no substantive work in the relevant inferences. However, many philosophers are skeptical that such deflationary analyses are possible for many of the contexts where simplicity considerations have been thought to play an important role. Kelly (2010), for example, has argued that simplicity typically comes into play when our background knowledge underdetermines theory choice. Sober himself seems to advocate a mixed view: some appeals to simplicity in science are best understood in deflationary terms, others are better understood in terms of Akaikian model selection theory.

5. Conclusion

The putative role of considerations of simplicity in the history and current practice of science gives rise to a number of philosophical problems, including the problem of precisely defining and measuring theoretical simplicity, and the problem of justifying preferences for simpler theories. As this survey of the literature on simplicity in the philosophy of science demonstrates, these problems have turned out to be surprisingly resistant to resolution, and there remains a live debate amongst philosophers of science about how to deal with them. On the other hand, there is no disputing the fact that practicing scientists continue to find it useful to appeal to various notions of simplicity in their work. Thus, in many ways, the debate over simplicity resembles other long-running debates in the philosophy science, such as that over the justification for induction (which, it turns out, is closely related to the problem of justifying preferences for simpler theories). Though there is arguably more skepticism within the scientific community about the legitimacy of choosing between rival theories on grounds of simplicity than there is about the legitimacy of inductive inference—the latter being a complete non-issue for practicing scientists—as is the case with induction, very many scientists continue to employ practices and methods that utilize notions of simplicity to great scientific effect, assuming that appropriate solutions to the philosophical problems that these practices give rise to do in fact exist, even though philosophers have so far failed to articulate them. However, as this survey has also shown, statisticians, information and learning theorists, and other scientists have been making increasingly important contributions to the debate over the philosophical underpinning for these practices.

6. References and Further Reading

  • Ackerman, R. 1961. Inductive simplicity. Philosophy of Science, 28, 162-171.
    • Argues against the claim that simplicity considerations play a significant role in inductive inference. Critiques measures of simplicity proposed by Jeffreys, Kemeny, and Popper.
  • Akaike, H. 1973. Information theory and the extension of the maximum likelihood principle. In B. Petrov and F. Csaki (eds.), Second International Symposium on Information Theory. Budapest: Akademiai Kiado.
    • Laid the foundations for model selection theory. Proves a theorem suggesting that the simplicity of a model is relevant to estimating its future predictive accuracy. Highly technical.
  • Baker, A. 2003. Quantitative parsimony and explanatory power. British Journal for the Philosophy of Science, 54, 245-259.
    • Builds on Nolan (1997), argues that quantitative parsimony is linked with explanatory power.
  • Baker, A. 2007. Occam’s Razor in science: a case study from biogeography. Biology and Philosophy, 22, 193-215.
    • Argues for a “naturalistic” justification of Ockham’s Razor and that preferences for ontological parsimony played a significant role in the late 19th century debate in bio-geography between dispersalist and extensionist theories.
  • Barnes, E.C. 2000. Ockham’s razor and the anti-superfluity principle. Erkenntnis, 53, 353-374.
    • Draws a useful distinction between two different interpretations of Ockham’s Razor: the anti-superfluity principle and the anti-quantity principle. Explicates an evidential justification for anti-superfluity principle.
  • Boyd, R. 1990. Observations, explanatory power, and simplicity: towards a non-Humean account. In R. Boyd, P. Gasper and J.D. Trout (eds.), The Philosophy of Science. Cambridge, MA: MIT Press.
    • Argues that appeals to simplicity in theory evaluation are typically best understood as covert judgments of theoretical plausibility.
  • Bunge, M. 1961. The weight of simplicity in the construction and assaying of scientific theories. Philosophy of Science, 28, 162-171.
    • Takes a skeptical view about the importance and justifiability of a simplicity criterion in theory evaluation.
  • Carlson, E. 1966. The Gene: A Critical History. Philadelphia: Saunders.
    • Argues that simplicity considerations played a significant role in several important debates in the history of genetics.
  • Carnap, R. 1950. Logical Foundations of Probability. Chicago: University of Chicago Press.
  • Chater, N. 1999. The search for simplicity: a fundamental cognitive principle. The Quarterly Journal of Experimental Psychology, 52A, 273-302.
    • Argues that simplicity plays a fundamental role in human reasoning, with simplicity to be defined in terms of Kolmogorov complexity.
  • Cohen, I.B. 1985. Revolutions in Science. Cambridge, MA: Harvard University Press.
  • Cohen, I.B. 1999. A guide to Newton’s Principia. In I. Newton, The Principia: Mathematical Principles of Natural Philosophy; A New Translation by I. Bernard Cohen and Anne Whitman. Berkeley: University of California Press.
  • Crick, F. 1988. What Mad Pursuit: a Personal View of Scientific Discovery. New York: Basic Books.
    • Argues that the application of Ockham’s Razor to biology is inadvisable.
  • Dowe, D, Gardner, S., and Oppy, G. 2007. Bayes not bust! Why simplicity is no problem for Bayesians. British Journal for the Philosophy of Science, 58, 709-754.
    • Contra Forster and Sober (1994), argues that Bayesians can make sense of the role of simplicity in curve-fitting.
  • Duhem, P. 1954. The Aim and Structure of Physical Theory. Princeton: Princeton University Press.
  • Einstein, A. 1954. Ideas and Opinions. New York: Crown.
    • Einstein’s views about the role of simplicity in physics.
  • Fitzpatrick, S. 2009. The primate mindreading controversy: a case study in simplicity and methodology in animal psychology. In R. Lurz (ed.), The Philosophy of Animal Minds. New York: Cambridge University Press.
    • Advocates a deflationary analysis of appeals to simplicity in debates over the cognitive capacities of non-human primates.
  • Forster, M. 1995. Bayes and bust: simplicity as a problem for a probabilist’s approach to confirmation. British Journal for the Philosophy of Science, 46, 399-424.
    • Argues that the Bayesian approach to scientific reasoning is inadequate because it cannot make sense of the role of simplicity in theory evaluation.
  • Forster, M. 1999. Model selection in science: the problem of language variance. British Journal for the Philosophy of Science, 50, 83-102.
    • Responds to criticisms of Forster and Sober (1994). Argues that AIC relies on a language invariant measure of simplicity.
  • Forster, M. 2001. The new science of simplicity. In A. Zellner, H. Keuzenkamp and M. McAleer (eds.), Simplicity, Inference and Modelling. Cambridge: Cambridge University Press.
    • Accessible introduction to model selection theory. Describes how different procedures, including AIC, BIC, and MDL, trade-off simplicity and fit to the data.
  • Forster, M. and Sober, E. 1994. How to tell when simpler, more unified, or less ad hoc theories will provide more accurate predictions. British Journal for the Philosophy of Science, 45, 1-35.
    • Explication of AIC statistics and its relevance to the philosophical problem of justifying preferences for simpler theories. Argues against Bayesian approaches to simplicity. Technical in places.
  • Foster, M. and Martin, M. 1966. Probability, Confirmation, and Simplicity: Readings in the Philosophy of Inductive Logic. New York: The Odyssey Press.
    • Anthology of papers discussing the role of simplicity in induction. Contains important papers by Ackermann, Barker, Bunge, Goodman, Kemeny, and Quine.
  • Friedman, M. 1974. Explanation and scientific understanding. Journal of Philosophy, LXXI, 1-19.
    • Defends a unification account of explanation, connects simplicity with explanatoriness.
  • Galilei, G. 1962. Dialogues concerning the Two Chief World Systems. Berkeley: University of California Press.
    • Classic defense of Copernicanism with significant emphasis placed on the greater simplicity and harmony of the Copernican system. Asserts that nature does nothing in vain.
  • Gauch, H. 2003. Scientific Method in Practice. Cambridge: Cambridge University Press.
    • Wide-ranging discussion of the scientific method written by a scientist for scientists. Contains a chapter on the importance of parsimony in science.
  • Gauch, H. 2006. Winning the accuracy game. American Scientist, 94, March-April 2006, 134-141.
    • Useful informal presentation of the concept of Ockham’s hill and its importance to scientific research in a number of fields.
  • Gingerich, O. 1993. The Eye of Heaven: Ptolemy, Copernicus, Kepler. New York: American Institute of Physics.
  • Glymour, C. 1980. Theory and Evidence. Princeton: Princeton University Press.
    • An important critique of Bayesian attempts to make sense of the role of simplicity in science. Defends a “boot-strapping” analysis of the simplicity arguments for Copernicanism and Newton’s argument for universal gravitation.
  • Goodman, N. 1943. On the simplicity of ideas. Journal of Symbolic Logic, 8, 107-1.
  • Goodman, N. 1955. Axiomatic measurement of simplicity. Journal of Philosophy, 52, 709-722.
  • Goodman, N. 1958. The test of simplicity. Science, 128, October 31st 1958, 1064-1069.
    • Reasonably accessible introduction to Goodman’s attempts to formulate a measure of logical simplicity.
  • Goodman, N. 1959. Recent developments in the theory of simplicity. Philosophy and Phenomenological Research, 19, 429-446.
    • Response to criticisms of Goodman (1955).
  • Goodman, N. 1961. Safety, strength, simplicity. Philosophy of Science, 28, 150-151.
    • Argues that simplicity cannot be equated with testability, empirical content, or paucity of assumption.
  • Goodman, N. 1983. Fact, Fiction and Forecast (4th edition). Cambridge, MA: Harvard University Press.
  • Harman, G. 1999. Simplicity as a pragmatic criterion for deciding what hypotheses to take seriously. In G. Harman, Reasoning, Meaning and Mind. Oxford: Oxford University Press.
    • Defends the claim that simplicity is a fundamental component of inductive inference and that this role has a pragmatic justification.
  • Harman, G. and Kulkarni, S. 2007. Reliable Reasoning: Induction and Statistical Learning Theory. Cambridge, MA: MIT Press.
    • Accessible introduction to statistical learning theory and VC dimension.
  • Harper, W. 2002. Newton’s argument for universal gravitation. In I.B. Cohen and G.E. Smith (eds.), The Cambridge Companion to Newton. Cambridge: Cambridge University Press.
  • Hesse, M. 1967. Simplicity. In P. Edwards (ed.), The Encyclopaedia of Philosophy, vol. 7. New York: Macmillan.
    • Focuses on attempts by Jeffreys, Popper, Kemeny, and Goodman to formulate measures of simplicity.
  • Hesse, M. 1974. The Structure of Scientific Inference. London: Macmillan.
    • Defends the view that simplicity is a determinant of prior probability. Useful discussion of the role of simplicity in Einstein’s work.
  • Holton, G. 1974. Thematic Origins of Modern Science: Kepler to Einstein. Cambridge, MA: Harvard University Press.
    • Discusses the role of aesthetic considerations, including simplicity, in the history of science.
  • Hoffman, R., Minkin, V., and Carpenter, B. 1997. Ockham’s Razor and chemistry. Hyle, 3, 3-28.
    • Discussion by three chemists of the benefits and pitfalls of applying Ockham’s Razor in chemical research.
  • Howson, C. and Urbach, P. 2006. Scientific Reasoning: The Bayesian Approach (Third Edition). Chicago: Open Court.
    • Contains a useful survey of Bayesian attempts to make sense of the role of simplicity in theory evaluation. Technical in places.
  • Jeffreys, H. 1957. Scientific Inference (2nd edition). Cambridge: Cambridge University Press.
    • Defends the “simplicity postulate” that simpler theories have higher prior probability.
  • Jeffreys, H. 1961. Theory of Probability. Oxford: Clarendon Press.
    • Outline and defense of the Bayesian approach to scientific inference. Discusses the role of simplicity in the determination of priors and likelihoods.
  • Kelly, K. 2004. Justification as truth-finding efficiency: how Ockham’s Razor works. Minds and Machines, 14, 485-505.
    • Argues that Ockham’s Razor is justified by considerations of truth-finding efficiency. Critiques Bayesian, Akiakian, and other traditional attempts to justify simplicity preferences. Technical in places.
  • Kelly, K. 2007. How simplicity helps you find the truth without pointing at it. In M. Friend, N. Goethe, and V.Harizanov (eds.), Induction, Algorithmic Learning Theory, and Philosophy. Dordrecht: Springer.
    • Refinement and development of the argument found in Kelly (2004) and Schulte (1999). Technical.
  • Kelly, K. 2010. Simplicity, truth and probability. In P. Bandyopadhyay and M. Forster (eds.), Handbook of the Philosophy of Statistics. Dordrecht: Elsevier.
    • Expands and develops the argument found in Kelly (2007). Detailed critique of Bayesian accounts of simplicity. Technical.
  • Kelly, K. and Glymour, C. 2004. Why probability does not capture the logic of scientific justification. In C. Hitchcock (ed.), Contemporary Debates in the Philosophy of Science. Oxford: Blackwell.
    • Argues that Bayesians can’t make sense of Ockham’s Razor.
  • Kemeny, J. 1955. Two measures of complexity. Journal of Philosophy, 52, p722-733.
    • Develops some of Goodman’s ideas about how to measure the logical simplicity of predicates and systems of predicates. Proposes a measure of simplicity similar to Popper’s (1959) falsifiability measure.
  • Kieseppä, I. A. 1997. Akaike Information Criterion, curve-fitting, and the philosophical problem of simplicity. British Journal for the Philosophy of Science, 48, p21-48.
    • Critique of Forster and Sober (1994). Argues that Akaike’s theorem has little relevance to traditional philosophical problems surrounding simplicity. Highly technical.
  • Kitcher, P. 1989. Explanatory unification and the causal structure of the world. In P. Kitcher and W. Salmon, Minnesota Studies in the Philosophy of Science, vol 13: Scientific Explanation, Minneapolis: University of Minnesota Press.
    • Defends a unification theory of explanation. Argues that simplicity contributes to explanatory power.
  • Kuhn, T. 1957. The Copernican Revolution. Cambridge, MA: Harvard University Press.
    • Influential discussion of the role of simplicity in the arguments for Copernicanism.
  • Kuhn, T. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press.
  • Kuipers, T. 2002. Beauty: a road to truth. Synthese, 131, 291-328.
    • Attempts to show how aesthetic considerations might be indicative of truth.
  • Kyburg, H. 1961. A modest proposal concerning simplicity. Philosophical Review, 70, 390-395.
    • Important critique of Goodman (1955). Argues that simplicity be identified with the number of quantifiers in a theory.
  • Lakatos, I. and Zahar, E. 1978. Why did Copernicus’s research programme supersede Ptolemy’s? In J. Worrall and G. Curie (eds.), The Methodology of Scientific Research Programmes: Philosophical Papers of Imre Lakatos, Volume 1. Cambridge: Cambridge University Press.
    • Argues that simplicity did not really play a significant role in the Copernican Revolution.
  • Lewis, D. 1973. Counterfactuals. Oxford: Basil Blackwell.
    • Argues that quantitative parsimony is less important than qualitative parsimony in scientific and philosophical theorizing.
  • Li, M. and Vitányi, P. 1997. An Introduction to Kolmogorov Complexity and its Applications (2nd edition). New York: Springer.
    • Detailed elaboration of Kolmogorov complexity as a measure of simplicity. Highly technical.
  • Lipton, P. 2004. Inference to the Best Explanation (2nd edition). Oxford: Basil Blackwell.
    • Account of inference to the best explanation as inference to the “loveliest” explanation. Defends the claim that simplicity contributes to explanatory loveliness.
  • Lombrozo, T. 2007. Simplicity and probability in causal explanation. Cognitive Psychology, 55, 232–257.
    • Argues that simplicity is used as a guide to assessing the probability of causal explanations.
  • Lu, H., Yuille, A., Liljeholm, M., Cheng, P. W., and Holyoak, K. J. 2006. Modeling causal learning using Bayesian generic priors on generative and preventive powers. In R. Sun and N. Miyake (eds.), Proceedings of the 28th annual conference of the cognitive science society, 519–524. Mahwah, NJ: Erlbaum.
    • Argues that simplicity plays a significant role in causal learning.
  • MacKay, D. 1992. Bayesian interpolation. Neural Computation, 4, 415-447.
    • First presentation of the concept of Ockham’s Hill.
  • Martens, R. 2009. Harmony and simplicity: aesthetic virtues and the rise of testability. Studies in History and Philosophy of Science, 40, 258-266.
    • Discussion of the Copernican simplicity arguments and recent attempts to reconstruct the justification for them.
  • McAlleer, M. 2001. Simplicity: views of some Nobel laureates in economic science. In A. Zellner, H. Keuzenkamp and M. McAleer (eds.), Simplicity, Inference and Modelling. Cambridge: Cambridge University Press.
    • Interesting survey of the views of famous economists on the place of simplicity considerations in their work.
  • McAllister, J. W. 1996. Beauty and Revolution in Science. Ithaca: Cornell University Press.
    • Proposes that scientists’ simplicity preferences are the product of an aesthetic induction.
  • Mill, J.S. 1867. An Examination of Sir William Hamilton’s Philosophy. London: Walter Scott.
  • Myrvold, W. 2003. A Bayesian account of the virtue of unification. Philosophy of Science, 70, 399-423.
  • Newton, I. 1999. The Principia: Mathematical Principles of Natural Philosophy; A New Translation by I. Bernard Cohen and Anne Whitman. Berkeley: University of California Press.
    • Contains Newton’s “rules for the study of natural philosophy”, which includes a version of Ockham’s Razor, defended in terms of the simplicity of nature. These rules play an explicit role in Newton’s argument for universal gravitation.
  • Nolan, D. 1997. Quantitative Parsimony. British Journal for the Philosophy of Science, 48, 329-343.
    • Contra Lewis (1973), argues that quantitative parsimony has been important in the history of science.
  • Norton, J. 2000. ‘Nature is the realization of the simplest conceivable mathematical ideas’: Einstein and canon of mathematical simplicity. Studies in the History and Philosophy of Modern Physics, 31, 135-170.
    • Discusses the evolution of Einstein’s thinking about the role of mathematical simplicity in physical theorizing.
  • Norton, J. 2003. A material theory of induction. Philosophy of Science, 70, p647-670.
    • Defends a “material” theory of induction. Argues that appeals to simplicity in induction reflect factual assumptions about the domain of inquiry.
  • Oreskes, N., Shrader-Frechette, K., Belitz, K. 1994. Verification, validation, and confirmation of numerical models in the earth sciences. Science, 263, 641-646.
  • Palter, R. 1970. An approach to the history of early astronomy. Studies in History and Philosophy of Science, 1, 93-133.
  • Pais, A. 1982. Subtle Is the Lord: The science and life of Albert Einstein. Oxford: Oxford University Press.
  • Peirce, C.S. 1931. Collected Papers of Charles Sanders Peirce, vol 6. C. Hartshorne, P. Weiss, and A. Burks (eds.). Cambridge, MA: Harvard University Press.
  • Plutynski, A. 2005. Parsimony and the Fisher-Wright debate. Biology and Philosophy, 20, 697-713.
    • Advocates a deflationary analysis of appeals to parsimony in debates between Wrightian and neo-Fisherian models of natural selection.
  • Popper, K. 1959. The Logic of Scientific Discovery. London: Hutchinson.
    • Argues that simplicity = empirical content = falsifiability.
  • Priest, G. 1976. Gruesome simplicity. Philosophy of Science, 43, 432-437.
    • Shows that standard measures of simplicity in curve-fitting are language variant.
  • Raftery, A., Madigan, D., and Hoeting, J. 1997. Bayesian model averaging for linear regression models. Journal of the American Statistical Association, 92, 179-191.
  • Reichenbach, H. 1949. On the justification of induction. In H. Feigl and W. Sellars (eds.), Readings in Philosophical Analysis. New York: Appleton-Century-Crofts.
  • Rosencrantz, R. 1983. Why Glymour is a Bayesian. In J. Earman (ed.), Testing Scientific Theories. Minneapolis: University of Minnesota Press.
    • Responds to Glymour (1980). Argues that simpler theories have higher likelihoods, using Copernican vs. Ptolemaic astronomy as an example.
  • Rothwell, G. 2006. Notes for the occasional major case manager. FBI Law Enforcement Bulletin, 75, 20-24.
    • Emphasizes the importance of Ockham’s Razor in criminal investigation.
  • Sakamoto, Y., Ishiguro, M., and Kitagawa, G. 1986. Akaike Information Criterion Statistics. New York: Springer.
  • Schaffner, K. 1974. Einstein versus Lorentz: research programmes and the logic of comparative theory evaluation. British Journal for the Philosophy of Science, 25, 45-78.
    • Argues that simplicity played a significant role in the development and early acceptance of special relativity.
  • Schulte, O. 1999. Means-end epistemology. British Journal for the Philosophy of Science, 50, 1-31.
    • First statement of the claim that Ockham’s Razor can be justified in terms of truth-finding efficiency.
  • Simon, H. 1962. The architecture of complexity. Proceedings of the American Philosophical Society, 106, 467-482.
    • Important discussion by a Nobel laureate of features common to complex systems in nature.
  • Sober, E. 1975. Simplicity. Oxford: Oxford University Press.
    • Argues that simplicity can be defined in terms of question-relative informativeness. Technical in places.
  • Sober, E. 1981. The principle of parsimony. British Journal for the Philosophy of Science, 32, 145-156.
    • Distinguishes between “agnostic” and “atheistic” versions of Ockham’s Razor. Argues that the atheistic razor has an inductive justification.
  • Sober, E. 1988. Reconstructing the Past: Parsimony, Evolution and Inference. Cambridge, MA: MIT Press.
    • Defends a deflationary account of simplicity in the context of the use of parsimony methods in evolutionary biology.
  • Sober, E. 1994. Let’s razor Ockham’s Razor. In E. Sober, From a Biological Point of View, Cambridge: Cambridge University Press.
    • Argues that the use of Ockham’s Razor is grounded in local background assumptions.
  • Sober, E. 2001a. What is the problem of simplicity? In H. Keuzenkamp, M. McAlleer, and A. Zellner (eds.), Simplicity, Inference and Modelling. Cambridge: Cambridge University Press.
  • Sober, E. 2001b. Simplicity. In W.H. Newton-Smith (ed.), A Companion to the Philosophy of Science, Oxford: Blackwell.
  • Sober, E. 2007. Evidence and Evolution. New York: Cambridge University Press.
  • Solomonoff, R.J. 1964. A formal theory of inductive inference, part 1 and part 2. Information and Control, 7, 1-22, 224-254.
  • Suppes, P. 1956. Nelson Goodman on the concept of logical simplicity. Philosophy of Science, 23, 153-159.
  • Swinburne, R. 2001. Epistemic Justification. Oxford: Oxford University Press.
    • Argues that the principle that simpler theories are more probably true is a fundamental a priori principle.
  • Thagard, P. 1988. Computational Philosophy of Science. Cambridge, MA: MIT Press.
    • Simplicity is a determinant of the goodness of an explanation and can be measured in terms of the paucity of auxiliary assumptions relative to the number of facts explained.
  • Thorburn, W. 1918. The myth of Occam’s Razor. Mind, 23, 345-353.
    • Argues that William of Ockham would not have advocated many of the principles that have been attributed to him.
  • van Fraassen, B. 1989. Laws and Symmetry. Oxford: Oxford University Press.
  • Wallace, C. S. and Dowe, D. L. 1999. Minimum Message Length and Kolmogorov Complexity. Computer Journal, 42(4), 270–83.
  • Walsh, D. 1979. Occam’s Razor: A Principle of Intellectual Elegance. American Philosophical Quarterly, 16, 241-244.
  • Weinberg, S. 1993. Dreams of a Final Theory. New York: Vintage.
    • Argues that physicists demand simplicity in physical principles before they can be taken seriously.
  • White, R. 2005. Why favour simplicity? Analysis, 65, 205-210.
    • Attempts to justify preferences for simpler theories in virtue of such theories having higher likelihoods.
  • Zellner, A, Keuzenkamp, H., and McAleer, M. 2001. Simplicity, Inference and Modelling. Cambridge: Cambridge University Press.
    • Collection papers by statisticians, philosophers, and economists on the role of simplicity in scientific inference and modelling.

Author Information

Simon Fitzpatrick
John Carroll University
U. S. A.

Zeno’s Paradoxes

Zeno_of_EleaIn the fifth century B.C.E., Zeno of Elea offered arguments that led to conclusions contradicting what we all know from our physical experience–that runners run, that arrows fly, and that there are many different things in the world. The arguments were paradoxes for the ancient Greek philosophers. Because most of the arguments turn crucially on the notion that space and time are infinitely divisible—for example, that for any distance there is such a thing as half that distance, and so on—Zeno was the first person in history to show that the concept of infinity is problematical.

In his Achilles Paradox, Achilles races to catch a slower runner–for example, a tortoise that is crawling away from him. The tortoise has a head start, so if Achilles hopes to overtake it, he must run at least to the place where the tortoise presently is, but by the time he arrives there, it will have crawled to a new place, so then Achilles must run to this new place, but the tortoise meanwhile will have crawled on, and so forth. Achilles will never catch the tortoise, says Zeno. Therefore, good reasoning shows that fast runners never can catch slow ones. So much the worse for the claim that motion really occurs, Zeno says in defense of his mentor Parmenides who had argued that motion is an illusion.

Although practically no scholars today would agree with Zeno’s conclusion, we can not escape the paradox by jumping up from our seat and chasing down a tortoise, nor by saying Achilles should run to some other target place ahead of where the tortoise is at the moment. What is required is an analysis of Zeno's own argument that does not get us embroiled in new paradoxes nor impoverish our mathematics and science.

This article explains his ten known paradoxes and considers the treatments that have been offered. Zeno assumed distances and durations can be divided into an actual infinity (what we now call a transfinite infinity) of indivisible parts, and he assumed these are too many for the runner to complete. Aristotle's treatment said Zeno should have assumed there are only potential infinities, and that neither places nor times divide into indivisible parts. His treatment became the generally accepted solution until the late 19th century. The current standard treatment says Zeno was right to conclude that a runner's path contains an actual infinity of parts, but he was mistaken to assume this is too many. This treatment employs the apparatus of calculus which has proved its indispensability for the development of modern science. In the twentieth century it became clear to most researchers that disallowing actual infinities, as Aristotle wanted, hampers the growth of set theory and ultimately of mathematics and physics. This standard treatment took hundreds of years to perfect and was due to the flexibility of intellectuals who were willing to replace old theories and their concepts with more fruitful ones, despite the damage done to common sense and our naive intuitions. The article ends by exploring newer treatments of the paradoxes—and related paradoxes such as Thomson's Lamp Paradox—that were developed since the 1950s.

Table of Contents

  1. Zeno of Elea
    1. His Life
    2. His Book
    3. His Goals
    4. His Method
  2. The Standard Solution to the Paradoxes
  3. The Ten Paradoxes
    1. Paradoxes of Motion
      1. The Achilles
      2. The Dichotomy (The Racetrack)
      3. The Arrow
      4. The Moving Rows (The Stadium)
    2. Paradoxes of Plurality
      1. Alike and Unlike
      2. Limited and Unlimited
      3. Large and Small
      4. Infinite Divisibility
    3. Other Paradoxes
      1. The Grain of Millet
      2. Against Place
  4. Aristotle’s Treatment of the Paradoxes
  5. Other Issues Involving the Paradoxes
    1. Consequences of Accepting the Standard Solution
    2. Criticisms of the Standard Solution
    3. Supertasks and Infinity Machines
    4. Constructivism
    5. Nonstandard Analysis
    6. Smooth Infinitesimal Analysis
  6. The Legacy and Current Significance of the Paradoxes
  7. References and Further Reading

1. Zeno of Elea

a. His Life

Zeno was born in about 490 B.C.E. in Elea, now Velia, in southern Italy; and he died in about 430 B.C.E. He was a friend and student of Parmenides, who was twenty-five years older and also from Elea. There is little additional, reliable information about Zeno’s life. Plato remarked (in Parmenides 127b) that Parmenides took Zeno to Athens with him where he encountered Socrates, who was about twenty years younger than Zeno, but today’s scholars consider this encounter to have been invented by Plato to improve the story line. Zeno is reported to have been arrested for taking weapons to rebels opposed to the tyrant who ruled Elea. When asked about his accomplices, Zeno said he wished to whisper something privately to the tyrant. But when the tyrant came near, Zeno bit him, and would not let go until he was stabbed. Diogenes Laërtius reported this apocryphal story seven hundred years after Zeno’s death.

b. His Book

According to Plato’s commentary in his Parmenides (127a to 128e), Zeno brought a treatise with him when he visited Athens. It was said to be a book of paradoxes defending the philosophy of Parmenides. Plato and Aristotle may have had access to the book, but Plato did not state any of the arguments, and Aristotle’s presentations of the arguments are very compressed. A thousand years after Zeno, the Greek philosophers Proclus and Simplicius commented on the book and its arguments. They had access to some of the book, perhaps to all of it, but it has not survived. Proclus is the first person to tell us that the book contained forty arguments. This number is confirmed by the sixth century commentator Elias, who is regarded as an independent source because he does not mention Proclus. Unfortunately, we know of no specific dates for when Zeno composed any of his paradoxes, and we know very little of how Zeno stated his own paradoxes. We do have a direct quotation via Simplicius of the Paradox of Denseness and a partial quotation via Simplicius of the Large and Small Paradox. In total we know of less than two hundred words that can be attributed to Zeno. Our knowledge of these two paradoxes and the other seven comes to us indirectly through paraphrases of them, and comments on them, primarily by Aristotle (384-322 B.C.E.), but also by Plato (427-347 B.C.E.), Proclus (410-485 C.E.), and Simplicius (490-560 C.E.). The names of the paradoxes were created by commentators, not by Zeno.

c. His Goals

In the early fifth century B.C.E., Parmenides emphasized the distinction between appearance and reality. Reality, he said, is a seamless unity that is unchanging and can not be destroyed, so appearances of reality are deceptive. Our ordinary observation reports are false; they do not report what is real. This metaphysical theory is the opposite of Heraclitus’ theory, but evidently it was supported by Zeno. Although we do not know from Zeno himself whether he accepted his own paradoxical arguments or what point he was making with thm, according to Plato the paradoxes were designed to provide detailed, supporting arguments for Parmenides by demonstrating that our common sense confidence in the reality of motion, change, and ontological plurality (that is, that there exist many things), involve absurdities. Plato’s classical interpretation of Zeno was accepted by Aristotle and by most other commentators throughout the intervening centuries.

Eudemus, a student of Aristotle, offered another interpretation. He suggested that Zeno was challenging both pluralism and Parmenides’ idea of monism, which would imply that Zeno was a nihilist. Paul Tannery in 1885 and Wallace Matson in 2001 offer a third interpretation of Zeno’s goals regarding the paradoxes of motion. Plato and Aristotle did not understand Zeno’s arguments nor his purpose, they say. Zeno was actually challenging the Pythagoreans and their particular brand of pluralism, not Greek common sense. Zeno was not trying to directly support Parmenides. Instead, he intended to show that Parmenides’ opponents are committed to denying the very motion, change, and plurality they believe in, and Zeno’s arguments were completely successful. This controversial issue about interpreting Zeno’s purposes will not be pursued further in this article, and Plato’s classical interpretation will be assumed.

d. His Method

Before Zeno, Greek thinkers favored presenting their philosophical views by writing poetry. Zeno began the grand shift away from poetry toward a prose that contained explicit premises and conclusions. And he employed the method of indirect proof in his paradoxes by temporarily assuming some thesis that he opposed and then attempting to deduce an absurd conclusion or a contradiction, thereby undermining the temporary assumption. This method of indirect proof or reductio ad absurdum probably originated with his teacher Parmenides [although this is disputed in the scholarly literature], but Zeno used it more systematically.

2. The Standard Solution to the Paradoxes

Any paradox can be treated by abandoning enough of its crucial assumptions. For Zeno's it is very interesting to consider which assumptions to abandon, and why those. A paradox is an argument that reaches a contradiction by apparently legitimate steps from apparently reasonable assumptions, while the experts at the time can not agree on the way out of the paradox, that is, agree on its resolution. It is this latter point about disagreement among the experts that distinguishes a paradox from a mere puzzle in the ordinary sense of that term. Zeno’s paradoxes are now generally considered to be puzzles because of the wide agreement among today’s experts that there is at least one acceptable resolution of the paradoxes.

This resolution is called the Standard Solution. It presupposes calculus, the rest of standard real analysis, and classical mechanics. It assumes that physical processes are sets of point-events. It implies that motions, durations, distances and line segments are all linear continua composed of points, then uses these ideas to challenge various assumptions made, and steps taken, by Zeno. To be very brief and anachronistic, Zeno's mistake (and Aristotle's mistake) was not to have used calculus. More specifically, in the case of the paradoxes of motion such as the Achilles and the Dichotomy, Zeno's mistake was not his assuming there is a completed infinity of places for the runner to go, which was what Aristotle said was Zeno's mistake; Zeno's and Aristotle's mistake was in assuming that this is too many places (for the runner to go to in a finite time).

A key background assumption of the Standard Solution is that this resolution is not simply employing some concepts that will undermine Zeno’s reasoning–Aristotle's reasoning does that, too, at least for most of the paradoxes–but that it is employing concepts which have been shown to be appropriate for the development of a coherent and fruitful system of mathematics and physical science. Aristotle's treatment of the paradoxes does not employ these fruitful concepts. The Standard Solution is much more complicated than Aristotle's treatment, and no single person can be credited with creating it.

The Standard Solution uses calculus. In calculus we need to speak of one event happening pi seconds after another, and of one event happening the square root of three seconds after another. In ordinary discourse outside of science we would never need this kind of precision. The need for this precision has led to requiring time to be a linear continuum, very much like a segment of the real number line.

Calculus was invented in the late 1600's by Newton and Leibniz. Their calculus is a technique for treating continuous motion as being composed of an infinite number of infinitesimal steps. After the acceptance of calculus, most all mathematicians and physicists believed that continuous motion, including Achilles' motion, should be modeled by a function which takes real numbers representing time as its argument and which gives real numbers representing spatial position as its value. This position function should be continuous or gap-free. In addition, the position function should be differentiable or smooth in order to make sense of speed, the rate of change of position. By the early 20th century most mathematicians had come to believe that, to make rigorous sense of motion, mathematics needs a fully developed set theory that rigorously defines the key concepts of real number, continuity and differentiability. Doing this requires a well defined concept of the continuum. Unfortunately Newton and Leibniz did not have a good definition of the continuum, and finding a good one required over two hundred years of work.

The continuum is a very special set; it is the standard model of the real numbers. Intuitively, a continuum is a continuous entity; it is a whole thing that has no gaps. Some examples of a continuum are the path of a runner’s center of mass, the time elapsed during this motion, ocean salinity, and the temperature along a metal rod. Distances and durations are normally considered to be real continua whereas treating the ocean salinity and the rod's temperature as continua is a very useful approximation for many calculations in physics even though we know that at the atomic level the approximation breaks down.

The distinction between “a” continuum and “the” continuum is that “the” continuum is the paradigm of “a” continuum. The continuum is the mathematical line, the line of geometry, which is standardly understood to have the same structure as the real numbers in their natural order. Real numbers and points on the continuum can be put into a one-to-one order-preserving correspondence. There are not enough rational numbers for this correspondence even though the rational numbers are dense, too (in the sense that between any two rational numbers there is another rational number).

For Zeno’s paradoxes, standard analysis assumes that length should be defined in terms of measure, and motion should be defined in terms of the derivative. These definitions are given in terms of the linear continuum. The most important features of any linear continuum are that (a) it is composed of points, (b) it is an actually infinite set, that is, a transfinite set, and not merely a potentially infinite set that gets bigger over time, (c) it is undivided yet infinitely divisible (that is, it is gap-free), (d) the points are so close together that no point can have a point immediately next to it, (e) between any two points there are other points, (f) the measure (such as length) of a continuum is not a matter of adding up the measures of its points nor adding up the number of its points, (g) any connected part of a continuum is also a continuum, and (h) there are an aleph-one number of points between any two points.

Physical space is not a linear continuum because it is three-dimensional and not linear; but it has one-dimensional subspaces such as paths of runners and orbits of planets; and these are linear continua if we use the path created by only one point on the runner and the orbit created by only one point on the planet. Regarding time, each (point) instant is assigned a real number as its time, and each instant is assigned a duration of zero. The time taken by Achilles to catch the tortoise is a temporal interval, a linear continuum of instants, according to the Standard Solution (but not according to Zeno or Aristotle). The Standard Solution says that the sequence of Achilles' goals (the goals of reaching the point where the tortoise is) should be abstracted from a pre-existing transfinite set, namely a linear continuum of point places along the tortoise's path. Aristotle's treatment does not do this. The next section of this article presents the details of how the concepts of the Standard Solution are used to resolve each of Zeno's Paradoxes.

Of the ten known paradoxes, The Achilles attracted the most attention over the centuries. Aristotle’s treatment of the paradox involved accusing Zeno of using the concept of an actual or completed infinity instead of the concept of a potential infinity, and accusing Zeno of failing to appreciate that a line cannot be composed of points. Aristotle’s treatment is described in detail below. It was generally accepted until the 19th century, but slowly lost ground to the Standard Solution. Some historians say he had no solution but only a verbal quibble. This article takes no side on this dispute and speaks of Aristotle’s “treatment.”

The development of calculus was the most important step in the Standard Solution of Zeno's paradoxes, so why did it take so long for the Standard Solution to be accepted after Newton and Leibniz developed their calculus? The period lasted about two hundred years. There are four reasons. (1) It took time for calculus and the rest of real analysis to prove its applicability and fruitfulness in physics. (2) It took time for the relative shallowness of Aristotle’s treatment to be recognized. (3) It took time for philosophers of science to appreciate that each theoretical concept used in a physical theory need not have its own correlate in our experience.  (4) It took time for certain problems in the foundations of mathematics to be resolved, such as finding a better definition of the continuum and avoiding the paradoxes of Cantor's naive set theory.

Point (2) is discussed in section 4 below.

Point (3) is about the time it took for philosophers of science to reject the demand, favored by Ernst Mach and many Logical Positivists, that meaningful terms in science must have “empirical meaning.” This was the demand that each physical concept be separately definable with observation terms. It was thought that, because our experience is finite, the term “actual infinite” or "completed infinity" could not have empirical meaning, but “potential infinity” could. Today, most philosophers would not restrict meaning to empirical meaning. However, for an interesting exception see Dummett (2000) which contains a theory in which time is composed of overlapping intervals rather than durationless instants, and in which the endpoints of those intervals are the initiation and termination of actual physical processes. This idea of treating time without instants develops a 1936 proposal of Russell and Whitehead. The central philosophical issue about Dummett's treatment of motion is how its adoption would affect other areas of mathematics and science.

Point (1) is about the time it took for classical mechanics to develop to the point where it was accepted as giving correct solutions to problems involving motion. Point (1) was challenged in the metaphysical literature on the grounds that the abstract account of continuity in real analysis does not truly describe either time, space or concrete physical reality. This challenge is discussed in later sections.

Point (4) arises because the standard of rigorous proof and rigorous definition of concepts has increased over the years. As a consequence, the difficulties in the foundations of real analysis, which began with George Berkeley’s criticism of inconsistencies in the use of infinitesimals in the calculus of Leibniz (and fluxions in the calculus of Newton), were not satisfactorily resolved until the early 20th century with the development of Zermelo-Fraenkel set theory. The key idea was to work out the necessary and sufficient conditions for being a continuum. To achieve the goal, the conditions for being a mathematical continuum had to be strictly arithmetical and not dependent on our intuitions about space, time and motion. The idea was to revise or “tweak” the definition until it would not create new paradoxes and would still give useful theorems. When this revision was completed, it could be declared that the set of real numbers is an actual infinity, not a potential infinity, and that not only is any interval of real numbers a linear continuum, but so are the spatial paths, the temporal durations, and the motions that are mentioned in Zeno’s paradoxes. In addition, it was important to clarify how to compute the sum of an infinite series (such as 1/2 + 1/4 + 1/8 + ...) and how to define motion in terms of the derivative. This new mathematical system required new or better-defined mathematical concepts of compact set, connected set, continuity, continuous function, convergence-to-a-limit of an infinite sequence (such as 1/2, 1/4, 1/8, ...), curvature at a point, cut, derivative, dimension, function, integral, limit, measure, reference frame, set, and size of a set. Similarly, rigor was added to the definitions of the physical concepts of place, instant, duration, distance, and instantaneous speed. The relevant revisions were made by Euler in the 18th century and by Bolzano, Cantor, Cauchy, Dedekind, Frege, Hilbert, Lebesque, Peano, Russell, Weierstrass, and Whitehead, among others, during the 19th and early 20th centuries.

What about Leibniz's infinitesimals or Newton's fluxions? Let's stick with infinitesimals, since fluxions have the same problems and same resolution. In 1734, Berkeley had properly criticized the use of infinitesimals as being "ghosts of departed quantities" that are used inconsistently in calculus. Earlier Newton had defined instantaneous speed as the ratio of an infinitesimally small distance and an infinitesimally small duration, and he and Leibniz produced a system of calculating variable speeds that was very fruitful. But nobody in that century or the next could adequately explain what an infinitesimal was. Newton had called them “evanescent divisible quantities,” whatever that meant. Leibniz called them “vanishingly small,” but that was just as vague. The practical use of infinitesimals was unsystematic. For example, the infinitesimal dx is treated as being equal to zero when it is declared that x + dx = x, but is treated as not being zero when used in the denominator of the fraction [f(x + dx) - f(x)]/dx which is the derivative of the function f. In addition, consider the seemingly obvious Archimedean property of pairs of positive numbers: given any two positive numbers A and B, if you add enough copies of A, then you can produce a sum greater than B. This property fails if A is an infinitesimal. Finally, mathematicians gave up on answering Berkeley’s charges (and thus re-defined what we mean by standard analysis) because, in 1821, Cauchy showed how to achieve the same useful theorems of calculus by using the idea of a limit instead of an infinitesimal. Later in the 19th century, Weierstrass resolved some of the inconsistencies in Cauchy’s account and satisfactorily showed how to define continuity in terms of limits (his epsilon-delta method). As J. O. Wisdom points out (1953, p. 23), “At the same time it became clear that [Leibniz's and] Newton’s theory, with suitable amendments and additions, could be soundly based.” In an effort to provide this sound basis according to the latest, heightened standard of what counts as “sound,” Peano, Frege, Hilbert, and Russell attempted to properly axiomatize real analysis. This led in 1901 to Russell’s paradox and the fruitful controversy about how to provide a foundation to all of mathematics. That controversy still exists, but the majority view is that axiomatic Zermelo-Fraenkel set theory with the axiom of choice blocks all the paradoxes, legitimizes Cantor’s theory of transfinite sets, and provides the proper foundation for real analysis and other areas of mathematics. This standard real analysis lacks infinitesimals, thanks to Cauchy and Weierstrass. Standard real analysis is the mathematics that the Standard Solution applies to Zeno’s Paradoxes.

The rational numbers are not continuous although they are infinitely numerous and infinitely dense. To come up with a foundation for calculus there had to be a good definition of the continuity of the real numbers. But this required having a good definition of irrational numbers. There wasn’t one before 1872. Dedekind’s definition in 1872 defines the mysterious irrationals in terms of the familiar rationals. The result was a clear and useful definition of real numbers. The usefulness of Dedekind's definition of real numbers, and the lack of any better definition, convinced many mathematicians to be more open to accepting actually-infinite sets.

We won't explore the definitions of continuity here, but what Dedekind discovered about the reals and their relationship to the rationals was how to define a real number to be a cut of the rational numbers, where a cut is a certain ordered pair of actually-infinite sets of rational numbers.

A Dedekind cut (A,B) is defined to be a partition or cutting of the set of all the rational numbers into a left part A and a right part B. A and B are non-empty subsets, such that all rational numbers in A are less than all rational numbers in B, and also A contains no greatest number. Every real number is a unique Dedekind cut. The cut can be made at a rational number or at an irrational number. Here are examples of each:

Dedekind's real number 1/2 is ({x : x < 1/2} , {x: x ≥ 1/2}).

Dedekind's positive real number √2 is ({x : x < 0 or x2 < 2} , {x: x2 ≥ 2}).

Notice that the rational real number 1/2 is within its B set, but the irrational real number √2 is not within its B set because B contains only rational numbers. That property is what distinguishes rationals from irrationals, according to Dedekind.

For any cut (A,B), if B has a smallest number, then the real number for that cut corresponds to this smallest number, as in the definition of ½ above. Otherwise, the cut defines an irrational number which, loosely speaking, fills the gap between A and B, as in the definition of the square root of 2 above.

By defining reals in terms of rationals this way, Dedekind gave a foundation to the reals, and legitimized them by showing they are as acceptable as actually-infinite sets of rationals.

But what exactly is an actually-infinite or transfinite set, and does this idea lead to contradictions? This question needs an answer if there is to be a good theory of continuity and of real numbers. In the 1870s, Cantor clarified what an actually-infinite set is and made a convincing case that the concept does not lead to inconsistencies. These accomplishments by Cantor are why he (along with Dedekind and Weierstrass) is said by Russell to have “solved Zeno’s Paradoxes.”

That solution recommends using very different concepts and theories than those used by Zeno. The argument that this is the correct solution was presented by many people, but it was especially influenced by the work of Bertrand Russell (1914, lecture 6) and the more detailed work of Adolf Grünbaum (1967). In brief, the argument for the Standard Solution is that we have solid grounds for believing our best scientific theories, but the theories of mathematics such as calculus and Zermelo-Fraenkel set theory are indispensable to these theories, so we have solid grounds for believing in them, too. The scientific theories require a resolution of Zeno’s paradoxes and the other paradoxes; and the Standard Solution to Zeno's Paradoxes that uses standard calculus and Zermelo-Fraenkel set theory is indispensable to this resolution or at least is the best resolution, or, if not, then we can be fairly sure there is no better solution, or, if not that either, then we can be confident that the solution is good enough (for our purposes). Aristotle's treatment, on the other hand, uses concepts that hamper the growth of mathematics and science. Therefore, we should accept the Standard Solution.

In the next section, this solution will be applied to each of Zeno’s ten paradoxes.

To be optimistic, the Standard Solution represents a counterexample to the claim that philosophical problems never get solved. To be less optimistic, the Standard Solution has its drawbacks and its alternatives, and these have generated new and interesting philosophical controversies beginning in the last half of the 20th century, as will be seen in later sections. The primary alternatives contain different treatments of calculus from that developed at the end of the 19th century. Whether this implies that Zeno’s paradoxes have multiple solutions or only one is still an open question.

Did Zeno make mistakes? And was he superficial or profound? These questions are a matter of dispute in the philosophical literature. The majority position is as follows. If we give his paradoxes a sympathetic reconstruction, he correctly demonstrated that some important, classical Greek concepts are logically inconsistent, and he did not make a mistake in doing this, except in the Moving Rows Paradox, the Paradox of Alike and Unlike and the Grain of Millet Paradox, his weakest paradoxes. Zeno did assume that the classical Greek concepts were the correct concepts to use in reasoning about his paradoxes, and now we prefer revised concepts, though it would be unfair to say he blundered for not foreseeing later developments in mathematics and physics.

3. The Ten Paradoxes

Zeno probably created forty paradoxes, of which only the following ten are known. Only the first four have standard names, and the first two have received the most attention. The ten are of uneven quality. Zeno and his ancient interpreters usually stated his paradoxes badly, so it has taken some clever reconstruction over the years to reveal their full force. Below, the paradoxes are reconstructed sympathetically, and then the Standard Solution is applied to them. These reconstructions use just one of several reasonable schemes for presenting the paradoxes, but the present article does not explore the historical research about the variety of interpretive schemes and their relative plausibility.

a. Paradoxes of Motion

i. The Achilles

Achilles, who is the fastest runner of antiquity, is racing to catch the tortoise that is slowly crawling away from him. Both are moving along a linear path at constant speeds. In order to catch the tortoise, Achilles will have to reach the place where the tortoise presently is. However, by the time Achilles gets there, the tortoise will have crawled to a new location. Achilles will then have to reach this new location. By the time Achilles reaches that location, the tortoise will have moved on to yet another location, and so on forever. Zeno claims Achilles will never catch the tortoise. He might have defended this conclusion in various ways—by saying it is because the sequence of goals or locations has no final member, or requires too much distance to travel, or requires too much travel time, or requires too many tasks. However, if we do believe that Achilles succeeds and that motion is possible, then we are victims of illusion, as Parmenides says we are.

The source for Zeno's views is Aristotle (Physics 239b14-16) and some passages from Simplicius in the fifth century C.E. There is no evidence that Zeno used a tortoise rather than a slow human. The tortoise is a commentator’s addition. Aristotle spoke simply of “the runner” who competes with Achilles.

It won’t do to react and say the solution to the paradox is that there are biological limitations on how small a step Achilles can take. Achilles’ feet aren’t obligated to stop and start again at each of the locations described above, so there is no limit to how close one of those locations can be to another. It is best to think of the change from one location to another as a movement rather than as incremental steps requiring halting and starting again. Zeno is assuming that space and time are infinitely divisible; they are not discrete or atomistic. If they were, the Paradox's argument would not work.

One common complaint with Zeno’s reasoning is that he is setting up a straw man because it is obvious that Achilles cannot catch the tortoise if he continually takes a bad aim toward the place where the tortoise is; he should aim farther ahead. The mistake in this complaint is that even if Achilles took some sort of better aim, it is still true that he is required to go to every one of those locations that are the goals of the so-called “bad aims,” so Zeno's argument needs a better treatment.

The treatment called the "Standard Solution" to the Achilles Paradox uses calculus and other parts of real analysis to describe the situation. It implies that Zeno is assuming in the Achilles situation that Achilles cannot achieve his goal because

(1) there is too far to run, or

(2) there is not enough time, or

(3) there are too many places to go, or

(4) there is no final step, or

(5) there are too many tasks.

The historical record does not tell us which of these was Zeno's real assumption, but they are all false assumptions, according to the Standard Solution. Let's consider (1). Presumably Zeno would defend the assumption by remarking that the sum of the distances along so many of the runs to where the tortoise is must be infinite, which is too much for even Achilles. However, the advocate of the Standard Solution will remark, "How does Zeno know what the sum of this infinite series is?" According to the Standard Solution the sum is not infinite. Here is a graph using the methods of the Standard Solution showing the activity of Achilles as he chases the tortoise and overtakes it.

graph of Achilles and the Tortoise

To describe this graph in more detail, we need to say that Achilles' path [the path of some dimensionless point of Achilles' body] is a linear continuum and so is composed of an actual infinity of points. (An actual infinity is also called a "completed infinity" or "transfinite infinity," and the word "actual" does not mean "real" as opposed to "imaginary.") Since Zeno doesn't make this assumption, that is another source of error in Zeno's reasoning. Achilles travels a distance d1 in reaching the point x1 where the tortoise starts, but by the time Achilles reaches x1, the tortoise has moved on to a new point x2. When Achilles reaches x2, having gone an additional distance d2, the tortoise has moved on to point x3, requiring Achilles to cover an additional distance d3, and so forth. This sequence of non-overlapping distances (or intervals or sub-paths) is an actual infinity, but happily the geometric series converges. The sum of its terms d1 + d2 + d3 +… is a finite distance that Achilles can readily complete while moving at a constant speed.

Similar reasoning would apply if Zeno were to have made assumption (2) or (3). Regarding (4), the requirement that there be a final step or final sub-path is simply mistaken, according to the Standard Solution. More will be said about assumption (5) in Section 5c.

By the way, the Paradox does not require the tortoise to crawl at a constant speed but only to never stop crawling and for Achilles to travel faster on average than the tortoise. The assumption of constant speed is made simply for ease of understanding.

The Achilles Argument presumes that space and time are infinitely divisible. So, Zeno's conclusion may not simply have been that Achilles cannot catch the tortoise but instead that he cannot catch the tortoise if space and time are infinitely divisible. Perhaps, as some commentators have speculated, Zeno used the Achilles only to attack continuous space, and he intended his other paradoxes such as "The Moving Rows" to attack discrete space. The historical record is not clear. Notice that, although space and time are infinitely divisible for Zeno, he did not have the concepts to properly describe the limit of the infinite division. Neither Zeno nor any of the other ancient Greeks had the concept of a dimensionless point; they did  not even have the concept of zero. However, today's versions of Zeno's Paradoxes can and do use those concepts.

ii. The Dichotomy (The Racetrack)

In his Progressive Dichotomy Paradox, Zeno argued that a runner will never reach the stationary goal line of a racetrack. The reason is that the runner must first reach half the distance to the goal, but when there he must still cross half the remaining distance to the goal, but having done that the runner must cover half of the new remainder, and so on. If the goal is one meter away, the runner must cover a distance of 1/2 meter, then 1/4 meter, then 1/8 meter, and so on ad infinitum. The runner cannot reach the final goal, says Zeno. Why not? There are few traces of Zeno's reasoning here, but for reconstructions that give the strongest reasoning, we may say that the runner will not reach the final goal because there is too far to run, the sum is actually infinite. The Standard Solution argues instead that the sum of this infinite geometric series is one, not infinity.

The problem of the runner getting to the goal can be viewed from a different perspective. According to the Regressive version of the Dichotomy Paradox, the runner cannot even take a first step. Here is why. Any step may be divided conceptually into a first half and a second half. Before taking a full step, the runner must take a 1/2 step, but before that he must take a 1/4 step, but before that a 1/8 step, and so forth ad infinitum, so Achilles will never get going. Like the Achilles Paradox, this paradox also concludes that any motion is impossible. The original source is Aristotle (Physics, 239b11-13).

The Dichotomy paradox, in either its Progressive version or its Regressive version, assumes for the sake of simplicity that the runner’s positions are point places. Actual runners take up some larger volume, but assuming point places is not a controversial assumption because Zeno could have reconstructed his paradox by speaking of the point places occupied by, say, the tip of the runner’s nose, and this assumption makes for a strong paradox than assuming the runner's position are larger.

In the Dichotomy Paradox, the runner reaches the points 1/2 and 3/4 and 7/8 and so forth on the way to his goal, but under the influence of Bolzano and Dedekind and Cantor, who developed the first theory of sets, the set of those points is no longer considered to be potentially infinite. It is an actually infinite set of points abstracted from a continuum of points–in the contemporary sense of “continuum” at the heart of calculus. And the ancient idea that the actually infinite series of path lengths or segments 1/2 + 1/4 + 1/8 + … is infinite had to be rejected in favor of the new theory that it converges to 1. This is key to solving the Dichotomy Paradox, according to the Standard Solution. It is basically the same treatment as that given to the Achilles. The Dichotomy Paradox has been called “The Stadium” by some commentators, but that name is also commonly used for the Paradox of the Moving Rows.

Aristotle, in Physics Z9, said of the Dichotomy that it is possible for a runner to come in contact with a potentially infinite number of things in a finite time provided the time intervals becomes shorter and shorter. Aristotle said Zeno assumed this is impossible, and that is one of his errors in the Dichotomy. However, Aristotle merely asserted this and could give no detailed theory that enables the computation of the finite amount of time. So, Aristotle could not really defend his diagnosis of Zeno's error. Today the calculus is used to provide the Standard Solution with that detailed theory.

There is another detail of the Dichotomy that needs resolution. How does Zeno complete the trip if there is no final step or last member of the infinite sequence of steps (intervals and goals)? Don't trips need last steps? The Standard Solution answers "no" and says the intuitive answer "yes" is one of our many intuitions that must be rejected when embracing the Standard Solution.

iii. The Arrow

Zeno’s Arrow Paradox takes a different approach to challenging the coherence of our common sense concepts of time and motion. As Aristotle explains, from Zeno’s “assumption that time is composed of moments,” a moving arrow must occupy a space equal to itself during any moment. That is, during any moment it is at the place where it is. But places do not move. So, if in each moment, the arrow is occupying a space equal to itself, then the arrow is not moving in that moment because it has no time in which to move; it is simply there at the place. The same holds for any other moment during the so-called “flight” of the arrow. So, the arrow is never moving. Similarly, nothing else moves. The source for Zeno’s argument is Aristotle (Physics, 239b5-32).

The Standard Solution to the Arrow Paradox uses the “at-at” theory of motion, which says motion is being at different places at different times and that being at rest involves being motionless at a particular point at a particular time. The difference between rest and motion has to do with what is happening at nearby moments and has nothing to do with what is happening during a moment. An object cannot be in motion in or during an instant, but it can be in motion at an instant in the sense of having a speed at that instant, provided the object occupies different positions at times before or after that instant so that the instant is part of a period in which the arrow is continuously in motion. If we don't pay attention to what happens at nearby instants, it is impossible to distinguish instantaneous motion from instantaneous rest, but distinguishing the two is the way out of the Arrow Paradox. Zeno would have balked at the idea of motion at an instant, and Aristotle explicitly denied it. The Arrow Paradox seems especially strong to someone who would say that motion is an intrinsic property of an instant, being some propensity or disposition to be elsewhere.

In standard calculus, speed of an object at an instant (instantaneous velocity) is the time derivative of the object's position; this means the object's speed is the limit of its speeds during arbitrarily small intervals of time containing the instant. Equivalently, we say the object's speed is the limit of its speed over an interval as the length of the interval tends to zero. The derivative of position x with respect to time t, namely dx/dt, is the arrow’s speed, and it has non-zero values at specific places at specific instants during the flight, contra Zeno and Aristotle. The speed during an instant or in an instant, which is what Zeno is calling for, would be 0/0 and so be undefined. Using these modern concepts, Zeno cannot successfully argue that at each moment the arrow is at rest or that the speed of the arrow is zero at every instant. Therefore, advocates of the Standard Solution conclude that Zeno’s Arrow Paradox has a false, but crucial, assumption and so is unsound.

Independently of Zeno, the Arrow Paradox was discovered by the Chinese dialectician Kung-sun Lung (Gongsun Long, ca. 325–250 B.C.E.). A lingering philosophical question about the arrow paradox is whether there is a way to properly refute Zeno's argument that motion is impossible without using the apparatus of calculus.

iv. The Moving Rows (The Stadium)

It takes a body moving at a given speed a certain amount of time to traverse a body of a fixed length. Passing the body again at that speed will take the same amount of time, provided the body’s length stays fixed. Zeno challenged this common reasoning. According to Aristotle (Physics 239b33-240a18), Zeno considered bodies of equal length aligned along three parallel racetracks within a stadium. One track contains A bodies (three A bodies are shown below); another contains B bodies; and a third contains C bodies. Each body is the same distance from its neighbors along its track. The A bodies are stationary, but the Bs are moving to the right, and the Cs are moving with the same speed to the left. Here are two snapshots of the situation, before and after.

Diagram of Zeno's Moving Rows

Zeno points out that, in the time between the before-snapshot and the after-snapshot, the leftmost C passes two Bs but only one A, contradicting the common sense assumption that the C should take longer to pass two Bs than one A. The usual way out of this paradox is to remark that Zeno mistakenly supposes that a moving body passes both moving and stationary objects with equal speed.

Aristotle argues that how long it takes to pass a body depends on the speed of the body; for example, if the body is coming towards you, then you can pass it in less time than if it is stationary. Today’s analysts agree with Aristotle’s diagnosis, and historically this paradox of motion has seemed weaker than the previous three. This paradox is also called “The Stadium,” but occasionally so is the Dichotomy Paradox.

Some analysts, such as Tannery (1887), believe Zeno may have had in mind that the paradox was supposed to have assumed that space and time are discrete (quantized, atomized) as opposed to continuous, and Zeno intended his argument to challenge the coherence of this assumption about discrete space and time. Well, the paradox could be interpreted this way. Assume the three objects are adjacent to each other in their tracks or spaces; that is, the middle object is only one atom of space away from its neighbors. Then, if the Cs were moving at a speed of, say, one atom of space in one atom of time, the leftmost C would pass two atoms of B-space in the time it passed one atom of A-space, which is a contradiction to our assumption that the Cs move at a rate of one atom of space in one atom of time. Or else we’d have to say that in that atom of time, the leftmost C somehow got beyond two Bs by passing only one of them, which is also absurd (according to Zeno). Interpreted this way, Zeno’s argument produces a challenge to the idea that space and time are discrete. However, most commentators believe Zeno himself did not interpret his paradox this way.

b. Paradoxes of Plurality

Zeno's paradoxes of motion are attacks on the commonly held belief that motion is real, but because motion is a kind of plurality, namely a process along a plurality of places in a plurality of times, they are also attacks on this kind of plurality. Zeno offered more direct attacks on all kinds of plurality. The first is his Paradox of Alike and Unlike.

i. Alike and Unlike

According to Plato in Parmenides 127-9, Zeno argued that the assumption of plurality–the assumption that there are many things–leads to a contradiction. He quotes Zeno as saying: "If things are many, . . . they must be both like and unlike. But that is impossible; unlike things cannot be like, nor like things unlike" (Hamilton and Cairns (1961), 922).

Zeno's point is this. Consider a plurality of things, such as some people and some mountains. These things have in common the property of being heavy. But if they all have this property in common, then they really are all the same kind of thing, and so are not a plurality. They are a one. By this reasoning, Zeno believes it has been shown that the plurality is one (or the many is not many), which is a contradiction. Therefore, by reductio ad absurdum, there is no plurality, as Parmenides has always claimed.

Plato immediately accuses Zeno of equivocating. A thing can be alike some other thing in one respect while being not alike it in a different respect. Your having a property in common with some other thing does not make you identical with that other thing. Consider again our plurality of people and mountains. People and mountains are all alike in being heavy, but are unlike in intelligence. And they are unlike in being mountains; the mountains are mountains, but the people are not. As Plato says, when Zeno tries to conclude "that the same thing is many and one, we shall [instead] say that what he is proving is that something is many and one [in different respects], not that unity is many or that plurality is one...." [129d] So, there is no contradiction, and the paradox is solved by Plato. This paradox is generally considered to be one of Zeno's weakest paradoxes, and it is now rarely discussed. [See Rescher (2001), pp. 94-6 for some discussion.]

ii. Limited and Unlimited

This paradox is also called the Paradox of Denseness. Suppose there exist many things rather than, as Parmenides would say, just one thing. Then there will be a definite or fixed number of those many things, and so they will be “limited.” But if there are many things, say two things, then they must be distinct, and to keep them distinct there must be a third thing separating them. So, there are three things. But between these, …. In other words, things are dense and there is no definite or fixed number of them, so they will be “unlimited.” This is a contradiction, because the plurality would be both limited and unlimited. Therefore, there are no pluralities; there exists only one thing, not many things. This argument is reconstructed from Zeno’s own words, as quoted by Simplicius in his commentary of book 1 of Aristotle’s Physics.

According to the Standard Solution to this paradox, the weakness of Zeno’s argument can be said to lie in the assumption that “to keep them distinct, there must be a third thing separating them.” Zeno would have been correct to say that between any two physical objects that are separated in space, there is a place between them, because space is dense, but he is mistaken to claim that there must be a third physical object there between them. Two objects can be distinct at a time simply by one having a property the other does not have.

iii. Large and Small

Suppose there exist many things rather than, as Parmenides says, just one thing. Then every part of any plurality is both so small as to have no size but also so large as to be infinite, says Zeno. His reasoning for why they have no size has been lost, but many commentators suggest that he’d reason as follows. If there is a plurality, then it must be composed of parts which are not themselves pluralities. Yet things that are not pluralities cannot have a size or else they’d be divisible into parts and thus be pluralities themselves.

Now, why are the parts of pluralities so large as to be infinite? Well, the parts cannot be so small as to have no size since adding such things together would never contribute anything to the whole so far as size is concerned. So, the parts have some non-zero size. If so, then each of these parts will have two spatially distinct sub-parts, one in front of the other. Each of these sub-parts also will have a size. The front part, being a thing, will have its own two spatially distinct sub-parts, one in front of the other; and these two sub-parts will have sizes. Ditto for the back part. And so on without end. A sum of all these sub-parts would be infinite. Therefore, each part of a plurality will be so large as to be infinite.

This sympathetic reconstruction of the argument is based on Simplicius’ On Aristotle’s Physics, where Simplicius quotes Zeno’s own words for part of the paradox, although he does not say what he is quotingfrom.

There are many errors here in Zeno’s reasoning, according to the Standard Solution. He is mistaken at the beginning when he says, “If there is a plurality, then it must be composed of parts which are not themselves pluralities.” A university is an illustrative counterexample. A university is a plurality of students, but we need not rule out the possibility that a student is a plurality. What’s a whole and what’s a plurality depends on our purposes. When we consider a university to be a plurality of students, we consider the students to be wholes without parts. But for another purpose we might want to say that a student is a plurality of biological cells. Zeno is confused about this notion of relativity, and about part-whole reasoning; and as commentators began to appreciate this they lost interest in Zeno as a player in the great metaphysical debate between pluralism and monism.

A second error occurs in arguing that the each part of a plurality must have a non-zero size. In 1901, Henri Lebesgue showed how to properly define the measure function so that a line segment has nonzero measure even though (the singleton set of) any point has a zero measure. The measure of the line segment [a,  b] is b - a; the measure of a cube with side a is a3. Lebesgue’s theory is our current civilization’s theory of measure, and thus of length, volume, duration, mass, voltage, brightness, and other continuous magnitudes.

Thanks to Aristotle’s support, Zeno’s Paradoxes of Large and Small and of Infinite Divisibility (to be discussed below) were generally considered to have shown that a continuous magnitude cannot be composed of points. Interest was rekindled in this topic in the 18th century. The physical objects in Newton’s classical mechanics of 1726 were interpreted by R. J. Boscovich in 1763 as being collections of point masses. Each point mass is a movable point carrying a fixed mass. This idealization of continuous bodies as if they were compositions of point particles was very fruitful; it could be used to easily solve otherwise very difficult problems in physics. This success led scientists, mathematicians, and philosophers to recognize that the strength of Zeno’s Paradoxes of Large and Small and of Infinite Divisibility had been overestimated; they did not prevent a continuous magnitude from being composed of points.

iv. Infinite Divisibility

This is the most challenging of all the paradoxes of plurality. Consider the difficulties that arise if we assume that an object theoretically can be divided into a plurality of parts. According to Zeno, there is a reassembly problem. Imagine cutting the object into two non-overlapping parts, then similarly cutting these parts into parts, and so on until the process of repeated division is complete. Assuming the hypothetical division is “exhaustive” or does comes to an end, then at the end we reach what Zeno calls “the elements.” Here there is a problem about reassembly. There are three possibilities. (1) The elements are nothing. In that case the original objects will be a composite of nothing, and so the whole object will be a mere appearance, which is absurd. (2) The elements are something, but they have zero size. So, the original object is composed of elements of zero size. Adding an infinity of zeros yields a zero sum, so the original object had no size, which is absurd. (3) The elements are something, but they do not have zero size. If so, these can be further divided, and the process of division was not complete after all, which contradicts our assumption that the process was already complete. In summary, there were three possibilities, but all three possibilities lead to absurdity. So, objects are not divisible into a plurality of parts.

Simplicius says this argument is due to Zeno even though it is in Aristotle (On Generation and Corruption, 316a15-34, 316b34 and 325a8-12) and is not attributed there to Zeno, which is odd. Aristotle says the argument convinced the atomists to reject infinite divisibility. The argument has been called the Paradox of Parts and Wholes, but it has no traditional name.

The Standard Solution says we first should ask Zeno to be clearer about what he is dividing. Is it concrete or abstract? When dividing a concrete, material stick into its components, we reach ultimate constituents of matter such as quarks and electrons that cannot be further divided. These have a size, a zero size (according to quantum electrodynamics), but it is incorrect to conclude that the whole stick has no size if its constituents have zero size. [Due to the forces involved, point particles have finite “cross sections,” and configurations of those particles, such as atoms, do have finite size.] So, Zeno is wrong here. On the other hand, is Zeno dividing an abstract path or trajectory? Let's assume he is, since this produces a more challenging paradox. If so, then choice (2) above is the one to think about. It's the one that talks about addition of zeroes. Let's assume the object is one-dimensional, like a path. According to the Standard Solution, this "object" that gets divided should be considered to be a continuum with its elements arranged into the order type of the linear continuum, and we should use Lebesgue's notion of measure to find the size of the object. The size (length, measure) of a point-element is zero, but Zeno is mistaken in saying the total size (length, measure) of all the zero-size elements is zero. The size of the object  is determined instead by the difference in coordinate numbers assigned to the end points of the object. An object extending along a straight line that has one of its end points at one meter from the origin and other end point at three meters from the origin has a size of two meters and not zero meters. So, there is no reassembly problem, and a crucial step in Zeno's argument breaks down.

c. Other Paradoxes

i. The Grain of Millet

There are two common interpretations of this paradox. According to the first, which is the standard interpretation, when a bushel of millet (or wheat) grains falls out of its container and crashes to the floor, it makes a sound. Since the bushel is composed of individual grains, each individual grain also makes a sound, as should each thousandth part of the grain, and so on to its ultimate parts. But this result contradicts the fact that we actually hear no sound for portions like a thousandth part of a grain, and so we surely would hear no sound for an ultimate part of a grain. Yet, how can the bushel make a sound if none of its ultimate parts make a sound? The original source of this argument is Aristotle Physics (250a.19-21). There seems to be appeal to the iterative rule that if a millet or millet part makes a sound, then so should a next smaller part.

We do not have Zeno’s words on what conclusion we are supposed to draw from this. Perhaps he would conclude it is a mistake to suppose that whole bushels of millet have millet parts. This is an attack on plurality.

The Standard Solution to this interpretation of the paradox accuses Zeno of mistakenly assuming that there is no lower bound on the size of something that can make a sound. There is no problem, we now say, with parts having very different properties from the wholes that they constitute. The iterative rule is initially plausible but ultimately not trustworthy, and Zeno is committing both the fallacy of division and the fallacy of composition.

Some analysts interpret Zeno’s paradox a second way, as challenging our trust in our sense of hearing, as follows. When a bushel of millet grains crashes to the floor, it makes a sound. The bushel is composed of individual grains, so they, too, make an audible sound. But if you drop an individual millet grain or a small part of one or an even smaller part, then eventually your hearing detects no sound, even though there is one. Therefore, you cannot trust your sense of hearing.

This reasoning about our not detecting low amplitude sounds is similar to making the mistake of arguing that you cannot trust your thermometer because there are some ranges of temperature that it is not sensitive to. So, on this second interpretation, the paradox is also easy to solve. One reason given in the literature for believing that this second interpretation is not the one that Zeno had in mind is that Aristotle’s criticism given below applies to the first interpretation and not the second, and it is unlikely that Aristotle would have misinterpreted the paradox.

ii. Against Place

Given an object, we may assume that there is a single, correct answer to the question, “What is its place?” Because everything that exists has a place, and because place itself exists, so it also must have a place, and so on forever. That’s too many places, so there is a contradiction. The original source is Aristotle’sPhysics (209a23-25 and 210b22-24).

The standard response to Zeno’s Paradox Against Place is to deny that places have places, and to point out that the notion of place should be relative to reference frame. But Zeno’s assumption that places have places was common in ancient Greece at the time, and Zeno is to be praised for showing that it is a faulty assumption.

4. Aristotle’s Treatment of the Paradoxes

Aristotle’s views about Zeno’s paradoxes can be found in Physics, book 4, chapter 2, and book 6, chapters 2 and 9. Regarding the Dichotomy Paradox, Aristotle is to be applauded for his insight that Achilles has time to reach his goal because during the run ever shorter paths take correspondingly ever shorter times.

Aristotle had several criticisms of Zeno. Regarding the paradoxes of motion, he complained that Zeno should not suppose the runner's path is dependent on its parts; instead, the path is there first, and the parts are constructed by the analyst. His second complaint was that Zeno should not suppose that lines contain points. Aristotle's third and most influential, critical idea involves a complaint about potential infinity. On this point, in remarking about the Achilles Paradox, Aristotle said, “Zeno’s argument makes a false assumption in asserting that it is impossible for a thing to pass over…infinite things in a finite time.” Aristotle believes it is impossible for a thing to pass over an actually infinite number of things in a finite time, but that it is possible for a thing to pass over a potentially infinite number of things in a finite time. Here is how Aristotle expressed the point:

For motion…, although what is continuous contains an infinite number of halves, they are not actual but potential halves. (Physics 263a25-27). …Therefore to the question whether it is possible to pass through an infinite number of units either of time or of distance we must reply that in a sense it is and in a sense it is not. If the units are actual, it is not possible: if they are potential, it is possible. (Physics 263b2-5).

Actual infinities are also called completed infinities. A potential infinity could never become an actual infinity. Aristotle believed the concept of actual infinity is perhaps not coherent, and so not real either in mathematics or in nature. He believes that actual infinities are not real because, if one were to exist, its infinity of parts would have to exist all at once, which he believed is impossible. Potential infinities exist over time, as processes that always can be continued at a later time. That's the only kind of infinity that could be real, thought Aristotle. A potential infinity is an unlimited iteration of some operation—unlimited in time. Aristotle claimed correctly that if Zeno were not to have used the concept of actual infinity, the paradoxes of motion such as the Achilles Paradox (and the Dichotomy Paradox) could not be created.

Here is why doing so is a way out of these paradoxes. Zeno said that to go from the start to the finish line, the runner Achilles must reach the place that is halfway-there, then after arriving at this place he still must reach the place that is half of that remaining distance, and after arriving there he must again reach the new place that is now halfway to the goal, and so on. These are too many places to reach. Zeno made the mistake, according to Aristotle, of supposing that this infinite process needs completing when it really does not; the finitely long path from start to finish exists undivided for the runner, and it is the mathematician who is demanding the completion of such a process. Without that concept of a completed infinity there is no paradox. Aristotle is correct about this being a treatment that avoids paradox. Today’s standard treatment of the Achilles paradox disagrees with Aristotle's way out of the paradox and says Zeno was correct to use the concept of a completed infinity and to imply the runner must go to an actual infinity of places in a finite time.

From what Aristotle says, one can infer between the lines that he believes there is another reason to reject actual infinities: doing so is the only way out of these paradoxes of motion. Today we know better. There is another way out, namely, the Standard Solution that uses actual infinities, namely Cantor's transfinite sets.

Aristotle’s treatment by disallowing actual infinity while allowing potential infinity was clever, and it satisfied nearly all scholars for 1,500 years, being buttressed during that time by the Church's doctrine that only God is actually infinite. George Berkeley, Immanuel Kant, Carl Friedrich Gauss, and Henri Poincaré were influential defenders of potential infinity. Leibniz accepted actual infinitesimals, but other mathematicians and physicists in European universities during these centuries were careful to distinguish between actual and potential infinities and to avoid using actual infinities.

Given 1,500 years of opposition to actual infinities, the burden of proof was on anyone advocating them. Bernard Bolzano and Georg Cantor accepted this burden in the 19th century. The key idea is to see a potentially infinite set as a variable quantity that is dependent on being abstracted from a pre-exisiting actually infinite set. Bolzano argued that the natural numbers should be conceived of as a set, a determinate set, not one with a variable number of elements. Cantor argued that any potential infinity must be interpreted as varying over a predefined fixed set of possible values, a set that is actually infinite. He put it this way:

In order for there to be a variable quantity in some mathematical study, the “domain” of its variability must strictly speaking be known beforehand through a definition. However, this domain cannot itself be something variable…. Thus this “domain” is a definite, actually infinite set of values. Thus each potential infinite…presupposes an actual infinite. (Cantor 1887)

From this standpoint, Dedekind’s 1872 axiom of continuity and his definition of real numbers as certain infinite subsets of rational numbers suggested to Cantor and then to many other mathematicians that arbitrarily large sets of rational numbers are most naturally seen to be subsets of an actually infinite set of rational numbers. The same can be said for sets of real numbers. An actually infinite set is what we today call a "transfinite set." Cantor's idea is then to treat a potentially infinite set as being a sequence of definite subsets of a transfinite set. Aristotle had said mathematicians need only the concept of a finite straight line that may be produced as far as they wish, or divided as finely as they wish, but Cantor would say that this way of thinking presupposes a completed infinite continuum from which that finite line is abstracted at any particular time.

[When Cantor says the mathematical concept of potential infinity presupposes the mathematical concept of actual infinity, this does not imply that, if future time were to be potentially infinite, then future time also would be actually infinite.]

Dedekind's primary contribution to our topic was to give the first rigorous definition of infinite set—an actual infinity—showing that the notion is useful and not self-contradictory. Cantor provided the missing ingredient—that the mathematical line can fruitfully be treated as a dense linear ordering of uncountably many points, and he went on to develop set theory and to give the continuum a set-theoretic basis which convinced mathematicians that the concept was rigorously defined.

These ideas now form the basis of modern real analysis. The implication for the Achilles and Dichotomy paradoxes is that, once the rigorous definition of a linear continuum is in place, and once we have Cauchy’s rigorous theory of how to assess the value of an infinite series, then we can point to the successful use of calculus in physical science, especially in the treatment of time and of motion through space, and say that the sequence of intervals or paths described by Zeno is most properly treated as a sequence of subsets of an actually infinite set [that is, Aristotle's potential infinity of places that Achilles reaches are really a variable subset of an already existing actually infinite set of point places], and we can be confident that Aristotle’s treatment of the paradoxes is inferior to the Standard Solution’s.

Zeno said Achilles cannot achieve his goal in a finite time, but there is no record of the details of how he defended this conclusion. He might have said the reason is (i) that there is no last goal in the sequence of sub-goals, or, perhaps (ii) that it would take too long to achieve all the sub-goals, or perhaps (iii) that covering all the sub-paths is too great a distance to run. Zeno might have offered all these defenses. In attacking justification (ii), Aristotle objects that, if Zeno were to confine his notion of infinity to a potential infinity and were to reject the idea of zero-length sub-paths, then Achilles achieves his goal in a finite time, so this is a way out of the paradox. However, an advocate of the Standard Solution says Achilles achieves his goal by covering an actual infinity of paths in a finite time, and this is the way out of the paradox. (The discussion of whether Achilles can properly be described as completing an actual infinity of tasks rather than goals will be considered in Section 5c.) Aristotle's treatment of the paradoxes is basically criticized for being inconsistent with current standard real analysis that is based upon Zermelo Fraenkel set theory and its actually infinite sets. To summarize the errors of Zeno and Aristotle in the Achilles Paradox and in the Dichotomy Paradox, they both made the mistake of thinking that if a runner has to cover an actually infinite number of sub-paths to reach his goal, then he will never reach it; calculus shows how Achilles can do this and reach his goal in a finite time, and the fruitfulness of the tools of calculus imply that the Standard Solution is a better treatment than Aristotle's.

Let’s turn to the other paradoxes. In proposing his treatment of the Paradox of the Large and Small and of the Paradox of Infinite Divisibility, Aristotle said that

…a line cannot be composed of points, the line being continuous and the point indivisible. (Physics, 231a 25)

In modern real analysis, a continuum is composed of points, but Aristotle, ever the advocate of common sense reasoning, claimed that a continuum cannot be composed of points. Aristotle believed a line can be composed only of smaller, indefinitely divisible lines and not of points without magnitude. Similarly a distance cannot be composed of point places and a duration cannot be composed of instants. This is one of Aristotle’s key errors, according to advocates of the Standard Solution, because by maintaining this common sense view he created an obstacle to the fruitful development of real analysis. In addition to complaining about points, Aristotelians object to the idea of an actual infinite number of them.

In his analysis of the Arrow Paradox, Aristotle said Zeno mistakenly assumes time is composed of indivisible moments, but “This is false, for time is not composed of indivisible moments any more than any other magnitude is composed of indivisibles.” (Physics, 239b8-9) Zeno needs those instantaneous moments; that way Zeno can say the arrow does not move during the moment. Aristotle recommends not allowing Zeno to appeal to instantaneous moments and restricting Zeno to saying motion be divided only into a potential infinity of intervals. That restriction implies the arrow’s path can be divided only into finitely many intervals at any time. So, at any time, there is a finite interval during which the arrow can exhibit motion by changing location. So the arrow flies, after all. That is, Aristotle declares Zeno’s argument is based on false assumptions without which there is no problem with the arrow’s motion. However, the Standard Solution agrees with Zeno that time can be composed of indivisible moments or instants, and it implies that Aristotle has mis-diagnosed where the error lies in the Arrow Paradox. Advocates of the Standard Solution would add that allowing a duration to be composed of indivisible moments is what is needed for having a fruitful calculus, and Aristotle's recommendation is an obstacle to the development of calculus.

Aristotle’s treatment of The Paradox of the Moving Rows is basically in agreement with the Standard Solution to that paradox–that Zeno did not appreciate the difference between speed and relative speed.

Regarding the Paradox of the Grain of Millet, Aristotle said that parts need not have all the properties of the whole, and so grains need not make sounds just because bushels of grains do. (Physics, 250a, 22) And if the parts make no sounds, we should not conclude that the whole can make no sound. It would have been helpful for Aristotle to have said more about what are today called the Fallacies of Division and Composition that Zeno is committing. However, Aristotle’s response to the Grain of Millet is brief but accurate by today’s standards.

In conclusion, are there two adequate but different solutions to Zeno’s paradoxes, Aristotle’s Solution and the Standard Solution? No. Aristotle’s treatment does not stand up to criticism in a manner that most scholars deem adequate. The Standard Solution uses contemporary concepts that have proved to be more valuable for solving and resolving so many other problems in mathematics and physics. Replacing Aristotle’s common sense concepts with the new concepts from real analysis and classical mechanics has been a key ingredient in the successful development of mathematics and science in recent centuries, and for this reason the vast majority of scientists, mathematicians, and philosophers reject Aristotle's treatment. Nevertheless, there is a significant minority in the philosophical community who do not agree, as we shall see in the sections that follow.

5. Other Issues Involving the Paradoxes

a. Consequences of Accepting the Standard Solution

There is a price to pay for accepting the Standard Solution to Zeno’s Paradoxes. The following–once presumably safe–intuitions or assumptions must be rejected:

  1. A continuum is too smooth to be divisible into point elements.
  2. Runners do not have time to go to an actual infinity of places in a finite time.
  3. The sum of an infinite series of positive terms is always infinite.
  4. For each instant there is a next instant and for each place along a line there is a next place.
  5. A finite distance along a line cannot contain an actually infinite number of points.
  6. The more points there are on a line, the longer the line is.
  7. It is absurd for there to be numbers that are bigger than every integer.
  8. A one-dimensional curve can not fill a two-dimensional area, nor can an infinitely long curve enclose a finite area.
  9. A whole is always greater than any of its parts.

Item (8) was undermined when it was discovered that the continuum implies the existence of fractal curves. However, the loss of intuition (1) has caused the greatest stir because so many philosophers object to a continuum being constructed from points. The Austrian philosopher Franz Brentano believed with Aristotle that scientific theories should be literal descriptions of reality, as opposed to today’s more popular view that theories are idealizations or approximations of reality. Continuity is something given in perception, said Brentano, and not in a mathematical construction; therefore, mathematics misrepresents. In a 1905 letter to Husserl, he said, “I regard it as absurd to interpret a continuum as a set of points.”

But the Standard Solution needs to be thought of as a package to be evaluated in terms of all of its costs and benefits. From this perspective the Standard Solution’s point-set analysis of continua has withstood the criticism and demonstrated its value in mathematics and mathematical physics. As a consequence, advocates of the Standard Solution say we must live with rejecting the eight intuitions listed above, and accept the counterintuitive implications such as there being divisible continua, infinite sets of different sizes, and space-filling curves. They agree with the philosopher W. V .O. Quine who demands that we be conservative when revising the system of claims that we believe and who recommends “minimum mutilation.” Advocates of the Standard Solution say no less mutilation will work satisfactorily.

b. Criticisms of the Standard Solution

Balking at having to reject so many of our intuitions, the 20th century philosophers Henri-Louis Bergson, Max Black, Franz Brentano, L. E. J. Brouwer, Solomon Feferman, William James, James Thomson, and Alfred North Whitehead argued in different ways that the standard mathematical account of continuity does not apply to physical processes, or is improper for describing those processes. Here are their main reasons: (1) the actual infinite cannot be encountered in experience and thus is unreal, (2) human intelligence is not capable of understanding motion, (3) the sequence of tasks that Achilles performs is finite and the illusion that it is infinite is due to mathematicians who confuse their mathematical representations with what is represented. (4) motion is unitary even though its spatial trajectory is infinitely divisible, (5) treating time as being made of instants is to treat time as static rather than as the dynamic aspect of consciousness that it truly is, (6) actual infinities and the contemporary continuum are not indispensable to solving the paradoxes, and (7) the Standard Solution’s implicit assumption of the primacy of the coherence of the sciences is unjustified because coherence with a priori knowledge and common sense is primary.

See Salmon (1970, Introduction) and Feferman (1998) for a discussion of the controversy about the quality of Zeno’s arguments, and an introduction to its vast literature. This controversy is much less actively pursued in today’s mathematical literature, and hardly at all in today’s scientific literature. A minority of philosophers are actively involved in an attempt to retain one or more of the eight intuitions listed in section 5a above. An important philosophical issue is whether the paradoxes should be solved by the Standard Solution or instead by assuming that a line is not composed of points but of intervals, and whether use of infinitesimals is essential to a proper understanding of the paradoxes.

c. Supertasks and Infinity Machines

Zeno’s Paradox of Achilles was presented as implying that he will never catch the tortoise because the sequence of goals to be achieved has no final member. In that presentation, use of the terms “task” and “act” was intentionally avoided, but there are interesting questions that do use those terms. In reaching the tortoise, Achilles does not cover an infinite distance, but he does cover an infinite number of distances. In doing so, does he need to complete an infinite sequence of tasks or actions? In other words, assuming Achilles does complete the task of reaching the tortoise, does he thereby complete a supertask, a transfinite number of tasks in a finite time?

Bertrand Russell said “yes.” He argued that it is possible to perform a task in one-half minute, then perform another task in the next quarter-minute, and so on, for a full minute. At the end of the minute, an infinite number of tasks would have been performed. In fact, Achilles does this in catching the tortoise. In the mid-twentieth century, Hermann Weyl, Max Black, and others objected, and thus began an ongoing controversy about the number of tasks that can be completed in a finite time.

That controversy has sparked a related discussion about whether there could be a machine that can perform an infinite number of tasks in a finite time. A machine that can is called an infinity machine. In 1954, in an effort to undermine Russell’s argument, the philosopher James Thomson described a lamp that is intended to be a typical infinity machine. Let the machine switch the lamp on for a half-minute; then switch it off for a quarter-minute; then on for an eighth-minute; off for a sixteenth-minute; and so on. Would the lamp be lit or dark at the end of minute? Thomson argued that it must be one or the other, but it cannot be either because every period in which it is off is followed by a period in which it is on, and vice versa, so there can be no such lamp, and the specific mistake in the reasoning was to suppose that it is logically possible to perform a supertask. The implication for Zeno’s paradoxes is that, although Thomson is not denying Achilles catches the tortoise, he is denying Russell’s description of Achilles’ task as being the completion of an infinite number of sub-tasks in a finite time.

Paul Benacerraf (1962) complains that Thomson’s reasoning is faulty because it fails to notice that the initial description of the lamp determines the state of the lamp at each period in the sequence of switching, but it determines nothing about the state of the lamp at the limit of the sequence. The lamp could be either on or off at the limit. The limit of the infinite converging sequence is not in the sequence. So, Thomson has not established the logical impossibility of completing this supertask.

Could some other argument establish this impossibility? Benacerraf suggests that an answer depends on what we ordinarily mean by the term “completing a task.” If the meaning does not require that tasks have minimum times for their completion, then maybe Russell is right that some supertasks can be completed, he says; but if a minimum time is always required, then Russell is mistaken because an infinite time would be required. What is needed is a better account of the meaning of the term “task.” Grünbaum objects to Benacerraf’s reliance on ordinary meaning. “We need to heed the commitments of ordinary language,” says Grünbaum, “only to the extent of guarding against being victimized or stultified by them.”

The Thomson Lamp has generated a great literature in recent philosophy. Here are some of the issues. What is the proper definition of “task”? For example, does it require a minimum amount of time, and does it require a minimum amount of work, in the physicists’ technical sense of that term? Even if it is physically impossible to flip the switch in Thomson’s lamp, suppose physics were different and there were no limit on speed; what then? Is the lamp logically impossible? Is the lamp metaphysically impossible, even if it is logically possible? Was it proper of Thomson to suppose that the question of whether the lamp is lit or dark at the end of the minute must have a determinate answer? Does Thomson’s question have no answer, given the initial description of the situation, or does it have an answer which we are unable to compute? Should we conclude that it makes no sense to divide a finite task into an infinite number of ever shorter sub-tasks? Even if completing a countable infinity of tasks in a finite time is physically possible (such as when Achilles runs to the tortoise), is completing an uncountable infinity also possible? Interesting issues arise when we bring in Einstein’s theory of relativity and consider a bifurcated supertask. This is an infinite sequence of tasks in a finite interval of an external observer’s proper time, but not in the machine’s own proper time. See Earman and Norton (1996) for an introduction to the extensive literature on these topics. Unfortunately, there is no agreement in the philosophical community on most of the questions we’ve just entertained.

d. Constructivism

The spirit of Aristotle’s opposition to actual infinities persists today in the philosophy of mathematics called constructivism. Constructivism is not a precisely defined position, but it implies that acceptable mathematical objects and procedures have to be founded on constructions and not, say, on assuming the object does not exist, then deducing a contradiction from that assumption. Most constructivists believe acceptable constructions must be performable ideally by humans independently of practical limitations of time or money. So they would say potential infinities, recursive functions, mathematical induction, and Cantor’s diagonal argument are constructive, but the following are not: The axiom of choice, the law of excluded middle, the law of double negation, completed infinities, and the classical continuum of the Standard Solution. The implication is that Zeno’s Paradoxes were not solved correctly by using the methods of the Standard Solution. More conservative constructionists, the finitists, would go even further and reject potential infinities because of the human being's finite computational resources, but this conservative sub-group of constructivists is very much out of favor.

L. E. J. Brouwer’s intuitionism was the leading constructivist theory of the early 20th century. In response to suspicions raised by the discovery of Russell’s Paradox and the introduction into set theory of the controversial non-constructive axiom of choice, Brouwer attempted to place mathematics on what he believed to be a firmer epistemological foundation by arguing that mathematical concepts are admissible only if they can be constructed from, and thus grounded in, an ideal mathematician’s vivid temporal intuitions, the a priori intuitions of time. Brouwer’s intuitionistic continuum has the Aristotelian property of unsplitability. What this means is that, unlike the Standard Solution’s set-theoretic composition of the continuum which allows, say, the closed interval of real numbers from zero to one to be split or cut into (that is, be the union of sets of) those numbers in the interval that are less than one-half and those numbers in the interval that are greater than or equal to one-half, the corresponding closed interval of the intuitionistic continuum cannot be split this way into two disjoint sets. This unsplitability or inseparability agrees in spirit with Aristotle’s idea of the continuity of a real continuum, but disagrees in spirit with Aristotle by allowing the continuum to be composed of points. [Posy (2005) 346-7]

Although everyone agrees that any legitimate mathematical proof must use only a finite number of steps and be constructive in that sense, the majority of mathematicians in the first half of the twentieth century claimed that constructive mathematics could not produce an adequate theory of the continuum because essential theorems will no longer be theorems, and constructivist principles and procedures are too awkward to use successfully. In 1927, David Hilbert exemplified this attitude when he objected that Brouwer’s restrictions on allowable mathematics–such as rejecting proof by contradiction–were like taking the telescope away from the astronomer.

But thanks in large part to the later development of constructive mathematics by Errett Bishop and Douglas Bridges in the second half of the 20th century, most contemporary philosophers of mathematics believe the question of whether constructivism could be successful in the sense of producing an adequate theory of the continuum is still open [see Wolf (2005) p. 346, and McCarty (2005) p. 382], and to that extent so is the question of whether the Standard Solution to Zeno’s Paradoxes needs to be rejected or perhaps revised to embrace constructivism. Frank Arntzenius (2000), Michael Dummett (2000), and Solomon Feferman (1998) have done important philosophical work to promote the constructivist tradition. Nevertheless, the vast majority of today’s practicing mathematicians routinely use nonconstructive mathematics.

e. Nonstandard Analysis

Although Zeno and Aristotle had the concept of small, they did not have the concept of infinitesimally small, which is the informal concept that was used by Leibniz (and Newton) in the development of calculus. In the 19th century, infinitesimals were eliminated from the standard development of calculus due to the work of Cauchy and Weierstrass on defining a derivative in terms of limits using the epsilon-delta method. But in 1881, C. S. Peirce advocated restoring infinitesimals because of their intuitive appeal. Unfortunately, he was unable to work out the details, as were all mathematicians—until 1960 when Abraham Robinson produced his nonstandard analysis. At this point in time it was no longer reasonable to say that banishing infinitesimals from analysis was an intellectual advance. What Robinson did was to extend the standard real numbers to include infinitesimals, using this definition: h is infinitesimal if and only if its absolute value is less than 1/n, for every positive standard number n. Robinson went on to create a nonstandard model of analysis using hyperreal numbers. The class of hyperreal numbers contains counterparts of the reals, but in addition it contains any number that is the sum, or difference, of both a standard real number and an infinitesimal number, such as 3 + h and 3 – 4h2. The reciprocal of an infinitesimal is an infinite hyperreal number. These hyperreals obey the usual rules of real numbers except for the Archimedean axiom. Infinitesimal distances between distinct points are allowed, unlike with standard real analysis. The derivative is defined in terms of the ratio of infinitesimals, in the style of Leibniz, rather than in terms of a limit as in standard real analysis in the style of Weierstrass.

Nonstandard analysis is called “nonstandard” because it was inspired by Thoralf Skolem’s demonstration in 1933 of the existence of models of first-order arithmetic that are not isomorphic to the standard model of arithmetic. What makes them nonstandard is especially that they contain infinitely large (hyper)integers. For nonstandard calculus one needs nonstandard models of real analysis rather than just of arithmetic. An important feature demonstrating the usefulness of nonstandard analysis is that it achieves essentially the same theorems as those in classical calculus. The treatment of Zeno’s paradoxes is interesting from this perspective. See McLaughlin (1994) for how Zeno’s paradoxes may be treated using infinitesimals. McLaughlin believes this approach to the paradoxes is the only successful one, but commentators generally do not agree with that conclusion, and consider it merely to be an alternative solution. See Dainton (2010) pp. 306-9 for some discussion of this.

f. Smooth Infinitesimal Analysis

Abraham Robinson in the 1960s resurrected the infinitesimal as an infinitesimal number, but F. W. Lawvere in the 1970s resurrected the infinitesimal as an infinitesimal magnitude. His work is called “smooth infinitesimal analysis” and is part of “synthetic differential geometry.” In smooth infinitesimal analysis, a curved line is composed of infinitesimal tangent vectors. One significant difference from a nonstandard analysis, such as Robinson’s above, is that all smooth curves are straight over infinitesimal distances, whereas Robinson’s can curve over infinitesimal distances. In smooth infinitesimal analysis, Zeno’s arrow does not have time to change its speed during an infinitesimal interval. Smooth infinitesimal analysis retains the intuition that a continuum should be smoother than the continuum of the Standard Solution. Unlike both standard analysis and nonstandard analysis whose real number systems are set-theoretical entities and are based on classical logic, the real number system of smooth infinitesimal analysis is not a set-theoretic entity but rather an object in a topos of category theory, and its logic is intuitionist. (Harrison, 1996, p. 283) Like Robinson’s nonstandard analysis, Lawvere’s smooth infinitesimal analysis may also be a promising approach to a foundation for real analysis and thus to solving Zeno’s paradoxes, but there is no consensus that Zeno’s Paradoxes need to be solved this way. For more discussion see note 11 in Dainton (2010) pp. 420-1.

6. The Legacy and Current Significance of the Paradoxes

What influence has Zeno had? He had none in the East, but in the West there has been continued influence and interest up to today.

Let’s begin with his influence on the ancient Greeks. Before Zeno, philosophers expressed their philosophy in poetry, and he was the first philosopher to use prose arguments. This new method of presentation was destined to shape almost all later philosophy, mathematics, and science. Zeno drew new attention to the idea that the way the world appears to us is not how it is in reality. Zeno probably also influenced the Greek atomists to accept atoms. Aristotle was influenced by Zeno to use the distinction between actual and potential infinity as a way out of the paradoxes, and careful attention to this distinction has influenced mathematicians ever since. The proofs in Euclid’s Elements, for example, used only potentially infinite procedures. Awareness of Zeno’s paradoxes made Greek and all later Western intellectuals more aware that mistakes can be made when thinking about infinity, continuity, and the structure of space and time, and it made them wary of any claim that a continuous magnitude could be made of discrete parts. ”Zeno’s arguments, in some form, have afforded grounds for almost all theories of space and time and infinity which have been constructed from his time to our own,” said Bertrand Russell in the twentieth century.

There is controversy in the recent literature about whether Zeno developed any specific, new mathematical techniques. Some scholars claim Zeno influenced the mathematicians to use the indirect method of proof (reductio ad absurdum), but others disagree and say it may have been the other way around. Other scholars take the internalist position that the conscious use of the method of indirect argumentation arose in both mathematics and philosophy independently of each other. See Hintikka (1978) for a discussion of this controversy about origins. Everyone agrees the method was Greek and not Babylonian, as was the method of proving something by deducing it from explicitly stated assumptions. G. E. L. Owen (Owen 1958, p. 222) argued that Zeno influenced Aristotle’s concept of motion not existing at an instant, which implies there is no instant when a body begins to move, nor an instant when a body changes its speed. Consequently, says Owen, Aristotle’s conception is an obstacle to a Newton-style concept of acceleration, and this hindrance is “Zeno’s major influence on the mathematics of science.” Other commentators consider Owen’s remark to be slightly harsh regarding Zeno because, they ask, if Zeno had not been born, would Aristotle have been likely to develop any other concept of motion?

Zeno’s paradoxes have received some explicit attention from scholars throughout later centuries. Pierre Gassendi in the early 17th century mentioned Zeno’s paradoxes as the reason to claim that the world’s atoms must not be infinitely divisible. Pierre Bayle’s 1696 article on Zeno drew the skeptical conclusion that, for the reasons given by Zeno, the concept of space is contradictory. In the early 19th century, Hegel suggested that Zeno’s paradoxes supported his view that reality is inherently contradictory.

Zeno’s paradoxes caused mistrust in infinites, and this mistrust has influenced the contemporary movements of constructivism, finitism, and nonstandard analysis, all of which affect the treatment of Zeno’s paradoxes. Dialetheism, the acceptance of true contradictions via a paraconsistent formal logic, provides a newer, although unpopular, response to Zeno’s paradoxes, but dialetheism was not created specifically in response to worries about Zeno’s paradoxes. With the introduction in the 20th century of thought experiments about supertasks, interesting philosophical research has been directed towards understanding what it means to complete a task.

Zeno's paradoxes are often pointed to for a case study in how a philosophical problem has been solved, even though the solution took over two thousand years to materialize.

So, Zeno’s paradoxes have had a wide variety of impacts upon subsequent research. Little research today is involved directly in how to solve the paradoxes themselves, especially in the fields of mathematics and science, although discussion continues in philosophy, primarily on whether a continuous magnitude should be composed of discrete magnitudes, such as whether a line should be composed of points. If there are alternative treatments of Zeno's paradoxes, then this raises the issue of whether there is a single solution to the paradoxes or several solutions or one best solution. The answer to whether the Standard Solution is the correct solution to Zeno’s paradoxes may also depend on whether the best physics of the future that reconciles the theories of quantum mechanics and general relativity will require us to assume spacetime is composed at its most basic level of points, or, instead, of regions or loops or something else.

From the perspective of the Standard Solution, the most significant lesson learned by researchers who have tried to solve Zeno’s paradoxes is that the way out requires revising many of our old theories and their concepts. We have to be willing to rank the virtues of preserving logical consistency and promoting scientific fruitfulness above the virtue of preserving our intuitions. Zeno played a significant role in causing this progressive trend.

7. References and Further Reading

  • Arntzenius, Frank. (2000) “Are there Really Instantaneous Velocities?”, The Monist 83, pp. 187-208.
    • Examines the possibility that a duration does not consist of points, that every part of time has a non-zero size, that real numbers cannot be used as coordinates of times, and that there are no instantaneous velocities at a point.
  • Barnes, J. (1982). The Presocratic Philosophers, Routledge & Kegan Paul: Boston.
    • A well respected survey of the philosophical contributions of the Pre-Socratics.
  • Barrow, John D. (2005). The Infinite Book: A Short Guide to the Boundless, Timeless and Endless, Pantheon Books, New York.
    • A popular book in science and mathematics introducing Zeno’s Paradoxes and other paradoxes regarding infinity.
  • Benacerraf, Paul (1962). “Tasks, Super-Tasks, and the Modern Eleatics,” The Journal of Philosophy, 59, pp. 765-784.
    • An original analysis of Thomson’s Lamp and supertasks.
  • Bergson, Henri (1946). Creative Mind, translated by M. L. Andison. Philosophical Library: New York.
    • Bergson demands the primacy of intuition in place of the objects of mathematical physics.
  • Black, Max (1950-1951). “Achilles and the Tortoise,” Analysis 11, pp. 91-101.
    • A challenge to the Standard Solution to Zeno’s paradoxes. Blacks agrees that Achilles did not need to complete an infinite number of sub-tasks in order to catch the tortoise.
  • Cajori, Florian (1920). “The Purpose of Zeno’s Arguments on Motion,” Isis, vol. 3, no. 1, pp. 7-20.
    • An analysis of the debate regarding the point Zeno is making with his paradoxes of motion.
  • Cantor, Georg (1887). "Über die verschiedenen Ansichten in Bezug auf die actualunendlichen Zahlen." Bihang till Kongl. Svenska Vetenskaps-Akademien Handlingar , Bd. 11 (1886-7), article 19. P. A. Norstedt & Sôner: Stockholm.
    • A very early description of set theory and its relationship to old ideas about infinity.
  • Chihara, Charles S. (1965). “On the Possibility of Completing an Infinite Process,” Philosophical Review 74, no. 1, p. 74-87.
    • An analysis of what we mean by “task.”
  • Copleston, Frederick, S.J. (1962). “The Dialectic of Zeno,” chapter 7 of A History of Philosophy, Volume I, Greece and Rome, Part I, Image Books: Garden City.
    • Copleston says Zeno’s goal is to challenge the Pythagoreans who denied empty space and accepted pluralism.
  • Dainton, Barry. (2010). Time and Space, Second Edition, McGill-Queens University Press: Ithaca.
    • Chapters 16 and 17 discuss Zeno's Paradoxes.
  • Dauben, J. (1990). Georg Cantor, Princeton University Press: Princeton.
    • Contains Kronecker’s threat to write an article showing that Cantor’s set theory has “no real significance.” Ludwig Wittgenstein was another vocal opponent of set theory.
  • De Boer, Jesse (1953). “A Critique of Continuity, Infinity, and Allied Concepts in the Natural Philosophy of Bergson and Russell,” in Return to Reason: Essays in Realistic Philosophy, John Wild, ed., Henry Regnery Company: Chicago, pp. 92-124.
    • A philosophical defense of Aristotle’s treatment of Zeno’s paradoxes.
  • Diels, Hermann and W. Kranz (1951). Die Fragmente der Vorsokratiker, sixth ed., Weidmannsche Buchhandlung: Berlin.
    • A standard edition of the pre-Socratic texts.
  • Dummett, Michael (2000). “Is Time a Continuum of Instants?,” Philosophy, 2000, Cambridge University Press: Cambridge, pp. 497-515.
    • Promoting a constructive foundation for mathematics, Dummett’s formalism implies there are no instantaneous instants, so times must have rational values rather than real values. Times have only the values that they can in principle be measured to have; and all measurements produce rational numbers within a margin of error.
  • Earman J. and J. D. Norton (1996). “Infinite Pains: The Trouble with Supertasks,” in Paul Benacerraf: the Philosopher and His Critics, A. Morton and S. Stich (eds.), Blackwell: Cambridge, MA, pp. 231-261.
    • A criticism of Thomson’s interpretation of his infinity machines and the supertasks involved, plus an introduction to the literature on the topic.
  • Feferman, Solomon (1998). In the Light of Logic, Oxford University Press, New York.
    • A discussion of the foundations of mathematics and an argument for semi-constructivism in the tradition of Kronecker and Weyl, that the mathematics used in physical science needs only the lowest level of infinity, the infinity that characterizes the whole numbers. Presupposes considerable knowledge of mathematical logic.
  • Freeman, Kathleen (1948). Ancilla to the Pre-Socratic Philosophers, Harvard University Press: Cambridge, MA. Reprinted in paperback in 1983.
    • One of the best sources in English of primary material on the Pre-Socratics.
  • Grünbaum, Adolf (1967). Modern Science and Zeno’s Paradoxes, Wesleyan University Press: Middletown, Connecticut.
    • A detailed defense of the Standard Solution to the paradoxes.
  • Grünbaum, Adolf (1970). “Modern Science and Zeno’s Paradoxes of Motion,” in (Salmon, 1970), pp. 200-250.
    • An analysis of arguments by Thomson, Chihara, Benacerraf and others regarding the Thomson Lamp and other infinity machines.
  • Hamilton, Edith and Huntington Cairns (1961). The Collected Dialogues of Plato Including the Letters, Princeton University Press: Princeton.
  • Harrison, Craig (1996). “The Three Arrows of Zeno: Cantorian and Non-Cantorian Concepts of the Continuum and of Motion,” Synthese, Volume 107, Number 2, pp. 271-292.
    • Considers smooth infinitesimal analysis as an alternative to the classical Cantorian real analysis of the Standard Solution.
  • Heath, T. L. (1921). A History of Greek Mathematics, Vol. I, Clarendon Press: Oxford. Reprinted 1981.
    • Promotes the minority viewpoint that Zeno had a direct influence on Greek mathematics, for example by eliminating the use of infinitesimals.
  • Hintikka, Jaakko, David Gruender and Evandro Agazzi. Theory Change, Ancient Axiomatics, and Galileo’s Methodology, D. Reidel Publishing Company, Dordrecht.
    • A collection of articles that discuss, among other issues, whether Zeno’s methods influenced the mathematicians of the time or whether the influence went in the other direction. See especially the articles by Karel Berka and Wilbur Knorr.
  • Kirk, G. S., J. E. Raven, and M. Schofield, eds. (1983). The Presocratic Philosophers: A Critical History with a Selection of Texts, Second Edition, Cambridge University Press: Cambridge.
    • A good source in English of primary material on the Pre-Socratics with detailed commentary on the controversies about how to interpret various passages.
  • Maddy, Penelope (1992) “Indispensability and Practice,” Journal of Philosophy 59, pp. 275-289.
    • Explores the implication of arguing that theories of mathematics are indispensable to good science, and that we are justified in believing in the mathematical entities used in those theories.
  • Matson, Wallace I (2001). “Zeno Moves!” pp. 87-108 in Essays in Ancient Greek Philosophy VI: Before Plato, ed. by Anthony Preus, State University of New York Press: Albany.
    • Matson supports Tannery’s non-classical interpretation that Zeno’s purpose was to show only that the opponents of Parmenides are committed to denying motion, and that Zeno himself never denied motion, nor did Parmenides.
  • McCarty, D.C. (2005). “Intuitionism in Mathematics,” in The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro, Oxford University Press, Oxford, pp. 356-86.
    • Argues that a declaration of death of the program of founding mathematics on an intuitionistic basis is premature.
  • McLaughlin, William I. (1994). “Resolving Zeno’s Paradoxes,” Scientific American, vol. 271, no. 5, Nov., pp. 84-90.
    • How Zeno’s paradoxes may be explained using a contemporary theory of Leibniz’s infinitesimals.
  • Owen, G.E.L. (1958). “Zeno and the Mathematicians,” Proceedings of the Aristotelian Society, New Series, vol. LVIII, pp. 199-222.
    • Argues that Zeno and Aristotle negatively influenced the development of the Renaissance concept of acceleration that was used so fruitfully in calculus.
  • Posy, Carl. (2005). “Intuitionism and Philosophy,” in The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro, Oxford University Press, Oxford, pp. 318-54.
    • Contains a discussion of how the unsplitability of Brouwer’s intuitionistic continuum makes precise Aristotle’s notion that “you can’t cut a continuous medium without some of it clinging to the knife,” on pages 345-7.
  • Proclus (1987). Proclus’ Commentary on Plato’s Parmenides, translated by Glenn R. Morrow and John M. Dillon, Princeton University Press: Princeton.
    • A detailed list of every comment made by Proclus about Zeno is available with discussion starting on p. xxxix of the Introduction by John M. Dillon. Dillon focuses on Proclus’ comments which are not clearly derivable from Plato’s Parmenides, and concludes that Proclus had access to other sources for Zeno’s comments, most probably Zeno’s original book or some derivative of it. William Moerbeke’s overly literal translation in 1285 from Greek to Latin of Proclus’ earlier, but now lost, translation of Plato’s Parmenides is the key to figuring out the original Greek. (see p. xliv)
  • Rescher, Nicholas (2001). Paradoxes: Their Roots, Range, and Resolution, Carus Publishing Company: Chicago.
    • Pages 94-102 apply the Standard Solution to all of Zeno's paradoxes. Rescher calls the Paradox of Alike and Unlike the "Paradox of Differentiation."
  • Russell, Bertrand (1914). Our Knowledge of the External World as a Field for Scientific Method in Philosophy, Open Court Publishing Co.: Chicago.
    • Russell champions the use of contemporary real analysis and physics in resolving Zeno’s paradoxes.
  • Salmon, Wesley C., ed. (1970). Zeno’s Paradoxes, The Bobbs-Merrill Company, Inc.: Indianapolis and New York. Reprinted in paperback in 2001.
    • A collection of the most influential articles about Zeno’s Paradoxes from 1911 to 1965. Salmon provides an excellent annotated bibliography of further readings.
  • Szabo, Arpad (1978). The Beginnings of Greek Mathematics, D. Reidel Publishing Co.: Dordrecht.
    • Contains the argument that Parmenides discovered the method of indirect proof by using it against Anaximenes’ cosmogony, although it was better developed in prose by Zeno. Also argues that Greek mathematicians did not originate the idea but learned of it from Parmenides and Zeno. (pp. 244-250). These arguments are challenged in Hntikka (1978).
  • Tannery, Paul (1885). “‘Le Concept Scientifique du continu: Zenon d’Elee et Georg Cantor,” pp. 385-410 of Revue Philosophique de la France et de l’Etranger, vol. 20, Les Presses Universitaires de France: Paris.
    • This mathematician gives the first argument that Zeno’s purpose was not to deny motion but rather to show only that the opponents of Parmenides are committed to denying motion.
  • Tannery, Paul (1887). Pour l’Histoire de la Science Hellène: de Thalès à Empédocle, Alcan: Paris. 2nd ed. 1930.
    • More development of the challenge to the classical interpretation of what Zeno’s purposes were in creating his paradoxes.
  • Thomson, James (1954-1955). “Tasks and Super-Tasks,” Analysis, XV, pp. 1-13.
    • A criticism of supertasks. The Thomson Lamp thought-experiment is used to challenge Russell’s characterization of Achilles as being able to complete an infinite number of tasks in a finite time.
  • Tiles, Mary (1989). The Philosophy of Set Theory: An Introduction to Cantor’s Paradise, Basil Blackwell: Oxford.
    • A philosophically oriented introduction to the foundations of real analysis and its impact on Zeno’s paradoxes.
  • Vlastos, Gregory (1967). “Zeno of Elea,” in The Encyclopedia of Philosophy, Paul Edwards (ed.), The Macmillan Company and The Free Press: New York.
    • A clear, detailed presentation of the paradoxes. Vlastos comments that Aristotle does not consider any other treatment of Zeno’s paradoxes than by recommending replacing Zeno’s actual infinities with potential infinites, so we are entitled to assert that Aristotle probably believed denying actual infinities is the only route to a coherent treatment of infinity. Vlastos also comments that “there is nothing in our sources that states or implies that any development in Greek mathematics (as distinct from philosophical opinions about mathematics) was due to Zeno’s influence.”
  • White, M. J. (1992). The Continuous and the Discrete: Ancient Physical Theories from a Contemporary Perspective, Clarendon Press: Oxford.
    • A presentation of various attempts to defend finitism, neo-Aristotelian potential infinities, and the replacement of the infinite real number field with a finite field.
  • Wisdom, J. O. (1953). “Berkeley’s Criticism of the Infinitesimal,” The British Journal for the Philosophy of Science, Vol. 4, No. 13, pp. 22-25.
    • Wisdom clarifies the issue behind George Berkeley’s criticism (in 1734 in The Analyst) of the use of the infinitesimal (fluxion) by Newton and Leibniz. See also the references there to Wisdom’s other three articles on this topic in the journal Hermathena in 1939, 1941 and 1942.
  • Wolf, Robert S. (2005). A Tour Through Mathematical Logic, The Mathematical Association of America: Washington, DC.
    • Chapter 7 surveys nonstandard analysis, and Chapter 8 surveys constructive mathematics, including the contributions by Errett Bishop and Douglas Bridges.

Author Information

Bradley Dowden
California State University, Sacramento
U. S. A.

Phenomenology and Natural Science

Phenomenology provides an excellent framework for a comprehensive understanding of the natural sciences. It treats inquiry first and foremost as a process of looking and discovering rather than assuming and deducing. In looking and discovering, an object always appears to a someone, either an individual or community; and the ways an object appears and the state of the individual or community to which it appears are correlated.

To use the simplest of examples involving ordinary perception, when I see a cup, I see it only through a single profile. Yet to perceive it as real rather than a hallucination or prop is to apprehend it as having other profiles that will show themselves as I walk around it, pick it up, and so forth. No act of perception – not even a God’s – can grasp all of a thing’s profiles at once. The real is always more than what we can perceive.

Phenomenology of science treats discovery as an instrumentally mediated form of perception. When researchers detect the existence of a new particle or asteroid, it assumes these will appear in other ways in other circumstances – and this can be confirmed or disconfirmed only by looking, in some suitably broad sense. It is obvious to scientists that electrons appear differently when addressed by different instrumentation (for example, wave-particle duality), and therefore that any conceptual grasp of the phenomenon involves instrumental mediation and anticipation. Not only is there no “view from nowhere” on such phenomena, but there is also no position from which we can zoom in on every available profile. There is no one privileged perception and the instrumentally mediated “positions” from which we perceive constantly change.

Phenomenology looks at science from various “focal lengths.” Close up, it looks at laboratory life; at attitudes, practices, and objects in the laboratory. It also pulls back the focus and looks at forms of mediation – how things like instruments, theories, laboratories, and various other practices mediate scientific perception. It can pull the focus back still further and look at how scientific research itself is contextualized, in an environment full of ethical and political motivations and power relations. Phenomenology has also made specific contributions to understanding relativity, quantum mechanics, and evolution.

Table of Contents

  1. Introduction
  2. Historical Overview
  3. Science and Perception
  4. General Implications
    1. The Priority of Meaning over Technique
    2. The Priority of the Practical over the Theoretical
    3. The priority of situation over abstract formalization
  5. Layers of Experience
    1. First Phase: Laboratory Life
    2. Second Phase: Forms of Mediation
    3. Third Phase: Contextualization of Research
  6. Phenomenology and Specific Sciences
    1. Relativity
    2. Quantum Mechanics
    3. Evolution
  7. Conclusion
  8. References and Further Reading

1. Introduction

Phenomenology provides an excellent starting point, perhaps the only adequate starting point, for a comprehensive understanding of the natural sciences: their existence, practices, methods, products, and cultural niches. The reason is that, for a phenomenologist, inquiry is first and foremost a question of looking and discovering rather than assuming and deducing. In looking and discovering, an object is always given to a someone – be it an individual or community – and the object and its manners of givenness are correlated. In the special terminology of phenomenology, this is the doctrine of intentionality (for example, see Cairns 1999). This doctrine has nothing to do with the distinction between “inner” and “outer” experiences, but is a simple fact of perception. To use the time-honored phenomenological example, even when I see an ordinary object such as a cup, I apprehend it only through a single appearance or profile. Yet for me to perceive it as a real object – rather than a hallucination or prop – I apprehend it as having other profiles that will show themselves as I walk around it, pick it up, and so forth, each profile flowing into the next in an orderly, systematic way. I do more than expect or deduce these profiles; the act of perceiving a cup contains anticipations of other acts in which the same object will be experienced in other ways. That’s what gives my experience of the world its depth and density. Perhaps I will discover that my original perception was misled, and my anticipations were mere assumptions; still, I discover this only through looking and discovering – through sampling other profiles. In science, too, when researchers propose the existence of a new particle or asteroid, such a proposal involves anticipations of that entity appearing in other ways in other circumstances, anticipations that can be confirmed or disconfirmed only by looking, in some suitably broad sense (Crease 1993). In ordinary perception, each appearance and profile (noema) is correlated with a particular position of the one who apprehends it (noesis); a change in either one (the cup turning, the person moving) affects the profile apprehended. This is called the noetic-noematic correlation. In science, the positioning of the observer is technologically mediated; what a particle or cell looks like depends in part on the state of instrumentation that mediates the observation.

Another core doctrine of phenomenology is the lifeworld (Crease 2011). Human beings, that is, engage the world in different ways. For instance, they seek wealth, fame, pleasure, companionship, happiness, or “the good”. They do this as children, adolescents, parents, merchants, athletes, teachers, and administrators. All these ways of being are modifications of a matrix of practical attachments that human beings have to the world that precedes any cognitive understanding. The lifeworld is the technical term phenomenologists have for this matrix. The lifeworld is the soil out of which grow various ways of being, including science. Understanding photosynthesis or quantum field theory, for instance, is only one – and very rare – way that human beings interact with plants or matter, and not the default setting. Humans have to be trained to see the world that way; they have to pay a special kind of attention and pursue a special kind of inquiry. Thus the subject-inquirer (again, whether individual or community) is always bound up with what is being inquired into by practical engagements that precede the inquiry, engagements that can be altered by and in the wake of the inquiry. It is terribly tempting for metaphysicians to “kick away the ladder of lived experience” from scientific ontology as a means to gain some sort of privileged access to the world that bypasses lifeworld experience, but this condemns science to being “empty fictions” (Vallor 2009).

The aim of phenomenology is to unearth invariants in noetic-noematic correlations, to make forms or structures of experience appear that are hidden in ordinary, unreflective life, or the natural attitude. Again, the parallel with scientific methodology is uncanny; scientific inquiry aims to find hidden forms or structures of the world by varying, repeating, or otherwise changing interventions into nature to see what remains the same throughout. Phenomenologists seek invariant structures at several different phases or levels – including that of the investigator, the laboratory, and the lifeworld - and can examine not only each phase or level, but the relation of each to the others. Over the last hundred years, this has generated a vast and diverse body of literature (Ginev 2006; Kockelmans & Kisiel 1970; Chasan 1992; Hardy and Embree 1992; McGuire and Tuchanska 2001; Gutting 2005).

2. Historical Overview

Phenomenology started out, in Husserl’s hands, well-positioned to develop an account of science. After all, Husserl was at the University of Göttingen during the years when David Hilbert, Felix Klein, and Emmy Noether were developing and extending the notion of invariance and group theory. Husserl not only had a deep appreciation for mathematics and natural science, but his approach was allied in many key respects with theirs, for he extended the notion of invariance to perception by viewing the experience of an object as of something that remains the same in the flux of changing sensory conditions produced by changing physical conditions. This may seem far-removed from the domain of mathematics but it is not. Klein's Erlanger program viewed mathematical objects as not representable geometrically all at once but rather in definite and particular ways, depending on the planes on which they were projected; the mathematical object remained the same throughout different projections. In an analogous way, Husserl’s phenomenological program viewed a sensuously apprehended object as not given to an experiencing subject all at once but rather via a series of adumbrations or profiles, one at a time, that depend on the respective positioning of subject and object. The “same” object – even light of a certain wavelength – can look very different to human observers in different conditions. What is different about Husserl’s program, and may make it seem removed from the mathematical context, is that these profiles are not mathematical projections but lifeworld experiences. What remained to be added to the phenomenological approach to create a fuller framework for a natural philosophy of science was a notion of perceptual fulfillment under laboratory conditions, and of the theoretical planning and instrumental mediation leading to the observing of a scientific object. The “same” structure – for example, a cell – will look very different using microscopes of different magnification and quality, and phenomenology easily provides an account for this (Crease 2009).

Despite this promising beginning, many phenomenologists after Husserl turned away from the sciences, sometimes even displaying a certain paternalistic and superior attitude towards them as impoverished forms of revealing. This is unwarranted. Husserl’s objection to rationalistic science in the Crisis of the European Sciences was after all not to science but to the Galilean assumption that the ontology of nature could be provided by mathematics alone, bypassing the lifeworld (Gurwitsch 1966, Heelan 1987). And Heidegger’s objection, in Being and Time, most charitably considered, was not to theoretical knowledge, but to the forgetting of the fact that it is a founded mode in the lifeworld, to be interpreted not merely as an aid to disclosure but as a special and specialized mode of access to the real itself. Others to follow, including Gadamer and  Merleau-Ponty, for various reasons did not pursue the significance of phenomenology for natural science.

Science also lagged behind other areas of phenomenological inquiry for historical reasons. The dramatic success of Einstein’s theory of general relativity, in 1919, brought “a watershed for subsequent philosophy of science" that proved to be detrimental to the prospects of phenomenology for science (Ryckman 2005). Kant’s puzzling and ambiguous doctrine of the schematism – according to which intuitions, which are a product of sensibility, and categories, which are a product of understanding, are synthesized by rules or schemata to produce experience – had nurtured two very different approaches to the philosophy of science. One, taken by logical empiricists, rejected the schematism and treated sensibility and the understanding as independent, and the line between the intuitive and the conceptual as that between experienced physical objects and abstract mathematical frameworks. The empiricists saw these two as linked by rules of coordination that applied the latter to the former. Such coordination – the subjective contribution of mind to knowledge – produced objective knowledge. The other, more phenomenological route was to pursue the insight that experience is possible only thanks to the simultaneous co-working of intuitions and concepts. While some forms and categories are subject to replacement, producing a “relativized a priori” (my conception of things like electrons, cells, and simultaneity may change) such forms and categories make experience possible. Objective knowledge arises not by an arbitrary application of concepts to intuitions – it is not just a decision of consciousness – but is a function of the fulfillment of physical conditions of possible conscious experience; scientists look at photographic plates or information collected by detectors in laboriously prepared conditions that assure them that such information is meaningful and not noise. Husserl’s phenomenological approach to transcendental structures, though, must be contrasted with Kant’s, for while Kant’s transcendental concepts are deduced, Husserl’s are reflectively observed and described. However, following the stunning announcement of the success of general relativity in 1919, which seemed to destroy transcendental assumptions about at least the Euclidean form of space and about absolute time, logical empiricists were quick to claim it vindicated their approach and refuted not only Kant but all transcendental philosophy. "Through Einstein … the Kantian position is untenable," Schlick declared, “and empiricist philosophy has gained one of its most brilliant triumphs.”  But the alleged vanquishing of transcendental philosophy and triumph of logical empiricism’s claims to understand science was due to “rhetoric and successful propaganda” rather than argument (Ryckman 2005). For as other transcendental philosophers such as Ernst Cassirer, and philosophically sophisticated scientists such as Hermann Weyl, realized, in making claims about the forms of possible phenomena general relativity called for what amounted to a revision, rather than a refutation, of Kant’s doctrine; how we may experience spatiality in ordinary life remains unaffected by Einstein’s theory. But the careers of both Cassirer and Weyl took them away from such questions, and nobody else took their place.

3. Science and Perception

One way of exhibiting the deep link between phenomenology and science is to note that phenomenology is concerned with the difference between local effects and global structures in perception. To use the time-honored example of perceiving a cup through a profile again: Grasping it under that particular adumbration or profile is a local effect, though what I intend is a global structure – the phenomenon – with multiple horizons of profiles. Phenomenology aims to exhibit how the phenomenon is constituted in describing these horizons of profiles. But this of course is closely related to the aim of science, which seeks to describe how phenomena (for example, electrons) appear differently in different contexts – and even, in the case of general relativity, incorporates a notion of invariance into the very notion of objectivity itself (Ryckman 2005). An objective state of affairs, that is, is one that has the same description regardless of whether the frame of reference from which it is observed is accelerating or not.

In science, however, perceiving (observing) is mediated by theory and instruments. Thanks to theories, the lawlike behavior of scientific phenomena (for example, how electrons behave in different conditions) is represented or “programmed” and then correlated with instrumental techniques and practices so that a phenomenon appears. The theory (for example, electromagnetism) thus structures both the performance process thanks to which the phenomenon appears, and the phenomenon itself. Read noetically, with respect to production, the theory is something to be performed; read noematically, with respect to the product, it describes the object appearing in performance. A theory does not correspond to a scientific phenomenon; rather, the phenomenon fulfills or does not fulfill the expectations of its observation raised by the theory. Is this an electron beam or not?  To decide that, its behavior has to be evaluated. Theory provides a language that the experimenter can use for describing or recognizing or identifying the profiles. For the theorist, the semantics of the language is mathematical; for the experimenter, the semantics are descriptive and the objects described are not mathematical objects but phenomena – bodily presences in the world. Thus the dual semantics of science (Heelan 1988); a scientific word (such as ‘electron’) can refer to both an abstract term in a theory and to a physical phenomenon in a laboratory. The difference is akin to that between a ‘C’ in a musical score and a ‘C’ heard in a concert hall. Conflating these two usages has confused many a philosopher of science. But our perception of the physical phenomenon in the laboratory has been mediated by the instruments used to produce and measure it (Ihde 1990).

By adding theoretical and experimental mediation to Husserl’s account of what is “constitutive” of perceptual horizons (Husserl 2001, from where the following quotations are taken except where noted), one generates a framework for a phenomenological account of science. To grasp a scientific object, like a perceptual object, as a presence in the world, as “objective,” means, strangely enough, to grasp it as never totally given, but as having an unbounded number of profiles that are not simultaneously grasped. Such an object is embedded in a system of “referential implications” available to us to explore over time. And it is rarely grasped with Cartesian clarity the first time around, but “calls out to us” and “pushes us” towards appearances not simultaneously given. A new property, for example parity violation, is detected in one area of particle physics – but if it shows up here it should also show up there even more intensely and dramatically. Entities, that is, show themselves as having further sides to be explored, and as amenable to better and better instrumentation. Phenomena even as it were call attention to their special features – strangeness in elementary particles, DNA in cells, gamma ray bursters amongst astronomical bodies – and recommends these features to us for further exploration. “There is a constant process of anticipation, of preunderstanding.”  With sufficient apprehension of sampled profiles, “The unfamiliar object is … transformed …into a familiar object.”  This involves development both of an inner horizon of profiles already apprehended, already sampled, and an external of not-yet apprehended profiles. But the object is never fully grasped in its complete presence, horizons remain, and the most one can hope for is for a thing to be given optimally in terms of the interests for which it is approached. And because theory and instruments are always changing, the same object will always be grasped with new profiles. Thus, Husserl’s phenomenological account readily handles the often vexing question in traditional philosophy of science of how “the same” experiment can be “repeated.”  It equally readily handles the even more troublesome puzzle in traditional approaches of how successive theories or practices can refer to the same object. For just as the same object can be apprehended “horizontally” in different instrumental contexts at the same time, it can also be apprehended “vertically” by successively more developed instrumentation. Husserl, for instance, refers to the “open horizon of conceivable improvement to be further pursued” (Husserl Crisis #9a). Newer, more advanced instruments will pick out the same entity (for example, an electron), yield new values for measurements of the same quantities, and open up new domains in which new phenomena will appear amid the ones that now appear on the threshold. Today’s discovery is tomorrow’s background.

The basic account of perception given above has been further elaborated in the context of group theory by Ernst Cassirer in a remarkable article (Cassirer 1944). Cassirer extends the attempts of Helmholtz, Poincaré and others to apply the mathematical concept of group to perception in a way that makes it suitable to the philosophy of science. Group theory may seem far from the perceptual world, Cassirer says. But the perceptual world, like the mathematical world, is structured; it possesses perceptual constancy in a way that cannot be reduced to “a mere mosaic, an aggregate of scattered sensations” but involve a certain kind of invariance. Perception is integrated into a total experience in which keeping track of “dissimilarity rather than similarity” is a hallmark of the same object. The cup is going to look different as the light changes and as I move about it. “As the particular changes its position in the context, it changes its “aspect.”  Thus, Cassirer writes, “the ‘possibility of the object’ depends upon the formation of certain invariants in the flux of sense-impressions, no matter whether these be invariants of perception or of geometrical thought, or of physical theory. The positing of something endowed with objective existence and nature depends on the formation of constants of the kinds mentioned …. The truth is that the search for constancy, the tendency toward certain invariants, constitutes a characteristic feature and immanent function of perception. This function is as much a condition of perception of objective existence as it is a condition of objective knowledge.”  The constitutive factor of objective knowledge, Cassirer concludes, “manifests itself in the possibility of forming invariants.”  Again, one needs to flesh out such an approach with account of fulfillment as mediated both theoretically and practically.

4. General Implications

The above, it will be seen, has three general implications for philosophy of science:

a. The Priority of Meaning over Technique.

In contrast to positivist-inspired and much mainstream philosophy of science, a phenomenological approach does not view science as pieced together at the outset from praxes, techniques, and methods. Praxes, techniques, and methods – as well as data and results – come into being by interpretation. The generation of meaning does not move from part to whole, but via a back-and-forth (hermeneutical) process in which phenomena are projected upon an already-existing framework of meaning, the assumptions of which are at least partially brought into question, and by this action further reviewed and refined within the ongoing process of interpretation. This process is amply illustrated by episode after episode in the history of science. Relativity theory evolved as a response to problems and developments experienced by scientists working within Newtonian theory.

b. The Priority of the Practical over the Theoretical

The framework of meaning mentioned above in terms of which phenomena are interpreted is not comprised merely of tools, texts, and ideas, but involves a culturally and historically determined engagement with the world which is prior to the subject and object separation. On the one hand, this means that the meanings generated by science are not ahistorical forms or natural kinds that have a transcendent origin. On the other hand, it means that these meanings are also not arbitrary or mere artifacts of discourse; science has a “historical space” in which meanings are realized or not realized. Results are right or wrong; theories are adjudicated as true or false. Later, as the historical space changes, the “same” theory (or more fully developed versions thereof) may be confirmed by different results inconsistent with previous confirmations of the earlier version. What a “cell” is may look very different depending on the techniques and instruments used to apprehend it, but what is happening is not a wholesale replacement of one picture or theory by another, but expanding and evolving knowledge (Crease 2009).

c. The Priority of the Practical over the Theoretical

Truth always involves a disclosure of something to someone in a particular cultural and historical context. Even scientific knowledge can never completely transcend these culturally and historically determined involvements, leaving them behind as if scientific knowledge consisted in abstractions viewed from nowhere in particular. The particularity of the phenomena disclosed by science is often disguised by the fact that they can show themselves in many different cultural and historical contexts if the laboratory conditions are right, giving rise to the illusion of disembodied knowledge.

5. Layers of Experience

These three implications suggest a way of ordering the kinds of contributions that a phenomenology can make to the philosophy of science. For there are several different phases – focal lengths, one might say – at which to set one’s phenomenology, and it is important to distinguish between them. The focal length can be trained within the laboratory on laboratory life, and investigate the attitudes, practices, and objects encountered in the laboratory. These, however, are nested in the laboratory environment and in the structure of scientific knowledge, which is their exterior expression. Another phase concerns the forms of mediation, both theoretical and instrumental, and how these contextualize the phase just mentioned of attitudes, practices, and objects, and how these are related to their exterior. This phase is nested in turn in another kind of environment, the lifeworld itself, with its ethical and political motivations and power relations. The contributions of phenomenology to the philosophy of science is first of all to describe these phases and how they are nested in each other, and then to describe and characterize each. A philosophical account of science cannot begin, nor is it complete, without a description of these phases.

a. First Phase: Laboratory Life

One phase has to do with specific attitudes, practices, or objects encountered by a researcher doing research in the laboratory environment – with the phenomenology of laboratory perception. Inquiry is one issue here. Conventional textbooks often treat the history of science as a sequence of beliefs about the state of the world, as if it were like a series of snapshots. This creates problems having to do with accounting for how these beliefs change, how they connect up, and what such change implies about continuity of science. It also rings artificial from the standpoint of laboratory practice. A phenomenological approach, by contrast, considers the path of science as rather like an evolving perception, as a continual process that cannot be neatly dissected into what’s in question and what not, what you believe and what you do not. Affects of research is another issue. The moment of experience involves more than knowledge, global or local, more than iterations and reiterations. Affects like wonder, astonishment, surprise, incredulity, fascination, and puzzlement are important to inquiry, in mobilizing the transformation of the discourse and our basic way of being related to a field of inquiry. They indicate to us the presence of something more than what’s formulated, yet also not arbitrary. When something unexpected happens, it is not a matter of drawing a conceptual blank. When something unexpected and puzzling happens in the lab, it involves a discomfort from running into something that you think you should understand and you do not. Taking that discomfort with you is essential to what transformations ensue. Other key issues of the phenomenology of laboratory experience include trust, communication, data, measurement, and experiment (Crease 1993). Experiment is an especially important topic. For there is nothing automatic about experimentation; experiments are first and foremost material events in the world. Events to not produce numbers – they do not measure themselves – but do so only when an action is planned, prepared, and witnessed. An experiment, therefore, has the character of a performance, and like all performances is a historically and culturally situated hermeneutical process. Scientific objects that appear in laboratory performances may have to be brought into focus, somewhat like the ship that Merleau-Ponty describes that has run aground on the shore, whose pieces are at first mixed confusingly with the background, filling us with a vague tension and unease, until our sight is abruptly recast and we see a ship, accompanied by release of the tension and unease (Crease 1998). In the laboratory, however, what is at first latent in the background and then recognized as an entity belongs to an actively structured process. We are staging what we are trying to recognize, and the way we are staging it may interfere with our recognition and the experiment may have to be restaged to bring the object into better focus.

b. Second Phase: Forms of Mediation

Second order features have to do with understanding the contextualization of the laboratory itself. For the laboratory is a special kind of environment. The laboratory is like a garden, walled off to a large extent from the wider and wilder surrounding environment outside. Special things are grown in it that may not appear in the outside world, but yet are related to them, and which help us understand the outside world. To some extent, the laboratory can be examined as the product or embodiment of forms discursive formations imposing power and unconditioned knowledge claims (Rouse 1987). But only to a limited extent. For the laboratory is not like an institution in which all practices are supposed to work in the same way without changing. It thus cannot be understood by studying discursive formations of power and knowledge exclusively; it is unlike a prison or military camp. A laboratory is a place designed to make it possible to stage performances that show themselves at times as disruptive of discourse, to explore such performances and make sure there really is a disruption, and then to foster creation of a new discourse.

c. Third Phase: Contextualization of Research

A third phase has to do with the contextualization of research itself, with approaches to the whole of the world, and with understanding why human beings have come to privilege certain kinds of inquiry over others. The lifeworld – a kind of horizon or atmosphere in which we think, pre-loaded with powerful metaphors and images and deeply embedded habits of thought – has its own character and changes over time. This character affects everyone in it, scientists and philosophers who think about science. The conditions of the lifeworld can, for instance, seduce us into thinking that only the measurable is the real. This is the kind of layer addressed by Husserl’s Crisis (Husserl 1970), Heidegger’s “The Question Concerning Technology,” (Heidegger 1977) and so forth. The distinction between the second and third phases thus parallels the distinction in sociology of science between micro-sociology and macro-sociology.

6. Phenomenology and Specific Sciences

Phenomenology has also been shown to contribute to understanding certain features or developments in contemporary theories which seem of particular significance for science itself, including relativity, quantum mechanics, and evolution.

a. Relativity

Ryckman (2005) highlights the role of phenomenology in understanding the structure and implications of general relativity and of certain other developments in contemporary physics. The key has to do with the role of general covariance, or the requirement that objects must be specified without any reference to a dynamical background space-time setting. Fields, that is, are not properties of space-time points or regions, they are those points and regions. The result of the requirement of general covariance is thus to remove the physical objectivity of space and time as independent of the mass and energy distribution that shapes the geometry of physical space and time. This, Ryckman writes, is arguably its “most philosophically significant aspect," for it specifies “what is a possible object of fundamental physical theory.”  The point was digested by transcendental philosophers who could understand relativity. One was Cassirer, who saw that covariance could not be treated as a principle of coordination between intuitions and formalisms, and thus was not part of the “subjective” contribution to science, as Schlick and his follower Hans Reichenbach were doing. Rather, it amounted to a restriction on what was allowed as a possible object of field theory to begin with. The requirement of general covariance meant that relativity was about a universe in which objects did not flit about on a space-time stage, but were that stage. Ryckman’s book also demonstrates the role of phenomenology in Weyl’s classic treatment of relativity, and in his formulation of the gauge principle governing the identity of units of measurement. Phenomenology thus played an important role in the articulation of general relativity, and certain concepts central to modern physics.

b. Quantum Mechanics

Phenomenology may also contribute to explaining the famous disparity between the clarity and correctness of the theory and the obscurity and inaccuracy of the language used to speak about its meaning. In Quantum Mechanics and Objectivity (Heelan 1965) and other writings (Heelan 1975), Heelan applies phenomenological tools to this issue. His approach is partly Heideggerian and partly Husserlian. What is Heideggerian is the insistence on the moment prior to object-constitution, the self-aware context or horizon or world or open space in which something appears. The actual appear­ing (or phenomenon) to the self is a second moment. This Heelan analyses in a Husserlian way by studying the intentionality structure of object constitution and insisting on the duality therein of its (embodied subjective) noetic and (embodied objective) noematic poles. “The noetic aspect is an open field of connected scientific questions addressed by a self-aware situated researcher to empirical experience; the noematic aspect is the response obtained from the situated scientific experiment by the experiencing researcher. The totality of actual and possible answers constitutes a horizon of actual and possible objects of human knowledge and this we call a World.”  (Heelan 1965, x; also 3-4). The world then becomes the source of meaning of the word “real,” which is defined as what can appear as an object in the world. The ever-changing and always historical laboratory environment with all its ever-to-be-updated instrumentation and technologies belongs to the noetic pole; it is what makes the objects of science real by bringing them into the world in the act of measurement. Measure­ment involves “an interaction with a measuring instrument capable of yielding macroscopic sensible data, and a theory capable of explaining what it is that is measured and why the sensible data are observable symbols of it” (Heelan 1965, 30-1). The difference between quantum and classical physics does not lie in the intervention of the observer’s subjectivity but in the nature of the quantum object: “[W]hile in classical physics this is an idealised normative (and hence abstract) object, in quantum physics the object is an individual instance of its idealised norm” (Heelan 1965, xii). For while in classical physics deviations of variables from their ideal norms are treated independently in a statistically based theory of errors, the variations (statistical distribution) of quantum measurements are systematically linked in one formalism. The apparent puzzle raised by the “reduction of the wave packet” is thus explained via an account of measurement. In the “orthodox” interpretation, the wave function is taken to be the “true” reality, and the act of measurement is seen as changing the incoming wave packet into one of its component eigen functions by an anonymous random choice. The sensible outcome of this change is the eigenvalue of the outgoing wave function which is read from the measuring instrument. (An eigen function, very simply, is a function which has the property that, when an operation is performed on it, the result is that function multiplied by a constant, which is called the eigenvalue.) The agent of this transformation is the human spirit or mind as a doer of mathematics. Heelan also sees this process as depending on the conscious choice and participation of the scientist-subject, but through a much different process. The formulae relate, not to the ideal object in an absolute sense, apart from all human history, culture, and language, but to the physical situation in which the real object is placed, yielding a particular instance of an ensemble or system that admits of numerous potential experimental realizations. The reduction of the wave packet then “is nothing more than the expression of the scientist’s choice and implementation of a measuring process; the post-measurement outcome is different from the means used to prepare the pure state” prior to the implementation of the measurement (Heelan 1965, 184). The wave function describes a situation which is imperfectly described as a fact of the real world; it describes a field of possibilities. That does not mean there is more-to-be-discovered (“hidden variables”) which will make it a part of the real world, nor that only human participation is able to bring it into the real world, but that what becomes a fact of the real world does so by being fleshed out by an instrumental environment to one or another complementary presentations. Heelan’s work therefore shows the value of Continental approaches to the philosophy of science, and exposes the shortcomings of approaches to the philosophy of science which relegate such themes to “somewhere between mysticism and crossword puzzles” (Heelan 1965, x).

c. Evolution

One of the ­most significant discover­ies of 20th century phe­nomenology was of what is variously called em­bodiment, lived body, flesh, or animate form, the experiences of which are that of a unified, self-aware being, and which cannot be understood apart from reflection on con­crete human experience. The body is not a bridge that connects subject and world, but rather a primordial and unsurpassab­le unity productive of there being persons and worlds at all. Husserl was aware even of the significance of evolution and move­ment. His use of the expression “animate organism” betrays a recognition that he was discussing “something not exclusive to humans, that is, something broader and more funda­mental than human animate organism” (Sheets-Johnstone 1999, 132); thus, a need to discuss matters across the evolutionary spec­trum. Failing to examine our evolutionary heritage, in fact, means misconceiving the wellsprin­gs of our humanity (Sheets-Johnstone 1999). Biologists who developed phenomenological treatments of animal behavior include von Uxhull, to whom Heidegger refers in the section on animals in Fundamental Concepts of Metaphysics, and Adolph Portmann, both of whom discussed the animal’s umwelt. And Sheets-Johnstone has emphasized that phenome­nology needs to ­examine not only the ontogenet­ic dimension B infant behavior B but also the phylogenetic one. ­­­­­­If we treat human animate form as unique we shirk our phenomenologic­al duties and end up with incomplete and distorted accounts containing implicit and unexamined notions. “[G]enuine understandings of consciousness demand close and serious study of evolution as a history of animate form” (Sheets-Johnstone 1999, 42).

7. Conclusion

Developing a phenomenological account of science is important for the philosophy of science insofar as it has the potential to move us beyond a dead-end in which that discipline has entrapped itself. The dead-end involves having to choose between: on the one hand, assuming that a fixed, stable order pre-exists human beings that is uncovered or approximated in scientific activity; and on the other hand, assuming that the order is imposed by the outside. Each approach is threatened, though in different ways, by the prospect of having to incorporate a role for history and culture. Phenomenology is not as threatened, for its core doctrine of intentionality implies that parts are only understood against the background of wholes and objects against the background of their horizons, and that while we discover objects as invariants within horizons, we also discover ourselves as those worldly embodied presences to whom the objects appear. It thus provides an adequate philosophical foundation for reintroducing history and culture into the philosophy of the natural sciences.

8. References and Further Reading

  • Babich, Babette, 2010. “Early Continental Philosophy of Science,” in A. Schrift, ed., The History of Continental Philosophy, V. 3, Durham: Acumen, 263-286.
  • Cairns, Dorion, 1999. “The Theory of Intentionality in Husserl,” ed. by L. Embree, F. Kersten, and R. Zaner, Journal of the British Society for Phenomenology 32: 116-124.
  • Cassirer, E. 1944. “The Concept of Group and the Theory of Perception,” tr. A. Gurwitsch, Philosophy and Phenomenological Research, pp. 1-35.
  • Chasan, S. 1992. “Bibliography of Phenomenological Philosophy of Natural Science,” in Hardy & Embree, 1992, pp. 265-290.
  • Crease, R. 1993. The Play of Nature. Bloomington, IN: Indiana University Press.
  • Crease, R. ed. 1997. Hermeneutics and the Natural Sciences. Dordrecht: Kluwer.
  • Crease, R. 1998. “What is an Artifact?”  Philosophy Today, SPEP Supplement, 160-168.
  • Crease, R. 2009. “Covariant Realism.”  Human Affairs 19, 223-232.
  • Crease, R. 2011. “Philosophy Rules.”  Physics World, August 2011.
  • Ginev, D. 2006. The Context of Constitution: Beyond the Edge of Epistemological Justification. Boston: Boston Studies in the Philosophy of Science.
  • Gurwitsch, A. 1966. “The Last Work of Husserl,” Studies in Phenomenology and Psychology, Evanston: Northwestern University Press, 1966, pp. 397-447.
  • Gutting, G. (ed.). 2005  Continental Philosophy of Science. Oxford: Blackwell.
  • Hardy, L., and Embree, L., 1992. Phenomenology of Natural Science. Dordrecht: Kluwer.
  • Heelan, P. 1965. Quantum Mechanics and Objectivity. The Hague: Nijhoff.
  • Heelan, P. 1967. Horizon, Objectivity and Reality in the Physical Sciences. International Philosophical Quarterly 7, 375-412.
  • Heelan, P. 1969. The Role of Subjectivity in Natural Science, Proc. Amer. Cath. Philos. Assoc. Washington, D.C.
  • Heelan, P. 1970a. Quantum Logic and Classical Logic: Their Respective Roles, Synthese 21: 2 - 33.
  • Heelan, P. 1970b. Complementarity, Context-Dependence and Quantum Logic. Foundations of Physics 1, 95-110.
  • Heelan, P. 1972. Toward a new analysis of the pictorial space of Vincent van Gogh. Art Bull. 54, 478-492.
  • Heelan, P. 1975. Heisenberg and Radical Theoretical Change. Zeitschrift für allgemeine Wissenchaftstheorie 6, 113-136, and following page 136.
  • Heelan, P. 1983. Space-Perception and the Philosophy of Science. Berkeley: University of California Press.
  • Heelan, P. 1987. “Husserl’s Later Philosophy of Science,” Philosophy of Science 54: 368-90.
  • Heelan, P. 1988. “Experiment and Theory: Constitution and Reality,” Journal of Philosophy 85, 515-24.
  • Heelan, P. 1991. Hermeneutical Philosophy and the History of Science. In Nature and Scientific Method: William A. Wallace Festschrift, ed. Daniel O. Dahlstrom. Washington, D.C.: Catholic University of America Press, 23-36.
  • Heelan, P. 1995 An Anti-epistemological or Ontological Interpretation of the Quantum Theory and Theories Like it, in Continental and Postmodern Perspectives in the Philosophy of Science, ed. by B. Babich, D. Bergoffen, and S. Glynn. Aldershot/Brookfield, VT: Avebury Press, 55-68.
  • Heelan, P. 1997. Why a hermeneutical philosophy of the natural sciences? In Crease 1997, 13-40.
  • Heidegger, M. 1977. The Question Concerning Technology, tr. W. Lovitt. New York: Garland.
  • Husserl, E. 1970. The Crisis of European Sciences and Transcendental Phenomenology, tr. D. Carr. Evanston: Northwestern University Press.
  • Husserl, E. 2001. Analyses Concerning Passive and Active Synthesis: Lectures on Transcendental Logic, tr. A. Steinbock. Boston: Springer.
  • Ihde, D. 1990. Technology and the Lifeworld. Bloomington: Indiana University Press.
  • Kockelmans, J. and Kisiel, T., 1970. Phenomenology and the Natural Sciences. Evanston: Northwestern University Press.
  • Mcguire, J. and Tuschanska, B. 2001. Science Unfettered: A Philosophical Study in Sociopolitical Ontology. Columbus: Ohio University Press.
  • Michl, Matthias, Towards a Critical Gadamerian Philosophy of Science, MA thesis, University of Auckland, 2005.
  • Rouse, J. 1987. Knowledge and Power: Toward a Political Philosophy of Science. Ithaca: Cornell University Press.
  • Ryckman, T. 2005. The Reign of Relativity: Philosophy in Physics 1915-1925. New York: Oxford.
  • Seebohm, T. 2004. Hermeneutics: Method and Methodology. Springer.
  • Sheets-Johnstone, M. 1999. The Primacy of Movement. Baltimore: Johns Benjamins.
  • Ströker, Elisabeth 1997. The Husserlian Foundations of Science. Boston: Kluwer.
  • Vallor, Shannon, 2009. “The fantasy of third-person science: Phenomenology, ontology, and evidence.”  Phenom. Cogn. Sci. 8: 1-15.


Author Information

Robert P. Crease
Stony Brook University
U. S. A.

Philosophy of Technology

Like many domain-specific subfields of philosophy, such as philosophy of physics or philosophy of biology, philosophy of technology is a comparatively young field of investigation. It is generally thought to have emerged as a recognizable philosophical specialization in the second half of the 19th century, its origins often being located with the publication of the Ernst Kapp’s book, Grundlinien einer Philosophie der Technik (Kapp, 1877). Philosophy of technology continues to be a field in the making and as such is characterized by the coexistence of a number of different approaches to (or, perhaps, styles of) doing philosophy. This highlights a problem for anyone aiming to give a brief but concise overview of the field because “philosophy of technology” does not name a clearly delimited academic domain of investigation that is characterized by a general agreement among investigators on what are the central topics, questions and aims, and who are the principal authors and positions. Instead, “philosophy of technology” denotes a considerable variety of philosophical endeavors that all in some way reflect on technology.

There is, then, an ongoing discussion among philosophers, scholars in science and technology studies, as well as engineers about what philosophy of technology is, what it is not, and what it could and should be. These questions will form the background against which the present article presents the field. Section 1 begins by sketching a brief history of philosophical reflection on technology from Greek Antiquity to the rise of contemporary philosophy of technology in the mid-19th to mid-20th century. This is followed by a discussion of the present state of affairs in the field (Section 2). In Section 3, the main approaches to philosophy of technology and the principal kinds of questions which philosophers of technology address are mapped out. Section 4 concludes by presenting two examples of current central discussions in the field.

Table of Contents

  1. A Brief History of Thinking about Technology
    1. Greek Antiquity: Plato and Aristotle
    2. From the Middle Ages to the Nineteenth Century: Francis Bacon
    3. The Twentieth Century: Martin Heidegger
  2. Philosophy of Technology: The State of the Field in the Early Twenty-First Century
  3. How Philosophy of Technology Can Be Done: The Principal Kinds of Questions That Philosophers of Technology Ask
  4. Two Exemplary Discussions
    1. What Is (the Nature of) Technology?
    2. Questions Regarding Biotechnology
  5. References and Further Reading

1. A Brief History of Thinking about Technology

The origin of philosophy of technology can be placed in the second half of the 19th century. But this does not mean that philosophers before the mid-19th century did not address questions that would today be thought of as belonging in the domain of philosophy of technology. This section will give the history of thinking about technology – focusing on a few key figures, namely Plato, Aristotle, Francis Bacon and Martin Heidegger.

a. Greek Antiquity: Plato and Aristotle

Philosophers in Greek antiquity already addressed questions related to the making of things. The terms “technique” and “technology” have their roots in the ancient Greek notion of “techne” (art, or craft-knowledge), that is, the body of knowledge associated with a particular practice of making (cf. Parry, 2008). Originally the term referred to a carpenter’s craft-knowledge about how to make objects from wood (Fischer, 2004: 11; Zoglauer, 2002: 11), but later it was extended to include all sorts of craftsmanship, such as the ship’s captain’s techne of piloting a ship, the musician’s techne of playing a particular kind of instrument, the farmer’s techne of working the land, the statesman’s techne of governing a state or polis, or the physician’s techne of healing patients (Nye, 2006: 7; Parry, 2008).

In classical Greek philosophy, reflection on the art of making involved both reflection on human action and metaphysical speculation about what the world was like. In the Timaeus, for example, Plato unfolded a cosmology in which the natural world was understood as having been made by a divine Demiurge, a creator who made the various things in the world by giving form to formless matter in accordance with the eternal Ideas. In this picture, the Demiurge’s work is similar to that of a craftsman who makes artifacts in accordance with design plans. (Indeed, the Greek word “Demiourgos” originally meant “public worker” in the sense of a skilled craftsman.) Conversely, according to Plato (Laws, Book X) what craftsmen do when making artifacts is to imitate nature’s craftsmanship – a view that was widely endorsed in ancient Greek philosophy and continued to play an important role in later stages of thinking about technology. On Plato’s view, then, natural objects and man-made objects come into being in similar ways, both being made by an agent according to pre-determined plans.

In Aristotle’s works this connection between human action and the state of affairs in the world is also found. For Aristotle, however, this connection did not consist in a metaphysical similarity in the ways in which natural and man-made objects come into being. Instead of drawing a metaphysical similarity between the two domains of objects, Aristotle pointed to a fundamental metaphysical difference between them while at the same time making epistemological connections between on the one hand different modes of knowing and on the other hand different domains of the world about which knowledge can be achieved. In the Physics (Book II, Chapter 1), Aristotle made a fundamental distinction between the domains of physis (the domain of natural things) and poiesis (the domain of non-natural things). The fundamental distinction between the two domains consisted in the kinds of principles of existence that were underlying the entities that existed in the two domains. The natural realm for Aristotle consisted of things that have the principles by which they come into being, remain in existence and “move” (in the senses of movement in space, of performing actions and of change) within themselves. A plant, for instance, comes into being and remains in existence by means of growth, metabolism and photosynthesis, processes that operate by themselves without the interference of an external agent. The realm of poiesis, in contrast, encompasses things of which the principles of existence and movement are external to them and can be attributed to an external agent – a wooden bed, for example, exists as a consequence of a carpenter’s action of making it and an owner’s action of maintaining it.

Here it needs to be kept in mind that on Aristotle’s worldview every entity by its nature was inclined to strive toward its proper place in the world. For example, unsupported material objects move downward, because that is the natural location for material objects. The movement of a falling stone could thus be interpreted as a consequence of the stone’s internal principles of existence, rather than as a result of the operation of a gravitational force external to the stone. On Aristotle’s worldview, contrary to our present-day worldview, it thus made perfect sense to think of all natural objects as being subject to their own internal principles of existence and in this respect being fundamentally distinct from artifacts that are subject to externally operating principles of existence (to be found in the agents that make an maintain them).

In the Nicomachean Ethics (Book VI, Chapters 3-7), Aristotle distinguished between five modes of knowing, or of achieving truth, that human beings are capable of. He began with two distinctions that apply to the human soul. First, the human soul possesses a rational part and a part that does not operate rationally. The non-rational part is shared with other animals (it encompasses the appetites, instincts, etc.), whereas the rational part is what makes us human – it is what makes man the animal rationale. The rational part of the soul in turn can be subdivided further into a scientific part and a deductive or ratiocinative part. The scientific part can achieve knowledge of those entities of which the principles of existence could not have been different from what they are; these are the entities in the natural domain of which the principles of existence are internal to them and thus could not have been different. The deductive or ratiocinative part can achieve knowledge of those entities of which the principles of existence could have been different; the external principles of existence of artifacts and other things in the non-natural domain could have been different in that, for example, the silver smith who made a particular silver bowl could have had a different purpose in mind than the purpose for which the bowl was actually made. The five modes of knowledge that humans are capable of – often denoted as virtues of thought – are faculties of the rational part of the soul and in part map onto the scientific part / deductive part dichotomy. They are what we today would call science or scientific knowledge (episteme), art or craft knowledge (techne), prudence or practical knowledge (phronesis), intellect or intuitive apprehension (nous) and wisdom (sophia). While episteme applies to the natural domain, techne and phronesis apply to the non-natural domain, phronesis applying to actions in general life and techne to the crafts. Nous and sophia, however, do not map onto these two domains: while nous yields knowledge of unproven (and not provable) first principles and hence forms the foundation of all knowledge, sophia is a state of perfection that can be reached with respect to knowledge in general, including techne.

Both Plato and Aristotle thus distinguished between techne and episteme as pertaining to different domains of the world, but also drew connections between the two. The reconstruction of the actual views of Plato and Aristotle, however, remains a matter of interpretation (see Parry, 2008). For example, while many authors interpret Aristotle as endorsing the widespread view of technology as consisting in the imitation of nature (for example, Zoglauer, 2002: 12), Schummer (2001) recently argued that for Aristotle this was not a characterization of technology or an explication of the nature of technology, but merely a description of how technological activities often (but not necessarily) take place. And indeed, it seems that in Aristotle’s account of the making of things the idea of man imitating nature is much less central than it is for Plato, when he draws a metaphysical similarity between the Demiurge’s work and the work of craftsmen.

b. From the Middle Ages to the Nineteenth Century: Francis Bacon

In the Middle Ages, the ancient dichotomy between the natural and artificial realms and the conception of craftsmanship as the imitation of nature continued to play a central role in understanding the world. On the one hand, the conception of craftsmanship as the imitation of nature became thought of as applying not only to what we would now call “technology” (that is, the mechanical arts), but also to art. Both were thought of as the same sort of endeavor. On the other hand, however, some authors began to consider craftsmanship as being more than merely the imitation of nature’s works, holding that in their craftsmanship humans were also capable of improving upon nature’s designs. This conception of technology led to an elevated appreciation of technical craftsmanship which, as the mere imitation of nature, used to be thought of as inferior to the higher arts in the Scholastic canon that was taught at medieval colleges. The philosopher and theologian Hugh of St. Victor (1096-1141), for example, in his Didascalicon compared the seven mechanical arts (weaving, instrument and armament making, nautical art and commerce, hunting, agriculture, healing, dramatic art) with the seven liberal arts (the trivium of grammar, rhetoric, and dialectic logic, and the quadrivium of astronomy, geometry, arithmetic, and music) and incorporated the mechanical arts together with the liberal arts into the corpus of knowledge that was to be taught (Whitney, 1990: 82ff.; Zoglauer, 2002: 13-16).

While the Middle Ages thus can be characterized by an elevated appreciation of the mechanical arts, with the transition into the Renaissance thinking about technology gained new momentum due to the many technical advances that were being made. A key figure at the end of the Renaissance is Francis Bacon (1561-1626), who was both an influential natural philosopher and an important English statesman (among other things, Bacon held the offices of Lord Keeper of the Great Seal and later Lord Chancellor). In his Novum Organum (1620), Bacon proposed a new, experiment-based method for the investigation of nature and emphasized the intrinsic connectedness of the investigation of nature and the construction of technical “works”. In his New Atlantis (written in 1623 and published posthumously in 1627), he presented a vision of a society in which natural philosophy and technology occupied a central position. In this context it should be noted that before the advent of science in its modern form the investigation of nature was conceived of as a philosophical project, that is, natural philosophy. Accordingly, Bacon did not distinguish between science and technology, as we do today, but saw technology as an integral part of natural philosophy and treated the carrying out of experiments and the construction of technological “works” on an equal footing. On his view, technical “works” were of the utmost practical importance for the improvement of the living conditions of people, but even more so as indications of the truth or falsity of our theories about the fundamental principles and causes in nature (see Novum Organum, Book I, aphorism 124).

New Atlantis is the fictional report of a traveler who arrives at an as yet unknown island state called Bensalem and informs the reader about the structure of its society. Rather than constituting a utopian vision of an ideal society, Bensalem’s society was modeled on the English society of Bacons” own times that had become increasingly industrialized and in which the need for technical innovations, new instruments and devices to help with the production of goods and the improvement of human life was clearly felt (compare Kogan-Bernstein, 1959). The utopian vision in New Atlantis only pertained to the organization of the practice of natural philosophy. Accordingly, Bacon spent much of New Atlantis describing the most important institution in the society of Bensalem, Salomon’s House, an institution devoted entirely to inquiry and technological innovation.

Bacon provided a long list of the various areas of knowledge, techniques, instruments and devices that Salomon’s House possesses, as well as descriptions of the way in which the House is organized and the different functions that its members fulfill. In his account of Salomon’s house Bacon’s unbridled optimism about technology can be seen: Salomon’s House appears to be in the possession of every possible (and impossible) technology that one could think of, including several that were only realized much later (such as flying machines and submarines) and some that are impossible to realize. (Salomon’s House even possesses several working perpetuum mobile machines, that is, machines that once they have been started up will remain in motion forever and are able to do work without consuming energy. Contemporary thermodynamics shows that such machines are impossible.) Repeatedly it is stated that Salomon’s House works for the benefit of Bensalem’s people and society: the members of the House, for example, regularly travel through the county to inform the people about new inventions, to warn them about upcoming catastrophic events, such as earthquakes and droughts the occurrence of which Salomon’s House is been able to forecast, and to advise them about how they could prepare themselves for these events.

While Bacon is often associated with the slogan “knowledge is power”, contrary to how the slogan is often understood today (where “power” is often taken to mean political power or power within society) what is meant is that knowledge of natural causes gives us power over nature that can be used for the benefit of mankind. This can be seen, for instance, from the way Bacon described the reasons of the Bensalemians for founding Salomon’s House: “The end of our foundation is the knowledge of causes, and secret motions of things; and the enlarging of the bounds of human empire to the effecting of all things possible.” Here, inquiry into “the knowledge of causes, and secret motions of things” and technological innovation by producing what is possible (“enlarging of the bounds of human empire to the effecting of all things possible”) are explicitly mentioned as the two principal goals of the most important institution in society. (It should also be noted that Bacon himself never formulated the slogan “knowledge is power”. Rather, in the section “Plan of the Work” in the Instauratio Magna he speaks of the twin aims of knowledge – Bacon’s term is ‘scientia” – and power – “Potentia” – as coinciding in the devising of new works because one can only have power over nature when one knows and follows nature’s causes. The connection between knowledge and power here is the same as in the description of the purpose of Salomon’s House.)

The improvement of life by means of natural philosophy and technology is a theme which pervades much of Bacons’ works, including the New Atlantis and his unfinished opus magnum, the Instauratio Magna. Bacon saw the Instauratio Magna, the “Great Renewal of the Sciences”, as the culmination of his life work on natural philosophy. It was to encompass six parts, presenting an overview and critical assessment of the knowledge about nature available at the time, a presentation of Bacon’s new method for investigating nature, a mapping of the blank spots in the corpus of available knowledge and numerous examples of how natural philosophy would progress when using Bacon’s new method. It was clear to Bacon that his work could only be the beginning of a new natural philosophy, to be pursued by later generations of natural philosophers, and that he would himself not be able to finish the project he started in the Instauratio. In fact, even the writing of the Instauratio proved a much too ambitious project for one man: Bacon only finished the second part, the Novum Organum, in which he presented his new method for the investigation of nature.

With respect to this new method, Bacon argued against the medieval tradition of building on the Aristotelian/Scholastic canon and other written sources as the sources of knowledge, proposing a view of knowledge gained from systematic empirical discovery instead. For Bacon, craftsmanship and technology played a threefold role in this context. First, knowledge was to be gained by means of observation and experimentation, so inquiry in natural philosophy heavily relied on the construction of instruments, devices and other works of craftsmanship to make empirical investigations possible. Second, as discussed above, natural philosophy should not be limited to the study of nature for knowledge’s sake but should also always inquire how newly gained knowledge could be used in practice to extend man’s power over nature to the benefit of society and its inhabitants (Kogan-Bernstein, 1959; Fischer, 1996: 284-287). And third, technological “works” served as the empirical foundations of knowledge about nature in that a successful “work” could count as an indication of the truth of the involved theories about the fundamental principles and causes in nature (see above).

While in many locations in his writings Bacon suggests that the “pure” investigation of nature and the construction of new “works” are of equal importance, he did prioritize technology. From the description that Bacon gives of how Salomon’s House is organized, for example, it is clear that the members of Salomon’s House also practice “pure” investigation of nature without much regard for its practical use. The “pure” investigation of nature seems to have its own place within the House and to be able to operate autonomously. Still, as a whole, the institution of Salomon’s House is decidedly practice-oriented, such that the relative freedom of inquiry in the end manifests itself within the confines of an environment in which practical applicability is what counts. Bacon draws the same picture in the Instauratio Magna, where he explicitly acknowledges the value of “pure” investigation while at the same time emphasizing that the true aims of natural philosophy (‘scientiae veros fines” – see towards the end of the Preface of the Instauratio Magna) concern its benefits and usefulness for human life.

c. The Twentieth Century: Martin Heidegger

Notwithstanding the fact that philosophers have been reflecting on technology-related matters ever since the beginning of Western philosophy, those pre-19th century philosophers who looked at aspects of technology did not do so with the aim of understanding technology as such. Rather, they examined technology in the context of more general philosophical projects aimed at clarifying traditional philosophical issues other than technology (Fischer, 1996: 309). It is probably safe to say that before the mid to late 19th century no philosopher considered himself as being a specialized philosopher of technology, or even as a general philosopher with an explicit concern for understanding the phenomenon of technology as such, and that no full-fledged philosophies of technology had yet been elaborated.

No doubt one reason for this is that before the mid to late 19th century technology had not yet become the tremendously powerful and ubiquitously manifest phenomenon that it would later become. The same holds with respect to science, for that matter: it is only after the investigation of nature stopped being thought of as a branch of philosophy – natural philosophy – and the contemporary notion of science emerged that philosophy of science as a field of investigation could emerge. (Note that the term “scientist”, as the name for a particular profession, was coined in the first half of the 19th century by the polymath and philosopher William Whewell – see Snyder, 2009.) Thus, by the end of the 19th century natural science in its present form had emerged from natural philosophy and technology had manifested itself as a phenomenon distinct from science. Accordingly, “until the twentieth century the phenomenon of technology remained a background phenomenon” (Ihde, 1991: 26) and the philosophy of technology “is primarily a twentieth-century development” (Ihde, 2009: 55).

While one reason for the emergence of the philosophy of technology in the 20th century is the rapid development of technology at the time, according to the German philosopher Martin Heidegger an important additional reason should be pointed out. According to Heidegger, not only did technology in the 20th century develop more rapidly than in previous times and by consequence became a more visible factor in everyday life, but also did the nature of technology itself at the same time undergo a profound change. The argument is found in a famous lecture that Heidegger gave in 1955, titled The Question of Technology (Heidegger, 1962), in which he inquired into the nature of technology. Note that although Heidegger actually talked about “Technik” (and his inquiry was into “das Wesen der Technik”; Heidegger, 1962: 5), the question he addressed is about technology. In German, “Technologie” (technology) is often used to denote modern “high-tech” technologies (such as biotechnology, nanotechnology, etc.), while “Technik” is both used to denote the older mechanical crafts and the modern established fields of engineering. (“Elektrotechnik”, for example, is electrical engineering.) As will be discussed in Section 2, philosophy of technology as an academic field arose in Germany in the form of philosophical reflection on “Technik”, not “Technologie”. While the difference between the two terms remains important in contemporary German philosophy of technology (see Section 4.a below), both “Technologie” and “Technik” are commonly translated as “technology” and what in German is called “Technikphilosophie” in English goes by the name of “philosophy of technology”.

On Heidegger’s view, one aspect of the nature of both older and contemporary technology is that technology is instrumental: technological objects (tools, windmills, machines, etc.) are means by which we can achieve particular ends. However, Heidegger argued, it is often overlooked that technology is more than just the devising of instruments for particular practical purposes. Technology, he argued, is also a way of knowing, a way of uncovering the hidden natures of things. In his often idiosyncratic terminology, he wrote that “Technology is a way of uncovering” (“Technik ist eine Weise des Entbergens”; Heidegger, 1962: 13), where “Entbergen” means “to uncover” in the sense of uncovering a hidden truth. (For example, Heidegger (1962: 11-12) connects his term “Entbergen” with the Greek term “aletheia”, the Latin “veritas” and the German “Wahrheit”.) Heidegger thus adopted a view of the nature of technology close to Aristotle’s position, who conceived of techne as one of five modes of knowing, as well as to Francis Bacon’s view, who considered technical works as indications of the truth or falsity of our theories about the fundamental principles and causes in nature.

The difference between older and contemporary technology, Heidegger went on to argue, consists in how this uncovering of truth takes place. According to Heidegger, older technology consisted in “Hervorbringen” (Heidegger, 1962: 11). Heidegger here plays with the dual meaning of the term: the German “Hervorbringen” means both “to make” (the making or production of things, material objects, sound effects, etc.) and “to bring to the fore”. Thus the German term can be used to characterize both the “making” aspect of technology and its aspect of being a way of knowing. While contemporary technology retains the “making” aspect of older technology, Heidegger argued that as a way of knowing it no longer can be understood as Hervorbringen (Heidegger, 1962: 14). In contrast to older technology, contemporary technology as a way of knowing consists in the challenging (“Herausfordern” in German) of both nature (by man) and man (by technology). The difference is that while older technologies had to submit to the standards set by nature (e.g., the work that an old windmill can do depends on how strongly the wind blows), contemporary technologies can themselves set the standards (for example, in modern river dams a steady supply of energy can be guaranteed by actively regulating the water flow). Contemporary technology can thus be used to challenge nature: “Heidegger understands technology as a particular manner of approaching reality, a dominating and controlling one in which reality can only appear as raw material to be manipulated” (Verbeek, 2005: 10). In addition, on Heidegger’s view contemporary technology challenges man to challenge nature in the sense that we are constantly being challenged to realize some of the hitherto unrealized potential offered by nature – that is, to devise new technologies that force nature in novel ways and in so doing uncover new truths about nature.

Thus, in the 20th century, according to Heidegger, technology as a way of knowing assumed a new nature. Older technology can be thought of as imitating nature, where the process of imitation is inseparably connected to the uncovering of the hidden nature of the natural entities that are being imitated. Contemporary technology, in contrast, places nature in the position of a supplier of resources and in this way places man in an epistemic position with respect to nature that differs from the epistemic relation of imitating nature. When we imitate nature, we examine entities and phenomena that already exist. But products of contemporary technology, such as the Hoover dam or a nuclear power plant, are not like already existing natural objects. On Heidegger’s view, they force nature to deliver energy (or another kind of resource) whenever we ask for it and therefore cannot be understood as objects made by man in a mode of imitating nature – nature, after all, cannot produce things that force herself to deliver resources in ways that man-made things can force her to do this. This means that there is a fundamental divide between older and contemporary technology, making the rise of philosophy of technology in the late 19th century and in the 20th century an event that occurred in parallel to a profound change in the nature of technology itself.

2. Philosophy of Technology: The State of the Field in the Early Twenty-First Century

In accordance with the preceding historical sketch, the history of philosophy of technology – as the history of philosophical thinking about issues concerned with the making of things, the use of techne, the challenging of nature and so forth – can be (very) roughly divided into three major periods.

The first period runs from Greek antiquity through the Middle Ages. In this period techne was conceived of as one among several kinds of human knowledge, namely the craft-knowledge that features in the domain of man-made objects and phenomena. Accordingly, philosophical attention for technology was part of the philosophical examination of human knowledge more generally. The second period runs roughly from the Renaissance through the Industrial Revolution and is characterized by an elevated appreciation for technology as an increasingly manifest but not yet all-pervasive phenomenon. Here we see a general interest in technology not only as a domain of knowledge but also as a domain of construction, that is, of the making of artifacts with a view on the improvement of human life (for instance, in Francis Bacon’s vision of natural philosophy). However, there is no particular philosophical interest yet in technology per se other than the issues that earlier philosophers had also considered. The third period is the contemporary period (from the mid 19th century to the present) in which technology had become such a ubiquitous and important factor in human lives and societies that it began to manifest itself as a subject sui generis of philosophical reflection. Of course, this is only a very rough periodization and different ways of periodizing the history of philosophy of technology can be found in the literature – e.g., Wartofsky (1979), Feenberg (2003: 2-3) or Franssen and others (2009: Sec. 1). Moreover, this periodization applies only to Western philosophy. To be sure, there is much to be said about technology and thinking about technology in technologically advanced ancient civilizations in China, Persia, Egypt, etc., but this cannot be done within the confines of the present article. Still, the periodization proposed above is a useful first-order subdivision of the history of thinking about technology as it highlights important changes in how technology was and is understood.

The first monograph on philosophy of technology appeared in Germany in the second half of the 19th century in the form of Ernst Kapp’s book, Grundlinien einer Philosophie der Technik (“Foundations of a Philosophy of Engineering”) (Kapp, 1877). This book is commonly seen as the origin of the field (Rapp, 1981: 4; Ferré, 1988: 10; Fischer, 1996: 309; Zoglauer, 2002: 9; De Vries, 2005: 68; Ropohl, 2009: 13), because the term “philosophy of technology” (or rather, “philosophy of technics”) was first introduced there. Kapp used it to denote the philosophical inquiry into the effects of the use of technology on human society. (Mitcham (1994: 20), however, mentions the Scottish chemical engineer Andrew Ure as a precursor to Kapp in this context. Apparently in 1835 Ure coined the phrase “philosophy of manufactures” in a treatise on philosophical issues concerning technology.) For several decades after the publication of Kapp’s work not much philosophical work focusing on technology appeared in print and the field didn”t really get going until well into the 20th century. Again, the main publications appeared in Germany (for example, Dessauer, 1927; Jaspers, 1931; Diesel, 1939).

It should be noted that if philosophy of technology as an academic field indeed started here, the field’s origins lie outside professionalized philosophy. Jaspers was a philosopher, but neither Kapp nor most of the other early authors on the topic were professional philosophers. Kapp, for example, had earned a doctorate in classical philology and spent much of his life as a schoolteacher of geography and history and as an independent writer and untenured university lecturer (a German “Privatdozent”). Dessauer was an engineer (and an advocate of an unconditionally optimistic view of technology), Ure a chemical engineer and Diesel (son of the inventor of the Diesel engine, Rudolf Diesel) an independent writer.

In his book, Kapp argued that technological artifacts should be thought of as man-made imitations and improvements of human organs (see Brey, 2000; De Vries, 2005). The underlying idea is that human beings have limited capacities: we have limited visual powers, limited muscular strength, limited resources for storing information, etc. These limitations have led human beings to attempt to improve their natural capacities by means of artifacts such as cranes, lenses, etc. On Kapp’s view, such improvements should not so much be thought of as extensions or supplements of natural human organs, but rather as their replacements (Brey, 2000: 62). Because technological artifacts are supposed to serve as replacements of natural organs, they must on Kapp’s view be devised as imitations of these organs – after all, they are intended to perform the same function – or at least as being modeled on natural organs: ‘since the organ whose utility and power is to be increased is the standard, the appropriate form of a tool can only be derived from that organ” (Kapp, quoted and translated by Brey, 2000: 62). This way of understanding technology, which echoes the view of technology as the imitation of nature by men that was already found with Plato and Aristotle, was dominant throughout the Middle Ages and continued to be endorsed later.

The period after World War II saw a sharp increase in the amount of published reflections on technology that, for obvious reasons given the role of technology in both World Wars, often expressed a deeply critical and pessimistic view of the influence of technology on human societies, human values and the human life-world in general. Because of this increase in the amount of reflection on technology after World War II, some authors locate the emergence of the field in that period rather than in the late 19th century (for example Ihde, 1993: 14-15, 32-33; Dusek, 2006: 1-2; Kroes and others, 2008: 1). Ihde (1993: 32) points to an additional reason to locate the beginning of the field in the period following World War II: historians of technology rate World War II as the technologically most innovative period in human history until then, as during that war many new technologies were introduced that continued to drive technological innovation as well as the associated reflection on such innovation for several decades to follow. Thus, from this perspective it was World War II and the following period in which technology reached the level of prominence in the early 21st century and, accordingly, became a focal topic for philosophy. It became “a force too important to overlook”, as Ihde (1993: 32) writes.

A still different picture is obtained if one takes the existence of specialized professional societies, dedicated academic journals, topic-specific textbooks as well as a specific name identifying the field as typical signs that a particular field of investigation has become established as a branch of academia. (Note that in his influential The Structure of Scientific Revolutions, historian and philosopher of science Thomas Kuhns mentions these as signs of the establishment of a new paradigm, albeit not a new field or discipline – see Kuhn, 1970: 19.) By these indications, the process of establishing philosophy of technology as an academic field has only begun in the late 1970s and early 1980s – as Ihde (1993: 45) writes, “from the 1970s on, philosophy of technology began to take its place alongside the other “philosophies of …”” – and continued into the early 21st century.

As Mitcham (1994: 33) remarks, the term “philosophy of technology” was not widely used outside Germany until the 1980s (where the German term is “Technikphilosophie” or “Philosophie der Technik” rather than “philosophy of technology”). In 1976, the Society for the Philosophy of Technology was founded as the first professional society in the field. In the 1980s introductory textbooks on philosophy of technology began to appear. One of the very first (Ferré, 1988) appeared in the famous Prentice Hall Foundations of Philosophy series that included several hallmark introductory texts in philosophy (such as Carl Hempel’s Philosophy of Natural Science, David Hull’s Philosophy of Biological Science, William Frankena’s Ethics and Wesley Salmon’s Logic). In recent years numerous introductory texts have become available, including Ihde (1993), Mitcham (1994), Pitt (2000), Bucciarelli (2003), Fischer (2004), De Vries (2005), Dusek (2006), Irrgang (2008) and Nordmann (2008). Anthologies of classic texts in the field and encyclopedias of philosophy of technology have only very recently begun to appear (e.g., Scharff & Dusek, 2003; Kaplan, 2004; Meijers, 2009; Olsen, Pedersen & Hendricks, 2009; Olsen, Selinger, & Riis, 2009). However, there were few academic journals in the early 21st century dedicated specifically to philosophy of technology and covering the entire range of themes in the field.

”Philosophy of technology” denotes a considerable variety of philosophical endeavors. There is an ongoing discussion among philosophers of technology and scholars in related fields (e.g., science and technology studies, and engineering) on how philosophy of technology should be conceived of. One would expect to find a clear answer to this question in the available introductory texts, along with a general of agreement on the central themes and questions of the field, as well as on who are its most important authors and which the fundamental positions, theories, theses and approaches. In the case of philosophy of technology, however, comparing recent textbooks reveals a striking lack of consensus about what kind of endeavor philosophy of technology is. According to some authors, the sole commonality of the various endeavors called “philosophy of technology” is that they all in some way or other reflect on technology (cf. Rapp, 1981: 19-22; 1989: ix; Ihde, 1993: 97-98; Nordmann, 2008: 10).

For example, Nordmann characterized philosophy of technology as follows: “Not only is it a field of work without a tradition, it is foremost a field without its own guiding questions. In the end, philosophy of technology is the whole of philosophy done over again from the start – only this time with consideration for technology” (2008: 10; Reydon’s translation). Nordmann (2008: 14) added that the job of philosophy of technology is not to deal philosophically with a particular subject domain called “technology” (or “Technik” in German). Rather, its job is to deal with all the traditional questions of philosophy, relating them to technology. Such a characterization of the field, however, seems impracticably broad because it causes the name “philosophy of technology” to lose much of its meaning. On Nordmann’s broad characterization it seems meaningless to talk of “philosophy of technology”, as there is no clearly recognizable subfield of philosophy for the name to refer to. All of philosophy would be philosophy of technology, as long as some attention is paid to technology.

A similar, albeit apparently somewhat stricter, characterization of the field was given by Ferré (1988: ix, 9), who suggested that philosophy of technology is ‘simply philosophy dealing with a special area of interest”, namely technology. According to Ferré, the various “philosophies of” (of science, of biology, of physics, of language, of technology, etc.) should be conceived of as philosophy in the broad sense, with all its traditional questions and methods, but now “turned with a special interest toward discovering how those fundamental questions and methods relate to a particular segment of human concern” (Ferré, 1988: 9). The question arises what this “particular segment of human concern” called “technology” is. But first, the kinds of questions philosophers of technology ask with respect to technology must be explicated.

3. How Philosophy of Technology Can Be Done: The Principal Kinds of Questions That Philosophers of Technology Ask

Philosopher of technology Don Ihde defines philosophy of technology as philosophy that examines the phenomenon of technology per se, rather than merely considering technology in the context of reflections aimed at philosophical issues other than technology. (Note the opposition to Nordmann’s view, mentioned above.) That is, philosophy of technology “must make technology a foreground phenomenon and be able to reflectively analyze it in such a way as to illuminate features of the phenomenon of technology itself” (Ihde, 1993: 38; original emphasis).

However, there are a number of different ways in which one can approach the project of illuminating characteristic features of the phenomenon of technology. While different authors have presented different views of what philosophy of technology is about, there is no generally agreed upon taxonomy of the various approaches to (or traditions in, or styles of doing) philosophy of technology. In this section, a number of approaches that have been distinguished in the recent literature are discussed with the aim of providing an overview of the various kinds of questions that philosophers ask with respect to technology.

In an early review of the state of the field, philosopher of science Marx W. Wartofsky distinguished four main approaches to philosophy of technology (Wartofsky, 1979: 177-178). First, there is the holistic approach that sees technology as one of the phenomena generally found in human societies (on a par with phenomena such as art, war, politics, etc.) and attempts to characterize the nature of this phenomenon. The philosophical question in focus here is: What is technology? Second, Wartofsky distinguished the particularistic approach that addresses specific philosophical questions that arise with respect to particular episodes in the history of technology. Relevant questions are: Why did a particular technology gain or lose prominence in a particular period? Why did the general attitude towards technology change at a particular time? And so forth. Third is the developmental approach that aims at explaining the general process of technological change and as such has a historical focus too. And fourth, there is the social-critical approach that conceives of technology as a social/cultural phenomenon, that is a product of social conventions, ideologies, etc. In this approach, technology is seen as a product of human actions that should be critically assessed (rather than characterized, as in the holistic approach). Besides critical reflection on technology, a central question here is how technology has come to be what it is today and which social factors have been important in shaping it. The four approaches as distinguished by Wartofsky clearly are not mutually exclusive: while different approaches address similar and related questions, the difference between them is a matter of emphasis.

A similar taxonomy of approaches is found with Friedrich Rapp, an early proponent of analytic philosophy of technology (see also below). For Rapp, the principal dichotomy is between holistic and particularistic approaches, that is, approaches that conceive of technology as a single phenomenon the nature of which philosophers should clarify vs. approaches that see “technology” as an umbrella term for a number of distinct historical and social phenomena that are related to one another in complex ways and accordingly should each be examined in relation to the other relevant phenomena (Rapp, 1989: xi-xii). Rapp’s own philosophy of technology stands in the latter line of work. Within this dichotomy, Rapp (1981: 4-19) distinguished four main approaches, each reflecting on a different aspect of technology: on the practice of invention and engineering, on technology as a cultural phenomenon, on the social impact of technology, and on the impact of technology on the physical/biological system of planet Earth. While it is not entirely clear how Rapp conceives of the relation between these four approaches and his holistic/particularistic dichotomy, it seems that holism and particularism can generally be understood as modes of doing philosophy that can be realized within each of the four approaches.

Gernot Böhme (2008: 23-32) also distinguished between four main paradigms of contemporary philosophy of technology: the ontological paradigm, the anthropological paradigm, the historical-philosophical paradigm and the epistemological paradigm. The ontological paradigm, according to Böhme, inquires into the nature of artifacts and other technical entities. It basically consists in a philosophy of technology that parallels philosophy of nature, but focuses on the Aristotelian domain of poiesis instead of the domain of physis (see Section 1.a. above). The anthropological paradigm asks one of the traditional questions of philosophy – What is man? – and approaches this question by way of an examination of technology as a product of human action. The historical-philosophical paradigm examines the various manifestations of technology throughout human history and aims to clarify what characterizes the nature of technology in different periods. In this respect, it is closely related to the anthropological paradigm and individual philosophers can work in both paradigms simultaneously. Böhme (2008: 26), for example, lists Ernst Kapp as a representative of both the anthropological and historical-philosophical paradigms. Finally, the epistemological paradigm inquires into technology as a form of knowledge in the sense in which Aristotle did (See Sec. 1.a. above). Böhme (2008: 23) observed that despite the factual existence of philosophy of technology as an academic field, as yet there is no paradigm that dominates the field.

Carl Mitcham (1994) made a fundamental distinction between two principal subdomains of philosophy of technology, which he called “engineering philosophy of technology” and “humanities philosophy of technology”. Engineering philosophy of technology is the philosophical project aimed at understanding the phenomenon of technology as instantiated in the practices of engineers and others working in technological professions. It analyzes “technology from within, and [is] oriented toward an understanding of the technological way of being-in-the-world” (Mitcham, 1994: 39). As representatives of engineering philosophy of technology Mitcham lists, among others, Ernst Kapp and Friedrich Dessauer. Humanities philosophy of technology, on the other hand, consists of more general philosophical projects in which technology per se is not principal subject of concern. Rather, technology is taken as a case study that might lead to new insights into a variety of philosophical questions by examining how technology affects human life.

The above discussion shows how different philosophers have quite different views of how the field of philosophy of technology is structured and what kinds of questions are in focus in the field. Still, on the basis of the preceding discussion a taxonomy can be constructed of three principal ways of conceiving of philosophy of technology:

(1) philosophy of technology as the systematic clarification of the nature of technology as an element and product of human culture (Wartofsky’s holistic and developmental approaches; Rapp’s cultural approach; Böhme’s ontological, anthropological and historical paradigms; and Mitcham’s engineering approach);

(2) philosophy of technology as the systematic reflection on the consequences of technology for human life (Wartofsky’s particularistic and social/critical approaches; Rapp’s social impact and physical impact approaches; and Mitcham’s humanities approach);

(3) philosophy of technology as the systematic investigation of the practices of engineering, invention, designing and making of things (Wartofsky’s particularistic approach; Rapp’s invention approach; Böhme’s epistemological paradigm; and Mitcham’s engineering approach).

All three approaches are represented in present-day thinking about technology and are illustrated below.

(1) The systematic clarification of the nature of technology. Perhaps most philosophy of technology has been done – and continues to be done – in the form of reflection on the nature of technology as a cultural phenomenon. As clarifying the nature of things is a traditional philosophical endeavor, many prominent representatives of this approach are philosophers who do not consider themselves philosophers of technology in the first place. Rather, they are general philosophers who look at technology as one among the many products of human culture, such as the German philosophers Karl Jaspers (e.g., his book Die geistige Situation der Zeit; Jaspers, 1931), Oswald Spengler (Der Mensch und die Technik; Spengler, 1931), Ernst Cassirer (e.g., his Symbol, Technik, Sprache; Cassirer, 1985), Martin Heidegger (Heidegger, 1962; discussed above), Jürgen Habermas (for example with his Technik und Wissenschaft als “Ideologie”; Habermas, 1968) and Bernhard Irrgang (2008). The Spanish philosopher José Ortega y Gasset is also often counted among the prominent representatives of this line of work.

(2) Systematic reflection on the consequences of technology for human life. Related to the conception of technology as a human cultural product is the approach to philosophy of technology that reflects on and criticizes the social and environmental impact of technology. As an examination of how technology affects society, this approach lies at the intersection of philosophy and sociology, rather than lying squarely within philosophy itself. Prominent representatives thus include the German philosopher/sociologists of the Frankfurt School (Herbert Marcuse, Theodor W. Adorno and Max Horkheimer), Jürgen Habermas, the French sociologist Jacques Ellul (1954), or the American political theorist Langdon Winner (1977).

A central question in contemporary versions of this approach is whether technology controls us or we are able to control technology (Feenberg, 2003: 6; Dusek, 2006: 84-111; Nye, 2006: Chapter 2). Langdon Winner, for example, thought of technology as an autonomously developing phenomenon fundamentally out of human control. As Dusek (2006: 84) points out, this issue is in fact a constellation of two separate questions: Are the societies that we live in, and we ourselves in our everyday lives, determined by technology? And are we able to control or guide the development of technology and the application of technological inventions, or does technology have a life of its own? As it might be that while our lives are not determined by technology we still are not able to control the development and application of technology, these are separate, albeit intimately related issues. The challenge for philosophy of technology, then, is to assess the effects of technology on our societies and our lives, to explore possibilities for us to exert influence on the current applications and future development of technology, and to devise concepts and institutions that might enable democratic control over the role of technology in our lives and societies.

(3) The systematic investigation of the practices of engineering, invention, designing and making of things. The third principal approach to philosophy of technology examines concrete technological practices, such as invention, design and engineering. Early representatives of this approach include Ernst Kapp (1877), Friedrich Dessauer (1927; 1956) and Eugen Diesel (1939). The practical orientation of this approach, as well as its comparative distance from traditional issues in philosophy, is reflected in the fact that none of these three early philosophers of technology were professional philosophers (see Section 2).

A guiding idea in this approach to philosophy of technology is that the design process constitutes the core of technology (Franssen and others, 2009: Sec. 2.3), such that studying the design process is crucial to any project that attempts to understand technology. Thus, philosophers working in this approach often examine design practices, both in the strict context of engineering and in wider contexts such as architecture and industrial design (for example, Vermaas and others, 2008). In focus are epistemological and methodological questions, such as: What kinds of knowledge do engineers have? (for example, Vincenti, 1990; Pitt, 2000; Bucciarelli, 2003; Auyang, 2009; Houkes, 2009). Is there a kind of knowledge that is specific for engineering? What is the nature of the engineering process and the design process? (for example,, Vermaas and others, 2008). What is design? (for example, Houkes, 2008). Is there a specific design/engineering methodology? How do reasoning and decision processes in engineering function? How do engineers deal with uncertainty, failure and error margins? (for example, Bucciarelli, 2003: Chapter 3). Is there any such thing as a technological explanation? If so, what is the structure of technological explanations? (for example, Pitt, 2000: Chapter 4; Pitt, 2009). What is the relation between science and technology and in what way are design processes similar to and different from investigative processes in natural science? (for example, Bunge, 1966).

This approach to philosophy of technology is closely related to philosophy of science, where also much attention is given to methodology and epistemology. This can be seen from the fact that central questions from philosophy of science parallel some of the aforementioned questions: What is scientific knowledge? Is there a specific scientific method, or perhaps a clearly delimited set of such methods? How does scientific reasoning work? What is the structure of scientific explanations? Etc. However, there still seems to be comparatively little attention for such questions among philosophers of technology. Philosopher of technology Joseph Pitt, for example, observed that notwithstanding the parallel with respect to questions that can be asked about technology “there is a startling lack of symmetry with respect to the kinds of questions that have been asked about science and the kinds of questions that have been asked about technology” (2000: 26; emphasis added). According to Pitt, philosophers of technology have largely ignored epistemological and methodological questions about technology and have instead focused overly on issues related to technology and society. But, Pitt pointed out, social criticism “can come only after we have a deeper understanding of the epistemological dimension of technology (Pitt, 2000: 27) and “policy decisions require prior assessment of the knowledge claims, which require good theories of what knowledge is and how to assess it” (ibid.). Thus, philosophers of technology should orient themselves anew with respect to the questions they ask.

But there are more parallels between the philosophies of technology and science. An important endeavor in philosophy of science that is also seen as central in philosophy of technology is conceptual analysis. In the case of philosophy of technology, this involves both concepts related to technology and engineering in general (concepts such as “technology”, “technics”, “technique”, “machine”, “mechanism”, “artifact”, “artifact kind”, “information”, ‘system”, “efficiency”, “risk”, etc.; see also Wartofsky, 1979: 179) and concepts that are specific for the various engineering disciplines. In addition, in both philosophy of science and philosophy of technology a renewed interest in metaphysical issues can currently be seen. For example, while philosophers of science inquire into the nature of the natural kinds that the sciences study, philosophers of technology are developing a parallel interest into the metaphysics of artifacts and kinds of artifacts (e.g., Houkes & Vermaas, 2004; Margolis & Laurence, 2007; Franssen, 2008). And lastly, philosophers of technology and philosophers of particular special sciences are increasingly beginning to cooperate on questions that are of crucial interest to both fields; a recent example is Krohs & Kroes (2009) on the notion of function in biology and technology.

A difference between the states of affairs in philosophy of science and in philosophy of technology, however, lies in the relative dominance of continental and analytic approaches. While there is some continental philosophy of science (e.g., Gutting, 2005), it constitutes a small minority in the field in comparison to analytic philosophy of science. In contrast, continental-style philosophy of technology is a domain of considerable size, while analytic-style philosophy of technology is small in comparison. Analytic philosophy of technology has existed since the 1960s and only began the process of becoming the dominant form of philosophy of technology in the early 21st century (Franssen and others, 2009: Sec. 1.3.). Kroes and others (2008: 2) even speak of a “recent analytic turn in the philosophy of technology”. Overviews of analytic philosophy of technology can be found in Mitcham (1994: Part 2), Franssen (2009) and Franssen and others (2009: Sec. 2).

4. Two Exemplary Discussions

After having mapped out three principal ways in which one can conceive of philosophy of technology, two discussions from contemporary philosophy of technology will be presented to illustrate what philosophers of technology do. The first example will demonstrate philosophy of technology as the systematic clarification of the nature of technology. The second example shows philosophy of technology as systematic reflection on the consequences of technology for human life, and is concerned with biotechnology. (Illustrations of philosophy of technology as the systematic investigation of the practices of engineering, invention, designing and making of things will not be presented. Examples of this approach to philosophy of technology can be found in Vermaas and others (2008) or Franssen and others (2009).)

a. What Is (the Nature of) Technology?

The question, What is technology? or What is the nature of technology?, is both a central question that philosophers of technology aim to answer and a question the answer to which determines the subject matter of philosophy of technology. One can think of philosophy of technology as the philosophical examination of technology, in the same way as the philosophy of science is the philosophical examination of science and the philosophy of biology the philosophical study of a particular subdomain of science. However, in this respect the philosophy of technology is in a similar situation as the philosophy of science finds itself in.

Central questions in the philosophy of science have long been what science is, what characterizes science and what distinguishes science from non-science (the demarcation problem). These questions have recently somewhat moved out of focus, however, due to the lack of acceptable answers. Philosophers of science have not been able to satisfactorily explicate the nature of science (for a recent suggestion, see Hoyningen-Huene, 2008) or to specify any clear-cut criterion by which science could be demarcated from non-science or pseudo-science. As philosopher of science Paul Hoyningen-Huene (2008: 168) wrote: “fact is that at the beginning of the 21st century there is no consensus among philosophers or historians or scientists about the nature of science.”

The nature of technology, however, is even less clear than the nature of science. As philosopher of science Marx Wartofsky put it, ““Technology” is unfortunately too vague a term to define a domain; or else, so broad in its scope that what it does define includes too much. For example, one may talk about technology as including all artifacts, that is, all things made by human beings. Since we “make” language, literature, art, social organizations, beliefs, laws and theories as well as tools and machines, and their products, such an approach covers too much” (Wartofsky, 1979: 176). More clarity on this issue can be achieved by looking at the history of the term (for example, Nye, 2006: Chapter 1; Misa, 2009; Mitcham & Schatzberg, 2009) as well as at recent suggestions to define it.

Jacob Bigelow, an early author on technology, conceived of it as a specific domain of knowledge: technology was “an account [...] of the principles, processes, and nomenclatures of the more conspicuous arts” (Bigelow, 1829, quoted in Misa, 2009: 9; Mitcham & Schatzberg, 2009: 37). In a similar manner, Günter Ropohl (1990: 112; 2009: 31) defined “technology” as the ‘science of technics” (“Wissenschaft von der Technik”, where “Technik” denotes the domain of crafts and other areas of manufacturing, making, etc.). The important aspect of Bigelow’s and Ropohl’s definitions is that “technology” does not denote a domain of human activity (such as making or designing) or a domain of objects (technological innovations, such as solar panels), but a domain of knowledge. In this respect, their usage of the term is continuous with the meaning of the Greek “techne” (Section 1.a).

A review of a number of definitions of “technology” (Li-Hua, 2009) shows that there is not much overlap between the various definitions that can be found in the literature. Many definitions conceive of technology in Bigelow’s and Ropohl’s sense as a particular body of knowledge (thus making the philosophy of technology a branch of epistemology), but do not agree on what kind of knowledge it is supposed to be. On some definitions it is seen as firm-specific knowledge about design and production processes, while others conceive of it as knowledge about natural phenomena and laws of nature that can be used to satisfy human needs and solve human problems (a view which closely resembles Francis Bacon’s).

Philosopher of science Mario Bunge presented a view of the nature of technology along the latter lines (Bunge, 1966). According to Bunge, technology should be understood as constituting a particular subdomain of the sciences, namely “applied science”, as he called it. Note that Bunge’s thesis is not that technology is applied science in the sense of the application of scientific theories, models, etc. for practical purposes. Although a view of technology as being “just the totality of means for applying science” (Scharff, 2009: 160) remains present among the general public, most engineers and philosophers of technology agree that technology cannot be conceived of as the application of science in this sense. Bunge’s view is that technology is the subdomain of science characterized by a particular aim, namely application. According to Bunge, natural science and applied science stand side by side as two distinct modes of doing science: while natural science is scientific investigation aimed at the production of reliable knowledge about the world, technology is scientific investigation aimed at application. Both are full-blown domains of science, in which investigations are carried out and knowledge is produced (knowledge about the world and how it can be applied to concrete problems, respectively). The difference between the two domains lies in the nature of the knowledge that is produced and the aims that are in focus. Bunge’s statement that “technology is applied science” should thus be read as “technology is science for the purpose of application” and not as “technology is the application of science.

Other definitions reflect still different conceptions of technology. In the definition accepted by the United Nations Conference on Trade and Development (UNCTAD), technology not only includes specific knowledge, but also machinery, production systems and skilled human labor force. Li-Hua (2009) follows the UNCTAD definition by proposing a four-element definition of “technology” as encompassing technique (that is, a specific technique for making a particular product), specific knowledge (required for making that product; he calls this technology in the strict sense), the organization of production and the end product itself. Friedrich Rapp, in contrast, defined “technology” even more broadly as a domain of human activity: “in simplest terms, technology is the reshaping of the physical world for human purposes” (Rapp, 1989: xxiii).

Thus, attempts to define “technology” in such a way that this definition would express the nature of technology, or only some of the principal characteristics of technology, have not led to any generally accepted view of what technology is. In this context, historian of science and technology Thomas J. Misa observed that historians of technology have so far resisted defining “technology” in the same way as “no scholarly historian of art would feel the least temptation to define “art”, as if that complex expression of human creativity could be pinned down by a few well-chosen words” (Misa, 2009: 8). The suggestion clearly is that technology is far too complex and too diverse a domain to define or to be able to talk about the nature of technology. Nordmann (2008: 14) went even further by arguing that not only can the term “technology” not be defined, but also it should not be defined. According to Nordmann, we should accept that technology is too diverse a domain to be caught in a compact definition. Accordingly, instead of conceiving of “technology” as the name of a particular fixed collection of phenomena that can be investigated, Nordmann held that “technology” is best understood as what Grunwald & Julliard (2005) called a “reflective concept”. According to the latter authors, “technology” (or rather, “Technik” – see Section 1.c) should simply be taken to mean whatever we mean when we use the term. While this clearly cannot be an adequate definition of the term, it still can serve as a basis for reflections on technology in that it gives us at least some sense of what it is that we are reflection on. Using “technology” in this extremely loose manner allows us to connect reflections on very different issues and phenomena as being about – in the broadest sense – the same thing. In this way, “technology” can serve as the core concept of the field of philosophy of technology.

Philosophy of technology faces the challenge of clarifying the nature of a particular domain of phenomena without being able to determine the boundaries of that domain. Perhaps the best way out of this situation is to approach the question on a case-by-case basis, where the various cases are connected by the fact that they all involve technology in the broadest possible sense of the term. Rather than asking what technology is, and how the nature of technology is to be characterized, it might be better to examine the natures of particular instances of technology and in so doing achieve more clarity about a number of local phenomena. In the end, the results from various case studies might to some extent converge – or they might not.

b. Questions Regarding Biotechnology

The question how to define “technology” is not merely an academic issue. Consider the case of biotechnology, the technological domain that features most prominently in systematic reflections on the consequences of technology for human life. When thinking about what the application of biotechnologies might mean for our lives, it is important to define what we mean by “biotechnology” such that the subject matter under consideration is delimited in a way that is useful for the discussion.

On one definition, given in 1984 by the United States Office of Technology Assessment, biotechnology comprises “[a]ny technique using organisms and their components to make products, modify plants and animals to carry desired traits, or develop micro-organisms for specific uses” (Office of Technology Assessment, 1984; Van den Beld, 2009: 1302). On such a conception of biotechnology, however, traditional farming, breeding and production of foodstuffs, as well as modern large-scale agriculture and industrialized food production would all count as biotechnology. The domain of biotechnology would thus encompass an extremely heterogeneous collection of practices and techniques of which many would not be particularly interesting subjects for philosophical or ethical reflection (although all of them affect human life: consider, for example, the enormous effect that the development of traditional farming had with respect to the rise of human societies). Accordingly, many definitions are much narrower and focus on “new” or “modern” biotechnologies, that is, technologies that involve the manipulation of genetic material. These are, after all, the technologies that are widely perceived by the general public as ethically problematic and thus as constituting the proper subject matter of philosophical reflection on biotechnology. Thus, the authors of a 2007 reported on the possible consequences, opportunities and challenges of biotechnology for Europe make a distinction between traditional and modern biotechnology, writing about modern biotechnology that it “can be defined as the use of cellular, molecular and genetic processes in production of goods and services. Its beginnings date back to the early 1970s when recombinant DNA technology was first developed” (quoted in Van den Beld, 2009: 1302).

Such narrow definitions, however, tend to cover too little. As Van den Beld (2009: 1306) pointed out in this context, “There are no definitions that are simply correct or incorrect, only definitions that are more or less pragmatically adequate in view of the aims one pursues.” When it comes to systematic reflection on how the use of technologies affects human life, the question thus is whether there is any particular area of technology that can be meaningfully singled out as constituting “biotechnology”. However, the spectrum of technological applications in the biological domain is simply too diverse.

In overviews of the technologies that are commonly discussed under the name of “biotechnology” a common distinction is between “white biotechnology” (biotechnology in industrial contexts), “green biotechnology” (biotechnology involving plants) and “red biotechnology” (biotechnology involving humans and non-human animals, in particular in medical and biomedical contexts). White biotechnology involves, among other things, the use of enzymes in detergents or the production of cheeses; the use of micro-organisms for the production of medicinal substances; the production of biofuels and bioplastics and so forth. Green biotechnology typically involves genetic technology and is also often called “green genetic technology”. It mainly encompasses the genetic modification of cultivated crops. Philosophical/ethical issues discussed under this label include the risk of outcrossing between genetically modified types of plants and the wild types; the use of genetically modified crops in the production of foodstuffs, either directly or indirectly as food for animals intended for human consumption (for example, soy beans, corn, potatoes and tomatoes); the labeling of foodstuffs produced on the basis of genetically modified organisms; issues related to the patenting of genetically modified crops and so forth.

Not surprisingly, red biotechnology is the most hotly discussed area of biotechnology as red biotechnologies directly involve human beings and non-human animals, both of which are categories that feature prominently throughout ethical discussions. Red biotechnology involves such things as the transplantation of human organs and tissues, and xenotransplantation (the transplantation of non-human animal organs and tissues to humans); the use of cloning techniques for reproductive and therapeutic purposes; the use of embryos for stem cell research; artificial reproduction, in vitro fertilization, the genetic testing of embryos and pre-implantation diagnostics and so forth. In addition, an increasingly discussed area of red biotechnology is constituted by human enhancement technologies. These encompass such diverse technologies as the use of psycho-pharmaceutical substances for the improvement of one’s mental capacities, the genetic modification of human embryos to prevent possible genetic diseases and so forth.

Other areas of biotechnology can include synthetic biology, which involves the creation of synthetic genetic systems, synthetic metabolic systems and attempts at creating living synthetic life forms from scratch. Synthetic biology does not fit into the distinction between white, green and red biotechnology and receives attention from philosophers not only because projects in synthetic biology may raise ethical questions (for example, Douglas & Savulescu, 2012) but also because of questions from epistemology and philosophy of science (for example, O”Malley, 2009; Van den Beld, 2009: 1314-1316).

Corresponding to this diversity of technologies covered by the label of “biotechnology”, philosophical reflection on biotechnology as such and on its possible consequences for human life will not be a very fruitful enterprise as there will not be much to say about biotechnology in general. Instead, philosophical reflection on biotechnology will need to be conducted locally rather than globally, taking the form of close examination of particular technologies in particular contexts. Philosophers concerned with biotechnology reflect on such specific issues as the genetic modification of plants for agricultural purposes, or the use of psycho-pharmaceutical substances for the improvement of the mental capacities of healthy subjects – not biotechnology as such. In the same way as “technology” can be thought of as a “reflective concept” (Grunwald & Julliard, 2005) that brings together a variety of phenomena under a common denominator for the purposes of enabling philosophical work, so “biotechnology” too can be understood as a “reflective concept” that is useful to locate particular considerations within the wide domain of philosophical reflection.

This is, however, not to say that on more general levels nothing can be said about biotechnology. Bioethicist Bernard Rollin, for example, considered genetic engineering in general and addressed the question whether genetic engineering could be considered intrinsically wrong – that is, wrong in any and all contexts and hence independently of the particular context of application that is under consideration (Rollin, 2006: 129-154). According to Rollin, the alleged intrinsic wrongness of genetic engineering constituted one out of three aspects of wrongness that members of the general public often associate with genetic engineering. These three aspects, which Rollin illustrated as three aspects of the Frankenstein myth (see Rollin, 2006: 135), are: the intrinsic wrongness of a particular practice, its possible dangerous consequences and the possibilities of causing harm to sentient beings. While the latter two aspects of wrongness might be avoided by means of appropriate measures, the intrinsic wrongness of a particular practice (in cases where it obtains) is unavoidable. Thus, if it could be argued that genetic engineering is intrinsically wrong – that is, something that we just ought not to do, irrespective of whatever positive or negative consequences are to be expected –, this would constitute a strong argument against large domains of white, green and red biotechnology. On the basis of an assessment of the motivations that people have to judge genetic engineering as being intrinsically wrong, Rollin, however, concluded that such an argument could not be made because in the various cases in which people concluded that genetic engineering was intrinsically wrong the premises of the argument were not well-founded.

But in this case, too, the need for local rather than global analyses can be seen. Assessing the tenability of the value judgment that genetic engineering is intrinsically wrong requires examining concrete arguments and motivations on a local level. This, I want to suggest by way of conclusion, is a general characteristic of the philosophy of technology: the relevant philosophical analyses will have to take place on the more local levels, examining particular technologies in particular contexts, rather than on more global levels, at which large domains of technology such as biotechnology or even the technological domain as a whole are in focus. Philosophy of technology, then, is a matter of piecemeal engineering, in much the same way as William Wimsatt has suggested that philosophy of science should be done (Wimsatt, 2007).

5. References and Further Reading

  • Auyang, S.Y. (2009): “Knowledge in science and engineering”, Synthese 168: 319-331.
  • Brey, P. (2000): “Theories of technology as extension of human faculties”, in: Mitcham, C. (Ed.): Metaphysics, Epistemology, and Technology (Research in Philosophy and Technology, Vol. 19), Amsterdam: JAI, pp. 59-78.
  • Böhme, G. (2008): Invasive Technologie: Technikphilosophie und Technikkritik, Kusterdingen: Die Graue Edition.
  • Bucciarelli, L.L. (1994): Designing Engineers, Cambridge (MA): MIT Press.
  • Bucciarelli, L.L. (2003): Engineering Philosophy, Delft: Delft University Press.
  • Bunge, M. (1966): “Technology as applied science”, Technology and Culture 7: 329-347.
  • Cassirer, E. (1985): Symbol, Technik, Sprache: Aufsätze aus den Jahren 1927-1933 (edited by E.W. Orth & J. M. Krois), Hamburg: Meiner.
  • De Vries, M.J. (2005): Teaching About Technology: An Introduction to the Philosophy of Technology for Non-Philosophers, Dordrecht: Springer.
  • Dessauer, F. (1927): Philosophie der Technik: Das Problem der Realisierung, Bonn: Friedrich Cohen.
  • Dessauer, F. (1956): Der Streit um die Technik, Frankfurt am Main: Verlag Josef Knecht.
  • Diesel, E. (1939): Das Phänomen der Technik: Zeugnisse, Deutung und Wirklichkeit, Leipzig: Reclam & Berlin: VDI-Verlag.
  • Douglas, T. & Savulescu, J. (2010): “Synthetic biology and the ethics of knowledge”, Journal of Medical Ethics 36: 687-693.
  • Dusek, V. (2006): Philosophy of Technology: An Introduction, Malden (MA): Blackwell.
  • Ellul, J. (1954): La Technique ou l’Enjeu du Siècle, Paris: Armand Colin.
  • Feenberg, A. (2003): “What is philosophy of technology?”, lecture at the University of Tokyo (Komaba campus), June 2003.
  • Ferré, F. (1988): Philosophy of Technology, Englewood Cliffs (NJ): Prentice Hall; unchanged reprint (1995): Philosophy of Technology, Athens (GA) & London, University of Georgia Press.
  • Fischer, P. (1996): “Zur Genealogie der Technikphilosophie”, in: Fischer, P. (Ed.): Technikphilosophie, Leipzig: Reclam, pp. 255-335.
  • Fischer, P. (2004): Philosophie der Technik, München: Wilhelm Fink (UTB).
  • Franssen, M.P.M. (2008): “Design, use, and the physical and intentional aspects of technical artifacts”, in: Vermaas, P.E., Kroes, P., Light, A. & Moore, S.A. (Eds): Philosophy and Design: From Engineering to Architecture, Dordrecht: Springer, pp. 21-35.
  • Franssen, M.P.M. (2009): “Analytic philosophy of technology”, in: J.K.B. Olsen, S.A. Pedersen & V.F. Hendricks (Eds): A Companion to the Philosophy of Technology, Chichester: Wiley-Blackwell, pp. 184-188.
  • Franssen, M.P.M., Lokhorst, G.-J. & Van de Poel, I. (2009): “Philosophy of technology”, in: Zalta, E. (Ed.): Stanford Encyclopedia of Philosophy (Fall 2009 Edition),
  • Grunwald, A. & Julliard, Y. (2005): “Technik als Reflexionsbegriff: Zur semantischen Struktur des Redens über Technik”, Philosophia Naturalis 42: 127-157.
  • Gutting, G. (Ed.) (2005): Continental Philosophy of Science, Malden (MA): Blackwell.
  • Habermas, J. (1968): Technik und Wissenschaft als “Ideologie”, Frankfurt am Main: Suhrkamp.
  • Heidegger, M. (1962): Die Technik und die Kehre, Pfullingen: Neske.
  • Houkes, W. (2008): “Designing is the construction of use plans”, in: Vermaas, P.E., Kroes, P., Light, A. & Moore, S.A. (Eds): Philosophy and Design: From Engineering to Architecture, Dordrecht: Springer, pp. 37-49.
  • Houkes, W. (2009): “The nature of technological knowledge”, in: Meijers, A.W.M. (Ed.): Philosophy of Technology and Engineering Sciences (Handbook of the Philosophy of Science, Volume 9), Amsterdam: North Holland, pp. 310-350.
  • Houkes, W. & Vermaas, P.E. (2004): “Actions versus functions: A plea for an alternative metaphysics of artefacts”, The Monist 87: 52-71.
  • Hoyningen-Huene, P. (2008): ‘Systematicity: The nature of science”, Philosophia 36: 167-180.
  • Ihde, D. (1993): Philosophy of Technology: An Introduction, New York: Paragon House.
  • Ihde, D. (2009): “Technology and science”, in: Olsen, J.K.B., Pedersen, S.A. & Hendricks, V.F. (Eds): A Companion to the Philosophy of Technology, Chichester: Wiley-Blackwell, pp. 51-60.
  • Irrgang, B. (2008): Philosophie der Technik, Darmstadt: Wissenschaftliche Buchgesellschaft.
  • Jaspers, K. (1931): Die geistige Situation der Zeit, Berlin & Leipzig: Walter de Gruyter & Co.
  • Kaplan, D.M. (Ed.) (2004): Readings in the Philosophy of Technology, Lanham (Md.): Rowman & Littlefield.
  • Kapp, E. (1877): Grundlinien einer Philosophie der Technik: Zur Entstehungsgeschichte der Cultur aus neuen Gesichtspunkten, Braunschweig: G. Westermann.
  • Kogan-Bernstein, F.A. (1959): “Einleitung”, in: Kogan-Bernstein, F.A. (Ed): Francis Bacon: Neu-Atlantis, Berlin: Akademie-Verlag, pp. 1-46
  • Kroes, P.E., Light, A., Moore, S.A. & Vermaas, P.E. (2008): “Design in engineering and architecture: Towards an integrated philosophical understanding”, in: Vermaas, P.E., Kroes, P., Light, A. & Moore, S.A. (Eds): Philosophy and Design: From Engineering to Architecture, Dordrecht: Springer, pp. 1-17.
  • Krohs, U. & Kroes, P. (Eds) (2009): Functions in Biological and Artificial Worlds: Comparative Philosophical Perspectives, Cambridge (MA): MIT Press.
  • Kuhn, T.S. (1970): The Structure of Scientific Revolutions (Second Edition, Enlarged), Chicago: University of Chicago Press.
  • Li-Hua, R. (2009): “Definitions of technology”, in: J.K.B. Olsen, S.A. Pedersen & V.F. Hendricks (Eds): A Companion to the Philosophy of Technology, Chichester: Wiley-Blackwell, pp. 18-22.
  • Margolis, E. & Laurence, S. (Eds) (2007): Creations of the Mind: Theories of Artifacts and Their Representation, Oxford: Oxford University Press.
  • Meijers, A.W.M. (Ed.) (2009): Philosophy of Technology and Engineering Sciences (Handbook of the Philosophy of Science, Volume 9), Amsterdam: North Holland.
  • Misa, T.J. (2009): “History of technology”, in: J.K.B. Olsen, S.A. Pedersen & V.F. Hendricks (Eds): A Companion to the Philosophy of Technology, Chichester: Wiley-Blackwell, pp. 7-17.
  • Mitcham, C. (1994): Thinking Through Technology: The Path Between Engineering and Philosophy, Chicago & London: University of Chicago Press.
  • Mitcham, C. & Schatzberg, E. (2009): “Defining technology and the engineering sciences”, in: Meijers, A.W.M. (Ed.): Philosophy of Technology and Engineering Sciences (Handbook of the Philosophy of Science, Volume 9), Amsterdam: North Holland, pp. 27-63.
  • Nordmann, A. (2008): Technikphilosophie: Zur Einführung, Hamburg: Junius.
  • Nye, D.E. (2006): Technology Matters: Questions to Live With, Cambridge (MA): MIT Press.
  • O”Malley, M.A. (2009): “Making knowledge in synthetic biology: Design meets kludge”, Biological Theory 4: 378-389.
  • Parry, R. (2008): “Episteme and techne”, in: Zalta, E. (Ed.): Stanford Encyclopedia of Philosophy (Fall 2008 Edition),
  • Pitt, J.C. (2000): Thinking About Technology: Foundations of the Philosophy of Technology, New York & London: Seven Bridges Press.
  • Pitt, J.C. (2009): “Technological explanation”, in: Meijers, A.W.M. (Ed.): Philosophy of Technology and Engineering Sciences (Handbook of the Philosophy of Science, Volume 9), Amsterdam: North Holland, pp. 861-879.
  • Olsen, J.K.B., Pedersen, S.A. & Hendricks, V.F. (Eds) (2009): A Companion to the Philosophy of Technology, Chichester: Wiley-Blackwell.
  • Olsen, J.K.B., Selinger, E. & Riis, S. (Eds) (2009): New Waves in Philosophy of Technology, Houndmills: Palgrave Macmillan.
  • Office of Technology Assessment (1984): Commercial Biotechnology: An International Analysis, Washington (DC): U.S. Government Printing Office.
  • Rapp, F. (1981): Analytical Philosophy of Technology (Boston Studies in the Philosophy of Science, Vol. 63), Dordrecht: D. Reidel.
  • Rapp, F. (1989): “Introduction: General perspectives on the complexity of philosophy of technology”, in: Durbin, P.T. (Ed.): Philosophy of Technology: Practical, Historical and Other Dimensions, Dordrecht: Kluwer, pp. ix-xxiv.
  • Rollin, B.E. (2006): Science and Ethics, Cambridge: Cambridge University Press.
  • Ropohl, G. (1990): “Technisches Problemlösen und soziales Umfeld”, in: Rapp, F. (Ed.): Technik und Philosophie, Düsseldorf: VDI, pp. 111-167.
  • Ropohl, G. (2009): Allgemeine Technologie: Eine Systemtheorie der Technik (3., überarbeitete Auflage), Karlsruhe: Universitätsverlag Karlsruhe.
  • Scharff, R.C. (2009): “Technology as “applied science””, in: J.K.B. Olsen, S.A. Pedersen & V.F. Hendricks (Eds): A Companion to the Philosophy of Technology, Chichester: Wiley-Blackwell, pp. 160-164.
  • Scharff, R.C. & Dusek, V. (Eds.) (2003): Philosophy of Technology: The Technological Condition – An Anthology, Malden (MA): Blackwell.
  • Schummer, J. (2001): “Aristotle on technology and nature”, Philosophia Naturalis 38: 105-120.
  • Snyder, L.J. (2009): “William Whewell”, in: Zalta, E. (Ed.): Stanford Encyclopedia of Philosophy (Winter 2009 Edition),
  • Spengler, O. (1931): Der Mensch und die Technik: Beitrag zu einer Philosophie des Lebens, München: C.H. Beck.
  • Van den Beld, H. (2009): “Philosophy of biotechnology”, in: Meijers, A.W.M. (Ed.): Philosophy of Technology and Engineering Sciences (Handbook of the Philosophy of Science, Volume 9), Amsterdam: North Holland, pp. 1302-1340.
  • Verbeek, P.-P. (2005): What Things Do: Philosophical Reflections on Technology, Agency, and Design, University Park (PA): Pennsylvania State University Press.
  • Vermaas, P.E., Kroes, P., Light, A. & Moore, S.A. (Eds) (2008): Philosophy and Design: From Engineering to Architecture, Dordrecht: Springer.
  • Vincenti, W.G. (1990): What Engineers Know and How They Know It: Analytical Studies from Aeronautical History, Baltimore (MD): Johns Hopkins University Press.
  • Wartofsky, M.W. (1979): “Philosophy of technology”, in: Asquith, P.D. & Kyburg, H.E. (eds): Current Research in Philosophy of Science, East Lansing (MI): Philosophy of Science Association, pp. 171-184.
  • Whitney, E. (1990): Paradise Restored: The Mechanical Arts From Antiquity Through the Thirteenth Century (Transactions of the American Philosophical Society, Vol. 80), Philadelphia: The American Philosophical Society.
  • Wimsatt, W.C. (2007): Re-engineering Philosophy for Limited Beings: Piecewise Approximations to Reality, Cambridge (MA): Cambridge University Press.
  • Winner, L. (1977): Autonomous Technology: Technics-out-of-control as a Theme in Political Thought, Cambridge (MA): MIT Press
  • Zoglauer, T. (2002): “Einleitung”, in: Zoglauer, T. (Ed.): Technikphilosophie, Freiburg & München: Karl Alber.


Author Information

Thomas A.C. Reydon
Leibniz University of Hannover

Philosophy of Medicine

While philosophy and medicine, beginning with the ancient Greeks, enjoyed a long history of mutually beneficial interactions, the professionalization of “philosophy of medicine” is a nineteenth century event.  One of the first academic books on the philosophy of medicine in modern terms was Elisha Bartlett’s Essay on the Philosophy of Medical Science, published in 1844.  In the mid to late twentieth century, philosophers and physicians contentiously debated whether philosophy of medicine was a separate discipline distinct from the disciplines of either philosophy or medicine.  The twenty-first century consensus, however, is that it is a distinct discipline with its own set of problems and questions. Professional journals, books series, individual monographs, as well as professional societies and meetings are all devoted to discussing and answering that set of problems and questions.  This article examines—by utilizing a traditional approach to philosophical investigation—all aspects of the philosophy of medicine and the attempts of philosophers to address its unique set of problems and questions.  To that end, the article begins with metaphysical problems and questions facing modern medicine such as reductionism vs. holism, realism vs. antirealism, causation in terms of disease etiology, and notions of disease and health.  The article then proceeds to epistemological problems and questions, especially rationalism vs. empiricism, sound medical thinking and judgments, robust medical explanations, and valid diagnostic and therapeutic knowledge.  Next, it will address the vast array of ethical problems and questions, particularly with respect to principlism and the patient-physician relationship.  The article concludes with a discussion of what constitutes the nature of medical knowledge and practice, in light of recent trends towards both evidence-based and patient-centered medicine.  Finally, even though a vibrant literature on nonwestern traditions is available, this article is concerned only with the western tradition of philosophy of medicine (Kaptchuk, 2000; Lad, 2002; Pole, 2006; Unschuld, 2010).

Table of Contents

  1. Metaphysics
    1. Reductionism vs. Holism
    2. Realism vs. Antirealism
    3. Causation
    4. Disease and Health
  2. Epistemology
    1. Rationalism vs. Empiricism
    2. Medical Thinking
    3. Explanation
    4. Diagnostic and Therapeutic Knowledge
  3. Ethics
    1. Principlism
    2. Patient-Physician Relationship
  4. What is Medicine?
  5. References and Further Reading

1. Metaphysics

Traditionally, metaphysics pertains to the analysis of objects or events and the forces or factors causing or impinging upon them.  One branch of metaphysics, denoted ontology, investigates problems and questions concerning the nature and existence of objects or events and their associated forces or factors.  For philosophy of medicine in the twenty-first century, the two chief objects are the patient’s disease and health, along with the forces or factors responsible for causing them.  “What is/causes health?” or “What is/causes disease?” are contentious questions for philosophers of medicine.  Another branch of metaphysics involves the examination of presuppositions that inform a given ontology.  For philosophy of medicine, the most controversial debate centers around the presuppositions of reductionism and holism.  Questions like “Can a disease be sufficiently reduced to its elemental components?” or “Is the patient more than simply the sum of physical parts?” drive discussion among philosophers of medicine.  In addition, the debate between realism and antirealism has important traction within the field.  This debate centers on questions like, “Are disease-causing entities real?” or “Are these entities socially constructed?”   This section first explores the reductionism-holism and realism-antirealism debates, along with the notion of causation, before turning attention to the notions of disease and health.

a. Reductionism vs. Holism

The reductionism-holism debate enjoys a lively history, especially from the middle to the latter part of the twentieth century.  Reductionism, broadly construed, is the diminution of complex objects or events to their component parts.  In other words, the properties of the whole are simply the addition or summation of the properties of the individual parts.  Such reductionism is often called metaphysical or ontological reductionism to distinguish it from methodological or epistemological reductionism.  Methodological reductionism refers to the investigation of complex objects and events and their associated forces or factors by using technology that isolates and analyzes individual components only.  Epistemological reductionism involves the explanation of complex objects and events and their associated forces or factors in terms of their individual components only.  For the life sciences vis-à-vis reductionism, an organism is made of component parts like bio-macromolecules and cells, whose properties are sufficient for investigating and explaining the organism, if not life itself.  Life scientists often sort these parts into a descending hierarchy. Beginning with the organism,  they proceed downward through organ systems, individual organs, tissues, cells, and macromolecules until reaching the atomic and subatomic levels.  Albert Szent-Gyorgyi once remarked, as he descended this hierarchy in his quest for understanding living organisms, “life slipped between his fingers.”  Holism, however, is the position that the properties of the whole are not reducible to properties of its individual components.  Jan Smuts (1926) introduced the term in the early part of the twentieth century, especially with respect to biological evolution, to account for living processes—without the need for immaterial components.

The relevance of the reductionism-holism debate pertains to both medical knowledge and practice.  Reductionism influences not only how a biomedical scientist investigates and explains disease, but also how a clinician diagnoses and treats it.  For example, if a biomedical researcher believes that the underlying cause of a mental disease is dysfunction in brain processes or mechanisms, especially at the molecular level, then that disease is often investigated exclusively at that level.  In turn, a clinician classifies mental diseases in terms of brain processes or mechanisms at the molecular level, such as depletion in levels of the neurotransmitter serotonin.  Subsequently, the disease is treated pharmacologically by prescribing drugs to elevate the low levels of the neurotransmitter in the depressed brain to levels considered normal within the non-depressed brain.  Although the assumption of reductionism produces a detailed understanding of diseases in molecular or mechanistic terms, many clinicians and patients are dissatisfied with the assumption.  Both clinicians and patients feel that the assumption excludes important information concerning the nature of the disease, as it influences both the patient’s illness and life experience.  Rather than simply treating the disease, such information is vital for treating patients with chronic cases.  In other words, patients often feel as if physicians reduce them to their disease or diseased body part  rather than on the overall illness experience.  The assumption of holism best suits the approach to medical knowledge and practice that includes the patient’s illness experience.  Rather than striving exclusively for restoration of the patient to a pre-diseased state, the clinician assists the patient in redefining what the illness means for their life.  The outcome is not a physical cure necessarily, as it is healing of wholeness from the fragmentation in the patient’s life caused by the illness.

b. Realism vs. Antirealism

Realism is the philosophical notion that observable objects and events are actual objects and events, independent of the person observing them.  In other words, since it exists outside the minds of those observing it, reality does not depend on conceptual structures or linguistic formulations..  Proponents of realism also espouse that even unobservable objects and events, like subatomic particles, exist.  Historically, realists believe that universals—abstractions of objects and events—are separate from the mind cognizing them.  For example, terms like bacteria and cell denote real objects in the natural world, which exist apart from the human minds trying to examine and understand them.  Furthermore, scientific investigations into natural objects like bacteria and cells yield true accounts of these objects.  Anti-realism, on the other hand, is the philosophical notion that observable objects and events are not actual objects and events as observed by a person but rather they are dependent upon the person observing them.  In other words, these objects and events are mind-dependent—not mind-independent.   Anti-realists deny the existence of objects and events apart from the mind cognizing them.  Human minds construct these objects and events based on social or cultural values.  Historically, anti-realists subscribe to nominalism, in which universals do not exist and predicates of particular objects do.  Finally, they question the truth of scientific accounts of the world since no mind-independent framework can be correct absolutely.  Antirealists hold that such truth is framework dependent, so when one changes the framework, truth itself changes.

The debate among realists and anti-realists has important implications for philosophers of medicine, as well as for the practice of clinical medicine.  For example, a contentious issue is whether disease entities or conditions for the expression of a disease are real or not.  Realists argue that such entities or conditions are real and exist independent of medical researchers investigating them, while anti-realists deny their reality and existence.  Take the example of depression.  According to realists, the neurotransmitter serotonin is a real entity that exists in a real brain—apart from clinical investigations or investigators.  A low level of that transmitter is a real condition for the disease’s expression.  For anti-realists, however, serotonin is a laboratory or clinical construct based on experimental or clinical conditions.  Changes in that construct lead to changes in understanding the disease.  The debate is not simply academic but has traction for the clinic.  Clinical realists believe that restoring serotonin levels cures depression.  Clinical anti-realists are less confident about restoring levels of the neurotransmitter to affect a cure.  Rather, they believe that both diagnosis and treatment of depression do not devolve into simplistic measurements of serotonin levels.  Importantly, the anti-realists do not harbor the hope that additional information is likely to remedy the clinical problems associated with serotonin replacement therapy.  The problems are ontological to their core.  The neurotransmitter is a mental construct entirely dependent on efforts to investigate and translate laboratory investigations into clinical practice.

c. Causation

Causation has a long philosophical history, beginning with the ancient Greek philosophers.  Aristotle in particular provided a robust account of causation in terms of material cause, what something is made of; formal cause, how something is made; efficient cause, forces responsible for making something; and, final cause, the purpose for or end to which something is made.  In the modern period, Francis Bacon pruned the four Aristotelian causes to material and efficient causation.  With the rise of British empiricism, especially with David Hume’s philosophical analysis of causation, causation comes under critical scrutiny.  For Hume, in particular, causation is simply the constant conjunction of object and events, with no ontological significance in terms of hooking up the cause with the effect.  According to Hume, society indoctrinates us to assume something real exists between the cause and its effect.  Although Hume’s skepticism of causal forces prevailed in his personal study, it did not prevail in the laboratory.  During the nineteenth century, with the maturation of the scientific revolution, only one cause survived for accounting for natural entities and phenomena—the material cause, which is not strictly Aristotle’s original notion of material causation.  The modern notion involves mechanisms and processes and thereby eliminates efficient causation.  The material cause became the engine driving mechanistic ontology.  During the twentieth century, after the Einsteinian and quantum revolutions, mechanistic ontology gave way to physical ontology that included forces such as gravity as causal entities.  A century later, efficient causation is the purview of philosophers, who argue endlessly about it, while scientists take physical causation as unproblematic in constructing models of natural phenomena based on cause and effect.

For philosophers of medicine, causation is an important notion for analyzing both disease etiology and therapeutic efficacy (Carter, 2003).  At the molecular level, causation operates physico-chemically to investigate and explain disease mechanisms.  In the example of depression, serotonin is a neurotransmitter that binds specific receptors within certain brain locations, which in turn causes a cascade of molecular events in maintaining mental health.  This underlying causal or physical mechanism is critical not only for understanding the disease, but also for treating it.  Medical causation also operates at other levels.  For infectious diseases, the Henle-Koch postulates are important in determining the causal relationship between an infectious microorganism and an infected host (Evans, 1993).  To secure that relationship the microorganism must be associated with every occurrence of the disease, be isolated from the infected host, be grown under in vitro conditions, and be shown to elicit the disease upon infection of a healthy host.  Finally, medical causation operates at the epidemiological or population level.  Here, Austin Bradford Hill’s nine criteria are critical for establishing a causal relationship between environmental factors and disease incidence (Hill, 1965).  For example, the relationship between cigarette smoking and lung cancer involves the strength of the association between smoking and lung cancer, as well as the consistency of that association for the biological mechanisms.  These examples establish the importance of causal mechanisms involved in disease etiology and treatment, especially for diseases with an organic basis; however, some diseases, particularly mental disorders, do not reduce to such readily apparent causality.  Instead, they represent rich areas of investigations for philosophers of medicine.

d. Disease and Health

“What is disease?” is a contentious question among philosophers of medicine.  These philosophers distinguish among four different notions of disease.  The first is an ontological notion.  According to its proponents, disease is a palpable object or entity whose existence is distinct from that of the diseased patient.  For example, disease may be the condition brought on by the infection of a microorganism, such as a virus.  Critics, who champion a physiological notion of disease, argue that advocates of the ontological notion confuse the disease condition, which is an abstract notion, with a concrete entity like a virus.  In other words, proponents of the first notion often combine the disease’s condition and cause.  Supporters of this second notion argue that disease represents a deviation from normal physiological functioning.  The best-known defender of this notion is Christopher Boorse (1987), who defines disease as a value-free statistical norm with respect to “species design.”  Critics who object to this notion, however, cite the ambiguity of the term “norm” in terms of a reference class.  Instead of a statistical norm, evolutionary biologists propose a notion of disease as a maladaptive mechanism, which factors in the organism’s biological history.  Critics of this third notion claim that disease manifests itself, especially clinically, in terms of the individual patient and not a population.  A population may be important to epidemiologists but not to clinicians who must treat individual patients whose manifestation of a disease and response to therapy for that disease may differ from each other significantly.  The final notion of disease addresses this criticism.  The genetic notion claims that disease is the mutation in or absence of a gene.  Its champions assert that each patient’s genomic constitution is unique. By knowing the genomic constitution, clinicians are able to both diagnose the patient’s disease and tailor a specific therapeutic protocol.  Critics of the genetic notion claim that disease, especially its experience, cannot be reduced to nucleotide sequences.  Instead, it requires a larger notion including social and cultural factors.

“What is health?” is an equally contentious question  among philosophers of medicine.   The most common notion of health is simply absence of disease.  Health, according to proponents of this notion, represents a default state as opposed to pathology.  In other words, if an organism is not sick then it must be healthy.  Unfortunately, this notion does not distinguish between various grades of health or preconditions towards illness.  For example, as cells responsible for serotonin stop producing the neurotransmitter a person is prone to depression.  Such a person is not as healthful as a person who is making sufficient amounts of serotonin.  An adequate understanding of health should account for such preconditions.  Moreover, health as absence of disease often depends upon personal and social values of what is health.  Again, ambiguity enters into defining health given these values.  For one person, health might be very different from that of another.  The second notion of health does permit distinction between grades of health, in terms of quantifying it, and does not depend upon personal or social values.  Proponents of this notion, such as Boorse, define health in terms of normal functioning, where the normal reflects a statistical norm with respect to species design.  For example, a person with low levels of serotonin who is not clinically symptomatic in terms of depression is not as healthful as a person with statistically normal neurotransmitter levels.  Criticisms of the second notion revolve around its lack of incorporating the social dimension of health and jettison the notion altogether opting for the notion of wellbeing.  Wellbeing is a normative notion that combines both a person’s values, especially in terms of his or her life goals, and objective physiological states.  Because normative notions contain a person’s value system, they are often difficult to define and defend since values vary from person to person and culture to culture.  Proponents of this notion include Lennart Nordenfelt (1995), Carol Ryff and Burton Singer (1998), Carolyn Whitbeck (1981).

2. Epistemology

Epistemology is the branch of philosophy concerned with the analysis of knowledge, in terms of both its origins and justification.  The overarching question is, “What is knowing or knowledge?”  Subsidiary questions include, “How do we know that we know?”; “Are we certain or confident in our knowing or knowledge?”; “What is it we know when we claim we know?” Philosophers generally distinguish three kinds or theories of knowledge.  The first pertains to knowledge by acquaintance, in which a knowing or an epistemic agent is familiar with an object or event.  It is descriptive in nature, that is, a knowing-about knowledge.  For example, a surgeon is well acquainted with the body’s anatomy before performing an operation.  The second is competence knowledge, which is the species of knowledge useful for performing a task skillfully.  It is performative or procedural in nature, that is, a knowing-how knowledge.  Again, by way of example, the surgeon must know how to perform a specific surgical procedure before executing it.  The third, which interests philosophers most, is propositional knowledge.  It pertains to certain truths or facts.  As such, philosophers traditionally call this species of knowledge, “justified true belief.”  Rather than descriptive or performative in nature, it is explanatory, or a knowing-that knowledge.  Again, by way of illustration, the surgeon must know certain facts or truths about the body’s anatomy, such as the physiological function of the heart, before performing open-heart surgery.  This section begins with the debate between rationalists and empiricists over the origins of knowledge, before turning to medical thinking and explanation and then concluding with the nature of diagnostic and therapeutic knowledge.

a. Rationalism vs. Empiricism

The rationalism-empiricism debate has a long history, beginning with the ancient Greeks, and focuses on the origins of knowledge and its justification.  “Is that origin rational or empirical in nature?”  “Is knowledge deduced or inferred from first principles or premises?”  “Or, is it the result of careful observation and experience?”  These are just a few of the questions fueling the debate, along with similar questions concerning epistemic justification.  Rationalists, such as Socrates,Plato,  Descartes, and Kant, appeal to reason as being both the origin and the justification of knowledge.  As such, knowledge is intuitive in nature, and in contrast to the senses or perception, it is exclusively the product of the mind.  Given the corruptibility of the senses, argue the rationalists, no one can guarantee or warrant knowledge—except through the mind’s capacity to reason.  In other words, rationalism provides a firm foundation not only for the origin of knowledge but also for warranting its truth.    Empiricists, such as Aristotle, Avicenna, Bacon, Locke, Hume, and Mill, avoid the fears of rationalists and exalt observation and experience with respect to the origin and justification of knowledge.  According to empiricists, the mind is a blank slate (Locke’s tabula rasa) upon which observations and experiences inscribe knowledge.  Here, empiricists champion the role of experimentation in the origin and justification of knowledge.

The rationalism-empiricism debate originates specifically with ancient Greek and Roman medicine.  The Dogmatic school of medicine, founded by Hippocrates’ son and son-in-law in the fourth century BCE, claimed that reason is sufficient for understanding the underlying causes of diseases and thereby for treating them.  Dogmatics relied on theory, especially the humoral theory of health and disease, to practice medicine.  The Empiric school of medicine, on the other hand, asserted that only observation and experience, not theory, is a sufficient foundation for medical knowledge and practice.  Theory is an outcome of medical observation and experience, not their foundation.  Empirics relied on palpable, not underlying, causes to explain health and disease and to practice medicine.  Philinus of Cos and his successor Serapion of Alexandria, both third century BCE Greek physicians, are credited with founding the Empiric school, which included the influential Alexandrian school.  A third school of medicine arose in response to the debate between the Dogmatics and Empirics, the Methodic school of medicine.  In contrast to Dogmatics, and in agreement with Empirics, Methodics argued that underlying causes are superfluous to the practice of medicine.  Rather, the patient’s immediate symptoms, along with common sense, are sufficient and provide the necessary information to treat the patient.  Thus, in contrast to Empirics, Methodics argued that experience is unnecessary to treat disease and that the disease’s symptoms provide all the knowledge needed to practice medicine.

The Dogmatism-Empiricism debate, with Methodism representing a minority position, raged on and was still lively in the seventeenth and eighteenth centuries.  For example, Giorgio Baglivi (1723), an Armenian-born seventeenth century Italian physician, decried the polarization of physicians along dogmatic and empiric boundaries and recommended resolving the debate by combining the two.  Contemporary philosophical commentators on medicine recognize the importance of both epistemic positions, and several commentators propose synthesis of them.  For example, Jan van Gijn (2005) advocates an “empirical cycle” in which experiments drive hypothetical thinking, which in turn results in additional experimentation.  Although no clear resolution of the rationalism-empiricism debate in medicine appears on the immediate horizon, the debate does emphasize the importance of and the need for additional philosophical analysis of epistemic issues surrounding medical knowledge.

b. Medical Thinking

“How doctors think” is the title of two  twenty-first century books on medical thinking.  The first is by a medical humanities scholar, Kathryn Montgomery (2006).  Montgomery addresses vital questions about how physicians go about making clinical decisions when often faced with tangible uncertainty.  She argues for medical thinking based not on science but on Aristotelian phronesis, or practical or intuitive reasoning.  The second book is by a practicing clinician, Jerome Groopman (2007).  Groopman also addresses questions about medical thinking, and he too pleads for clinical reasoning based on practical or intuitive foundations.  Both books call for introducing the art of medical thinking to offset the over dependence on the science of medical thinking.  In general, medical thinking reflects the cognitive faculties of clinicians to make rational decisions about what ails patients and how best to go about treating them both safely and effectively.  That thinking, during the twentieth century, mimicked the technical thinking of natural scientists—and, for good reason.  As Paul Meehl (1954) convincingly demonstrated, statistical reasoning in the clinical setting out performs intuitive clinical thinking.  Although Montgomery’s and Groopman’s attempts to swing the pendulum back to the art of medical thinking, the risk of medical errors often associated with such thinking demands clearer analysis of the science of medical thinking.  That analysis centers traditionally on both logical and algorithmic methods of clinical judgment and decision-making, to which the twenty-first century has turned.

Georg Stahl’s De logico medica, published in 1702, is one of the first modern treatises on medical logic.  However, not until the nineteenth century did logic of medicine become an important area of sustained analysis or have an impact on medical knowledge and practice.  For example, Friedrich Oesterlen’s Medical logic, published in English translation in 1855, promoted medical logic not only as a tool for assessing the formal relationship between propositional statements and thereby avoiding clinical error, but also for analyzing the relationship among medical facts and evidence in the generation of medical knowledge.  Oesterlen’s logic of medicine was indebted to the Paris school of clinical medicine, especially Pierre Louis’ numerical method (Morabia, 1996).  Contemporary logic of medicine continues this tradition, especially in terms of statistical analysis of experimental and clinical data.  For example, Edmond Murphy’s The Logic of Medicine (1997) represents an analysis of logical and statistical methods used to evaluate both experimental and clinical evidence.  Specifically, Murphy identifies several “rules of evidence” critical for interpreting such evidence as medical knowledge.  A particularly vigorous debate concerns the role of frequentist vs. Bayesian statistics in determining the statistical significance of data from clinical trials.  The logic of medicine, then, represents an important and a fruitful discipline in which medical scientists and clinical practitioners can detect and avoid errors in the generation and substantiation of medical knowledge and in its application or translation to the clinic.

Philosophers of medicine actively debate the best courses of action for making clinical decisions.  For, clinical judgment is an informal process in which a clinician assesses a patient’s clinical signs and symptoms to come to an accurate judgment about what is ailing the patient. To make such a judgment requires an insight into the intelligibility of the clinical evidence.  The issue for philosophers of medicine is what role intuition should play in clinical judgment when facing the ideals of objective scientific reasoning and judgment.  Meehl’s work on clinical judgment, as noted earlier casted suspicion on the effectiveness of intuition in clinical judgment; and yet, some philosophers of medicine champion  the understood dimension in such decision-making.  The debate often reduces to whether clinical judgment is an art or a science; however, some, like Alvan Feinstein (1994), argue for a reconciliatory position between them.  Once a physician comes to a judgment then the physician must make a decision as to how to proceed clinically.  Although clinical decision making, with its algorithmic-like decision trees, is a formal procedure compared to clinical judgment, philosophers of medicine actively argue about the structure of these trees and procedures for generating and manipulating them.  The main issue is how best to define the utility grounding the trees.

c. Explanation

Epistemologists are generally interested in the nature of propositions especially the explanatory power of those justified true beliefs.  To know something truly is to understand and explain the hidden causes behind it.  Explanations operate at a variety of levels.  For example, neuroscientific explanations account for human behavior in terms of the neurological activity while astrological explanations account for such behavior with respect to astronomical activity.  Philosophers, especially philosophers of science, distinguish several kinds of explanations, including the covering law explanation, causal explanation, and inference to the best explanation.  In twenty-first century medicine, explanations are important for understanding disease mechanisms and, in understanding those mechanisms, for developing therapeutic modalities to treat the patient’s disease.  This line of reasoning runs deep in medical history, beginning, as we have seen, with the Dogmatics.  Twenty-first century philosophers of medicine utilize the explanatory schemes developed by philosophers of science to account for medical phenomena.  The Following section will briefly examine each of these explanatory schemes and their relevance for medical explanations.

Carl Hempel and Paul Oppenheim introduced covering law explanation in the late 1940s.  According to Hempel and Oppenheim (1948), explanations function as arguments with the conclusion or explanandum—that which is explained—deduced or induced from premises or explanans—that which does the explaining.  At least one of the explanans must be a scientific law, which can be a mechanistic or statistical law.  Although covering law explanations are useful for those medical phenomena that reduce to mechanistic or statistical laws, such as explaining cardiac output in terms of heart rate and stroke volume, not all such phenomena lend themselves to such reductive explanations.  The next explanatory scheme, causal explanation, attempts to rectify that problem.  Causal explanation relies on the temporal or spatial regularity of phenomena and events and utilizes antecedent causes to explain phenomena and events.  The explanations can be simplistic in nature, with only a few antecedent causes arranged linearly, or very complex, with multiple antecedent causes operating in a matrix of interrelated and integrated interactions.  For example, causal explanations of cancer involve at least six distinct sets of genetic factors controlling cellular phenomena such as cell growth and death, immunological response, and angiogenesis.  Finally, Gilbert Harman articulated the contemporary form of inference to the best explanation, or IBE, in the 1960s.  Harman (1965) proposed that based on the totality of evidence one must choose the explanation that best accounts for or infers that evidence and reject its competitors.  The criteria for “bestness” range from the explanation’s simplicity to its generality or consilience to account for analogous phenomena.  Peter Lipton (2004) offers Ignaz Semmelweis’ explanation of increased mortality of women giving birth in one ward compared to another, as an example of IBE.  Donald Gillies (2005) provides an analysis of it in terms of Kuhnian paradigm.

d. Diagnostic and Therapeutic Knowledge

Diagnostic knowledge pertains to the clinical judgments and decisions made about what ails a patient.  Epistemologically, the issues concerned with such knowledge are its accuracy and certainty.  Central to both these concerns are clinical symptoms and signs.  Clinical symptoms are subjective manifestations of the disease that the patient articulates during the medical interview, while clinical signs are objective manifestations that the physician discovers during the physical examine.  What is important for the clinician is how best to quantify those signs and symptoms, and then to classify them in a robust nosology or disease taxonomy.  The clinical strategy is to collect the empirical data through the physical examination and laboratory tests, to deliberate on that data, and then to draw a conclusion as to what the data means in terms of the patient’s disease condition.  The strategy is fraught with questions for philosophers of medicine, from “What constitutes symptoms and signs and how they differ?” to “How best to measure and quantify the signs and to classify the diseases?”  Philosophers of medicine debate the answers to these questions, but the discussion among philosophers of science over the strategy by which natural scientists investigate the natural world guides much of the debate.  Thus, a clinician generates hypotheses about a patient’s disease condition, which he or she then assesses by conducting further medical tests.  The result of this process is a differential diagnosis, which represents a set of hypothetical explanations for the patient’s disease condition.  The clinician then narrows this set to one diagnostic hypothesis that best explains most, and hopefully all, of the relevant clinical evidence.  The epistemic mechanism that accounts for this process and the factors involved in it are unclear.  Philosophers of medicine especially dispute the role of tacit factors in the process.  Finally, the heuristics of the process are an active area of philosophical investigation in terms of identifying rules for interpreting clinical evidence and observations.

Therapeutic knowledge refers to the procedures and modalities used to treat patients.  Epistemologically, the issues concerned with such knowledge are its efficacy and safety.  Efficacy refers to how well the pharmacological drug or surgical procedure treats or cures the disease, while safety refers to possible patient harm caused by side effects.  The questions animating discussion among philosophers of medicine range from “What is a cure?” to “How to establish or justify the efficacy of a drug or procedure?” The latter question occupies a considerable amount of the philosophy of medicine literature, especially the nature and role of clinical trials.  Although basic medical research into the etiology of disease mechanisms is important, the translation of that research and the philosophical problems that arise from it are foremost on the agenda for philosophers of medicine.  The origin of clinical trials dates at least to the eighteenth century but not until the twentieth century is consensus reached over the structure of these trials.  Today, four phases define a clinical trial.  During the first phase, clinical investigators establish the maximum tolerance of healthy volunteers to a drug.  The next phase involves a small patient population to determine the drug’s efficacy and safety.  In the third phase, which is the final phase required to obtain FDA approval, clinical investigators utilize a large and relatively diverse patient population to establish the drug’s efficacy and safety.  A fourth stage is possible in which clinical investigators chart the course of the drug’s use and effectiveness in a diverse patient population over a longer period.  The following are topics of active discussion among philosophers of medicine: The nature of clinical trials with respect to features like randomizing in which test subjects are arbitrarily assigned to either experimental or control groups, blinding of patients and physicians to randomizing to remove assessment bias, concurrent control in which the control group does not receive the experimental treatment that the test group receives, and the role of the placebo effect or the expected benefit patient’s anticipate from receiving treatment represent.  However, the most pressing problem is the type of statistics utilized for analyzing clinical trial evidence.   Some philosophers of medicine champion frequentist statistics, while others Bayesian statistics.

3. Ethics

Ethics is the branch of philosophy concerned with the right or moral conduct or behavior of a community and its members.  Traditionally, philosophers divide ethics into descriptive, normative, and applied ethics.  Descriptive ethics involves detailing ethical conduct without evaluating it in terms of moral codes of conduct, whereas normative ethics pertains to how a community and its members should act under given situations, generally in terms of an ethical code.  This code is often a product of certain values held in common within a community.  For example, ethical codes against murder reflect values community members place upon taking human life without just cause.  Besides values, ethicists base normative ethics on a particular theoretical perspective.  Within western culture, three such perspectives predominate.  The first and historically oldest ethical theory—although it experienced a Renaissance in the late twentieth century—is virtue ethics.  Virtue ethics claims that ethical conduct is the product of a moral agent who possesses certain virtues, such as prudence, courage, temperance, or justice—the traditional cardinal virtues.  The second ethical theory is deontology and bases moral conduct on adherence to ethical precepts and rules reflecting moral duties and obligations.  The third ethical theory is consequentialism, which founds moral conduct on the outcome or consequence of an action.  The chief example of this theory is utilitarianism, or the maximization of an action’s utility, which claims that an action is moral if it realizes the greatest amount of happiness for the greatest number of community members.   Finally, applied ethics is the practical use of ethics within a profession such as business or medicine.  Medical or biomedical ethics reflects applied ethics and is a major feature within the landscape of twenty-first century medicine.  Historically, ethical issues are a conspicuous component of medicine beginning with Hippocrates.  Throughout medical history several important treatises on medical ethics have been published.  Probably the best-known example is Thomas Percival’s Medical Ethics, published in 1803, which influenced the development of the American Medical Association’s ethical code.  Today, medical ethics is founded not on any particular ethical theory but on four ethical principles.

a. Principlism

The origins of the predominant system for contemporary medical or biomedical ethics began in 1932.  In that year, the Public Health Service, in conjunction with the Tuskegee Institute in Macon County, Alabama, undertook a clinical study to document the course of syphilis on untreated test subjects.  The subjects were Afro-American males.  Over the next forty years, healthcare professionals observed the course of the disease, even after the introduction of antibiotics.  Not until 1972, did the study end and only after public outcry from newspaper articles—especially an article in the New York Times—reporting the study’s atrocities.  What made the study so atrocious was that the healthcare professionals misinformed the subjects about treatment or failed to treat the subjects with antibiotics.  To insure that such flagrant abuse of test subjects did not happen again, the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research met from February 13-16, 1976.  At the Smithsonian Institute’s Belmont Conference Center in Maryland, the commission drafted guidelines for the treatment of research subjects.  The outcome was a report entitled, Ethical Principles and Guidelines for the Protection of Human Subjects of Research, or known simply as the Belmont Report, published in 1979.  The report lists and discusses several ethical principles necessary for protecting human test subjects and patients from unethical treatment at the hands of healthcare researchers and providers.  The first is respect for persons, in that researchers must respect the test subject’s autonomy to make informed decisions based on accurate and truthful information concerning the test study’s procedures and risks.  The next principle is beneficence or maximizing the benefits to risk ratio for the test subject.  The final ethical principle is justice, which ensures that the cost to benefit ratio is equitably distributed among the general population and that no one segment of it bears an unreasonable burden with respect to the ratio.

One of the framers of the Belmont Report was a young philosopher named Tom Beauchamp.  While working on the report, Beauchamp, in collaboration with a colleague, James Childress, was also writing a book on the role of ethical principles in guiding medical practice.  Rather than ground biomedical ethics on any particular ethical theory, such as deontology or utilitarianism, Beauchamp and Childress looked to ethical principles for guiding and evaluating moral decisions and judgments in healthcare.  The fruit of their collaboration was Principles of Biomedical Ethics, first published in the same year as the Belmont Report, 1979.  In the book, Beauchamp and Childress apply the ethical principles approach of the report to regulate the activities of biomedical researchers, to assist physicians in deliberating over the ethical issues associated with the practice of clinical medicine.  However, besides the three guiding principles of the report, they added a fourth—nonmaleficence.  Moreover, the first principle became patient autonomy, rather than respect of persons as denoted in the report.  For the autonomy principle, Beauchamp and Childress stress the patient’s liberty to make critical decisions concerning treatment options, which have a direct impact on the patient’s own values and life plans.  The authors’ second principle, nonmaleficence, instructs the healthcare provider to avoid doing harm to the patient, while the next principle of beneficence emphasizes removing harm and doing good to the patient.  Beauchamp and Childress articulate the final principle, justice, in terms reminiscent of the Belmont report with respect to equitable distribution of risks and benefits, as well as healthcare resources, among both the general and patient populations.  The bioethical community quickly dubbed  the Beauchamp and Childress approach to biomedical ethics as principlism.

Principlism’s impact on the bioethical discipline is unparalleled.  Beauchamp and Childress’ book is now in its fifth edition and is a standard textbook for teaching biomedical ethics at medical schools and graduate programs in medical ethics.  One of the chief advocates of principlism is Raanan Gillon (1986)  Gillon expanded the scope of the principles to aid in their application to difficult bioethical issues, especially where the principles might conflict with one another.  However, principlism is not without its critics.  A fundamental complaint is the lack of theoretical support for the four principles, especially when the principles collide with one another in terms of their application to a bioethical problem. In its use, ethicists and clinicians generally apply the principles in an algorithmic manner to justify practically any ethical position on a biomedical problem.  What critics want is a unified theoretical basis for grounding the principles, in order to avoid or adjudicate conflicts among the principles.  Moreover, Beauchamp and Childress’ emphasis on patient autonomy had serious ramifications on the physician’s role in medical care, with that role at times marginalized to the patient’s role.  Alfred Tauber (2005), for instance, charges that such autonomy itself is “sick” and often results in patients abandoned to their own resources with detrimental outcomes for them.  In response to their critics, Beauchamp and Childress argue that no single ethical theory is available to unite the four principles to avoid or adjudicate conflicts among them.  However, they did introduce in the fourth edition of Principles, a notion of common morality—a set of shared moral standards—to provide theoretical support for the principles.  Unfortunately, their notion of common morality lacks the necessary theoretical robustness to unify the principles effectively.  Although principlism still serves a useful function in biomedical ethics, particularly in the clinic, early twenty-first century trends towards healthcare ethics and global bioethics have made its future unclear.

b. Patient-Physician Relationship

According to many philosophers of medicine, medicine is more than simply a natural or social science; it is a moral enterprise.  What makes medicine moral is the patient-physician or therapeutic relationship.  Although some philosophers of medicine criticize efforts to model the relationship, given the sheer number of contemporary models proposed to account for it, modeling the relationship has important ramifications for understanding and framing the moral demands of medicine and healthcare.  The traditional medical model within the industrialized West, especially in mid twentieth century America, was paternalism or “doctor knows best.”  The paternalistic model is doctor-centered in terms of power distribution, with the patient representing a passive agent who simply follows the doctor’s orders.  The patient is not to question those orders, unless to clarify them.  The source for this power distribution is the doctor’s extensive medical education and training relative to the patient’s lack of medical knowledge.  In this model, the doctor represents a parent, generally a father figure and the patient a child—especially a sick child.  The motivation of this model is to relieve a patient burdened with suffering from a disease, to benefit the patient from the doctor’s medical knowledge, and to affect a cure while returning the patient to health.  In other words, the model’s guiding principle is beneficence.  Besides the paternalistic model, other doctor-centered models include the priestly and mechanic models.  However, the paternalistic model, as well as the other doctor-centered models, ran into severe criticism with abuses associated with the models and with the rise of patient advocacy groups to correct the abuses.

Within the latter part of the twentieth century and the rise of patient autonomy as a guiding principle for medical practice, alternative patient-physician models challenged traditional medical paternalism.  Instead of doctor-centered, one set of models are patient-centered in which patients are the locus of power.  The most predominant patient-centered model is the business model, where the physician is a healthcare provider and the patient a consumer of healthcare goods and services.  The business model is an exchange relationship and relies heavily on a free market system.  Thus, the patient possesses the power to pick and choose among physicians until a suitable healthcare provider is found.  The legal model is another patient-centered model, in which the patient is a client and the guiding forces are patient autonomy and justice.  Patient and physician enter into a contract for healthcare services.  Another set of models in which patients have significant power in the therapeutic relationship are the mutual models.  In these models, neither patients nor physicians have the upper hand in terms of power-they share it.  The most predominant model is the partnership model in which patient and physician are associates in the therapeutic relationship.  The guiding force of this model is informed consent in which the physician apprises the patient of the available therapeutic options and the patient then chooses which is best.  Both the patient and physician share decision making over the best means for affecting a cure.  Finally, other examples of mutual models include the covenant model, which stresses promise instead of contract; the friendship model, which involves a familial-like relationship; and, the neighbor model, which maintains a “safe” distance and yet familiarity between patient and physician.

4. What is Medicine?

The nature of medicine is certainly an important question facing twenty-first century philosophers of medicine.  One reason for its importance is that the question addresses the vital topic of how physicians should practice medicine.  During the turn of the twenty-first century, clinicians and other medical pundits have begun to accept evidence-based medicine, or EBM, as the best way to practice medicine.  Proponents of EBM claim that physicians should engage in medical practices based on the best scientific and clinical evidence available, especially from randomized controlled clinical trials, systematic observations, and meta-analyses of that evidence, rather than on pathophysiology or an individual physician’s clinical experience. Proponents also claim that EBM represents a paradigmatic shift away from traditional medicine.  Traditional practitioners doubt the radical claims of EBM proponents.  One specific objection is that application of evidence from population based clinical trials to the individual patient within the clinic is not as easy to accomplish as EBM proponents realize.  In response, some clinicians propose patient-centered medicine (PCM).   Patient-centered advocates include the patient’s personal information in order to apply the best available scientific and clinical evidence in treatment.  The focus then shifts from the patience’s disease to the patience’s illness experience.  The key for the practice of patient-centered medicine is a physician’s effective communication with the patient.  While some commentators present EBM and PCM as competitors, others propose a combination or integration of the two medicines.  The debate between advocates of EBM and PCM is reminiscent of an earlier debate between the science and art of medicine and belies a deep anxiety over the nature of medicine.  Certainly, philosophers of medicine can play a strategic role in the debate and assist towards its satisfactory resolution.

Besides its nature,  twenty-first century medicine also faces a number of crises, including economic, malpractice, healthcare insurance, healthcare policy, professionalism, public or global health, quality-of-care, primary or general care, and critical care—to name a few (Daschle, 2008; Relman, 2007).  Philosophers of medicine can certainly contribute to the resolution of these crises by carefully and insightfully analyzing the issues associated with them.  For example, considerable attention has been paid in the literature to the crisis over the nature of medical professionalism (Project of the ABIM Foundation, et al., 2002; Tallis, 2006).  The question that fuels this crisis is what type of physician best meets the patient’s healthcare needs and satisfies medicine’s social contract.  The answer to this question involves the physician’s professional demeanor or character.  However, little consensus as to how best to define professionalism is palpable in the literature.  Philosophers of medicine can aid by furnishing guidance towards a consensus on the nature of medical professionalism.

Philosophy of medicine is a vibrant field of exploration into the world of medicine in particular, and of healthcare in general.  Along traditional lines of metaphysics, epistemology, and ethics, a cadre of questions and problems face philosophers of medicine and cry out for attention and resolution.  In addition, many competing forces are vying for the soul of medicine today.  Philosophy of medicine is an important resource for reflecting on those forces in order to forge a medicine that meets both physical and existence needs of patients and society.

5. References and Further Reading

  • Achinstein, P. 1983. The nature of explanation. Oxford: Oxford University Press.
  • Andersen, H. 2001. The history of reductionism versus holism approaches to scientific research. Endeavor 25:153-156.
  • Aristotle. 1966. Metaphysics. H.G. Apostle, trans. Bloomington: Indiana University Press.
  • Baglivi, G. 1723. Practice of physick, 2nd edition. London: Midwinter.
  • Bartlett, E. 1844. Essay on the philosophy of medical science. Philadelphia: Lea & Blanchard.
  • Beauchamp, T., and Childress, J.F. (2001) Principles of biomedical ethics, 5th edition. Oxford:Oxford University Press.
  • Black, D.A.K. (1968) The logic of medicine. Edinburgh: Oliver & Boyd.
  • Bock, G.R., and Goode, J.A., eds. 1998. The limits of reductionism in biology. London: John Wiley.
  • Boorse, C. 1975. On the distinction between disease and illness. Philosophy and Public Affairs 5:49-68.
  • Boorse, C. 1987. Concepts of health. In Health care ethics: an introduction, D. VanDeVeer and T. Regan, eds.  Philadelphia: Temple University Press, pp. 359-393.
  • Boorse, C. 1997. A rebuttal on health. In What is disease?, J.M. Humber and R.F.  Almeder, eds. Totowa, N.J.: Humana Press, pp. 1-134.
  • Brody, H. 1992. The healer’s power. New Haven, CT: Yale University Press.
  • Caplan, A.L. 1986 Exemplary reasoning? A comment on theory structure in biomedicine. Journal of Medicine and Philosophy 11:93-105.
  • Caplan, A.L. 1992. Does the philosophy of medicine exist? Theoretical Medicine 13:67-77.
  • Carter, K.C. 2003. The rise of causal concepts of disease: case histories. Burlington, VT: Ashgate.
  • Cassell, E.J. 2004. The nature of suffering and the goals of medicine, 2nd edition. New York: Oxford University Press.
  • Clouser, K.D., and Gert, B. 1990. A critique of principlism. Journal of Medicine and Philosophy 15:219-236.
  • Collingwood, R.G. 1940. An essay on metaphysics. Oxford: Clarendon Press.
  • Coulter, A. 1999. Paternalism or partnership? British Medical Journal 319:719-720.
  • Culver, C.M., and Gert, B. 1982. Philosophy in medicine: conceptual and ethical issues in medicine and psychiatry. New York: Oxford University Press.
  • Daschle, T. 2008. Critical: what we can do about the health-care crisis. New York: Thomas Dunne Books.
  • Davis, R.B. 1995. The principlism debate: a critical overview. Journal of Medicine and Philosophy 20:85-105.
  • Davis-Floyd, R., and St. John, G. 1998. From doctor to healer: the transformative journey. New Brunswick, NJ: Rutgers University Press.
  • Dummett, M.A.E. 1991. The logical basis of metaphysics. Cambridge: Harvard University Press.
  • Elsassar, W.M. 1998. Reflections on a theory of organisms: holism in biology. Baltimore: Johns Hopkins University Press.
  • Emanuel, E.J., and Emanuel, L.L. 1992 Four models of the physician-patient relationship. Journal of American Medical Association 267:2221-2226.
  • Engel, G.L. 1977. The need for a new medical model: a challenge for biomedicine. Science 196:129-136.
  • Engelhardt, Jr., H.T. 1996. The foundations of bioethics, 2nd edition. New York: Oxford University Press.
  • Engelhardt, Jr., H.T., ed., 2000. Philosophy of medicine: framing the field. Dordrecht: Kluwer.
  • Engelhardt, Jr., H.T., and Erde, E.L. 1980. Philosophy of medicine. In A guide to culture of science, technology, and medicine, P.T. Durbin, ed. New York: Free Press, pp. 364-461.
  • Engelhardt, Jr., H.T., and Wildes, K.W. Philosophy of medicine. 2004. In Encyclopedia of bioethics, 3rd edition, S.G. Post, ed. New York: Macmillan, pp. 1738-1742.
  • Evans, A.S. 1993. Causation and disease: a chronological journey. New York: Plenum.
  • Evans, M., Louhiala, P. and Puustinen, P., eds. 2004. Philosophy for medicine: applications in a clinical context. Oxon, UK: Radcliffe Medical Press.
  • Evidence-Based Medicine Working Group. 1992. Evidence-based medicine: a new approach to teaching the practice of medicine. Journal of American Medical Association 268:2420- 2425.
  • Feinstein, A.R. .1967. Clinical judgment. Huntington, NY: Krieger.
  • Feinstein, A.R. 1994. Clinical judgment revisited: the distraction of quantitative models. Annals of Internal Medicine 120:799-805.
  • Fulford, K.W.M. 1989. Moral theory and medical practice. Cambridge: Cambridge University Press.
  • Gardiner, P. 2003. A virtue ethics approach to moral dilemmas in medicine. Journal of Medical Ethics 29:297-302.
  • Gert, B., Culver, C.M., and Clouser, K.D. 1997. Bioethics: a return to fundamentals. Oxford, Oxford University Press.
  • Gillies, D.A. 2005. Hempelian and Kuhnian approaches in the philosophy of medicine: the Semmelweis case. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Science 36:159-181.
  • Gillon, R. 1986. Philosophical medical ethics. New York: John Wiley and Sons.
  • Goldman, G.M. 1990. The tacit dimension of clinical judgment. Yale Journal of Biology and Medicine 63:47-61.
  • Golub, E.S. 1997. The limits of medicine: how science shapes our hope for the cure. Chicago: University of Chicago Press.
  • Goodyear-Smith, F., and Buetow, S. 2001. Power issues in the doctor-patient relationship. Health Care Analysis 9:449-462.
  • Groopman, J. 2007. How doctors think. New York: Houghton Mifflin.
  • Halpern, J. 2001. From detached concern to empathy: humanizing medical practice. New York: Oxford University Press.
  • Hampton, J.R. 2002. Evidence-based medicine, opinion-based medicine, and real-world medicine. Perspectives in Biology and Medicine 45:549-68.
  • Harman, G.H. 1965. The inference to the best explanation. Philosophical Review 74:88-95.
  • Haug, M.R., and Lavin, B. 1983. Consumerism in medicine: challenging physician authority. Beverly Hills, CA: Sage Publications.
  • Häyry, H. 1991. The limits of medical paternalism. London: Routledge.
  • Hempel, C.G. 1965. Aspects of scientific explanation and other essays in the philosophy of science. New York: Free Press.
  • Hempel, C.G., and Oppenheim, P. 1948. Studies in the logical of explanation. Philosophy of science 15:135-175.
  • Hill, A.B. 1965. The environment and disease: association or causation? Proceedings of the Royal Society of Medicine 58:295-300.
  • Howick, J.H. 2011. The philosophy of evidence-based medicine. Hoboken, NJ: Wiley- Blackwell.
  • Illari, P.M., Russo, F., and Williamson, J., eds. 2011. Causality in the sciences. New York: Oxford University Press.
  • Illingworth, P.M.L. 1988. The friendship model of physician/patient relationship and patient autonomy. Bioethics 2:22-36.
  • James, D.N. 1989. The friendship model: a reply to Illingworth. Bioethics 3:142-146.
  • Johansson, I., and Lynøe, N. 2008. Medicine and philosophy: a twenty-first century introduction. Frankfurt: Ontos Verlag.
  • Jonsen, A.R. 2000. A short history of medical ethics. New York: Oxford University Press.
  • Kaptchuk, T.J. 2000. The web that has no weaver: understanding Chinese medicine. Chicago, IL: Contemporary Books.
  • Kadane, J.B. 2005. Bayesian methods for health-related decision making. Statistics in Medicine 24:563-567.
  • Katz, J. 2002. The silent world of doctor and patient. Baltimore: Johns Hopkins University Press.
  • King, L.S. 1978. The philosophy of medicine. Cambridge: Harvard University Press.
  • Kleinman, A. 1988. The illness narratives: suffering, healing and the human condition. New York: Basic Books.
  • Knight, J.A. 1982. The minister as healer, the healer as minister. Journal of Religion and Health 21:100-114.
  • Konner, M. 1993. Medicine at the crossroads: the crisis in health care. New York: Pantheon Books.
  • Kovács, J. 1998. The concept of health and disease. Medicine, Health Care and Philosophy 1:31- 39.
  • Kulkarni, A.V. 2005. The challenges of evidence-based medicine: a philosophical perspective. Medicine, Health Care and Philosophy 8:255-260.
  • Lad, V. D. 2002. Textbook of Ayurveda: fundamental principles of Ayurveda, volume 1. Albuquerque, NM: Ayurvedic Press.
  • Larson, J.S. 1991. The measurement of health: concepts and indicators. New York: Greenwood Press.
  • Le Fanu, J. 2002. The rise and fall of modern medicine. New York: Carroll & Graf.
  • Levi, B.H. 1996. Four approaches to doing ethics. Journal of Medicine and Philosophy 21:7-39.
  • Liberati, A. Vineis, P. 2004. Introduction to the symposium: what evidence based medicine is and what it is not. Journal of Medical Ethics 30:120-121.
  • Lipton, P. 2004. Inference to the best explanation, 2nd edition. New York: Routledge.
  • Little, M. 1995. Humane medicine Cambridge: Cambridge University Press.
  • Loewy, E.H. 2002. Bioethics: past, present, and an open future. Cambridge Quarterly of Healthcare Ethics 11:388-397.
  • Looijen, R.C. 2000. Holism and reductionism in biology and ecology: the mutual dependence of higher and lower level research programmes. Dordrecht: Kluwer.
  • Maier, B., and Shibles, W.A. 2010. The philosophy and practice of medicine and bioethics: a naturalistic-humanistic approach. New York: Springer.
  • Marcum, J.A. 2005. Metaphysical presuppositions and scientific practices: reductionism and organicism in cancer research. International Studies in the Philosophy of Science 19:31-45.
  • Marcum, J.A. 2008. An introductory philosophy of medicine: humanizing modern medicine. New York: Springer.
  • Marcum, J.A. 2009. The conceptual foundations of systems biology: an introduction. Hauppauge, NY: Nova Scientific Publishers.
  • Marcum, J.A., and Verschuuren, G.M.N. 1986. Hemostatic regulation and Whitehead’s philosophy of organism. Acta Biotheoretica 35:123-133.
  • Matthews, J.N.S. 2000. An introduction to randomized controlled clinical trials. London: Arnold.
  • May, W.F. 2000. The physician’s covenant: images of the healer in medical ethics, 2nd edition. Louisville: Westminster John Knox Press.
  • Meehl, P.E. 1954. Clinical versus statistical prediction: a theoretical analysis and a review of the literature. Minneapolis: University of Minnesota Press.
  • Montgomery, K. 2006. How doctors think: clinical judgment and the practice of medicine. New York: Oxford University Press.
  • Morabia, A. 1996. P.C.A. Louis and the birth of clinical epidemiology. Journal of Clinical Epidemiology 49:1327-1333.
  • Murphy, E.A. 1997. The logic of medicine, 2nd edition. Baltimore: The Johns Hopkins University Press.
  • Nesse, R.M. (2001) On the difficulty of defining disease: a Darwinian perspective. Medicine, Health Care and Philosophy 4:37-46.
  • Nordenfelt, L. 1995. On the nature of health: an action-theory approach, 2nd edition. Dordrecht: Kluwer.
  • Overby, P. 2005. The moral education of doctors. New Atlantis 10:17-26.
  • Papakostas, Y.G., and Daras, M.D. 2001. Placebos, placebo effects, and the response to the healing situation: the evolution of a concept. Epilepsia 42:1614-1625.
  • Parker, M. 2002. Whither our art? Clinical wisdom and evidence-based medicine. Medicine, Health Care and Philosophy 5:273-280.
  • Pellegrino, E.D., and Thomasma, D.C. 1981. A philosophical basis of medical practice: toward a philosophy and ethic of the healing professions. New York: Oxford University Press.
  • Pellegrino, E.D., and Thomasma, D.C. 1988. For the patient’s good: the restoration of beneficence in health care. New York: Oxford University Press.
  • Pellegrino, E.D., and Thomasma, D.C. 1993. The virtues in medical practice. New York: Oxford University Press.
  • Pole, S. 2006. Ayurvedic medicine: the principles of traditional practice. Philadelphia, PA: Elsevier.
  • Post, S.G. 1994. Beyond adversity: physician and patient as friends? Journal of Medical Humanities 15:23-29.
  • Project of the ABIM Foundation, ACP-ASIM Foundation, and European Federation of Internal Medicine 2002. Medical professionalism in the new millennium: a physician charter. Annals of Internal Medicine 136:243-246.
  • Quante, M., and Vieth, A. 2002. Defending principlism well understood. The Journal of Medicine and Philosophy 27:621 – 649.
  • Reeder, L.G. 1972. The patient-client as a consumer: some observations on the changing professional-client relationship. Journal of Health and Social Behavior 13:406-412.
  • Reiser, S.J. 1978. Medicine and the reign of technology. Cambridge: Cambridge University Press.
  • Relman, A.S. 2007. A second opinion: rescuing America’s healthcare. New York: Perseus Books.
  • Reznek, L. 1987. The nature of disease. London: Routledge & Kegan Paul.
  • Rizzi, D.A., and Pedersen, S.A. 1992. Causality in medicine: towards a theory and terminology. Theoretical Medicine 13:233-254.
  • Roter, D. 2000. The enduring and evolving nature of the patient-physician relationship. Patient Education and Counseling 39:5-15.
  • Rothman, K.J. 1976. Causes. Journal of Epidemiology 104:587-592.
  • Ryff, C.D., and Singer, B. 1998. Human health: new directions for the next millennium. Psychological Inquiry 9:69-85.
  • Sackett, D.L., Richardson, W.S., Rosenberg, W., and Haynes, R.B. 1998. Evidence-based medicine: how to practice and teach EBM. London: Churchill Livingstone.
  • Salmon, W. 1984. Scientific explanation and the causal structure of the world. Princeton: Princeton University Press.
  • Samaniego, F.J. 2010. A comparison of the Bayesian and frequentist approaches to estimation. New York: Springer.
  • Schaffner, K.F. 1993. Discovery and explanation in biology and medicine. Chicago: University of Chicago Press.
  • Schaffner, K.F., and Engelhardt, Jr., H.T. 1998. Medicine, philosophy of. In Routledge Encyclopedia of Philosophy, E. Craig, ed. London: Routledge, pp. 264-269.
  • Schwartz, W.B., Gorry, G.A., Kassirer, J.P., and Essig, A. 1973. Decision analysis and clinical judgment. American Journal of Medicine 55:459-472.
  • Seifert, J. 2004. The philosophical diseases of medicine and their cures: philosophy and ethics of medicine, vol. 1: foundations. New York: Springer.
  • Senn, S. 2007. Statistical issues in drug development, 2nd edition. Hoboken, NJ: John Wiley & Sons.
  • Simon, J.R. 2010. Advertisement for the ontology of medicine. Theoretical Medicine and Bioethics 31:333-346.
  • Smart, J.J.C. 1963. Philosophy and scientific realism. London: Routledge & Kegan Paul.
  • Smuts, J. 1926. Holism and evolution. New York: Macmillan.
  • Solomon, M.J., and McLeod, R.S. 1998. Surgery and the randomized controlled trial: past, present and future. Medical Journal of Australia 169:380-383.
  • Spodick, D.H. 1982. The controlled clinical trial: medicine’s most powerful tool. The Humanist 42:12-21, 48.
  • Stempsey, W.E. 2000. Disease and diagnosis: value-dependent realism. Dordrecht: Kluwer.
  • Stempsey, W.E. 2004. The philosophy of medicine: development of a discipline. Medicine, Health Care and Philosophy 7:243-251.
  • Stempsey, W.E. 2008. Philosophy of medicine is what philosophers of medicine do. Perspectives in Biology and Medicine 51:379-371.
  • Stewart, M., Brown, J.B., Weston, W.W., McWhinney, I.R., McWilliam, C.L., and Freeman, T.R. 2003. Patient-centered medicine: transforming the clinical method, 2nd edition. Oxon, UK: Radcliffe Medical Press.
  • Straus, S.E., and McAlister, F.A. 2000. Evidence-based medicine: a commentary on common criticisms. Canadian Medical Association Journal 163:837-840.
  • Svenaeus, F. 2000. The hermeneutics of medicine and the phenomenology of health: steps towards a philosophy of medical practice. Dordrecht: Kluwer.
  • Tallis, R.C. 2006. Doctors in society: medical professionalism in a changing world. Clinical Medicine 6:7-12.
  • Tauber, A.I. 1999. Confessions of a medicine man: an essay in popular philosophy. Cambridge: MIT Press
  • Tauber, A.I. 2005. Patient autonomy and the ethics of responsibility. Cambridge: MIT Press.
  • Thagard, P. 1999. How scientists explain disease. Princeton: Princeton University Press.
  • Tonelli, M.R. 1998. The philosophical limits of evidence-based medicine. Academic Medicine 73:1234-1240.
  • Tong, R. 2007. New perspectives in health care ethics: an interdisciplinary and crosscultural approach. Upper Saddle River, NJ: Pearson Prentice Hall.
  • Toombs, S.K. 1993. The meaning of illness: a phenomenological account of the different perspectives of physician and patient. Dordrecht: Kluwer.
  • Toombs, S. K., ed. 2001. Handbook of phenomenology and medicine. Dordrecht: Kluwer.
  • Unschuld, P.U. 2010. Medicine in China: a history of ideas, 2nd edition. Berkeley, CA: University of California Press.
  • van der Steen, W.J., and Thung, P.J. 1988. Faces of medicine: a philosophical study.         Dordrecht: Kluwer.
  • van Gijn, J. 2005. From randomized trials to rational practice. Cardiovascular Diseases 19:69- 76.
  • Veatch, R.M. 1981. A theory of medical ethics. New York: Basic Books.
  • Veatch, R.M. 1991. The patient-physician relations: the patient as partner, part 2. Bloomington, IN: Indiana University Press.
  • Velanovich, V. 1994. Does philosophy of medicine exist? A commentary on Caplan. Theoretical Medicine 15:88-91.
  • Weatherall, D. 1996. Science and the quiet art: the role of medical research in health care. New York: Norton.
  • Westen, D., and Weinberger, J. 2005. In praise of clinical judgment: Meehl’s forgotten legacy. Journal of Clinical Psychology 61:1257-1276.
  • Whitbeck, C. 1981. A theory of health. In Concepts of health and disease: interdisciplinary perspectives, A.L. Caplan, H.T. Engelhardt, Jr., and J.J. McCartney, eds. London: Addison- Wesley, pp. 611-626.
  • Wildes, K.W. 2001. The crisis of medicine: philosophy and the social construction of medicine. Kennedy Institute of Ethics Journal 11:71-86.
  • Woodward, J. 2003. Making things happen: a theory of causal explanation. Oxford: Oxford University Press.
  • Worrall, J. 2002. What evidence in evidence-based medicine? Philosophy of Science 69:S316- S330.
  • Worrall, J. 2007. Why there’s no cause to randomize. British Journal for the Philosophy of Science 58:451-488.
  • Wulff, H.R., Pedesen, S.A., and Rosenberg, R. 1990. Philosophy of medicine: an introduction, 2nd edition. Oxford: Blackwell.
  • Zaner, R.M. 1981. The context of self: a phenomenological inquiry using medicine as a clue. Athens, OH: Ohio University Press.


Author Information

James A. Marcum
Baylor University
U. S. A.

Causal Role Theories of Functional Explanation

Functional explanations are a type of explanation offered in the natural and social sciences. In giving these explanations, researchers appeal to the functions that a structure or system has. For instance, a biologist might say, “The kidney has the function of eliminating waste products from the bloodstream.” Or a sociologist might say, “The purpose of monogamy is to preserve the family structure.” Each of these is concerned with a function that a structure or system is believed to possess. Philosophical interest in this issue concerns understanding what exactly these statements amount to, and whether they are explanatory. Of particular concern is whether such statements commit us to problematic views about the existence of teleology, or purposes, in nature, and whether this is legitimate in the sciences.

This article considers the debate over functional explanations in the philosophical literature from the 1950’s to the early 21st century. It begins by considering the background to philosophical interest in this subject. Then it looks at two prominent early approaches to functional explanation: Ernest Nagel’s deductive-nomological approach from the 1950’s, and Robert Cummins’ causal account from the 1970’s, as well as objections to both. Throughout there is consideration of illustrative examples of functional explanations from different sciences. Although there are other accounts of functional explanation in the literature, such as the evolutionary and design-oriented accounts, they will be mentioned only in relation to the other two. The broad history of the concept of teleology and the details of the other accounts will not be developed here.

Table of Contents

  1. Background
  2. Nagel’s Early Account
  3. Difficulties
  4. Cummins’ View
  5. An Example
  6. Objections and Replies
  7. Other Developments
  8. References and Further Reading
    1. References
    2. Suggested Reading

1. Background

Prior to the 1950’s, philosophers and scientists were concerned with appeals to teleological concepts in the sciences. These concepts have a history going back to Aristotle, who claimed that understanding such things as physical objects or an animal’s physiological structures involves knowing what they are for. In his view, a complete explanation of these structures requires understanding the purposes towards which they are directed. For instance, he thought that one does not understand what a kidney is merely by knowing the material it is made from; one also has to understand that it has the purpose of filtering blood (what it’s for). In Aristotle’s view purposes were conceived as unusual properties like “ends” or “final causes”. But after the scientific revolution these properties were hard to understand in a manner consistent with the sciences, and were considered to be obscure. At a later point this way of thinking was also believed by many to have become questionable because of Darwin’s theory of evolution, which held that talk of Aristotelian purposes was problematic. The idea was to replace talk of such purposes with reference to the notion of natural selection and the environment. The theory of evolution was taken by many scientists to show that appeals to purposive concepts in biology and the social sciences were merely misguided appeals to an outmoded form of explanation. However, the difficulty remained and scientists often continued through the 1900’s to use purposive language in explaining the phenomena that interested them. If appeals to purposive concepts were misguided, then one would have expected these appeals to wither away. So a problem remained in the scientific community over what this situation implied. There was concern about whether this continued talk of purposes was illegitimate in the sciences, or whether there was an acceptable way of understanding such language.

2. Nagel’s Early Account

Modern discussions of functional explanation usually begin with the views developed by Ernest Nagel and others around the 1950’s in the context of scientific explanation (Nagel 1961; Hempel 1959). In his book The Structure of Science, one of Nagel’s concerns was describing the various forms of explanation that occur in the sciences. When we consider the biological and social sciences, he observes, we often find researchers describing the structures or systems that concern them in terms of the functions they have. For instance, a biologist might say, “The heart has the function of pumping blood through the circulatory system.” Or a sociologist might say, “The purpose of the religious ritual is to increase cohesion in society.” Put generally, Nagel says these statements can be understood as stating that “the function of system X is to do Y.” He claims that there are several issues that such statements raise for further examination. We want to understand the detailed structure of such statements and how they work. We also want to understand how these statements are related to other kinds of explanations offered in the natural and social sciences.

Nagel claims that functional statements can be understood in the following manner. These statements are explanation sketches that when fully spelled out reduce to another form of explanation common in the physical sciences called the Deductive-Nomological model. Nagel's account focuses on the attributions of functions to the components of a structure (or system). Consider the statement, “The heart has the function of pumping blood through the circulatory system.” When understood, this statement serves to explain why hearts are present in vertebrates. To see this, we have to note that hearts occur only together with a certain physical organization of the vertebrate body and in a certain external environment that the organism lives in. So, what the statement really says is that, in vertebrate bodies with an organization of blood and blood vessels and in a certain external environment, circulation occurs only if an organism has a heart. If we focus on this latter statement, Nagel says that the information in it can be expanded into a D-N explanation. When this is done, we then have the following explanation: Every vertebrate body with the appropriate organization and in a certain environment engages in circulation. If the vertebrate body does not have a heart, then it does not engage in circulation. Hence, the vertebrate body must have a heart.

This means functional statements in general have the following form for Nagel: The function of A in a system S with organization C is to enable S in environment E to engage in process P. And this can be expanded into an explicit explanation in this way: Every system S with organization C and in environment E engages in process P. If S with organization C and in environment E does not have A, then S does not engage in P. Hence, S with organization C must have A (1961, 403).

Understood this way, functional statements can be seen as explaining why components like hearts are present in certain organisms in which circulation occurs. For what such statements say is that for organisms having a heart is a necessary condition for pumping blood in the circulatory system. Nagel says this means the statement “The heart has the function of pumping blood through the circulatory system” says the same thing as the statement “Organisms in which circulation occurs pump blood only if they have a heart.” Because of the equivalence, he writes that it appears that “when a function is ascribed to a constituent element in an organism, the content of the teleological [functional] statement is fully conveyed by another statement that is not explicitly teleological and that simply asserts a necessary . . . condition for the occurrence of a certain trait or activity of the organism” (1961, 405).

There are two features of Nagel’s account worth noting at this point. First, he holds that when we attribute a function to a component, this is always relative to a goal state of the larger system. Researchers are not interested in merely any effects of the components a system has. Rather, they are interested in certain effects that are important to the maintenance of the organism. Biologists, for example, are interested in understanding the heart’s pumping blood because this effect contributes to the activity of circulation, which is important for the organism’s survival. It is important to note that for Nagel talk of goals in this context is not intended to make reference to conscious entities of some kind. It only concerns the characteristic activities that researchers believe to be present in the systems that concern them (for example, survival and reproduction).

Second, we saw that functional statements that mention the functions of components can be translated into equivalent statements that lack these notions. Nagel claims this shows there’s nothing scientifically suspect about these statements. We shouldn’t think that such statements commit us to problematic entities like “purposes” in nature that somehow attach to components, or to “ends” or “final causes” towards which components are believed to strive (as in Aristotle’s view of teleology). Instead, what the account makes clear is that, in attributing a function to a component, researchers are merely concerned with explaining why the component is present in the larger system. In light of this, Nagel suggests that functional statements are unproblematic despite their earlier associations in history with unacceptable teleological notions. Understood properly he thinks these statements should be seen as belonging with other legitimate forms of explanation in the sciences (on this point compare Ayala 1970).

3. Difficulties

There are two difficulties commonly noted with the account presented. The first concerns the reference to goals in identifying the effects of the components that are the targets of explanation. As we have seen for Nagel, the function of a component is identified in relation to the effects that contribute towards the important activities of an organism, like survival and reproduction. But this is problematic for some (Cummins 1975) since it suggests there are components that researchers might want to say have functions, but with this account do not. For instance, let us suppose that the wings of a particular species of bird stopped contributing to the capacity for survival and reproduction (maybe this is due to a specific type of airborne predator). In this situation, it seems researchers would not say that the wings did not have any function to perform; they would still want to understand how the wings contributed to the birds’ capacity for flight. The present account would classify these structures as lacking functions when some would say they have functions.

The second problem concerns the sense in which functional statements explain the presence of the components involved. In Nagel’s view, we explain the presence of a component by inferring that it is necessary for the performance of a capacity of a system. In the example considered, the heart is said to be a necessary condition for the occurrence of blood pumping in vertebrates. But it has been held false by some (Wright 1973; Cummins 1975) that having a heart is a necessary condition for pumping blood. For surely there are functionally equivalent structures like artificial hearts that are capable of pumping blood through the circulatory system, or circulation might even be achieved in other ways. If this is the case, however, then functional statements cannot be interpreted as explaining why certain structures must be present in an organism (because they needn’t be). Nagel knew of this concern and said his interest was with actual living systems and what they include (as opposed to logical possibilities). But it is not clear this resolves the worry since we find components like artificial hearts in actual living systems (Buller 1999). Aside from this issue, there are believed to be general problems with the framework of Deductive-Nomological explanation that lies behind Nagel’s account of functional explanation, which have raised concerns with his approach. It is no longer accepted by everyone, for example, that scientific explanations have to be seen as involving deductions from universal, law like statements as required with the Deductive-Nomological model of explanation. The result of this was that people became less inclined to think that an account of functional explanation had to be incorporated into the particular explanatory framework Nagel employed. This was important in making people receptive to the alternative accounts that were being developed.

4. Cummins’ View

After Nagel the most prominent view developed along these lines was by Robert Cummins (1975, 1983). Cummins agrees that the previous account suffers from the problems that were described. The point of functional statements is not to explain why certain components are present in a structure in relation to some goal state the structure has. Instead, Cummins suggests that functional statements are merely used to explain the contributions made by components of a structure to a capacity of a containing system. The performance of a capacity of a system is explained in terms of the capacities of the components it contains, and how they are organized. Consider researchers’ interest in explaining what it is for a system S to have property P. Cummins writes that “the natural strategy for answering such a question is to construct an analysis of S that explains S’s possession of P by appeal to the properties of S’s components and their mode of organization” (1983, 15). For example, let us suppose that researchers are interested in understanding how circulation occurs in vertebrates. To explain this capacity, they search for the structure in the body that contributes to this capacity by moving the blood around. They observe that blood is moved through the arteries by some sort of pumping motion. When they learn that the heart serves as the pumping mechanism in the body, they identify this with its function (they report “the heart has the function of pumping blood”). The capacity for circulation is thus explained in terms of the capacities of the components of the system that enable it to perform the task.

There are different stages that are involved in giving the explanation. Researchers begin with a specification of the larger function of the system they want to explain. The explanation then consists in showing how the function depends on the capacities of the components of the system, and their organization. There are two stages that are involved in this process. The first stage is to analyze the function in question in terms of the capacities involved in bringing about the function, which Cummins calls the analytical strategy. He says “The Analytical Strategy proceeds by analyzing a disposition [function] into a number of other relatively less problematic dispositions such that [the] organized manifestation of these analyzing dispositions amounts to a manifestation of the analyzed disposition” (1977, 272). The second stage is to show that there is a physical structure present that realizes the various capacities. This is needed to show that the function is, in fact, realized by an actual structure or mechanism. As Cummins explains, “Ultimately . . . a complete property theory for a dispositional property must exhibit the details of the target property’s instantiation in the system (or system type) that has it. Analysis of the disposition . . . is only the first step; instantiation is the second” (1983, 31). While the two stages are independent of one another, it is important to note that both stages are needed for the explanation to be complete. This is worth keeping in mind because one sometimes finds discussions in the literature that focus merely on one stage of the explanation and not the other.

Cummins notes that the explanation can be iterated by applying it to the functions of the components cited in the earlier explanation. This process can be repeated until researchers are satisfied with the level reached, or when they reach a level of physical components where no further explanation can be given. In practice, where this line is drawn is relative to the particular interests the researcher has.

Understood this way, functional statements are not used, as Nagel says, in explaining why certain components are present in a system, but to explain how a component contributes to the capacity of the system that contains it. This concern with the causal organization of components is why Cummins’ account is referred to as the “causal role” account of functional explanation. Furthermore, there is no requirement with the account that the larger function being explained be related to the organism’s survival and reproduction (or similar activity). This means the goal state requirement of the previous account has been dropped. In this respect, Cummins’ account can be seen as broadening the number of capacities of systems that are the appropriate subjects of explanation. The broader applicability of the account is seen by many to be a benefit of the view, but we will see later that for some it is thought to raise problems.

5. An Example

With Cummins’ account, the same form of explanation is said to be applicable to a range of different structures or systems in the sciences. The account applies to the functions of structures like the heart in physiology, but it can also be applied to other kinds of systems, including chemical systems, psychological systems, social systems, and others. We can illustrate this with an example from psychology. Consider the explanation of color vision in the human system (Dawson 1998, 163). The function of color vision consists in the capacity (F) to perceive information about the colors of objects in the environment. The trichromatic theory of color vision provides an explanation of how this works. It holds that the function is performed in virtue of the capacities (C1a, C2a, . . . Cna) of the components of the visual system (S1a, S2a, . . . Sna), and the way they are organized. In particular, the function depends on the capacities of the parts of the eye to produce differential responses to wavelengths of light. Researchers have learned that when light falls on the retina there are three kinds of cone cells there containing photopigments (red, green, and blue) that respond differently to the different wavelengths of light present. These responses are combined together in the retina through cellular connections and produce a distinctive response signal in the nervous system. The signal is sent to the visual cortex in the brain, and leads to the color perceptions we have. In turn, the subcapacities of the individual pigments in the cones (say C1a) have themselves been explained in terms of the capacities (C1b, C2b, . . . Cnb) of the molecular components (S1b, S2b, . . . Snb) of the photopigments and how they respond to light (for example, the capacities of vitamin A and various proteins to change when exposed to light). In this way, the function of color vision has been decomposed into the various capacities of the anatomical components within the eye and nervous system that underlie the function.

In this respect functional explanation is intended to be a general strategy of explanation that can be applied in different sciences. What is needed is for researchers to identify a capacity of the system they want to explain, and then describe how this occurs as a result of the organized behavior of the components which make up the system. In pursuing this strategy, researchers can be seen as undertaking a kind of mechanical analysis in attempting to explain the behaviors of the system that interest them. It is, in part, the broad applicability of the account to different sciences that has made philosophers interested in understanding its details and implications. One will find further illustrations of the account by looking at discussions of functionalism in the philosophy of mind. This will reveal how the account has been used in theories in other areas.

6. Objections and Replies

There are several objections that have been made to the causal role account. Here we will consider four of the common ones. The first is the concern that too many kinds of components can be ascribed functions that on their face don’t seem to have functions. Consider a bit of dirt that has become lodged in a pipe and which operates as a one-way valve (Griffiths 1993). This material can be seen as contributing to the capacity of the pipe to control the flow of liquid, and so can be part of a functional explanation in line with the account. It seems odd to attribute a function to the dirt in this situation, but it becomes a possibility once we have dropped the notion of a goal state from the account. Once this occurs, it seems any effect of a structure that contributes to a capacity can be used in an explanation.

A second objection concerns the various kinds of things that we know objects like hearts are capable of doing. Millikan (1989a) claims that objects like hearts not only move blood through the circulatory system, but also make a thumping noise that doctors can listen to. Making a noise is an effect of the structure that can be explained in terms of the account presented before. But while biologists take the function of the heart to be the circulation of blood, they do not say that making thumping noises is. So the account seems too liberal since it fails to distinguish between genuine functions and mere side effects of the systems.

In reply to these sorts of concerns, Cummins (1975) argues that there is no objective way of making the distinction between genuine functions and other effects. The effects of a component may be relevant to the explanation of different overall capacities. The limits on what capacities should be explained depend on the particular explanatory interests researchers have. Relative to the capacity for blood circulation in the body, the heart can be said to function as a pumping mechanism; but relative to the capacity for making sounds, the heart can be said to function as a noise maker. There is no saying which of these counts as a genuine function in an absolute sense since researchers’ interests are what matter. This response raises issues about the nature of scientific explanations and whether these should be seen as objective, or whether it is sufficient merely to appeal to the interests researchers have. The answer to this depends on what one sees as the appropriate characteristics of a scientific explanation. In addition to making this point, Cummins also added a general restriction on the kinds of explanations that should be given. He said that the appropriateness of a functional explanation is related to the “interestingness” of the explanation being offered. An explanation counts as interesting when the component capacities appealed to in the explanation are less complex and different in kind from the larger capacity being explained. We can illustrate this with the piece of dirt which is lodged in the pipe. For example, the capacity of the dirt to obstruct the flow of liquid is neither less complex nor really different from the overall capacity of the pipe being explained, and so a functional explanation has no interest in this case. This response is intended to place limits on the functional explanations that are appropriate to make in these sorts of cases. But it has been perceived as being somewhat vague in describing the systems to which it applies.

This sort of concern has also been considered by Davies (2001), who argues that there are specific constraints on the effects that are appropriate to use in an explanation. He suggests we can supplement the account offered by recognizing that functions are appropriately attributed, not just to any components, but to those in a hierarchically organized system. A system is hierarchically organized just when the function is performed in virtue of the lower-level organization of the system in question. In this view, the effects of a component are not functional unless they are due to the specific hierarchical organization of the structure. For example, aside from pumping blood the heart beat has the effect of vibrating the sternum in the chest. But this effect does not contribute anything to a larger capacity of the circulatory system, or to other related capacities of the organism. Therefore, there is no reason to accept this as a genuine function of the component in question.

A third problem concerns the character of some of the components to which researchers ascribe functions. Neander (1991; see Millikan 1989b) claims that researchers in sciences like biology commonly refer to components that may be diseased and malformed as having functions. Due to congenital disease, for example, a heart may lack the parts necessary to pump blood around the circulatory system in an organism and thus not work. This presents a problem because here the component will be unable to perform its causal role, and lack the function as a result. Despite this fact, Neander claims that, in this situation, researchers still classify the components functionally as being hearts. She claims this shows we need another notion of function independent of the causal role notion. The evolutionary account she prefers is presently the main rival to the causal role account. It holds that functional explanations are a type of evolutionary explanation. Roughly, to say that a component has a function F is to say that F is an effect of the component that was selected for by natural selection in the past. So the heart has the function of pumping blood because hearts were selected for pumping blood in our ancestors, and this led to the present existence of hearts. With this view the idea is that natural selection confers on components functional roles they are supposed to perform despite their inability to perform their causal role.

Different replies have been offered to this objection. Cummins (1975) accepts that malfunctioning components do not have their causal role functions to perform. If a component is unable to perform its causal role, then this implies the loss of the relevant capacity. A similar sort of view has also been supported by Davies (2001, 176). He maintains that we should not classify components as being malfunctional in these cases. He says that components are wrongly classified this way on the basis of our prior experience of physically similar structures, which leads us to expect the structures will function as the other structures do. But, properly speaking, the structures should be classified as nonfunctional “because natural traits cannot malfunction.” This point is made in relation to a larger complaint he makes that the notion of function appealed to in evolutionary theories does not fit well with the ontologies of the natural sciences.

Another reply has also been offered by Amundson and Lauder (1994). They argue it is false that malfunctioning components can only be classified in terms of the evolutionary functions they perform. They claim that researchers in physiology commonly classify structures in terms of their homologous relationships to other structures, and independently of their functions. Homologies are defined as traits in different species that are similar due to common ancestry (a standard example is the forelimbs of humans and bats that are structurally similar). These traits can be identified in terms of such features as the physical similarity present, correspondence of parts, and other features. These criteria enable researchers to classify structures on the basis of their anatomical features alone. Since researchers can classify a malformed heart as a heart by appeal to its structural features, it is claimed that there is no problem for the account with structures that malfunction. Whether this response is correct depends on how classification works in the practice of researchers in physiology. This has been a subject of early 21st century controversy among philosophers working in this area.

The final problem concerns the role of functional statements in relation to what they explain. As was noted before, on the evolutionary account a functional statement is said to explain the presence of a component of a system. For instance, the statement “the heart functions to pump blood” serves to explain why hearts exist in certain organisms. The idea is that in the past, hearts that pumped blood contributed to the survival of an organism, which explains the present existence of hearts. So, the functional statement points in the direction of an evolutionary account for the rise of the traits which are subject to explanation. In contrast, the causal role account does not provide an explanation in this sense. Functional statements do not serve to explain the presence of the component, but explain the contribution of the component to a capacity of a particular system. This is not a matter of why the component exists in the system, but the task the component performs. The different accounts raise issues about the aims theories of functional explanation are believed to have and what needs explaining. The different perspectives people have taken on this topic are related to how the explanations are used by researchers in different areas of science, and the particular roles the explanations have in these areas.

7. Other Developments

So far, we have seen that the causal role account is applicable to different fields of science. Another issue is discussion over the exact fields it applies to, and how it relates to the other accounts mentioned. In this respect, it will help to describe the relation between the causal role and other accounts as they have been discussed in the biological sciences.

According to Neander (1991), there is a single notion of function used across different areas of biological research. The basic notion is the evolutionary notion we described before that is explained in terms of natural selection. The idea is that the heart functions to pump blood because pumping blood was the effect that hearts were selected for in the past, and which led to the present existence of hearts. Neander claims this is the basic notion at work when researchers appeal to the functions of a component and not the causal notion. In another vein, Kitcher (1993) argues that the concept of design can be seen to underlie all functional explanations. Roughly, to say that a component has a function F is to say that F is what the component was designed to do (the account allows there are different sources of design). He believes the notion applies in evolutionary contexts that involve a past selection process, as well as in physiological investigations that concern the current causal contributions of a component to a system’s capacity (in the latter case, Kitcher claims that a selection process is involved in an indirect way). In both accounts, appeals to functions can be unified under some general concept that applies across different areas of research.

Not everyone agrees, though, that the notion of function can be treated in this way. Godfrey-Smith (1993) argues that it is wrong to think there is a unified concept of function at work in different areas. He suggests there are distinct notions of function that are appropriate to different fields. The causal role notion is appropriate in physiological investigations where researchers are concerned with understanding how the capacities of a system depend on the capacities of its components. These investigations can be undertaken independently from historical considerations about selection. Alternatively, the evolutionary notion is appropriate in areas like evolution and behavioral ecology where researchers are interested in explaining why organisms have the structures and behaviors they have. In this context, the focus is on past selection pressures in the environment, and a historical approach is appropriate. So, in Godfrey-Smith’s view it is a mistake to think that we should be attempting to unify the various uses of functional language under a single account; we should rather accept pluralism. The different notions should merely be seen as reflecting the different kinds of information researchers are concerned with in different areas of investigation. This point can be related to the previous issue concerning the interest-relative character of the explanations researchers offer.

Not only are there concerns about the areas of research where the causal role notion is applicable, but there have been questions raised about similar notions in the vicinity. The causal notion is often thought to be the basic notion at work in physiology (this is controversial for some who hold the evolutionary or other account). Wouters (1995) argues, though, that there is another notion that has been neglected by philosophers in this area. He says that when researchers talk of functional explanations they are often referring to what he calls “viability explanations.” These have a different focus than explaining how a function depends on the capacities of the components of a system. In these cases, researchers are interested in explaining the traits that individuals need to survive and reproduce in their environments. For example, given the distance between the central organs and the outer periphery of the human body, the function of circulating oxygen cannot be performed by simple diffusion. This is why a circulatory system is needed for the performance of the function. This explanation contributes to our understanding by showing why the circulatory system has to be present for maintaining the viability of the organism. Wouters claims that this form of explanation is distinct from the previous accounts we described, and involves its own explanatory structure. Moreover, this explanation is said to share features with the earlier notion of functional explanation described by those like Nagel. In this respect, it is suggested that philosophers may have overlooked an important notion of functional explanation worthy of further examination.

There is a lot more that could be said about the subject of causal role functional explanation, and the debate about functions. We have not covered everything that might be considered on this issue. One thing that is clear from our discussion, though, is that a proper understanding of functional explanations cannot be achieved independently from considering how they are used in the sciences. Whether, and to what extent, the causal role notion is applicable in a particular area needs to be determined by examining the science in question. In this respect, those interested in furthering our understanding of the notion will have to familiarize themselves with the details of the examples being considered. An understanding of how functional explanations are used is an important part of helping us improve our understanding of this concept.

8. References and Further Reading

a. References

  • Amundson, R., & Lauder, G. (1994). “Function without purpose: the uses of causal role function in evolutionary biology.” Biology and Philosophy 9: 443-469.
  • Ariew, A., Cummins, R., & Perlman, M. (eds.) (2002). Functions: New Essays in the Philosophy of Psychology and Biology. Oxford: Oxford University Press.
  • Ayala, F. (1970). “Teleological explanations in evolutionary biology.” Philosophy of  Science 37: 1-15.
  • Block, N. (1980). “Introduction: what is functionalism?” In Block, N. (ed.) Readings in Philosophy of Psychology, vol. 1. Cambridge, MA: Harvard University Press.
  • Cummins, R. (1977). “Programs in the explanation of behavior.” Philosophy of Science 44: 269-287.
  • Cummins, R. (1983). The Nature of Psychological Explanation. Cambridge, MA: MIT Press.
  • Davies, P. (2001). Norms of Nature: Naturalism and the Nature of Functions. Cambridge, MA: MIT Press.
  • Dawson, M. (1998). Understanding Cognitive Science. Malden, MA: Blackwell.
  • Enç, B. & Adams, F. (1992). “Functions and goal directedness.” Philosophy of Science 59: 635-654.
  • Godfrey-Smith, P. (1993). “Functions: consensus without unity.” Pacific Philosophical Quarterly 74: 196-208.
  • Griffiths, P. E. (1993). “Functional analysis and proper functions.” British Journal for the Philosophy of Science 44: 409-422.
  • Hempel, C. G. (1959). “The logic of functional analysis.” In Gross, L. (ed.) Symposium on Sociological Theory. Evanston, IL: Harper and Row Publishers.
  • Hempel, C. G. & Oppenheim, P. (1948). “Studies in the logic of explanation.” Philosophy of Science 15: 135-175.
  • Kitcher, P. (1993). “Function and design.” Midwest Studies in Philosophy 18: 379-397.
  • McLaughlin, P. (2001). What Functions Explain: Functional Explanation and Self-Reproducing Systems. Cambridge: Cambridge University Press.
  • Millikan, R. G. (1989a). “An ambiguity in the notion of function.” Biology and Philosophy 4: 172-176.
  • Millikan, R. G. (1989b). “In defense of proper functions.” Philosophy of Science 56: 288- 302.
  • Nagel, E. (1961). The Structure of Science. Indianapolis, IN: Hackett.
  • Nagel, E. (1977). “Teleology revisited: goal-directed processes in biology.” Journal of Philosophy 74: 261-301.
  • Neander, K. (1991). “The teleological notion of ‘function’.” Australasian Journal of Philosophy 69 (4): 454-468.
  • Neander, K. (2002). “Types of traits: the importance of functional homologies.” In Ariew, A., Cummins, R., & Perlman, M. (eds.) (2002).
  • Polger, T. (2004). Natural Minds. Cambridge, MA: MIT Press.
  • Wouters, A. (1995). “Viability explanation.” Biology and Philosophy 10: 435-457.
  • Wright, L. (1973). “Functions.” Philosophical Review 82: 139-168.

b. Suggested Reading

  • Allen, C., Bekoff, M., & Lauder, G. (eds.) (1998). Nature’s Purposes: Analyses of Function and Design in Biology. Cambridge, MA: MIT Press.
  • Buller, D. (ed.) (1999). Function, Selection, and Design. Albany, NY: State University of New York Press.
  • Cummins, R. (1975). “Functional Analysis.” Journal of Philosophy 72: 741-765. Reprinted with minor alterations in Allen, Bekoff, & Lauder (1998) and Buller (1999).
  • Wouters, A. (2005). “The function debate in philosophy.” Acta Biotheoretica 53: 123-151.


Author Information

Mark B. Couch
Seton Hall University
U. S. A.

Relational Models Theory

Relational Models Theory is a theory in cognitive anthropology positing a biologically innate set of elementary mental models and a generative computational system operating upon those models.  The computational system produces compound models, using the elementary models as a kind of lexicon.  The resulting set of models is used in understanding, motivating, and evaluating social relationships and social structures.  The elementary models are intuitively quite simple and commonsensical.  They are as follows: Communal Sharing (having something in common), Authority Ranking (arrangement into a hierarchy), Equality Matching (striving to maintain egalitarian relationships), and Market Pricing (use of ratios).  Even though Relational Models Theory is classified as anthropology, it bears on several philosophical questions.

It contributes to value theory by describing a mental faculty which plays a crucial role in generating a plurality of values.  It thus shows how a single human nature can result in conflicting systems of value.  The theory also contributes to philosophy of cognition.  The complex models evidently result from a computational operation, thus supporting the view that a part of the mind functions computationally.  The theory contributes  to metaphysics.  Formal properties posited by the theory are perhaps best understood abstractly, raising the possibility that these mental models correspond to abstract objects.  If so, then Relational Models Theory reveals a Platonist ontology.

Table of Contents

  1. The Theory
    1. The Elementary Models
    2. Resemblance to Classic Measurement Scales
    3. Self-Organization and Natural Selection
    4. Compound Models
    5. Mods and Preos
  2. Philosophical Implications
    1. Moral Psychology
    2. Computational Conceptions of Cognition
    3. Platonism
  3. References
    1. Specifically Addressing Relational Models Theory
    2. Related Issues

1. The Theory

a. The Elementary Models

The anthropologist Alan Page Fiske pioneered Relational Models Theory (RMT).  RMT was originally conceived as a synthesis of certain constructs concerning norms formulated by Max Weber, Jean Piaget, and Paul Ricoeur.  Fiske then explored the theory among the Moose people of Burkina Faso in Africa.  He soon realized that its application was far more general, giving special insight into human nature.  According to RMT, humans are naturally social, using the relational models to structure and understand social interactions, the application of these models seen as intrinsically valuable. All relational models, no matter how complex, are, according to RMT, analyzable by four elementary models: Communal Sharing, Authority Ranking, Equality Matching, Market Pricing.

Any relationship informed by Communal Sharing presupposes a bounded group, the members of which are not differentiated from each other.  Distinguishing individual identities are socially irrelevant.  Generosity within a Communal Sharing group is not usually conceived of as altruism due to this shared identity, even though there is typically much behavior which otherwise would seem like extreme altruism.  Members of a Communal Sharing relationship typically feel that they share something in common, such as blood, deep attraction, national identity, a history of suffering, or the joy of food.  Examples include nationalism, racism, intense romantic love, indiscriminately killing any member of an enemy group in retaliation for the death of someone in one’s own group, sharing a meal.

An Authority Ranking relationship is a hierarchy in which individuals or groups are placed in relative higher or  lower relations .  Those ranked higher have prestige and privilege not enjoyed by those who are lower.  Further, the higher typically have some control over the actions of those who are lower.  However, the higher also have duties of protection and pastoral care for those beneath them.  Metaphors of spatial relation, temporal relation, and magnitude are typically used to distinguish people of different rank. For example, a King having a larger audience room than a Prince, or a King arriving after a Prince for a royal banquet.  Further examples include military rankings, the authority of parents over their children especially in more traditional societies, caste systems, and God’s authority over humankind.  Brute coercive manipulation is not considered to be Authority Ranking; it is more properly categorized as the Null Relation in which people treat each other in non-social ways.

In Equality Matching, one attempts to achieve and sustain an even balance and one-to-one correspondence between individuals or groups.  When there is not a perfect balance, people try to keep track of the degree of imbalance in order to calculate how much correction is needed.  “Equality matching is like using a pan balance: People know how to assemble actions on one side to equal any given weight on the other side” (Fiske 1992, 691).  If you and I are out of balance, we know what would restore equality.  Examples include the principle of one-person/one-vote, rotating credit associations, equal starting points in a race, taking turns offering dinner invitations, and giving an equal number of minutes to each candidate to deliver an on-air speech.

Market Pricing is the application of ratios to social interaction.  This can involve maximization or minimization as in trying to maximize profit or minimize loss.  But it can also involve arriving at an intuitively fair proportion, as in a judge deciding on a punishment proportional to a crime.  In Market Pricing, all socially relevant properties of a relationship are reduced to a single measure of value, such as money or pleasure.  Most utilitarian principles involve maximization.  An exception would be Negative Utilitarianism whose principle is the minimization of suffering.  But all utilitarian principles are applications of Market Pricing, since the maximum and the minimum are both proportions.  Other examples include rents, taxes, cost-benefit analyses including military estimates of kill ratios and proportions of fighter planes potentially lost, tithing, and prostitution.

RMT has been extensively corroborated by controlled studies based on research using a great variety of methods investigating diverse phenomena, including cross-cultural studies (Haslam 2004b).  The research shows that the elementary models play an important role in cognition including perception of other persons.

b. Resemblance to Classic Measurement Scales

It may be jarring to learn that intense romantic love and racism are both categorized as Communal Sharing or that tithing and prostitution are both instances of Market Pricing.  These examples illustrate that a relational model is, at its core, a meaningless formal structure.  Implementation in interpersonal relations and attendant emotional associations enter in on a different level of mental processing.  Each model can be individuated in purely formal terms, each elementary model strongly resembling one of the classic scale types familiar from measurement theory.  (Strictly speaking, it is each mod which can be individuated in purely formal terms.  This finer point will be discussed in the next section.)

Communal Sharing resembles a nominal (categorical) scale.  A nominal scale is simply classifying things into categories.  A questionnaire may be designed to categorize people as theist, atheist, agnostic, and other.  Such a questionnaire is measuring religious belief by using a nominal scale.  The groups into which Communal Sharing sorts people is similar.  One either belongs to a pertinent group or one does not, there being no degree or any shades of gray.  Another illustration of nominal scaling is the pass/fail system of grading.  Authority Ranking resembles an ordinal scale in which items are ranked.  The ranking of students according to their performance is one example.  The ordered classification of shirts in a store as small, medium, large, and extra large is another.  Equality Matching resembles an interval scale.  On interval scales , any unit measures the same magnitude on any point in the scale.  For example, on the Celsius scale the difference between 1 degree and 2 degrees is the same as the difference between 5 degrees and 6 degrees.  Equality Matching resembles an interval scale insofar as one can measure the degree of inequality in a social relationship using equal intervals so as to judge how to correct the imbalance.  It is by use of such a scale that people in an Equality Matching interaction can specify how much one person owes another.  However, an interval scale cannot be used to express a ratio because it has no absolute zero point.  For example, the zero point on the Celsius scale is not absolute so one cannot say that 20 degrees is twice as warm as 10 degrees while on a Kelvin scale because the zero point is absolute one can express ratios.  Given that Market Pricing is the application of ratios to social interactions, it resembles a ratio scale such as the Kelvin scale.  One cannot, for example, meaningfully speak of the maximization of utility without presupposing some sort of ratio scale for measuring utility.  Maximization would correspond to 100 percent.

c. Self-Organization and Natural Selection

The four measurement scales correspond to different levels of semantic richness and precision.  The nominal scale conveys little information, being very coarse grained.  For example, pass/fail grading conveys less information than ranking students.  Giving letter grades is even more precise and semantically rich, conveying how much one student out-performs another.  This is the use of an interval scale.  The most informative and semantically rich is a percentage grade which illustrates the ratio by which one student out-performs another, hence a ratio scale.  For example, if graded accurately a student scoring 90 percent has done twice as well as a student scoring 45 percent.  Counterexamples may be apparent: two students could be ranked differently while receiving the same letter grade by using a deliberately coarse-grained letter grading system so as to minimize low grades.  To take an extreme case, a very generous instructor might award an A to every student (after all, no student was completely lost in class) while at the same time mentally ranking the students in terms of their performance.  Split grades are sometimes used to smooth out the traditional coarse-grained letter grading system .  But, if both scales are as sensitive as possible and based on the same data, the interval scale will convey more information than the ordinal scale.  The ordinal ranking will be derivable from the interval grading, but not vice versa.  This is more obvious in the case of temperature measurement, in which grade inflation is not an issue.  Simply ranking objects in terms of warmer/colder conveys less information than does Celsius measurement.

One scale is more informative than another because it is less symmetrical; greater asymmetry means that more information is conveyed.  On a measurement scale, a permutation which distorts or changes information is an asymmetry.  Analogously, a permutation in a social-relational arrangement which distorts or changes social relations is an asymmetry.  In either case, a permutation which does not carry with it such a distortion or change is symmetric.  The nominal scale type is the most symmetrical scale type, just as Communal Sharing is the most symmetrical elementary model.  In either case, the only asymmetrical permutation is one which moves an item out of a category, for example, expelling someone from the social group.  Any permutation within the category or group makes no difference; no difference to the information conveyed, no difference to the social relation.  In the case of pass/fail grading, the student’s performance could be markedly different from what it actually was.  So long as the student does well enough to pass (or poorly enough to fail), this would not have changed the grade.  Thanks to this high degree of symmetry, the nominal scale conveys relatively little information.

The ordinal scale is less symmetrical.  Any permutation that changes rankings is asymmetrical, since it distorts or changes something significant.  But items arranged could change in many respects relative to each other while their ordering remains unaffected, so a high level of symmetry remains.  Students could vary in their performance, but so long as their relative ranking remains the same, this would make no difference to grades based on an ordinal scale.

An interval scale is even less symmetrical and hence more informative, as seen in the fact that a system of letter grades conveys more information than does a mere ranking of students.  An interval scale conveys the relative degrees of difference between items.  If one student improves from doing C level work to B level work, this would register on an interval scale but would remain invisible on an ordinal scale if the change did not affect student ranking.  Analogously, in Equality Matching, if one person, and one person only, were to receive an extra five minutes to deliver their campaign speech, this would be socially significant.  By contrast, in Authority Ranking, the addition of an extra five minutes to the time taken by a Prince to deliver a speech would make no socially significant difference provided that the relative ranking remains undisturbed (for example, the King still being allotted more time than the Prince, and the Duke less than the Prince).

In Market Pricing, as in any ratio scale, the asymmetry is even greater.  Adding five years to the punishment of every convict could badly skew what should be proportionate punishments.  But giving an extra five minutes to each candidate would preserve balance in Equality Matching.

The symmetries of all the scale types have an interesting formal property.  They form a descending symmetry subgroup chain.  In other words, the symmetries of a ratio scale form a subset of the symmetries of a relevant interval scale, the symmetries of that scale form a subset of the symmetries of a relevant ordinal scale, and the symmetries of that scale form a subset of the symmetries of a relevant nominal scale.  More specifically, the scale types form a containment hierarchy.  Analogously, the symmetries of Market Pricing form a subset of the symmetries of Equality Matching which form a subset of the symmetries of Authority Ranking which form a subset of the symmetries of Communal Sharing.  Descending subgroup chains are common in nature, including inorganic nature.  The symmetries of solid matter form a subset of the symmetries of liquid matter which form a subset of the symmetries of gaseous matter which form a subset of the symmetries of plasma.

This raises interesting questions about the origins of these patterns in the mind: could they result from spontaneous symmetry breakings in brain activity rather than being genetically encoded?  Darwinian adaptations are genetically encoded, whereas spontaneous symmetry breaking is ubiquitous in nature rather than being limited to genetically constrained structures.  The appeal to spontaneous symmetry breaking suggests a non-Darwinian approach to understanding how the elementary models could be “innate” (in the sense of being neither learned nor arrived at through reason).  That is, are the elementary relational models results of self-organization rather than learning or natural selection?  If they are programmed into the genome, why would this programming imitate a pattern in nature which usually occurs without genetic encoding?  The spiral shape of a galaxy, for example, is due to spontaneous symmetry breaking, as is the transition from liquid to solid.  But these transitions are not encoded in genes, of course.  Being part of the natural world, why should the elementary models be understood any differently?

d. Compound Models

While all relational models are analyzable into four fundamental models, the number of models as such is potentially infinite.  This is because social-relational cognition is productive; any instance of a model can serve as a constituent in an even more complex instance of a model.  Consider Authority Ranking and Market Pricing; an instance of one can be embedded in or subordinated to an instance of the other.  When a judge decides on a punishment that is proportionate to the crime, the judge is using a ratio scale and hence Market Pricing.  But the judge is only authorized to do this because of her authority, hence Authority Ranking.  We have here a case of Market Pricing embedded in a superordinate (as opposed to subordinate) structure of Authority Ranking resulting in a compound model.  Now consider ordering food from a waiter.  The superordinate relationship is now Market Pricing, since one is paying for the waiter’s service.  But the service itself is Authority Ranking with the customer as the superior party.  In this case, an instance of Authority Ranking is subordinate to an instance of Market Pricing.  This is also a compound model with the same constituents but differently arranged.  The democratic election of a leader is Authority Ranking subordinated to Equality Matching.  An elementary school teacher’s supervising children to make sure they take turns is Equality Matching subordinated to Authority Ranking.

A model can also be embedded in a model of the same type.  In some complex egalitarian social arrangements, one instance of Equality Matching can be embedded in another.  Anton Pannekoek’s proposed Council Communism is one such example.  The buying and selling of options is the buying and selling of the right to buy and sell, hence recursively embedded Market Pricing.  Moose society is largely structured by a complex model involving multiple levels of Communal Sharing.  A family among the Moose is largely structured by Communal Sharing, as is the village which embeds it, as is the larger community that embeds the village, and so on.  In principle, there is no upper limit on the number of embeddings in a compound model.  Hence, the number of potential relational models is infinite.

e. Mods and Preos

A model, whether elementary or compound, is devoid of meaning when considered in isolation.  As purely abstract structures, models are sometimes known as “mods” , which is an abbreviation of, “cognitively modular but modifiable modes of interacting” (Fiske 2004, 3).  (This may be a misnomer, since, as purely formal structures devoid of semantic content, mods are not modes of social interaction any more than syntax.   is a communication system.)  In order to externalize models, that is, in order to use them to interpret or motivate or structure interactions, one needs “preos,” these being “socially transmitted prototypes, precedents, and principles that complete the mods, specifying how, when and with respect to whom the mods apply” (2004, 4).  Strictly speaking, a relational model is the union of a mod with a preo.  A mod has the formal properties of symmetry, asymmetry, and in some cases embeddedness.  But a mod requires a preo in order to have the properties intuitively identifiable as meaningful, such as social application, emotional resonance, and motivating force.

The notion of a preo updates and includes the notion of an implementation rule, from an earlier stage of relational-models t