All posts by IEP Author

Knowledge-First Theories of Justification

Knowledge-first theories of justification are theories of justification that give knowledge priority when it comes to explaining when and why someone has justification for an attitude or an action. The emphasis of this article is on knowledge-first theories of justification for belief. As it turns out, there are a number of ways of giving knowledge priority when theorizing about justification, and what follows is a survey of more than a dozen existing options that have emerged since the publication in 2000 of Timothy Williamson’s Knowledge and Its Limits.

The present article first traces several of the general theoretical motivations that have been offered for putting knowledge first in the theory of justification. It then provides an examination of existing knowledge-first theories of justification and their objections. There are doubtless more ways to give knowledge priority in the theory of justified belief than are covered here, but the survey is instructive because it highlights potential shortcomings that would-be knowledge-first theorists may wish to avoid.

The history of the Gettier problem in epistemology is a long history of failed attempts to give a reductive account of knowledge in terms of justification and other conditions. In light of this, many have since judged the project of providing a reductive analysis of knowledge to be a degenerating research program. In putting knowledge first in the theory of justification, epistemologists are exploring whether we can more successfully reverse the order of explanation in epistemology by giving an account of justified belief in terms of knowledge. This article concludes with a reflection about the extent to which the short history of the many controversial attempts to secure an unproblematic knowledge-first account of justified belief has begun to resemble the older Gettier dialectic.

Table of Contents

  1. Motivating Knowledge-First Approaches
  2. The Token-Identity Theory
  3. Modal Theories
  4. Reasons-First, Knowledge-First Theories
  5. Perspectival Theories
  6. Infallibilist Knowledge-First Virtue Epistemology
  7. Proficiency-Theoretic Knowledge-First Virtue Epistemology
  8. Functionalist & Ability-Theoretic Knowledge-First Virtue Epistemology
  9. Know-How Theories and the No-Defeat Condition
  10. Excused Belief vs. Justified Belief
  11. A Methodological Reflection on Gettier
  12. References and Further Reading

1. Motivating Knowledge-First Approaches

Knowledge-first theories of justified belief give knowledge priority when it comes to explaining when and why someone has a justified belief. As it turns out, there are a number of ways of giving knowledge priority when theorizing about justified belief, and what follows is a survey of several existing options.

Before examining specific knowledge-first theories of justification it is worth considering what might motivate such an approach to begin with. One kind of motivation involves the need for an extensionally adequate theory of justified belief. After all, there is some set of possible cases where agents have justified beliefs, and a knowledge-first theory of justified belief should pick out that set and offer us a knowledge-centric explanation for why that set has exactly the members that it has. Traditional epistemologists should note that progress has been made in this direction, and this provides at least some reason to think that some knowledge-centric account of justification is correct. But there is more to be observed when it comes to motivating knowledge-first accounts of justified belief.

Consider, first, conceptual relations between knowledge and justification. Sutton (2005; 2007) has argued that grasping the concept of epistemic justification depends on our prior understanding of knowledge:

We only understand what it is to be justified in the appropriate sense because we understand what it is to know, and can extend the notion of justification to non-knowledge only because they are would-be knowers. We grasp the circumstances—ordinary rather than extraordinary—in which the justified would know. Justification in the relevant sense is perhaps a disjunctive concept—it is knowledge or would-be knowledge (Sutton 2005: 361).

If our concept of epistemic justification depends on our concept of knowledge, then that surely provides at least some reason to think that knowledge might be more basic a kind than justified belief. At the very least it provides us with reason to explore that possibility.

Second, consider some plausible claims about the normativity of belief. As Williamson (2014: 5) reasons: “If justification is the fundamental epistemic norm of belief, and a belief ought to constitute knowledge, then justification should be understood in terms of knowledge too.” Here Williamson is connecting norms for good instances of a kind and norms for bringing about instances of that kind. So if one is justified in holding a belief only if it is a good belief, and a good belief is one that constitutes knowledge, then it seems to follow that a justified belief has to be understood in terms of knowledge (Kelp, et al. 2016; Simion 2019).

A third reason for putting knowledge first in the theory of justification stems from Williamson’s (2000) defense of the unanalyzability of knowledge together with the E=K thesis, which says that the evidence you possess is just what you know. Assuming we should understand justification in terms of having sufficient evidence, it seems to follow that we should understand justification in terms of knowledge. (For critical discussion of E=K see Silins (2005), Pritchard and Greenough (2009), Neta (2017), and Fratantonio (2019).)

A fourth reason stems from the way in which asymmetries of knowledge can explain certain asymmetries of justification. While much of the knowledge-first literature on lottery beliefs has focused on assertion (see the article Knowledge Norms), the points are easily extended to justified belief. One cannot have justification to believe that (L) one has a losing lottery ticket just on the basis of one’s statistical evidence. But one can have justification to believe (L) on the basis of a newspaper report. What can explain this asymmetry? Knowledge. For one cannot know (L) on the basis of merely statistical evidence, but one can know (L) on the basis a newspaper report. Accordingly, knowledge can play a role in explaining the justificatory asymmetry involving (L) (Hawthorne 2004; Smithies 2012). A similar asymmetry and knowledge-first explanation can be drawn from the literature on pragmatic encroachment (Smithies 2012; De Rose 1996). See also Dutant and Littlejohn (2020) for further justificatory asymmetries that certain knowledge-first approaches to justified belief can explain.

Fifth, putting knowledge in the explanatory forefront can explain (broadly) Moorean absurdities. Consider, for instance, the absurdity involved in believing p while also believing that one does not know p. Some explanation for the irrationality of this combination of beliefs should fall out of a theory of justification that tells us when and why a belief is (or is not) justified. Theories of justification that explain justification in terms of knowledge have an easy time explaining this (Williamson 2000; 2009; 2014).

Lastly, putting knowledge in the explanatory forefront of justification can provide an explanation of the tight connection between justification and knowledge. For it is widely believed that knowing p or being in a position to know p entails that one has justification for believing p. The traditional explanation of this entailment relation involves the idea that knowledge is to be analyzed in terms of, and hence entails, justification. But another way of explaining this entailment is by saying that knowledge or being in a position to know is constitutively required for justification (Sylvan 2018).

2. The Token-Identity Theory

Perhaps the first knowledge-first theory of justified belief is the token-identity theory, according to which token instances of justified belief just are token instances of knowledge, which yield the following biconditional (Williamson 2009, 2014; Sutton 2005, 2007; Littlejohn 2017: 41-42):

(J=K) S’s belief that p is justified iff S knows that p.

The term ‘iff’ abbreviates “if and only if.” This is a theory of a justified state of believing (doxastic justification), not a theory of having justification to believe, whether or not one does in fact believe (propositional justification). But it is not hard to see how a (J=K) theorist might accommodate propositional justification (Silva 2018: 2926):

(PJ=PK) S has justification to believe p iff S is in a position to know p.

What does it take to be in a position to know p? One type of characterization takes being in a position to know as being in a position where all the non-doxastic demands on knowing are met (Smithies 2012; Neta 2017; Rosenkranz 2018; Lord 2018). The doxastic demands involve believing p in the right kind of way, that is, the kind of way required for knowing. The non-doxastic demands involve the truth of p and one’s standing in a suitably non-accidental relation to p such that, typically, were one to believe p in the right kind of way, one would know that p. (For further characterizations of being in a position to know see Williamson 2000: 95; Rosenkranz 2007: 70-71.)

One issue raised by characterizing being in a position to know in counterfactual terms concerns what we might call doxastic masks: features of one’s situation that are triggered by one’s act of coming to believe p at a time t+1 that would preclude one from knowing p despite all the non-doxastic requirements of knowledge being met at an earlier time t. For example, you might have all the evidence it could take for anyone to know p, but suppose Lewis’ (1997) sorcerer does not want you to know p. So, in all or most nearby worlds when the sorcerer sees you beginning to form the belief in p, he dishes out some kind of defeater that prevents you from knowing p. So, on standard possible worlds analyses of counterfactuals, it is false that you have some way of coming to believe p such that were you to use it, you would know p (compare Whitcomb 2014). Alternatively, one might seek to characterize being in a position to know in terms of having the disposition to know which is compatible with the existence of doxastic masks. Another alternative is to give up on the idea that being in a position to know is best understood in terms of worlds and situations nearby or close to one’s actual situation, thereby making the target characterization of being in a position to know a more idealized notion, one that is discussed below (compare Smithies 2012: 268, 2019: sect 10.4; Rosenkrantz 2018; Chalmers 2012).

There are various problems with (J=K) and, by extension, (PJ=PK).  First, (J=K) is incompatible with the fallibility of justification, that is, the possibility of having justified false beliefs. So (J=K) cannot permit justified false beliefs. But any theory of justification that rules out such beliefs is widely seen to be implausible (Bird 2007; Comesaña and Kantin 2010; Madison 2010; Whitcomb 2014; Ichikawa 2014).

Second, (J=K) is incompatible with the possibility of having a justified true belief in the absence of knowledge. Gettier cases are typically cases of justified true belief that do not constitute knowledge. But (J=K) implies that there are no such cases because it implies that there can be no cases of justification without knowledge. This bucks against a history of strong intuitions to the contrary (Bird 2007; Comesaña and Kantin 2010; Madison 2010; Whitcomb 2014; Ichikawa 2014).

Third, (J=K) is incompatible with the new evil demon hypothesis. Consider someone who, unwittingly, has had their brain removed, placed in a vat, and is now being stimulated in such a way that the person’s life seems to go on as normal. According to the new evil demon hypothesis: if in normal circumstances S holds a justified belief that p, then S’s recently envatted brain-duplicate also holds a justified belief that p. It is beyond the scope of this article to defend the new evil demon hypothesis. But as Neta and Pritchard (2007) point out, it is a widely shared intuition in 21st century epistemology. This generates problems for (J=K). For since one cannot know that one is looking at a hand (or that a hand is in the room) if one is a recently envatted brain who merely seems to be looking at a hand, then according to (J=K) one cannot be justified in believing it either (Bird 2007; Ichikawa 2014). For further discussion see the article on The New Evil Demon Hypothesis. See also Meylan (2017).

3. Modal Theories

To avoid the problems with (J=K), some have sought to connect justified belief and knowledge in a less direct way, invoking some modal relation or other.

Here is Alexander Bird’s (2007) knowledge-first account of justified judgment, which can be transformed into a theory of justified belief (i.e. arguably the end state of a justified act of judging):

(JuJu) If in world w1 S has mental states M and then forms a judgment [or belief], that judgment [or belief] is justified iff there is some world w2 where, with the same mental states M, S forms a corresponding judgment and that judgment [or belief] yields knowledge.

(JuJu) counts as a knowledge-first theory because it explains one’s justified judgment/belief in terms of the knowledge of one’s mental state duplicates. It does a good deal better than (J=K) when it comes to accounting for intuitive characteristics of justified belief: namely, its fallibility, its compatibility with Gettier cases, and its compatibility with the new evil demon hypothesis.

Despite this, various problems have been pointed out concerning (JuJu). First, it seems that we can obtain justified false beliefs from justified false beliefs. For example, suppose S knew that:

(a) Hesperus is Venus.

But, due to some misleading evidence, S had the justified false belief that:

(b) Hesperus is not Phosphorus.

Putting these two together S could infer that:

(c) Phosphorus is not Venus.

As Ichikawa (2014: 191-192) argues, S could justifiably believe (c) on this inferential basis. But, according to (JuJu), S can justifiably believe (c) on the basis of an inference from (a) and (b) only if it is possible for a mental state duplicate of S’s to know (c) on this basis. But content externalism precludes such a possibility. For content externalism implies that any mental state duplicate of S’s who believes (c) on the basis of (a) and (b) is a thinker for whom the terms ‘Phosphorus’ and ‘Venus’ refer to the very same astral body, thus making knowledge of (c) on the basis of (a) and (b) impossible. Because of this, (JuJu) implies that you cannot have justification to believe (c) on this inferential basis, contrary to what seems to be the case. This is not just a problem for (JuJu), but also a problem for (J=K).

Second, (JuJu) fails to survive the Williamsonian counterexamples to internalism. Williamson’s counterexamples, as McGlynn (2014: 44ff) observes, were not intended to undermine (JuJu) but they do so anyway. Here is one example:

Suppose that it looks and sounds to you as though you see and hear a barking dog; you believe that a dog is barking on the basis of the argument ‘That dog is barking; therefore, a dog is barking’. Unfortunately, you are the victim of an illusion, your demonstrative fails to refer, your premise sentence thereby fails to express a proposition, and your lack of a corresponding singular belief is a feature of your mental state, according to the content externalist. If you rationally believe that a dog is barking, then by [JuJu] someone could be in exactly the same mental state as you actually are and know that a dog is barking. But that person, too, would lack a singular belief to serve as the premise of the inference, and would therefore not know that a dog is barking. (Williamson 2000: 57-58).

McGlynn (2014: 44) draws attention to the fact that a “natural verdict is that one’s belief that a dog is barking is rational or justified” despite the fact that one cannot know this while having the same mental states. For any (non-factive) mental state duplicate will be one for whom the sentence ‘That dog is barking’ cannot be true, and hence cannot be known either. So we have another counterexample to (JuJu). Again, this is not just a problem for (JuJu), but also (J=K).

Since (JuJu)’s problems stem from its insistence on sameness of mental states, a natural response is to abandon that emphasis and focus on what a thinker and, say, her duplicate on Twin Earth can have in common. This is just what Ichikawa (2014: 189) attempts to do:

(JPK) S has a justified belief iff S has a possible counterpart, alike to S in all relevant intrinsic respects, whose corresponding belief is knowledge.

The target intrinsic respects are limited to the non-intentional properties that S and her Twin Earth duplicate can share. But they are not intended to include all such properties. Ichikawa wants to maintain that if, say, S unwittingly lost her body in an envattment procedure, she could still have a justified belief that she has a body even though the only counterparts of hers who could know this are ones who have a body. So, the target intrinsic respects are to be further restricted to what S and her envatted counterpart could share. In the end, this seems to amount to sameness of brain states or something close to that. This aspect of (JPK) goes a long way towards making it internalist-friendly and also helps (JPK) avoid the difficulties facing (JuJu) and (J=K). (See Ichikawa (2017) for his most recent work on knowledge-first approaches to justification.)

Nevertheless, (JPK) has problems of its own. Both problems stem from the attempt to reconcile (JPK) with the idea that justified belief is a type of creditable belief. Here is how Ichikawa (2014: 187) describes the first problem: Zagzebski (1996: 300-303) and many others have argued that it is plausible that S’s holding a justified belief entails that S is creditworthy (that is, praiseworthy) for believing as she does. Moreover, S is creditworthy because S holds a justified belief. That is, it is S’s particular act of believing that explains why S deserves credit. But (JPK) seems forced to explain S creditworthiness in terms of facts about  S’s counterparts since it is one’s counterparts that explain one’s doxastic justification. But this seems odd: why facts about a merely possible, distinct individual make me creditworthy for believing as I actually do? As others have pointed out, this can seem odd (Silva 2017). But a more promising response involves noting that having a justified belief immediately grounds being creditworthy for believing, just as our intuition has it. And facts about one’s counterparts’ knowledge immediately grounds having a justified belief. But immediate grounding is not transitive, so stuff about knowledge does not immediately ground being creditworthy for believing. So, the odd consequence does not follow. A consequence that does follow is that stuff about knowledge mediately grounds being creditworthy for believing. (Because there is a chain of immediate grounds connecting these.) But here it is open for the knowledge-firster to say that our intuition really concerns only immediate grounding.

Ichikawa is clear that (JPK) is a theory of justified belief (doxastic justification) and that this is the notion of justification that is connected to a belief’s being creditworthy. But doxastic justification has a basing requirement, and this makes doxastic justification partly a historical matter. And epistemic credit and blame also seem to depend on historical factors too (Greco 2014).  Thus, Ichikawa’s defense of (JPK) is susceptible to cases like the following:

Bad Past: At t S comes to believe that there is a ceiling overhead. S believes this because she just took a pill which she knew would induce random changes in her intrinsic states. In advance of taking the pill, S knew it would very likely cause her to have many false perceptual beliefs. But as it happens, the pill induced a total re-organization of her intrinsic states such that at t S has a counterpart who knows a ceiling is overhead.

(JPK) implies that S has a justified belief in Bad Past because she happens to have a knowledgeable counterpart. And because she has a justified belief, she is also creditworthy. But this seems wrong. Rather, S seems positively blameworthy for believing as she does. See Silva (2017) for further discussion of (JuJu) and (JPK) and see Greco (2014) for further discussion of historical defeaters for doxastic justification.

An alternative solution to these problems would be to revise (JPK) so that it is only a theory about propositional justification:

(PJPK) S has justification to believe p iff S has a possible counterpart, alike to S in all relevant intrinsic respects, whose corresponding belief is knowledge.

One could then, arguably, concoct a knowledge-first theory of doxastic justification by adding some kind of historical condition that rules out cases like Bad Past.

It should be noted that (PJPK) has a strange result. For if your internal counterpart knows p, then your internal counterpart believes p. But if your internal counterpart believes p, then you also believe p—provided you and your counterpart are not in very different environments (for example, earth vs. twin earth) that shift the content of the belief (compare Whitcomb 2014). So if (PJPK) is true, you only have propositional justification to believe p if you actually believe p. But it is usually assumed that it is possible to have propositional justification to believe p even if you do not believe p. To accommodate this (PJPK) may need revision.

4. Reasons-First, Knowledge-First Theories

Sylvan (2018), and Lord (2018) each take a reasons-first approach to justification, on which justified belief just is belief that is held for sufficient reason:

(J=SR) S’s belief that p is justified iff (i) S possess sufficient reason to believe p, and (ii) S believes that p for the right reasons.

While (J=SR) is not itself a knowledge-first view of justification, it becomes one when combined with a knowledge-first account of condition (i). Lord (2018: ch3) and Sylvan (2018: 212) both do this, taking reasons to be facts and arguing that one possesses a fact just in case one is in a position to know it:

(Pos=PK) S possess the fact that p as a reason to respond in some way w iff S is in a position to know that p.

Others have argued for some kind of knowledge-first restriction on (Pos=PK). For example, Neta (2017) has argued that our evidence is the set of propositions we are in a position to know non-inferentially. Provided one’s evidence just is the set of reasons one has for belief, this view will fall into the reasons-first, knowledge-first camp. For objections to (Pos=PK) see Kiesewetter (2017: 200-201, 208-209) and Silva (2023).

Surprisingly, the category of reasons-first, knowledge-first views cross-cuts some of the other categories. For example, (J=K) theorists have tended to fall into this camp. Williamson (2009) and Littlejohn (2018) take one’s evidence to consist of the propositions that one knows. Again, provided one’s evidence just is the set of reasons one has for belief, this leads to a view on which one possess p iff one knows p. This more restrictive knowledge-first view of possession, but together with (J=SR) and (J=K) it constitutes a kind of reasons-first, knowledge-first theory of justification. Since justified belief that p and knowledge that p never separate on this view, it can seem hardly worth mentioning this view as a reasons-first view. But there is more in need of epistemic justification than belief (though that will not be discussed here). There are other doxastic attitudes (for example, suspension, credence, acceptance, faith) as well as actions and feelings that are in need of epistemic justification. On knowledge-first, reasons-first views these states can only be justified by one’s knowledge.

As mentioned above (J=K) is subject to a range of objections. What follows focuses on Lord and Sylvan’s incarnation of the knowledge-first program that consists of (J=SR) and (Pos=PK). These two principles give us a knowledge-first theory of justification that avoids some of the main problems facing (J=K).

First, (J=SR) and (Pos=PK) are consistent with the existence of justified false beliefs. This is due to the fact that one’s reasons (the facts one is in a position to know) can provide one with sufficient, yet non-conclusive, reason to believe further propositions that may be false. The fact that a drunk has always lied about being sober, can be a sufficient yet non-conclusive inductive reason to believe that he will lie about being sober in the future. Since it is non-conclusive, having justification for this belief is consistent with it turning out to be false. So this view can allow for justified yet false inferential beliefs. The possibility of justified false perceptual beliefs is discussed below in connection with the new evil demon hypothesis.

Second, (i=SR) and (Pos=PK) are consistent with the existence of unknown, justified true beliefs. Because Smith can have justified false beliefs in the way described above, he can have a justified false belief that Jones will get the job based on the fact that the employer said so and the fact that this is a highly reliable indicator of who will get the job. Smith may also know that Jones has ten coins in his pocket based on perception. So, through an appropriate inferential process, Smith can come by a justified true inferential belief that the person who will get the job has ten coins in his pocket. This is a Gettier case, that is, an instance of a justified true belief without knowledge.

There are a few caveats. First, it’s worth noting that the reasons-first, knowledge-first theory of justification only has this implication under the assumption that the justificatory support one derives from facts one is in a position to know is transitive, or can at least sometimes carry over inferences from premises that one is not in a position to know. For, here, Smith’s false belief that Jones will get the job is justified by the reasons Smith is in a position to know, and we are assuming this justified false belief—which Smith is not in a position to know—can nevertheless facilitate Smith’s ability to acquire inferential justification for believing that the person who will get the job has ten coins in his pocket. For worries about the non-transitivity of the justification relation see Silins (2007) and Roche and Shogenji (2014).

Second, it is also worth noting that while Lord and Sylvan’s view is consistent with some intuitions about Gettier cases, it is not consistent with all such intuitions. After all, their view seems to be that we possess different reasons or evidence in the Gettier cases than we do in the good cases. This will seem counterintuitive to those who think that we have the same evidence in both cases.

Third, (J=SR) and (Pos=PK) are consistent with some intuitions about the new evil demon hypothesis. In the standard telling, the recently envatted brain has a non-veridical perceptual experience of p and believes p on the basis of that non-veridical experience. While the non-veridical experience does not give one access to the fact that p (if it is a fact), there is an inferential process that can give the envatted brain a justified belief according to (J=SR) and (Pos=PK). This is because mature thinkers who are recently envatted can know (or be in a position to know) that in the past their visual experiences have been a reliable guide to reality, and can sometimes know that they are now having an experience of p. Together, these are facts that can give one sufficient reason to believe p even if one is an unwittingly recently envatted brain.

Of course, the weakness here is that the envatted brain’s perceptual belief that p is not based on her inferential source of propositional justification to believe p. Rather, the envatted brain holds her belief in response to her perceptual experience. So, she is not doxastically justified, that is, her belief itself fails to be justified. So, there is some bullet to bite unless, perhaps, one can argue that knowledge of the fact that one is having an experience of p can itself be a reason to believe p even when one is an unwittingly envatted brain.

There are further problems that the reasons-first, knowledge-first view faces. They are along the lines of the problems for Bird’s (JuJu). For if reasons are facts, then one cannot obtain justified false beliefs from justified false-premise beliefs unless, as noted above, one’s justified false-premise beliefs are themselves inferentially justified and justificatory support carries over (see the discussion of (JuJu) above).  Similarly, it is unclear whether one can gain justified beliefs from contentless beliefs. For contentless “premise” beliefs do not stand in inferential relations to their “conclusions,” and such relations seem essential to the ability of justificatory support to transmit across inferences.

For a further concern about this view, see Littlejohn’s (2019) “Being More Realistic About Reasons,” where he argues that the conjunction of (J=SR) and (Pos=K) generates explanatory lacunas regarding how reasons should constrain our credences.

5. Perspectival Theories

Perspectival knowledge-first theories of justification put “knowledge first” by letting one’s point of view on whether one has knowledge determine whether one has justification. Smithies (2012), for example, argues that:

(PJ=PJK) S has justification to believe that p iff S has justification to believe that she is in a position to know that p.

Smithies (2012: 268) treats being in a position to know as a matter of being in a position where all the non-psychological conditions for knowing are met. Smithies is clear that this is only a theory of propositional justification (having justification to believe), not doxastic justification (having a justified belief). For as a theory of doxastic justification it would be too demanding: it would require an infinite hierarchy of beliefs, and it would require that one have epistemic concepts (e.g. KNOWS, JUSTIFIED, POSITION TO KNOW) if one is to have any justified beliefs at all. This would over-intellectualize justification, excluding agents incapable of epistemic reflection (for example, young children, people with handicaps, smart non-humans). Worse, if knowledge requires justification then this would also rob such beings of knowledge.

It is important to note that (PJ=PJK) is neutral on which side of the biconditional gets explanatory priority. To be a genuinely knowledge-first view it must be the condition on the right-hand side that explains why the condition on the left-hand side obtains. This is something that Smithies himself rejects.  And there are good reasons for this, as there are objections to (PJ=PJK) that emerge only if we give the right-hand side explanatory priority. But there is also a general objection to this view that is independent of which side gets priority. This section starts with the general objection and then turns to the others.

A central worry to have about (PJ=PJK), irrespective of which side gets explanatory priority, is the extent to which Smithies’ purely non-psychological conception of propositional justification is a theoretically valuable conception of justification as opposed to a theoretically valuable conception of evidential support. For our evidence can support propositions in virtue of entailment and probabilistic relations, where these propositions can be so complex as to be well beyond our psychological abilities to grasp. For example, even before I had the concept of a Gettier Case, my evidence supported the claim that I exist or I’m in a Gettier case just in virtue of the fact that I exist was already part of my evidence and entailed that disjunction. But since I did not have the concept of GETTIER CASE, I could not have formed that belief.

So one general question concerns whether the motivations appealed to in support of (PJ=PJK) wrongly identify the following two epistemic notions:

Evidential Support: Having evidence, E, such that E entails or probabilistically supports p.

Justification: Having evidence, E, such that E gives one justification to believe p.

Certain evidentialists will like the idea of binding these notions together, thinking that strong evidential support is all there is to epistemic justification (Smithies 2019). Yet many have objected to the kind of evidentialism implicit in making evidential support necessary and sufficient for justification. The necessity direction has been objected to due to lottery problems, pragmatic encroachment, and the existence of justified beliefs not derived from evidence (so called “basic” or “immediate” or “foundational” justified beliefs). The sufficiency direction, while rarely challenged, is also objectionable (Conee 1987, 1994; Silva 2018). For example, some mental states are such that we are not in a position to know that we are in them even upon reflection (Williamson 2000). Suppose you knew that you just took a pill that ensured that you are in a mental state M iff you do not believe (A) that you are in M. A rational response to this knowledge would be to suspend belief in (A) due to your knowledge of this biconditional: for if you believe (A) then it is false, and if you disbelieve (A) then it is true. So suspension seems like the only rational response available to you. In at least some such cases where you consciously suspend belief in (A), you will also know that you have suspended belief (A). This is at least a metaphysical possibility, and certainly a logical possibility. Now, since you know the biconditional and since you know you have suspended belief in (A), your evidence entails that you are in M. But it is logically impossible for you to justifiably believe or know (A) on your evidence—and you can know this a priori. For believing (A) on your evidence entails that (A) is false. So connecting justification to evidential support in this way is inconsistent with the following plausible idea: S has justification to believe P on E only if it is logically possible for S to justifiably believe P on E. For further discussion of these and related reasons to separate justification from evidential support see Silva (2018) and Silva and Tal (2020). For further objections to Smithies see Smith (2012). For further defense of Smithies’ theory see Smithies (2019: sect 9.4).

Further, as Smith (2012) points out, (PJ=PJPK) implies that having justification to believe p requires having justification to believe an infinite hierarchy of meta-justificatory claims:

One thing that we can immediately observe is that [PJ=PJK]… is recursive, in that it can be reapplied to the results of previous applications. If one has justification to believe that p (Jp) then, by [PJ=PJK], one must have justification to believe that one is in a position to know that p (JKp). But if one has justification to believe that one is in a position to know that p (JKp) then, by [PJ=PJK], one must have justification to believe that one is in a position to know that one is in a position to know that p (JKKp) and so on… In general, we have it that Jp ⊃ JKn p for any positive integer n.

If one adds to this the priority claim that having justification to believe that one is in a position to know p is the source of one’s justification to believe p, one must either accept a skeptical result due to grounding worries about the infinite hierarchy of meta-justificatory claims, or accept a knowledge-first form of infinitism. But even overcoming the standard general worries with infinitism, knowledge-first infinitism will be especially difficult to handle due to luminosity failures for KK. For example, in Williamson’s (2000: 229) unmarked clock case, one is argued to know a proposition p, while also knowing that it is very improbable that one knows i. Intuitively, this is a case where one knows p and so justifiably believes p even though they lack justification to believe they know p. (For a discussion of the limits of the unmarked clock case see Horowitz 2014.)

The final issue with (PJ=PJPK) is whether or not having justification to believe that one is in a position to know is the source of one’s propositional justification to believe p (which would make this a knowledge-first view) or whether it is a non-explanatory necessary and sufficient condition on having justification to believe p (Smithies’ view). To illustrate the difference, suppose there is an infallible record of peoples’ heights. It is certainly true that Paul is 5’11’’ at t if and only if the infallible record says that Paul is 5’11’’ at t. But the right-hand-side of that biconditional is plausibly non-explanatory. The fact that there is an infallible record does not make or otherwise explain Paul’s height. Now, if the advocate of (PJ=PJPK) holds that having justification to believe that one is in a position to know is the source of one’s justification, then having a doxastically justified belief will, according to tradition, require one to base their belief that p on that source of justification. But ordinarily we do not base our beliefs on further facts about knowing or being in a position to know. So if we are not to risk an unacceptable skepticism about doxastically justified belief (and hence knowledge), it seems we will either have to give up the tradition or treat the right-hand-side of (PJ=PJPK) as specifying a mere non-explanatory necessary and sufficient condition. However, if that is the case, it can seem puzzling why there should be such a modally robust connection between justification and one’s perspective on whether one knows.

A view much like (PJ=PJPK) that avoids all but this final problem is Dutant and Littlejohn’s (2020) thesis:

(Probable Knowledge) It is rational for S to believe p iff the probability that S is in a position to know p is sufficiently high.

Even after specifying the relevant notion of ‘in a position to know’ and the relevant notion of “probability’ (objective, subjective, epistemic, together with some specification of what counts as an agent’s evidence), provided we can and should distinguish between propositionally and doxastically rational belief, it seems that (Probable Knowledge) is either not going to be a genuinely knowledge-first view or one that does not allow for enough doxastically rational beliefs due to the basing worry described above in connection with Bad Past.

Reynolds (2013) offers a related view of doxastic justification on which justified belief is the appearance of knowledge: “I believe with justification that I am currently working on this paper if and only if there has been an appearance to me of my knowing that I am currently working on this paper.” Generalizing this we get:

(J=AK) S’s belief that p is justified if and only if S is appeared to as though S knows that p.

On his view appearances are not doxastic states nor are they conceptually demanding. As he  explains the target notion:

Consider the following example: Walking in a park I notice an unfamiliar bird, and decide I would like to find out what it is. Fortunately, it doesn’t immediately fly away, so I observe it for two or three minutes. A few hours later, having returned home, I look up a web site, find a few photos, follow up by watching a video, and conclude confidently that I saw a Steller’s Jay. I think it is perfectly correct to say that the bird I saw had the appearance of a Steller’s Jay, even though I didn’t know that that’s what it was at the time. If it hadn’t had the appearance of a Steller’s Jay, I wouldn’t have been able to remember that appearance later and match it to the photos and video of Steller’s Jays. I didn’t have the concept of a Steller’s Jay, yet I had an appearance of a Steller’s Jay. (Reynolds 2013: 369)

(J=AK) has advantages with regard to (PJ=PJK). It does not lead to an infinite hierarchy of meta-justificatory claims and it is not hard to see how many of our occurrent beliefs might be based on such appearances, thereby avoiding some of the skeptical challenges that threatened (PJ=PJK). But there are problems.

One concern with (J=AK) is its self-reflective character. To have a justified belief you have to be (or have been) in a state in which it appears to you as though you have knowledge. This requires introspective abilities, which arguably some knowing creatures might lack. As Dretske (2009) put it: a dog can know where its bowl is, and a cat can know where the mouse ran. The correctness of these and other knowledge ascriptions does not seem to turn on whether or not dogs and cats have the capacity to access their own mental lives in such a way that they can appear to themselves to have knowledge.

Moreover, (J=AK) implies that every justified belief is a belief with such an appearance. But many of the justified beliefs we form and much of the knowledge we acquire is merely dispositional, that is, it involves dispositional beliefs that are never or only very briefly made occurrent. Do we, as a matter of psychological fact, also have the appearance of knowledge with regard to all such states? There is non-trivial empirical reason to find this suspicious. In the psychology of memory, it has been observed that our memory systems are not purely preservative, they are also constructive. For example, our sub-personal memory systems often lead us to forget very specific beliefs while forming new beliefs that are more general in character. Sometimes this leads to new knowledge and new justified beliefs (Grundmann and Bernecker 2019). But if the new belief is the product of sub-personal operations and the more general belief is itself unretrieved, then it is unclear how that more general unretrieved justified belief could appear to oneself as a case of knowing.

A final concern with (J=AK) is its ability to handle undercutting defeat and the plausible idea that beliefs can cognitively penetrate appearances (see (cognitive penetration). For suppose you have strong undefeated evidence that you are in fake-barn country, but you brazenly believe without justification that you are looking at the one real barn in all the country. Perhaps this is because you pathologically believe in your own good fortune. But pathology is not necessary to make the point, as it is often assumed that we can have unjustified beliefs that we believe to be justified. If either is your situation, your belief that you are looking at a real barn can appear to you to be knowledge given your normal visual experience and the fact that you (unjustifiably) believe your defeater to have been defeated. According to (J=AK) your belief is then justified. But that is the wrong result. Unjustified beliefs that enable the appearance of knowledge should not have the ability to neutralize defeaters.

Here is a final perspectival, knowledge-first theory of justification. It is mentioned by Smithies (2012) and explored by Rosenkranz (2018):

(J=¬K¬K): S has justification to believe p iff S is not in a position to know that S is not in a position to know that p.

Like Smithies, Rosenkranz relies on a conception of justification and being in a position to know that is psychologically undemanding. But unlike Smithies, Rosenkranz explicitly regards his view as being about justification for idealized agents and leaves open what relevance this notion has for ordinary, non-idealized agents like us.

There are at least two concerns with this view of justification. First, suppose we were to treat (J=¬K¬K) as a theory of justification for ordinary non-ideal agents and imposed (as many wish to) substantive psychological limits on what one has justification to believe. With such limits in place, (J=¬K¬K) would face not an over-intellectualization problem but an under-intellectualization problem. For agents who lack the concept KNOWLEDGE or the complicated concept POSITION TO KNOW could never be in a position to know that they are not in a position to know. So, such agents would be justified in believing anything.

But even once psychological limits are stripped away, and with them the under-intellectualization problem, another problem remains. Smithies (2012: 270) points out that, on this view, to lack justification one must be in a position to know that one is not in a position to know. Since being in a position to know is factive, this limits defeating information to factive defeating information. But it seems like misleading (non-factive) information can also defeat knowledge and justification. For example, suppose you are told that you are in fake-barn country. But in fact you are not, so you are not in a position to know that you are in fake-barn country. Still, the misleading testimony that you are in fake-barn country gives you justification to believe that you are in fake-barn country. Intuitively, this misleading testimony will defeat your justification to believe that there is a barn ahead; the misleading testimony ensures you should not believe that. But you are not in a position to know that you are not in a position to know that there is a barn ahead—recall the testimony you receive is misleading. So (J=¬K¬K) says you have justification when intuitively you do not.

In response, it seems open to advocates of (J=¬K¬K) to argue that while one might not be in a position to know the content of the misleading testimony (because it is false), the misleading testimony itself can defeat. In this case, for example, it is arguable that the misleading testimony that one is in circumstances that make one’s knowing that p improbable itself defeats one’s being in a position to know p, and so prevents one’s good visual contact with an actual nearby barn in normal conditions from putting one in position to know that a barn is nearby. However, recent arguments for the existence of “unreasonable knowledge”—that is, knowledge that p while knowing that it is improbable that one knows p—will challenge the integrity of this response in defense of (J=¬K¬K). For more on unreasonable knowledge see Lasonen-Aarnio (2010, 2014) and Benton and Baker-Hytch (2015).

6. Infallibilist Knowledge-First Virtue Epistemology

We are not simply retainers of propositional knowledge. We are also able to acquire it. You are, for example, able to figure out whether your bathroom faucet is currently leaking, you are able to figure out whether your favorite sports team won more games this season than last season, you are able to figure out the sum of 294 and 3342, and so on. In normal circumstances when you exercise this ability you gain propositional knowledge. If you are able to figure out whether the faucet is leaking and you use that ability, the typical result is knowledge that the faucet is leaking (if it is leaking) or knowledge that the faucet is not leaking (if it is not leaking). The core idea behind knowledge-first virtue epistemology (KFVE) is that justified belief is belief that is somehow connected to exercises of an ability to know. Predictably, (KFVE)-theorists have had different things to say about how justified belief is connected to such abilities.

Some have argued that success is a general feature of exercises of abilities (Millar 2016). That is, one exercises an ability only if one does what the ability is an ability to do. It is widely thought that belief formation is a part of exercising an ability to know because knowing is constituted by believing. From which it follows in the special case of exercises of abilities to know that:

(Exercise Infallibilism) S’s belief is the product of an exercise of an ability to know only if S’s belief constitutes knowledge.

For example, Millar (2019) argues for a special instance of this in arguing that we cannot exercise an ability to know by perception without thereby acquiring perceptual knowledge.

If (Exercise Infallibilism) is true, and if justified beliefs just are beliefs that are products of abilities to know, then (J=K) follows. And so we have a virtue theoretic account of justified belief that faces all the same problems we saw above facing (J=K). Of note is the inability of such a view to accommodate the following desiderata:

Desideratum 1. Justification is non-factive, that is, one can have justified false beliefs.

Desideratum 2. One can have justified true beliefs that do not constitute knowledge, as in standard Gettier cases.

Desideratum 3. One can have justified perceptual beliefs even if one is in an environment where perceptual knowledge is impossible due to systematically misleading features of one’s perceptual environment. This can happen on a more global scale (as in the new evil demon case), and it can happen on a more local scale (as in beech-elm cases discussed below).

7. Proficiency-Theoretic Knowledge-First Virtue Epistemology

The central point of departure from Millar’s virtue theory and the remaining virtue theories is that they reject (Exercise Infallibilism). It is this rejection that makes the resulting theories resilient to the objections facing (J=K). On Miracchi’s (2015) preferred instance of (KFVE), exercises of abilities to know explain our justified beliefs but it is not mere abilities to know that have the potential yield justified beliefs. Rather, it is only proficient abilities to know (“competences”) that yield justified beliefs, and all abilities to know are proficient abilities to know. One has a proficient ability to know just in case an exercise of their ability to know ensures a sufficiently high objective probability of knowing. That is, the conditional objective probability that S knows p given that S exercised a relevant ability to know is sufficiently high. This is a kind of in situ reliability demand on justification.

We can summarize her view of justified belief, roughly, as follows:

(KFVE-Proficiency) S has a justified belief iff S’s belief is competent, where S’s belief is competent iff S’s belief is produced by an exercise of a proficient ability to know.

Central to her view is the idea that exercises of proficient abilities are fallible, that is, an agent can exercise an ability to know without succeeding in knowing. So (Exercise Infallibilism) is given up. This enables (KFVE-Proficiency) to accommodate justified false beliefs (that is, Desideratum 1) as well as justified true beliefs that do not constitute knowledge (that is, Desideratum 2). So (KFVE-Proficiency) avoids two of the main challenges to (J=K) and Millar’s (KFVE-Infallibilism).

However, by limiting justified beliefs to beliefs produced by proficient abilities, Miracchi’s view is, like (J=K) and Millar’s infallibilist view, unable to accommodate Desideratum 3, that is, the compatibility of justified beliefs formed in certain deceptive environments. The first case of this is just the familiar new evil demon case. For the recently envatted brain, as Kelp (2016; 2017; 2018) argues, retains the ability to know by perception that, say, they have hands by responding to visual appearances in normal circumstances. But because they are no longer in normal circumstances, they no longer possess a proficient ability to know. In other words, the recently envatted brain’s change of environment robs them of the proficiency needed to form justified beliefs.

Miracchi (2020) rejects, or is at least deeply suspicious of, the metaphysical possibility of the new evil demon hypothesis. But we need not rely on fantastical envatted brain scenarios to make this style of objection to (KFVE-Proficiency). Suppose you grew up in an environment with lots of beech trees and developed the ability to visually identify them and thus the ability to know that a beech tree is nearby by sight. Since exercises of abilities are fallible, you could exercise this beech-identification ability if you were to unwittingly end up in another environment where there are only elms (which, according to Putnam, look indistinguishable from beeches to the untrained). But this is not an environment where your ability to identify beeches amounts to a proficiency: conditional on your exercise of your ability to identify and come to know that beeches are nearby, it is objectively highly likely that you will fail to know. So the intuition that you can have justified perceptual beliefs about beeches being nearby in such a case appears inconsistent with (KFVE-Proficiency). While there may be some doubt about the metaphysical possibility of the new evil demon hypothesis, this is a perfectly possible scenario. See Kelp (2018: 92) for a similar objection for Miracchi.

One last concern with (KFVE-Proficiency) regards its ability to accommodate defeat. This is discussed in the section below.

8. Functionalist & Ability-Theoretic Knowledge-First Virtue Epistemology

Kelp (2016; 2017; 2018) and Simion (2019) offer versions of (KFVE) that do not tie justification so closely to in situ reliability and thereby avoid not only the problem of having justified false beliefs and the possibility of Gettier cases, but also problems arising from the new evil demon hypothesis and very local cases of deception (like the beech-elm case above). So Desiderata 1–3 are easily managed. This section first explains their distinctive views and then mentions some concerns they share.

On Kelp’s (2016; 2017; 2019) view, justified belief is competent belief, and competent beliefs are generated by exercises of an agent’s ability to know. Importantly, such exercises do not require proficiency in Miracchi’s sense. Kelp’s view, roughly, amounts to this:

(KFVE-Ability) S has a justified belief iff S’s belief is competent, where S’s belief is competent iff S’s belief is produced by an exercise of an ability to know.

On Simion’s (2019) view, in contrast, justified beliefs are beliefs that are generated by properly functioning cognitive processes that are aimed at yielding knowledge. Presumably, if an agent has properly functioning cognitive processes that are aimed at yielding knowledge, then such an agent has an ability to know as well. So it’s not too much of a taxonomic stretch to place Simion’s theory among the virtue theories. Like the exercise of abilities, cognitive processes can properly function without proficiency:

(KFVE-Functionalism) S’s belief is justified iff S’s belief is produced by a properly functioning cognitive process that has the etiological function of generating knowledge.

These statements of Kelp and Simion’s views are relatively coarse-grained and both Kelp and Simion defend more refined theses.

Kelp and Simion’s views are not unrelated to each other. For the ability to know is an ability one has in virtue of having certain belief-producing cognitive processes, and Kelp’s (2018) preferred account of how the ability to know is acquired is the same general kind of account that Simion (2019) relies on in arguing that the cognitive processes that constitute one’s ability to know are cognitive processes whose function is knowledge production. Nevertheless, the views are distinct in that (KFVE-Ability) grounds justification in agent abilities, while (KFVE-Functionalism) grounds them in cognitive processes. See Kelp (2019) for a discussion of the importance of this difference.

Central to their views is the idea that exercises of abilities to know are fallible, and given the fallibility of exercises of the ability to know, (KFVE-Ability) and (KFVE-Functionalism) allow for justified false beliefs and justified true beliefs that do not constitute knowledge. So, Desiderata 1 and 2 are easily accommodated.

Desiderata 3 is likewise easily accommodated. In Kelp’s (2018) telling, the recently envatted brain retains and exercises an ability to know when believing she has a hand upon having the visual experience as of a hand. According to Simion (2019), just as an envatted heart pumping orange juice counts as a properly functioning heart, a recently envatted brain counts as properly functioning when it comes to believe it has a hand upon having the visual experience as of a hand. And if justified belief can be had in cases of such systematic perceptual deception, then they can also be had in cases of localized perceptual deception as in the beech-elm scenario above.

So (KFVE-Ability) and (KFVE-Functionalism) can accommodate Desiderata 1–3. What about the desiderata that emerged in the objections to (JuJu), (JPK), and reasons-first, knowledge-first views? That is:

Desideratum 4. Justified beliefs can be based on inferences from justified false beliefs.

Desideratum 5. Justified beliefs can be based on inferential acts involving contentless beliefs.

Desideratum 6. Justified belief is a kind of creditable belief.

Desideratum 7. Justified belief has a historical dimension that is incompatible with situations like Bad Past.

If (KFVE-Ability) or (KFVE-Functionalism) imply that a recently envatted brain is able to have justified beliefs from an exercise of an ability to know or as a product of their cognitive competences which aim a knowledge, then it is easy to see how Desiderata 4 and 5 is satisfied by (KFVE-Ability) and (KFVE-Functionalism). For these seem like more local cases of deception. As for 6 and 7, the virtue-theoretic machinery here is key. For both can be explained by the demand that justified beliefs are beliefs that issue from an ability or a properly functioning cognitive process. But that was exactly what was lacking in the cases discussed above that motivated 6 and 7. See Silva (2017) for an extended discussion of how certain versions of (KFVE) can satisfy these desiderata.

There are some worries about these versions of (KFVE). Consider Schroeder’s (2015) discussion about defeater pairing. Any objective condition, d, which defeats knowledge that p is such that: if one justifiedly believes that d obtains then this justified belief will defeat one’s justification to believe p. For example, suppose you formed the belief that a wall is red from an ability to know this by perception and that you are in normal circumstances where the wall is in fact red. You will have a justified belief according to each of the fallibilist versions of (KFVE) above. But suppose you were given misleading yet apparently reliable undercutting information that the wall is illuminated by red lights and so might not actually be red. This is not true, but were it true it would defeat your knowledge; were it true you would be in a Gettier situation. Now the defeater pairing insight says that the fact that you justifiedly believe the wall is illuminated by red lights defeats your justification to believe the wall is red. But according to the fallibilist instances of (KFVE) discussed above, since you arrived at your belief that the wall is red through an exercise of your proficiency or ability or properly functioning cognitive process, you have a justified belief according to (KFVE-Proficiency), (KFVE-Competence), and (KFVE-Functionalism). But that is inconsistent with the intuition that the justification for your belief is defeated.

So this objection gives rise to a further potential demand on an adequate theory of justified belief:

Desideratum 8. Justified belief is susceptible to defeat by justified defeating information.

A possible response to this objection is to maintain that exercises of abilities, or the use of a reliable processes, always depends on the absence of credible defeating information. In which case, the versions of (KFVE) above may be able to accommodate Desideratum 8.

Another response is to resist Desideratum 8 and the supposed phenomenon of defeater pairing. For more on this see discussion of “unreasonable justified beliefs”. See Lasonen-Aarnio (2010, 2014) and Benton and Baker-Hytch (2015). For qualified opposition see Horowitz (2014).

The second concern to have about (KFVE-Ability) and (KFVE-Functionalism) is that there is a question about the extent to which abilities/cognitive processes are “in the head.” For example, consider the amputee gymnast. She lost her leg and so no longer has the ability to do a backflip. So her ability to do backflips is located, in part, in her ability to successfully interact with the physical world in some ways. In this case, it is located in her ability to control her body’s physical movements in certain ways. This does not conflate proficiency with mere ability, for even with both legs the gymnast might not have a proficiency because she’s in an inhospitable environment for performing backflips (high winds, buckling floors, and so forth). We might wonder, then, whether the envatted brain’s ability to know by perception is lost with the loss of her body and the body’s perceptual apparatus just as the gymnast’s ability to do backflips is lost with the loss of her leg. If so, then it is a mistake to think (KFVE-Ability) and (KFVE-Functionalism) are compatible with the new evil demon hypothesis, and hence with Desideratum 3. This threatens to make these views more revisionary than they initially appeared to be.

9. Know-How Theories and the No-Defeat Condition

Silva (2017) argues that justification is grounded in our practical knowledge (knowledge-how) concerning the acquisition of propositional knowledge (knowledge-that). The motivation for this incarnation of (KFVE) starts with the simple observation that we know how to acquire propositional knowledge. You, for example, know how to figure out whether your bathroom faucet is currently leaking, you know how to figure out whether your favorite sports team won more games this season than last season, you know how to figure out the sum of 294 and 3342, and so on. In normal circumstances when you exercise such know-how you typically gain propositional knowledge. If you know how to figure out whether the faucet is leaking and you use that know-how, the typical result is knowledge that the faucet is leaking (if it is leaking) or knowledge that the faucet is not leaking (if it is not leaking). One way of thinking about the grounds of justification is that it is crucially connected to this kind of know-how: justified belief is, roughly, belief produced by one’s knowledge how to acquire propositional knowledge.

Here is a characterization of Silva’s (2017) view:

(KFVE-KnowHow) S has a justified belief iff (i) S’s belief is produced by an exercise of S’s knowledge of how to gain propositional knowledge, and (ii) S is not justified in thinking she is not in a position to acquire propositional knowledge in her current circumstances.

One advantage of (KFVE-KnowHow) is that it is formulated in terms of know-how and so avoids worries about abilities not being “in the head.” For example, while the amputee gymnast discussed above lacks the ability to perform backflips, she still knows how to do them. Similarly, in thinking about the recently envatted brain, she still knows how to acquire propositional knowledge by perception even if she lacks the ability to do so because she has lost the necessary perceptual apparatus. So Desideratum 3 is, arguably, easier to accommodate with (KFVE-KnowHow).

Similarly, since exercises of know-how are fallible in situ (Hawley 2003), (KFVE-KnowHow) has no trouble explaining how exercises of one’s knowledge how to know could lead one to have a false belief (that is, Desideratum 1) or have true beliefs that do not constitute knowledge (that is, Desideratum 2). For similar reasons (KFVE-KnowHow) is able to satisfy Desiderata 4-7. See Silva (2017) for further discussion.

Lastly, condition (ii) is a kind of “no defeater” condition that makes (KFVE-KnowHow) compatible with Schroeder’s defeater-pairing thesis and standard intuitions about undercutting defeat. So it manages to accommodate Desideratum 8.  So (KFVE-KnowHow) appears capable of satisfying all the desiderata that emerged above. Accordingly, to the extent that one finds some subset of Desiderata 1-8 objectionable one will have reason to object to (KFVE-KnowHow). For one way of developing this point see the next section.

10. Excused Belief vs. Justified Belief

The objections to knowledge-first views of justification above assumed, among other things, that justification has the following properties:

Desideratum 1. Justification is non-factive, that is, one can have justified false beliefs.

Desideratum 2. One can have justified true beliefs that do not constitute knowledge, as in standard Gettier cases.

Desideratum 3. One can have justified perceptual beliefs even if one is in an environment where perceptual knowledge is impossible due to systematically misleading features of one’s perceptual environment. This can happen on a more global scale (as in the new evil demon case), and it can happen on a more local scale (as in beech-elm cases discussed above).

Desideratum 4. Justified beliefs can be based on inferences from justified false beliefs.

Desideratum 5. Justified beliefs can be based on inferential activities involving contentless beliefs.

Desideratum 6. Justified belief is a kind of creditable belief.

Desideratum 7. Justified belief has a historical dimension that is incompatible with situations like Bad Past.

Desideratum 8. Justified belief is susceptible to defeat by justified defeating information.

Knowledge-first virtue epistemology has the easiest time accommodating these assumed properties of justification, with (KFVE-KnowHow) being able to accommodate all of them.

In defense of alternative knowledge-first views some might argue that Desiderata 1–8 (or some subset thereof) are not genuine properties of justification, but rather properties of a kindred notion, like excuse. Littlejohn (2012: ch. 6; 2020) and Williamson (2014: 5; 2020) have argued that the failure to properly distinguish justification from excuses undermines many of the arguments that object to there being a tight connection between knowledge and justification. An excuse renders you blameless in violating some norm, and it is easy to see how some might argue that 1–8 (or some subset thereof) indicate situations in which an agent is excusable, and so blameless, although her belief is not justified. For the locus classicus on the concept of excuse see Austin’s “A Plea for Excuses.” For critical discussion of the excuse maneuver in defense of knowledge-first theories (of assertion and justification) see Lackey (2007), Gerken (2011), Kvanvig (2011), Schechter (2017), Madison (2018), and Brown (2018).

Arguably, the most accommodating knowledge-first virtue theory, (KFVE-KnowHow), threatens to make the concept of an excuse nearly inapplicable in epistemology. For the situations indicated in 1-8 are so inclusive that it can be hard to see what work is left for excuses. If one thought there should be deep parallels between epistemology and moral theory, which leaves substantive work for excuses, then one might worry that any theory that can accommodate all of Desiderata 1-8 will in some way be guilty of conflating justification with excuse.

11. A Methodological Reflection on Gettier

The history of the Gettier problem is a long history of failed attempts to give a reductive account of knowledge in terms of justification and other conditions. In light of this, many have since judged the project of providing a reductive analysis of knowledge to be a degenerating research program. In putting knowledge first in the theory of justification, epistemologists are exploring whether we can more successfully reverse the order of explanation in epistemology by giving an account of justified belief in terms of knowledge. Attempts to put knowledge first in the theory of justification began during the early twenty-first century, reminiscent of the history of attempts to solve the Gettier problem: knowledge-first theories are proposed, counterexamples are given, new knowledge-first theories (or error theories) are developed, new counterexamples are given, and so on (Whitcomb 2014: sect. 6).

Perhaps this repeat of Gettierology merits a new approach. One such approach, advocated by Gerken (2018) is an ‘equilibristic epistemology’ according to which there is not a single epistemic phenomenon or concept that comes first in the project of the analysis of knowledge or justification. Rather, there are various basic epistemic phenomena that are not reductively analyzable. At most they may be co-elucidated in a non-reductive manner. Alternatively, perhaps we should return to the tradition from which knowledge-first epistemology sprung. That is, perhaps we should return to the prior project of providing a reductive analysis of knowledge in terms of other conditions. A manifestation of a return to the traditional approach involves drawing a distinction between knowledge and awareness, where the diagnosis of the failure of post-Gettier analyses of knowledge is, in part, taken to be a failure to appreciate the differences between knowledge and awareness (Silva 2023: ch.8-9).

12. References and Further Reading

  • Benton, M. and M. Baker-Hytch.  2015. ‘Defeatism Defeated.’  Philosophical Perspectives 29: 40-66.
  • Bird, Alexander. 2007. ‘Justified Judging.’ Philosophy and Phenomenological Research, 74: 81-110.
  • Brown, J. 2018. Fallibilism. Oxford: Oxford University Press.
  • Chalmers, D. 2012. Constructing the World. Oxford: Oxford University Press.
  • Comesana, J. and Kantin, H. 2010. ‘Is Evidence Knowledge?’ Philosophy and Phenomenological Research, 89: 447-455.
  • Conee, E. 1987. ‘Evident, but Rationally unacceptable’. Australasian Journal of Philosophy 65: 316-26.
  • Conee, E. 1994. ‘Against and Epistemic Dilemma’. Australasian Journal of Philosophy 72: 475-81.
  • Dretske, F. 2009. Perception, Knowledge, Belief. Cambridge: Cambridge University Press.
  • Dutant, J. and C. Littlejohn. 2020. ‘Defeaters as indicators of ignorance.’ In J. Brown and M. Simion (ed.), Reasons, Justification, and Defeat. Oxford University Press.
  • Fratantonio, G. 2019. ‘Armchair Access and Imagination.’ Dialectica 72(4): 525-547.
  • Gerken, M. 2011. ‘Warrant and Action.’ Synthese, 178(3): 529-47.
  • Gerken, M. 2018. ‘Against Knowledge-First Epistemology.’ In E. And B. A. Gordon and Jarvis Carter (ed.), Knowledge-First Approaches in Epistemology and Mind, Oxford University Press. pp. 46-71.
  • Greco, J. 2014. ‘Justification is not Internal.’ In M. Steup, J. Turri, and E. Sosa (eds.) Contemporary Debates in Epistemology. Oxford: Wiley Blackwell: 325-336.
  • Grundmann, T. and S. Bernecker. 2019. ‘Knowledge from Forgetting.’ Philosophy and Phenomenological Research XCVIII: 525-539.
  • Hawley, K. 2003. ‘Success and Knowledge-How.’ American Philosophical Quarterly, 40: 19-3.
  • Hawthorne, J. Knowledge and Lotteries. Oxford:  Oxford University Press.
  • Horowitz, S. 2014. ‘Epistemic Akrasia.’ Nous 48/4: 718-744.
  • Ichikawa, J.J. 2014. ‘Justification is Potential Knowledge.’ Canadian Journal of Philosophy, 44: 184-206.
  • Ichikawa, J.J. 2017. ‘Basic Knowledge First.’ Episteme 14(3): 343-361.
  • Ichikawa, J. and Steup, M. 2012. ‘The Analysis of Knowledge.’ Stanford Encyclopedia of Philosophy.
  • Ichikawa, J. and C.S.I. Jenkins. 2018. In Joseph Adam Carter, Emma C. Gordon & Benjamin Jarvis (eds.), Knowledge First: Approaches in Epistemology and Mind. Oxford University Press.
  • Kelp, C., M. Simion, H. Ghijsen. 2016. ‘Norms of Belief.’ Philosophical Issues 16: 374-92.
  • Kelp. C. 2016. ‘‘Justified Belief: Knowledge First-Style.’ Philosophy and Phenomenological Research 93: 79-100.
  • Kelp, C. 2017. ‘Knowledge First Virtue Epistemology.’ In Carter, A., Gordon, E. and Jarvis, B. (eds.) Knowledge First: Approaches in Epistemology and Mind. Oxford: Oxford University Press.
  • Kelp, C. 2019b. ‘How to Be a Reliabilist.’ Philosophy and Phenomenological Research 98: 346-74.
  • Kelp, C. 2018. Good Thinking: A Knowledge-First Virtue Epistemology. New York: Routledge.
  • Kiesewetter, B. 2017. The Normativity of Rationality. Oxford: Oxford University Press.
  • Kvanvig, J. L. 2011. ‘Norms of Assertion.’ In Jessica Brown and Herman Cappelen (eds.), Assertion: New Philosophical Essays. Oxford: Oxford University Press.
  • Lackey, J. 2007. ‘Norms of Assertion.’ Nous 41: 594-626.
  • Lasonen-Aarnio, M. 2010. ‘Unreasonable knowledge.’ Philosophical Perspectives 24: 1-21.
  • Lasonen-Aarnio, M. 2014. ‘Higher-order evidence and the limits of defeat.’ Philosophy and Phenomenological Research 88: 314–345.
  • Lewis, D. 1997. ‘Finkish Dispositions.’ The Philosophical Quarterly 47: 143-58.
  • Littlejohn, C. 2017. ‘How and Why Knowledge is First.’ In A. Carter, E. Gordon & B. Jarvis (eds.), Knowledge First. Oxford: Oxford University Press.
  • Littlejohn, C. 2012. Justification and the Truth-Connection. Cambridge: Cambridge University Press.
  • Littlejohn, C. 2019. ‘Being More Realistic About Reasons: On Rationality and Reasons Perspectivism.’ Philosophy and Phenomenological Research 99/3: 605-627.
  • Littlejohn, C. 2020. ‘Plea for Epistemic Excuses.’ In F. Dorsch and J. Dutant (eds.), The New Evil Demon Problem. Oxford: Oxford University Press.
  • Madison, B. 2010. ‘Is Justification Knowledge?’ Journal of Philosophical Research 35:173-191.
  • Madison, B. 2018. ‘On Justifications and Excuses.’ Synthese 195 (10):4551-4562.
  • McGlynn, A. 2014. Knowledge First? Palgrave MacMillan.
  • Meylan, A. 2017. ‘In support of the knowledge-first conception of the normativity of justification.’ In Carter, A., Gordon, E. and Jarvis, B. (eds.) Knowledge First: Approaches in Epistemology and Mind. Oxford: Oxford University Press.
  • Millar, A. 2016. Forthcoming a. ‘Abilities, Competences, and Fallibility.’ In M. Á. Fernández (ed.), Performance Epistemology. Oxford: Oxford University Press.
  • Millar, A. 2019. Knowing by Perceiving. Oxford: Oxford University Press.
  • Miracchi, L. 2015. ‘Competence to Know.’ Philosophical Studies, 172: 29-56.
  • Miracchi, L. 2020. ‘Competent Perspectives and the New Evil Demon Problem.’ In J. Dutant and F. Dorsch, (eds.), The New Evil Demon. Oxford: Oxford University Press.
  • Neta, R. and D. Pritchard. 2007. ‘McDowell and the New Evil Genius.’ Philosophy and Phenomenological Research, 74: 381-396.
  • Neta, R. 2017. ‘Why Must Evidence Be True?’ in The Factive Turn in Epistemology, edited by Velislava Mitova. Cambridge: Cambridge University Press.
  • Pritchard, D. and Greenough, P. Williamson on Knowledge. Oxford: Oxford University Press.
  • Reynolds, S. 2013. ‘Justification as the Appearance of Knowledge.’ Philosophical Studies, 163: 367-383.
  • Rosenkranz, S. 2007. ‘Agnosticism as a Third Stance.” Mind 116: 55-104.
  • Rosenkranz, S. 2018. ‘The Structure of Justification.’ Mind 127: 309-338.
  • Roche, W. and T. Shogenji. 2014. ‘Confirmation, transitivity, and Moore: The Screening-off Approach.’ Philosophical Studies 168: 797-817.
  • Schechter, J. 2017. ‘No Need for Excuses.’ In J. Adam Carter, Emma Gordon & Benjamin Jarvis (eds.), Knowledge-First: Approaches in Epistemology and Mind. Oxford University Press. pp. 132-159.
  • Silins, N. 2005. Silins, N. (2005). ‘Deception and Evidence.’ Philosophical Perspectives 19: 375-404.
  • Silins, N. 2007. ‘Basic justification and the Moorean response to the skeptic.’ In T. Gendler & J. Hawthorne (Eds.), Oxford Studies in Epistemology (Vol. 2, pp. 108–140). Oxford: Oxford University Press.
  • Silva, P. 2017. ‘Knowing How to Put Knowledge First in the Theory of Justification.’ Episteme 14 (4): 393-412.
  • Silva, P. 2018. ‘Explaining Enkratic Asymmetries: Knowledge-First Style.’ Philosophical Studies 175 (11): 2907-2930.
  • Silva P. & Tal, E. 2021. ‘Knowledge-First Evidentialism and the Dilemmas of Self-Impact.’ In Kevin McCain, Scott Stapleford & Matthias Steup (eds.), Epistemic Dilemmas. London: Routledge.
  • Silva, P. 2023. Awareness and the Substructure of Knowledge. Oxford: Oxford University Press.
  • Simion, M. 2019. ‘Knowledge‐first functionalism.’ Philosophical Issues 29 (1): 254-267.
  • Smith, M. 2012. ‘Some Thoughts on the JK-Rule.’  Nous 46(4): 791-802.
  • Smithies, D. 2012. ‘The Normative Role of Knowledge.’ Nous 46(2): 265-288.
  • Smithies, D.  2019. The Epistemic Role of Consciousness. Oxford: Oxford University Press.
  • Sutton, J. 2005. ‘Stick to What You Know.’ Nous 39(3): 359-396.
  • Sutton, J. 2007. Beyond Justification. Cambridge: MIT Press.
  • Sylvan, K. 2018. ‘Knowledge as a Non-Normative Relation.’ Philosophy and Phenomenological Research 97 (1): 190-222.
  • Whitcomb, D. 2014. ‘Can there be a knowledge-first ethics of belief.’ In Jonathan Matheson & Rico Vits (eds.), The Ethics of Belief: Individual and Social, Oxford University Press. 2014.
  • Williamson, T. 2000. Knowledge and its Limits. Oxford: Oxford University Press.
  • Williamson, T. 2009. ‘Replies to Critics.’ In Duncan Pritchard & Patrick Greenough (eds.), Williamson on Knowledge. Oxford: Oxford University Press. pp. 279-384.
  • Williamson, T. 2014. ‘Knowledge First.’ In M. Steup, J. Turri, and E. Sosa (eds.), Contemporary Debates in Epistemology (Second Edition). Oxford: Wiley-Blackwell.
  • Williamson, T. 2020. ‘Justifications, Excuses, and Sceptical Scenarios.’ In J. Dutant and F. Dorsch, (eds.), The New Evil Demon. Oxford: Oxford University Press. Archived in Phil
  • Zagzebski, L. 1996. Virtues of the Mind: An Inquiry into the Nature of Virtue and the Ethical Foundations of Knowledge. Cambridge: Cambridge University Press.

 

Author Information

Paul Silva Jr.
Email: psilvajr@gmail.com
University of Cologne
Germany

What Else Science Requires of Time

This article is one of the three supplements of the main Time article. The two others are “Frequently Asked Questions about Time” and “Special Relativity: Proper Times, Coordinate Systems, and Lorentz Transformations (by Andrew Holster).”

Table of Contents

  1. What are Theories of Physics?
    1. The Core Theory
  2. Relativity Theory
  3. Quantum Theory
    1. The Standard Model
  4. Big Bang
    1. Cosmic Inflation
    2. Eternal Inflation and Many Worlds
  5. Infinite Time

1. What are Theories of Physics?

The answer to this question is philosophically controversial, and there is a vast literature on the topic. Here are some brief remarks.

The confirmed theories of physics are our civilization’s most valuable tools for explaining, predicting, and understanding the natural phenomena that physicists study. One of the best features of a good theory in physics is that it allows us to calculate the results of many observations from few assumptions. We humans are lucky that we happen to live in a universe that is so explainable, predictable and understandable, and that is governed by so few laws.

The term theory in this article is used in a technical sense, not in the sense of an explanation as in the remark, “My theory is that the mouse stole the cheese,” nor in the sense of a prediction as in the remark, “My theory is that the mouse will steal the cheese.” The general theory of relativity is an example of our intended sense of the term “theory.” In physics it is usually not helpful to try to explain some phenomena by appealing to something purpose.

Because theories in science are designed for producing interesting explanations, not for encompassing all the specific facts, that is why there is no scientific theory that specifies your age nor one that specifies when you woke up last Tuesday. Some theories are expressed fairly precisely, and some are expressed less precisely. The fairly precise ones that have simplifying assumptions are often called models of nature or models  of the world. In physics, the fundamental laws in those models are expressed in the language of mathematics as mathematical equations.

Most researchers would say the model should tell us how the system being modeled would behave if certain conditions were to be changed in a specified way, for example, if the density were doubled or those three moons orbiting the planet were not present. Knowing how the system would behave under different conditions helps us understand the causal structure of the system being modeled.

Theories of physics are, among other things, a set of laws and a set of ways to link its statements to the real, physical world. Do its laws actually govern us? In Medieval Christian theology, the laws of nature were considered to be God’s commands, but today saying nature ‘obeys’ scientific laws or we are ‘governed’ by laws is considered by scientists to be a harmless metaphor. Scientific laws are called laws because they constrain what can happen; they imply this can happen and that cannot. It was Pierre Laplace who first declared that fundamental scientific laws are hard and fast rules with no exceptions.

The philosopher David Lewis claimed that a scientific law is whatever provides a lot of information in a compact and simple expression. This is a justification for saying a law must be a general claim.  The claim that Mars is farther from the Sun than is the Earth is true, but it does not qualify as being a law because it is not general enough. The Second Law of Thermodynamics is general enough.

In our fundamental theories of physics, the standard philosophical presupposition is that a state of a physical system describes what there is at some time, and a law of the theory—an “evolution law” or “dynamical law”—describes how the system evolves from a state at one time into a state at another time. All evolution laws in our fundamental theories are differential equations. Nearly all the fundamental laws are time-reversible, which means that the evolution can be into either an earlier time or a later time. The most important, proposed exception to time-reversibility is the treatment in quantum theory of the measurement process. It is discussed below. The second law of thermodynamics says entropy tends to increase, so it is not time-reversible, but it is not a fundamental law.

All laws were once assumed to be local in the sense that they need to mention only the here are now and not the there and then. Also, presumably these laws are the same at all times. We have no a priori reason to think physical theories must be time-reversible, local, and time-translation invariant, but these assumptions have been very fruitful throughout much of the history of physics.

Due to the influence of Isaac Newton, subsequent physicists have assumed that the laws of physics are time-translation invariant. This invariance over time implies the laws of physics we have now are the same laws that held in the past and will hold in the future. This is not implying that if you bought an ice cream cone yesterday, you will buy one tomorrow. Also, the principle that the laws of physical science do not change from one time to another and thus are time translation invariant is not itself time translation invariant, so it is considered to be a meta-law rather than a law.

The laws and principles of physics are not accepted absolutely, like dogmas. Any currently-accepted law  or principle might need to be revised in the future to account for some unusual observations or experiments. However, some laws are believed more strongly than others, and so are more likely to be changed than others if future observations indicate a change is needed.

The laws of our fundamental theories contain many constants such as the fine-structure constant, the value for the speed of light in a vacuum, Planck’s constant, and the value of the rest mass of an electron and proton. For some of these constants, such as the mass of a proton, the Standard Model indicates that we should be able to compute the value exactly, but practical considerations of solving the equations to obtain a value even to two decimal places have been insurmountable, so we make do with a good measurement. That is, we measure the constant as precisely as possible, and then select a best, specific value for the constant to place into the theories containing the constant. A virtue of a theory is to not have too many constants. If there were too many, then the theory could never be disproved by data because the constants could be changed to account for any data, and so the theory would explain nothing and would be pseudoscience. Regarding the divide between science and pseudoscience, the leading answer is that:

what is really essential in order for a theory to be scientific is that some future information, such as observations or measurements, could plausibly cause a reasonable person to become either more or less confident of its validity. This is similar to Popper’s criteria of falsifiability, while being less restrictive and more flexible (Dan Hooper).

a. The Core Theory

Some physical theories are fundamental, and some are not. Fundamental theories are foundational in the sense that their laws cannot be derived from the laws of other physical theories even in principle. For example, the second law of thermodynamics is not fundamental, nor are the laws of plate tectonics in geophysics despite their being critically important to their sciences. The following two theories are fundamental: (i) the general theory of relativity, and (ii) quantum theory. Their amalgamation is what Nobel Prize winner Frank Wilczek called the Core Theory, the theory of almost everything physical. The hedge “almost” is there because It is not a theory of gravity, or dark matter, or dark energy, for example. If it were, it would be called a “theory of everything.” (For the experts: More technically, this amalgamated theory is the effective quantum field theory that includes both the weak field limit of Einstein’s General Theory of Relativity and the Standard Model of Particle Physics, and no assumption is made about the existence of space and time below the Planck length and Planck time.) Most all scientists believe this Core Theory holds not just in our solar system, but all across the universe, and it held yesterday and will hold tomorrow. Wilczek claimed:

[T]he Core has such a proven record of success over an enormous range of applications that I can’t imagine people will ever want to junk it. I’ll go further: I think the Core provides a complete foundation for biology, chemistry, and stellar astrophysics that will never require modification. (Well, “never” is a long time. Let’s say for a few billion years.)

This implies one could think of chemistry is applied quantum theory. The Core Theory does not include the Big Bang Theory, and it does not use the terms time’s arrow or now. The concept of time in the Core Theory is primitive or “brute.” It is not definable, but rather it is used to define and explain other concepts.

It is believed by most physicists that the Core Theory can be used in principle to adequately explain the behavior of a potato, a galaxy, and a brain. The hedge phrase “in principle” is important. One cannot replace it with “in practice” or “practically.” Practically there are many limitations on the use of the Core theory. Here are some of the limitations. There is a margin of error in any measurement, so a user of the Core Theory does not have access to all the needed data for a prediction such as the position of ever particle in a system; and, even if this were available, the complexity of the needed calculations would be prohibitive. There is limit of predictability in a chaotic system due to the butterfly effect that magnifies small errors in an initial measurement into very large errors later in the time evolution of the system. There is quantum uncertainty that Heisenberg expressed with his Uncertainty Principle (see below for more on this). In addition, the Core theory does not explicitly contain the concepts of a potato, galaxy, and brain. They are emergent concepts that are needed in good explanations at a higher scale, the macroscopic scale. Commenting on these various practical limitations for the study of galaxies, the cosmologist Andrew Ponzen said “Ultimately, galaxies are less like machines and more like animals—loosely understandable, rewarding to study, but only partially predictable.”

Regarding the effect of quantum theory on ontology, the world’s potatoes, galaxies and brains have been considered by a number of twentieth-century philosophers to be just different mereological sums of particles, but the majority viewpoint among philosophers of physics in the twenty-first century is that potatoes, galaxies and brains are, instead, fairly stable patterns over time of interacting quantized fields.

For a great many investigations, it is helpful to treat objects as being composed of particles rather than fields. A proton or even a planet might be usefully treated as a particle for certain purposes. Electrons, quarks, and neutrinos are fundamental particles, and they are considered to be structureless, having no inside. Superstring theory disagrees and treats all these particles as being composed of very tiny one-dimensional objects called “strings” that move in a higher-dimensional space, but due to lack of experimental support, string theory is considered to be as yet unconfirmed. String theory in some form or other is the leading candidate for a theory of quantum gravity that resolves the contradictions between quantum theory and relativity theory.

The Core has been tested in many extreme circumstances and with great sensitivity, so physicists have high confidence in it. There is no doubt that for the purposes of doing physics the Core theory provides a demonstrably superior representation of reality to that provided by its alternatives. But all physicists know the Core is not strictly true and complete, and they know that some features will need revision—revision in the sense of being modified or extended. Physicists are motivated to discover how to revise it because such a discovery can lead to great praise from the rest of the physics community. Wilczek says the Core will never need modification for understanding (in principle) the special sciences of biology, chemistry, stellar astrophysics, computer science and engineering, but he would agree that the Core needs revision in order to adequately explain why 95 percent of the universe consists of dark energy and dark matter, why the universe has more matter than antimatter, why neutrinos change their identity over time, and why the energy of empty space is as small as it is. One metaphysical presupposition here is that the new theory will be logically consistent and will have eliminated the present inconsistencies between relativity theory and quantum theory.

The Core Theory presupposes that time exists, that it emerges from spacetime, and that spacetime is fundamental and not emergent. Within the Core Theory, relativity theory allows space to curve, ripple, and expand; and thIs curving, rippling, and expanding can vary over time. Quantum Theory does not allow any of this, although a future revision of Quantum Theory within the Core Theory is expected to allow this.

The Core Theory also presupposes reductionism in the sense that large-scale laws are nearly all based on the small-scale laws, for example, that the laws of geology are based on the fundamental laws of physics. The only exception seems to be with quantum coherence in which the behavior of a group of particles is not fully describable by complete knowledge of the behavior of the individual particles.

The Core Theory also presupposes an idea Laplace had in 1800 that is now called the Laplacian Paradigm—that laws should have the form of describing how a state of a system at one time turns into a different state at another time. These are the evolution laws or dynamical laws. David Deutsch, Chiara Marletto, and their corroborators (Deutsch 2013) have challenged that paradigm and proposed Constructor Theory which requires time to emerge from a non-temporal substrate. So, time is not a fundamental feature of nature. Also, it turns the tables on classical reductionism by claiming that the small-scale, microscopic laws of nature are all emergent properties of the larger-scale laws, not vice versa.

2. Relativity Theory

Time is fundamental in relativity theory, and the theory has had a great impact upon our understanding of the nature of time. When the term relativity theory is used, it usually means the general theory of relativity of 1915, but sometimes it means the special theory of relativity of 1905. The special theory is the theory of space and time when you do not pay attention to gravity, and the general theory is when you do. Both the special and general theories have been well-tested; and they are almost universally accepted. Today’s physicists understand them better than Einstein did.

Although the Einstein field equations in his general theory:

are exceedingly difficult to manipulate, they are conceptually fairly simple. At their heart, they relate two things: the distribution of energy in space, and the geometry of space and time. From either one of these two things, you can—at least in principle—work out what the other has to be. So, from the way that mass and other energy is distributed in space, one can use Einstein’s equations to determine the geometry of that space, And from that geometry, we can calculate how objects will move through it (Dan Hooper).

The theory of relativity implies the fundamental laws of nature are the same for a physical system regardless of what time it is.

The relationship between the special and general theories is slightly complicated. Both theories are about motion of objects and both approach agreement with Newton’s theory the slower the speed of objects, the weaker the gravitational forces, and the lower the energy of those objects. Special relativity implies the laws of physics are the same for all inertial observers, that is, observers who are moving at a constant velocity relative to each other will find that all phenomena obey the same laws. Observers are frames of reference, or persons of negligible mass and volume making measurements from a stationary position in a frame of reference. General relativity implies the laws are the same even for observers accelerating relative to each other, such as changing their velocity due to the influence of gravitation. And acceleration is absolute, not relative to a frame. General relativity holds in all reference frames, but special relativity holds only for inertial reference frames, namely non-accelerating frames.

Special relativity allows objects to have mass but not gravity. It always requires a flat geometry—that is, a Euclidean geometry for space and a Minkowskian geometry for spacetime. General relativity does not have those restrictions. General relativity is a specific theory of gravity, assuming the theory is supplemented by a specification of the distribution of matter-energy at some time. Newton’s main laws of F = ma and F =  GmM/r2 hold only in special situations. Special relativity is not a specific theory but rather a general framework for theories, and it is not a specific version of general relativity. Nor is general relativity a generalization of special relativity. The main difference between the two is that, in general relativity, spacetime does not simply exist passively as a background arena for events. Instead, spacetime is dynamical in the sense that changes in the distribution of matter and energy are changes in the curvature of spacetime (though not necessarily vice versa).

The theory of relativity is generally considered to be a theory based on causality:

One can take general relativity, and if you ask what in that sophisticated mathematics is it really asserting about the nature of space and time, what it is asserting about space and time is that the most fundamental relationships are relationships of causality. This is the modern way of understanding Einstein’s theory of general relativity….If you write down a list of all the causal relations between all the events in the universe, you describe the geometry of spacetime almost completely. There is still a little bit of information that you have to put in, which is counting, which is how many events take place…. Causality is the fundamental aspect of time. (Lee Smolin).

In the Core theories, the word time is a theoretical term, and the dimension of time is treated somewhat like a single dimension of space. Space is a set of all possible point-locations. Time is a set of all possible point-times. Spacetime is a set of all possible point-events. Spacetime is presumed to be four-dimensional and also a continuum of points, with time being a distinguished, one-dimensional sub-space of spacetime. Because the time dimension is so different from a space dimension, physicists very often speak of (3+1)-dimensional spacetime rather than 4-dimensional spacetime. Both relativity theory and quantum theory assume that three-dimensional space is isotropic (rotation symmetric) and homogeneous (translation symmetric) and that there is translation symmetry in time. Regarding all these symmetries, the physical laws need to obey them, but specific physical systems within space-time need not; your body could become very different if you walk across the road instead of along the road.

(For the experts: Technically, any spacetime, no matter how many dimensions it has, is required to be a differentiable manifold with a metric tensor field defined on it that tells what geometry it has at each point. General relativistic spacetimes are manifolds built from charts involving open subsets of R4. General relativity does not consider a time to be a set of simultaneous events that do or could occur at that time; that is a Leibnizian conception. Instead General relativity specifies time in terms of the light cone structures at each place. The theory requires spacetime to have at least four dimensions, not exactly four dimensions.)

Relativity theory implies time is a continuum of instantaneous times that is free of gaps just like a mathematical line. This continuity of time was first emphasized by the philosopher John Locke in the late seventeenth century, but it is meant here in a more detailed, technical sense that was developed only toward the end of the 19th century for calculus.

continuous vs discrete

According to both relativity theory and quantum theory, time is not discrete or quantized or atomistic. Instead, the structure of point-times is a linear continuum with the same structure as the mathematical line or as the real numbers in their natural order. For any point of time, there is no next time because the times are packed together so tightly. Time’s being a continuum implies that there is a non-denumerably infinite number of point-times between any two non-simultaneous point-times. Some philosophers of science have objected that this number is too large, and we should use Aristotle’s notion of potential infinity and not the late 19th century notion of a completed infinity. Nevertheless, accepting the notion of an actual nondenumerable infinity is the key idea used to solve Zeno’s Paradoxes and to remove inconsistencies in calculus.

The fundamental laws of physics assume the universe is a collection of point events that form a four-dimensional continuum, and the laws tell us what happens after something else happens or because it happens. These laws describe change but do not themselves change. At least that is what laws are in the first quarter of the twenty-first century, but one cannot know a priori that this is always how laws must be. Even though the continuum assumption is not absolutely necessary to describe what we observe, so far it has proved to be too difficult to revise our theories in order to remove the assumption and retain consistency with all our experimental data. Calculus has proven its worth.

No experiment is so fine-grained that it could show times to be infinitesimally close together, although there are possible experiments that could show the assumption to be false if the graininess of time were to be large enough to be detectable.

Not only is there some uncertainty or worry about the correctness of relativity in the tiniest realms, there is also uncertainty about whether it works differently on cosmological scales than it does at the scale of atoms, houses, and solar systems, but so far there are no rivals theories that have been confirmed.

In the twenty-first century, one of the most important goals in physics is to discover/invent a theory of quantum gravity that unites the best parts of quantum theory with the theory of relativity. Einstein claimed in 1916 that his general theory of relativity needed to be replaced by a theory of quantum gravity. Subsequent physicists generally agree with him, but that theory has not been found so far. A great many physicists of the 21st century believe a successful theory of quantum gravity will require quantizing time so that there are atoms of time. But this is just an opinion, not a fact.

If there is such a thing as an atom of time and thus such a thing as an actual next instant and a previous instant, then time cannot be like the real number line, because no real number has a next number. It is speculated that if time were discrete, a good estimate for the duration of an atom of time is 10-44 seconds, the so-called Planck time. No physicist can yet suggest a practical experiment that is sensitive to this tiny scale of phenomena. For more discussion, see (Tegmark 2017).

The special and general theories of relativity imply that to place a reference frame upon spacetime is to make a choice about which part of spacetime is the space part and which is the time part. No choice is objectively correct, although some choices are very much more convenient for some purposes. This relativity of time, namely the dependency of time upon a choice of reference frame, is one of the most significant philosophical implications of both the special and general theories of relativity.

Since the discovery of relativity theory, scientists have come to believe that any objective description of the world can be made only with statements that are invariant under changes in the reference frame. Saying, “It is 8:00” does not have a truth value unless a specific reference frame is implied, such as one fixed to Earth with time being the time that is measured by our civilization’s standard clock. This relativity of time to reference frames is behind the remark that Einstein’s theories of relativity imply time itself is not objectively real but spacetime is.

Regarding relativity to frame, Newton would say that if you are seated in a vehicle moving along a road, then your speed relative to the vehicle is zero, but your speed relative to the road is not zero. Einstein would agree. However, he would surprise Newton by saying the length of your vehicle is slightly different in the two reference frames, the one in which the vehicle is stationary and the one in which the road is stationary. Equally surprising to Newton, the duration of the event of your drinking a cup of coffee while in the vehicle is slightly different in those two reference frames. These relativistic effects are called space contraction and time dilation, respectively. So, both length and duration are frame dependent and, for that reason, say physicists, they are not objectively real characteristics of objects. Speeds also are relative to reference frame, with one exception. The speed of light in a vacuum has the same value c in all frames that are allowed by relativity theory. Space contraction and time dilation change in tandem so that the speed of light in a vacuum is always the same number.

Relativity theory allows great latitude in selecting the classes of simultaneous events, as shown in this diagram. Because there is no single objectively-correct frame to use for specifying which events are present and which are past—but only more or less convenient ones—one philosophical implication of the relativity of time is that it seems to be easier to defend McTaggart’s B theory of time and more difficult to defend McTaggart’s A-theory that implies the temporal properties of events such as “is happening now” or “happened in the past” are intrinsic to the events and are objective, frame-free properties of those events. In brief, the relativity to frame makes it difficult to defend absolute time.

Relativity theory challenges other ingredients of the manifest image of time. For two point-events A and B occurring at the same place but at different times, relativity theory implies their temporal order is one of simultaneity, and this order is absolute in the sense of being independent of the frame of reference. This agrees with common sense and thus the manifest image of time, but if A and B are distant from each other and occur close enough in time to be within each other’s absolute elsewhere, then relativity theory implies event A can occur before event B in one reference frame, but after B in another frame, and simultaneously with B in yet another frame. No person before Einstein ever imagined time has such a strange feature.

The special and general theories of relativity provide accurate descriptions of the world when their assumptions are satisfied. Both have been carefully tested. The special theory does not mention gravity, and it assumes there is no curvature to spacetime, but the general theory requires curvature in the presence of mass and energy, and it requires the curvature to change as their distribution changes. The presence of gravity in the general theory has enabled the theory to be used to explain phenomena that cannot be explained with either special relativity or Newton’s theory of gravity or Maxwell’s theory of electromagnetism.

Because of the relationship between spacetime and gravity, the equations of general relativity are much more complicated than are those of special relativity. But general relativity assumes the equations of special relativity hold at least in all infinitesimal regions of spacetime.

To give one example of the complexity just mentioned, the special theory clearly implies there is no time travel to events in one’s own past. Experts do not agree on whether the general theory has this same implication because the equations involving the phenomena are too complex for them to solve directly. Approximate solutions have to be used, yet still there is disagreement about this kind of time travel.

Because of the complexity of Einstein’s equations, all kinds of tricks of simplification and approximation are needed in order to use the laws of the theory on a computer for all but the simplest situations.

Regarding curvature of time and of space, the presence of mass at a point implies intrinsic spacetime curvature at that point, but not all spacetime curvature implies the presence of mass. Empty spacetime can still have curvature, according to relativity theory. This point has been interpreted by many philosophers as a good reason to reject Leibniz’s classical relationism. The point was first mentioned by Arthur Eddington.

Two accurate, synchronized clocks do not stay synchronized if they undergo different gravitational forces. This is a second kind of time dilation, in addition to dilation due to speed. So, a correct clock’s time depends on the clock’s history of both speed and gravitational influence. Gravitational time dilation would be especially apparent if a clock were to approach a black hole. The rate of ticking of a clock approaching the black hole slows radically upon approach to the horizon of the hole as judged by the rate of a clock that remains safely back on Earth. This slowing is sometimes misleadingly described as time slowing down. After a clock falls through the event horizon, it can still report its values to Earth, and when it reaches the center of the hole not only does it stop ticking, but it also reaches the end of time, the end of its proper time.

The general theory of relativity theory has additional implications for time. In 1948-9, the logician Kurt Gödel discovered radical solutions to Einstein’s equations, solutions in which there are what are called “closed time-like curves” in graphical representations of spacetime. The unusual curvature is due to the rotation of all the matter throughout Gödel’s universe. As one progresses forward in time along one of these curves, one arrives back at one’s starting point—thus, time travel! Fortunately, there is no empirical evidence that our own universe has this rotation. Some physicists are not convinced from Gödel’s work that time travel is possible in our own universe even if it obeys Einstein’s general theory of relativity. Here is Einstein’s reaction to Gödel’s work on time travel:

Kurt Gödel’s essay constitutes, in my opinion, an important contribution to the general theory of relativity, especially to the analysis of the concept of time. The problem involved here disturbed me already at the time of the building of the general theory of relativity, without my having succeeded in clarifying it.

Let’s explore the microstructure of time in more detail. In mathematical physics used in both relativity theory and quantum theory, the ordering of instants by the happens-before relation of temporal precedence is complete in the sense that there are no gaps in the sequence of instants. Any interval of time is a continuum, so the points of time form a linear continuum. Unlike physical objects, physical time and physical space are believed to be infinitely divisible—that is, divisible in the sense of the actually infinite, not merely in Aristotle’s sense of potentially infinite. Regarding the density of instants, the ordered instants are so densely packed that between any two there is a third so that no instant has a very next instant. Regarding continuity, time’s being a linear continuum implies that there is a nondenumerable infinity of instants between any two non-simultaneous instants. The rational number line does not have so many points between any pair of different points; it is not continuous the way the real number line is, but rather contains many gaps. The real numbers such as pi and the square root of two fill the gaps.

The actual temporal structure of events can be embedded in the real numbers, at least locally, but how about the converse? That is, to what extent is it known that the real numbers can be adequately embedded into the structure of the instants, at least locally? This question is asking for the justification of saying time is not discrete, that is, not atomistic. The problem here is that the shortest duration ever measured is about 250 zeptoseconds. A zeptosecond is 10−21 second. For times shorter than about 10-43 second, which is the physicists’ favored candidate for the duration of an atom of time, science has no experimental grounds for the claim that between any two events there is a third. Instead, the justification of saying the reals can be embedded into the structure of the instants is that (i) the assumption of continuity is very useful because it allows the mathematical methods of calculus to be used in the physics of time; (ii) there are no known inconsistencies due to making this assumption; and (iii) there are no better theories available. The qualification earlier in this paragraph about “at least locally” is there in case there is time travel to the past. A circle is continuous, and one-dimensional, but it is like the real numbers only locally.

One can imagine two empirical tests that would reveal time’s discreteness if it were discrete—(1) being unable to measure a duration shorter than some experimental minimum despite repeated tries, yet expecting that a smaller duration should be detectable with current equipment if there really is a smaller duration, and (2) detecting a small breakdown of Lorentz invariance. But if any experimental result that purportedly shows discreteness is going to resist being treated as a mere anomaly, perhaps due to error in the measurement apparatus, then it should be backed up with a confirmed theory that implies the value for the duration of the atom of time. This situation is an instance of the kernel of truth in the physics joke that no observation is to be trusted until it is backed up by theory.

It is commonly remarked that, according to relativity theory, nothing can go faster than c, the speed of light, not even the influence of gravity. The remark needs some clarification, else it is incorrect. Here are three ways to go faster than the speed c. (1) First, the medium needs to be specified. c is the speed of light in a vacuum. The speed of light in certain crystals can be much less than c, say 40 miles per hour, and if so, then a horse outside the crystal could outrun the light beam. (2) Second, the limit c applies only to objects within space relative to other objects within space, and it requires that no object pass another object locally at faster than c. However, the general theory of relativity places no restrictions on how fast space itself can expand. So, two galaxies can fly apart from each other at faster than the speed c of light if the intervening space expands sufficiently rapidly. (3) Imagine standing still outside on the flat ground and aiming your laser pointer forward and parallel to the ground. Now change the angle in order to aim the pointer down at your feet. During that process of changing the angle, the point of intersection of the pointer and the tangent plane of the ground will move toward your feet faster than the speed c. This does not violate relativity theory because the point of intersection is merely a geometrical object, a point, not a physical object, so its speed is not restricted by relativity theory.

For more about special relativity, see Special Relativity: Proper Times, Coordinate Systems, and Lorentz Transformations.

3. Quantum Theory

Time is a continuum in quantum theory, just as it is in the theory of relativity and Newton’s mechanics, but change over time is treated in quantum theory very differently than in classical theories.

Quantum theory is a combination of quantum mechanics and the special theory of relativity, but not the general theory of relativity. It also includes the Standard Model of particle physics, which is a theory of all the known forces of nature except for the gravitational force and of all known particles except the graviton, the particle of gravity. Quantum theory has its name because it implies that some qualities or properties, such as energy and charge, are quantized in the sense that they do not change continuously but only in multiples of minimum discrete steps in a shortest time. The minimum changes are called quantum steps.

Quantum theory is our most successful theory in all of science, more so even than relativity theory, and it is well tested and very well understand mathematically despite its not being well understood intuitively or informally or philosophically. The variety of phenomena it can be used to successfully explain is remarkable. For four examples, it explains (i) why you can see through a glass window but not a potato, (ii) why the Sun has lived so long without burning out, (iii) why atoms are stable so that the negatively-charged electrons do not crash into the positively-charged nucleus, (iv) why the periodic table of elements has the structure and most of the values it has. Without quantum theory, all these facts must be taken to be brute facts of nature.

Surprisingly, physicists still do not agree on the exact formulation of quantum theory. Its many so-called “interpretations” are really competing versions of the theory. That is why there is no agreement on what the axioms of quantum theory are. Also, there is a disagreement among philosophers of physics regarding whether the competing interpretations are (1) empirically equivalent and underdetermined by (all possible) experimental evidence and so must be decided upon by such features as their degree of mathematical elegance and simplicity, or (2) are not empirically equivalent theories but, instead, are theories that may in the future be confirmed or refuted by experimental evidence.

All current interpretations of quantum theory appear to prohibit time-like loops that allow a particle to travel along a path of spacetime that curves into its own past, although this is allowed by the general theory of relativity. To be more cautious, Gödel and Einstein believed the general theory of relativity allowed this, but some 21st century experts on relativity are not yet convinced that Gödel and Einstein interpreted the theory correctly.

Indeterminism

Determinism implies predictability in principle, and it implies the universe is not random.

Classical physicists envisioned the world to be deterministic in the sense that, given a precise and complete specification of the way things are at some time, called the “initial state,” then any later state, the so-called “final state,” is fixed, at least in principle, even if practically there are no available instruments that would provide the information about the initial state, and even if practically the required computations are too difficult to perform.

According to quantum theory, a state of an isolated system is described very differently than in all earlier theories of physics. It is described by the Schrödinger wave function. Schrödinger’s wave equation for that function describes how the state changes from one time to another. In this equation, time is fundamental, but space is not. However, the wave function at a time and place specifies the probability of detecting, say, an electron at that time and place. So, probability is at the heart of quantum theory. Because of the probability, if you were to set up your system the way it was the first time, then the outcome the second time might be different. Therefore, the key principle of causal determinism, “same cause, same effect,” fails.

Einstein reacted to this quantum indeterminism by proposing that there would be a future discovery of as yet unknown variables or properties that, when taken into account by a revised Schrödinger equation, would make quantum theory be deterministic. David Bohm agreed with Einstein and went some way in this direction by building a revision of quantum theory, but his interpretation has not succeeded in moving the needle of scientific opinion.

Physicists normally wish to assume that our universe’s total information is conserved over time—all the universe’s quantum information was present at the Big Bang, and it persists today. This principle of the conservation of information fails according to the classical interpretation of quantum theory, the Copenhagen Interpretation.

The Copenhagen Interpretation

The classical Interpretation of quantum theory was the product of Niels Bohr and his colleagues in the 1920s. It is called the Copenhagen Interpretation because Bohr taught at the University of Copenhagen. According to its advocates, it has implications about time reversibility, determinism, the conservation of information, locality, the principle that causes affect the future and not the past, and the reality of the world independently of its being observed—namely, that they all fail.

In the famous two-slit experiment, an electron shot toward an otherwise impenetrable plate might pass through it by entering through the plate’s left slit or a parallel right slit. The slits are very narrow and closely aligned. Unlike macroscopic objects such as bullets entering through a narrow slit in a steel wall and which are at only one location at a time, a single electron is understood in the Copenhagen interpretation as going through both slits at the same time, and then interfering with itself on the other side and then striking the optical screen behind the plate at only a single location and thereby helping to cast a unique pattern of dots on the screen. This pattern is very similar to the pattern obtained by diffraction of classical waves. The favored explanation of the two-slit experiment is to assume so-called “wave-particle duality,” namely that a single particle has wavelike properties, and a wave (a wave train, not just a wave crest) has particle-like properties. Also before it is detected, it is in a cloud of possibilities such as being in the state of having gone through only the left slit, plus the state of having gone through only the right slit.

The optical screen that displays the dots is similar to a computer monitor that displays a pixel-dot when and where an electron collides with it. See the diagram below of electrons passing  through slits in a screen (such as a piece of steel containing two narrow, parallel slits) and then hitting an optical screen that is behind two slits. In the diagram below, the interference pattern that is produced is displayed on the right (the front view). This interference pattern occurs even if the electrons are shot infrequently at the optical screen, such as only once per second. Surely, if electrons were like bullets, a bullet hitting the screen cannot be affected by what the previous bullet did a second earlier. Because the collective electron behavior over time looks so much like optical wave diffraction, this behavior is considered to be definitive evidence of electrons behaving as waves.

But the interference does not occur if the electrons are actively observed during the experiment by, say, a light being shined on each slit to see which slit each electron went through. When observed going through the slits, the electron behavior changes, and they act like tiny bullets with no diffraction and no other wave behavior. Here is a diagram of that situation:

Comparison of the two diagrams has led a great many researchers to conclude that, when an electron is not observed at the moment of passing through the slits or before colliding with the screen, it passes through both two slits (and is in two places at once). When unobserved before the first screen, it passes through only one slit.

According to the Copenhagen Interpretation of the two-slit experiment, observing the electron going through the slits collapses the wave function so it describes a single outcome, while deleting the other possibilities. To restate this, before the measurement, the electron is in a two-places-at-once-state of going through the left slit and of going through the right slit, and the measurement interaction collapses this superposition state into a single state of the electron’s going through the slit where it is detected.

To explain the two-slit experiment, Bohr proposed an anti-realist interpretation of the world by saying there is no determinate, unfuzzy way the world is when it is not being observed. There is only a cloud of possible values for each property of the system that might be measured. Eugene Wigner, a Nobel Prize winning physicist, promoted the claim that there exists a determinate, unfuzzy reality only when a conscious being is observing it. This prompted Einstein to ask a supporter of Bohr whether he really believed that the moon exists only when it is being looked at.

The two-slit experiment has caused philosophers of physics to disagree about what quantum theory implies an object is, what it means for an object to have a location, how an object maintains its identity over time, and whether consciousness of the measurer is required in order to make reality become determinate and not “fuzzy” or “blurry.” Also, in regard to the classical principle that causes affect the future and never the past, Princeton physicist John Wheeler famously remarked in his 1983 book Quantum Theory and Measurement: “Equipment operating in the here and now has an undeniable part in bringing about that which appears to have happened.” Opponents of the Copenhagen Interpretation have remarked that these interpretations of quantum theory are too weird to be true.

Measurement

According to the Copenhagen Interpretation, during the measurement process the wave function describing the fuzzy, superposition-state “collapses” instantaneously or nearly instantaneously from the superposition of states to a single state with a definite value for whatever is measured. Using a detector to measure which slit the electron went through in the two-slit experiment is the paradigm example of the collapse of the wave function.

Attempting to confirm this claim about the speed of the collapse via an experiment faces the obstacle that no measurement can detect such a short interval of time,

Yet what we do already know from experiments is that the apparent speed at which the collapse process sweeps through space, cleaning the fuzz away, is faster than light. This cuts against the grain of relativity in which light sets an absolute limit for speed (Andrew Pontzen).

During the collapse, one of the possible values for the measurement becomes the actual specific value, and the other possibilities are deleted. And quantum information is lost. According to the Copenhagen Interpretation, during any measurement, from full knowledge of the new state, the prior state cannot be deduced. Different initial states may transition into the same final state. So, time reversibility fails. There can be no un-collapsing.

When a measurement occurs, it is almost correct to explain this as follows: At the beginning of the measurement, the system “could be in any one of various possibilities, we’re not sure which.” But not quite. Strictly speaking, before the measurement is made the system is in a superposition of multiple states, one for each possible outcome of the measurement, with each outcome having a fixed probability of occurring as determined by quantum theory; and the measurement itself is procedure that removes the superposition and randomly realizes just one of those states. Informally, this is sometimes summarized in the remark that measurement turns the situation from fuzzy to definite.

For an instant, a measurement on an electron can say it is there at this specific place, but immediately afterward it becomes fuzzy again, and once again there is no single truth about where an electron is precisely, but only a single truth about the probabilities for finding the electron in various places if certain kinds of measurements were to be made.

Many opponents of the Copenhagen Interpretation have reacted in this way:

In the wake of the Solvay Conference (in 1927), popular opinion within the physics community swung Bohr’s way, and the Copenhagen approach to quantum mechanics settled in as entrenched dogma. It’s proven to be an amazingly successful tool at making predictions for experiments and designing new technologies. But as a fundamental theory of the world, it falls woefully short (Sean Carroll).

George Ellis, co-author with Stephen Hawking of the definitive book The Large-Scale Structure of Space-Time, identifies what he believes is a key difficulty with our understanding of quantum measurement in interpretations that imply the wave function collapses during measurement: “Usually, it is assumed that the measurement apparatus does not obey the rules of quantum theory, but this [assumption] contradicts the presupposition that all matter is at its foundation quantum mechanical in nature.”

Those who want to avoid having to bring consciousness of the measurer into quantum physics and who want to restore time-reversibility and determinism and conservation of quantum information typically recommend adopting a different interpretation of quantum mechanics that changes how measurement is understood. Einstein had a proposal, the Hidden Variable Interpretation. He hoped that by adding new laws specifying the behavior of so-called “underlying variables” affecting the system, then determinism, time-reversibility, and information conservation would be restored, and there would be no need to speak of a discontinuous collapse of the wave function during measurement. The “spookiness” would be gone. Also, quantum probabilities would be epistemological; they would be caused by our lack of knowledge of the hidden variables. Einstein’s proposal never gathered much support.

The Many-Worlds Interpretation and Branching Time

The Many-Worlds Interpretation is a popular replacement for the Copenhagen Interpretation. It introduces many worlds or multiple universes. Our own is just one of many, perhaps infinitely many. Anything that can happen according to quantum mechanics in our universe does happen in some universe or other.

This proposal removes the radical distinction between the measurer and what is measured and replaces it with a continuously evolving wave function for the combined system of measurement process plus measurer for the entire universe. Our being stuck in a single world, though, implies that during measurements it will appear as if there is collapse of the wave function for the system under study, but the wave function for the totality of the multiverse does not collapse. The laws of the Many-Worlds Interpretation are time-reversible symmetric and deterministic, and there is no need for the anti-realist stance. Also, quantum information is never lost in the sum of all worlds. It is an open question  whether the multiverse theory should require the same fundamental scientific laws in all universes.

The Many-Worlds Interpretation is frequently called the Everettian interpretation for its founder Hugh Everett III. It implies that, during any measurement having some integer number n of possible outcomes, the universe splits instantaneously into n copies of itself, each with a different outcome. If a measurement can produce any value from 0 to 10, and we find that “8” is the value we see for the outcome of our own measuring apparatus, then the counterparts of us who live in the other universes and who have the same memories as we have see outcomes other than “8”. Clearly, the weirdness of the Copenhagen interpretation has been traded for a new kind of weirdness.

In the Many-Worlds interpretation, there is no access from one world to another. They exist “in parallel” and not within the same physical space, so any two are neither far from nor close to each other. Information is conserved, but not within any single universe. If we had access to all information about all the many worlds (the multiverse’s wave function) and had unlimited computational capacity, then we could see that the multiverse of many worlds evolves deterministically and time-reversibly and see that the wave function for the multiverse never collapses discontinuously. Unfortunately, nobody can know the exact wave function for the entire multiplicity of universes. In a single universe, the ideally best available information can be used to predict only the probability of a measurement outcome, a probability that must be less than 1. So, in this sense, probability remains at the heart of our own world.

The notion that it takes consciousness to have a measurement has been rejected in favor of the idea that, when a system is measured, all that is required is that the system be in a superposition,  then interact with and become entangled with its environment. This interaction process is called “decoherence,” an exotic kind of breaking apart. The state of a system of one free electron can be ‘measured’ by its hitting an air molecule. Not every interaction leads to decoherence though, but it takes careful work to create the kind of interaction that preserves it. Preserving coherence is the most difficult goal to achieve in improving a quantum computer, and cooling is one of the main techniques used to reduce interactions that cause decoherence. These interactions are called “noise” in a quantum computer. According to the Many-Worlds Interpretation, the moon is there when it is not being looked at because the moon is always interacting with some particle or other and thereby decohering and, in that sense, getting measured. Decoherence is also why the moon’s quantum properties are not visible to us at our macroscale. Nevertheless, the moon is a quantum object (an object obeying the rules of quantum theory), like all other objects.

Although not all cosmologists who accept the Everettian or Many-Worlds Interpretation of quantum mechanics agree with each other, Sean Carroll’s particular position is that new universes are created whenever there is decoherence.

The multiverse of the Many-Worlds Interpretation is a different multiverse from the multiverse of chaotic inflation that is described below in the section about extending the Big Bang Theory. Those universes exist within a single background physical space, unlike in the multiverse of the Many-Worlds Interpretation. Not every expert here agrees, but many suggest that in both kinds of multiverse, time is better envisioned, not as linear, but rather as increasingly branching into the times of the new universes. Time itself branches and is not linear, and there can be no un-branching or branch removal. If Leibniz were alive, he might say that, despite all the many branches coming into existence, we live in the best of all possible branches. The reason for saying “not every expert here agrees” is that even though everyone agrees on what the wave function is doing and that it gets new parts when there is an interaction, they do not all want to say a new part is describing a new world.

What the Copenhagen Interpretations calls quantum fuzziness or a superposition of states, Everett calls a superposition of many alternate universes. One advantage of accepting all these admittedly weird alternate universes is that in one clear sense the multiverse is deterministic and has information conservation. Although any single universe fails to be deterministic and information-preserving, the evolution of the global state of the multiverse is deterministic and information-preserving, and the multiverse evolves according to the Schrödinger equation. At least this is so on an ontological approach to the wave function; but, on an epistemic approach, quantum theories are not directly about reality but rather are merely tools for making measurements. This is an instrumentalist proposal.

Experts do not agree on whether the quantum wave function is a representation of reality, or only of our possible knowledge of reality. And there is no consensus on whether we currently possess the fundamental laws of quantum theory, as Everett believed, or instead only an incomplete version of the fundamental laws, as Einstein believed.

Heisenberg’s Uncertainty Principle 

 In quantum mechanics, various Heisenberg Uncertainty Principles restrict the simultaneous values of, for example, a particle’s position and momentum. The uncertainty in the values cannot both be zero at the same time. Another Heisenberg uncertainty principle restricts time and energy. It implies that the uncertainties in the simultaneous measurements of time and energy in energy emission (or absorption) must obey the inequality ΔE Δt ≥ h/4π. Here h is Planck’s constant. ΔE is the (standard deviation of the) uncertainty in the value of the energy during a time interval. Δt is the uncertainty in the time. You cannot have values for E and t more precisely than this. A system cannot have such a precise value of E that ΔE is zero because the inequality would be violated. According to ontological approaches to quantum mechanics, there are no such precise values to be known.

These uncertainties are detected over a collection of measurements because any single measurement has (in principle and not counting practical measurement error) a precise value and is not “fuzzy” or uncertain. Repeated measurements necessarily produce a spread in values that reveal the fuzzy, wavelike characteristics of the phenomenon being measured, and these measurements collectively obey the Heisenberg inequality. Heisenberg himself thought of his uncertainty principle as being about how the measurer necessarily disturbs the measurement and not about how nature itself does not have definite values.

One other significant implication of these remarks about the uncertainty principle for time and energy is that there can be violations in the classical law of the conservation of energy. The classical law says the total energy of a closed and isolated system is always conserved and can only change its form but not disappear or increase. A falling rock has kinetic energy of motion during its fall to the ground, but when it collides with the ground, the kinetic energy changes its form by heating the ground, heating the rock, and creating the sound energy of the collision. No energy is lost in the process. This classical law can be violated in two ways: (1) if the universe (or the isolated system being studied) expands in volume, and (2) by being violated by an amount ΔE for a time Δt, as described by Heisenberg’s Uncertainty Principle. The classical law is often violated for very short time intervals and is less likely to be violated as the time interval increases. Some philosophers of physics have described this violation as something coming from nothing and something disappearing into nothing. The quantum “nothing” or quantum vacuum, however, is not really what classical philosophers call “nothing,” Quantum theory (rather than quantum mechanics) does contain a more sophisticated law of conservation of energy that has no violations and that accounts for the deviations from the classical law.

Quantum Foam

Quantum theory allows so-called “virtual particles” to be created out of the quantum vacuum without violating the more sophisticated law of conservation of energy. Despite their name, these particles are real, but they are unusual, because they borrow energy from the vacuum and pay it back very quickly. What happens is that, when a pair of energetic virtual particles—say, an electron and anti-electron—are created from energy in the vacuum, the two exist for only a very short time before being annihilated or reabsorbed, thereby giving back their borrowed energy. The greater the energy of the virtual pair, the shorter the time interval that the two exist before being reabsorbed, as described by Heisenberg’s Uncertainty Principle. In short, the more energy that is borrowed, the quicker it is paid back.

The physicist John Wheeler first suggested that the ultramicroscopic structure of spacetime for periods on the order of the Planck time (about 5.4 x 10-44 seconds) in regions about the size of the Planck length (about 1.6 x 10-35 meters) probably is a quantum foam of rapidly changing curvature of spacetime, with black holes and virtual particle-pairs and perhaps wormholes rapidly forming and dissolving.

The Planck time is the time it takes light to travel a Plank length. The terms Planck length and Planck time were inventions of Max Planck in the early twentieth-century during his quest to find basic units of length and time that could be expressed in terms only of universal constants. He defined the Planck unit of time algebraically as √(ħG/c5) is the square root symbol. ħ is Planck’s constant in quantum theory divided by 2π; G is the gravitational constant in Newtonian mechanics; c is the speed of light in a vacuum in relativity theory. Three different theories of physics are tied together in this one expression. The Planck time is a theoretically interesting unit of time, but not a practical one. No known experimental procedure can detect events that are this brief.

Quantum field theory is an amalgam of the theory of quantum mechanics and the special theory of relativity. There are no isolated particles in a vacuum according to quantum field theory because every ordinary elementary particle is surrounded by a cloud of virtual particles. Many precise experiments can be explained only by assuming there is this cloud.

So far, this article has spoken of virtual particles as if they are ordinary, but short-lived, particles. This is not quite correct. Virtual particles are not exactly particles like the other particles of the quantum fields. Both are excitations of these fields, and they both have gravitational effects and thus effects on time, but virtual particles are not equivalent to ordinary quantum particles, although the longer lived ones are more like ordinary particle excitations than the short lived ones.

Virtual particles are just a way to calculate the behavior of quantum fields, by pretending that ordinary particles are changing into weird particles with impossible energies, and tossing such particles back and forth between themselves. A real photon has exactly zero mass, but the mass of a virtual photon can be absolutely anything. What we mean by “virtual particles” are subtle distortions in the wave function of a collection of quantum fields…but everyone calls them particles [in order to keep their names simple] (Carroll 2019, p. 316).

For more presentation of the ontological implications of quantum field theory, see the last section of the supplementary article “Frequently Asked Questions about Time.”

Entanglement and Non-Locality

Classical theories imply locality, the feature that says an object is influenced immediately and directly only by its immediate surroundings. All the interpretations of quantum theory other than the Many-Worlds Interpretation imply the universe is not local. One particle can be coordinated with a distant particle instantly. Einstein discovered this phenomenon. Technically, it called “quantum entanglement.” Some physicists speak of it as “spooky action at a distance,” and many scientists have attributed this phrase to Einstein himself, but he never said it; only other scientists and science reporters say it. For Einstein, it cannot be spooky action because it is not action. Entanglement is, though, a correlation over a distance.

If some properties of two particles somehow become entangled, this does not mean that, if you move one of them, then the other one moves, too. It is not that kind of entanglement. It is about a particle’s suddenly having a definite property it did not previously have. This entanglement leads to non-locality. A quantum measurement of a certain property of one member of an entangled pair of particles will instantaneously or nearly instantaneously determine the value of that property found by any similar measurement that will eventually be made on the other member of the pair, no matter how far away  and how close in time to the first measurement. This is very unintuitive, but the only reasonable explanation is that neither particle has a definite value for the property until the first one was measured, after which the second one’s value is almost immediately fixed. This is at least spooky correlation at a distance.

For example, suppose two electrons have entangled spins, so that if any one has spin-up when measured, then the other always has spin-down when measured in the same direction, even if the particles are very far away from each other and both are measured at about the same time. The most important feature here is that the values of the spin properties of the entangled pair were not fixed at the time they became entangled. The value of the spin for the first electron is random or in a superposition of up and down, and only a measurement of spin will fix its value. it might be up; it might be down, but measuring the spin of the first particle to be up immediately fixes the value of spin of the second particle to be down. This initial randomness prevents use of the correlation for sending a useful signal.

Here is another way of describing the odd situation of the two entangled particles. The first is Alice’s particle; the second is Bob’s. Because of the correlation any pair of their measurements will be to be found up-down or else down-up. Alice can look at her system and instantly learn about Bob’s, and she would like to use this fact  to communicate quickly with Bob. They agree on the secret code that if Bob measures his electron to be spin-down, then he should buy the junk bonds, else he should not buy the junk bonds. Suppose Alice wants to use their secret code to tell Bob to buy the bonds. Unfortunately, Alice cannot force her particle always to be up, as a means of causing the second particle to be down. She might measure her particle to be down, and that would send the wrong stock signal to Bob. So, the correlation cannot be used for communication or action or causality. The limitation on the speed of communications, actions, and causal influences that holds in special relativity is preserved even in quantum theory.

In 1935, Erwin Schrödinger said:

Measurements on (spatially) separated systems cannot directly influence each other—that would be magic.

Einstein agreed. Yet the magic seems to exist.

Becoming entangled is a physical process, and it comes in degrees. The above discussion presumed a high level of entanglement.

Ontologically, the key idea about quantum entanglement is that if a particle becomes entangled with one or more other particles within the system, then it loses some of its individuality. The whole system is more than the sum of its sub-parts. The state of an entangled group of particles is not determined by the sum of the states of each separate particle. And vice versa. If you have the maximum information about the state of an entangled system of particles, you know hardly anything about the state of any individual particle. In that sense, quantum mechanics has led to the downfall  of reductionism.

It is easy to create entangled pairs. Colliding two energetic photons will produce an entangled pair of an electron and an anti-electron whose spins along some axis are entangled. Most entanglement occurs over a short distance. But in order to explore this “magical” feature of the quantum world, researchers have separated two entangled particles by a great distance and measured their spins at the same time. This way, the first measurement outcome cannot have directly affected the second measurement outcome via sending some ordinary signal between them because the signal would have had to move faster than light speed to get there by the time the second measurement is made. Nevertheless, the transmission of coordinated behavior happens in zero time or in nearly zero time. It is hard for us who are influenced by the manifest image to believe that the two entangled electrons did not start out with the spins that they were later measured to have, but careful observations have repeatedly confirmed this nonlocality. It has been shown repeatedly that any assumption that the two entangled particles started out with the same spin is inconsistent with the data produced in the observations.

But entanglement needs to be better understood. The philosopher David Albert has commented that “In order to make sense of this ‘instaneity,’ it looks as if there is a danger that one may require an absolute notion of simultaneity of exactly the kind that the special theory of relativity denied.”

Leonard Susskind has emphasized that it is not just particles that can become entangled. Parts of space can be entangled with each other, and it is this entanglement that “holds space together.” Some researchers have concluded that, because quantum theory implies that non-locality occurs most everywhere, this is the default, and what needs to be explained is any occurrence of locality.

Approximate Solutions

Like the equations of the theory of relativity, the equations of quantum theory are very difficult to solve and use except in very simple situations. The equations cannot be used directly in today’s computers. There have been many Nobel-Prize winning advances in chemistry by finding methods of approximating quantum theory in order to simulate the results of chemical activity within a computer. For one example, Martin Karplus won the Nobel Prize for chemistry in 2013 for creating approximation methods for computer programs that describe the behavior of the retinal molecule in our eye’s retina. The molecule has almost 160 electrons, but he showed that, for describing how light strikes the molecule and begins the chain reaction that produces the electrical signals that our brain interprets during vision, chemists can successfully use an approximation; they need to pay attention only to the molecule’s outer electrons, that is, to the electrons in the electron cloud that is farthest out from the nucleus.

a. Standard Model

The Standard Model of particle physics was proposed in the 1970s, and subsequently it has been revised and well tested. The Model is designed to describe elementary particles and the physical laws that govern them. The Standard Model is really a loose collection of theories about different particle fields, and it describes all known non-gravitational fields. It is our civilization’s most precise and powerful theory of physics.

The theory sets limits of what exists and what can happen. It implies that a particle can be affected by some forces but not others. It implies that a photon cannot decay into two photons. It implies that protons attract electrons and never repel them. It also implies that every proton consists in part of two up quarks and one down quark that interact with each other by exchanging gluons. The gluons “glue” the particles together via the strong nuclear force just as photons glue electrons to protons via the electromagnetic force. Gravitons, the carrier particles for gravity, glue a moon to a planet and a planet to a star. Unlike how Isaac Newton envisioned forces, all forces are transmitted by particles. That is, all forces have carrier particles that “carry” the force from one place to another. The gluons are massless and transmit the strong force at nearly light speed; this force “glues” the quarks together inside a proton. More than 90% of the mass of the proton consists in a combination of virtual quarks, virtual antiquarks and virtual gluons. Because the virtual particles exist over only very short time scales, they are too difficult to detect by any practical experiment, and so they are called “virtual particles.” However, this word “virtual” does not imply “not real.”

The properties of spacetime points that serve to distinguish any particle from any other are a spacetime point’s values for mass, spin, and charge at that point. Nothing else. There are no other differences among what is at a point, so in that sense fundamental physics is very simple. Charge, though, is not simply electromagnetic charge. There are three kinds of color charge for the strong nuclear force, and two kinds of charge for the weak nuclear force.

Except for gravity, the Standard Model describes all the universe’s forces. Strictly speaking, these theories are about interactions rather than forces. A force is just one kind of interaction. Another kind of interaction does not involve forces but rather it changes one kind of particle into another kind. The neutron, for example, changes its appearance depending on how it is probed. The weak interaction can transform a neutron into a proton. It is because of transformations like this that the concepts of something being made of something else and of one thing being a part of a whole become imprecise for very short durations and short distances. So, classical mereology—the formal study of parts and the wholes they form—fails.

Interaction in the field of physics is very exotic. When a particle interacts with another particle, the two particles exchange other particles, the so-called carriers of the interactions. So, when milk is spilled onto the floor, what is going on is that the particles of the milk and the particles in the floor and the particles in the surrounding air exchange a great many carrier particles with each other, and the exchange is what is called “spilling milk onto the floor.” Yet all these varied particles are just tiny fluctuations of fields. This scenario indicates one important way in which the scientific image has moved very far away from the manifest image.

According to the Standard Model, but not according to general relativity theory, all particles must move at light speed c unless they interact with other fields. All the particles in your body such as its protons and electrons would move at the speed c if they were not continually interacting with the Higgs Field. The Higgs Field can be thought as being like a “sea of molasses” that slows down all protons and electrons and gives them the mass and inertia they have. Neutrinos are not affected by the Higgs Field, but they move slightly less than c because they are slightly affected by the field of the weak interaction.

As of the first quarter of the twenty-first century, the Standard Model is incomplete because it cannot account for gravity or dark matter or dark energy or the fact that there is more matter than anti-matter. When a new version of the Standard Model does all this, then it will perhaps become the long-sought “theory of everything.”

4. Big Bang

The classical Big Bang Theory implies that the universe once was extremely small, extremely dense, extremely hot, nearly uniform, and expanding; and it had extremely high energy density and severe curvature of its spacetime at all scales. Now the universe has lost all these properties except one: it is still expanding. Some cosmologists believe time began with the Big Bang, at the famous cosmic time t = 0, but the Big Bang Theory itself does not imply anything about when time began, nor whether anything was happening before the Big Bang, although those features could be added into a revised theory of the Big Bang.

The Big Bang explosion was a rapid expansion of space itself, not an expansion of something in a pre-existing void. Think of the expansion as being due to the creation of new space everywhere.

The Big Bang Theory is only a theory of the observable universe, not of the whole universe. The observable universe is the part of the universe that is in principle observable by creatures on Earth. But surely there is more than we can in principle observe. Scientists have no well-confirmed idea about the universe as a whole; it might or might not be like the observable universe.

The Big Bang Theory was very controversial when it was created in the 1920s. Before the 1960s, physicists were unsure whether proposals about cosmic origins were pseudoscientific and so should not be discussed in a well-respected physics journal. By 1930, there was general agreement among cosmologists that the universe was expanding, but it was not until the 1970s that there was general agreement that the Big Bang Theory is correct. The theory’s primary competitor during this time was the steady state theory. That theory allows space to expand in volume while this expansion is compensated for by providing spontaneous creation of matter in order to keep the universe’s overall density constant over time. This spontaneous creation violated the increasingly attractive principle of the conservation of energy.

The Big Bang explosion began approximately 13.8 billion years ago (although a minority of cosmologists suggest it might be as young as 11.4 billion years old). At that time, the observable universe would have had an ultramicroscopic volume. The explosion created new space, and this explosive process of particles flying away from each other continues to create new space today. In fact, in 1998, the classical theory of the Big Bang was revised to say the expansion rate has been accelerating slightly for the last five billion years due to the pervasive presence of dark energy. Dark energy has this name because so little is known about it other than that its amount per unit volume stays constant as space expands. That is, it does not dilute. There are two possibilities for what it is. Dark energy is either what is referred to as the “cosmological constant” or “the energy of the vacuum.” Firstly it might be:

a nonzero ground-state energy of the universe that will exist indefinitely into the future. Or second, it could be energy stored in yet another invisible background scalar field in the universe. If this is the case, then the next obvious question is, will this energy be released in yet another, future inflationary-like phase transition as the universe continues to cool down? At this time the answer is up for grabs ” (Lawrence M. Krauss, The Greatest Story Ever Told—So Far: Why Are We Here?).

One hopes that, if it is the latter of the two possibilities, then that phase transition will not happen very soon.

The Big Bang Theory in some form or other (with or without inflation) is accepted by nearly all cosmologists, astronomers, astrophysicists, and philosophers of physics, but it is not as firmly accepted as is the theory of relativity.

The Big Bang Theory originated with several people, although Edwin Hubble’s very careful observations in 1929 of galaxy recession from Earth were the most influential pieces of evidence in its favor. He showed that on average the farther a galaxy is from Earth, the faster is recedes from Earth. In 1922, the Russian physicist Alexander Friedmann discovered that the general theory of relativity allows an expanding universe. Unfortunately, Einstein reacted to this discovery by saying this is a mere physical possibility and not a feature of the actual universe. He later retracted this claim, thanks in large part to the influence of Hubble’s data. The Belgian physicist Georges Lemaître suggested in 1927 that there is some evidence the universe is expanding, and he defended his claim using previously published measurements to show a pattern that the greater the distance of a galaxy from Earth the greater the galaxy’s speed away from Earth. He calculated these speeds from the Doppler shifts in their light frequency, as did Hubble.

Currently, space is expanding because most clusters of galaxies are flying away from each other, even though molecules, planets, and galaxies themselves are not now expanding. Eventually, according to the most popular version of the Big Bang Theory, in the very distant future, even these objects will expand away from each other and all structures of particles will be annihilated, leaving only an expanding soup of elementary particles as the universe chills and approaches thermodynamic equilibrium.

The acceptance of the theory of relativity has established that space curves locally near all masses. However, the theory of relativity has no implications about curvature of space at the cosmic level. The universe presumably has no edge, but the observable universe does. The observable universe is a sphere containing 350 billion large galaxies; it is called “our Hubble Bubble” and also “our pocket universe.” Its diameter is about 93 billion light years, but it is rapidly growing more every day.

The Big Bang Theory presupposes that the ultramicroscopic-sized observable universe at a very early time had an extremely large curvature, but most cosmologists believe that the universe has straightened out and now no longer has any significant spatial curvature on the largest scale of billions of light years. Also, astronomical observations reveal that the current distribution of matter in the universe tends towards uniformity as the scale increases. At very large scales it is homogeneous and isotropic.

Here is a picture that displays the evolution of the observable universe since the Big Bang—although the picture displays only two spatial dimensions of it. Time is increasing to the right while space increases both up and down and in and out of the picture:

Big Bang graphic

Attribution: NASA/WMAP Science Team

Clicking on the picture will produce an expanded picture with more detail. (The picture shows only two spatial dimensions of the three in our universe.)

The term Big Bang does not have a precise definition. It does not always refer to a single, first event; rather, it more often refers to a brief duration of early events as the universe underwent a rapid expansion. In fact, the idea of a first event is primarily a product of accepting the theory of relativity, which is known to fail in the limit as the universe’s volume approaches zero, the so-called singularity. Actually, the Big Bang Theory itself is not a specific theory, but rather a framework for more specific Big Bang theories.

Astronomers on Earth detect microwave radiation arriving in all directions. It is the cooled down heat from the Big Bang. More specifically, it is electromagnetic radiation produced about 380,000 years after the Big Bang when the universe suddenly turned transparent for the first time. Mapping the microwave radiation gives us a picture of the universe in its infancy. At that time, the universe had cooled down to 3,000 degrees Kelvin, which was cool enough to form atoms and to allow photons for the first time to move freely without being immediately reabsorbed by neighboring particles. This primordial electromagnetic radiation has now reached Earth as the universe’s most ancient light. Because of space’s expansion during the light’s travel to Earth, the radiation has cooled and dimmed, and its wavelength has increased and become microwave radiation with a corresponding temperature of only 2.73 degrees Celsius above absolute zero. The microwave’s wavelength is about two millimeters and is small compared to the 100-millimeter wavelength of the microwaves in kitchen ovens. Measuring this incoming Cosmic Microwave Background (CMB) radiation reveals it to be extremely uniform in all directions in the sky.

Extremely uniform, but not perfectly uniform. CMB radiation varies very slightly with the angle it is viewed from. The variation is a ten thousandth of a degree of temperature. These small temperature fluctuations of the currently arriving radiation indicate fluctuations in the density of the matter of the early plasma and so are probably the origin of what later would become today’s galaxies and the voids between them because the high density regions will contract under the pull of gravity and form stars. The temperature fluctuations, in turn, probably began much earlier as quantum effects.

After the early rapid expansion ended, the universe’s expansion rate became constant and comparatively low for billions of years. This rate is now accelerating slightly because there is a another source of expansion—the repulsion of dark energy. The influence of dark energy was initially insignificant for billions of years, but its key feature is that it does not dilute as the space undergoes expansion. So, finally, after about seven or eight billion years of space’s expanding after the Big Bang, the dark energy became an influential factor and started to significantly accelerate the expansion. Today the expansion rate is becoming more and more significant. For example, the diameter of today’s observable universe will double in about 10 billion years. This influence from dark energy is shown in the above diagram by the presence of the curvature that occurs just below and before the abbreviation “etc.” Future curvature will be much greater. Most cosmologists believe this dark energy is the energy of space itself.

The initial evidence for dark energy came from observations in 1998 of Doppler shifts of supernovas. These observations are best explained by the assumption that distances between supernovas are increasing at an accelerating rate. Because of this rate increase, any receding galaxy cluster that is currently 100 light-years away from our Milky Way will be more than 200 light-years away in another 13.8 billion years, and it will be moving away from us much faster than it is now. One day, it will be moving so fast away that it will become invisible because the recession speed will exceed light speed. In enough time, every galaxy other than the Milky Way will become invisible. After that, the stars in the Milky Way will gradually become invisible, with the more distant ones disappearing first. We will lose sight of all our neighbors.

The universe is currently expanding, so everything is moving a bit from everything else. But the influence is not currently significant except at the level of galaxy clusters getting farther away from other galaxy clusters as new space is created between them. But the influence is accelerating, and so someday all solar systems, and ultimately even all configurations of elementary particles will expand and break apart. We approach the heat death of the universe, the big chill.

The term “our observable universe” and the synonymous term “our Hubble bubble,” refer to everything that some person on Earth could in principle observe. Cosmologists presume that there are distant places in the universe in which an astronomer there could see more things than are observable from here on Earth. Physicists are agreed that, because of this reasoning, there exist objects that are in the universe but not in our observable universe. Because those unobservable objects are also the product of our Big Bang, cosmologists assume that they are similar to the objects we on Earth can observe—that those objects form atoms and galaxies, and that time behaves there as it does here. But there is no guarantee that this convenient assumption is correct. Occam’s Razor suggests it is correct, but that is the sole basis for such a claim. So, it is more accurate to say the classical Big Bang Theory implies that the observable universe once was extremely small, dense, hot, and so forth.

Because the Big Bang happened about 13.8 billion years ago, you might think that no observable object can be more than 13.8 billion light-years from Earth, but this would be a mistake that does not take into account the fact that the universe has been expanding all that time. The relative distance between galaxies is increasing over time. That is why astronomers can see about 45 billion light-years in any direction and not merely 13.8 billion light-years.

When contemporary physicists speak of the age of our universe and of the time since our Big Bang, they are implicitly referring to cosmic time measured in the cosmological rest frame. This is time measured in a unique reference frame in which the average motion of all the galaxies is stationary and the Cosmic Microwave Background radiation is as close as possible to being the same in all directions. This frame is not one in which the Earth is stationary. Cosmic time is time measured by a clock that would be sitting as still as possible while the universe expands around it. In cosmic time, the time of t = 0 years is when the time that the Big Bang began, and t = 13.8 billion years is our present. If you were at rest at the spatial origin in this frame, then the Cosmic Microwave Background radiation on a very large scale would have about the same average temperature in any direction.

The cosmic rest frame is a unique, privileged reference frame for astronomical convenience, but there is no reason to suppose it is otherwise privileged. It is not the frame sought by the A-theorist who believes in a unique present, nor by Isaac Newton who believed in absolute rest, nor by James Clerk Maxwell who believed in an aether at rest and that waved whenever a light wave passed through.

The cosmic frame’s spatial origin point is described as follows:

In fact, it isn’t quite true that the cosmic background heat radiation is completely uniform across the sky. It is very slightly hotter (i.e., more intense) in the direction of the constellation of Leo than at right angles to it…. Although the view from Earth is of a slightly skewed cosmic heat bath, there must exist a motion, a frame of reference, which would make the bath appear exactly the same in every direction. It would in fact seem perfectly uniform from an imaginary spacecraft traveling at 350 km per second in a direction away from Leo (towards Pisces, as it happens)…. We can use this special clock to define a cosmic time…. Fortunately, the Earth is moving at only 350 km per second relative to this hypothetical special clock. This is about 0.1 percent of the speed of light, and the time-dilation factor is only about one part in a million. Thus to an excellent approximation, Earth’s historical time coincides with cosmic time, so we can recount the history of the universe contemporaneously with the history of the Earth, in spite of the relativity of time.

Similar hypothetical clocks could be located everywhere in the universe, in each case in a reference frame where the cosmic background heat radiation looks uniform. Notice I say “hypothetical”; we can imagine the clocks out there, and legions of sentient beings dutifully inspecting them. This set of imaginary observers will agree on a common time scale and a common set of dates for major events in the universe, even though they are moving relative to each other as a result of the general expansion of the universe…. So, cosmic time as measured by this special set of observers constitutes a type of universal time… (Davies 1995, pp. 128-9).

It is a convention that cosmologists agree to use the cosmic time of this special reference frame, but it is an interesting fact and not a convention that our universe is so organized that there is such a useful cosmic time available to be adopted by the cosmologists. Not all physically possible spacetimes obeying the laws of general relativity can have this sort of cosmic time.

In the 2020s, the standard model of cosmology and thus of the Big Bang is known as the lambda-CDM model or Λ-CDM model. Lambda is the force accelerating the expansion, and CDM is cold dark matter. The cold, dark matter is expected by some physicists to consist in as yet undiscovered weakly interacting massive particles, called wimps. A competing theory implies the dark matter is fuzzy, ultralight particles called axions.

a. Cosmic Inflation

According to one somewhat popular revision of the classical Big Bang Theory, the cosmic inflation theory, the universe was created from quantum fluctuations in an inflaton field, then the field underwent a cosmological phase transition for some unknown reason causing an exponentially accelerating expansion of space, and, then for some unknown reason it stopped inflating very soon after it began. At that time, the universe continued expanding at a more or less constant rate for billions of years.

By the time that inflation was over, every particle was left in isolation, surrounded by a vast expanse of empty space extending in every direction. And then—only a fraction of a fraction of an instant later—space was once again filled with matter and energy. Our universe got a new start and a second beginning. After a trillionth of a second, all four of the known forces were in place, and behaving much as they do in our world today. And although the temperature and density of our universe were both dropping rapidly during this era, they remained mind-boggingly high—all of space was at a temperature of 1015 degrees. Exotic particles like Higgs bosons and top quarks were as common as electrons and photons. Every last corner of space teemed with a dense plasma of quarks and gluons, alongside many other forms of matter and energy. After expanding for another millionth of a second, our universe had cooled down enough to enable quarks and gluons to bind together forming the first protons and neutrons (Dan Hooper, At the Edge of Time, p. 2).

About half the cosmologists do not believe in cosmic inflation. They hope there is another explanation of the phenomena that inflation theory explains. The theory provides an explanation for (i) why there is currently so little curvature of space on large scales (the flatness problem), (ii) why the microwave radiation that arrives on Earth from all directions is so uniform (the cosmic horizon problem), (iii) why there are not point-like magnetic monopoles most everywhere (called the monopole problem), and (iv) why we have been unable to detect proton decay that has been predicted (the proton decay problem). It is difficult to solve these problems in some other way than by assuming inflation.

The theory of primordial cosmic strings has been the major competitor to the theory of cosmic inflation, but the above problems are more difficult to solve with strings and without inflation, and the anisotropies of the Cosmic Microwave Background (CMB) radiation are consistent with inflation but not with primordial cosmic strings. . The theory of inflation is accepted by a great many members of the community of professional cosmologists, but it is not as firmly accepted as is the Big Bang Theory. Princeton cosmologist Paul Steinhardt and Neil Turok of the Perimeter Institute are three of inflation’s noteworthy opponents, although Steinhardt once made important contributions to the creation of inflation theory. One of their major complaints is that at the time of the Big Bang, there should have been a great many long wavelength gravitational waves created, and today we have the technology that should have detected these waves, but we find no evidence for them.

According to the theory of inflation, assuming the Big Bang began at time t = 0, then the epoch of inflation (the epoch of radically repulsive gravity) began at about t = 10−36 seconds and lasted until about t = 10−33 seconds, during which time the volume of space increased by a factor of 1026, and any initial unevenness in the distribution of energy was almost all smoothed out, that is, smoothed out from the large-scale perspective, somewhat in analogy to how blowing up a balloon removes its initial folds and creases and looks flat when a small section of it is viewed close up.

Although the universe at the beginning of inflation was actually much smaller than the size of a proton, think of it instead as having been the size of a marble. Then during the inflation period this marble-sized object expands abruptly to a gigantic sphere whose radius is the distance that now would reach from Earth to the nearest supercluster of galaxies. This would be a spectacular change in something marble-sized.

The speed of this inflationary expansion was much faster than light speed. However, this fast expansion speed does not violate Einstein’s general theory of relativity because this theory places no limits on the speed of expansion of space itself.

At the end of that inflationary epoch at about t = 10−33 seconds or so, the inflation stopped. In more detail, what this means is that the explosive material decayed for some unknown reason and left only normal matter with attractive gravity. Meanwhile, our universe continued to expand, although now at a slow, nearly constant, rate. It went into its “coasting” phase. Regardless of any previous curvature in our universe, by the time the inflationary period ended, the overall structure of space on the largest scales had very little spatial curvature, and its space was extremely homogeneous. Today, we see evidence that the universe is homogeneous on its largest scale.

But at the very beginning of the inflationary period, there surely were some very tiny imperfections due to the earliest quantum fluctuations in the inflaton field. These quantum imperfections inflated into small perturbations or slightly bumpy regions at the end of the inflationary period. The densest regions attracted more material than the less dense regions, and these dense regions would eventually turn into future galaxies. The less dense regions would eventually evolve into the voids between the galaxies. Those early quantum fluctuations have now left their traces in the very slight hundred-thousandth of a degree differences in the temperature of the cosmic microwave background radiation at different angles as one now looks out into space from Earth with microwave telescopes.

Let’s re-describe the process of inflation. Before inflation began, for some as yet unknown reason the universe contained an unstable inflaton field or false vacuum field. For some other, as yet unknown reason, this energetic field expanded and cooled and underwent a spontaneous phase transition (somewhat analogous to what happens when cooling water spontaneously freezes into ice). That phase transition caused the highly repulsive primordial material to hyper-inflate exponentially in volume for a very short time. To re-describe this yet again, during the primeval inflationary epoch, the gravitational field’s stored, negative, repulsive, gravitational energy was rapidly released, and all space wildly expanded. At the end of this early inflationary epoch at about t = 10−33 seconds, the highly repulsive material decayed for some as yet unknown reason into ordinary matter and energy, and the universe’s expansion rate stopped increasing exponentially, and the expansion rate dropped precipitously and became nearly constant. During the inflationary epoch, the entropy continually increased, so the second law of thermodynamics was not violated.

Alan Guth described the inflationary period this way:

There was a period of inflation driven by the repulsive gravity of a peculiar kind of material that filled the early universe. Sometimes I call this material a “false vacuum,” but, in any case, it was a material which in fact had a negative pressure, which is what allows it to behave this way. Negative pressure causes repulsive gravity. Our particle physics tells us that we expect states of negative pressure to exist at very high energies, so we hypothesize that at least a small patch of the early universe contained this peculiar repulsive gravity material which then drove exponential expansion. Eventually, at least locally where we live, that expansion stopped because this peculiar repulsive gravity material is unstable; and it decayed, becoming normal matter with normal attractive gravity. At that time, the dark energy was there, the experts think. It has always been there, but it’s not dominant. It’s a tiny, tiny fraction of the total energy density, so at that stage at the end of inflation the universe just starts coasting outward. It has a tremendous outward thrust from the inflation, which carries it on. So, the expansion continues, and as the expansion happens the ordinary matter thins out. The dark energy, we think, remains approximately constant. If it’s vacuum energy, it remains exactly constant. So, there comes a time later where the energy density of everything else drops to the level of the dark energy, and we think that happened about five or six billion years ago. After that, as the energy density of normal matter continues to thin out, the dark energy [density] remains constant [and] the dark energy starts to dominate; and that’s the phase we are in now. We think about seventy percent or so of the total energy of our universe is dark energy, and that number will continue to increase with time as the normal matter continues to thin out. (World Science U Live Session: Alan Guth, published November 30, 2016 at https://www.youtube.com/watch?v=IWL-sd6PVtM.)

Before about t = 10-46 seconds, there was a single basic force rather than the four we have now. The four basic forces (or basic interactions) are: the force of gravity, the strong nuclear force, the weak force, and the electromagnetic force. At about t = 10-46 seconds, the energy density of the primordial field was down to about 1015 GEV, which allowed spontaneous symmetry breaking (analogous to the spontaneous phase change in which water cools enough to spontaneously change to ice); this phase change created the gravitational force as a separate basic force. The other three forces had not yet appeared as separate forces.

Later, at t = 10-12 seconds, there was even more spontaneous symmetry breaking. First the strong nuclear force, then the weak nuclear force and finally the electromagnetic force became separate forces. For the first time, the universe now had exactly four separate forces. At t = 10-10 seconds, the Higgs field turned on. This slowed down many kinds of particles by giving them mass so they no longer moved at light speed.

Much of the considerable energy left over at the end of the inflationary period was converted into matter, antimatter, and radiation, such as quarks, antiquarks, and photons. The universe’s temperature escalated with this new radiation; this period is called the period of cosmic reheating. Matter-antimatter pairs of particles combined and annihilated, removing from the universe all the antimatter and almost all the matter. At t = 10-6 seconds, this matter and radiation had cooled enough that quarks combined together and created protons and neutrons. After t = 3 minutes, the universe had cooled sufficiently to allow these protons and neutrons to start combining strongly to produce hydrogen, deuterium, and helium nuclei. At about t = 379,000 years, the temperature was low enough (around 2,700 degrees C) for these nuclei to capture electrons and to form the initial hydrogen, deuterium, and helium atoms of the universe. With these first atoms coming into existence, the universe became transparent in the sense that short wavelength light (about a millionth of a meter) was now able to travel freely without always being absorbed very soon by surrounding particles. Due to the expansion of the universe since then, this early light’s wavelength expanded and is today invisible on Earth because it is at much longer wavelength than it was 379,000 years ago. That radiation is now detected on Earth as having a wavelength of 1.9 millimeters, and it is called the cosmic microwave background radiation or CMB. That energy is continually arriving at the Earth’s surface from all directions. It is almost homogenous and almost isotropic.

As the universe expands, the CMB radiation loses energy; but this energy is not lost from the universe, nor is the law of conservation of energy violated. There is conservation because the same amount of energy is gained by going into expanding the space.

In the literature in both physics and philosophy, descriptions of the Big Bang often speak of it as if it were the first event, but the Big Bang Theory does not require there to be a first event, an event that had no prior event. Any description mentioning the first event is a philosophical position, not something demanded by the scientific evidence. Physicists James Hartle and Stephen Hawking once suggested that looking back to the Big Bang is just like following the positive real numbers back to ever-smaller positive numbers without ever reaching the smallest positive one. There isn’t a smallest positive number. If Hartle and Hawking are correct that time is strictly analogous to this, then the Big Bang had no beginning point event, no initial time.

The classical Big Bang Theory is based on the assumption that the universal expansion of clusters of galaxies can be projected all the way back to a singularity, to a zero volume at t = 0. The assumption is faulty. Physicists now agree that the projection to a smaller volume  must become untrustworthy for any times less than the Planck time. If a theory of quantum gravity ever gets confirmed, it is expected to provide more reliable information about the Planck epoch from t=0 to the Planck time, and it may even allow physicists to answer the questions, “What caused the Big Bang?” and “Did anything happen before then?”

For a short lecture by Guth on these topics aimed at students, see https://www.youtube.com/watch?v=ANCN7vr9FVk.

b. Eternal Inflation and Many Worlds

Although there is no consensus among physicists about whether there is more than one universe, many of the Big Bang inflationary theories are theories of eternal inflation, of the eternal creation of more Big Bangs and thus more universes. The theory is called the theory of chaotic inflation, and the theory of the inflationary multiverse and the Many-Worlds Interpretation and occasionally the Multiverse Theory (although this is different from the multiverse theory of Hugh Everett). The key idea is that once inflation gets started it cannot easily be turned off.

The inflaton field is the fuel of our Big Bang and of all of the other Big Bangs. Advocates of eternal inflation say that not all the inflaton fuel is used up in producing just one Big Bang, so the remaining fuel is available to create other Big Bangs, at an exponentially increasing rate because the inflaton fuel increases much faster than it gets used. Presumably, there is no reason why this process should ever end, so there will be a potentially infinite number of universes in the multiverse. Also, there is no good reason to suppose our actual universe was the first one. Technically, whether one Big Bang occurred before or after another is not well defined.

A helpful mental image here is to think of the multiverse as a large, expanding space filled with bubbles of all sizes, all of which are growing. Each bubble is its own universe, and each might have its own physical constants, its own number of dimensions, even some laws of physics different from ours. In some of these universes, there may be no time at all. Regardless of whether a single bubble universe is inflating or no longer inflating, the space between the bubbles is inflating and more bubbles are being born at an exponentially increasing rate. Because the space between bubbles is inflating, nearby bubbles are quickly hurled apart. That implies there is a low probability that our bubble universe contains any empirical evidence of having interacted with a nearby bubble.

After any single Big Bang, eventually the hyper-inflation ends within that universe. We say its bit of inflaton fuel has been used up. However, after the hyper-inflation ends, the expansion within that universe does not. Our own expanding bubble was produced by our Big Bang 13.8 billion years ago. It is called the Hubble Bubble.

The inflationary multiverse is not the quantum multiverse predicted by the many-worlds interpretation of quantum theory. The many-worlds interpretation says every possible outcome of a quantum measurement persists in a newly created world, a parallel universe. If you turn left when you could have turned right, then two universes are instantly created, one in which you turned left, and a different one in which you turned right. A key feature of both the inflationary multiverse and the quantum multiverse is that the wave function does not collapse when a measurement occurs. Unfortunately both theories are called the multiverse theory as well as the many-worlds theory, so a reader needs to be alert to the use of the term. The Everettian Theory is the theory of the quantum multiverse but not of the inflationary multiverse.

The original theory of inflation was created by Guth and Linde in the early 1980s. The theory of eternal inflation with a multiverse was created by Linde in 1983 by building on some influential work by Gott and Vilenkin. The multiplicity of universes of the inflationary multiverse also is called parallel worlds, many worlds, alternative universes, alternate worlds, and branching universes—many names denoting the same thing. Each universe of the multiverse normally is required to use some of the same physics (there is no agreement on how much) and all the same mathematics. This restriction is not required by a logically possible universe of the sort proposed by the philosopher David Lewis.

New energy is not required to create these inflationary universes, so there are no implications about whether energy is or is not conserved in the multiverse.

Normally, philosophers of science say that what makes a theory scientific is not that it can be falsified (as the philosopher Karl Popper proposed), but rather is that there can be experimental evidence for it or against it. Because it is so difficult to design experiments that would provide evidence for or against the multiverse theories, many physicists complain that their fellow physicists who are developing these theories are doing technical metaphysical speculation, not physics. However, the response from defenders of multiverse research is usually that they can imagine someday, perhaps in future centuries, running crucial experiments, and, besides, the term physics is best defined as being whatever physicists do professionally.

5. Infinite Time

Is time infinitely divisible? Yes, because general relativity theory and quantum theory require time to be a continuum. But this answer will change to “no” if these theories are eventually replaced by a Core Theory that quantizes time. “Although there have been suggestions by some of the best physicists that spacetime may have a discrete structure,” Stephen Hawking said in 1996, “I see no reason to abandon the continuum theories that have been so successful.” Twenty-five years later, the physics community became much less sure that Hawking is correct.

Did time begin at the Big Bang, or was there a finite or infinite time period before our Big Bang? The answer is unknown. There are many theories that imply an answer to the question, but the major obstacle in choosing among them is that the theories cannot be tested practically.

Stephen Hawking and James Hartle said the difficulty of knowing whether the past and future are infinite in duration turns on our ignorance of whether the universe’s positive energy is exactly canceled out by its negative energy. All the energy of gravitation and spacetime curvature is negative. If the total of the universe’s energy is non-zero and if quantum mechanics is to be trusted, including the law of conservation of energy, then time is infinite in the past and future. Here is the argument for this conclusion. The law of conservation of energy implies energy can change forms, but if the total were ever to be non-zero, then the total energy can never become zero in the future nor have been zero in the past because any change in the total to zero from non-zero or from non-zero to zero would violate the law of conservation of energy. So, if the total of the universe’s energy is non-zero and if quantum mechanics is to be trusted, then there always have been states whose total energy is non-zero, and there always will be states of non-zero energy. That suggests there can be no first instant or last instant and thus that time is eternal.

There is no solid evidence that the total is non-zero, but a slim majority of the experts favor a non-zero total, although their confidence in this is not strong. Assuming there is a non-zero total, there is no favored theory of the universe’s past, but the favored theory of the future is the big chill theory. The big chill theory implies the universe just keeps getting chillier forever as space expands and gets more dilute, and so there always will be changes and thus new events produced from old events.

Here are more details of the big chill theory. The last star will burn out in 1015 years. Then all the stars and dust within each galaxy will fall into black holes. Then the material between galaxies will fall into black holes as well, and finally in about 10100 years all the black holes will evaporate, leaving only a soup of elementary particles that gets less dense and therefore “chillier” as the universe’s expansion continues. The microwave background radiation will red shift more and more into longer wavelength radio waves. Future space will expand toward thermodynamic equilibrium. But because of vacuum energy, the temperature will only approach, but never quite reach, zero on the Kelvin scale. Thus the universe descends into a “big chill,” having the same amount of total energy it always has had.

Here is some final commentary about the end of time:

In classical general relativity, the Big Bang is the beginning of spacetime; in quantum general relativity—whatever that may be, since nobody has a complete formulation of such a theory as yet—we don’t know whether the universe has a beginning or not.

There are two possibilities: one where the universe is eternal, one where it had a beginning. That’s because the Schrödinger equation of quantum mechanics turns out to have two very different kinds of solutions, corresponding to two different kinds of universe.

One possibility is that time is fundamental, and the universe changes as time passes. In that case, the Schrödinger equation is unequivocal: time is infinite. If the universe truly evolves, it always has been evolving and always will evolve. There is no starting and stopping. There may have been a moment that looks like our Big Bang, but it would have only been a temporary phase, and there would be more universe that was there even before the event.

The other possibility is that time is not truly fundamental, but rather emergent. Then, the universe can have a beginning. …And if that’s true, then there’s no problem at all with there being a first moment in time. The whole idea of “time” is just an approximation anyway (Carroll 2016, 197-8).

Back to the main “Time” article for references and citations.

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.

British Empiricism

 ‘British Empiricism’ is a name traditionally used to pick out a group of eighteenth-century thinkers who prioritised knowledge via the senses over reason or the intellect and who denied the existence of innate ideas. The name includes most notably John Locke, George Berkeley, and David Hume. The counterpart to British Empiricism is traditionally considered to be Continental Rationalism that was advocated by Descartes, Spinoza, and Leibniz, all of whom lived in Continental Europe beyond the British Isles and all embraced innate ideas. This article characterizes empiricists more broadly as those thinkers who accept Locke’s Axiom that there is no idea in the mind that cannot be traced back to some particular experience. It includes British-Irish Philosophy from the seventeenth, eighteenth, and nineteenth century. As well as exploring the traditional connections among empiricism and metaphysics and epistemology, it examines how British empiricists dealt with issues in moral philosophy and the existence and nature of God. The article identifies some challenges to the standard understanding of British Empiricism by including early modern thinkers from typically marginalised groups, especially women. Finally, in showing that there is nothing uniquely British about being an empiricist, it examines a particular case study of the eighteenth-century philosopher Anton Wilhelm Amo, the first African to receive a doctorate in Europe.

Table of Contents

  1. Introduction
    1. Historiography
  2. The Origins of Empiricism
    1. Precursors to Locke
    2. Locke
  3. Our Knowledge of the External World and Causation
    1. Berkeley on the Nature of the External World
    2. Hume on the Nature of Causation
    3. Shepherd on Berkeley and Hume
  4. Morality
    1. Hutcheson and the Moral Sense
    2. Hume on Taste and the Moral Sense
    3. Newcome on Pain, Pleasure, and Morality
  5. God and Free-Thinking
    1. Anthony Collins
    2. John Toland
    3. George Berkeley
  6. Anton Wilhelm Amo: A Case Study in the Limits of British Empiricism
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Introduction

This article is called ‘British Empiricism’, but it could just as accurately have been titled ‘British-Irish Philosophy from the seventeenth to the nineteenth century and the Lockean Axiom’. The article focuses on the commitment to the Lockean version of the Peripatetic axiom that is shared by many British and Irish thinkers in the seventeenth, eighteenth, and nineteenth centuries. Following John Locke (1632–1704), virtually all the empiricist thinkers considered in this article accept that “nothing is in the intellect that was not first in the senses” (De veritate q. 2 a. 3 arg. 19), to use Thomas Aquinas (1225–1274) phrasing of what is known as the Peripatetic Axiom (see Cranefield 1970 for more on the origin of the phrase).

While the shared acceptance of this axiom is a unifying feature for the thinkers considered in this article, it is worth starting off with some problematization of the term ‘British Empiricism’. The term ‘British’ here is used in a sense common in the early modern period which includes both what in the early twenty-first century was the United Kingdom of Great Britain and North Ireland and the Republic of Ireland—and thus includes thinkers such as the Ardagh-born John Toland (1670–1722) and the Kilkenny-born George Berkeley (1685-1753). The term ‘British’ here also excludes the many British colonies, meaning that this is not a global but is Western European. The term ‘empiricism’ considered here is neither exhaustive nor confined to ‘Britain’. In other words, this article does not discuss all British thinkers who are committed to the Peripatetic axiom. Nor do we claim that such a commitment only exists among British thinkers (see also section 6.1). We further problematize the term by discussing its historiography (section 1.1). This helps to explain why we chose to keep (and use) the term and how the issues and thinkers considered in this article were selected. After all, it is important to be transparent about the fact that an article like this, which focuses on a philosophical tradition, tells a particular story. This will inevitably involve choices by the authors that are shaped by factors like their own introduction to that tradition and which concern the protagonists and the content considered; both of which we outline below.

Section 2 considers the history of the Peripatetic axiom and Locke’s interpretation of it, which here is called the Lockean Axiom.

Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience.

Subsequent sections consider how this axiom, accepted in some form by all the thinkers below, was applied to a variety of questions. Section 3 discusses its application to our knowledge of the external world, focusing on George Berkeley (1685–1753), David Hume (1711–1776), and Mary Shepherd (1777–1847). Section 4 focuses on how the axiom influenced moral philosophy in the period, focusing on Hume, Francis Hutcheson (1694–1746), and Susanna Newcome (1685–1763). Section 5 examines the application of the axiom to our knowledge of God, and its focuses on Berkeley, Toland, and Anthony Collins (1676–1729). The final section (section 6) focuses on the limitations of the narrative developed here by considering the case of Anton Wilhelm Amo (c. 1703–1759). Amo is committed to a version of the Lockean Axiom and thus there is a strong reason to consider him within the narrative developed here. However, including Amo comes at the price of challenging the moniker ‘British’ and thus of another feature that determined the selection.

In other words, the purpose of including Amo is twofold. First, it highlights the limits of our narrative. Second, it points to the arbitrary nature of any historical narrative concerning ‘British Empiricism.’ This results from the fact, which we highlight in the next section, that (‘British’) ‘Empiricism’ is an external ascription applied by scholars to certain philosophers – and not a self-expression of a common identity these philosophers took themselves to have shared. In other words, it is an analyst’s category and not an actors’ one. As such, any narrative using this category is always, more or less explicitly, guided by the assumptions, interests, values, and goals of the scholar or analyst employing it. In an attempt to be as transparent as possible about these assumptions, as well as to bolster the case of our arbitrariness-claim, we consider the historiography of the term ‘empiricism’ in the next section. This will also serve to shed further light on the nature and scope of the narrative we develop here, and the ways in which it deviates from the standard narrative.

a. Historiography

A crucial thing to note about both the term ‘British Empiricism’ and what is traditionally thought of as its counterpart ‘Continental Rationalism’ is that they are both anachronisms in the previously introduced sense of being analysts’, and not actors’, categories. To put it differently, none of the thinkers considered in this article, nor thinkers like René Descartes (1596–1650), Baruch Spinoza (1632–1677), or Gottfried Wilhelm Leibniz (1646-1716), who are usually thought of as ‘rationalists,’ used these terms to describe themselves. These thinkers did not think of themselves as working in unified traditions that were opposed to each other. Take the case of Berkeley for instance: while Berkeley critically reacts to Descartes (for example, Letter 44), he is even more critical of Locke. As a case in point, consider his rejection of the existence of abstract ideas in the Introduction to A Treatise Concerning the Principles of Human Knowledge. In fact, we know of no place in Berkeley’s work where he would clearly suggest that he sees himself working in some sort of tandem with Locke, against the likes of Descartes or Leibniz. Leibniz even writes about Berkeley’s Principles that “[t]here is much here that is correct and close to my own view” (AG 307). At the same time Leibniz defends the notion of innate ideas against Locke (see New Essays, G VI), but he also has a critical attitude towards Cartesianism on a variety of issues (see, for example, Anfray 2019 for a concise overview). In summary, the interrelations between these various actors (Berkeley, Locke, Descartes, and Leibniz in this instance) are complex; and it would be a stretch to suggest they saw themselves in two opposing camps.

The fact that it is highly doubtful that ‘empiricists’ (and ‘rationalists’) perceived themselves as such is important. This raises the question of why it is still often taken to be the case that there are two antagonistic philosophical traditions in early modern Europe epitomized by, on the one hand, Descartes, Leibniz, and Spinoza, and Berkeley, Hume, and Locke on the other. What is more, there is evidence that the contrast between these traditions, as we know it today, was invented in the 1850’s by the German historian Kuno Fischer (1824–1907) (see Mercer 2020, 73; for more on the rise of these labels see also Loeb 2010, Norton 1981, Vanzo 2016).

However, despite its complicated history, and further potential challenges which we discuss towards the end of this section, we believe retaining the label ‘British Empiricism’ is fruitful as long as one is fully aware of the fact that it is an analyst’s category. Importantly, there needs to be transparency about the criteria that are used to group certain thinkers together. In our case, the group of thinkers considered here are all, with one exception, British or Irish in the previously outlined sense and share a commitment to the Lockean Axiom, that ‘there is no idea in the mind that cannot be traced back to some particular experience’. This axiom was developed in response to the notion that humans possess innate ideas or innate knowledge (whether that be of mathematical/geometrical truths, or of God), which had previously been endorsed by Plato, was defended by thinkers like Descartes, later Cartesians such as Nicholas Malebranche (1638–1715), and Leibniz, in the seventeenth-century (for Locke’s metaphysics and epistemology, see, for example, Ayers 1991, Bennet 1971, Chappell 1992, Jolley 1999, Mackie 1976, Yolton 1956, Wilson 1999).

Locke, and subsequent thinkers who would go on to be characterised as empiricists, rejected this innatist notion. Indeed, it is standard to view responses to this question, of whether there are innate ideas in the human mind, as a central dividing line between empiricists and rationalists more generally. Thus, in an attempt to bridge the gap between the old standard narrative and new ways of speaking about the history of early modern philosophy, we keep this starting point, yet use it to tell a different story in terms of actors and issues considered. This we deem to be important because of exclusionary tendencies of the traditional early modern canon. By this we mean the fact that the voices of women and other marginalized groups were often systematically excluded when the early modern canon was formed (not to mention that many of the philosophers that became part of the canon have problematic views on issues pertaining sex, gender, class, race, or species) (see for example, O’Neill 1998; Conley 2006, Shapiro 2016; Hutton 2021; Lapointe and Heck 2023). Thus, it is crucial that any new narrative about ‘British Empiricism’ considers non-canonical (that is, traditionally underrepresented) thinkers as well. With that in mind, our decision to focus on the Lockean Axiom is significant because it allows us to integrate non-canonical thinkers such as Collins, Toland, Shepherd, or Newcombe alongside the traditional ‘big three’ of Locke, Berkeley, and Hume. Additionally, focusing on this axiom enables us to consider a larger variety of issues compared to the standard narrative, which focuses primarily on our knowledge of the external world (covered in section 2). For, as will become evident in the subsequent sections, the interests of even Berkeley, Locke, and Hume go well beyond this epistemological issue and encompass, for example, theological and moral questions.

Yet, even if our narrative is more inclusive than the standard story, it is nonetheless important to note its limitations. In closing this section, we illustrate this point with the case of comparatively well-known British women philosophers from the early modern period who do not neatly fall into the category of ‘empiricism’ – either in our use of the term or in its more traditional sense.

It might seem obvious that an article focusing on the Lockean Axiom, as we have called it, does not discuss Margaret Cavendish (1623-1673). After all, Cavendish died over decade before the Essay was published. However, a comprehensive account of philosophy in early modern Britain cannot afford to neglect such a prolific writer. Over her lifetime, Cavendish wrote numerous philosophical treatises, plays, and poems, as well as novels (perhaps most famously The Blazing World in 1668). Yet, Cavendish, perhaps at this stage the most ‘canonical’ woman in early modern philosophy, does not fit neatly into either the ‘empiricist’ or ‘rationalist’ camp. She is critical of Descartes on several issues, including his views on the transfer of motion (which she rejects in favor of an account of self-motion as ubiquitous throughout nature) and his dualism (see her Observations upon Experimental Philosophy and Grounds of Natural Philosophy (both published in 1668); for discussion of Cavendish’s system of nature see Boyle 2017, Lascano 2023, Detlefsen 2006, Cunning 2016). But she is also committed to some (possibly weak) form of ‘innatism’ (discussed in section 2.2), whereby all parts of nature, including humans, have an innate knowledge of God’s existence. Note that (as discussed in section 2.1), there is version of the story of ‘empiricism’ that can be told that brings Thomas Hobbes into the fold. Despite being contemporaneous with Hobbes, Cavendish’s metaphysical and epistemological commitments make it difficult to do the same with her. Thus, by framing the story of early modern British philosophy as one concerned with ‘empiricism’, there is a danger of excluding Cavendish. As recent scholars like Marcy Lascano (2023), have argued, this motivates developing alternative stories – ones that might focus on ‘vitalism’, for instance – alongside more traditional narratives, which feature Cavendish and other women as protagonists.

Another case in point is Mary Astell (1666-1731). One way of telling the story of ‘empiricism’ is as a tradition that formed in opposition to Cartesianism. But if an opposition to Cartesianism is over emphasized, then a thinker like Astell is likely to fall through the cracks. For even though Astell was writing during Locke’s lifetime and critically engages with him when developing her views on education, love, and theology (see for example, Proposal to the Ladies, Parts I and II. Wherein a Method is offer’d for the Improvement of their Minds from 1694 and 1697 or The Christian Religion, As Profess’d by a Daughter Of the Church of England from 1705), she is quite explicitly committed to a form of substance dualism that shares many features in common with that of Descartes (see Atherton 1993 and Broad 2015).

While it may be hard, as we have suggested, to incorporate Cavendish or Astell into a traditional ‘empiricist’ narrative, there are several thinkers that might more easily fit under that label. Take the case of Anne Conway (1631–1679), who is as critical of ‘rationalists’ like Descartes and Spinoza (along with other figures like Hobbes) in her Principles of the Most Ancient and Modern Philosophy (for example, chap. 7) as any of the ‘usual suspects’, such as Berkeley or Locke (for more on Conway’s philosophical system, see Hutton 2004; Thomas 2017; Lascano 2023). But since Conway is not focused on the Peripatetic axiom but wants to offer a philosophical system that can explain how the nature of mind and matter as well as how God and the creation are related, it is hard to place her in the narrative developed in this article. (For a more thorough consideration of Conway’s philosophy, see for instance Hutton 2004; Thomas 2017; Lascano 2023.)  This also holds for someone like Damaris Masham (1658-1708) who – despite knowing Locke and corresponding with Leibniz and Astell – is not overly concerned with the Lockean Axiom. Rather, Masham focuses on moral issues as well as love and happiness (see for example, Discourse Concerning the Love of God in 1696 and her Occasional Thoughts in 1705) arguing for a notion of humans as social and rational beings (for more on Masham’s social philosophy, see Broad 2006, and 2019; Frankel 1989; Hutton 2014 and 2018; Myers 2013). Finally, our focus on the Lockean Axiom means that even someone like Mary Wollstonecraft is hard to incorporate into the narrative. While Wollstonecraft is deeply influenced by Locke’s views on education and love, which play an important role in the background of her Vindication of the Rights of Women from 1792, her focus is on women’s rights. There is no obvious sense in which she is an ‘empiricist’ – on either a traditional conception of that term or the way we have conceived it in this article (that is, as committed to the Lockean Axiom) (see Bahar 2002; Bergès 2013; Bergès and Coffee 2016; Falco 1996; Sapiro 1992).

Wollstonecraft’s case is of particular interest because it illustrates that one can even be a Lockean of sorts and still not fit the bill, as it were. In turn, this emphasizes that any narrative that scholars develop will have to make tough choices about who to include, which is why it is so important to be transparent about the reasoning behind these choices. We strongly believe that this must be kept in mind when reading this article and engaging in both teaching and scholarship in the history of philosophy more generally.

In sum, we have strived to present here a narrative that does justice to the existing tradition while correcting some of its main flaws (in particular, its exclusionary tendencies) in terms of issues and thinkers considered. Nonetheless, it is important to be mindful of the fact that this narrative is just one of many stories that could be told about British philosophy from the seventeenth to the nineteenth century. After all, each narrative – no matter its vices and virtues – will have to deal with the fact that it is arbitrary in the sense of being the product of a particular analyst’s choices. It might well be the case that other scholars deem it better to forgo these labels altogether in research and teaching (see, for example, Gordon-Roth and Kendrick 2015).

2. The Origins of Empiricism

a. Precursors to Locke

As noted in the previous section, this article on ‘British Empiricism’ will focus on a particular narrative that takes Locke’s Essay Concerning Human Understanding as a starting point for the ‘British empiricist’ tradition. Inevitably, there is a degree of arbitrariness in this decision – as we suggested in the previous section, such is the case with any historical narrative that chooses some thinkers or ideas and not others. Nonetheless, we think that this particular narrative has the theoretical virtue of allowing us to expand the canon of ‘British empiricism’ and discuss a greater range of topics (covering moral philosophy and theology, for example, as well as epistemology and metaphysics).

Even if ‘empiricism’ is tied to an acceptance of some version of the ‘Peripatetic Axiom’ (as it is in this article), it is important to note that ‘empiricism’ is neither uniquely British nor a uniquely early modern phenomenon, and Locke was not the first early modern thinker to draw heavily from the ‘Peripatetic Axiom’ in his approach to knowledge. In this section, we briefly outline the history of the ‘Peripatetic Axiom’ prior to Locke before introducing Locke’s usage of it as espoused in the Essay. We do so by charting the emergence of this ‘Peripatetic Axiom’ which, in a very general form, is as follows:

Peripatetic Axiom: there is nothing in the intellect not first in the senses.

The name comes from the axiom’s association with Aristotle (see Gasser-Wingate 2021), the ‘Peripatetic’ philosopher; so-called because he liked to philosophise while walking. We will argue that, in the hands of Locke, the Peripatetic Axiom, which has a long history, was turned into the Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience (which we discuss in greater detail in section 2.2).

Prior to Locke, the axiom can be found in the writings of medieval Aristotelian writers including Thomas Aquinas (1225–1274) and Roger Bacon, other early modern writers like Thomas Hobbes (1588–1679), and perhaps even in the work of Ancient Greek thinkers like Aristotle (ca. 384-322 BCE) and Heraclitus (ca. 500 BCE). Our contention is that, in Locke’s Essay, the Peripatetic Axiom took on a particular shape that would go on to be hugely influential in seventeenth- and eighteenth-century philosophy, especially in Britain. One reason for this is that Locke’s Essay was extremely widely read in Britain; for example, it was a standard set text for philosophy in British universities.

For the purposes of the discussion in this article, we take empiricists to be those thinkers who are committed, in some form or another, to the view that all knowledge (everything that is ‘in the mind’) can be traced back to some kind of experience. Often, ‘experience’ is construed in terms of sense-perception, although, as we will find, in Locke’s Essay, ‘experience’ covers both outward sense experience and inward, introspective experience of the operations and contents of one’s own mind – what Locke calls ‘reflection’ (Essay 2.1.2). Thus, Locke can be thought of as having expanded the scope of what can be ‘experienced’, compared to many of his early modern, medieval, and ancient predecessors.

There is some evidence of something close to a commitment to ‘empiricism’ – perhaps a kind of ‘proto-empiricism’ – in Pre-Socratic writers such as Heraclitus, Empedocles (ca. 495–435 BCE), or Xenophanes (ca. 570–475 BCE). Although their writings make it hard to determine whether they are committed to a recognisable form of the Peripatetic Axiom or are simply resistant to thinkers like Parmenides (ca. 515–445 BCE), who argued that the senses are unreliable and that a priori reasoning is the only appropriate way to grasp the nature of reality. Similarly, Aristotle rejects his teacher Platos (427–347 BCE) account of knowledge as recollection and the theory of innate ideas that follows from it. Plato had argued that our knowledge of, for example, mathematical principles is in fact knowledge of the Forms (Republic 510c1–511b2). The Forms – perfect, idealised, abstract entities which inhabit a ‘Realm of Forms’ distinct from our own world of sense experience—can be accessed, according to Plato, by recollection or intuition. Aristotle rejects this account of knowledge as recollection (for example, APo. 100a)—a move that would later be repeated by Locke in his own discussion of innate ideas in Book I of the Essay. Instead, Aristotle claims that “to gain light on things imperceptible we must use the evidence of perceptible things” (EN 1104a13–14). Similarly, Aristotle rejects the idea, found in thinkers like Parmenides and Plato, that reality can be understood through a priori reasoning, claiming instead that “we should accept what is evident to the senses rather than reasoning” (GA 760b29–33). Like later thinkers who accept the Peripatetic axiom, like Locke and Hume, Aristotle argues that – since inquiry is limited by what we are able to experience – when it comes to certain observable phenomena, we may, at best, be able to arrive at possible causes (Meteor 344a5–7).

In medieval thought, we begin to find explicit formulations of the Peripatetic Axiom. Note that, despite being called ‘Peripatetic’, the axiom is more explicitly articulated by later followers of Aristotle. Perhaps the most famous follower of Aristotle in Western philosophy, Thomas Aquinas, claims that “without sense perception no one can either learn anything new, nor understand matters already learned” (In DA 3.13 [para. 791]). In other words, according to Aquinas, we only learn new things via sense-perception. Clearly, this implies that there is nothing (new) in the mind that is not first in the senses. Similarly, another medieval thinker who pre-empts some of the ideas that would go on to be central to Locke’s view, Roger Bacon (1215–1292), writes that “without experience nothing can be sufficiently known” (OM 6.1). This is not quite the same as the claim that there is no knowledge (at all) without experience, but is still an endorsement of the crucial, necessary role that experience plays in knowledge acquisition that is central to the empiricist tradition.

Perhaps the most significant, imminent pre-cursor to Locke – in the context of the history of the Peripatetic Axiom – is Thomas Hobbes. Hobbes commits himself to the Peripatetic Axiom when he writes, in Leviathan (1651), that “there is no conception in a man’s mind, which hath not at first, totally, or by parts, been begotten upon the organs of Sense” (Leviathan, 1.1). Indeed, arguably one could tell a somewhat different story of early modern (or even ‘British’) ‘empiricism’ that takes Hobbes as its starting point. As Peter Nidditch explains, Hobbes (along with the French philosopher Pierre Gassendi (1592-1655)) “first produced in the modern era, especially in his Leviathan and De Corpore, a philosophy of mind and cognition that built on empiricist principles” (Nidditch 1975, viii). Nidditch goes on to suggest, speculatively, that it is most likely Hobbes’ reputation – as a highly unorthodox thinker, at best, and a secret atheist, at worst – that prevented him, retrospectively, from being seen as the ‘father of empiricism’ in the standard narrative. Whatever the explanation, it is Locke rather than Hobbes who would go on to be widely read and highly influential in Britain, and elsewhere, in the seventeenth- and eighteenth-century. As Nidditch puts it: “The Essay gained for itself a unique standing as the most thorough and plausible formulation of empiricism – a viewpoint that it caused to become an enduring powerful force” (Nidditch 1975, vii). Due to the Essay’s widespread influence, we focus on the role that Locke, rather than Hobbes, played in the development of British thought during these centuries; a role which would go on to be seen as so important that it even becomes possible, in hindsight, to speak of a more or less unified group and label it ‘British empiricism’. As we have suggested, there is a story to be told about Hobbes and empiricism, but it is one that, for the most part, we do not tell here (see section 1).

b. Locke

As was noted in the introduction, the question of whether there are innate ideas in the human mind is often seen as a central dividing line between empiricism and rationalism as they are standardly construed. While we pointed out the various issues of this standard narrative, our narrative also makes use of the issue of innatism. Though, crucially, our focus is less on finding a dividing line and more on finding a common denominator in the views of mainly ‘British’ and ‘Irish’ philosophers (for more on issues concerning the ‘British’ moniker see § 6). With that in mind, let us turn to the issue of innatism and the way Locke deals with it.

Locke characterises his innatist opponents’ position like so: “It is an established Opinion amongst some Men, That there are in the Understanding certain innate Principles; some primary Notions…as it were stamped upon the Mind of Man, which the Soul receives in its very first Being; and brings into the world with it,” (Essay, 1.2.5).

Whether or not this is a fair characterisation of his opponents’ views, as Locke sees it, the term ‘innate’ suggests that, on the innatist account, human beings are quite literally born with some in-built knowledge – some principles or propositions that the mind need not acquire but already possesses. In short, on this view, prior to any experience – that is, at the very first instant of its having come into existence – the human mind knows something. Locke develops two lines of argument against the innatist position, which will be referred to in what follows as (1) the Argument from Superfluousness and (2) the Argument from Universal Assent.

The Argument from Superfluousness proceeds as follows:

It would be sufficient to convince unprejudiced Readers of the falseness of this Supposition, if I should only shew (as I hope I shall in the following Parts of this Discourse) how Men, barely [that is, only] by the Use of their natural Faculties, may attain to all the Knowledge they have, without the help of any innate Impressions. (Essay, 1.2.1)

Locke’s point here is that all it takes to convince an ‘unprejudiced reader’ (that is, one who is willing to be swayed by reasonable argument) of the falseness of innatism is evidence that all knowledge can be traced back to instances in which our human “natural Faculties” – that is, our faculties of sense-perception and reflection – were in use. This argument thus depends upon the plausibility of Locke’s claim that all knowledge can be traced back to some kind of experience. We leave aside the Argument from Superfluousness for the moment since we discuss this claim in greater detail below.

In contrast, the Argument from Universal Assent is a standalone argument that does not depend upon any additional claims about the sources of human knowledge. Locke claims that if the human mind possessed certain principles innately then there would surely have to be certain spoken or written propositions that all human beings would assent to. In other words, if there were an innate principle X such that all human beings, regardless of their lives and experiences, knew X, then when confronted with a written or verbal statement of X (“X”), all human beings would agree that “X” is true. For example, let us assume for the moment that murder is wrong is a principle that is innately known to the human mind. Locke’s point is that, if presented with a written or verbal statement of “murder is wrong”, surely all human beings would assent to it.

And yet, Locke argues, this does not seem to be true of this or any other principle (evidenced, for example, by the fact that people do, in fact, commit murder). He writes: “[this] seems to me a Demonstration that there are none such [innate principles of knowledge]: Because there are none to which all Mankind gives an Universal assent” (Essay, 1.2.4). If by ‘demonstrates’, here, Locke means that it logically follows that, since there are no universally assented-to propositions, there must not be any innately known principles, he is not quite right. For there might be other reasons why certain propositions are not universally assented to—perhaps not everyone understands the statements they are being presented with, or perhaps they are lying (perhaps murderers know murder is wrong, but commit it nonetheless). At best, the Argument from Universal Assent provides a probable case against innatism, or places the burden proof on the innatist to explain why there are no universally assented-to propositions, or else neutralises the converse view (which Locke thinks his opponents subscribe to; see Essay, 1.2.4) that the existence of innate principles can be proven by appealing to the existence of universally assented-to propositions. And, of course, Locke’s reasoning also depends upon the truth of the claim that there are, in fact, no universally assented=to propositions (perhaps people have just not had the chance to assent to them yet, because they have not yet been articulated). Given all these mitigating factors, it seems most charitable to suggest that Locke is simply hoping to point out the implausibility, or even absurdity, of the innatist position – especially given an increasing societal awareness of cultural relativity in different societies and religions outside Europe in the seventeenth century (Essay, 1.4.8), not to mention the fact that neither Plato or Aristotle, or any other pre-Christians, would have assented to propositions like ‘God exists’ or ‘God is to be worshipped’ which, Locke claims, are paradigm cases of so-called ‘innate principles’ (Essay, 1.4.8).

Having, to his own satisfaction at least, provided one argument against the innatist position, Locke develops an account of the sources of human knowledge that supports the Argument from Superfluousness – by showing how all human knowledge can be traced back to some kind of experience. In contrast to innatists, Locke maintains that at birth the human mind is a blank slate or ‘tabula rasa’. If we picture the mind as a “white Paper, void of all characters”, Locke asks, “How comes it to be furnished?” (Essay, 2.1.2). His response is that: “I answer, in one word, From Experience: In that, all our Knowledge is founded; and from that ultimately derives itself” (Essay, 2.1.2).

Locke then divides experience into two subcategories with respective mental faculties: ‘sensation’ and ‘reflection’ (Essay, 2.1.2). Concerning sensation, he writes:

Our Senses, conversant about particular sensible Objects, do convey into the Mind, several distinct Perceptions of things, according to those various ways, wherein those Objects do affect them: And thus we come by those Ideas, we have of Yellow, White, Heat, Cold, Soft, Hard, Bitter, Sweet, and all those which we call sensible qualities. (Essay, 2.1.3)

Our ideas of sensation, Locke explains, are those which pertain to the qualities of things we perceive via the (five) external senses: the objects of vision, touch, smell, hearing, and taste. But of course, this does not exhaust the objects of the mind – we can also have ideas of things that are not perceived by the ‘outward’ senses. As Locke writes:

The other Fountain, from which experience furnisheth the Understanding with Ideas, is the Perception of the Operations of our own Minds within us, as it is employ’d about the Ideas it has got; which Operations, when the Soul comes to reflect on, and consider, do furnish the Understanding with another set of Ideas, which could not be had from the things without: and such are, Perception, Thinking, Doubting, Believing, Reasoning, Knowing, Willing, and all the different actings of our own Mind. (Essay 2.1.4)

In a sense, then, Locke’s point is this: While we standardly talk as though we ‘experience’ only those things that can be perceived by the senses, in actual fact we also experience the operations of our own mind as well as things external to it. We can, that is, observe ourselves thinking, doubting, believing, reasoning, and so on – and we can observe ourselves perceiving, too (this claim is contentious: Do we really observe ourselves perceiving, or are we simply aware of ourselves perceiving?).

Locke’s aim is to establish that no object of knowledge, no ‘idea’ (Essay, 1.1.8), can fail to be traced back to one of these two ‘fountains’ of knowledge. In doing so, Locke thus commits himself to a particular formulation of the ‘Peripatetic Axiom’ (discussed in section 2.1). While the ‘Peripatetic Axiom ‘– found in medieval Aristotelians and in Hobbes – states that ‘there is nothing in the intellect not first in the senses,’ Locke’s claim, which is central to the way ‘empiricism’ is construed in this article, is:

Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience.

The Lockean Axiom would go on to very influential in seventeenth- and eighteenth-century thought, especially in Britain.

3. Our Knowledge of the External World and Causation

This section focuses on the application of the Lockean Axiom (there is no idea in the mind that cannot be traced back to some particular experience) to our knowledge of the external world. In doing so it most closely resembles the standard narrative of ‘British empiricism’ because the focus rests on Berkeley’s rejection of materialism and Hume’s denial of necessary connection. However, in contrast to the standard narrative, we close this section by emphasizing how Mary Shepherd, who is said to have read Locke’s Essay when she was eight years old (Jeckyl 1894, 217), rejects both positions. Although, as will become evident, in doing so she does not draw from the Lockean Axiom but from two causal principles.

a. Berkeley on the Nature of the External World

 In A Treatise Concerning the Principles of Human Knowledge (1710/34) and Three Dialogues between Hylas and Philonous (1713/34), Berkeley defends the doctrine he is most famous for: Immaterialism. In a nutshell, Berkeley holds that everything that exists is either an immaterial mind or idea (for example, PHK §§ 25–27). Thus, his commitment to the notorious dictum esse est percipi aut percipere (“To be is to be perceived or to perceive”) (compare NB 429, 429a; PHK § 3).

Two key features of his argument for immaterialism are Berkeley’s claims that the “existence of an idea consists in being perceived” (PHK § 3) and that “an idea can be like nothing but an idea” (PHK § 8). Since Berkeley is convinced that sense perception works via resemblance (for example, Works II, 129; TVV § 39) (see Fasko and West 2020; Atherton 1990; West 2021) and because we know that (most) objects of human knowledge are ideas – either “imprinted on the senses” or “formed by help of memory and imagination” (PHK § 1), he argues that we can infer that the objects in the external world also must be ideas or collections of ideas (PHK §§ 1–8). After all, according to Berkeley, when we say something like the table exists, we mean that it can be perceived. And what is perceived is, after all, an idea (PHK § 3) (Daniel 2021, Fields, 2011, Jones 2021, Rickless 2013, Saporiti 2006).

It is important to note that, in developing this argument, Berkeley, implicitly, draws on the Lockean Axiom that there is no idea that cannot be traced back to some particular experience. For Berkeley’s point is that our experience of the external world and its objects clearly suggests that they only exist when they are perceived. That is, when we trace back our ideas of things in the external world to the experiences we have of them, we come to understand that these ‘things’ are also ideas.

Berkeley fortifies his case for immaterialism by rejecting what is, to his mind, the only viable alternative: Materialism. More specifically, Berkeley argues against the existence of a (Lockean) material substance. In doing so, he, again, draws from the Lockean Axiom – and, in that sense, uses Locke’s own claim against him – by raising the question of whether we even have an idea of material substance in the first place. Berkeley then claims that even materialists, like Locke on his reading, must accept that we do not; for, as they themselves admit, there is nothing we can say about it (DHP 261). The reason we do not have an idea of material substance, Berkeley contends, is that there is no such thing in the first place and, thus, no experience of such a thing (and where there is no experience, there can be no idea). In fact, Berkeley believes that the very notion of such a thing would be “repugnant” (DHP 232; PHK § 17). As he puts it:

I have no reason for believing the existence of matter. I have no immediate intuition thereof: neither can I mediately from my sensations, ideas, notions, actions or passions, infer an unthinking, unperceiving, inactive substance, either by probable deduction, or necessary consequence. (DHP 233)

Even worse, assuming the existence of a material substance leads to skepticism concerning the existence of the external world and ultimately also God’s existence (that is, it leads to atheism, compare also PHK § 92) because it leads one to become “ignorant of the true nature of every thing, but you know not whether any thing really exists, or whether there are any true natures at all” (DHP 229). When challenged by his imagined opponent with the argument that we also have no idea of God or other minds (see also section 4.3) – and thus no reason to assume they exist – Berkeley appeals to the (first personal) experience we can have of these entities (DHP 233). This is consistent with the Lockean Axiom which, while it does entail that every idea can be traced back to an experience, does not entail that every experience must lead to an idea.

In sum, in arguing for his immaterialism Berkeley makes implicit use of the Lockean Axiom inasmuch as he draws from it to establish that the external world and its objects must consist of ideas because our experience of the external world and its objects are such that it consists of perceivable things. The Lockean Axiom also plays a role in Berkeley’s argument against the existence of material substance, in that the lack of experience of matter is taken to explain the lack of a corresponding idea – and an analysis of the idea shows its repugnancy.

b. Hume on the Nature of Causation

 At least in the context of contemporary Western thought, Hume’s account of causation is perhaps one of the best known and most discussed theories to have come out of the early modern period (see, for example, Garrett 2015; Bell 2008; Beauchamp and Rosenberg 1981). In An Enquiry Concerning Human Understanding (1748), Hume sets out to demonstrate that causal relations – or what he calls ‘necessary connections’ – are not something that we experience in the world around us (see Noxon 1973 or Traiger 2006 for a discussion of the development of Hume’s thought and the relation between the Treatise and the EHU). Rather, Hume claims, we form the idea or concept of causation in our mind as a result of repeated experiences of ‘causes’ preceding ‘effects’, and the ‘sentiment’ that such repeated experiences generate in us (EHU 7). In other words, on Hume’s view, we feel as though certain events or objects (like smoke and fire) are necessarily connected, by a causal relation, because we see them occur in conjunction with one another repeatedly. But, strictly speaking, Hume argues, we do not experience any such causal relations and thus cannot know with certainty that the two things are necessarily connected – at best, we can have probable knowledge. What is important, for the concerns of this article, is that Hume’s reasoning for this view is premised upon a version of the Lockean Axiom: There is no idea in the mind that cannot be traced back to some particular experience. In other words, it is Hume’s ‘empiricism’ (in the sense that we have used the term in this article) that leads him to arrive at his skeptical account of causation. For an ‘empiricist’, knowledge is dependent upon experience – and Hume’s point in the EHU is that we cannot experience causation. We run through Hume’s argument in more detail below.

Hume begins section 2 of the EHU (where his discussion of the origin of ideas takes place) by establishing what has come to be known as ‘the Copy Principle’ (for further discussion, see Coventry and Seppalainen 2012; Landy 2006 and 2012). The Copy Principle concerns the relation between what Hume calls ‘impressions’ and ‘ideas.’ The crucial thing for our purposes is that, for Hume, ‘impression’ refers (amongst other things) to any direct experience or sense-perception we have of an external object. When I look outside my window and see the sun, for instance, I am receiving an ‘impression’ of the sun. That is, the sun is ‘impressing’ itself upon my sense organs, similarly to a stamp that impresses an insignia upon wax. ‘Ideas,’ on the other hand, are what are left behind, in the mind, by such impressions; Hume’s use of the term ‘idea’ is thus slightly different to that of Locke or Berkeley,   who both use ‘idea’ in a way that also encompasses Humean impressions. When I remember the sun, as I lie in bed at night, I am having an ‘idea’ of the sun. And, similarly, if I lie in bed and imagine tomorrow’s sun, I am also forming an ‘idea’ of it. In terms of our experiences of them, impressions and ideas are differentiated by their degrees of vividness and strength: my impression of the sun, for instance, will be stronger and more vivid (perhaps brighter) than my idea of the sun. As Hume puts it:

These faculties [of memory and imagination] may mimic or copy the perceptions of the senses; but they never can entirely reach the force and vivacity of the original sentiment. The utmost we say of them, even when they operate with greatest vigour, is, that they represent their object in so lively a manner, that we could almost say we feel or see it: But, except the mind be disordered by disease or madness, they never can arrive at such a pitch of vivacity, as to render these perceptions altogether undistinguishable. (EHU 2.1, 17)

An idea might somewhat resemble the strength or vividness of an impression but, Hume claims, an idea of the sun and the sun itself (unless one’s mind is ‘disordered’) will never be entirely indistinguishable.

The Copy Principle entails that every (simple) idea is a copy of an impression. Hume writes:

It seems a proposition, which will not admit of much dispute, that all our ideas are nothing but copies of our impressions, or, in other words that it is impossible for us to think of anything which we have not antecedently felt, either by our external or internal senses. (EHU 7.1.4, 62)

This principle is strongly empiricist in character and closely related to both the Lockean Axiom and the Peripatetic Axiom, which entails that there is nothing in the mind not first in the senses. Like the Lockean Axiom, the Copy Principle (as articulated in this passage) tells us that if I have an idea of X, then I must previously have had an experience, or ‘impression’, of X.

For Hume, all of this makes the issue of where we get our idea of causation extremely pressing. Hume denies that we do in fact have any impressions of causation or ‘necessary connections’ between things:

When we look about us to external objects…we are never able in a single instance, to discover any power of necessary connexion; any quality which binds the effect to the cause, and renders one an infallible consequence of the other. We only find that one does actually, in fact, follow the other. (EHU 7.1.6, 63)

Consider the case of a white billiard ball rolling along a table and knocking a red ball. Hume asks: can you in fact experience or perceive the ‘necessary connection’ (or causal relation) that makes it the case that when the white ball knocks the red ball the red ball moves away? His answer is no: what you experience, strictly speaking, is a white ball moving and then a red ball moving. But if we do not have an impression of causation, in such instances, why do we have an idea of causation?

Hume concludes that while we do not have an outward impression of causation, because we repeatedly experience uniform instances of for example, smoke following fire, or red balls moving away from white balls, we come to feel a new impression which Hume calls a ‘sentiment’. That is, we feel as though we are experiencing causation – even though, in strict truth, we are not. This new feeling or sentiment is “a customary connexion in the thought or imagination between one object and its usual attendant; and this sentiment is the original of that idea which we seek for” (EHU 7.2.30, 78). In other words, while our idea of causation or necessary connection cannot be traced back to a specific impression, it can nonetheless be traced back to experience more generally. Repeated uniform experience, Hume claims, induces us to generate the idea of causation – and is the foundation of our ‘knowledge’ of cause-and-effect relations in the world around us. In line with the Lockean Axiom, then, Hume’s view is that we would have no idea of causation, were it not for our experience of certain events or objects (‘causes’) regularly preceding others (‘effects’).

c. Shepherd on Berkeley and Hume

 The previous subsections have established that Berkeley and Hume both draw on the Lockean Axiom that there is no idea that cannot be traced back to some particular experience in important ways. Both thinkers draw on this principle inasmuch as they take the absence of particular experiences (about the external world or causation) not only to entail that there is no idea but that the things in question (material substance or necessary connections) do not exist. In this section we consider how Mary Shepherd rejects both Berkeley’s immaterialism and Hume’s skeptical account of causation. As will become evident, however, Shepherd does so not by drawing on the Lockean Axiom – which does not play any role in her account of the mind – but by using two causal principles that she introduces in her works. Shepherd is thus an example of the limits of the narrative developed here. For even though she conceives of Locke as her closest ‘philosophical ally’ (LoLordo 2020, 9), Shepherd concludes that one needs, in order to refute Berkeley and Hume, to consider the issue of causation first – and not issues concerning (mental) representation. For Shepherd believes that even (mental) representation and the mental content it allows for ought ultimately to be understood in causal terms.

Shepherd’s first causal principle, the so-called CP, holds that “nothing can ‘begin its own existence’” (for example, ERCE 94). Second, the Causal-Likeness-Principle (CLP) states that “like causes, must generate like Effects” (for example, ERCE 194). It is important to note that the CLP is a biconditional, as Shepherd claims in her second book Essays on the Perception of an External Universe (1827) that “like effects must have like causes” (EPEU 99).

Shepherd defends both principles in her first book, Essay on the Relation of Cause and Effect (1824). The main aim of this work is to refute a Humean account of causation as constant conjunction. In particular, Shepherd wants to establish, against Hume, that causes and effects are necessarily connected (ERCE 10). While the details of Shepherd’s argument can be put aside for now, the crucial thing to note is that she does not draw from the Peripatetic Axiom or the Lockean Axiom. Instead, Shepherd focuses on rejecting Hume’s theory of mental representation and his claim that the possibility of separating cause and effect in thought tells us something about their actual relation (Bolton 2010 & 2019; Landy 2020a & 2020b). Crucially, this rejection of Hume, in turn, fortifies her case for her two causal principles – both of which play a crucial role in arguing against Berkeley.

Meanwhile, in rejecting Berkeley’s version of immaterialism, Shepherd contends that we have sensations of solidity and extension (EPEU 218), and drawing from the CP, we know that these must have a cause. Since we know the mind to be a cause for sensations (for example, EPEU 14–15), there must also be another cause for these sensations. Thus, we can come to know that matter (which she also calls ‘body’) is the “continually exciting cause, for exhibition of the perception of extension and solidity on the mind in particular” (EPEU 155) and matter is “unperceived extended impenetrability” (LMSM 697). In other words, the causal connection between our mental content and the external world allows Shepherd to draw inferences about its objects, which show them not to be ideational, that is, not to merely consist of ideas as Berkeley, for instance, would have it (while Shepherd thus clearly rejects a Berkeleyan brand of immaterialism (see Atherton 1996, Rickless 2018), it is not clear whether she is opposed to all kinds of immaterialism whatsoever; as Bolye (2020, 101) points out ‘(im-)material’ seems to be a “label” for capacities and it is unclear whether more than capacities exist in Shepherd’s metaphysics).

In sum, Shepherd is a fitting end point for this part of the narrative because she not only closely engages with Berkeley and Hume (and their applications of the Lockean Axiom) but also because Locke is such a close philosophical ally for her—although, scholars have noted that Shepherd sometimes advances an idiosyncratic reading of Locke (Boyle 2023; LoLordo 2022). Even more to the point, Shepherd suggests that her theory is a ‘modified Berkeleian theory’ (LMSM 698) and thus aligns herself explicitly with a key figure of the ‘standard’ narrative of British empiricism.

Thus, despite the fact that the Lockean Axiom does not play a role in Shepherd’s argumentation, and in fact it is unclear what she thinks about it, there are good reasons to consider her within this narrative. For Shepherd’s philosophy focuses on key figures within this narrative to the point where she aligns herself implicitly and explicitly with at least two of them.

4. Morality

 

 

One of the most interesting upshots of the widespread acceptance of the Lockean Axiom, or what we might call his ‘empiricist’ philosophy, in Britain and Ireland during the eighteenth century is the effect it had on theorising about morality; specifically concerning the question of where we get our moral ideas (like good, bad, right, wrong, virtuous, and vicious) from. The Lockean Axiom dictates that there is no idea that cannot be traced back to some particular experience. While that might fit nicely with how we get our ideas of concepts like colour, sound, or touch (and any other ideas that can be traced to sense perception), ideas like justice/injustice, good/bad, or right/wrong, do not seem to be easily traceable to some particular experience. It does not seem controversial to suggest that ‘redness’ or ‘loudness’ are qualities we can experience in the world around us, but it is much less obvious that we experience qualities such as ‘goodness’, ‘badness’, ‘rightness’, or ‘wrongness’. For a start, while – barring cases of, for example, blindness, deafness, or any other sensory deficiency – there is likely to be agreement about an object’s colour or the volume of a sound. There is, however, generally speaking, considerable disagreement when it comes to the goodness/badness or rightness/wrongness of an action. The same applies in the case of beauty and other aesthetic qualities, and there is a great deal that could be said about ‘empiricist’ approaches to aesthetics (we do not discuss these issues here but for discussion of Hume’s aesthetics see, for example, Costello 2007, Gracyk 1994, Townsend 2001, and for discussion of Hutcheson’s aesthetics, see, for example, Shelley 2013, Michael 1984, Kivy 2003).

This section looks at three thinkers’ views on morality and examine the role that the Lockean Axiom played in their theorising. All three are important figures in the history of (Western) ethics. Francis Hutcheson was one of the first philosophers to apply the Lockean Axiom to questions of morality and, though he was Irish born, would go on to be known as a central figure in the so-called ‘Scottish Enlightenment’ (his parents were Scottish Presbyterian and he would spend most of his career in Scotland). David Hume pre-empts discussions of utility in ethical theorising that would come to the fore in the work of Mill and Bentham and develops the idea of a sense of ‘taste’ which allows us to perceive the moral characteristics of persons and actions. Meanwhile, Susanna Newcome (1685-1763) has recently been identified (Connolly 2021) as one of the earliest thinkers to defend what is recognisably a form of utilitarianism.

a. Hutcheson and the Moral Sense

In An Inquiry into the Original of Our Ideas of Beauty and Virtue (1725), Francis Hutcheson explicitly acknowledges the indebtedness of his discussion of morality (as well as beauty) to Locke (for example, Inquiry, 1.VII). His begins the Inquiry by defining sensations as “[t]hose Ideas which are rais’d in the Mind upon the presence of external Objects, and their acting upon our Bodys” and adds that “We find that the Mind in such Cases is passive, and has not Power directly to prevent the Perception or Idea” (Inquiry, 1.I). A little later, Hutcheson explains that “no Definition can raise any simple Idea which has not been before perceived by the Senses” (Inquiry, 1. IV). In making these claims, Hutcheson is committing himself to a version of the Lockean Axiom, the claim that there is no idea in the mind that cannot be traced to some particular experience – strictly speaking, this should read ‘simple idea’, since Hutcheson’s view is that all simple ideas must be traced back to some experience – compound ideas might be the product of reason.

Hutcheson’s commitment to the Lockean Axiom leads him to conclude that humans have a “Moral Sense” (see Frankena 1955; Harris 2017) as well as external senses of seeing, hearing, touching, tasing, and smelling. In fact, in his Essay on the Nature and Conduct of the Passions and Affections (1742), Hutcheson claims we have a range of ‘internal’ senses including a “Publick Sense”, concerned with the happiness of others, a “Sense of Honour”, and a sense of “decency and dignity” (Essay, 5-30). This is understandable given that, for Hutcheson, a sensation is ‘an idea raised in the mind upon the presence of external objects’ – and it is external objects, or more often external people (and their actions), that raise in us ideas of right, wrong, good, bad, justice, or injustice.

In the Essay, Hutcheson lays out a line of reasoning which justifies this view: “If we may call every Determination of our Minds to receive Ideas Independently on our Will, and to have Perceptions of Pleasure and Pain, A SENSE, we shall find many other Senses besides those commonly explained” (Essay, 5). His point is this: a sense is a ‘determination’ or faculty of the mind by means of which it receives (passively) certain kinds of ideas. Our sense of vision, for instance, is where we get our visual ideas, for example, ideas of colour or brightness/darkness. Our olfactory sense is where we get our ideas of smell such as sourness, putridness, and so on. However, if we can identify ideas that cannot be traced back to one of the five external senses – vision, hearing, taste, touch, smell – Hutcheson argues, then there must be another sense, an internal sense, by means of which the mind has received that idea. Such is the case with our ideas of good, bad, right, wrong, and so on. Since these ideas cannot be traced to any of the five external senses – because we do not literally see, hear, taste, touch, or smell good or bad, or right or wrong – we can infer that there must be a moral sense by which the mind has received them. Hutcheson describes this moral sense as that by which “we perceive Virtue, or Vice in our selves, or others” (Essay, 20). That is, through our naturally built-in moral sense, humans can detect virtue and vice. Note that this view implies that virtue and vice, and relatedly notions like good, bad, right, wrong, justice, and injustice, are qualities out there to be sensed. But what is it exactly that we are perceiving with our moral sense? And how does the human mind perceive virtue and vice in ourselves and other people?

For Hutcheson, the answer is that our ideas of virtue, vice, and other moral concepts are grounded in perceptions of pleasure and pain. Indeed, as the quotation above suggests, for Hutcheson, all perceptions are accompanied by a feeling of pleasure or pain. Some objects excite pleasure or pain in us, Hutcheson explains, even when we cannot see any “Advantage or Detriment the Use of such Objects might tend: Nor would the most accurate Knowledge of these things vary either the Pleasure or Pain of the Perception” (Inquiry, 1.VI). That is, some objects are naturally pleasurable or painful to sense – and such objects, according to Hutcheson, are beautiful or ugly, respectively. Similarly, the actions of some people generate pleasure or pain in us, and this is what determines whether we characterise those people are virtuous or vicious. Hutcheson maintains that it is a moral sense that generates our ideas of virtue or vice (just as it is an aesthetic sense that generates ideas of beauty or ugliness), rather than, say, a judgement or act of reason, because those ideas do “not arise from any Knowledge of Principles, Proportions, Causes, or of the Usefulness of the Object” (Inquiry, 1.XII). Instead, just as we are ‘struck’ with the colour of an object or the pitch of a sound, we are ‘struck’ by the rightness or wrongness, or virtuousness or viciousness, of a person or action.

In short, Hutcheson’s view is that we feel a kind of pleasure or displeasure in response to certain character traits or actions which determines whether we characterise them as virtuous or vicious. For example, one might feel pleasure witnessing an act of charity, or displeasure witnessing an action of cruelty. In the former case, an idea of virtue (or goodness, or rightness) is raised in our minds, while in the latter it is an idea of vice (or badness, or wrongness). In so doing, Hutcheson provides an empiricist  account of the origins of ideas concerning moral concepts, that is, one that draws on Lockean Axiom.

b. Hume on Taste and the Moral Sense

Like Hutcheson, Hume is interested in identifying the source of our ideas of moral concepts like virtue, vice, justice, injustice, right, and wrong. And, like Hutcheson, Hume arrives at the view that such ideas are derived from some kind of moral sense, which he calls ‘taste’ (see, for example, T 3.3.6; Shelley 1998). (Another similarity with Hutcheson is that many of Hume’s claims about our sense of morality are paralleled in his discussion of beauty—including the claim that we have a sense of beauty.) In An Enquiry Concerning the Principles of Morals (1751), Hume’s account of moral sense, or taste, is part of a wider discussion of whether it is reason or sentiment, that is, feeling, that gives us our principles of morality. Hume lays out the debate like so:

There has been a controversy started of late…concerning the general foundation of MORALS; whether they can be derived from REASON, or from SENTIMENT; whether we attain the knowledge of them by a chain of argument and induction, or by an immediate feeling and finer internal sense; whether, like all sound judgment of truth and falsehood, they should be the same to every rational intelligent being; or whether, like the perception of beauty and deformity, they be founded entirely on the particular fabric and constitution of the human species. (EPM 1.3, 170)

In other words, the question is: do we reach conclusions about what is right or wrong in the same way we reach the conclusion of a mathematical formula, or do we reach such conclusions in the way we arrive at judgements about what counts as beautiful? The question is significant because, Hume claims, if our moral principles are more like judgements of beauty, then they might not, strictly speaking, be objective. They might instead be grounded in specific human values, concerns, desires, and judgements. Whereas if they are more like conclusions arrived at using reasoning, such as mathematical conclusions, Hume claims, then they can be more appropriately described as objective.

Hume opts for a decidedly ‘empiricist’ approach, that is, he draws from the Lockean Axiom, in answering this question, which ultimately leads him to reject the claim that moral principles are the product of reason. He explains that in the sciences, or ‘natural philosophy’, thinkers “will hearken to no arguments but those which are derived from experience” (EPM 1.10, 172). The same ought to be true, he claims, in ethics. In line with the Lockean Axiom, Hume then suggests that we ought to “reject every system of ethics, however subtile [that is, subtle] or ingenious, which is not found in fact and observation” (ibid.)—that is, the previously mentioned experience which underlies the arguments must be tied to the world we can perceive by our senses. Thus, like the natural philosophers of the Royal Society in London (the natural scientists he is referring to here), who rejected armchair theorising about nature in favour of going out and making observations, Hume’s aim is to arrive at an account of the origin of our moral principles that is based on observations of which traits or actions people do, in practice, deem virtuous or vicious—and why they do so.

What Hume claims to find is that traits like benevolence, humanity, friendship, gratitude, and public spirit—in short, all those which “proceed from a tender sympathy with others, and a generous concern for our kind and species” (EPM 2.1.5, 175)—receive the greatest approbation, that is, approval or praise. What all these character traits, which Hume calls the ‘social virtues’, have in common is their utility (see also Galvagni 2022 for more on Hume’s notion of virtue). This is no coincidence, Hume argues, for “the UTILITY, resulting from the social virtues, forms, at least, a part of their merit, and is one source of that approbation, and regard so universally paid to them” (EPM 2.2.3, 176). This leads Hume to develop the following line of reasoning: There are a set of traits, or ‘social virtues’, that are deemed the most praiseworthy by society. People who exhibit these traits are characterised as ‘virtuous.’ What these virtuous traits have in common is that they promote the interests of—that is, are useful to—society at large. Thus, Hume concludes, utility is at the heart of morality. This conclusion would go on to influence later thinkers like Jeremy Bentham and John Stuart Mill and is central to the normative ethical theory utilitarianism—which we also discuss in section 4.3 in relation to Susanna Newcome.

What role does the moral sense, or taste, play in Hume’s account of the origins of morality? His answer is that taste serves to motivate us to action, based on the pleasure that comes with approbation or the displeasure that comes with condemnation. He writes: “The hypothesis which we embrace is plain. It maintains that morality is determined by sentiment. It defines virtue to be whatever mental action or quality gives to a spectator the pleasing sentiment of approbation; and vice the contrary,” (EPM, Appendix 1, I, 289). In other words, Hume’s point is that we enjoy responding to a person or action with approval and do not enjoy, and may even take displeasure from, responding to persons or actions with blame or condemnation. Thus, like Hutcheson, Hume thinks that ideas we receive via our moral sense are accompanied by feelings of pleasure or pain. This is a claim about human psychology and, again, an idea that would go on to play an important role in the utilitarian ethics of Bentham and Mill—especially the idea, known as ‘psychological hedonism’, that humans are driven by a desire for pleasure and to avoid pain. Hume himself seems to endorse a kind of psychological hedonism when he claims that if you ask someone “why he hates pain, it is impossible he can ever give any [answer]” (EPM, Appendix 1, V, 293).

In line with the Lockean Axiom, Hume concludes that it cannot be reason alone that is the source of our moral principles. Again, like Hutcheson, Hume thinks that we sense—immediately, prior to any rational judgement—rightness or wrongness, or virtue or vice, in certain persons or actions. What is more, he argues, reason alone is “cool and disengaged” and is thus “no motive to action”, whereas taste, or moral sense, is a motive for action since it is involves a feeling of pleasure or pain (EPM, Appendix 1, V, 294). For that reason, Hume concludes that, in morality, taste “is the first spring or impulse to desire or volition” (ibid.).

c. Newcome on Pain, Pleasure, and Morality

There is no explicit commitment to the Lockean Axiom in the writings of Susanna Newcome. However, what we do find in Newcome is a development of the idea, also found in Hutcheson and Hume, that moral theorising is rooted in experiences of pleasure and pain—ideas which, as we found in sections 4.1 and 4.2, are themselves premised upon an acceptance of the Lockean Axiom. Thus, in Newcome (as in Shepherd), we find a thinker deeply influenced by the work of others who did adhere to the Lockean Axiom. In a sense, then, Newcome’s work is indirectly influenced by that Axiom. What we also find in Newcome is a bridge between the ‘empiricism’ of Locke, and thinkers like Hutcheson and Hume, who accept the Lockean Axiom, and the later utilitarianism of Jeremy Bentham and John Start Mill. For these reasons, Newcome’s ethical thinking merits inclusion in this article on ‘empiricism’ and the story of the development of the Lockean Axiom we have chosen to tell (see section 1).

In An Enquiry into the Evidence of the Christian Religion (1728/1732), Newcome provides the basis for a normative ethical theory that looks strikingly similar to the utilitarianism later, and more famously, defended by Jeremy Bentham and John Stuart Mill. For this reason, Connolly (2021) argues that Newcome—whose work pre-dates that of both Bentham and Mill—could plausibly be identified as the first utilitarian. Newcome bases her claims about ethics on claims about our experiences of pleasure and pain. What is also interesting, for our present purposes, is that Newcome identifies ‘rationality’ with acting in a way that maximises pleasure or happiness. Consequently, on Newcome’s view, we can work out what actions are rational by paying attention to which actions lead to experiences of pleasure or happiness—and the same applies to irrational actions, which lead to experiences of pain or unhappiness. In the remainder of this section, we outline Newcome’s views on pleasure and pain, happiness and unhappiness, and rational and irrational action.

Newcome begins her discussion of ethics by claiming that pleasure and pain cannot be defined (Enquiry, II.I). She explains that happiness and misery are not the same as pleasure and pain. Rather, she claims, happiness is “collected Pleasure, or a Sum Total of Pleasure” while misery is “collected Pain, or a Sum Total of Pain” (Enquiry, II.II–III). In other words, happiness is made up of feelings of pleasure, and misery is made up of feelings of pain. One is in a state of happiness when one is experiencing pleasure, and one is in a state of misery when one is experiencing pain. Newcome then goes on to commit herself to what has come to be known as ‘psychological hedonism’ (see section 4.2): the view that humans are naturally driven to pursue pleasure and avoid pain. As she puts it, “To all sensible Beings Pleasure is preferable to Pain” (Enquiry, III.I) and “If to all sensible Beings Pleasure is preferable to Pain, then all such Beings must will and desire Pleasure, and will an Avoidance of Pain” (Enquiry, III.II). Newcome then moves from these claims about what humans naturally pursue or avoid to a claim about what is most ‘fit’ for us. Like later utilitarians such as Bentham and Mill, Newcome thus bases her normative ethical theory—that is, her account of how we ought to act—on psychological hedonism, an account of how we naturally tend to act. She writes: “What sensible Beings must always prefer, will, and desire, is most fit for them” (Enquiry, III.III) and “What sensible Beings must always will contrary to, shun and avoid, is most unfit for them” (Enquiry, III.IV). She concludes: “Happiness is then in its own Nature most fit for sensible Beings” and “Misery is in its own Nature most unfit for them” (Enquiry, III.V–VI).

As we noted at the beginning of this section, Newcome does not explicitly commit herself to the Lockean Axiom that there is no idea in the mind that cannot be traced back to some particular experience. Nonetheless, it is true to say that Newcome arrives at her conception of how humans ought to act on the basis of claims about experience. As we saw, Newcome’s view is that pleasure and pain cannot be defined. Her view seems to be that we all just know what it is to feel pleasure and experience pain, through experience. In much the same way that one could not convey an accurate notion of light or darkness to someone blind from birth or loud and quiet to someone deaf from birth, Newcome’s view seems to be that the only way to know what pleasure and pain are is to have pleasurable and painful experiences. And it is on the basis of such experiences that Newcome, in turn, arrives at her conception of happiness, misery, and ‘fit’ or ‘unfit’ actions—that is, the kinds of actions that it are ‘right’ or ‘wrong’, respectively, for us to perform. As we suggested above, Newcome’s moral philosophy is also noteworthy in that she identifies rational actions with those which are conducive to pleasure. She explains: “As Reason is that Power of the Mind by which it finds Truth, and the Fitness and Unfitness of Things, it follows, that whatever is True or Fit, is also Rational, Reasonable, or according to Reason,” (Enquiry, IV).

And she adds that “all those Actions of Beings which are Means to their Happiness, are rational” (Enquiry, IV.V). In Newcome, then, we find not only a normative ethical theory but also an account of rational action that is grounded, ultimately, in our experience of things. Rational action is action conducive to happiness, and happiness is the accumulation of pleasure. We work out what actions are rational or irrational, then, by appealing to our experience of pleasure or pain.

5. God and Free-Thinking

This section focuses on the application of the Lockean Axiom to questions concerning the existence of God and the divine attributes of wisdom, goodness, and power—a crucial issue for philosophers during the early modern period, when theological issues were seen as just as important as other philosophical or scientific issues.

If you accept the Lockean Axiom, this seems to pose a problem for talk of God and his attributes (although it is worth noting that Locke does seem not see it that way; rather, he thinks the idea of God is on the same epistemic footing as our idea of other minds (Essay 2.23.33–35)). As the ‘free-thinking’ philosopher Anthony Collins (1676–1729) argues, if all the ideas in our minds can be traced back to some particular experience, and if we cannot experience God directly (as orthodox Christian teachings, particularly in the Anglican tradition, would have it), then it seems impossible that we could have an idea of God. But if we cannot have an idea of the deity, one might worry, how we can know or learn anything about God? And what does that mean for the Bible, which is supposed to help us do just that? Thus, thinkers like Collins would argue, while you can have faith in God’s existence, whether this is a reasonable or justified belief is an entirely different question.

A potential rebuttal to Collins’ way of arguing, however, is to point to divine revelations in the form of miracles or other Christian mysteries. Perhaps miracles do constitute instances in which those present can, or could, experience God, or divine actions. This kind of response is attacked by another free-thinker, John Toland (1670–1722), who argues that religious mysteries cannot even be an object of believing because they are inconceivable. For example, the idea that God is father, son, and holy spirit all at once is something that seems both inconceivable and contrary to reason. Against these lines of reasoning, more orthodox thinkers like George Berkeley, who, crucially, accepts the Lockean Axiom as well (see section 3.1), argues that even though we cannot have an idea of God, we can nonetheless experience the deity through our experience of the divine creation, nature. We outline Collins, Toland, and Berkeley’s views on God, and their relation to ‘empiricism’, in the subsections below.

a. Anthony Collins

Anthony Collins had a close friendship with Locke, but he adopted the Lockean Axiom to advance his free-thinking agenda. Like Toland (see 5.2), Collins is concerned with defending the right to make “use of the understanding, in endeavouring to find out the meaning of any proposition whatsoever, in considering the nature of evidence for or against it, and in judging of it according to the seeming force or weakness of the evidence” (Discourse, 3).

Crucially, this process ought not to be interfered with by authority figures, particularly non-religious ones. Rather, everyone needs to be able to judge the evidence freely on their own (Discourse 3–21).

When it comes to applying the Lockean Axiom to questions concerning God’s existence and the divine attributes, Collins takes a concession by an orthodox cleric, archbishop William King (1650–1729), as his starting point. King writes that “it is in effect agreed on all hands, that the Nature of God, as it is in it self, is  incomprehensible by human Understanding; and not only his Nature, but likewise his Powers and Faculties” (Sermon § 3). While experience is not explicitly mentioned here, the underlying thought is that God is mysterious because we cannot experience God himself or the divine attributes. In other words, we do not have an idea of God because we cannot experience God. Thus, Collins argues that the word ‘God’ is empty (that is, does not denote anything in the world) and that when we say something like ‘God is wise,’ this is basically meaningless (Vindication, 12–13). In particular, Collins emphasizes that because of this lack of experience and the subsequent emptiness of the term, it becomes impossible to prove the existence of God against atheists. For the term cannot refer to more than a “general cause or effect” (Vindication, 13)—something that, he thinks, even atheists agree exists (Vindication, 14). They would only deny that this cause is wise, or would refuse the notion that this cause is immaterial, equating it instead with the “Material Universe” (Vindication, 14). To put it differently, Collins comes close to using the Lockean Axiom to advance atheism. At the very least, he makes it evident that accepting this axiom undermines fundamental theological commitments, because God and the divine attributes are generally held to be beyond the realm of creaturely experience and thus whatever idea we have of God must be empty (for discussion of Collins’ philosophy and the question of whether he is an atheist see O’Higgins 1970, Taranto 2000, and Agnesina 2018.) As we discuss in the next subsection, a similar way of arguing can also be found in John Toland’s Christianity not Mysterious (1696), which might have been an influence on Collins’ thinking. In contrast to Collins, though, Toland puts more emphasis on the connection between the Lockean Axiom and language, something that he also adopts from Locke.

b. John Toland

John Toland was an Irish-born deist who was raised as a Roman Catholic but converted to Anglicanism (the predominant denomination in Britain at the time) in his twenties. Throughout his writing career, Toland challenged figures in positions of authority. In Christianity not Mysterious, Toland takes aim at the Anglican clergy; this, ultimately, led to a public burning of several copies of the book by a hangman and Toland fleeing Dublin.

As mentioned in the previous section, Toland has a similar way of arguing compared to Collins in Christianity not Mysterious. This is no surprise if we consider that Toland highly esteemed Locke and accepts the Lockean Axiom. In fact, and, again, similarly to Collins, he implicitly draws from the axiom (or rather its contraposition) to argue against the religious mysteries of Christianity, such as the virgin birth of Jesus Christ or the latter’s resurrection from the dead. These events are mysterious in the sense that they cannot be explained without invoking a supernatural power because they conflict with the way things ‘naturally’ are. In line with such an understanding, Toland defines mysteries as “a thing of its own Nature inconceivable, and not to be judg’d by our ordinary Faculties and Ideas” (CNM, 93). The underlying idea is that mysteries are beyond the realm of our experience and that we cannot have an idea of any mystery because we cannot experience them—and so Toland says that “a Mystery expresses Nothing by Words that have no Ideas at all” (CNM, 84). In saying this, Toland is intending to follow Locke in holding that every meaningful word must stand for an idea and as such can be traced to some experience. As Locke says: “He that hath Names without Ideas, wants meaning in his Words, and speaks only empty Sounds’ (Essay 3.10.31). On this basis Toland argues that terms referring to mysteries are empty or meaningless because there can be no experiences of them. For instance, Toland criticises the doctrine of the Holy Trinity on this ground as well as arguing that it is neither supported by the bible nor any other form of divine revelation (CNM, § 3)—the existence of which is not rejected outright (compare CNM, 12).

In keeping with his critical attitude towards (religious) authorities, Toland claims that the Holy Trinity and other mysteries are an invention of “priest-craft” (CNM, 100) and nothing but a tool for submission. This point ties into his overall emancipatory aim of arguing for the right of everyone to use their reason in order to interpret the bible on their own, without interference by religious authorities (CNM, 5–14). For Toland believes that every reasonable person is capable of understanding the bible because reason is God-given. As Toland puts this point when addressing a potential clerical reader: “The uncorrupted Doctrines of Christianity are not above their [that is, the lay people’s] Reach or Comprehension, but the Gibberish of your Divinity Schools they understand not” (CNM, 87).

In short, Toland makes use of Lockean insights to tackle what were difficult and important theological questions of the day. By implicitly drawing on the Lockean Axiom and a broadly Lockean understanding of meaning, he argues against an overreach of clerical authority and against the existence of religious mysteries. For Toland, it holds that if something is really part of Christianity, it must also be accessible by our God-given reason (see also Daniel 1984 and the essays in Toland 1997 for more on Toland’s position).

c. George Berkeley

Throughout his life, Berkeley was very concerned with battling atheism or ideas which he thought undermined Christian teachings. His Principles of Human Knowledge was dedicated to identifying and rejecting the “grounds” for “Atheism and Irreligion” (Works II, 20). He also defends the idea that vision (NTV1709 § 147) or nature (NTV1732 § 147) is a divine language in his New Theory of Vision. Yet his most elaborate defense of the idea that we can experience God through nature is found in the fourth dialogue of Alciphron; or the Minute Philosopher (1732/52), which is a set of philosophical dialogues. In a nutshell, Berkeley argues that we have no direct access or experience of any other mind, including our fellow human beings or rational agents. Nonetheless most of us believe that other rational agents exist. The reason for this, Berkeley contends, is that these agents exhibit “signs” of their rationality which we can experience. Most notably, they communicate with us using language (AMP 4.5–7). Berkeley then argues (AMP 4.8–16) that nature—that is, everything we see, hear, smell, taste, and touch—literally forms a divine language (there are competing interpretations of how to best interpret this divine language; see, for example Fasko 2021 and Pearce 2017). This language not only shows that God (as a rational agent) exists, but also displays the divine goodness by providing us with “a sort of foresight which enables us to regulate our actions for the benefit of life. And without this we should be eternally at a loss” (PHK § 31). For example, God ensures that where there is fire there is smoke and, in the way, ‘tells’ us there is fire nearby, when we see smoke. In this way, Berkeley objects to the line of reasoning introduced at the beginning of this section—that we cannot have an idea of God because we cannot experience the deity—by showing that there is a sense in which we experience God, via the divine language that constitutes nature. Thus, Berkeley not only accepts the Lockean Axiom, but also accepts Collins’s point that we immediately experience God. What he rejects is the notion that there are no mediate signs for God’s existence because nature, as a divine language, is abundant with them.

While Alciphron provides evidence of God’s existence, Berkeley’s account of how we know (something) about God’s nature can be found in the Three Dialogues. There, he explains:

[T]aking the word ‘idea’ in a large sense, my soul may be said to furnish me with an idea, that is, an image or likeness of God, though indeed extremely inadequate. For all the notion I have of God is obtained by reflecting on my own soul, heightening its powers, and removing its imperfections. (DHP 231–32)

In other words, by reflecting on my own mind, but endeavoring to remove my imperfections, I can get a sense of what God’s mind must be like. Combined with the claims in Alciphron, Berkeley thus offers an account of knowledge of God’s existence and nature.

In the seventh dialogue of Alciphron, Berkeley tackles the challenge issued by Toland. Berkeley argues that it is not problematic that some words do not signify ideas and thus their meaning cannot be traced back to some experience. In fact, Berkeley argues, our everyday language is full of such words. These words still have a meaning because they serve a purpose:

[T]here may be another use of words besides that of marking and suggesting distinct ideas, to wit, the influencing our conduct and actions, which may be done either by forming rules for us to act by, or by raising certain passions, dispositions, or emotions in our minds (AMP 7.5).

Berkeley thus deems it irrelevant for the meaningfulness of a term whether it refers to ideas that are ultimately grounded in experience. Rather, its meaning needs to be judged by the function it serves. When it comes to the mysteries that Toland attacked, Berkeley argues it is irrelevant that we cannot experience them, as long as talking about them serves the right function, that is, it is still meaningful (AMP 7.14–31) (see Jakapi 2002; West 2018).

6. Anton Wilhelm Amo: A Case Study in the Limits of British Empiricism

We argued in the first section of this article that considering the Peripatetic axiom, or more precisely the Lockean Axiom, allows for a more inclusive and diverse alternative story than the standard narrative of ‘British Empiricism’ which solely focuses on Locke, Berkeley, and Hume. This was our motivation for moving away from the standard narrative and focusing on the Lockean Axiom. The advantages of the narrative presented here are that it can incorporate a wider variety of issues and thinkers. However, we also pointed out that the narrative told here is neither exclusive nor exhaustive. Rather than this being a fault specific to our chosen narrative, we think this is an inevitable consequence of developing narratives that include some figures or ideas and exclude others.

This final section’s aim is to further put this narrative into perspective—not least, to make it abundantly clear that we do not intend to replace the standard narrative with the ‘correct’ story of ‘British Empiricism’. Rather, our aim is to illustrate that we are forced to tell stories that involve difficult choices, which ought to be, nonetheless, deliberate (and transparent) and to show what kind of stories can be told and what the limitations of narratives, such as the one developed here, are. In the following, we therefore first introduce a fringe case—that is, a thinker who could, on certain readings, be read as an ‘empiricist’—in the form of Anton Wilhelm Amo (1703–1756), the first African to receive a doctorate in Europe and a figure who is increasingly of interest to Early Modern scholars (for example, Wiredu 2004, Emma-Adamah 2015, Meyns 2019, Menn and Smith 2020, Smith 2015, Walsh 2019, West 2022).

The aim in doing so is to demonstrate that the Peripatetic Axiom transcended the boundaries of early modern Britain and that it was quite possible for thinkers on the continent to have just as much (if not more) in common with, for example, Locke than Descartes (in turn, this indicates that the traditional story of ‘empiricism versus rationalism’ cannot simply be replaced with ‘Lockeanism versus Cartesianism’). The case of Amo also puts pressure on the cohesiveness of the concept of ‘British Empiricism’—in short, there is nothing uniquely British about being an ‘empiricist’ (that is, accepting the Peripatetic Axiom or the Lockean Axiom). We begin with a very brief overview of Amo’s philosophy before drawing out the tension between, on the one hand, Amo’s commitment to the Peripatetic Axiom and, on the other, the difficulty that arises if we try to place him in the ‘empiricist’ tradition. The case of Amo, we think, shows that there simply is not—in any realist sense—any fact of the matter about whether this or that philosopher is or is not an ‘empiricist.’

Anton Wilhelm Amo wrote four texts during his life time: the Inaugural Dissertation on the Impassivity of the Human Mind, the Philosophical Disputation Containing a Distinct Idea of those Things that Pertain either to the Mind or to Our Living and Organic Body (both written in 1734), a Treatise on the Art of Philosophising Soberly and Accurately, and On the Rights of Moors in Europe (his first text, published in 1729, which, sadly, is now lost). The three surviving texts outline Amo’s account of the mind-body relation, which is substance dualist, and his theory of knowledge. Specifically, the Inaugural Dissertation and the Philosophical Disputation both defend a roughly Cartesian account of the mind-body relation and mind-body interaction. Amo is critical of certain elements of Descartes’ view—in particular, the idea that the mind can ‘suffer’ with (that is, passively experience sensations) the body (ID, 179–81). Yet, while he is critical, Amo’s aim is not to dismiss but to fix these kinds of issues with Descartes’ dualism (Nwala 1978, 163; Smith 2015, 219). While it is not clear cut, there is therefore a case to be made for thinking of Amo as a ‘Cartesian’—if, by ‘Cartesian’, we mean something like a thinker who sets out to augment Descartes’ worldview in order to defend or support it. He is certainly not an outright critic. At the very least, it would be difficult to place Amo in the ‘empiricist’ tradition—at least as it is typically construed—given the underlying Cartesian flavour of his philosophical system.

What makes Amo an interesting ‘fringe’ case for ‘empiricism’—and, indeed, Cartesianism too—is his explicit commitment to the Peripatetic Axiom (see, for example, Treatise, 139, 141, 146). Like Hobbes and Locke—as well as Aristotelian scholastics like Aquinas before them (see section 2.1)—Amo maintains that there is nothing in the intellect not first in the senses. Other Cartesians, like Antoine Arnauld for example, explicitly rejected the Peripatetic Axiom. As Arnauld puts it, “It is false…that all of our ideas come through our senses” (Arnauld 1970, 7). Now, it is worth noting that Amo is not a lone outlier. Other Cartesians, like Robert Desgabets or Pierre-Sylvain Régis, also accepted the Peripatetic Axiom—thus, there are further fringe cases. Nonetheless, Amo’s body of work is greatly suited to illustrate the limitations of our narrative and, in fact, any narrative that makes use of ‘empiricism’, or related notions like ‘Cartesianism’, as labels. For, on the one hand, Amo has in common with traditional ‘empiricists’, like Locke, a commitment to the Peripatetic Axiom. But on the other, he wants to defend and improve the philosophical system of someone (that is, Descartes) who has come to epitomize like no other what ‘rationalism’ is about.

One might demand to know: ‘Well, is Amo an empiricist or not?’ But what this discussion shows, we contend, is that when it comes to Amo, or others like Desgabet or Régis, there is no simple (or ‘right’) answer. The answer depends on what is meant by ‘empiricist’—and this, in turn, might depend upon the context in which that concept is being employed or the use to which it is being put.

In that sense, Amo’s body of work is illustrative of the very fundamental problem or ‘danger’ that the attribution of any “-ism”—that is, analyst’s rather than actor’s categories—runs, particularly if these positions are taken to be dichotomous to others: such attributions risk obfuscating important similarities or differences between thinkers’ ideas or simply omitting interesting thinkers or ideas just because they do not fit the story—and, crucially, not because they are underserving of attention.

There are other reasons to think of someone like Amo as a particularly significant figure when it comes to examining, and revising, the historical canon—and categories like ‘empiricism’ in particular. In light of growing interest in and demand for non-Western figures, and thinkers from typically marginalised backgrounds, both in teaching and scholarship, Amo—the first African to receive a doctorate in Europe—has picked up considerable attention. But in what context should we teach or write about Amo? Continuing to think in terms of the standard narrative of ‘British Empiricism’ versus ‘Continental Rationalism’ will, as the above discussion showed, not make it easy to incorporate Amo’s work into syllabi or research—precisely because there is no objective fact of the matter about whether he is one or the other. And, as we have already suggested, Amo is not alone; this is true of many figures who, not coincidentally, have never quite found a place in the standard early modern canon. We think there are ways to incorporate figures like Amo into our familiar narratives—for instance, construing ‘empiricism’ in terms of an adherence to the Peripatetic Axiom does, in that sense, make Amo an ‘empiricist’—but such cases also provide reasons to think that we ought to take a serious look at what purpose those narratives serve and whether we, as scholars and educators, want them to continue to do so. New narratives are available and might better serve our aims, and correspond with our values, in teaching and scholarship going forward.

7. References and Further Reading

When citing primary sources, we have always aimed to use the canonical or most established forms. In the cases where there are no such forms, we used abbreviations that seemed sensible to us. Also, if the text is not originally written in English, we have utilized standardly used translations. Finally, we want to note that on each of these figures and issues there is way more high-quality scholarship than we were able to point towards in this article. The references we provide are merely intended to be a starting point for anyone who wants to explore these figures and issues in more detail.

a. Primary Sources

  • Amo, Anton Wilhelm. Anton Wilhelm Amo’s Philosophical Dissertations on Mind and Body. Edited by Smith, Justin EH, and Stephen Menn. Oxford: Oxford University Press, 2020.
  • The first critical translation of Amo’s work espousing his philosophy of mind.

  • Amo, Anton Wilhelm. Treatise on the art of philosophising soberly and accurately (with commentaries). In T. U. Nwala (Ed.), William Amo Centre for African Philosophy. University of Nigeria, 1990.
  • Amo’s most systematic text in which he offers a guide to logic and fleshes out his account of the mind-body relation and philosophy of mind.

  • Aristotle. [APo.], Posterior Analytics, trans. Hugh Tredennick, in Aristotle: Posterior Analytics, Topica, Loeb Classical Library, Cambridge, MA; London: William Heinemann, 1964, pp. 2–261.
  • One of the most prominent English translations of Aristotle’s famous work on science.

  • Aristotle. [EN], The Nicomachean Ethics, trans. H. Rackham, Loeb Classical Library, London: William Heinemann, Cambridge, MA: Harvard University Press, 1947.
  • One of the most prominent English translations of Aristotle’s famous work on ethics.

  • Aristotle. [GA], De la génération des animaux, ed. Pierre Louis, Collection des Universités de France, Paris: Les Belles Lettres, 1961; trans. A. L. Peck, in Aristotle, Generation of Animals, Loeb Classical Library, London: William Heinemann, Cambridge, MA: Harvard University Press, 1953.
  • One of the most prominent English translations of Aristotle’s famous work on biology.

  • Aristotle. [Meteor.], Meteorologica, trans. H. D. P. Lee, Loeb Classical Library, London: William Heinemann, Cambridge, MA: Harvard University Press, 1962.
  • One of the most prominent English translations of Aristotle’s famous work on the elements.

  • Aristotle. The Complete Works of Aristotle. Edited by Jonathan Barnes. Princeton: Princeton University Press, 1984.
  • Standard English translation used by scholars of Aristotle’s complete works.

  • Arnauld, Antoine. La Logique, ou L’Art de penser. Flammarion, 1970.
  • Edition of Arnauld and Nicole’s logic textbook.

  • Arnauld, Antoine and Nicole, Pierre. Logic, or, The art of thinking in which, besides the common, are contain’d many excellent new rules, very profitable for directing of reason and acquiring of judgment in things as well relating to the instruction of for the excellency of the matter printed many times in French and Latin, and now for publick good translated into English by several hands. London: Printed by T.B. for H. Sawbridge, 1685.
  • Early English translation of this important text for the so-called Port-Royal Logic; an influential logic textbook.

  • Aquinas, Thomas. [DA] A Commentary on Aristotle’s De anima. Edited by Robert Pasnau. New Haven, CN: Yale University Press, 1999.
  • English translation of Aquinas’ commentary on Aristotle’s famous text on the soul.

  • Aquinas, Thomas. Truth.Translated by Mulligan, Robert W., James V. McGlynn, and Robert W. Schmidt. 3 volumes. Indianapolis: Hackett, 1994.
  • English translation of Aquinas commentary on Aristotle’s famous text on the soul.

  • Astell, Mary. A Serious Proposal to the Ladies. Parts I and II. Edited by P. Springborg. Ontario: Broadview Literary Texts, 2002.
  • Argues for women’s education and offers a way for women to improve their critical thinking skills.

  • Astell, Mary. The Christian Religion, As Profess’d by a Daughter of the Church of England. In a Letter to the Right Honourable, T.L. C.I., London: R. Wilkin, 1705.
  • Introduces Astell’s religious and philosophical views and continues her feminist project.

  • Bacon, Francis. The Works. Edited by J. Spedding, R. L. Ellis, and D. D. Heath. 15 volumes. London: Houghton Mifflin, 1857–1900.
  • First edition of Bacon’s works, still in use by scholars.

  • Bacon, Roger. [OM] The ‘Opus Maius’ of Roger Bacon. Edited Robert Belle Burke. 2 volumes. New York: Russell & Russell, 1928.
  • One if not the most important works of Bacon, attempting to cover all aspects of natural science .

  • Berkeley, George. The Correspondence of George Berkeley. Edited by Marc A. Hight. Cambridge: Cambridge University Press, 2013.
  • Most comprehensive edition of Berkeley’s correspondence with friends, family, and contemporaries thinkers.

  • Berkeley, George. The Works of George Berkeley, Bishop of Cloyne. Edited by A. A. Luce and T. E. Jessop. 9 volumes. London: Thomas Nelson and Sons, 1948-1957
  • Currently the standard scholarly edition of Berkeley’s writings.

  • Cavendish, Margaret. Observations upon Experimental Philosophy, Edited by Eileen O’Neill. Cambridge: Cambridge University Press, 2001.
  • Cavendish’s critique of the experimental philosophy of the Royal Society in London, and a defence of her own philosophical system.

  • Cavendish, Margaret. Grounds of Natural Philosophy. Edited by Anne M. Thell. Peterborough, Canada: Broadview Press, 2020.
  • The most detailed articulation of Cavendish’s ‘vitalist’ philosophical system of nature.

  • Cavendish, Margaret. The Blazing World and Other Writings. London: Penguin Classics, 1994.
  • Cavendish’s fantasy novel, with critiques the Royal Society and was published alongside her Observations

  • Collins, Anthony. A Discourse of Free-thinking: Occasion’d by the Rise and Growth of a Sect Call’d Free-thinkers. London,1713.
  • A defence of the right to think for oneself on any question.

  • Collins, Anthony. A vindication of the divine attributes In some remarks on his grace the Archbishop of Dublin’s sermon, intituled, Divine predestination and foreknowledg consistent with the freedom of man’s will. H. Hills, and sold by the booksellers of London and Westminster, 1710.
  • A critique of Archbishop King’s sermon, arguing that King’s position is effectively no different from atheism.

  • Conway, Anne. The Principles of the Most Ancient and Modern Philosophy. Translated by J. C[lark]. London, 1692.
  • First English translation of Conway’s only known book introducing her metaphysics and system of nature.

  • Hobbes, Thomas. Leviathan, with selected variants from the Latin edition of 1668. Edited by Edwin Curley. Indianapolis: Hackett, 1994
  • Hobbes’ influential political treatise, in which he also defends materialism and an ‘empiricist’ theory of knowledge.

  • Hume, David. Enquiries concerning Human Understanding and concerning the Principles of Morals, edited by L. A. Selby-Bigge, 3rd ed. revised by P. H. Nidditch, Oxford: Clarendon Press, 1975.
  • Standard scholarly edition of Hume’s famous work in which he lays out his moral and political philosophy.

  • Hume, David. A Treatise of Human Nature. Edited by L. A. Selby-Bigge, 2nd edition revised by P. H. Nidditch. Oxford: Clarendon Press, 1975.
  • Standard scholarly edition of Hume’s famous work in which he lays out his account of human nature and begins to develop an account of the human mind.

  • Hutcheson, Francis. An Inquiry into the Original of Our Ideas of Beauty and Virtue. Edited by Wolfgang Leidhold. Indianapolis: Liberty Fun, 2004.
  • Hutcheson’s influential texts on ethics and aesthetics, in which he argues that we have both a moral sense and a sense of beauty.

  • Hutcheson, Francis. An Essay on the Nature and Conduct of the Passions, with Illustrations on the Moral Sense. Edited by Aaron Garret. Indianapolis: Liberty Fund, 2002.
  • A text outlining Hutcheson’s moral philosophy.

  • King, William. Archbishop King’s Sermon on Predestination. Edited by David Berman and Andrew Carpenter. Cadenus Press: Dublin, 1976.
  • A sermon on predestination that revolving around the issue of divine attributes and the way we can meaningfully talk about these attributes and God’s nature.

  • Leibniz, Gottfried Wilhelm. Die philosophischen Schriften. Edited by Carl Immanuel Gerhardt. 7 volumes. Weidmann: Berlin, 1875–90.
  • Standard scholarly edition of all of Leibniz’s works.

  • Locke, John. An Essay concerning Human Understanding. Edited by Peter H. Nidditch. Oxford: Clarendon Press, 1975.
  • Standard scholarly edition of Locke’s most famous work, providing his description of the human mind.

  • Masham, Damaris. Occasional Thoughts in Reference to a Vertuous or Christian Life, London: A. and J. Churchil, 1705.
  • Masham’s second book develops the views of the Discourse in relation to practical morality.

  • Masham, Damaris. A Discourse Concerning the Love of God, London: Awsnsham and John Churchill, 1696.
  • Argues that humans are social and rational as well as motivated by love of happiness.

  • Newcome, Susanna. An Enquiry into the Evidence of the Christian Religion. By a Lady [ie S. Newcome]. The second edition, with additions. London: William Innys, 1732.
  • Newcome’s book espousing her views on morality and a defence of the Christian religion.

  • Plato. Plato: Complete Works. Edited by John M. Cooper. Indianapolis: Hackett, 1997.
  • A standard English edition of Plato’s complete works.

  • Shepherd, Mary. Essays on the Perception of an External Universe, and Other Subjects connected with the Doctrine of Causation. London: John Hatchard and Son, 1827.
  • Shepherd’s second book introducing her metaphysics by establishing that there is an independently and continuously existing external world.

  • Shepherd, Mary. An Essay upon the Relation of Cause and Effect, controverting the Doctrine of Mr. Hume, concerning the Nature of the Relation; with Observations upon the Opinions of Dr. Brown and Mr. Lawrence, Connected with the Same Subject. London: printed for T. Hookham, Old Bond Street, 1824.
  • Shepherd’s first book introducing her notion of causation by way of rejecting a Humean notion of causation.

  • Toland, John. John Toland’s Christianity Not Mysterious: Text, Associated Works, and Critical Essays. Edited by Philip McGuinness, Alan Harrison, and Richard Kearney. Dublin: Liliput Press, 1997.
  • Critical edition of one of Toland’s most famous works which argues that nothing that is above or beyond reason that is part of Christianity.

  • Wollstonecraft, Mary. A Vindication of the Rights of Woman with Strictures on Political and Moral Subjects. Edited by Sylvana Tomaselli, in A Vindication of the Rights of Men with A Vindication of the Rights of Woman and Hints, Cambridge: Cambridge University Press, 1995.
  • Critical edition of Wollstonecraft’s groundbreaking work arguing for women’s rights.

b. Secondary Sources

  • Agnesina, Jacopo. The philosophy of Anthony Collins: free-thought and atheism. Paris: Honoré Champion, 2018.
  • Consideration of Collins’ philosophy with a focus on the question whether he is an atheist.

  • Anfray, Jean-Pascal. “Leibniz and Descartes.”In The Oxford Handbook of Descartes and Cartesianism edited by Steven Nadler, Tad M. Schmaltz, and Delphine Antoine-Mahut, 721–37, Oxford: Oxford University Press, 2019.
  • Essay considering the complicated relationship between two rationalists.

  • Atherton, Margaret. “Lady Mary Shepherd’s Case Against George Berkeley. ” British Journal for the History of Philosophy 4 (1996): 347–66. Doi: 10.1080/09608789608570945
  • First article to discuss and evaluate Shepherd’s criticism of Berkeley.

  • Atherton, Margaret, ed. Women philosophers of the early modern period. Indianapolis/Cambridge: Hackett, 1994.
  • Groundbreaking volume that contains various women philosophers and present excerpts of their works, intended for their inclusion in the classroom.

  • Atherton, Margaret. “Cartesian reason and gendered reason.” In A mind of one’s own edited by Louise Antony and Charlotte Witt, 21-37, Boulder, CO: Westview Press, 1993.
  • Argues against first generation feminist critiques for the emancipatory potential that Cartesianism held for some female thinkers.

  • Atherton, Margaret. Berkeley’s Revolution in Vision. Ithaca: Cornell University Press, 1990.
  • The most comprehensive study of Berkeley’s theory of vision/ philosophy of perception.

  • Ayers, Michael. Locke: Epistemology and Ontology. London: Routledge, 1991.
  • An in-depth discussion of Locke’s theory of knowledge and metaphysics.

  • Bahar, Saba. Mary Wollstonecraft’s Social and Aesthetic Philosophy: An Eve to Please Me. New York: Palgrave, 2002.
  • Sustained discussion of the way that aesthetic considerations (pertaining to the presentation of women) play a crucial role for Wollstonecraft’s feminist project.

  • Beauchamp, T.L. and A. Rosenberg. Hume and the Problem of Causation. Oxford: Oxford University Press, 1981.
  • Classical study of the Humean notion of causation and its problems.

  • Bell, Martin. “Hume on Causation.” In The Cambridge Companion to Hume’s Treatise, edited by Donald C. Ainslie, and Annemarie Butler, 147–76, Cambridge: Cambridge University Press, 2015.
  • Consideration of Hume’s view of causation, highlighting the centrality of this issue for understanding his philosophical system.

  • Bennett, Jonathan. Locke, Berkeley, Hume: Central Themes. Oxford: Oxford University Press, 1971.
  • Classical story of the three so-called empiricist which highlights issues discussed by all of these thinkers.

  • Bergès, Sandrine, and Coffee, Alan. The Social and Political Philosophy of Mary Wollstonecraft. Oxford: Oxford University Press, 2016.
  • Essays that distinctively consider Wollstonecraft as a philosopher and relate her to her intellectual context as well as contemporary debates.

  • Bergès, Sandrine. The Routledge guidebook to Wollstonecraft’s A Vindication of the Rights of Woman. London: Routledge, 2013.
  • Contributions introducing readers to Wollstonecraft’s famous work of women’s rights and hence also to the origins of feminist thought.

  • Bolton, Martha Brandt. “Lady Mary Shepherd and David Hume on Cause and Effect.” Feminist History of Philosophy: The Recovery and Evaluation of Women’s Philosophical Thought edited by Eileen O’Neill & Marcy P. Lascano, 129–52, Cham: Springer, 2019.
  • Sustained discussion of the different understanding of causation by Hume and Shepherd.

  • Bolton, Martha. “Causality and Causal Induction: The Necessitarian Theory of Lady Mary Shepherd. ” In Causation and Modern Philosophy, edited by Keith Allen and Tom Stoneham, 242–61. New York: Routledge, 2010.
  • Classical article on Shepherd’s idiosyncratic notion of causation and the way she departs from Hume.

  • Boyle Deborah. Mary Shepherd: A Guide. Oxford: Oxford University Press, 2023.
  • First book length treatment of Shepherd’s metaphysics, discussing her core commitments and pointing to helpful secondary literature.

  • Boyle, Deborah. “Mary Shepherd on Mind, Soul, and Self.” Journal of the History of Philosophy 58, no. 1 (2020): 93–112. Doi: 10.1353/hph.2020.0005
  • First sustained discussion of Shepherd’s philosophy of mind.

  • Boyle, Deborah A. The well-ordered universe: The philosophy of Margaret Cavendish. Oxford: Oxford University Press, 2018.
  • In-depth discussion of Cavendish’s metaphysics.

  • Broad, Jacqueline. “Damaris Masham on Women and Liberty of Conscience.” Feminist History of Philosophy: The Recovery and Evaluation of Women’s Philosophical Thought edited by Eileen O’Neill & Marcy P. Lascano, 319–36, Cham: Springer, 2019.
  • One of the first considerations of the role and of the ethics of toleration.

  • Broad, Jacqueline. The philosophy of Mary Astell: An early modern theory of virtue. Oxford: Oxford University Press, 2015.
  • Argues that Astell’s ethical goals are at the center of her philosophical project and help to unite some of her seemingly diverging commitments.

  • Broad, Jacqueline. “A woman’s influence? John Locke and Damaris Masham on moral accountability.” Journal of the History of Ideas 67, no. 3 (2006): 489–510. Doi: https://www.jstor.org/stable/30141038
  • Considers the influence Masham had on Locke’s notion of moral accountability.

  • Chappell, Vere Ed. Essays on Early Modern Philosophy, John Locke—Theory of Knowledge. London: Garland Publishing, 1992.
  • Contributions on a broad variety of issues that pertain to Locke theories of knowledge ranging from triangles to memory.

  • Conley, John J. “Suppressing Women Philosophers: The Case of the Early Modern Canon.” Early Modern Women: An Interdisciplinary Journal 1, no. 1 (2006): 99-114. Doi: 10.1086/EMW23541458
  • Consideration of the exclusion of women from the history of philosophy with a focus on the challenges of their reintegration.

  • Connolly, Patrick J. “Susanna Newcome and the Origins of Utilitarianism.” Utilitas 33, no. 4 (2021): 384–98. Doi: 10.1017/S0953820821000108
  • One of the few scholarly works on Newcome arguing that she occupies a noteworthy position at the dawn of utilitarianism.

  • Costelloe, Timothy M. Aesthetics and morals in the philosophy of David Hume. London: Routledge, 2013.
  • A broad discussion of Hume’s ethics and aesthetics.

  • Cranefield, Paul F. “On the Origin of the Phrase NIHIL EST IN INTELLECTU QUOD NON PRIUS PUERIT IN SENSU.” Journal of the history of medicine and allied sciences 25, no. 1 (1970): 77–80. Doi: 10.1093/jhmas/XXV.1.77
  • Early article looking into the origin of the Peripatetic axiom as found in Locke.

  • Cruz, Maité. “Shepherd’s Case for the Demonstrability of Causal Principles.” Ergo: An Open Access Journal of Philosophy (forthcoming).
  • Argues that Shepherd endorses a broadly Lockean or Aristotelian substance metaphysics.

  • Cunning, David. Cavendish. London: Routledge, 2016.
  • An introduction to Cavendish’s life and philosophical contributions.

  • Daniel, Stephen H. George Berkeley and Early Modern Philosophy. Oxford: Oxford University Press, 2021.
  • Book length treatments of Berkeley, relating his views to many other Early Modern figures and Ramism.

  • Daniel, Stephen Hartley. John Toland: His methods, manners, and mind. Kingston/Montreal: McGill-Queen’s Press-MQUP, 1984.
  • Only one of few book length studies of Toland and his philosophy.

  • Detlefsen, Karen. “Atomism, Monism, and Causation in the Natural Philosophy of Margaret Cavendish. ” Oxford Studies in Early Modern Philosophy 3 (2006): 199–240. Doi: 10.1093/oso/9780199203949.003.0007
  • A paper covering Cavendish’s rejection of atomism and commitment to monism, and her theory of causation.

  • Emma-Adamah, Victor U. “Anton Wilhelm Amo (1703-1756) the African‐German philosopher of mind: an eighteen-century intellectual history.” PhD diss., University of the Free State, 2015.
  • A doctoral dissertation on Amo’s account of the mind-body relation.

  • Falco, Maria J., ed. Feminist Interpretations of Mary Wollstonecraft. University Park PA: Penn State Press, 2010.
  • Includes contributions on the political and social impact of Wollstonecraft’s views.

  • Fasko, Manuel. Die Sprache Gottes: George Berkeleys Auffassung des Naturgeschehens. Basel/Berlin: Schwabe Verlag, 2021.
  • Detailed discussion of Berkeley’s divine language hypothesis arguing, contra Pearce, that only vision is the language of God.

  • Fasko, Manuel, and Peter West. “The Irish Context of Berkeley’s ‘Resemblance Thesis.’ ” Royal Institute of Philosophy Supplements 88 (2020): 7–31. Doi 10.1017/S1358246120000089
  • Arguing for the importance of the notion that representation requires resemblance in Berkeley’s intellectual context.

  • Fields, Keota. Berkeley: Ideas, Immateralism, and Objective Presence. Lanham: Lexington Books, 2011.
  • Discussion of Berkeley’s immaterialism in context of Descartes’ notion of objective presence that requires causal explanations of the content of ideas .

  • Frankel, Lois. “Damaris Cudworth Masham: A seventeenth century feminist philosopher.” Hypatia 4, no. 1 (1989): 80–90. Doi: 10.1111/j.1527-2001.1989.tb00868.x
  • Early art icle showing that Masham is a philosopher in her own right by espousing her feminist views.

  • Frankena, William. “Hutcheson’s Moral Sense Theory.” Journal of the History of Ideas (1955): 356-375. Doi: https://www.jstor.org/stable/2707637
  • Classic article on Hutcheson’s notion that we have a moral sense (much like a sense for seeing).

  • Galvagni, Enrico. “Secret Sentiments: Hume on Pride, Decency, and Virtue.” Hume Studies 47, no. 1 (2022): 131–55. Doi: 10.1353/hms.2022.0007
  • Discusses Hume’s account of decency and argues that it challenges standard virtue ethical interpretations of Hume.

  • Garrett, Don. “Hume’s Theory of Causation.” In The Cambridge Companion to Hume’s Treatise, edited by Donald C. Ainslie, and Annemarie Butler, 69–100, Cambridge: Cambridge University Press, 2015.
  • An introductory overview of Hume’s controversial theory of causation.

  • Gasser-Wingate, Marc. Aristotle’s Empiricism. Oxford: Oxford University Press, 2021.
  • An in-depth discussion of Aristotle’s view that all knowledge comes from perception.

  • Gordon‐Roth, Jessica, and Nancy Kendrick. “Including Early Modern Women Writers in Survey Courses: A Call to Action.” Metaphilosophy 46, no. 3 (2015): 364–79. Doi: 10.1111/meta.12137
  • Arguing for the importance of including women philosopher’s not in the least because of the current underrepresentation of women in the discipline.

  • Gracyk, Theodore A. “Rethinking Hume’s standard of taste.” The Journal of Aesthetics and Art Criticism 52, no. 2 (1994): 169–82.
  • A novel reading of Hume’s account of our knowledge of beauty.

  • Harris, James A. “Shaftesbury, Hutcheson and the Moral Sense. ” In The Cambridge History of Moral Philosophy, edited by Sacha Golob and Jens Timmermann, 325–37. Cambridge: Cambridge University Press, 2017. Doi: 10.1017/9781139519267.026
  • An introductory overview of Hutcheson’s account of the moral sense.

  • Hutton, Sarah. “Women, philosophy and the history of philosophy.” In Women Philosophers from the Renaissance to the Enlightenment, edited by Ruth Hagengruber and Sarah Hutton 12–29. New York: Routledge, 2021.
  • A discussion of why and how women are omitted from many histories of philosophy.

  • Hutton, Sarah. “Liberty of Mind: Women Philosophers and the Freedom to Philosophize.” In Women and liberty, 1600-1800: philosophical essays edited by Jacqueline Broad, and Karen Detlefsen,123–37. Oxford: Oxford University Press, 2017.
  • A paper arguing that women in early modern philosophy construed liberty as ‘freedom of the mind.’

  • Hutton, Sarah. “Religion and sociability in the correspondence of Damaris Masham (1658–1708).” In Religion and Women in Britain, c. 1660-1760, edited by Sarah Apetrei and Hannah Smith, 117–30. London: Routledge, 2016.
  • A discussion of Masham’s religious and social views, as espoused in her correspondences.

  • Hutton, Sarah. Anne Conway: A woman philosopher. Cambridge: Cambridge University Press, 2004.
  • Detailed discussion of Conway’s philosophy and her intellectual context

  • Jakapi, Roomet. “Emotive meaning and Christian mysteries in Berkeley’s Alciphron.” British journal for the history of philosophy 10, no. 3 (2002): 401–11. Doi: https://doi.org/10.1080/09608780210143218
  • Discusses the notion that Berkeley has an emotive theory of meaning.

  • Jolley, Nicholas. Locke, His Philosophical Thought. Oxford: Oxford University Press, 1999.
  • A broad discussion of Locke’s philosophical project.

  • Jones, Tom. George Berkeley: A Philosophical Life. Princeton: Princeton University Press, 2021.
  • The most comprehensive study of Berkeley’s life and intellectual context.

  • Kivy, Peter. The Seventh Sense: Francis Hutchenson and Eighteenth-Century British Aesthetics. Oxford: Clarendon Press, 2003.
  • An in-depth discussion of Hutcheson’s account of the sense of beauty.

  • Landy, David. “Shepherd on Hume’s Argument for the Possibility of Uncaused Existence.” Journal of Modern Philosophy 2 no. 1: 2020a. Doi: 10.32881/jomp.128
  • Discusses Shepherd’s criticism of Hume’s argument.

  • Landy, David. “A Defense of Shepherd’s Account of Cause and Effect as Synchronous.” Journal of Modern Philosophy 2, no. 1 (2020). Doi: 10.32881/jomp.46
  • Important discussion of Shepherd’s. account of synchronicity, defending this account against Humean worries.

  • Landy, David. “Hume’s theory of mental representation.” Hume Studies 38, no. 1 (2012): 23–54. Doi: 10.1353/hms.2012.0001
  • A novel interpretation of Hume’s account of how the mind represents external objects.

  • Landy, David. “Hume’s impression/idea distinction.” Hume Studies 32, no. 1 (2006): 119–39. Doi: 10.1353/hms.2011.0295
  • A discussion of Hume’s account of the relation between impressions and ideas.

  • Lascano, Marcy P. The Metaphysics of Margaret Cavendish and Anne Conway: Monism, Vitalism, and Self-Motion. Oxford: Oxford University Press, 2023.
  • Comprehensive discussion and comparison of Cavendish and Conway on three major themes in their philosophy.

  • Loeb, Louis E. Reflection and the stability of belief: essays on Descartes, Hume, and Reid. Oxford: Oxford University Press, 2010.
  • A discussion of the connections between Descartes, Hume, and Reid’s philosophies.

  • LoLordo, Antonia. Mary Shepherd. Cambridge: Cambridge University Press, 2022.
  • A broad overview of Shepherd’s philosophy, suitable for beginners.

  • LoLordo, Antonia, ed. Mary Sheperd’s Essays on the Perception of an External Universe. Oxford: Oxford Univeristy Press, 2020.
  • First critical edition of Shepherd’s 1827 book and 1832 paper.

  • Mackie, J. L. Problems from Locke, Oxford: Clarendon Press, 1971.
  • A discussion of the philosophical problems, relevant even today, that arise in Locke’s writing.

  • Mercer, Christia. “Empowering Philosophy.” In Proceedings and Addresses of the APA, vol. 94 (2020): 68–96.
  • An attempt to use philosophy’s past to empower it’s present and to promote a public-facing attitude to philosophy.

  • Meyns, Chris. “Anton Wilhelm Amo’s philosophy of mind.” Philosophy Compass 14, no. 3 (2019): e12571. Doi: 10.1111/phc3.12571
  • The first paper to provide a reconstruction of Amo’s philosophy of mind, suitable for beginners.

  • Michael, Emily. “Francis Hutcheson on aesthetic perception and aesthetic pleasure.” The British Journal of Aesthetics 24, no. 3 (1984): 241–55. Doi: 10.1093/bjaesthetics/24.3.241
  • A discussion of the sense of beauty and the feeling of pleasure in Hutcheson.

  • Myers, Joanne E. “Enthusiastic Improvement: Mary Astell and Damaris Masham on Sociability.” Hypatia 28, no. 3 (2013): 533–50. Doi: 10.1111/j.1527-2001.2012.01294.x
  • A discussion of the social philosophy of two early modern women.

  • Nwala, T. Uzodinma. “Anthony William Amo of Ghana on The Mind-Body Problem.” Présence Africaine 4 (1978): 158–65. Doi: 10.3917/presa.108.0158
  • An early attempt to reconstruct Amo’s response to the mind-body problem.

  • Influential paper since it is one of the first to discuss the problems and limits of the standard narrative that contrasts empiricism and rationalism.

  • Noxon, J. Hume’s Philosophical Development. Oxford: Oxford University Press, 1973.
  • A discussion of the development and changes in Hume’s philosophy over his lifetime.

  • O’Higgins, James. Anthony Collins the Man and His Works. The Hague : Martinus Nijhoff, 1970.
  • Still one of the most detailed discussions of Collins philosophy and intellectual context in English.

  • O’Neill, Eileen. “HISTORY OF PHILOSOPHY: Disappearing Ink: Early Modern Women Philosophers and Their Fate in History.” Philosophy in a Feminist Voice: Critiques and Reconstructions, edited by Janet A. Kourany, 17-62. Princeton: Princeton University Press, 1998.
  • Groundbreaking paper demonstrating how women thinkers have eradicated from the history of philosophy.

  • Pearce, Kenneth L. Language and the Structure of Berkeley’s World. Oxford: Oxford University Press, 2017.
  • Detailed consideration of Berkeley’s divine language hypothesis (that is, the notion that nature is the language of God).

  • Rickless, Samuel C. “Is Shepherd’s pen mightier than Berkeley’s word?.” British Journal for the History of Philosophy26, no. 2 (2018): 317–30. Doi: 10.1080/09608788.2017.1381584
  • Discussion of Shepherd’s criticism of Berkeley.

  • Rickless, Samuel C. Berkeley’s argument for idealism. Oxford: Oxford University Press, 2013.
  • Critically discusses Berkeley’s arguments for idealism.

  • Sapiro, Virginia. A vindication of political virtue: The political theory of Mary Wollstonecraft. Chicago: University of Chicago Press, 1992.
  • One of the first detailed discussions of Wollstonecraft’s’ political thought.

  • Saporiti, Katia. Die Wirklichkeit der Dinge. Frankfurt a. M.; Klostermann, 2006.
  • Critical examination of Berkeley’s metaphysics.

  • Seppalainen, Tom, and Angela Coventry. “Hume’s Empiricist Inner Epistemology: A Reassessment of the Copy Principle.” In The Continuum Companion to Hume, edited by Alan Bailey, Daniel Jayes O’Brie 38–56, London: Continuum, 2012.
  • Looks at exactly how Hume’s ‘copy principle’ (the claim that all ideas are copies of impressions) works.

  • Shapiro, Lisa. “Revisiting the early modern philosophical canon.” Journal of the American Philosophical Association 2, no. 3 (2016): 365–83. Doi: 10.1017/apa.2016.27
  • Critical consideration of the standard narrative arguing for a more inclusive story in terms of figures and issues considered.

  • Shelley, James. “Empiricism: Hutcheson and Hume.” In The Routledge companion to aesthetics, edited by Berys Gaut and Dominic Lopes, 55–68. London: Routledge, 2005.
  •  An overview of Hutcheson and Hume’s ‘empiricist’ approach to beauty and aesthetics.

  • Shelley, James R. “Hume and the Nature of Taste.” The Journal of Aesthetics and Art Criticism 56, no. 1 (1998): 29–38. Doi: 10.2307/431945
  • Focuses on the ‘normative force’ in Hume’s conception of taste.

  • Smith, Justin EH. Nature, human nature, and human difference: Race in early modern philosophy. Princeton: Princeton University Press, 2015.
  • Investigates the rise of the category of race in the Early Modern period.

  • Taranto, Pascal. Du déisme à l’athéisme: la libre-pensée d’Anthony Collins. Paris: Honoré Champion, 2000.
  • Discusses Collins’ writings and the question whether he is a (covert) atheist.

  • Thomas, Emily. “Time, Space, and Process in Anne Conway.” British Journal for the History of Philosophy 25, no 5 (2017): 990–1010. Doi: 10.1080/09608788.2017.1302408
  • Discussion of Conway’s views in relation to Leibniz, arguing that Conway is ultimately closer to Henry More.

  • Townsend, Dabney. Hume’s aesthetic theory: Taste and sentiment. London: Routledge, 2013.
  • Close examination of Hume’s aesthetic theory.

  • Traiger, Saul  Ed. The Blackwell Guide to Hume’s “Treatise”. Oxford: Blackwell, 2006. Student guide to Hume’s famous work . Doi: 10.1353/jhi.2016.0017
  • Arguing for the emergence of the standard narrative in 20th century based on its simplicity and aptness for teaching.

  • Walsh, Julie. “Amo on the Heterogeneity Problem.” Philosophers’ Imprint 19, no. 41 (2019): 1–18. Doi: http://hdl.handle.net/2027/spo.3521354.0019.041
  • A discussion of a problem facing Amo’s philosophy, about how the mind and body can be in unison if they are heterogeneous entities.

  • West, Peter. “Why Can An Idea Be Like Nothing But Another Idea? A Conceptual Interpretation of Berkeley’s Likeness Principle” Journal of the American Philosophical Association 7, no. 4 (2021): 530-548. Doi: doi:10.1017/apa.2020.34
  • An account of why Berkeley thinks an idea can be like nothing but another idea.

  • West, Peter. ‘Mind-Body Commerce: Occasional Causation and Mental Representation in Anton Wilhelm Amo” Philosophy Compass 17, no. 9 (2022). Doi: https://doi.org/10.1111/phc3.12872
  • An overview of secondary literature on Amo’s philosophy of mind so far, and a new reading of how his theory of mental representation works.

 

Author Information

Manuel Fasko
Email: manuel.fasko@unibas.ch
University of Basel
Switzerland

and

Peter West
Email: Peter.west@nulondon.ac.uk
Northeastern University London
United Kingdom

Susanne K. Langer (1895—1985)

Susanne K. Langer
Photo by Monozigote, CC BY-SA 4.0, via Wikimedia Commons

Susanne Langer was an American philosopher working across the analytic and continental divide in the fields of logic, aesthetics, and theory of mind. Her work connects in various ways to her central concerns of feeling and meaning.

Feeling, in Langer’s philosophy, encompasses the qualitative, sensory, and emotional aspects of human experience. It is not limited to mere emotional states but includes the entire range of sensory and emotional qualities that humans perceive and experience. Langer argues that feeling is not separate from rationality but, rather, an integral part of human intelligence and creativity.

In contrast to the logical positivists with whom she is sometimes associated, Langer argues for an expanded field of meaning. In contrast to the early Wittgenstein, who argues for a very limited field of meaning bounded by strict usage of language, Langer argues that symbolisms other than language are capable of expressing thoughts that language cannot.

Langer’s theory of feeling is closely tied to her theory of art, where she argues that artworks express forms of feeling. Artists use various elements, such as colours, shapes, sounds, and rhythms, to formulate feeling in their work, with each artwork being an art symbol. According to Langer, the artist’s task is to formulate the quality or gestalt of a particular feeling in their chosen medium.

In her broader philosophy of mind, Langer suggests that feeling is a fundamental aspect of human consciousness. She contends that feeling is not limited to individual emotions but is the basis for all forms of human thought, perception, and expression. In this sense, feeling serves as the foundation for higher-level cognitive processes, including symbolic thought and language.

Langer’s legacy includes her influential books on logic, philosophy of art, and theory of mind. Her position, whilst subject to minor terminological changes during her career, remains overwhelmingly consistent over half a century, and the resulting vision is a bold and original contribution to philosophy. Her ideas in the philosophy of art have been engaged with by various philosophers, including Nelson Goodman, Malcolm Budd, Peter Kivy, Brian Massumi, and Jenefer Robinson. In neuroscience and psychology, her notion of feeling, and her conceptual framework of mind, have been made use of by figures including Antonio Damasio and Jaak Panksepp. Overall, Langer’s work has left a lasting impact on philosophy, with her insights into the role of feeling in human life continuing to resonate with contemporary scholars and researchers.

Langer’s inclusiveness and rigor have recommended her thought to the generations since her passing. In the arts and biosciences her ideas are becoming more widely known. Langer’s work is a model of synthetic conceptual thinking which is both broad and coherent.

Table of Contents

  1. Life and Work
  2. Feeling
  3. Logic
  4. The ‘New Key’ in Philosophy
  5. Theory of Art
  6. Theory of Mind
  7. Political Philosophy and Contribution to the ‘Modern Man’ Discourse
  8. Legacy
  9. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life and Work

Susanne K. Langer (née Knauth) grew up in the Upper West Side of Manhattan, New York. The second of five children born to affluent German immigrants, Langer spoke German at home and French at school, and later claimed she only felt fully comfortable with English by the time she reached high school. Formative experiences included family summer vacations to Lake George and family music recitals in which Langer played the cello.

She attended Radcliffe College from 1916 and was awarded her doctorate in 1926 (Langer took the same classes as male students at Harvard during this time, who were taught separately; Harvard would not award men and women degrees on an equal basis until 1975). During Langer’s time at Radcliffe, she notably studied logic under Henry Sheffer, who introduced her to the ideas of Russell and the early Wittgenstein, as well as under Alfred North Whitehead, with Langer attending the lecture series which would become Process and Reality (1929). Whitehead would also supervise Langer’s doctoral thesis and write the introduction to her first book, The Practice of Philosophy (1930). Sheffer published very little, and Langer’s second book, An Introduction to Symbolic Logic (1937), is presented as putting forward Sheffer’s approach to logic, something Sheffer himself never did.

Langer married William Langer in 1921, who would go on to become a scholar of European history, and the two spent much of their first year of marriage in Vienna. Langer lived the rest of her life in America, though she returned to Europe with her family in the summer of 1933 for a European tour and to visit Edmund Husserl in Schluchsee, Germany. The couple had two children but divorced in 1941, with Langer never remarrying.

In addition to the intellectual influences of Whitehead, Sheffer, and Wittgenstein, Langer was strongly taken by the ideas of Ernst Cassirer; they met and corresponded, with Langer going on to translate Cassirer’s Language and Myth (1946) into English.

Langer’s third book, Philosophy in a New Key (1942), sold more than half a million copies. Arguing that there had been a shift in many fields towards recognition of the role of the symbolic in human life, ritual, art and language, the book brought together findings from many areas and offered a conceptual framework within which to understand, in particular, language and music.

After her divorce, Langer moved to New York City and stayed there for a decade as she wrote her theory of art, Feeling and Form (1953). Langer had part-time and temporary positions at various academic departments, including Radcliffe (1926-42) and Columbia (1945-50), but she did not have a full-time academic post until 1954, when she took up the chair of the philosophy department at Connecticut College for Women. From 1962, she was funded by a grant from the Edgar J. Kaufmann Foundation for her major work on theory of mind, at which point she retired to concentrate on her writing. After this, she split her time between Old Lyme, Connecticut and summers in a wood cabin in Ulster County, New York. Due to ill health and, in particular, failing eyesight, she published a curtailed version of her final, third volume of Mind in 1982. She died in 1985.

2. Feeling

Langer’s notion of feeling underpins all her other work. Feeling tells organisms how they are doing in various categories of need, both internal and external. As Langer puts it:

Feeling is the constant, systematic, but private display of what is going on in our own system, the index of much that goes on below the limen of sentience, and ultimately of the whole organic process, or life, that feeds and uses the sensory and cerebral system. (Langer, 1967)

Langer’s basic analytical unit of life is the act, which she considers in terms of phases. Langer repeatedly acknowledges the futility of drawing hard dividing lines in the natural sciences. Her preference instead is to find centres of activity which hold together because they are functional. An act is a functional unit, and can be considered on dramatically different scales, from cell to organ, to organism and ecosystem. Feeling is anything that can be felt, which is to say that it is a felt phase of an act. Feeling is the mark of at least primitive mentality or mentation, though not, at least in known non-human animals, mind. The relationship of feeling to logic in Langer’s work is that she argues for an expanded logical field of meaning which includes feeling, which is not considered as an irrational disturbance in an organism but the origin of logic; Langer writes that only a highly emotional animal could have developed the methods of logic. Lastly, there are unconscious processes, but there is no unconscious feeling: whatever can be felt is felt consciously. That anything that can be felt is a felt phase of an act emphasises this.

Langer describes how a phase is not a thing but a mode of appearance, explaining that when iron, for instance, is heated to become red-hot, redness is a phase, a mode of appearance, rather than being a new entity. When the iron is cooled the redness vanishes; Langer claims that, similarly, feeling is like this redness of the iron, mere appearance that has no independent existence. This is not to deny the importance of these appearances, however, since they are what the organism has to guide its negotiation with both its internal and external environment. Langer considers the notion of feelings to be a reification, that the process of feeling does not result in ontologically distinct products.

To the extent that an organism is able to react to a stimulus, it is able to feel. There are processes that may appear animated, such as leaves blowing along a path, or water bubbling up from a geyser, but in these examples the processes are entirely dictated by the external environment rather than being active agents seeking to maintain certain balances. If the stimuli in these examples cease, the wind for the leaf and the heat for the geyser, the animation would cease too, and immediately.

Animals feel, they feel their internal and external environment, they feel their own responses to the environment, and they feel as the environment responds to their actions. On human feeling, Langer writes:

Pure sensation—now pain, now pleasure—would have no unity, and would change the receptivity of the body for future pains and pleasures only in rudimentary ways. It is sensation remembered and anticipated, feared or sought or even imagined and eschewed that is important in human life. It is perception molded by imagination that gives us the outward world we know. And it is the continuity of thought that systematizes our emotional reactions into attitudes with distinct feeling tones, and sets a certain scope for an individual’s passions. In other words: by virtue of our thought and imagination we have… a life of feeling. (Langer, 1953)

Langer’s ideas are distinguished from those of the Classical Associationists; feeling is far from being a passive or neutral process, as Langer here stresses the feedback loop of imagination and perception in giving us access to the world. In stressing the continuity of the life of feeling, Langer is stressing the continuity of consciousness—not entirely unbroken in human experience, but normatively present. Feeling, for Langer, is the driving force of consciousness, motivating, among other functions, imagining and seeking and remembering.

This view of feeling leads to a particular view of consciousness: not as an emergent property of complex organisms such as humans but as a continuum along which there are simpler and more complex consciousnesses; whatever being is capable of feeling has at least a primitive awareness of, at a minimum, its sensory environment. Langer considers these very simple organisms, therefore, to be feeling, which is to be constantly attaining psychical phases of sensory acts, and that this constitutes mental activity. Langer describes this as mentation until it reaches the high development that it does in humans, which is the point at which this activity passes the threshold to be considered mind.

The clear question to come out of this is to ask what, if not consciousness, accounts for the gulf between animal mentation and the human mind. And, for Langer, the answer to this is symbolic thought.

Many animals are capable of reacting appropriately to signs (in later works Langer calls these signals), but, in known examples, only humans respond symbolically to the environment. A sign or signal, for Langer, is a symptom of an event; this can be natural, as in footprints signifying that a person or animal has walked a certain way, or artificial, as in a school bell signifying that the end of the school day has come. Symbols, by contrast, call attention primarily to concepts rather than objects; Langer writes that if someone says “Napoleon,” the correct response is not to look around for him but to ask “What about Napoleon?” The symbolic therefore allows people to imagine non-actual situations, including other times and places and the speculative.

Langer considers both emotion and logic to be high developments of feeling. Langer writes that logic is a device for leading people between intuitions, these intuitions being meaningful personal understandings (see the next section for a fuller discussion of Langer’s logic). Langer does not have a fully developed theory of emotion, though she refers to emotional situations in individual people and groups not infrequently. Her notion of feeling is certainly compatible with the use that is made of it by scientists such as Jaak Panksepp and Antonio Damasio, though it need not necessitate their ideas of emotion.

Langer’s notion of art concerns feeling as well: she argues that artworks present forms of feeling for contemplation. The purpose of art is to render pre-reflexive experience available to consciousness so that it can be reflected (rather than merely acted) upon. Knowledge of feeling captures what artworks are meant to help us with educationally, socially, and cross-culturally. We have access, in life and in art, to forms only, from which we extrapolate meaning. In life, the forms of feeling are too embedded in practical situations for us to contemplate them. When art is viewed as art, the experience of them is disinterested, the forms are isolated from practical situations.

Despite Langer’s emphasis on embodiment, she also clearly emphasises cognitive evaluations. As in many other areas, Langer’s work can be seen to bridge perspectives that are often considered incompatible: in this case, that emotion is either fundamentally embodied or fundamentally cognitive:

Certainly in our history, presumably for long ages – eons, lasting into present times – the human world has been filled more with creatures of fantasy than of flesh and blood. Every perceived object, scene, and especially every expectation is imbued with fantasy elements, and those phantasms really have a stronger tendency to form systematic patterns, largely of a dramatic character, than factual impressions. The result is that human experience is a constant dialectic of sensory and imaginative activity – a making of scenes, acts, beings, intentions and realizations such as I believe animals do not encounter. (Langer, 1972)

Langer here clearly believes cognitive evaluations matter—beliefs, whether about ghosts and monsters and gods or about why the bus is late and what might be done about it, and especially expectations, which determine to a surprising extent what is perceived. Langer also stresses here the dynamic real-time mixing of sensory and imaginative activity, disposing the holder of these expectations towards certain kinds of experience.

This emphasis on feeling in Langer has clear parallels to her contemporary John Dewey, who focused on experience similarly. These parallels have been drawn out most thoroughly by Robert Innis in his monograph on Langer.

3. Logic

Langer’s most distinctive contribution to the philosophy of logic is her controversial claim of a presentational logic that operates differently from, but is no less reasonable than, traditional logic. This presentational logic functions by association rather than by logical implication (as in traditional logic) or causality; nonetheless, Langer considers it also to be a logic because presentational forms contain relational patterns. Langer first put forward this idea in her doctoral dissertation in 1926, ‘A Logical Analysis of Meaning’, in which Langer investigated the meaning of meaning from the starting point that the dictionary definition of meaning seems to have little to do with meaning in art or religion.

Langer developed this idea further in her first book, The Practice of Philosophy (1930), in which she also situated philosophy in relation to science. Arguing that analysis is an indispensable part of any complex understanding, she distinguished between the empirical sciences which pursue facts and the rational sciences which instead pursue meanings—the latter exemplified by mathematics and logic. These rational sciences, Langer claimed, are the foundation of the ‘higher’ and more concrete subjects of ethics and metaphysics. Langer points out, for instance, that it was in studying numbers that philosophers gained the understanding they needed to approach more accurately the concept of infinity, and that Zeno’s paradox—that matter in its eternal motion is really at rest—is solved by a clear understanding of the continuum.

Aspects of Langer’s views here are heavily influenced by logical positivism, and this impression of these ideas is likely to be strengthened in the reader’s mind by Langer’s positive discussion of Bertrand Russell and of the early Wittgenstein of the Tractatus. One feature that Langer shares with logical positivism, for instance, is her view that philosophy is a critique of language. But even in this first book, published at approximately the peak of logical positivism’s popularity, Langer explicitly distinguishes her views from those of logical positivism. Already at this point, Langer is insisting on the importance of an interpretant in the meaning relation, reinserting the aspect of personal experience which logical positivism had carefully removed.

One of Langer’s contributions to the logic of signs and symbols is the claim that the semantic power of language is predicated on the lack of any rival interest in vocables. Langer uses the example of an actual peach to replace the spoken word ‘plenty’, and she argues that we are too interested in peaches for this to be effective: the peach would be both distracting and wasted. It is the irrelevance of vocables for any other purpose than language that leads to the transparency of spoken language, where meaning appears to flow through the words.

Langer’s textbook, An Introduction to Symbolic Logic (1937), was written expressly to take students to the point where they could tackle Russell and Whitehead’s Principia Mathematica (1910-3). This textbook contains not only instruction on the formal aspects of symbolic logic, Boolean as well as that of Principia Mathematica, but also extensive philosophical discussion on metaphor, exemplification, generalization and abstraction. As well as achieving this task, the book functions as an introduction to how Sheffer practiced logic, since he did not publish such a text.

Sheffer had followed Josiah Royce in considering logic to be a relational structure rather than dealing solely with inference. Langer takes this notion and follows it through its implications, paying special attention to the distinctions between types of logic and meaning.

From one perspective, Langer’s view is very radical, since expanding the notion of meaning to logical resemblance incorporates huge swathes of life which had been dismissed by many of the thinkers she cites most, such as Russell and the early Wittgenstein, as nonsense. However, this emphasis on the structure of relations can also be seen as a form of hylomorphism, connecting Langer’s views to a tradition which stretches back to Aristotle.

4. The ‘New Key’ in Philosophy

Langer’s next book, Philosophy in a New Key (1942), might be thought of as her central work, in that it serves as a summation and development of her previous work in logic and an expanded field of meaning, but also gives early formulation to her ideas in all the fields which would preoccupy her for the rest of her career, including art and theory of mind, but also touching on linguistics, myth, ritual, and ethnography.

In the book Langer claims that figures as diverse as Freud, Cassirer, Whitehead, Russell, and Wittgenstein are all engaged in a shared project to understand the nature of human symbolization. Along the way, Langer touches on a wide variety of subjects of philosophical interest. Her theory of depiction, for instance, is given, along with a speculative account of the early development of language, and the relation of fantasy to rational thought.

Langer justifies the exploration of all these different topics in a single text by relating them all to a single idea: that across a wide range of humanities subjects there had been, in the late 19th and early 20th centuries, a fundamental shift in the intellectual framework within which work was done in these disciplines and that this shift was related in every case to an expanded appreciation of the nature of human symbolization. Langer describes this shift using the musical metaphor of a key change—hence Philosophy in a New Key. In her introduction, Langer offers a brief account of previous shifts in philosophy such as, for instance, the Cartesian notion of looking at reality as a dichotomy of inner experience and outer world.

Langer refers to her theory of the symbolic as a semantic theory, which proved controversial, as her theory includes but is not limited to language. This is the expanded field of meaning that Langer sought to describe and provided conceptual scaffolding for. Where Wittgenstein’s Tractatus Logico-Philosophicus famously ends with the statement that “whereof we cannot speak, we must remain silent” (Wittgenstein, 1922), Langer argues that language is only one of many symbolisms, albeit one of particular importance, and that other symbolisms, including myth, ritual, and art, can form thoughts which language is incapable of. Whether or not Langer is correct depends not only on whether the semantic can be broadened in this way so that the semantic does not need a corresponding syntax, for instance, but also on whether there are thoughts which language is not capable of expressing.

Langer’s distinction between discursive and presentational symbolic forms in Philosophy in a New Key has received extensive discussion. Briefly, discursive forms are to be read and interpreted successively, whereas presentational forms are to be read and interpreted as a whole. Additionally, another important difference is that in discursive symbolisms the individual elements have independent meaning whereas in non-discursive symbolism they do not; words have independent meaning even when isolated from a wider text or utterance, whilst lines, colours and shapes isolated from an artwork do not have independent meaning.

Scientific language and mathematical proofs are straightforwardly discursive, whereas photographs and paintings are straightforwardly presentational. Some less intuitive but still important applications of this distinction exist, however, with novels and poems, for instance, being considered presentational forms by Langer; despite being formed with language, the artwork functions as a whole, and cannot be judged without considering the whole. On the other hand, graphs and charts function discursively, despite being visual.

Langer’s discussion of ritual is related to her careful reading of Ernst Cassirer, whom Langer met and corresponded with, and who considered Philosophy in a New Key to be the book on art which corresponded to his three-volume Philosophy of Symbolic Forms (1923-9). Langer would translate and write the introduction for the English-language edition of Cassirer’s Language and Myth (1946). Considering rain dances, for instance, Langer discusses them neither as a dishonest trick of tribal seniors nor as magic. Instead, the group activity is seen as symbolic:

Rain-making may well have begun in the celebration of an imminent shower after long drought; that the first harbinger clouds would be greeted with entreaty, excitement, and mimetic suggestion is obvious. The ritual evolves while a capricious heaven is making up its mind. Its successive acts mark the stages that bring the storm nearer. (Langer, 1942)

Langer notes, moreover, that participants do not try to make it snow in mid-summer, nor to ripen fruits entirely out of season. Instead, the elements are either present or imminent, and participants encourage them.

Langer’s treatment of music in the book is notable, defending critic Clive Bell’s famous phrase in Art (1917) that called art ‘significant form’. Langer argues that the sense in which this is true is that music is a symbolic form without fixed conventionally assigned meanings—she calls music an “unconsummated symbolism.” (Langer, 1942) Langer dismisses both the hedonic theory of art and the contagion theory, and she argues instead that music expresses the composer’s knowledge of feeling, an idea she attempts to elucidate and clarify but which she attributes to numerous European composers, critics, and philosophers including Wagner, Liszt and Johann Adam Hüller.

Philosophy in a New Key might also be thought to be central because Langer’s later theory of art is explicitly introduced on its cover as being derived from Philosophy in a New Key, and subsequently her Mind trilogy is introduced as having come out of her research on living form that informed her philosophy of art. Langer herself frequently refers back to Philosophy in a New Key in her later works, whereas The Practice of Philosophy never went beyond the first edition, with Langer in her later life turning down requests by the publisher to put it back in print.

The book was unexpectedly popular. Despite its enthusiastic popular reception, the book was largely neglected by the academic community at the time. The book’s success may partly explain Langer’s relative prominence within musical aesthetics compared to her relative neglect in the aesthetics of other artforms since her treatment of music in Philosophy in a New Key is much fuller than her brief and scattered comments on other artforms. Langer was well aware of this, and indeed the subsequent work, Feeling and Form, gives separate and sustained attention to a wide variety of artforms.

5. Theory of Art

After Philosophy in a New Key’s popular success in giving an account of music, Langer generalised its account to a theory of all the arts. Feeling and Form (1953) is split into three major parts: the first deals with introductory matters; part two, by far the largest part of the book, gives separate and sustained attention to each of the artforms dealt with in turn, including painting, sculpture, architecture, music, dance, poetry, myth and legend, prose fiction, comedic drama and tragic drama (there is also a short appendix on film at the end of the book); then, in part three, Langer gives her general account.

Helpfully, in part three she compares her ideas in detail to those of R. G. Collingwood, whose Principles of Art (1938) had appeared just fifteen years before; this very much helps to locate Langer’s position. The final chapter of Feeling and Form considers art from the point of view of its public, considering the educational and social role of art, in a way that both ties Feeling and Form into the sections on ritual and myth in Philosophy in a New Key and anticipates some arguments Langer would make in Volumes 1 and 3 of Mind. The theory of art presented here is based primarily on Feeling and Form, but also includes elements and quotes from the two other later books in which Langer discusses art at length: Problems of Art (1957) and Mind: Volume 1 (1967).

Langer’s theory states that artworks present forms of feeling. This is possible because both feeling and artistic elements are experienced as qualitative gradients; the forms of each are congruent. Feeling may be complex or simple—more or fewer gradients can be experienced simultaneously; artworks, similarly, may present many gradients at once or very few. In either case, there is a unity to the feeling or artwork—an overall quality. It is this quality of feeling that an artist tries to express when creating a work, negotiating the artistic elements.

Artists work by weighing qualities in the forming artwork—a formulation that seems to capture practices as diverse as traditional easel painting or the selection of ready-mades, a composer writing a symphony or a rock band writing a song, or theatre directors giving feedback to actors on blocking or actors improvising a scene of street theatre. “Artistic forms,” Langer writes, “are more complex than any other symbolic forms we know. They are, indeed, not abstractable from the works that exhibit them. We may abstract a shape from an object that has that shape, by disregarding color, weight and texture, even size; but to the total effect that is an artistic form, the color matters, the thickness of lines matters, and the appearance of texture and weight.” (Langer, 1957) The value of art is intrinsic to the work, rather than being a communication medium, and it is the sensuous qualities of the work which give the viewer access to the meaning (literary work being experienced in the sensuous imagination).

Langer holds that artworks are each a symbol expressive of human feeling. By expression—to press out—Langer means projection, she uses the example of horns projected from the head of a reindeer. An art object is therefore a projection of feeling, not spontaneous feeling—but the artist’s knowledge of feeling. Langer’s Expressivism, moreover, does not insist on melodrama and high emotion. Whilst it could be argued that Langer’s concept of expression differs too significantly from others in the Expressivist tradition to be called such, Langer herself writes that she, Croce, and Collingwood are embarked on a shared project, as well as Bell, Fry, and Cassirer. (Langer, 1953) So long as it is remembered that Langer does not claim that artworks express emotion, the grouping seems fair; Langer’s account concerns expressive form articulating knowledge of feeling rather than a contagious and spontaneous outpouring.

Langer writes that artists try to express a unitary gestalt:

What any true artist – painter or poet, it does not matter – tries to “re-create” is not a yellow chair, a hay wain or a morally perplexed prince, as a “symbol of his emotion,” but that quality which he has once known, the emotional “value” that events, situations, sounds or sights in their passing have had for him. He need not represent those same items of his experience, though psychologically it is a natural thing to do if they were outstanding forms; the rhythm they let him see and feel may be projected in other sensible forms, perhaps even more purely. When he finds a theme that excites him it is because he thinks that in his rendering of it he can endow it with some such quality, which is really a way of feeling. (Langer, 1967)

Langer believes that people feel, and artists have developed special sensitivity to feeling, and when working in an artistic mode, they seek to articulate what they have felt, so that the resulting artwork seems to possess the same quality as the feeling the artist has in mind (remembering that consciousness is a fundamentally embodied process for Langer, feeling raised above the “limen of sentience”). Langer stresses that the artist need not have experienced the feeling, but they must be capable of imagining it.

Langer distinguishes between what she calls primary and secondary illusions. A primary illusion is what an artform stably presents—so painting, sculpture, and architecture must present virtual space whilst a piece of music must present virtual time. This is the contextual framework within which artistic elements exist. Further primary illusions include virtual powers (dance) and virtual memory (literature). Primary illusions do not come and go, and are not a presentation of gradients; because of this they are not the site of particular interest in most artworks—Langer, for instance, criticises the work of Russian artist Malevich for generating a sense of space in his “magic squares” but nothing else. Secondary illusions, by contrast, present gradients; artworks can function because of the congruence between gradients of feeling and gradients in artworks. Gradients are projected into artworks, and while there are no set rules for how this is done, it is possible to analyse an artwork to see how a work has been achieved. By stressing the absence of rules of projection, what Langer means is that the results of these analyses cannot be generalised and reapplied—this is one major way in which art images are distinguished from models, which generally do have a single stable rule of projection; the salience of a gradient depends on the artwork. The relationship of secondary illusions to primary illusion is that of feeling to a life of feeling.

Feeling and Form did not find the success of its predecessor yet it has been mentioned or taught in some aesthetics programmes in the UK and US; perhaps surprisingly, it also seems to have been featured in university aesthetics syllabuses in China and India. Feeling and Form has also been made use of by philosophers seeking to put forward accounts of particular artforms. Robert Hopkins, for instance, has offered a limited defence of her ideas of virtual kinetic volume in sculpture as found in Feeling and Form.

Philosopher Paul Guyer has suggested, in his sub-chapter on Langer in his History of Modern Aesthetics, that the reason for the neglect of Feeling and Form may be timing; the publication of Feeling and Form in 1953 coincided with the publication of Wittgenstein’s Philosophical Investigations, the latter text preoccupying philosophy departments for decades. Accounts of art such as Langer’s which offered a single function which all artworks were meant to perform, expression in Langer’s case, were not in keeping with the intellectual fashion for proceduralist theories, such as George Dickie’s institutional theory of art or Arthur Danto’s historical account.

Langer produced two other books in this phase of her career. Problems of Art (1957) is a transcribed and edited collection of Langer’s talks on art to different audiences. Langer addresses different sorts of audiences, including general and non-specialist and technical, and so her position on many points is made clearer because of the different registers in which she addresses her audiences. She had had four years since the publication of Feeling and Form in which to synthesise the formulation of many of her ideas into a clearer form. The book also contains a reprint of her important and otherwise difficult-to-find essay from 1951 in honour of Henry Sheffer, ‘Abstraction in Science and Abstraction in Art’.

Secondly, Langer produced Reflections on Art: A Source Book of Writings by Artists, Critics, and Philosophers (1958). This latter book is a collection of writings on art which Langer considered to be both important and otherwise hard to find. Whilst invaluable in tracing influences on Langer’s ideas, Reflections on Art is not particularly helpful as an introductory text because of its focus on, in particular, metaphor and expression, at the expense of a wider survey of writings on art.

6. Theory of Mind

In the first volume of Mind (1967), Langer sets out the problem as she sees it: the mind-body duality resists efforts to solve it because it is built on faulty premises, that mind is not metaphysically distinct from body, and that behaviorism in psychology has previously led to an avoidance of the most pressing issues of the discipline. To tackle this, Langer puts forward the thesis, which she planned to substantiate over three volumes, that the whole of animal and human life, including law, the arts, and the sciences, is a development of feeling. The result is a biologically grounded theory of mind, a conceptual structure within which work in the life sciences can be integrated.

Furthermore, Langer claims that it is possible to know the character of the mind by studying the history of art, which shows the development and variety of feeling in its objectified forms. Langer proceeds, then, to first take issue with the ‘idols of the laboratory’—jargon, controlled experiment, and objectivity, claiming that each of these has its place but have held back progress in the life sciences. Claiming that each of these weaknesses is philosophical, Langer argues that scientific knowledge ultimately aims to explain phenomena, and that at a pre-scientific level work is motivated and guided by images which present the phenomenal character of a particular dynamic experience. Images are susceptible to analysis in a way that feeling itself is not. Here Langer calls art symbols a “systematic device whereby observations can be made, combined, recorded and judged, elements distinguished and imaginatively permuted, and, most important, data exhibited and shared, impressions corroborated.” (Langer, 1967) This is material art history seen as a data set, a treasure trove for psychological research.

Langer goes on to explore artistic projection, the artistic idea, and abstraction in art and science before considering living form—that functional artworks need a semblance of livingness, something Aristotle already remarked upon as the single most important characteristic of art. The image of mind that art provides can be used by those studying the mind to test the validity of their accounts.

This then sets up Langer’s discussion of acts and the growth and evolution of acts. Langer coins a new term, pressions, to name the class of relations which hold between acts and situations, such as impression, expression and suppression. Langer sees the evolution of life as fundamentally the evolution of acts, and sees the dominance of both mechanical models and imputing an agent such as God or Nature to situations as antithetical to serious understanding of this process.

The second volume of mind deals with a single section of her project, ‘The Great Shift’ from animal mentation to human mind. Starting with plankton, Langer considers progressively more complex examples, considering topics including instinct and the growth of acts. Langer seeks to neither deny animal feeling nor anthropomorphise animal feeling and behaviour. Langer draws on Jakob von Uexküll’s idea of animal ambient—that differing sensory abilities lead to different animals living in different experiential spaces even if they share the same actual space.

Langer discusses the migration of birds and other animals, arguing that animal travels should be seen as round trips, and migration as an elaboration of the same: a round trip with a long stopover. Also discussed are the parent-young relations of dolphins and the alleged use of language by chimpanzees. Langer brings a large amount of empirical material to bear on these issues, before moving on to consider the specialisation of man. She argues that Homo sapiens has been successful because of specialisation, against the argument that the species is a generalist. Langer considers the shape of the human foot, and that there is no evidence in this for humans ever living entirely in trees. The shape of the foot in facilitating bipedality, and an upright posture, and a larger brain, are all discussed, as is consideration of the hand as a sense organ. Langer then stresses a hugely important feature of the human brain, that it is able to finish impulses on a virtual level instead of needing to enact these in the actual world. This liberates the brain for conceptual thought.

Langer discusses dreaming and argues that the evidence suggests the brain requires constant activation, which is what has driven its increase in size and function. She then links the biologically grounded model of mentation she has drawn so far with the components of symbolization, discussing how mental abstraction is affected by memory, the origin of imagination, and the origins of speech in expression rather than communication.

Langer claims then that speech is necessary for social organisation and that all natural languages are socially adequate. Langer discusses the dangers of the imaginative capacity of humanity, and the feeling of reality, before discussing morality—a concern she notes is peculiar to man.

The final volume of Mind is not what Langer had planned, with an epistemological theory and a metaphysics. Due to poor health and failing eyesight, Langer left the final section of the book with only a brief outline.

What the third volume accomplishes, however, is to make connections between the model of man as the symbolic animal, which had been achieved by the end of the second volume, and various anthropological data relating to tribes, city states, and other societies. The focus of the third volume is considerably broadened to accommodate symbolic mind in society, and Langer by necessity only offers glimpses into this; Adrienne Dengerink Chaplin calls it a “holistic, biologically based, philosophical anthropology.” (Dengerink Chaplin, 2020)

Langer also offers a view of philosophy of religion, that “even as the power of symbolic thought creates the danger of letting the mind run wild, it also furnishes the saving counterbalance of cultural restraint, the orientating dictates of religion.” (Langer, 1953) A religious community and religious symbols keep a rein on individuation, strengthening social bonds; the loss of these religious frameworks in the modern world is a large part of the disorientation of modern life.

As the trajectory of her intellectual career intersected with Wittgenstein’s at several important junctures, it is of interest that she gives a brief verdict on his Philosophical Investigations: that it is a despairing resort to behaviourism.

7. Political Philosophy and Contribution to the ‘Modern Man’ Discourse

Langer’s contribution to political philosophy has received little attention, and her interest in it is certainly minor compared to her substantial interests in logic, the arts, and theory of mind. It consists of chapters on the structure of society in Philosophy in a New Key and the third volume of Mind, and, most notably, articles on the political danger of outdated symbolism governing societies in ‘The Lord of Creation’ (1944), and on what might be done to tackle the persistence of international war in ‘World Law and World Reform’ (1951).

‘The Lord of Creation’ essentially presents the arguments of Philosophy in a New Key through the lens of political philosophy and sociology. Symbolisation, Langer argues, is the source of the distinctiveness of human society—whilst animals, intelligent or not, live very realistic lives, humans are characteristically unrealistic: “magic and exorcism and holocausts—rites that have no connection with common-sense methods of self-preservation.” (Langer, 1944) This is because, Langer claims, people live lives in which there is a constant dialectic of sensory and imaginative activity, so that fantastic elements permeate our experience of reality: “The mind that can see past and future, the poles and the antipodes, and guess at obscure mechanisms of nature, is ever in danger of seeing what is not there, imagining false and fantastic causes, and courting death instead of life.” (Langer, 1944) This human condition has become a human crisis, according to Langer, because scientific progress has led to such upheavals in human living, especially in terms of the symbols which previously gave a shared context to human life.

Industrialisation, secularisation and globalisation have within two centuries, and in many places less, led to a poverty in the governing symbols available to humanity, according to Langer. People are now living together without overarching societal ties of religion or ethnicity, and are left with the vague notion of nationality to unite them, a concept Langer has little patience for, considering it to be a degraded tribalism:

At first glance it seems odd that the concept of nationality should reach its highest development just as all actual marks of national origins – language, dress, physiognomy, and religion – are becoming mixed and obliterated by our new mobility and cosmopolitan traffic. But it is just the loss of these things that inspires this hungry seeking for something like the old egocentric pattern in the vast and formless brotherhood of the whole earth. (Langer, 1944)

The problem is not merely industrial warfare, for Langer, but industrial warfare at a time when ‘modern man’ is simultaneously symbolically impoverished.

‘World Law and World Reform’ is a densely argued twelve pages; Langer argues that whilst civil war is a failure of institutions, and as such not irradicable, international war is, by nature, institutional. What she means by this is that the power of nation states is backed up by the threat and use of force, and it is the display and use of this force which enables diplomacy. Langer dismisses the notion of popular demand for war, arguing that it is diplomats—here she lists kings, presidents, premiers, other leading personages and their cabinets—who prepare and make war: “The threat of violence is the accepted means of backing claims in the concert of nations, as suit and judgement are in civil life.” (Langer, 1951)

Langer argues that this situation is the result of an essentially tribal philosophy of government, which did relatively little damage in the past, but which has the potential to end human life on earth since the invention of atomic weapons. Her solution is the creation and empowerment of a world judiciary, which would be invested with the power to adjudicate and enforce its decisions. She acknowledges that the United Nations is the most notable international institution of her era and lists five reforms which would make it suitable to perform the role of this world judiciary: “1) Extend membership to all nations; 2) Make the General Assembly a legislative body with power to adopt a constitution; 3) Give the World Court the power of summons, and make its decisions binding; 4) Set up a high secretariat (or other executive) to administer world interests; 5) Internationalize all armed force, setting up a federal guard (not enlisted by national units) and allowing the several nationals national police guards of their own, for domestic use.” (Langer, 1951)

Langer is not optimistic about these steps happening in short order, but she argues that historical parallels exist, and that the steps need not happen in one go and can be worked towards as a far-sighted goal. Her historical parallels are action to combat the Black Death and, later, to end legal child labour. In both of these situations, a pre-existing social malady became intolerable due to social changes which exacerbated them, and it was this which prompted social reform. Langer argues that, similarly, properly constituted world courts could bring an end to international war.

8. Legacy

Because of Langer’s many temporary academic positions, and her focus on research instead of teaching from the mid-1950s onwards, her legacy is mainly to be found in her publications, especially books, rather than in direct influence on students. Having said this, numerous individuals who would go on to be influential in their fields studied with Langer, including artist Eva Hesse and philosopher Arthur Danto. Danto would write the preface to the abridged version of Mind.

Langer herself is a subject of growing interest, with research being undertaken into her life, career, and index card system. The Susanne K. Langer Circle is an international group of scholars with interest in Langer’s work and life and is affiliated with Utrecht University. It hosted the first international conference on Langer’s work in 2022.

Langer’s textbook Introduction to Symbolic Logic was the first introductory book on the subject and made the methods of symbolic logic much more accessible. Randall Auxier has published an updated version of this with many more exercises and expanded discussion.

In philosophy of art, Langer’s ideas on expression have been engaged with by a range of prominent thinkers in the philosophy of music, including Malcolm Budd, Peter Kivy, and Jenefer Robinson. Nelson Goodman’s positions on many issues, in particular those he discusses in Languages of Art (1968), are influenced by Langer’s ideas, something Goodman half acknowledges in his introduction, though Goodman somewhat disingenuously cites Langer directly only as Cassirer’s translator.

Philosopher Brian Masumi has engaged with Langerean thought, particularly her work in Feeling and Form, discussing her ideas on motif and, especially, semblance, writing “Langer has probably gone further than any other aesthetic philosopher toward analyzing art-forms not as “media” but according to the type of experiential event they effect.” (Massumi, 2011) Langer in Massumi has been a very influential reference for younger philosophers engaging with her thought.

Jarold Lanier, the ‘father of virtual reality’, attributes the term ‘virtual world’ to Langer—computing and virtual reality pioneer Ivan Sutherland had Feeling and Form era Langer in mind. Here is the first reference to a virtual world in Langer—she is discussing architecture, in particular how one nomadic camp may be set up in the same geographical area where one from another culture used to be, but the sense is extremely evocative when considered in light of virtual reality:

A place, in this non-geographical sense, is a created thing, an ethnic domain made visible, tangible, sensible. As such it is, of course, an illusion. Like any other plastic symbol, it is primarily an illusion of self-contained, self-sufficient, perceptual space. But the principle of organization is its own: for it is organized as a functional realm made visible —the center of a virtual world, the “ethnic domain,” and itself a geographical semblance. (Langer, 1953)

Lanier made the change from virtual world to virtual reality, but the fundamental notion is Langerean. Pioneering media theorist Marshall McLuhan similarly seems to have had Langer in mind, occasionally citing her, when considering how media reshapes and reconstitutes (again, the above quote is suggestive here when considering McLuhan’s famous dictum “the medium is the message.”)

In neuroscience, several notable figures have referred in print approvingly to Langer’s ideas on feeling, including Jaak Panksepp, Gerald Edelman, and Antonio Damasio. The latter refers to his notion of background feeling as being what Langer describes, though he arrived at it independently. In psychology, Fred Levin writes that Langer anticipated by decades  the notion of feeling that the biosciences would adopt.

9. References and Further Reading

a. Primary Sources

  • A Logical Analysis of Meaning, doctoral thesis, Radcliffe College, 1926.
    • Unpublished thesis making the case for an expanded understanding of meaning which includes religion and the arts, argues that philosophy is the clarification of concepts.
  • The Practice of Philosophy. New York: Henry Holt, 1930.
    • Explains Langer’s perspective on what it is to do philosophy, and its distinction from and relation to other fields, including science, mathematics and logic, and art.
  • An Introduction to Symbolic Logic. New York: Allen and Unwin, 1937. Second revised edition, New York, Dover, 1953.
    • Textbook aiming to take beginners to the point of being able to tackle Russell and Whitehead’s Principia Mathematica.
  • Philosophy in a New Key: A Study in the Symbolism of Reason, Rite and Art. Cambridge, MA: Harvard University Press, 1942.
    • Langer’s most influential book—drawing together researches in psychology, art, ritual, language and logic to claim that there had been a recent philosophical shift predicated on an expanded awareness of the symbolic.
  • ‘The Lord of Creation’. In Fortune Magazine (1944).
    • Popular treatment discussing how the power of symbolisation is both a strength and source of the precariousness of human society.
  • ‘Abstraction in Science and Abstraction in Art’. In Structure, Method and Meaning: Essays in Honor of Henry M. Sheffer, edited by Paul Henle, Horace M. Kallen and Susanne K. Langer, 171–82. New York: Liberal Arts Press, 1951.
    • Defends the thesis that scientific abstraction concerns generalisation whilst artistic abstraction specifies unique objects which are forms of feeling.
  • ‘World Law and World Reform’. The Antioch Review, Vol. 11, No. 4 (Winter, 1951).
    • Langer’s most sustained political philosophy defending the implementation of empowered world courts.
  • Feeling and Form: A Theory of Art Developed from Philosophy in a New Key. New York: Charles Scribner’s, 1953.
    • An Expressivist theory of art which discusses numerous artforms in detail before generalising the conclusions.
  • Problems of Art: Ten Philosophical Lectures. New York: Charles Scribner’s, 1957.
    • Accessible collection of lectures to different audiences on art topics.
  • Reflections on Art: A Source Book of Writings by Artists, Critics, and Philosophers. Editor. Baltimore, MD: Johns Hopkins University Press, 1958.
    • Langer’s choice of aesthetics readings with introduction.

b. Secondary Sources

  • Auxier, Randall E. ‘Susanne Langer on Symbols and Analogy: A Case of Misplaced Concreteness?’ Process Studies 26 (1998): 86–106.
    • Suggests a modification to Langer’s account of symbols and considers this part of her account in relation to that of both Whitehead and Cassirer.
  • Auxier, Randall E. Logic: From Images to Digits. 2021. Ronkonkoma: Linus Learning.
    • An accessible and updated version of Langer’s symbolic logic, separating it from the implied metaphysics of the original.
  • Browning, Margaret M. ‘The Import of Feeling in the Organization of Mind’ in Psychoanalytic Psychology, Vol. 33, No. 2 (2016), pp. 284–298.
    • Pursues and defends a Langerean view of feeling from a neuroscientific perspective.
  • Browning, Margaret M. ‘Our Symbolic Minds: What Are They Really?’ in The Psychoanalytic Quarterly, Vol. 88, No. 1 (2019), pp. 25–52.
    • Discusses intersubjectivity from a Langerean perspective.
  • Budd, M. Music and the Emotions, London: Routledge, 1985.
    • A serious critique of Langer’s musical aesthetics.
  • Dengerink Chaplin, Adrienne. The Philosophy of Susanne Langer: Embodied Meaning in Logic, Art, and Feeling, London: Bloomsbury Academic, 2019.
    • Monograph on Langer with a particular focus on the influence of Wittgenstein, Whitehead, Scheffer and Cassirer on the development of Langer’s thought.
  • Dryden, Donald. ‘The Philosopher as Prophet and Visionary: Susanne Langer’s Essay on Human Feeling in the Light of Subsequent Developments in the Sciences’. Journal of Speculative Philosophy 21, No. 1 (2007): 27–43.
    • A brief summary of some of the applications of Langer’s theory of mind with a view to defending the applicability of the Langerean view.
  • Dryden, Donald. ‘Susanne Langer and William James: Art and the Dynamics of the Stream of Consciousness’. Journal of Speculative Philosophy, New Series, 15, No. 4 (1 January 2001): 272–85.
    • Traces commonalities and distinctions in the ideas on thinking and feeling of James and Langer.
  • Dryden, Donald. ‘Whitehead’s Influence on Susanne Langer’s Conception of Living Form’, Process Studies 26, No. 1–2 (1997): 62–85.
    • A clear account of what Langer does and does not take from Whitehead particularly concerning act form.
  • Gaikis, L. (ed.) The Bloomsbury Handbook of Susanne K. Langer. London: Bloomsbury Academic, 2024.
    • Featuring an extensive collection of major scholars on Langer, this book elucidates her transdisciplinary connections and insights across philosophy, psychology, aesthetics, history, and the arts.
  • Ghosh, Ranjan K. Aesthetic Theory and Art: A Study in Susanne K. Langer. Delhi: Ajanta Books International, 1979.
    • Doctoral dissertation on Langer which takes the unusual step, in an appendix, of applying her theories to specific artworks.
  • Hopkins, R. ‘Sculpture’ in Jerrold Levinson (ed.), The Oxford Handbook of Aesthetics. Oxford University Press. pp. 572–582 (2003).
    • Criticizes and offers a limited defence of Langer’s notion of virtual kinetic volume in sculpture.
  • Innis, Robert E. Susanne Langer in Focus: The Symbolic Mind. Bloomington: Indiana University Press, 2009.
    • The first English-language monograph on Langer; particularly helpful in locating Langer in relation to the pragmatist tradition.
  • Lachmann, Rolf. Susanne K. Langer: die lebendige Form menschlichen Fühlens und Verstehens. Munich: W. Fink, 2000.
    • The first monograph on Langer (German language).
  • Lachmann, Rolf. “From Metaphysics to Art and Back: The Relevance of Susanne K. Langer’s Philosophy for Process Metaphysics.” Process Studies, Vol. 26, No. 1–2, Spring-Summer 1997, 107–25.
    • English-language summary by Lachman of his above book.
  • Massumi, B. Semblance and Event: Activist Philosophy and the Occurrent Arts. Cambridge, MA: The MIT Press, 2011.
    • An aesthetics of interactive art, ephemeral art, performance art, and art intervention. The titular semblance is Langerean and the early part of the book features an extended discussion on and from ideas taken from Feeling and Form.
  • Nelson, Beatrice K. ‘Susanne K. Langer’s Conception of “Symbol” – Making Connections through Ambiguity’. Journal of Speculative Philosophy, New Series 8, No. 4 (1 January 1994): 277–96.
    • Considers what is involved and at stake in Langer’s synthetic project.
  • Reichling, Mary. ‘Susanne Langer’s Concept of Secondary Illusion in Music and Art’. Journal of Aesthetic Education 29, No. 4 (1 December 1995): 39–51.
    • Opening up of the philosophical discussion on secondary illusions with reference to specific works and art criticism.
  • Sargeant, Winthrop. ‘Philosopher in a New Key’. New Yorker, 3 December 1960.
    • New Yorker profile on Langer.
  • Saxena, Sushil. Hindustani Sangeet and a Philosopher of Art: Music, Rhythm, and Kathak Dance Vis-À-Vis Aesthetics of Susanne K. Langer. New Delhi: D. K. Printworld, 2001.
    • Applies Langerean aesthetics to a type of music Langer did not discuss.
  • Schultz, William. Cassirer and Langer on Myth: An Introduction. London: Routledge, 2000.
    • Discussion of literary myths in Cassirer and Langer, both commonalities and distinctions in their positions.
  • van der Tuin, Iris. ‘Bergson before Bergsonism: Traversing ‘Bergson’s Failing’ in Susanne K. Langer’s Philosophy of Art’. Journal of French and Francophone Philosophy 24, No. 2 (1 December 2016): 176–202.
    • Considers Feeling and Form in relation to the philosophy and reception of Henri Bergson.

 

Author Information

Peter Windle
Email: peterwindle@gmail.com
University of Kent
United Kingdom

Aristotle: Epistemology

For Aristotle, human life is marked by special varieties of knowledge and understanding. Where other animals can only know that things are so, humans are able to understand why they are so. Furthermore, humans are the only animals capable of deliberating in a way that is guided by a conception of a flourishing life. The highest types of human knowledge also differ in having an exceptional degree of reliability and stability over time. These special types of knowledge constitute excellences of the soul, and they allow us to engage in characteristic activities that are integral to a good human life, including the study of scientific theories and the construction of political communities.

Aristotle’s central interest in epistemology lies in these higher types of knowledge. Among them, Aristotle draws a sharp division between knowledge that aims at action and knowledge that aims at contemplation, valuing both immensely. He gives a theory of the former, that is, of practically oriented epistemic virtues, in the context of ethics (primarily in the sixth book of the Nicomachean Ethics [Nic. Eth.], which is shared with the Eudemian Ethics [Eud. Eth.]), and he gives a theory of the latter both there and in the Posterior Analytics [Post. An.], where the topic of epistemology is not sharply distinguished from the philosophy of science. Lower types of knowledge and other epistemically valuable states are treated piecemeal, as topics like perception, memory and experience arise in these texts as well as in psychological, biological, and other contexts.

Although Aristotle is interested in various forms of error and epistemic mistakes, his theory of knowledge is not primarily a response to the possibility that we are grossly deceived, or that the nature of reality is radically different from the way we apprehend it in our practical dealings and scientific theories. Instead, Aristotle takes it for granted that we, like other animals, enjoy various forms of knowledge, and sets out to enumerate their diverse standards, objects, purposes and relative value. He emphasizes the differences among mundane forms of knowledge such as perception and higher forms such as scientific theorizing, but he also presents an account on which the latter grows organically out of the former. His pluralism about knowledge and his sensitivity to the different roles various forms of knowledge play in our lives give his theory enduring relevance and interest.

Table of Contents

  1. Knowledge in General
  2. Perception
  3. Memory
  4. Experience
  5. Knowledge as an Intellectual Virtue
    1. The Division of the Soul
    2. Scientific Virtues
      1. Theoretical Wisdom
      2. Demonstrative Knowledge
      3. Non-Demonstrative Scientific Knowledge
    3. Practical Knowledge and the Calculative Virtues
      1. Craft
      2. Practical Wisdom
  6. References and Further Reading
    1. Bibliography

1. Knowledge in General

Knowledge in a broad sense (gnōsis, from whose root the word “knowledge” derives; sometimes also eidenai) is enjoyed by all animals from an early stage in their individual development (Generation of Animals [Gen. An.] I 23, 731a30–4). In Aristotle’s usage, it includes everything from a worm’s capacity to discriminate hot and cold to the human ability to explain a lunar eclipse or contemplate the divine (for representative usages, see Post. An. I 1, 71a1–2; II 8, 93a22; II 19, 99b38–9). However, Aristotle shows comparatively little interest in knowledge in this broad sense. The Aristotelian corpus has no surviving treatise devoted to knowledge in all generality, and there is no evidence that Aristotle ever authored such a text. His main interest is in more specific kinds of knowledge. Nevertheless, a few features of Aristotle’s view regarding knowledge in general deserve comment.

First, it is relatively clear that he takes gnōsis to be at least factive (although this is disputed by Gail Fine). That is, if someone (or some animal) has gnōsis that something is the case, then that thing is true. Plausibly, Aristotle takes gnōsis to be not only true cognition, but cognition that is produced by a faculty like perception which reliably yields truths. This makes it tempting to compare Aristotle’s general view of knowledge with contemporary forms of reliabilism such as Ernest Sosa’s or John Greco’s, though the reliability of gnōsis is not a point Aristotle stresses.

Second, Aristotle also treats most kinds of knowledge as relatives (Physics [Phys.] VII 3, 247b1–3). A relative, in Aristotle’s metaphysical scheme, is an entity which is essentially of something else (Cat. 7, 6a37). One example is that of a double, since a double is essentially the double of something else (Cat. 7, 6a39–b1). Likewise, knowledge is essentially knowledge of something-or-other (Cat. 7, 6b5), be it an external particular (De Anima [De An.] II 5, 417b25–7), a universal within the soul (De An. II 5, 417b22–3; compare Phys. VII 3, 247b4–5, 17–18), or the human good (Nic. Eth. VI 5, 1140a25–8). It is fundamental to Aristotle’s way of thinking about knowledge that it is in this way object directed, where the notion of object is a broad one that includes facts, particulars, theories and ethical norms. Aristotle frequently characterizes different types of knowledge by the types of objects they are directed at.

Third, for Aristotle, knowledge generally builds upon itself. In many cases, learning amounts to reconceiving the knowledge we already have, or coming to understand it in a new way (Post. An. I 1, 71b5–8). Further, Aristotle notes that the knowledge we gain when we learn something is often closely connected to the knowledge that we need to already have in order to learn this. For instance, in order to gain a proper geometrical understanding of why a given triangle has internal angles that sum to 180 degrees, we must already know that triangles in general have this angle sum and know that this particular figure is a triangle, whereupon it may be asked: what is this if not already to know that the particular triangle has this angle sum (Post. An. I 1, 71a19–27; cf. Pr. An. II 21, 67a12–22)? Likewise, in order to arrive at knowledge of what something is, that is, of its definition, we must perform an inquiry that involves identifying and scrutinizing things of the relevant kind. That requires knowing that the relevant things exist; however, how can we identify these things if we do not know what defines instances of that kind (Post. An. II 7, 92b4–11; II 8, 93a19–22)?

Aristotle identifies such questions with a famous puzzle raised in Plato’s Meno: how can we search for anything that we do not already know (Post. An. I 1, 71a29, compare Pr. An. II 21, 67a21–2)? Either we already know it, in which case we do not need to look for it, or we do not know it, in which case we do not know what we are seeking to learn and we will therefore not recognize it when we have found it (Meno 80e).

As David Bronstein and Gail Fine have shown, much of Aristotle’s epistemology is structured around this challenge. Aristotle is confident that we can distinguish the prior knowledge required for various types of learning from what we seek to learn; hence, for Aristotle, the puzzle in the Meno amounts to a challenge to articulate what prior knowledge various kinds of learning depend upon. The picture of learning and inquiry we get from Aristotle is, consequently, a thoroughly cumulative one. Typically, we learn by building on and combining what we already know rather than going from a state of complete ignorance to a state of knowledge. Aristotle is concerned to detail the various gradations in intellectual achievement that exist between mundane knowledge and full scientific or practical expertise.

This approach, however, raises a different worry. If we can only gain knowledge by building on knowledge we already have, then the question arises: where does our learning begin? Plato’s answer, at least as Aristotle understands it, is that we have innate latent beliefs in our souls which we can recollect and hence come to know (Post. An. II 19, 99b25–6). Aristotle rejects this view, taking it to require, implausibly, that we have more precise cognitive states in us than we are aware of (Post. An. II 19, 99b26–7). Instead, he adverts to perception as the type of knowledge from which higher cognitive states originate (Post. An. II 19, 99b34–5; cf. Met. I 1, 980a26–7). At least the most rudimentary types of perception allow us to gain knowledge without drawing on any prior knowledge. Thus, for Aristotle, everything learned (both for us and for other animals) starts with perception, such that any lack in perception must necessarily result in a corresponding lack in knowledge (Post. An. I 18, 81a38–9). Depending on the intellectual capabilities of a given animal, perception may be the highest type of knowledge available, or the animal may naturally learn from it, ascending to higher types of knowledge from which the animal can learn in turn (Post. An. II 19, 99a34–100a3; Met. I 1, 980a27–981b6).

2. Perception

For Aristotle, perception is a capacity to discriminate that is possessed by all human and non-human animals (Post. An. II 19, 99b36–7; De An. II 2, 413b2; Gen. An. I 23, 731a30–4), including insects and grubs (Met. I 1, 980a27–b24; De An. II 2, 413b19–22). Every animal possesses at least the sense of touch, even though some may lack other sensory modalities (De An. II 2, 413b8–10, 414a2–3). Each sense has a proper object which only that perceptual modality can detect as such (De An. II 6, 418a9–12): color for sight, sound for hearing, flavor for taste, odor for smell and various unspecified objects for touch (De An. II 6 418a12–14). For Aristotle, perception is not, however, limited to the proper objects of the sensory faculties. He allows that we and other animals also perceive a range of other things: various common properties which can be registered by multiple senses, such as shape, size, motion and amount (De An. II 6, 418a17–18), incidental objects such as a pale thing or even the fact that the pale thing is the son of Diares (De An. II 6, 418a21), and possibly even general facts such as that fire is hot (Met. I 1, 981b13; but see below).

Aristotle holds that we are never, or at least most infrequently, in error about the proper objects of perception, like color, sound, flavor, and so on (De An. II 6, 418a12; De An. III 3, 428b18–19). We are, however, regularly mistaken about other types of perceptual objects (De An. III 3, 428b19–25). While I can be mistaken, for instance, about the identity of the red thing I am perceiving (Is it an ember? Is it a glowing insect? Is it just artifact of the lighting?), I usually am not mistaken that I am seeing red. In Aristotle’s language, this is to say that I am more often in error regarding the incidental objects of perception (De An. III 3, 428b19–22). The common objects of perception are, in his view, even more prone to error (De An. III 3, 428b22–5); for example, I can easily misperceive the size of the red thing or the number of red things there are.

Aristotle gives an account of the way perception works which spans physiology, epistemology and philosophy of mind. In order for perception to occur, there must be an external object with some quality to be perceived and a perceptual organ capable of being affected in an appropriate way (De An. II 5, 417b20–1, 418a3–5). Aristotle posits that each sense organ is specialized and can only be affected in specific ways without being harmed. This explains both why different sensory modalities have different proper objects and why overwhelming stimuli can disable or damage these senses (De An. II 12, 424a28–34; III 2, 426a30–b3; III 13, 435b4–19). Perception takes place when the sensory organ is altered within its natural bounds, in such a way as to take on the sensible quality of the object perceived. In this way, the perceptual organ takes on the sensible form of the object without its matter (De An. III 12, 424a17–19). Much debate has revolved around whether Aristotle means that the organ literally takes on the sensible property (whether, for instance, the eye literally becomes red upon seeing red), or whether Aristotle means that it does so rather in some metaphorical or otherwise attenuated sense.

Some animals, Aristotle holds, have no other form of knowledge except perception. Such animals, in his view, only have knowledge when they are actually perceiving (Post. An. II 19, 99b38–9); they know only what is present to them when their perceptual capacities are in play. The same holds for human perceptual knowledge. If we can be said to have knowledge on account of our merely perceiving something, then this is knowledge we have only at the time when this perception is occurring (Pr. An. II 21, 67a39–67b1; compare Met. Ζ 15, 1039b27–30). A person has, for instance, perceptual knowledge that Socrates is sitting only when actually perceiving Socrates in a seated position. It follows that we cease to have this knowledge as soon as we cease to perceive the thing that we know by perception (Nic. Eth. VI 3, 1139b21–22; Topics [Top.] V 3, 131b21–22).

For Aristotle, this represents a shortcoming of perceptual knowledge. Perceptual knowledge is transitory or unstable in a way that knowledge ideally is not, since knowledge is supposed to be a cognitive state which we can rely upon (Categories [Cat.] 8, 8b27–30; Posterior Analytics [Post. An.] I 33, 89a5–10). Perception is also lacking as a form of knowledge in other ways. Higher types of knowledge confer a grasp of the reasons why something is so, but perception at best allows us to know that something is so (Metaphysics [Met.] I 1, 981b12–13; Post. An. I 31, 88a1–2). The content of perception is also tied to a particular location and time: what I perceive is that this thing here has this property now (Post. An. I 31, 87b28–30). Even if the content of my perception is a fact like the fact that fire is hot (rather than that this fire is hot), a perceptual experience cannot, according to Aristotle, tell me that fire is in general hot, since that would require me to understand why fire is hot (Met. I 1, 981b13).

Hence, while knowledge begins with perception, the types of knowledge which are most distinctively human are the exercise of cognitive abilities that far surpass perception (Gen. An. I 23, 731a34–731b5). In creatures like us, perception ignites a curiosity that prompts repeated observation of connected phenomena and leads us through a series of more demanding cognitive states that ideally culminate in scientific knowledge or craft (Met. I 1, 980a21–27; Post. An. I 31, 88a2–5; Post. An. II 19, 100a3–b5). The two most important of these intermediate states are memory and experience. Let us turn to these, before considering the types of knowledge that Aristotle considers to be virtues of the soul.

3. Memory

For Aristotle, perception provides the prior knowledge needed to form memories. The capacity to form memories allows us to continue to be aware of what we perceived in the past once the perceived object is no longer present, and thus to enjoy forms of knowledge that do not depend on the continued presence of their objects. Learning from our perceptions in order to form memories thus constitutes an important step in the ascent from perception to higher types of knowledge. With the formation of a memory, we gain epistemic access to the contents of our perceptions that transcends the present moment and place.

Aristotle distinguishes memory from recollection. Whereas recollection denotes an active, typically conscious “search” (On Memory and Recollection [De Mem.] 453a12), memory is a cognitive state that results passively from perception (De Mem. 453a15). In order to form a memory, the perceived object must leave an impression in the soul, like a stamp on a tablet (De Mem. 450a31–2). This requires the soul to be in an appropriate receptive condition, a condition which Aristotle holds to be absent or impaired in both the elderly and the very young (De Mem. 450a32–b7). Aristotle however denies that a memory is formed simultaneously with the impression of the perceived object, since we do not remember what we are currently perceiving; we have memories only of things in the past (De Mem. 449b24–26). He infers that there must be a lapse of time between the perceptual impression and the formation of a memory (De Mem. 449b28, 451a24–5, 29–30).

The fact that memory requires an impression raises a puzzle, as Aristotle notices: if perception is necessarily of a present object, but only an impression left by the object is present in our memory, do we really remember the same things that we perceive (De Mem. 450a25–37, 450b11–13)? His solution is to introduce a representational model of memory. The impression formed in us by a sensory object is a type of picture (De Mem. 450a29–30). Like any picture, it can be considered either as a present artifact or as a representation of something else (De Mem. 450b20–5). When we remember something, we access the representational content of this impression-picture. Memory thus requires a sensory impression, but it is not of the sensory impression; it is of the object this impression depicts (De Mem. 450b27–451a8).

While the capacity to form memories represents a cognitive advance over perception thanks to its cross-temporal character, memory is still a rudimentary form of knowledge, which Aristotle takes not to belong to the intellect strictly speaking (De An. I 4, 408b25–9; De Mem. 450a13–14). Memories need not possess any generality (although Aristotle does not seem to rule out the possibility of remembering generalizations), nor does memory as such tell us the reasons why things are so. A more venerable cognitive achievement than memory which, however, still falls short of full scientific knowledge or craft, is what Aristotle calls “experience” (empeiria).

4. Experience

Memories constitute the prior knowledge required to gain experience, which we gain by means of consciously or unconsciously grouping memories of the same thing (Post. An. II 19, 100a4–6; Met. I 1, 980b28–981a1). The type of knowledge we gain in experience confers practical success; in some cases, the practical efficacy (which is not to say the overall value) of this type of knowledge surpasses that of scientific knowledge (Met. I 1, 981a12–15). Aristotle emphasizes the pivotal role of experience in the acquisition of knowledge of scientific principles (Pr. An. I 30, 46a17–20; II 19, 100a6), but he considers the proper grasp of scientific principles to be a strictly different (and more valuable) kind of knowledge.

Experience thus sits mid-way between the awareness of the past we enjoy by way of memory and the explanatory capacity we have in scientific knowledge. His characterization of the content of the knowledge we have in experience has given rise to divergent interpretations. He contrasts experience with the “art” that a scientifically informed doctor has as follows:

[T]o have a judgment that when Callias was ill of this disease this did him good, and similarly in the case of Socrates and in many individual cases, is a matter of experience; but to judge that it has done good to all persons of a certain constitution, marked off in one class, when they were ill of this disease, e.g. to phlegmatic or bilious people when burning with fever–this is a matter of art. (Met. I 1, 981a7–12, trans. Ross)

On one traditional reading, the contrast Aristotle wishes to draw here concerns the generality of what one knows in experience and in scientific knowledge respectively. A person with scientific knowledge knows a universal generalization (for example, “all phlegmatic people are helped by such-and-such a drug when burning with a fever”), whereas a person with experience knows only a string of particular cases which fall under this generalization (“Socrates was helped by such-and-such a drug when burning with a fever”, “Callias was helped by such-and-such a drug when burning with a fever”, and so on).

What distinguishes the experienced person from someone who merely remembers these things, however, is that the memories of the experienced person are grouped or connected (Post. An. II 19, 100a4–6; Met. I 1, 980b28–981a1). Precisely what this grouping or connection comes to is not made clear by the text, but one point suggested by the passage above is that it allows one to competently treat new cases by comparison with old ones. An experienced person would thus, in this example, be able to prescribe the correct drug if, for instance, Polus should arrive with a fever and be of the relevant constitution to benefit from it. The experienced person will do this, however, by comparing Polus with Socrates and Callias, not by means of an explicit grasp of the universal generalization that all phlegmatic people benefit from this drug when suffering from a fever (or even that most of them do). The person with experience thus has a capacity to generalize, but not yet any explicit grasp of the underlying generalization.

One problem for this reading is that outside of this passage Aristotle describes generalizations, even scientifically accurate ones, as things known by experience. In particular, Aristotle describes scientific explananda as things known by experience, where these are taken to be general facts like the fact that round wounds heal more slowly (Post. An. I 13, 79a14–16 with Met. I 1, 981a28–30; Historia Animalium [Hist. An.] VIII 24, 604b25–7; Pr. An. I 30, 46a17–27; and, possibly, Post. An. II 19, 100a3–8). According to Pieter Sjoerd Hasper and Joel Yurdin, the content of experience does not differ from that of scientific knowledge in being any less general or less accurate than scientific knowledge. Instead, what one has experience of is fully precise scientific facts, but what one lacks is a grasp of their causes. On this view, Aristotle’s point in the passage quoted is that someone with experience knows that a certain treatment is effective for all feverish patients who are phlegmatic, but the person does not know why. Experience thus gives one knowledge of scientific explananda; further inquiry or reflection is however needed to have properly scientific knowledge, which requires a grasp of the causes of what one knows by experience.

On either of these interpretations, experience can be seen to contribute a further dimension to the temporal reach of our knowledge. Where memory allows us to retain perceptual knowledge, and thus extends our knowledge into the past, experience extends our knowledge into the future. A person with experience has not only a retrospective grasp of what has cured certain patients; this person has learned from this knowledge what will cure (or is likely to cure) the next patient with the relevant malady, either by direct comparison with previous cases or by grasping the relevant generalization. Since experience presupposes memory, an experienced person has knowledge whose reach extends both backward and forward in time.

5. Knowledge as an Intellectual Virtue

A virtue, for Aristotle, is a particular respect in which a thing is excellent at being what it is or doing what it is meant to do. If, with Aristotle, we suppose that not only our characters but also our intellects can be in better or worse conditions, it makes sense to talk about virtues of intellect as well as virtues of character.

Unlike contemporary virtue epistemologists, who tend to identify knowledge as a type of success issuing from intellectual virtues, Aristotle directly identifies the most desirable types of knowledge with certain intellectual virtues. This has an important effect on his epistemology. A virtue is a kind of stable condition, something a person is qualified with over a period of time rather than (primarily) a thing a person may be said to have or lack on a given occasion. The identification of the highest types of knowledge with virtues thus leads Aristotle to think of these kinds of knowledge as abilities. The relevant abilities include not just practical ones (like building a house) but also purely intellectual abilities, most importantly the ability to contemplate.

Since intellectual virtues must be stable states of the intellect, the best types of knowledge are also those that are difficult to acquire and, conversely, cannot be easily lost or forgotten (Cat. 8, 8b26–9a10). This does not hold of memories (which are easily formed and routinely forgotten) and even less so of perceptual knowledge (which, as we have seen, is for Aristotle a type of knowledge we have just when we are actually perceiving). Only the type of knowledge that is the outcome of protracted instruction or research counts as knowledge in the sense of a virtue (Nic. Eth. VII 3, 1141a18–22). Further, Aristotle thinks we only have this type of knowledge of necessary generalizations which belong to an axiomatizable theory, on the one hand, and of practically pertinent generalizations together with particular facts about their implementation, on the other. His reasons for this view are connected with his division of the human soul.

a. The Division of the Soul

Aristotle takes the human soul to have distinct parts corresponding to our various capacities. He divides the soul first into a rational and a non-rational part. The non-rational part of the soul accounts for the capacities we share with other animals. This part of the soul is divided into a vegetative part, which represents capacities for growth and nutrition, and a part representing the capacities we share with other animals but not with plants.

The rational part of our soul accounts for those capacities by which we seek to grasp truth, capacities which Aristotle takes to be limited to humans and the divine. By “truth”, Aristotle means both the theoretical truth of things that hold independently of us and the practical “truth” of an action or intention that accords with our rational desires (Nic. Eth. VI 2, 1139a26–31). Accordingly, Aristotle divides the rational soul into a calculative and a scientific part corresponding to the different types of truth we seek to grasp (Nic. Eth. VI 1, 1139a6–15; compare. Pol. VII 14, 1333a24–5). The calculative part of the rational soul is responsible for the cognitive component of our practical deliberation, while the scientific part of the soul is responsible for our grasp of what we seek to know for its own sake.

Each part of the soul can, for Aristotle, be in a better or a worse condition. Aristotle notes that this also holds of the nutritive part and the capacities for perception, but he shows little interest in the perfection of these capacities in normative contexts, since they are not distinctively human (Nic. Eth. I 7, 1097b33–5; I 13, 1102a32–b3). Perfecting the non-rational part of the soul is, for humans, to acquire virtues of character, such as courage, temperance and magnanimity. These are acquired, if at all, through a process of habituation beginning in childhood (Nic. Eth. II 1, 1103a25–6, b23–5). To perfect the rational part of the soul, on the other hand, is to acquire what Aristotle calls the “intellectual virtues” (Nic. Eth. I 13, 1103a3–5; VI 1, 1138b35–39a1). Such virtues are also acquired only gradually and over a long period of time, but “mostly as a result of instruction” (didaskalia, Nic. Eth. II 1, 1103a15) rather than habituation.

In addition to taking the calculative and the scientific parts of the soul to be concerned with practical and theoretical truth respectively, Aristotle also distinguishes them according to the modal statuses of the truths that they grasp. The virtues of the calculative part of the soul are excellences for grasping truth concerning what is contingent or can be otherwise (Nic. Eth. VI 1, 1139a8), whereas the virtues of the scientific part of the soul concern “things whose principles cannot be otherwise” (Nic. Eth. VI 1, 1139a7–8). This careful formulation leaves open the possibility that the scientific soul may grasp contingencies so long as the things about which it grasps these contingencies have principles which are necessary. There are, for instance, necessary principles which govern the eclipse of the moon, so that one can have scientific knowledge of the eclipse of the moon even though the moon is not always or necessarily eclipsed (Post. An. I 8, 75b33–6; compare I 31, 87b39–88a5; II 8, 93a35–93b3). Aristotle, however, tends to treat such cases as secondary, taking the primary objects of the scientific part of the soul to be strict and exceptionless necessities.

Aristotle takes there to be different intellectual capacities devoted to the grasp of truths of differing modal statuses for a variety of reasons. On the one hand, he thinks of action as the manipulation of truth. If I fashion some planks of wood into a table, I am making it true that these planks are a table (which, before I begin, is false). It follows that intellectual capacities that are directed towards action must have contingent truths as their objects, since if something cannot be otherwise, then a fortiori it cannot come to be otherwise by someone’s agency (Nic. Eth. VI 2, 1139a36–b11).

Conversely, Aristotle takes only necessary truths to be appropriate objects for the form of knowledge that pertains to the scientific part of the soul. The best condition for this part of the soul is one that allows someone to contemplate the truth freely and at will (De An. II 5, 417b23–5). This means that it ought not to need to monitor, intervene in or otherwise “check on” how things stand in the world with respect to what we know. Aristotle thinks that if one could have this sort of knowledge of a contingent state of affairs, then this state of affairs might change without our awareness, pulling the rug, as it were, out from under our knowledge (Nic. Eth. VI 3, 1139b21–2). For instance, if I could have scientific knowledge that Socrates is sitting, and Socrates gets up without me noticing, then I would suddenly no longer know that Socrates is sitting (since it would no longer be true that Socrates is sitting). Hence, if my knowledge is guaranteed to remain knowledge just by me having learned it in the appropriate way, then what I know must be a state of affairs that does not change, and this will be so if scientific knowledge is of necessities.

b. Scientific Virtues

i. Theoretical Wisdom

Wisdom (sophia) is Aristotle’s name for the best condition of the scientific part of the soul (Nic. Eth. VI 1, 1139a16; compare Met. I 2, 983a9–10) and the “most precise of the kinds of knowledge” (Nic. Eth. VI 7, 1141a17–8). This is the state that we are in when our soul grasps the best objects in the universe in the most intellectually admirable way, enabling us to contemplate these objects with total comprehension (Nic. Eth. VI 7, 1141a20–1, 1141b2–8; X 7, 1177a32–b24; Met. I 1, 981b25–982a3). In the best objects, Aristotle surely intends to include God (Met. 983a4–5) and possibly also the celestial bodies or other things studied in the books of the Metaphysics. He makes clear that humans and their polities are not among these most venerable things: we are in his view plainly not “the best thing there is in the universe” (Nic. Eth VI 7, 1141a21). Humans and their goals may be the most fitting objects of practical knowledge (Nic. Eth. VI 7, 1141b4–15), but there are better things to contemplate.

To this extent, theoretical wisdom is a distinctively disinterested type of knowledge. It is the limiting case of the type of knowledge we seek when we want to understand something for its own sake rather than because it benefits us or has practical utility, and Aristotle associates it strongly with leisure (scholē) (Met. I 1, 982a14–16; Nic. Eth. VI 12; Politics [Pol.] VII.14, 1333a16–b5). This does not mean, however, that it is neutral with respect to its ethical value. On the contrary, Aristotle takes the person with superlative wisdom to be “superlatively happy” (Nic. Eth. X 8, 1179a31), and the pursuit of theoretical wisdom is undoubtedly a central component of the good life in his view.

Aristotle also holds that wisdom can be practically advantageous in more mundane ways. He recounts a story about Thales putting his philosophical knowledge to work so as to predict an excellent olive crop and amassing a fortune by buying up all of the oil presses and then loaning them out at a profit (Pol. I 11, 1259a5–23). Yet he stresses that sophia is neither for the sake of such practical advantages (Nic. Eth. VI 7, 1141b2–8) nor does it require its possessor to be practically wise (Nic. Eth. VI 7, 1141b20–1). He depicts Thales as amassing this wealth to show that “philosophers could easily become wealthy if they wished, but this is not their concern” (Pol. I 11, 1259a15–18).

The best kind of theoretical knowledge has, for Aristotle, the structure of an axiomatic science. One has the best theoretical orientation towards the world when one grasps how each fact of a science concerning the highest things follows from the principles of the highest things (Met. I 2, 982a14–16). Wisdom thus divides into two components, scientific knowledge of certain principles (nous) and the type of scientific knowledge that consists in grasping a scientific proof or “demonstration” issuing from these principles (Nic. Eth. VI 7, 1141a17–20). Someone with the virtue of wisdom understands why the basic principles of theology (or whatever science deals with the best things) are the basic principles of that science, and is also able to prove, in axiomatic fashion, every other theorem in that science on the basis of these principles.

While wisdom is for Aristotle the best kind of theoretical knowledge, he does not hold that this sort of knowledge ought to form a foundation for all other kinds of knowledge or even all other scientific knowledge. This is because he holds that each kind of thing is only properly understood when we understand it according to its own, specific principles (Post. An. I 2, 71b23–25, 72a6; I 6, 74b24–26; I 7; Met. I 3, 983a23–25; Phys. I 1, 184a1–15). Knowledge of the first principles of the highest science might give someone a general understanding of a range of other things (Met. I 2, 982a7–10, 23–24; Nic. Eth. VI 7, 1141a12–15)—it might explain, for instance, why animals move at all by saying that they move in imitation of divine motion—but this sort of general understanding is, for Aristotle, no substitute for the specific kind of understanding we have when we grasp, for example, the mechanics of a particular animal’s motion or the function of this motion in its peculiar form of life.

For this reason, Aristotle takes each scientifically explicable domain to be associated with its own dual virtues of demonstrative and non-demonstrative scientific knowledge. The virtues of demonstrative and non-demonstrative knowledge are, therefore, not characteristics which a person can be said to simply have or to lack in general. Instead, someone might possess the virtues of scientific knowledge with respect to, say, geometry and lack them with respect to, say, human physiology. While one type of scientific knowledge might assist in the acquisition of another, and perhaps even provide some of its principles (Post. An. I 7, 75b14–17; compare I 9, 76a16–25), Aristotle insists that there is a different virtuous state associated with each distinct scientific domain (Nic. Eth. VI 10, 1143a3–4). He does, however, take all such virtues to share a common axiomatic structure, which he lays out in the Posterior Analytics in the course of giving a theory of demonstration.

ii. Demonstrative Knowledge

A demonstration (apodeixis), for Aristotle, is a deductive argument whose grasp imparts scientific knowledge of its conclusion (Post. An. I 2, 71b18–19). Aristotle takes it for granted that we possess a distinctive kind of knowledge by way of deductive reasoning and asks what conditions a deductive argument must satisfy in order to confer scientific knowledge. His primary model for this type of knowledge is mathematics, which, alongside geometrical construction, included the practice of providing a deductive argument from basic principles to prove that the construction satisfies the stated problem. Aristotle however seeks to generalize and extend this model to broadly “mathematical” sciences like astronomy and optics, and, with some qualifications, to non-mathematical sciences like botany and meteorology. His theory of knowledge in the Posterior Analytics (especially the first book) investigates this ideal knowledge state by asking what conditions an argument must satisfy in order to be a demonstration.

Aristotle observes, to begin, that not all deductive arguments are demonstrations (Post. An. I 2, 71b24–26). In particular, an argument from false premises does not confer knowledge of its conclusion (Post. An. I 2, 71b26–27). The notion of demonstration is not, however, simply the notion of a sound deductive argument, since even sound arguments do not provide knowledge of their conclusions unless the premises are already known. Moreover, even sound arguments from known premises may not provide the best kind of knowledge of the conclusion. Aristotle holds that in order to impart the best kind of knowledge of a necessary truth, an argument must establish this truth on the basis of principles that properly pertain to the type of thing the demonstration concerns. A demonstration of some astronomical fact must, for instance, proceed from properly astronomical principles (Pr. An. I 30, 46a19–20). What this rules out is, on the one hand, arguments from accidental and “chance” features of an object (Post. An. I 6, especially 74b5–12, 75a28–37; I 30), and arguments from the principles of a different science, on the other (Post. An. I 7, 75b37–40).

Two requirements for demonstration, then, are that the premises be true and that they be non-accidental facts belonging to the relevant science. Assuming that all principles of a science are true and non-accidental, this reduces to the condition that a demonstration be from principles belonging to the relevant science. This, however, is still not a sufficient condition for an argument to be a demonstration. Aristotle famously contrasts the following two arguments (Post. An. I 13, 78a30–7):

Argument One

Things that do not twinkle are near;
the planets do not twinkle;
therefore, the planets are near.

Argument Two

What is near does not twinkle;
the planets are near;
therefore, the planets do not twinkle.

Here by “twinkle” we should understand the specific astronomical phenomenon whereby a celestial body’s visual intensity modulates in the way that a distant star’s does on a clear night, and we should understand both of these arguments to quantify over visible celestial bodies. If we do, then both of these arguments are sound. The planets are near the earth (relative to most astronomical bodies), and they do not display the astronomical property of twinkling. It is also true that bodies which are relatively close to us, as compared to the stars, fail to display this effect, so the first premise of Argument Two is true. Further, only visible celestial bodies which are near to us fail to twinkle, so the first premise of Argument One is also true. All of the premises in these two arguments are also, in Aristotle’s view, properly astronomical facts. To this extent they both establish that their respective conclusions hold as a matter of astronomical science.

The latter argument is in Aristotle’s view superior, however, in that it establishes not only that the conclusion holds but also why it does. In a completed theory of astronomy, the non-twinkling of the planets might be explained by recourse to their nearness to us, for example by adding that other celestial bodies obstruct the light issuing from more distant ones. Argument Two conveys this explanation by presenting the immediate cause of the conclusion, nearness, as a middle term shared between the two premises. The two premises in the argument thus not only prove the conclusion; they jointly explain what makes the conclusion true.

On the other hand, while the facts that non-twinkling celestial bodies are near and that all planets are non-twinkling celestial bodie do provide perfectly legitimate grounds to infer that the planets are near, an argument from these premises provides little insight into why the planets are near. The fact that the planets are near might take significant work to establish (it might even be established using a chain of reasoning such as that in Argument One), but it would be a confusion, in Aristotle’s view, to think that the soundness of Argument One and the scientific character of its premises shows that the non-twinkling of the planets explains their nearness. The order of explanation runs rather in the opposite direction: they do not twinkle because they are near. In a completed science of astronomy as Aristotle conceives it, where it is assumed that the gross distance of all celestial bodies from the earth is eternally fixed, the nearness of the planets would presumably be treated as a fundamental given from which other things may be explained, not as a fact requiring explanation.

Someone is in a better cognitive condition with respect to a given fact, Aristotle evidently holds, if that person not only knows that it is true but also grasps why it is true. Aristotle does not argue for this position, but it is not difficult to imagine what reasons he might give. We naturally desire not just to know but to understand; curiosity is sated by explanation rather than sheer fact. Further, understanding confers stability on what we know, and Aristotle takes stability to be a desirable quality of knowledge (Cat. 8, 8b28–30; Post. An. I 33, 89a5–10). If I understand why the planets must not twinkle (rather than knowing that this is so but having no idea why), then I will be less likely to give up this belief in light of an apparent observation to the contrary, since to do so would require me to also revise my beliefs about what I take to be the explanation. This is especially so if I understand how this fact is grounded, as Aristotle requires of demonstrative knowledge, in the first principles of a science, since renouncing that piece of knowledge would then require me to renounce the very principles of my scientific theory.

Hence, the type of deduction which places one in the best cognitive condition with respect to an object must be explanatory of its conclusion in addition to being a sound argument with premises drawn from the correct science. The notion of explanation Aristotle works with in laying down this condition is a resolutely objective one. Scientific explanations are not just arguments that someone, or some select class of people, find illuminating. They are the best or most appropriate kinds of explanations available for the fact stated in the conclusion because they argue from what is prior to the conclusion in the order of nature (Phys. I.1, 184a10–23). Further, the fact that a given set of premises explains their conclusion need not be obvious or immediately clear. Aristotle leaves open the possibility that it might be a significant cognitive achievement to see that the premises of a given demonstration explain its conclusion (Post. An. I 7, 76a25–30).

When someone does grasp demonstrative premises as explanatory of a given demonstrative conclusion, the argument is edifying because it tracks some objective fact about how things stand with the relevant kind (celestial bodies, triangles, and so on). Aristotle describes the way that scientific knowledge correctly tracks the order of things in terms of “priority” (Post. An. I 2, 71b35–72a6). Borrowing the terminology of contemporary metaphysics, we might gloss this by saying that demonstrations reveal facts about grounding in a way that not all deductive arguments do. The second syllogism is better than the first one because the fact that the planets are near together with relevant universal generalization about the optical behavior of nearness ground the fact that the planets do not twinkle. They are in an objective sense responsible for the fact that the planets do not twinkle. Given the assumption that grounding is antisymmetric, the premises of the first syllogism cannot also ground its conclusion.

Aristotle’s account of the specific content and logical form of scientific principles is notoriously obscure. Key texts, which do not obviously stand in agreement, are Post. An. I 2, I 4, I 10, II 19 and Pr. An. I 30. Nevertheless, there are a few key ideas which Aristotle remains consistently committed to. First, a particularly important type of principle is one that states what something is, or its definition (Post. An. I 2, 72a22). The centrality accorded to this type of principle suggests a project of grounding all truths of a given science in facts about the essences of the kind or kinds that this science concerns. Aristotle however seems aware of problems with such a rigid view, and admits non-specific or “common” axioms into demonstrative sciences (Post. An. I 2, 72a17–19; I 10, 76b10–12). These include axioms like the principle of non-contradiction, which are in some sense assumed in every science (Post. An. I 2, 72a15–17; I 11, 77a30), as well as those like the axiom that equals taken from equals leave equals, which can be given both an arithmetical and a geometrical interpretation (Post. An. I 10, 76a41—b1; I 11, 77a30).

Aristotle also briefly discusses the profile of conviction (pistis) that someone with scientific knowledge ought to display. At least some of the principles of a demonstration, Aristotle holds, should be “better known” to the expert scientist than their conclusions, and the scientist should be “more convinced” of them (Post. An. I 2, 72a25–32). This is motivated in part by the idea that the principles of demonstrations are supposed to be the source or grounds for our knowledge of whatever we demonstrate in science. Aristotle also connects it with the requirement that someone who grasps a demonstration should be “incapable of being persuaded otherwise” (Post. An. I 2, 72b3). Someone with demonstrative knowledge in the fullest sense will never renounce their beliefs under dialectical pressure, and Aristotle thinks this requires someone to be supremely confident in the principles that found her demonstrations.

iii. Non-Demonstrative Scientific Knowledge

The claim that the principles of demonstrations must be better known than their conclusions generates a problem. If the best way to know something theoretically is by demonstration, and the premises of demonstrations must be at least as well known as their conclusions, then the premises of demonstrations will themselves need to be demonstrated. However, these demonstrations in turn will also have premises requiring demonstration, and so on. A regress looms.

Aristotle canvases three possible responses to this problem. First, demonstrations might extend back infinitely: there might be an infinite number of intermediate demonstrations between a conclusion and its first principles (Post. An. I 3, 72b8–10). Aristotle dismisses this solution on the grounds that we can indeed have demonstrative knowledge (Post. An. I 1, 71a34–72b1), and that we could not have it if having it required surveying an infinite series of arguments (Post. An. I 3, 72b10–11).

Two other views, which Aristotle attributes to two unnamed groups of philosophers, are treated more seriously. Both assume that chains of demonstrations terminate. One group says that they terminate in principles which cannot be demonstrated and consequently cannot be known (or at least, not in the demanding way required by scientific knowledge (Post. An. I 3, 72b11–13)). The other holds that demonstrations “proceed in a circle or reciprocally” (Post. An. I 3, 72b17–18): the principles are demonstrated, but from premises which are in turn demonstrated (directly or indirectly) from them.

Aristotle rejects both of these views, since both possibilities run afoul of the requirement that the principles of demonstrations be better known than their conclusion. The first alternative maintains that the principles are not known, or at least not in any scientifically demanding way, while the second requires that the principles in turn be demonstrated from (and hence not better known than) other demonstrable facts.

Aristotle’s solution is to embrace the claim that the principles are indemonstrable but to deny that this implies they are not known or understood in a rigorous and demanding way. There is no good reason, Aristotle maintains, to hold that demonstrative knowledge is the only or even the best type of scientific knowledge (Post. An. I 3, 72b23–5); it is only the best kind of scientific knowledge regarding what can be demonstrated. There is a different kind of knowledge regarding scientific principles. Aristotle sometimes refers to this as “non-demonstrative scientific knowledge” (Post. An. I 3, 72b20; I 33, 88b36) and identifies or associates it strongly with nous (Post. An. I 33, 88b35; II 19, 100b12), which is translated variously as comprehension, intellection and insight.

Aristotle is therefore a foundationalist insofar as he takes all demonstrative knowledge to depend on a special sort of knowledge of indemonstrable truths. It should be stressed, however, that Aristotle’s form of foundationalism differs from the types that are now more common in epistemology.

First, Aristotle professes foundationalism only regarding demonstrative knowledge in particular: he does not make any similar claim about perceptual knowledge as having a foundation in, for instance, the perception of sense data, nor for practical knowledge nor for our knowledge of scientific principles.

Second, as we have seen, Aristotle’s view is that scientific knowledge is domain-specific. Expert knowledge in one science does not automatically confer expert knowledge in any other, and Aristotle explicitly rejects the idea of a ”super-science” containing the principles for all other sciences (Post. An. I 32). Hence, Aristotle defends what we might term a “local” foundationalism about scientific knowledge. Our knowledge of geometry will have one set of foundations, our knowledge of physics another, and so on. (compare Post. An. I 7; I 9, 75b37–40; 76a13–16; Nic. Eth. VI 10, 1143a3–4).

Third, the faculty which provides our knowledge of the ultimate principles of demonstrative knowledge is, as we have seen, itself a rational faculty, albeit one which does not owe its knowledge to demonstration. Hence, Aristotle does not take the foundation of our demonstrative knowledge to be “brute” or “given”; his claim, more modestly, is that our knowledge of scientific principles must be of a different kind than demonstrative knowledge.

Finally, Aristotle’s foundationalism should not be taken to imply that we need to have knowledge of principles prior to discovering any other scientific facts. In at least some cases, Aristotle takes knowledge of scientific explananda to be acquired first (Post. An. II 2, 90a8–9), by perception or induction (Post. An. I 13, 78a34–5). Only later do we discover the principles which allow us to demonstrate them, and thus enjoy scientific knowledge of them.

Aristotle does not say much about the specific character of our knowledge of principles, and its nature has been the subject of much debate. As we have seen, Aristotle requires the principles to be “better known” (at least in part) than their demonstrative consequences, and he also refers to this type of knowledge as “more precise” (Post. An. II 19, 99b27) than demonstrative knowledge. Some scholars take Aristotle’s view to be that the principles are self-explanatory, while others take the principles to be inexplicable.

His views about the way we acquire knowledge of first principles have also been subject to varying interpretations. Traditionally, Aristotle’s view was taken to be that we learn the first principles by means of exercising nous, understood as a capacity for abstracting intelligible forms from the impressions left by perception. Subsequent scholars have pointed to the dearth of textual evidence for ascribing such a view to Aristotle in the Posterior Analytics, however. Aristotle calls the state we are in when we know first principles nous (Post. An. II 19, 100b12), but he does not claim that we learn first principles by means of exercising a capacity called nous.

A second possibility is that Aristotle thinks we obtain knowledge of scientific principles through some form of dialectic—a competitive argumentative practice outlined in the Topics that operates with different standards and procedures than scientific demonstration. Another view, defended by Marc Gasser-Wingate, is that our knowledge of the first principles is both justified and acquired by what Aristotle calls “induction” (epagōgē)—a non-deductive form of scientific argument in which we generalize from a string of observed cases or instances.

Some scholars also divide the question about how we first come to know the first principles from questions about what justifies this knowledge in the context of a science. One suggestion is that Aristotle takes the justification for nous to consist in a recognition of the principles as the best explanations for other scientific truths. On one version of this view, forcefully defended by David Charles, knowledge of first principles is not acquired prior to our knowledge of demonstrable truths; rather, we gain the two in lockstep as we engage in the process of scientific explanation. On other versions of this view, we come to know the first principles in some less demanding way before we come to appreciate their explanatory significance and thus have proper scientific knowledge of them. David Bronstein, who defends a version of the latter view, argues that Aristotle recommends a range of special methods for determining first principles, including, importantly, a rehabilitation of Plato’s method of division.

c. Practical Knowledge and the Calculative Virtues

Wisdom and scientific knowledge (demonstrative and non-demonstrative) are the excellences of the scientific part of our soul, that part of us devoted to the contemplation of unchanging realities. Aristotle takes the type of knowledge that we employ in our dealings with other people and our manipulation of our environment to be different in kind from these types of knowledge, and gives a separate account of their respective justification, acquisition, and purpose.

The goal of practical knowledge is to enable us to bring about changes in the world. Where attaining theoretical knowledge is a matter of bringing one’s intellect into conformity with unchanging structures in reality, practical knowledge involves a bidirectional relationship between one’s intellect and desires, on the one hand, and the world, on the other. As practical knowers we seek not only to conform our desires and intellects to facts about what is effective, ethical and practically pertinent; we also seek to conform these situations to what we judge to be such. Hence, practical knowledge can have as its objects neither necessities (since no one can coherently decide to, for example, change the angle sum of a triangle (Nic. Eth. III 2, 1112a21–31)) nor what is in the past (someone might make a decision to sack Troy, but “no one decides to have sacked Troy” (Nic. Eth. VI 2, 1139b7)). Only present and future contingencies are, in Aristotle’s view, possible objects of practical knowledge.

Aristotle distinguishes two activities that are enabled by practical thinking: action (praxis) and production (poiēsis) (Nic. Eth. VI 2, 1139a31–b5). Production refers to those doings whose end lies outside of the action itself (Nic. Eth. VI 5, 1140b6–7), the paradigm of which is the fashioning of a craft object like a shoe or a house. Aristotle recognizes that not all of our doings fit this mold, however: making a friend, doing a courageous act, or other activities laden with ethical significance cannot be thought of only with strain on the model of manufacturing a product. Such activities do aim to bring about changes in reality, but their end is not separate from the action itself. In performing a courageous act, say, I am, in Aristotle’s view, simply aiming to exercise the virtue of courage appropriately. Praxis is Aristotle’s name for doings such as these. It refers to a distinctively human kind of action, one not shared by other animals (Nic. Eth. VI 2, 1139a20; III 3, 1112b32; III 5, 1113b18), involving deliberation and judgment that an action is the best way of fulfilling one’s goals (Nic. Eth. VI 2, 1139a31).

Both action and production require more than just knowledge in order to be performed well. In particular, the best kind of action also requires the doer to be virtuous, and this, for Aristotle, has a desiderative as well as an epistemic component. Someone is only virtuous if that person desires the right things, in the right way, to the right extent (Nic. Eth. II 3, 1104b3–13; II 9 1109b1–5; III 4 1113a31–3; IV 1 1120a26–7; X I, 1172a20–3). Further, Aristotle does not take the starting points of our practical knowledge to be themselves objects of practical knowledge. It is not part of someone’s technical expertise to know that, for example, a sword is to be made or a patient to be healed; rather, a blacksmith or a doctor in her capacity as such takes for granted that these ends are to be pursued, and it is the job of her practical knowledge to determine actions which bring them about (Nic. Eth. III 3, 1112b11–16). The proper ends of actions, meanwhile, are given by virtue (Nic. Eth. VI 12, 1144a8, 20), and the virtues are habituated into us from childhood (Nic. Eth. II 2, 1103a17–18, 23–26, 1103b23–25). Nevertheless, Aristotle takes certain types of knowledge to be indispensable for engaging in action and production in the best possible ways. He identifies these types of knowledge with the intellectual virtues of practical wisdom (phronēsis) and craft (technē).

i. Craft

Craft (technē) is Aristotle’s name for the type of knowledge that perfects production (Nic. Eth. IV 4, 1140a1–10; Met. IX 2, 1046a36–b4). Aristotle mentions a treatise on craft which seems to have given a treatment of it roughly parallel to the treatment of scientific knowledge he gives in the Posterior Analytics (Nic. Eth. VI 4, 1140a2–3; compare VI 3, 1139b32), but this treatise is lost. Aristotle’s views on technē must be pieced together from scattered remarks and an outline of this treatise’s contents in the Nicomachean Ethics (VI 4).

As with scientific knowledge, Aristotle does not take craft to be a monolithic body of knowledge. He holds, sensibly enough, that a different type of technical knowledge is required for bringing into being a different kind of object. Aristotle’s stock example of a technē is the construction of houses (Nic. Eth. VI 4, 1140a4). In constructing a house, a craftsperson begins with the form of the house in mind together with a desire to bring one about, and practical knowledge is what enables this to lead to the actual presence of a house (Met. VII 7, 1032a32–1032b23; Phys. II 2, 194a24–27; II 3, 195b16–25).

In order for this to occur, a person with craft must know the “true prescription” pertaining to that practice (Nic. Eth. VI 4, 1140a21), that is, the general truths concerning how the relevant product is to be brought about. In the case of housebuilding, these might include the order in which various housebuilding activities need to be carried out, the right materials to use for various parts, and the correct methods for joining different types of materials. While a merely “experienced” housebuilder might manage to bring about a house without such prescriptions, they would not, in Aristotle’s view, bring about a house in the best or most felicitous way, and hence could not be said to operate according to the craft of housebuilding.

Aristotle indicates that these prescriptions fit together in a causal or explanatory way (Met. A 1, 981a1–3, 28–981b6; Post. An. II 19, 100a9). This view is plausible. Someone with the best kind of knowledge about how to bring about some product will presumably not only know what should be done but also understand why that is the correct thing to do. Such understanding, after all, has not only theoretical interest but also practical benefit. Suppose, for instance, that the craft of housebuilding prescribes that one should bind bricks using straw. Someone who understands why this is prescribed will be in a better position to know what else can be substituted should straw be unavailable, or even when it may be permissible to omit the binding agent. None of this is to say that a practitioner of a craft requires the same depth of understanding as someone with scientific knowledge, however. A technician of housebuilding does not need to know, for example, the chemical or physical principles which explain why and how binding agents work at a microscopic scale.

Given that it involves a kind of understanding, knowing the craft’s correct prescriptions in the way required by craft is a significant intellectual accomplishment. Nevertheless, this is not sufficient for having craft knowledge, according to Aristotle. Someone with craft knowledge must also have a “productive disposition” (Nic. Eth. VI 3, 1140a20–1), that is, a tendency to actually produce the goods according to these prescriptions when they have the desire to do so. Aristotle makes this disposition a part of craft knowledge itself, and not merely an extra condition required for practicing the craft, for at least three reasons.

First, someone does not count as having craft knowledge if that person has only a theoretical grasp of how houses are to be made, for example. Having craft knowledge requires knowledge of how to build houses, and Aristotle thinks that this sort of knowledge is only available to someone with a disposition to actually build them. Second, unlike mathematical generalizations, the prescriptions grasped in technē are not exceptionless necessities (Nic. Eth. VI 4, 1140a1–2; compare Nic. Eth. VI 2, 1139a6–8). Hence, simply knowing these prescriptions (even if one has every intention of fulfilling them) is not in itself sufficient for an ability to actually bring about the relevant product. One must have an ability to recognize when a rule of thumb about, say, the correct materials to use in building a house fails to apply. Aristotle thinks of this type of knowledge as existing in a disposition to apply the prescriptions correctly rather than as an auxiliary theoretical prescription.

A third, related reason is that the process of production requires one to make particular decisions that go beyond what is specified in the prescriptions given by that craft. Thus, even where the craft prescription instructs the builder to, for instance, separate cooking and sleeping quarters or to have a separate top floor for this kind of house, it may not specify the specific arrangement of these quarters or the precise elevation of the second floor. The ability to make such decisions in the context of practicing a craft is, for Aristotle, conferred by the productive disposition involved in craft knowledge rather than by the grasp of additional prescriptions.

ii. Practical Wisdom

Practical wisdom is the central virtue of the calculative part of the soul. This type of knowledge makes one excellent at deliberation (Nic. Eth. VI 9, 1142b31–3; VI 5, 1140b25). Since deliberation is Aristotle’s general term for reasoning well in practical circumstances, practical wisdom is also the type of knowledge that perfects action (praxis). More generally, practical wisdom is the intellectual virtue “concerned with things just and fine and good for a human being” (Nic. Eth. VI 12, 1143b21–22). It includes, or is closely allied with, a number of related types of practical knowledge that inform ethical behavior: good judgment (gnōmē), which Aristotle characterizes as a sensitivity to what is reasonable in a given situation (, Nic. Eth. VI 12, 1143a19–24); comprehension (sunesis), an ability to discern whether a given action or statement accords with practical wisdom (Nic. Eth. VI 10, 1143a9–10); and practical intelligence (nous, related to, but distinct from the theoretical virtue discussed above), which allows one to spot or recognize practically pertinent particulars (Nic. Eth. VI 11, 1143b4–5).

Practical wisdom thus serves to render action rather than production excellent. One important difference between practical wisdom and craft immediately follows. Whereas in craft someone performs an action for the purpose of creating something “other” than the production itself, the end of practical wisdom is the perfection of the action itself (Nic. Eth. VI 5, 1140b6–7). Nevertheless, in many respects, Aristotle’s view of practical wisdom is modeled on his view of craft knowledge. Like craft knowledge, the goal of practical wisdom is to effect some good change rather than simply to register the facts as they stand. In addition, like craft, this type of knowledge involves both a grasp of general prescriptions governing the relevant domain and an ability to translate these generalities into concrete actions. In the case of practical wisdom, the domain is the good human life generally (Nic. Eth. VI 8, 1141b15; Nic. Eth. VI 12, 1143b21–2), and the actions which it enables are ethically good actions. Hence, the general prescriptions associated with practical wisdom concern the living of a flourishing human life, rather than any more particular sphere of action. Practical wisdom also, like craft, involves an ability to grasp the connections between facts, but in a way that is specifically oriented towards action (Nic. Eth. VI 2 1139a33–1139b5; Nic. Eth. VI 7, 1141b16–20).

Some of the complications involved in moving from general ethical prescriptions to concrete actions also mirror those regarding the movement from a craft prescription to the production of a craft object. For one, Aristotle holds that many or all general truths in ethics likewise hold only for the most part (Nic. Eth. I 3, 1094b12–27; 1098a25-34). The ethical prescription, for instance, to reciprocate generosity is a true ethical generalization, even if not an exceptionless one (Nic. Eth. IX 2, 1164b31). If ethical norms permit of exceptions, then knowing these norms will not always be sufficient for working out the ethical thing to do. A further epistemic capacity will be required in order to judge whether general ethical prescriptions apply in the concrete case at hand, and this is plausibly one function of phronēsis.

Aristotle also describes phronēsis as a capacity to work out what furthers good ends (Nic. Eth. VI 5, 1140a25–9; VI 9, 1142b31–3). He distinguishes it from the trait of cleverness, a form of means-end reasoning that is indifferent to the ethical quality of the ends in question (Nic. Eth. VI 12, 1144a23–5). Phronēsis is an ability to further good ends in particular, and to further them in the most appropriate ways. It has also been argued that phronēsis has the function of recognizing, not only the means to one’s virtuous ends, but also what would constitute the realization of those ends in the first place. For instance, I might have the intention to be generous, but it is another thing to work out what it means to be generous to this friend at this time under these circumstances. This is parallel to the way that one needs, in, say,. seeking to construct a house to decide which particular type of house to construct given the constraints of location and resources.

One crucial difference between craft knowledge and practical wisdom is, however, the following. Whereas it suffices for craft knowledge to find a means to an end which is in accord with the goals of that craft, a practically wise person must find a way of realizing an ethical prescription which is in accord with all of the ethical virtues (Nic. Eth. VI 12, 1144a29–1144b1). This is a considerable practical ability in its own right, especially when the demands of different virtues come into conflict, as they might, for instance, when the just thing to do is not (or not obviously) the same as the kind or the generous thing to do. Practical wisdom thus requires, first, that one has all of the virtues so as to be sensitive to their various demands (Nic. Eth. VI 13, 1145a1–2). Over and above the possession of the virtues, practical wisdom calls for an ability to navigate their various requirements and arbitrate between them in concrete cases. In this way, it constitutes a far higher achievement than craft knowledge, since a person with practical wisdom grasps and succeeds in coordinating all of the goods constitutive of a human life rather than merely those directed towards the production of some particular kind of thing or the attainment of some specific goal.

6. References and Further Reading

Two good overviews of Aristotle’s views about knowledge, with complementary points of emphasis, are Taylor (1990) and Hetherington (2012). Bolton (2012) emphasizes Aristotle’s debt to Plato in epistemology. Fine (2021) is one of the few to treat Aristotle’s theory of knowledge in all generality at significant length, but readers should be aware that some of her central theses are not widely supported by other scholars. More advanced but nevertheless accessible pieces on Aristotle’s epistemology and philosophy of science may be found in Smith (2019), Anagnostopoulos (2009) and Barnes (1995).

The most in-depth study of Aristotle’s theory of scientific knowledge in the Posterior Analytics is Bronstein (2016), which focuses on the prior knowledge requirement and reads Aristotle’s views as a response to Meno’s paradox. See also Angioni (2016) on Aristotle’s definition of scientific knowledge in the Posterior Analytics. McKirahan (1992) and Barnes (1993) both provide useful commentary on the Posterior Analytics. See Barnes (1969), Burnyeat (1981), Lesher (2001) and Pasnau (2013) for views concerning whether Aristotle’s theory in the Posterior Analytics is best viewed as an epistemology, a philosophy of science, or something else. Sorabji (1980) also contains penetrating discussions of many specific issues in Aristotle’s epistemology and philosophy of science. For scholarly issues, Berti (1981) is still an excellent resource.

On Aristotle’s scientific method more generally, see Lennox (2021), Bolton (1987) and Charles (2002). For how we acquire knowledge of first principles, important contributions include Kahn (1981) and Bayer (1997) (who defend a view close to the traditional one), Irwin (1988) (who argues for the importance of a form of dialectic in coming to know first principles), and Gasser-Wingate (2016) (who argues for the role of induction and perception). Morison (2019) as well as Bronstein (2016) discuss at length the nature of knowledge of first principles and its relationship to nous in Aristotle.

Shields (2016) provides an excellent translation and up-to-date commentary on the De Anima. Kelsey (2022) gives a novel reading of De Anima as a response to Protagorean relativism. For Aristotle’s views on perception, see Modrak (1987) and Marmodoro (2014). Gasser-Wingate (2021) argues for an empiricist reading of Aristotle, against the rationalist reading of Frede (1996). On the more specific issue of whether Aristotle takes perception to involve a literal change in the sense organ, one can start with Caston (2004), Sorabji (1992) and Burnyeat (2002).

For the Nicomachean Ethics, Broadie and Rowe (2002) provide useful, if partisan, philosophical introduction and commentary, while Reeve (2014) provides extensive cross-references to other texts. For Aristotle’s views about practical wisdom, Russell (2014) and Reeve (2013) are useful starting points. Walker (2018) gives a prolonged treatment of Aristotle’s views about contemplation and its alleged “uselessness”, and Ward (2022) provides interesting background on the religious context of Aristotle’s views.

a. Bibliography

  • Anagnostopoulos, Georgios (ed.). 2009. A Companion to Aristotle. Sussex: Wiley-Blackwell.
  • Angioni, Lucas. 2016. Aristotle’s Definition of Scientific Knowledge. Logical Analysis and History of Philosophy 19: 140–66.
  • Barnes, Jonathan. 1969. Aristotle’s Theory of Demonstration. Phronesis 14: 123–52.
  • Barnes, Jonathan. 1993. Aristotle: Posterior Analytics. Oxford: Clarendon Press.
  • Barnes, Jonathan (ed.). 1995. The Cambridge Companion to Aristotle. Cambridge: Cambridge University Press.
  • Bayer, Greg. 1997. Coming to Know Principles in Posterior Analytics II.19. Apeiron 30: 109–42.
  • Berti, Enrico (ed.). 1981. Aristotle on Science: The Posterior Analytics. Proceedings of the Eighth Symposium in Aristotelicum Held in Padua from September 7 to 15, 1978. Padua: Editrice Antenore.
  • Bolton, Robert. 1987. Definition and Scientific Method in Aristotle’s Posterior Analytics and Generation of Animals. In Philosophical Issues in Aristotle’s Biology. Cambridge: Cambridge University Press.
  • Bolton, Robert. 1997. Aristotle on Essence and Necessity. Proceedings of the Boston Area Colloquium in Ancient Philosophy (edited by John J. Cleary) 13:113–38. Leiden: Brill.
  • Bolton, Robert. 2012. Science and Scientific Inquiry in Aristotle: A Platonic Provenance. In The Oxford Handbook of Aristotle (edited by Christopher Shields), 46–59. Oxford: Oxford University Press.
  • Bolton, Robert. 2014. Intuition in Aristotle. In Rational Intuition: Philosophical Roots, Scientific Investigations, 39–54. Cambridge: Cambridge University Press.
  • Bolton, Robert. 2018. The Search for Principles in Aristotle. In Aristotle’s Generation of Animals: A Critical Guide (edited by Andrea Falcon and David Lefebvre), 227–48. Cambridge: Cambridge University Press.
  • Broadie, Sarah, and Christopher Rowe. 2002. Nicomachean Ethics. Philosophical Introduction and Commentary by Sarah Broadie (translated by Christopher Rowe). New York: Oxford University Press.
  • Bronstein, David. 2010. Meno’s Paradox in Posterior Analytics 1.1. Oxford Studies in Ancient Philosophy 38: 115–41.
  • Bronstein, David. 2012. The Origin and Aim of Posterior Analytics II.19. Phronesis 57(1): 29–62.
  • Bronstein, David. 2016. Aristotle on Knowledge and Learning: The Posterior Analytics. Oxford: Oxford University Press.
  • Bronstein, David. 2020. Aristotle’s Virtue Epistemology. In What the Ancients Offer to Contemporary Epistemology (edited by Stephen Hetherington and Nicholas Smith), 157–77. New York: Routledge.
  • Burnyeat, Myles. 1981. Aristotle on Understanding Knowledge. In Aristotle on Science: The Posterior Analytics (edited by Enrico Berti). Padua: Editrice Antenore.
  • Burnyeat, Myles. 2011. Episteme. In Episteme, Etc. Essays in Honour of Jonathan Barnes (edited by Benjamin Morison and Katerina Ierodiakonou), 3–29. Oxford: Oxford University Press.
  • Burnyeat, Myles. 2002. De Anima II 5. Phronesis 47(1): 28–90.
  • Byrne, Patrick. 1997. Analysis and Science in Aristotle. Albany: State University of New York Press.
  • Caston, Victor. 2004. The Spirit and the Letter: Aristotle on Perception. In Metaphysics, Soul and Ethics: Themes from the Work of Richard Sorabji (edited by Ricardo Salles), 245–320. Oxford University Press.
  • Charles, David. 2002. Aristotle on Meaning and Essence. Oxford: Oxford University Press.
  • Fine, Gail. 2021. Aristotle on Knowledge. In Essays in Ancient Epistemology, 221–32. Oxford University Press.
  • Frede, Michael. 1996. Aristotle’s Rationalism. In Rationality in Greek Thought (edited by Michael Frede and Gisela Striker), 157–73. Oxford University Press.
  • Gasser-Wingate, Marc. 2016. Aristotle on Induction and First Principles. Philosopher’s Imprint 16(4): 1–20.
  • Gasser-Wingate, Marc. 2019. Aristotle on the Perception of Universals. British Journal for the History of Philosophy 27(3): 446–67.
  • Gasser-Wingate, Marc. 2021. Aristotle’s Empiricism. New York: Oxford University Press.
  • Goldin, Owen. 1996. Explaining an Eclipse: Aristotle’s Posterior Analytics 2.1–10. Ann Arbor: The University of Michigan Press.
  • Greco, John. 2010. Achieving Knowledge. Cambridge: Cambridge University Press.
  • Hasper, Pieter Sjoerd, and Joel Yurdin. 2014. Between Perception and Scientific Knowledge: Aristotle’s Account of Experience. In Oxford Studies in Ancient Philosophy (edited by Brad Inwood), 47:119–50.
  • Hetherington, Stephen (ed.). 2012. Aristotle on Knowledge. In Epistemology: The Key Thinkers, 50–71. London: Continuum.
  • Hintikka, Jaakko. 1967. Time, Truth and Knowledge in Ancient Greek Philosophy. American Philosophical Quarterly 4(1): 1–14.
  • Irwin, Terence. 1988. Aristotle’s First Principles. Oxford: Clarendon Press.
  • Kahn, Charles. 1981. The Role of Nous in the Cognition of First Principles in Posterior Analytics II 19. In Aristotle on Science: The Posterior Analytics. Proceedings of the Eighth Symposium in Aristotelicum Held in Padua from September 7 to 15, 1978 (edited by Enrico Berti). Padua: Editrice Antenore.
  • Kelsey, Sean. 2022. Mind and World in Aristotle’s de Anima. Cambridge, UK: Cambridge University Press.
  • Kiefer, Thomas. 2007. Aristotle’s Theory of Knowledge. London: Continuum.
  • Kosman, Aryeh. 2013. Understanding, Explanation, and Insight in Aristotle’s Posterior Analytics. In Virtues of Thought, 7–26. Cambridge: Harvard University Press.
  • Lennox, James G. 2021. Aristotle on Inquiry. Cambridge: Cambridge University Press.
  • Lesher, James H. 2001. On Aristotelian Ἐπιστήμη as ‘Understanding’. Ancient Philosophy 21(1): 45–55.
  • Lorenz, Hendrik. 2014. Understanding, Knowledge and Inquiry in Aristotle. In The Routledge Companion to Ancient Philosophy, 290–303. New York: Routledge.
  • Malink, Marko. 2013. Aristotle on Circular Proof. Phronesis 58(3): 215–48.
  • Marmodoro, Anna. 2014. Aristotle on Perceiving Objects. New York: Oxford University Press.
  • McKirahan, Richard. 1992. Principles and Proofs. Princeton: Princeton University Press.
  • Modrak, Deborah K. W. 1987. Aristotle: The Power of Perception. Chicago: University of Chicago Press.
  • Morison, Benjamim. 2019. Theoretical Nous in the Posterior Analytics. Manuscrito 42(4): 1–43.
  • Morison, Benjamin. 2012. Colloquium 2: An Aristotelian Distinction Between Two Types of Knowledge. In Proceedings of the Boston Area Colloquium of Ancient Philosophy (edited by Gary Gurtler and William Wians), 27:29–63.
  • Pasnau, Robert. 2013. Epistemology Idealized. Mind 122: 987–1021.
  • Reeve, C. D. C. 2013. Aristotle on Practical Wisdom: Nicomachean Ethics VI. Cambridge: Harvard University Press.
  • Reeve, C. D. C. 2014. Aristotle: Nicomachean Ethics. Indianapolis: Hackett.
  • Russell, Daniel C. 2014. Phronesis and the Virtues (NE Vi 12-13). In The Cambridge Companion to Aristotle’s Nicomachean Ethics (edited by Ronald Polansky), 203–20. New York: Cambridge University Press.
  • Shields, Christopher. 2016. Aristotle. De Anima. Oxford: Clarendon Press.
  • Smith, Nicholas D. (ed.). 2019. The Philosophy of Knowledge: A History (Vol. I: Knowledge in Ancient Philosophy). London: Bloomsbury Academic.
  • Sorabji, Richard. 1980. Necessity, Cause and Blame. Perspectives on Aristotle’s Theory. Ithaca: Cornell University Press.
  • Sorabji, Richard. 1992. Intentionality and Physiological Processes: Aristotle’s Theory of Sense-Perception. In Essays on Aristotle’s De Anima (edited by Martha C. Nussbaum and Amelie Oksenberg Rorty), 195–225. Clarendon Press.
  • Sosa, Ernst. 2010. Knowing Full Well. Cambridge: Princeton University Press.
  • Taylor, C. C. W. 1990. Aristotle’s Epistemology. In Epistemology (edited by Stephen Everson), 116–42. Cambridge: Cambridge University Press.
  • Walker, Matthew D. 2018. Aristotle on the Uses of Contemplation. Cambridge: Cambridge University Press.
  • Ward, Julie K. 2022. Searching for the Divine in Plato and Aristotle: Philosophical Theoria and Traditional Practice. Cambridge: Cambridge University Press.

 

Author Information

Joshua Mendelsohn
Email: jmendelsohn@luc.edu
Loyola University Chicago
U. S. A.

History of Utilitarianism

The term “utilitarianism” is most-commonly used to refer to an ethical theory or a family of related ethical theories.  It is taken to be a form of consequentialism, which is the view that the moral status of an action depends on the kinds of consequences the action produces. Stated this way, consequentialism is not committed to any view of what makes certain outcomes desirable. A consequentialist could claim (rather absurdly) that individuals have a moral obligation to cause as much suffering as possible. Similarly, a consequentialist could adopt an ethical egoist position, that individuals are morally required to promote their own interests. Utilitarians have their own position on these matters. They claim it is utility (such as happiness, or well-being), which makes an outcome desirable, they claim that an outcome with greater utility is morally preferable to one with less. Contrary to the ethical egoist, the utilitarian is committed to everyone’s interests being regarded as equally morally important.

These features are fairly uncontroversial among utilitarians, but other features are the subject of considerable dispute. How “utility” should be understood is contested. The favoured ways of understanding utilitarianism have varied significantly since Jeremy Bentham—seen as the “father of utilitarianism”—produced the first systematic treatise of the view. There have also been proponents of views that resemble utilitarianism throughout history, dating back to the ancient world.

This article begins by examining some of the ancient forerunners to utilitarianism, identifying relevant similarities to the position that eventually became known as utilitarianism. It then explores the development what has been called “classical utilitarianism”. Despite the name, “classical utilitarianism” emerged in the 18th and 19th centuries, and it is associated with Jeremy Bentham and John Stuart Mill. Once the main features of the view are explained, some common historical objections and responses are considered. Utilitarianism as the social movement particularly influential in the 19th century is then discussed, followed by a review of some of the modifications of utilitarianism in the 20th century. The article ends with a reflection on the influence of utilitarianism since then.

Table of Contents

  1. Precursors to Utilitarianism in the Ancient World
    1. Mozi
    2. Epicureanism
  2. The Development of Classical Utilitarianism
    1. Hutcheson
    2. Christian Utilitarianism
    3. French Utilitarianism
  3. Classical Utilitarianism
    1. Origin of the Term
    2. Bentham
    3. Features of Classical Utilitarianism
      1. Consequentialism
      2. Hedonism
      3. Aggregation
      4. Optimific (‘Maximising’)
      5. Impartiality
      6. Inclusivity
    4. Early Objections and Mill’s Utilitarianism
      1. Dickens’ Gradgrindian Criticism
      2. The ‘Swine’ Objection and ‘Higher Pleasures’
      3. Demandingness
      4. Decision Procedure
  4. The Utilitarian Movement
  5. Utilitarianism in the Twentieth 20th Century
    1. Hedonism and Welfarism
    2. Anscombe and ‘Consequentialism’
    3. Act versus Rule
    4. Satisficing and Scalar Views
  6. Utilitarianism in the Early 21st Century
  7. References and Further Reading

1. Precursors to Utilitarianism in the Ancient World

While utilitarianism became a refined philosophical theory (and the term “utilitarianism” was first used) in the 18th century, positions which bear strong similarities to utilitarianism have been deployed throughout history. For example, similarities to utilitarianism are sometimes drawn to the teachings of Aristotle, the Buddha and Jesus Christ. In this section, two views from the ancient world are considered. The first is of Mozi, who is sometimes described as the first utilitarian (though this is disputed). The second is Epicurus, whose hedonism was influential on the development of utilitarianism.

a. Mozi

Mozi (c.400s-300s B.C.E)—also known as Mo-Tzu, Mo Di and Mo Ti—led the Mohist school in Chinese philosophy, which, alongside the Confucian school, was one of the two major schools of thought during the Warring States period (403-221 B.C.E.). In this article, some salient similarities between his ethical outlook and utilitarianism will be observed. For a more detailed discussion of Mozi’s philosophy, including how appropriate it is to view him as a utilitarian, see the article devoted to his writings.

Utilitarians are explicit in the importance of impartiality, namely that the well-being of any one individual is no more important than the well-being of anyone else. This is also found in Mozi’s writings. The term jian’ai is often translated as “universal love”, but it is better understood as impartial care or concern. This notion is regarded as the cornerstone of Mohism. The Mohists saw excessive partiality as the central obstacle to good behaviour. The thief steals because they do not sufficiently care for the person they steal from, and rulers instigate wars because they care more for their own good than the people whose countries they invade. Thus, Mozi implored his followers to “replace partiality with impartiality”.

His emphasis on the importance of impartiality bears striking similarities to arguments later made by Bentham and Sidgwick. Mozi’s impartiality is like the utilitarian’s in that it implies inclusivity and equality. Every person’s interests are morally important, and they are equally important.

A second clear similarity between Mohists and utilitarians is the focus on consequences when considering the justifications for actions or practices. Unlike the Confucians, who saw rituals and custom as having moral significance, Mozi would reject this unless they could satisfy some useful purpose. If a custom serves no useful purpose, it should be disposed of. For example, it was customary at the time to spend large quantities of resources on funeral rites, but Mozi criticised this due to these conferring no practical benefit. This scrutiny of the status quo, and willingness to reform practices deemed unbeneficial is something found repeatedly in utilitarians in the 18th century and beyond (see section 4).

A particularly interesting suggestion made by Mozi is that the belief in ghosts and spirts should be encouraged. He claimed that historically, a belief in ghosts who would punish dishonesty or corrupt behaviour had motivated people to act well. Upon seeing scepticism about ghosts in his time, Mozi thought this meant people felt free to act poorly without punishment: “If the ability of ghosts and spirits to reward the worthy and punish the wicked could be firmly established as fact, it would surely bring order to the state and great benefit to the people” (The Mozi, chapter 31).

Mozi approves of the belief in the existence of ghosts, whether or not they actually exist, because of the useful consequences of this belief. This suggestion that utility may count in favour of believing falsehoods is reminiscent of a claim by Henry Sidgwick (1838-1900). Sidgwick was a utilitarian, but he acknowledged that the general public may be happier if they did not believe utilitarianism was true. If that was the case, Sidgwick suggests that the truth of utilitarianism should be kept secret, and some other moral system that makes people happier be taught to society generally. This controversial implication——that it might be morally appropriate to mislead the general public when it is useful——is radical, but it is a reasonable inference from this type of moral view, which Mozi embraced.

A significant difference between Mozi and the utilitarians of the 18th century is the theory of the good he endorsed. Mozi sought to promote a range of goods, specifically order, wealth and a large population. Classical utilitarians, however, regarded happiness or pleasure as the only good. This view was presented shortly after Mozi, in Ancient Greece.

b. Epicureanism

The Epicureans, led by Epicurus (341-271 B.C.E.), were (alongside the Stoics and the Skeptics) one of the three major Hellenistic schools of philosophy. The Epicureans were hedonistic, which means that they saw pleasure as the only thing that was valuable in itself, and pain (or suffering) as the only ultimately bad thing.

This commitment is shared by later utilitarians, and it can be seen in slogans like “the greatest happiness of the greatest number”, which was later used by Frances Hutcheson and popularised by Bentham (though he later disliked it as too imprecise).

Though the Epicureans saw pleasure as the only good, the way they understood pleasure was somewhat different to the way one might imagine pleasure today. They realised that the most intense pleasures, perhaps through eating large amounts of tasty food or having sex, are short-lived. Eating too much will lead to pain further down the line, and appetites for sex dwindle. Even if appetites do not fade, becoming accustomed to intense pleasures may lead to sadness (a mental pain) further down the line if one’s desires cannot be satisfied. Thus, Epicurus endorsed finding pleasure in simple activities that could be reliably maintained for long periods of time. Rather than elaborate feasts and orgies, Epicurus recommended seeking joy in discussion with friends, developing tastes that could easily be satisfied and becoming self-sufficient.

A particular difference between the Epicurean view of pleasure and the view of later hedonists is that Epicurus regards a state of painlessness—being without any physical pains or mental disturbances—as one of pleasure. In particular, Epicurus thought we should aim towards a state of ataraxia, a state of tranquillity or serenity. For this reason, the Epicurean view is similar to a version of utilitarianism sometimes known as negative utilitarianism, which claims that morality requires agents to minimise suffering, as opposed to the emphasis typical utilitarians play on promoting happiness.

Epicurus also differed from utilitarians in terms of the scope of his teachings. His guidance was fairly insular, amounting to something like egoistic hedonism—one that encouraged everyone to promote their own personal pleasure. Epicurus encouraged his followers to find comfort with friends, and make their families and communities happy. This is a stark difference from the attitude of radical reform exhibited by Jeremy Bentham and his followers, who intended to increase the levels of happiness all over the world, rather than merely in the secluded garden that they happened to inhabit.

Epicurean teaching continued long after Epicurus’ death, with Epicurean communities flourishing throughout Greece. However, with the rise of Christianity, the influence of Epicureanism waned. There are several reasons that may explain this. The metaphysical picture of the world painted by Epicureans was one lacking in divine providence, which was seen as impious. Furthermore, the Epicurean attitude towards pleasure was often distorted, and portrayed as degrading and animalistic. This criticism, albeit unfair, would go on to be a typical criticism of utilitarianism (see 3.d.ii). Due to these perceptions, Epicureanism was neglected in the Middle Ages.

By the 15th century, this trend had begun to reverse. The Italian Renaissance philosopher Lorenzo Valla (1407-1457) was influenced by Epicurus and the ancient Epicurean Lucretius (99-55 B.C.E.). Valla defended Epicurean ideas, particularly in his work, On Pleasure, and attempted to reconcile them with Christianity. Thomas More (1478-1535) continued the rehabilitation of hedonism. In Utopia (1516), More describes an idyllic society, where individuals are guided by the quest for pleasure. The Utopian citizens prioritised spiritual pleasures over animalistic ones, which may have made this view more amenable to More’s contemporaries. Later still, the French philosopher Pierre Gassendi (1592-1695) embraced significant portions of Epicurean thinking, including the commitment to ataraxia (tranquillity) as the highest pleasure. The Renaissance revival of Epicureanism paved the way for the development of utilitarianism.

2. The Development of Classical Utilitarianism

In the 17th and early 18th century, philosophical positions that are recognisably utilitarian gained prominence. None of the following labelled themselves as “utilitarians” (the word had not yet been introduced) and whether some should properly be described in this way is a matter of some dispute, but each contain significant utilitarian features and have an important place in the intellectual history.

a. Hutcheson

Francis Hutcheson (1694-1795) was a Scots-Irish philosopher sometimes seen as the first true utilitarian. Geoffrey Scarre (1996) suggests that Hutcheson deserves the title of “father of British utilitarianism” (though Bentham is more typically described in this kind of way). As with many attributions of this sort, this is heavily contested. Colin Heydt, for instance, suggests Hutcheson should not be classified as a utilitarian. Regardless, his contribution to the development of utilitarian thought is undisputed.

Hutcheson was a moral sense theorist. This means he thought that human beings have a special faculty for detecting the moral features of the world. The moral sense gives a person a feeling of pleasure when they observe pleasure in others. Further, the sense approves of actions which are benevolent. Benevolent actions are those that aim towards the general good.

One particular passage that had significant influence on utilitarians can be found in Hutcheson’s Inquiry Concerning the Original of Our Ideas of Virtue or Moral Good (1725):

In the same manner, the moral evil, or vice, is as the degree of misery, and number of sufferers; so that, that action is best, which procures the greatest happiness for the greatest numbers; and that, worst, which, in like manner, occasions, misery.

The phrase, “greatest happiness for the greatest number(s)” became one of the major slogans of utilitarianism. This seems to be the first appearance of the phrase in English (though it was used decades previously by Leibniz). Because of this position, it is easy to see how Hutcheson can be interpreted as a utilitarian.

One important distinction between Hutcheson and utilitarians, however, is that he views the motives of individuals as what is valuable, rather than the state of affairs the action brings about. Whereas utilitarians view happiness itself as good, Hutcheson thinks it is the motives identified by our moral sense (which aim at happiness), which are good.

Hutcheson anticipates something similar to Mill’s higher/lower pleasures distinction (see 3.d.ii). In his posthumously published A System of Moral Philosophy, he says there are “a great variety of pleasures of different and sometimes inconsistent kinds, some of them also higher and more durable than others” (1755). Hutcheson associates dignity and virtuous action with the higher pleasures, and claims that “the exercise of virtue, for some short period, provided it is not succeeded by something vicious, is of incomparably greater value than the most lasting sensual pleasures”. These “higher” pleasures include social and intellectual activities, and are held to trump “lower” pleasures, like food and sex. Hutcheson is aware, however, that pleasures are “generally blended”. Lower pleasures may be accompanied by socialising, moral qualities, or friendship.

This appreciation for the variety and combinations of pleasure adds a rich texture to Hutcheson’s account. However, these intricacies may indicate a further difference between his view and utilitarianism. For the utilitarian, for a certain type of activity to be more valuable than another, this must be explained in terms of pleasure. Hutcheson, however, seems to determine which pleasures are higher and lower based on prior views he harbours about which are noble. He supposes that people who possess “diviner faculties and fuller knowledge” will be able to judge which pleasures are better, and thus which it is better to engage in and promote in others.

Hutcheson is further distinct from utilitarians in that it is unclear whether he is actually trying to provide a theory of right action. He notes that our moral sense can discern which actions are best and worst, but he does not explicitly link this to an account of what it is our duty to do, or what it would be wrong for us not to do. This could be viewed simply as something Hutcheson omitted, but alternatively could be interpreted as a version of scalar utilitarianism (see section 5.d).

b. Christian Utilitarianism

Utilitarianism today is usually seen as a secular doctrine. From Bentham onwards, utilitarians typically attempted to describe their worldview without referring to any theistic commitments. In the 18th century, however, there was a distinct branch of early utilitarians who gave theistic justifications for their position. Participants in this strand are sometime referred to as “Anglican utilitarians”. Richard Cumberland (1631-1718) was an early example of this, and was later followed by John Gay (1699-1745), Soame Jenyns (1704-1787), Joseph Priestley (1733-1804), and William Paley (1743-1805). Paley’s Principles of Moral and Political Philosophy (1785) was the first to bring utilitarianism to a wider audience, and it remained the most discussed example of utilitarianism well into the 19th century.

Cumberland was a natural law theorist, which is to say that moral truths are determined by or can be derived from features of the world, including the nature of human beings. In Cumberland’s view, because human beings find pleasure good and pain bad, they can discern that God wills that they promote pleasure and diminish pain. In A Treatise of the Laws of Nature (1672), he writes: Having duly pondered on these matters to the best of our ability, our minds will be able to bring forth certain general precepts for deciding what sort of human actions may best promote the common good of all beings, and especially of rational beings, in which the proper happiness of each is contained. In such precepts, provided they be true and necessary, is the law of nature contained.

So, armed only with empirical facts about the world, like experiences of pleasure and pain, and our possessing the faculty of reason, Cumberland claimed that it was possible to ascertain that human beings have a God-given duty to promote the general happiness.

While secular versions of utilitarianism came to dominate the tradition, this type of argument for utilitarianism actually has some distinct advantages. Notably, this can provide simple answers to the question “Why be moral?”. Everyone may value their own happiness, so this provides everyone with a reason to act in ways that increase their own happiness. However, there are instances where promoting one’s own personal happiness seem to conflict with the common good. John Gay issued a challenge for secular versions of utilitarianism to explain why an agent in such a position has reason to sacrifice their own happiness to help others: “But how can the Good of Mankind be any Obligation to me, when perhaps in particular Cases, such as laying down my Life, or the like, it is contrary to my Happiness?” (Concerning the Fundamental Principle of Virtue or Morality, 1731).

For the Anglican utilitarian, this question is resolved easily. While it might appear that an individual’s happiness is best promoted by a selfish act contrary to the public good, this is only because rewards of the afterlife have not been taken into account. When someone recognises the infinite rewards for complying with God’s will (or infinite punishments for defying it), they will realise that acting in the interests of the common good (promoting the general happiness) is actually in their best interests. This kind of solution to the problem of moral motivation is not available for secular utilitarians.

Although theistically grounded versions of utilitarianism may stand on firmer ground when it comes to the problem of moral motivation, there are costs too. There are challenges to the existence of an all-powerful creator (see arguments for atheism). Even if those are avoided, the natural law reasoning championed by the Anglican utilitarians might not be persuasive. The inference from what kinds of things people enjoy to a specific divine purpose of human beings (for example, Priestley claims that we can discover that God “made us to be happy”) is one that might be scrutinised. Furthermore, the theistic utilitarian faces a version of the Euthyphro problem: is happiness good because God desires it, or does God desire happiness because it is good?

The Anglican utilitarians foresaw some of the problems that would become serious areas of discussion for later utilitarians. In Priestley, for instance, one can find a discussion of what would later be known as the “demandingness objection” (discussed in section 3.d.iii).

William Paley’s utilitarianism is of historical interest because he discussed several features of the view that have concerned utilitarians and their critics since. For example, he raised the question of whether certain types of action usually deemed to be evil, such as bribery or deceit, might be regarded as morally good if they lead to good consequences:

It may be useful to get possession of a place…or of a seat in parliament, by bribery or false swearing: as by means of them we may serve the public more effectually than in our private station. What then shall we say? Must we admit these actions to be right, which would be to justify assassination, plunder and perjury; or must we give up our principle, that the criterion of right is utility? (The Principles of Moral and Political Philosophy, 1785: 854).

In his answer to this question, Paley suggests a form of what would later be known as rule-utilitarianism (discussed further in section 5.c). He suggests that two types of consequences of an action can be distinguished—the general consequences and the particular consequences. The particular consequence is what follows from a specific action, that is, bribing someone on a given occasion. The general consequence is what follows from acting on that rule, and it is the general consequence Paley views as more important. Paley suggests that, in considering whether bribery to gain a political position is right, one should think about the consequences if everyone accepted a rule where bribery was allowed. Once this is taken into account, Paley argues, it will become apparent that bribery is not useful.

Like Epicurus, Paley is somewhat dismissive of animalistic pleasures, but his explanation for this differs. He makes a distinction between pleasures, which are fleeting, and happiness, which he seems to regard as possessed over longer periods of time:

Happiness does not consist in the pleasures of sense, in whatever profusion or variety they be enjoyed. By the pleasures of sense, I mean, as well the animal gratifications of eating, drinking, and that by which the species is continued, as the more refined pleasures of music, painting, architecture, gardening, splendid shows, theatric exhibitions; and the pleasures, lastly, of active sports, as of hunting, shooting, fishing, etc. (Principles of Moral and Political Philosophy, 35)

He claims these bodily pleasures do not contribute to happiness because they are too fleeting and “by repetition, lose their relish”. Rather, Paley sees happiness as consisting in social activities, the exercise of our faculties, and good health. Paley might then be seen as suggesting that happiness is something one does, rather than something one experiences. He also emphasises the importance of “prudent constitution of the habits” (which bears similarities to Aristotelian ethics). This distinguishes Paley somewhat from the classical utilitarians, who regarded pleasure as a mental state, and happiness consisting in pleasure as well as an absence of pain.

William Paley is also somewhat distinctive due to his conservative values. Unlike Bentham and his followers, who were radical reformers, Paley found the status quo satisfactory. This difference arises for a few different reasons. One explanation for this is that he thought that happiness was relatively evenly distributed around society. He did not think, for instance, that the wealthy were significantly happier than the poor. He argued that this was the case because of his view of happiness—he thought the wealthy and the poor had fairly equal access to social activities, utilising their faculties, and good health.

In his discussions of what acts should be regarded as criminal and what the punishments should be, he does appeal to utility, but also regularly to scripture. As a consequence, Paley’s position on many social issues is one that would now be considered extremely regressive. For example, he favoured financial penalties for women guilty of adultery (but did not suggest the same for men) and argued that we should not pursue leisure activities (like playing cards or frequenting taverns) on the Sabbath. Like many of the later utilitarians, Paley did argue that slavery should be abolished, criticising it as an “odious institution”, but he was in favour of a “gradual” emancipation.

The Anglican utilitarians were extremely influential. Bentham was familiar with their work, citing Joseph Priestley in particular as a major inspiration. Many of the discussions that later became strongly associated with utilitarianism originated here (or were at least brought to a wider audience). An obvious difference between many of the Anglican utilitarians and the later (Benthamite) utilitarians is the conservativism of the former. (One notable exception is perhaps found in Priestley, who celebrated the French Revolution. This reaction was met with such animosity—his chapel was destroyed in a riot—that he emigrated to America.) The Anglican utilitarians were committed to the traditional role of the church and did not endorse anything like the kind of radical reform championed by Bentham and his followers.

c. French Utilitarianism

The development of utilitarianism is strongly associated with Britain. John Plamenatz described the doctrine as “essentially English”. However, a distinctly utilitarian movement also took place in 18th-century France. Of the French utilitarians, Claude Helvétius (1715-1751) and François-Jean de Chastellux (1734-1788) are of particular interest.

While the dominant form utilitarianism in Britain in the 18th century was the Anglican utilitarianism of John Gay (see 2.b), the French utilitarians argued from no divine commitments. Helvétius’ De L’Espirit (1758) was ordered to be burned due to its apparently sacrilegious content. That the French utilitarians were secular has some implications that make it historically noteworthy. As mentioned above (section 2.b), one advantage of the theistically-grounded utilitarianism is that it solves the problem of moral motivation—one should promote the well-being of others because God desires it, and, even if one is fundamentally self-interested, it is in one’s interests to please God (because one’s happiness in the afterlife depends on God’s will). Without the appeal to God, giving an account of why anyone should promote the general happiness, rather than their own, becomes a serious challenge.

Helvétius poses an answer to this challenge. He accepts that the general good is what we should promote, but also, influenced by the Hobbesian or Mandevillian view of human nature, holds that people are generally self-interested. So, people should promote the general good, but human nature will mean that they will promote their individual goods. Helvétius takes this to show that we need to design our laws and policies so that private interest aligns with the general good. If everyone’s actions will be directed towards their own good, as a matter of human nature, “it is only by incorporating personal and general interest, that they can be rendered virtuous.” For this reason, he claims that morality is a frivolous science, “unless blended with policy and legislation”. Colin Heydt identifies this as the key insight that Bentham takes from Helvétius.

Taking this commitment seriously, Helvétius considered what it took to make a human life happy, and what circumstances would be most likely to bring this about. He approached this with a scientific attitude, suggesting “that ethics ought to be treated like all the other sciences. Its methods are those of experimental physics”. But this raises the question of how policy and legislation be designed to make people happy.

Helvétius thought that to be happy, people needed to have their fundamental needs met. In addition to this, they needed to be occupied. Wealthy people may often find themselves bored, but the “man who is occupied is the happy man”. So, the legislator should seek to ensure that citizens’ fundamental needs are met, but also that they are not idle, because he viewed labour as an important component in the happy life. Helvétius treats the suggestion that labour is a negative feature of life with scorn, claiming:

“To regard the necessity of labour as the consequence of an original sin, and a punishment from God, is an absurdity. This necessity is, on the contrary, a favour from heaven” (A Treatise on Man: His Intellectual Faculties and Education, volume 2).

Furthermore, certain desires and dispositions are amenable to an individual’s happiness, so the legislator should encourage citizens to psychologically develop a certain way. For instance, people should be persuaded that they do not need excessive wealth to be happy, and that in fact, luxury does not enhance the happiness of the rich. Because of this, he proposed institutional restrictions on what powers, privileges, and property people could legally acquire. In addition, Helvétius suggested that education should serve to restrict citizens’ beliefs about what they should even want to require, that is, people could be taught (or indoctrinated?) not to want anything that would not be conducive the public good.

As poverty does negatively affect the happiness of the poor, Helvétius defended limited redistribution of wealth. Specifically, one suggestion he offered was to force families that have shrunk in size to relinquish some of their land to families which have grown. Exactly what is the best way to move from a state of misery (which he thought most people were in) to a state of happiness would vary from society to society. So specific suggestions may have limited application. Helvétius urged that this transformation should take place and might involve changing how people think.

In Chastellux’s work, the view that governments should act primarily to promote public happiness is explicit. In his De la Félicité publique (1774), he says: It is an indisputable point, (or at least, there is room to think it, in this philosophical age, an acknowledged truth) that the first object of all governments, should be to render the people happy.

Accepting this, Chastellux asked how this should be done. What is most noteworthy in Chastellux is that he pursued a historical methodology, examining what methods of governments had been most successful in creating a happy populace, so that the more successful efforts might be emulated and developed. From his observations, Chastellux claimed that no society so far had discovered the best way to ensure happiness of its citizens, but he does not find this disheartening. He notes that even if all governments had aimed at the happiness of their citizens, it would “be no matter of astonishment” that they had so far failed, because human civilisation is still in its infancy. He harbours optimism that the technological developments of the future could help improve the quality of life of the poorest in society.

While the historical methodology found in Chastellux may be questionable (Geoffrey Scarre describes it as “fanciful and impressionistic”), it showed a willingness to utilise empirical measures in determining what is most likely to promote the general happiness.

Of the French utilitarians, Helvétius had the greatest influence on later developments in Britain; he was regularly acknowledged by Jeremy Bentham, William Godwin, and John Stuart Mill. The conviction to create good legislation and policies forms the crucial desire of utilitarians in the political realm. In Helvétius, we can also see the optimism of the radical reformer utilitarians, holding to his hope that “wise laws would be able without doubt to bring about the miracle of a universal happiness”.

3. Classical Utilitarianism

While many thinkers were promoting recognisably utilitarian ideas long before him, it is Jeremy Bentham who is credited with providing the first systematic account of utilitarianism in his Introduction to the Principles of Morals and Legislation (1789).

a. Origin of the Term

The word “utilitarianism” is not used in Jeremy Bentham’s Introduction to the Principles of Morals and Legislation (IPML). There he introduces the ‘principle of utility’, that “principle which approves or disapproves of every action whatsoever, according to the tendency it appears to have to augment or diminish the happiness of the party whose interest is in question; or, what is the same thing in other words to promote or to oppose that happiness”. Bentham borrows the term “utility” from David Hume’s Treatise of Human Nature (1739-1740). There, Hume argues that for any character traits viewed as virtues, this can be explained by the propensity of those traits to cause happiness (‘utility’). Bentham later reported that upon reading this, he “felt as if scales had fallen from my eyes”.

The first recorded use of the word “utilitarianism” comes in a letter Bentham wrote in 1781. The term did not catch on immediately. In 1802, in another letter, Bentham was still resisting the label “Benthamite” and encouraging the use of “utilitarian” instead. While Bentham seems to have originated the term, this does not seem to have been common knowledge. John Stuart Mill, in Utilitarianism (1861) notes that he found the term in an 1821 John Galt novel. He was using it as early as 1822, when he formed a society called the ‘Utilitarian Society’, which was a group of young men, who met every two weeks for three and half years. After this, the term entered common parlance.

b. Bentham

As well as providing what became the common name of the view, Jeremy Bentham (1748-1832) is credited with making utilitarianism a systematic ethical view. His utilitarian inclinations were sparked when he read Joseph Priestley’s Essay on Government (1768), and he claims that the “greatest happiness of the greatest number” is the measure of right and wrong in his Fragment on Government (1776). It is in IMPL, however, where the ideas are presented most clearly and explicitly.

In IPML, Bentham defines utility as “that property in any object, whereby it tends to produce benefit, advantage, pleasure, good, or happiness”. In the opening of IPML, Bentham makes clear his view that utility (pleasure and pain) determines the rightness or wrongness of an action. He states:

Nature has placed mankind under the governance of two sovereign masters, pain and pleasure. It is for them alone to point out what we ought to do, as well as determine what we shall do. On the one hand the standard of right and wrong, on the other the chain of causes and effects, are fastened to their throne. They govern us in all we do, in all we say, in all we think: every effort we can make to throw off our subjection, will serve but to demonstrate and confirm it.

As well as emphasising hedonism as the standard of rightness (normative hedonism), Bentham seems here committed to a certain view about our motivation. He not only claims that the rightness or wrongness of an action is determined by pain/pleasure, but also that these notions determine what we will do. Specifically, following Hobbes, Bentham thought that everyone is, as a matter of fact, always motivated by their own happiness, a form of psychological egoism. If we accept the ought-implies-can principle, the idea that we can only be required to act in ways that it is actually possible for us to act, this is a difficult position to reconcile with the claim that we ought to promote the general happiness. If human beings are necessarily always motivated by their own self-interest, imploring them to promote the interests of others seems futile.

Bentham was aware of this sort of objection. One type of response he gives is to claim that we should ensure, where possible, that society is structured so that when individuals act in their own interests, this is conducive to the general happiness. This answer is reminiscent of the strategy deployed by Helvétius (section 2.c). When the incentive and punitive structures in society are structured in this way, self-interested actions benefit the wider community. Second, he suggests that individuals do benefit from living in a community where the general good is promoted. This amounts to a denial that any self-interested actions actually does clash with the general good. This strikes many as implausible, as any actions that would be good for the general good but bad for the individual acting, would disprove it. This move is rendered unnecessary if psychological egoism is abandoned, and given some of the arguments against the view, Bentham’s utilitarianism may be better off without that psychological claim.

One of the ideas Bentham is known for is the “hedonic calculus” or “felicific calculus” (though Bentham never himself used either of these terms). The crux of this is the thought that to determine the value of an action, one can use a kind of moral ledger. On one side of the ledger, the expected good effects of the action and how good they are can be added up. On the other side, the bad effects of the action can be added. The total value of the negative effects can then be subtracted from the value of the positive effects, giving the total value of the action (or policy). This idea was first introduced by Pierre Bayle (1647-1706), though Bentham adds considerable depth to the idea.

In considering how to value a quantity of pleasure (or pain), Bentham observed that we can evaluate it with regards to seven dimensions or elements. These are the pleasure’s:

(1) intensity

(2) duration (how long the pleasure lasts)

(3) certainty/uncertainty (the probability it will occur)

(4) propinquity or remoteness (how soon the pleasure will occur)

(5) fecundity (how likely it is to be followed by further pleasures)

(6) purity (how likely it is to be followed or accompanied by pains)

(7) extent (the number of persons it extends to)

Bentham included a poem in the second edition of IPML, so that people could remember these dimensions:

Intense, long, certain, speedy, fruitful, pure –
Such marks in pleasures and in pains endure.
Such pleasures seek if private be thy end:
If it be public, wide let them extend
Such pains avoid, whichever be thy view:
If pains must come, let them extend to few.

On Bentham’s view, these are all the features we must know of a certain pleasure. Importantly, even a frivolous game, if it turns out to have the same intensity, duration, and so forth, is just as good as intellectual pursuits. He says this explicitly about the game push-pin (a children’s game where players try to hit each other’s pins on a table): “Prejudice apart, the game of push-pin is of equal value with the arts and sciences of music and poetry”. Notably, this view set him apart from those who claimed a difference in kind between types of pleasures, like John Stuart Mill (see section 3.d.ii).

While Bentham does suggest that this kind of happiness arithmetic would be successful in determining what actions are best, he does not suggest that we consider every factor of every possible action in advance of every given action. This would obviously be excessively time consuming, and could result in a failure to act, which would often be bad in terms of utility. Rather, we should use our experience as a guide to what will likely promote utility best.

Though the term “greatest happiness for the greatest number” has become strongly associated with utilitarianism and is used by Bentham in earlier works, he later distanced himself from it, because in it “lurks a source of misconception”. One interpretation of the expression suggests we should ascertain the largest number of people benefited by an action (the greatest number), and benefit those as much as possible, no matter what the effects are on the other remainder. For instance, we could imagine a policy that enslaved 1% of the population for the benefit of the 99%, greatly benefiting that majority, but making the enslaved miserable. A policy like this, which ignores entirely the well-being of some, is certainly not what Bentham intended. He later speaks simply of the “greatest happiness principle”, the requirement to promote the greatest happiness across the whole community.

Bentham was an active reformer. He argued for radical political changes, including arguing for the right to vote for women, significant prison reforms, the abolition of slavery, the elimination of capital punishment, and in favour of sexual freedom. Each of these was argued for on grounds of utility. Bentham gained a number of intellectual followers. One of the most notorious of these was James Mill (1783-1836), who was one of the major figures in 19th century philosophy and economics. Mill’s reputation was international, attracting attention from Karl Marx (1818-1883), and is still seen as one of the most important figures in utilitarianism, but today he is overshadowed by his son, John Stuart. John Stuart Mill (1806-1873) met Bentham when he was two years old, and, under the influences of Bentham and his father, became one of utilitarianism’s fiercest champions. John Stuart Mill’s defence of utilitarianism is still the most widely read today (discussed in more depth in 3.d).

c. Features of Classical Utilitarianism

It is a matter of some dispute what features make a moral theory appropriate for the name utilitarianism. The core features mentioned here are those commonly associated with classical utilitarianism. It is not clear how many of those associated with utilitarianism, even in 19th century Britain, actually accepted classical utilitarianism, that is, who thought the correct moral theory possessed these six features. For instance, though John Stuart Mill is regarded as the man who did most to popularise the view, he rejected elements of this picture, as he explicitly rejected the requirement to maximise utility (see Jacobson 2008 for a discussion of how Mill deviates from this orthodox picture). Regardless of how many actually held it, the view consisting of these claims has become the archetype of utilitarianism. The more a moral view departs from these, the less likely it is to be deemed a version of utilitarianism.

i. Consequentialism

Views are classed as consequentialist if they place particular emphasis on the role of the outcome of actions, rather than features intrinsic to the actions (for example, whether it involves killing, deception, kindness, or sympathy) as forms of deontology do, or what the actions might reveal about the character of the agent performing them (as does virtue ethics).

Classical utilitarianism is uncontroversially consequentialist. Later variations, such as rule-utilitarianism (see section 5c), which regard consequences as having an important role, are less easily categorised. Versions of utilitarianism that do not assess actions solely in terms of the utility they produce are sometimes referred to as indirect forms of utilitarianism.

ii. Hedonism

Following the Epicureans, classical utilitarianism regards pleasure as the only thing that is valuable in itself. Pleasure is the “utility” in classical utilitarianism. On this view, actions are morally better if they result in more pleasure, and worse if they result in less.

Hedonists differ on how they understand pleasure. The Epicureans, for instance, regarded a state of tranquility (ataraxia) as a form of pleasure, and one that should be pursued because it is sustainable. Classical utilitarians typically regard pleasure as a mental state which the individual experiences as positive. Bentham evaluated pleasures across his seven elements, but importantly thought no pleasure was superior in kind to any other. For example, the pleasure from eating fast food is no less valuable than the pleasure one may attain from reading a great novel, though they may differ in terms of sustainability (one might become ill fairly quickly from eating fast food) or propinquity (pleasure from fast food may be quick, whereas it may take some time to come to appreciate a complex prose). This parity of pleasures was something John Stuart Mill disagreed with, leading to a notable difference in their views (see 3.d.ii).

Many contemporary utilitarians, recognising issues with hedonism, have instead adopted welfarism, the weaker claim that the only thing that is intrinsically valuable is well-being, that is, whatever it is that makes a life go well. Well-being could be given a hedonistic analysis, as in classical utilitarianism, but alternatively a preference-satisfaction view (which states that one’s well-being consists in having one’s preferences satisfied) or an objective-list view (which states that lives go well or badly depending on how well they satisfy a set list of criteria) could be adopted.

iii. Aggregation

The utilitarian thinks that everyone’s individual pleasure is good, but they also think it makes sense to evaluate how good an outcome is by adding together all the respective quantities of pleasure (and pain) of the individuals affected. Imagine that we can assign a numerical value to how happy every person is (say 10 is as happy as you could be, zero is neither happy or unhappy, and -10 is as unhappy as you could be). The aggregative claim holds that we can simply add the quantities together for an action to see which is the best.

One of the criticisms sometimes made of utilitarianism is that ignores the separateness of persons. When we decide actions based on aggregated sums of happiness, we no longer think about individuals as individuals. Instead, they are treated more like happiness containers. A related complaint is that determining the best outcome by adding together the happiness scores of every individual can obscure extremes that might be morally relevant. This has implications that many find counterintuitive, such as that this method may judge an outcome where one person undergoes horrific torture to be a good outcome, so long as enough other people are happy.

iv. Optimific (‘Maximising’)

Hedonists believe pleasure is the only good. Aggregation commits utilitarians to the idea that the pleasures and pains of different people can be added to compare the value of outcomes. One could accept these claims without thinking that a moral agent must always do the best. Classical utilitarianism does hold that one is required to perform the best action. In other words, classical utilitarianism is a maximising doctrine (“maximising” is another word introduced into English by Jeremy Bentham).

Maximising views are controversial. One reason for this is that they eliminate the possibility of supererogatory actions, that is, actions that are beyond the call of duty. For example, we might think donating most of your income to charity would be a wonderful and admirable thing to do, but not something that is usually required. The maximiser claims that you must do the best action, and this is the case even if doing so is really difficult, or really costly, for the person acting.

Some of the most persistent criticisms of utilitarianism concern how much it demands. In response, some of the 20th-century revisions of the view sought to abandon this element, for example, satisficing versions and scalar views (5.d).

v. Impartiality

Utilitarians embrace a form of egalitarianism. No individual’s well-being is more important than any other’s. Because of this, utilitarians believe that it is just as important to help distant strangers as it is to help people nearby, including one’s friends or family. As Mill puts it, utilitarianism requires an agent “to be as strictly impartial as a disinterested and benevolent spectator”.

In fact, sometimes impartiality may require a person to help a stranger instead of a loved one. William Godwin (1756-1836) highlighted this in a famous example. He described a scenario where a fire broke out, and a bystander was able to save either Archbishop Fénelon (a famous thinker and author of the time) or a chambermaid. Godwin argued that because of Fénelon’s contributions to humanity, a bystander would be morally required to save him. Moreover, Godwin claimed, one would be required to save Fénelon even if the chambermaid was one’s mother.

This requirement for strict impartiality strikes many as uncomfortable, or even alienating. When challenged, Godwin defended his position, but insisted that scenarios where this kind of sacrifice is required would be rare. In most instances, he thought, people do happen to be more able to bring happiness to themselves or their loved ones, because of greater knowledge or increased proximity. In this way, some partial treatment, like paying more attention to one’s friends or family, can be defended impartially.

vi. Inclusivity

The classical utilitarian accepts the hedonist commitment that happiness is what is valuable. It is a separate question whose happiness should count. Utilitarians answer this with the most inclusive answer possible—everyone’s. Any subject that is capable of pleasure or pain should be taken into consideration.

This has some radical implications. As well as human beings, many animals can also experience pleasure or pain. On this topic, one passage from Bentham is regularly deployed by defenders of animal rights:

It may come one day to be recognized, that the number of legs, the villosity of the skin, or the termination of the os sacrum, are reasons equally insufficient for abandoning a sensitive being to the same fate. What else is it that should trace the insuperable line? Is it the faculty of reason, or perhaps, the faculty for discourse? …the question is not, Can they reason? nor, Can they talk? but, Can they suffer? (IPML, chapter XVII)

Reasoning of this sort extends the domain of morally relevant beings further than many were comfortable with. Bentham was not alone among utilitarians in suggesting that non-human life should be taken into moral consideration. In his Utilitarianism, Mill noted that lives full of happiness and free from pain should be “secured to all mankind; and not to them only, but, so far as the nature of things admits, to the whole sentient creation.” This emphasis on the importance of the well-being of animal life, as well as human life, has persisted into contemporary utilitarian thought.

d. Early Objections and Mill’s Utilitarianism

In the 19th century, knowledge of utilitarianism spread throughout society. This resulted in many criticisms of the view. Some of these were legitimate challenges to the view, which persist in some form today. Others, however, were based upon mistaken impressions.

In 1861, frustrated by what he saw as misunderstandings of the view, John Stuart Mill published a series of articles in Fraser’s Magazine, introducing the theory and addressing some common misconceptions. This was later published as a book, Utilitarianism (1863). Mill was somewhat dismissive of the importance of this work. In letters, he described it as a “little treatise”, and barely mentioned it in his Autobiography (unlike all his other major works). Despite this, it is the most widely consulted defence of utilitarianism.

Here are some of the early criticisms of utilitarianism, and Mill’s responses.

i. Dickens’ Gradgrindian Criticism

In the 19th century, utilitarianism was perceived by some of its detractors as cold, calculating, and unfeeling. In his 1854 novel, Hard Times, Charles Dickens portrays a caricature of a utilitarian in the character of Thomas Gradgrind. Gradgrind, who is described explicitly as a utilitarian, is originally described as follows:

Thomas Gradgrind, sir. A man of realities. A man of facts and calculations. A man who proceeds upon the principle that two and two are four, and nothing over, and who is not to be talked into allowing for anything over. Thomas Gradgrind, sir—peremptorily Thomas—Thomas Gradgrind. With a rule and a pair of scales, and the multiplication table always in his pocket, sir, ready to weigh and measure any parcel of human nature, and tell you exactly what it comes to. It is a mere question of figures, a case of simple arithmetic. You might hope to get some other nonsensical belief into the head of George Gradgrind, or Augustus Gradgrind, or John Gradgrind, or Joseph Gradgrind (all supposititious, non-existent persons), but into the head of Thomas Gradgrind—no, sir!

The reputation of utilitarians for being joyless and overly fixated on precision was so established that John Stuart Mill addressed this misconception in Utilitarianism (1861). Mill complains that the opponents of utilitarianism have been mistaken that the view opposes pleasure, which he describes as an “ignorant blunder”. This view of the position may come, in part, from its name, and the focus on utility, or what is useful or functional—terms seldom associated with happiness.

Despite Mill’s frustrations with this criticism, the colloquial use of the word “utilitarian” continued to have similar connotation long after his death. In an episode of the sitcom Seinfeld, for example, Elaine notes that while the female body is aesthetically appealing, the “The male body is utilitarian — it’s for getting around. It’s like a Jeep” (1997). The implication is that utilitarian objects being functional rather than fun. This association may be unfortunate and unfair, as Mill argues, but it has been a persistent one.

This particular criticism may be unfortunate, but aspects of it—such as the focus on measurement and arithmetic—foreshadow some of the utilitarianism’s later criticisms, like John Rawls’ (1921-2002) suggestion that it cannot appreciate the separateness of persons, or Bernard Williams’ (1923-2003) complaint that the view insists that people regard themselves as merely nodes in a utility calculus.

ii. The ‘Swine’ Objection and ‘Higher Pleasures’

Another criticism that was regularly levelled against utilitarianism was that it is unfit for humans, because the focus on pleasure would not allow for the pursuits of uniquely human goods. This was a criticism also made (unfairly) of the Epicureans. It suggested that the hedonist would endorse a life consisting entirely in eating, sleeping, and having sex, which were devoid of more sophisticated activities like listening to music, playing card games, or enjoying poetry. The allegation suggests that the utilitarian proffers an ethics for swine, which is undignified for human beings. Consequently, the opponent suggests, the view must be rejected.

There are several ways a utilitarian could respond to this. They could make use of the Epicurean strategy, which is to suggest that the animalistic pleasures are just as good, but they are not sustainable. If you try to spend all your time eating delicious food, your appetite will run out, and you may make yourself sick. Pleasures of the mind, however, might be pursued for a longer time. If someone is able to take pleasure in listening to poetry or music, this might also be more readily satisfied. Indulging in pleasures of these sorts does not require scarce resources, and so could be less vulnerable to contingent environmental factors. A bad harvest may ruin one’s ability to enjoy a certain food, but it would not tarnish one’s ability to enjoy a piece of music or think about philosophy. This is the type of response that would satisfy Bentham. He thought that no type of pleasure was intrinsically better than another (that push-pin “is of equal value with the arts and sciences of music and poetry”).

Mill disagreed with Bentham on this matter, claiming instead that “some kinds of pleasure are more desirable and more valuable than others”. On his view, the pleasure gained from appreciating a sophisticated poem or an opera could be better than the pleasure from push-pin, even if both instances had the same duration, were equally intense, and had no additional relevant consequences.

This was a controversial aspect of Mill’s utilitarianism, and many found his justification for this unconvincing. He suggested that someone who had experienced two different kinds of pleasures would be able to discern which was the higher quality. Some people may not be able to appreciate some forms of pleasure, because of ignorance or a lack of intelligence, just as animals are not capable of enjoying a great novel. But, according to Mill, it is generally better to be the intelligent person than the fool, and better to be a human than a pig, even a happy one: “It is better to be a human being dissatisfied than a pig satisfied; better to be Socrates dissatisfied than a fool satisfied. And if the fool, or the pig, is of a different opinion, it is only because they only know their own side of the question” (Mill, Utilitarianism, chapter 2).

Mill’s suggestion, however, invites scrutiny. Many people do opt for “lower” pleasures, rather than “higher” ones, even when capable of enjoying both. One might also wonder whether some mixture of different kinds of pleasures might be preferable to restricting oneself to pleasures more closely associated with the intellect and reasoning (which Mill regards as superior), yet Mill does not consider this, or that different people may simply have different preferences regarding some of these kinds of pleasure, without that indicating any superiority or inferiority. Mill’s proposal raises many questions, so a utilitarian may find that the simpler, Benthamite ‘quantitative hedonism’ is preferable to Mill’s ‘qualitative hedonism’ (see here for further discussion of this distinction).

While this aspect of Mill’s utilitarianism is contentious, a similar type of argument is still utilised to justify the claim that animals have a different moral status (see also the discussion of animals and ethics).

iii. Demandingness

Because of the classical utilitarian commitment to maximisation, utilitarianism is sometimes accused of being excessively demanding. Everyone is required, according to the classical utilitarian, to bring about the most happiness. If an individual can best serve the general utility by living an austere, self-sacrificial life, this is what the utilitarian calculus demands. However, this strikes many as counterintuitive. According to common-sense moral thinking, people can use their time in myriad ways without having morally failed, but the maximiser states that one must always do the very best. Morality then threatens to encroach on every decision.

Mill was aware of this criticism. He identified two particular ways this might be a concern.

First, utilitarianism may be seen to require that moral agents are always thinking about duty, that this must be the motive in every action a person performs. Thinking about morality must be central in all a person’s decisions. This, he claims, is a mistake. Mill argues that the business of ethics is people’s conduct, not whether they act because of a conscious desire to bring about the greatest utility. He provides an example to illustrate this. If a bystander notices someone drowning, what matters is that they save them, whatever their reasons might be:

He who saves a fellow creature from drowning does what is morally right, whether his motive be duty, or the hope of being paid for his trouble: he who betrays the friend that trusts him, is guilty of a crime, even if his object be to serve another friend to whom he is under greater obligations. (Utilitarianism, chapter 2)

Here, Mill makes a distinction between the moral worth of the action and the moral worth of an agent. As far as the action is concerned, the drowning person being rescued is what matters. Whether the person doing the saving is an admirable person might depend on whether they did it for noble reasons (like preventing suffering) or selfish reasons (like the hope of some reward), but utilitarianism is primarily concerned with what actions one should do. In other places, Mill does talk extensively about what makes a virtuous person, and this is strongly connected to his utilitarian commitments.

Second, Mill was aware of the worry that utilitarianism might dominate one’s life. If every action one performs must maximise utility, will this not condemn one to be constantly acting for the sake of others, to the neglect of the things that make one’s own life meaningful? Mill was dismissive of this worry, claiming that “the occasions on which any person (except one in a thousand) has it in his power to do this on an extended scale, in other words, to be a public benefactor, are but exceptional”. Sometimes, one might find oneself in a situation where one could save a drowning stranger, but such scenarios are rare. Most of the time, Mill thought, one individual does not have the ability to affect the happiness of others to any great degree, so they can focus on improving their own situation, or the situations of their friends or families.

In the 19th century, this response may have been more satisfactory, but today it seems wildly implausible. Due to the existence of effective charities, and the ability to send resources around the world instantly, an affluent person can make enormous differences to the lives of people halfway around the world. This could be in terms of providing food to countries experiencing famine, inoculations against debilitating illnesses or simply money to alleviate extreme poverty. In his time, perhaps Mill could not have been confident that small sums of money could prevent considerable suffering, but today’s middle classes have no such excuse.

Because of technological developments, for many people in affluent countries, living maximising happiness may require living a very austere life, while giving most of their resources to the world’s poorest people. This appears implausible to many people, and this intuition forms the basis of one of the major objections to utilitarianism today. Some have responded to this by moving to rule, satisficing, or scalar forms of utilitarianism (see section 5).

iv. Decision Procedure

The utilitarian claims that the right action is that which maximises utility. When an agent acts, they should act in a way that maximises expected utility. But how do they determine this? One way is to consider every possible action one might do, and for each one, think about all the consequences one might expect (with appropriate weightings for how likely each consequence would be), come up with an expected happiness value for each action, and then pick the one with the highest score. However, this sounds like a very time-consuming process. This will often be impossible, as time is limited. Is this a problem for utilitarians? Does it make the view impractical?

Mill was aware of this concern, that “there is not time, previous to action, for calculating and weighing the effects of any line of conduct on the general happiness.” However, Mill thinks this objection obscures relevant information gained throughout human history. As people have acted in all sorts of ways, with varying results, any person today can draw upon humanity’s wealth of knowledge of causes and effects, as well as from their own experiences. This background knowledge provides reasons to think that some actions are likely to be more conducive to happiness than others. Often, Mill thinks, an agent will not need to perform any calculations of utility to determine which actions best promote happiness; it will just be obvious.

Mill ridicules the suggestion that individuals would be completely ignorant of what actions they must do if they were to adopt utilitarianism. There would, of course, be no need to contemplate on each occasion whether theft or murder promote utility—and even if there were, he suggests that this would still not be particularly puzzling. Acknowledging this criticism with some derision, Mill notes that “there is no difficulty in proving any ethical standard whatever to work ill, if we suppose universal idiocy to be conjoined with it”.

However, this kind of objection relates to an interesting question. Should a utilitarian endorse reasoning like a utilitarian? Mill suggests that it is preferable in many occasions to make use of rules that have been previously accepted. But how does one determine whether to use a rule and when to perform a utility calculation? Some of Mill’s remarks about how to use rules have prompted commentators to regard him as a rule-utilitarian (see section 5.c). Utilitarianism also seems to allow for the possibility that no one should believe that utilitarianism is true. If, for instance, it turns out that the world would be a happier place if everyone accepted a Kantian ethical theory, the utilitarian should, by their own lights, favour a world where everyone believes Kant. Henry Sidgwick (1838-1900) took this seriously, and he defended the idea that perhaps only an “enlightened few” should know the truth about morality, and keep it hidden from the masses.

Utilitarians can say that the truth of their view does not depend on what the correct decision procedure is. Whether performing a utility calculus or simply acting on common-sense morality leads to most happiness, they can still say that the right actions are those that lead to happiness being maximised, that is, that utilitarianism is the correct theory. However, given that utilitarians do tend to care about how people should act, and want to change behaviours, the question of how one should decide what to do is pertinent. Exactly what the relationship between utilitarianism and practical reasoning is, or should be, according to utilitarians, is a persisting question.

4. The Utilitarian Movement

Today, utilitarianism is regarded primarily as a moral theory which can be used to determine the obligations of an individual in a situation. This focus on individual morality gives an inaccurate impression of the Utilitarian movement (‘Utilitarianism’ with a capital ‘U’ will be used to indicate the movement, as distinct from the moral theory) in the 18th and 19th century. The Utilitarians were keenly focused on social change. This took the form of revising social policy with the aim of improving the general happiness. Bentham is explicit on the first page of Introduction to the Principles of Morals and Legislation that the principle of utility applies not only to actions of private individuals, but also to “every measure of government”. Helvétius was similarly minded, emphasising the importance of laws that could make people happy, as well as ways to change people, so that they could be made happy more easily.

The Utilitarian project was an ambitious one. Every policy, every law, every custom was open to scrutiny. If it was deemed not conducive to general happiness, the Utilitarians suggested it should be disregarded or replaced. Because they were so willing to disregard customs—even those the general community placed high values on—the Utilitarians were a radical group. This section discusses some of the policies supported by Utilitarians.

A common plea from Utilitarians, deemed radical at the time, was for women’s suffrage. A notable example of this comes from Harriet Taylor (1807-1858). Taylor befriended and later married John Stuart Mill, and she is regarded as a prominent Utilitarian in her own right. She had a significant influence on Mill’s writing (exactly how much influence she had is a matter of dispute, though Mill said in his introduction to On Liberty, “Like all that” he had “written for many years, it belongs as much to her as to” him). In Taylor’s Enfranchisement of Women (1851), she argues that women should have equal political rights to men, including the right to vote and to serve in juries. In fact, Taylor’s arguments call for the equal access to all spheres of public life. In particular, she claimed women should be able to enter all professions, including running for political office.

In the same essay, Taylor condemned slavery. This was another point Utilitarians were largely united on. Bentham also criticised slavery on the grounds that it had negative effects on the general happiness, and when abolition was discussed in parliament, he actively opposed compensating slave-traders for their losses. John Stuart Mill was also vocal on the topic of slavery and the just treatment of former slaves. As a Member of Parliament, Mill chaired the Jamaica Committee, which aimed to prosecute Governor Eyre of Jamaica, who used excessive and deadly force in suppressing an uprising at Morant Bay in 1865. This pitted Mill against many prominent intellectuals, including his contemporary (and sometimes friend) Thomas Carlyle (1795-1881). Mill received assassination threats for his position, which was seen by many as overly sympathetic towards the Black Jamaicans.

Like his wife, John Stuart Mill also campaigned for the rights of women. He thought not only that society would benefit considerably from the liberation of women, but also that there would be an “unspeakable gain in private happiness to the liberated half of the species; the difference to them between a life of subjection to the will of others, and a life of rational freedom”. As well as making the case in his book The Subjection of Women (which drew heavily upon material from his wife’s previous work), Mill spoke passionately in favour of expanding suffrage in Parliament. This cause clearly moved Mill, who was reportedly arrested as a teenager for distributing information about contraception. Henry Sidgwick was also an active campaigner, particularly regarding education reform. He became one of the leading voices advocating for access to higher education for women and was one of the organisers of “Lectures for Ladies” at Cambridge, which, in 1871, led to the formation of Newnham College, an all-women’s college (at the time, women were not allowed to attend the university).

Jeremy Bentham, in the early 1800s, wrote essays defending sexual freedom. He was motivated by the harsh way that society treated homosexuals and thought there could be no utilitarian justification for this. While many members of the public may have been offended by these behaviours, they were not harmful, but the restrictions and punishments faced by the marginalised groups were.

Utilitarians were also vocal in defense of animal welfare. Bentham argued that the feature relevant for whether an entity has moral status is “is not, Can they reason? nor, Can they talk? but, Can they suffer?”. Mill, despite famously arguing that humans can appreciate “higher pleasures” than animals, is insistent that animal welfare is relevant. He thought it obvious that, for a utilitarian, any practice that led to more animal suffering than human pleasure was immoral, thus it seems likely he would have opposed factory farming practices.

Not all of the proposals endorsed by Utilitarians are looked on quite so favourably with a modern eye. While John Stuart Mill argued, from utilitarian principles, for a liberal democratic state, he suggested that those arguments did not apply to “barbarians” who were “unfit for representative government”. Infamously, Mill considered India unsuitable for democracy, and is seen by some as an apologist for the British Empire for defending this kind of view.

Another infamous proposal from the Utilitarians comes from Bentham in the domain of prison reform. Bentham suggested an innovative prison design known as the “panopticon” (1787). This was designed to be humane and efficient. A panopticon prison is circular with cells around the edges, and an inspector’s lodge in the middle, situated so that the guard can view each cell. From the inspection lodge each cell would be visible, but blinds to the inspector’s lodge would prevent the prisoners from seeing whether they were being watched, or even whether a guard was present, at any given time. The mere possibility that they were being watched at any time, Bentham thought, would suffice to ensure good behaviour. He also thought that this would prevent guards from mistreating prisoners, as that too would be widely visible. The panopticon was later popularised and criticised by Michel Foucault in Discipline and Punish. The panopticon is notorious for imposing psychological punishment on inmates. Never knowing whether one is being watched can be psychologically stressful. For better or worse, the panopticon anticipated many developments in surveillance present in early 21st-century society.

In each of these proposals, the Utilitarians insisted that policies, laws, or customs must be justified by their effects. If the effects were positive, they were good and could be maintained. If the effects were negative, they should be dispelled with. This attitude, and the radical political ambition, characterised Utilitarianism as a movement.

5. Utilitarianism in the Twentieth 20th Century

Despite its many detractors, utilitarianism in one form or another continued to hold sway as one of the major moral approaches throughout the 20th century. Philippa Foot (1920-2010) claimed in 1985 that it “tends to haunt” even those who reject the view. That being said, during the 20th century, new criticisms of the view emerged, and previous objections were explored in considerably more depth. This resulted in additional complications to the view, novel defences, and variations on the classical view.

In this section, some of the major 20th-century developments for utilitarianism are discussed. Some advances that may have been described under the heading of “utilitarianism” previously have been omitted, because they veer too far from the core view. For example, G. E. Moore’s “ideal utilitarianism”, despite the name, departs significantly from the central utilitarian commitments, so is not included here (in the early 21st century, this was typically regarded as a non-utilitarian form of consequentialism—see this discussion for further details).

a. Hedonism and Welfarism

The hedonism embraced by classical utilitarianism is controversial. Some of the reasons for this have already been discussed, such as the suggestion that pleasure is all that matters is crude or a doctrine “worthy of swine”. An additional complaint that this offers an impoverished theory of the good suggests that it ignores the values of achievement or authenticity. One example that exemplifies this is the thought experiment of the “experience machine” given by Robert Nozick (1938-2002):

Suppose there were an experience machine that would give you any experience you desired. Superduper neuropsychologists could stimulate your brain so that you would think and feel you were writing a great novel, or making a friend, or reading an interesting book. All the time you would be floating in a tank, with electrodes attached to your brain. Should you plug into this machine for life, pre-programming your life’s experiences? (Nozick, Anarchy, State & Utopia, 1974)

Nozick supposes that many people would be reluctant to plug into the machine. Given that the machine could guarantee more pleasurable experiences than life outside it could, this suggests that people value something other than simply the pleasurable sensations. If some of the things that one would miss out on inside the machine (like forming relationships or changing the world in various ways) are valuable, this suggests that hedonism—the claim that only pleasure matters—is false.

In the 20th century, as a result of rejecting the hedonistic component, several utilitarians modified their view, such that utility could be understood differently. One way to change this is to suggest that the classical view is right that it is important that a person’s life goes well (their well-being), and also that this is the only thing that matters morally, but that it gets something wrong about what makes a person’s life go well. Rather than just a matter of how much pleasure a life contains, we might think well-being is best understood in another way. If a view holds that the well-being of individuals—however this is best understood—is the only moral value, it is welfarist.

One account of well-being regards preferences as especially important, such that a person’s life is made better by their preferences being satisfied. This view, which when joined to utilitarianism is known as preference utilitarianism, is able to evade the problems caused by the experience machine, because some of our preferences are not just to experience certain sensations, but to do things and to have relationships. These preferences would remain unsatisfied in an artificial reality, so the preference utilitarian could regard a person’s life as going less well as a result (even if they do not know it).

However, preference utilitarianism has problems of its own. For instance, some preferences simply do not seem that important. John Rawls (1921-2002) imagines a case of an intellectually gifted person, whose only desire is to count blades of grass. According to preference-satisfaction theories of well-being, if such a person is able to spend all their time grass-counting, their life is as good as it can be. Yet many have the intuition that this life is lacking some important features, like participating in social relationships or enjoying cultural pursuits. If there is some value lacking in the life of the grass-counter, this implies something wrong with the preference-satisfaction account of well-being.

Another objection against preference utilitarianism concerns preferences a person no longer has. If someone has a preference for something to happen, then forgets about it, never to find out whether it occurs, does this actually make their life go better? To take this to an extreme, does a person’s life improve if one of their preferences is satisfied after they die? Utilitarians who are more hedonistically inclined find this implausible. Peter Singer, one of utilitarianism’s most famous defenders, previously endorsed preference utilitarianism, but has since abandoned this in favour of hedonistic utilitarianism.

b. Anscombe and ‘Consequentialism’

G.E.M. Anscombe (1919-2001) was an influential figure in 20th century philosophy. She was not a utilitarian but was responsible for significant changes in how utilitarianism was discussed. In ‘Modern Moral Philosophy’ (1958), Anscombe expressed extremely critical views about the state of moral philosophy. She thought the notion of morality as laws or rules that one must follow made little sense in a secular world; that without a divine law-maker (God), injunctions to or prohibitions against acting some way lacked authority. She was similarly critical of Kant, claiming that the idea that one could legislate for oneself was “absurd”. Among other things, her paper—and Anscombe’s general rejection of the major ethical theories of her day—sparked renewed interest in Aristotelian ethical thinking and the development of virtue ethics.

Anscombe also criticised utilitarianism as a “shallow philosophy” because it suggested that it was always able to give clear-cut answers. She claimed that in ethics borderline cases are ubiquitous. In these cases, there is not an obvious answer, and even if there is a correct answer, it might be something one should be conflicted about.

Anscombe’s criticisms of utilitarians since Sidgwick were particularly scathing. She claimed that they held a view of intention that meant everything that was foreseen was intended—a view she thought was “obviously incorrect”. Anscombe invented the term “consequentialism” as a name for the view she was critical of, distinguishing this from “old-fashioned Utilitarianism”. After Anscombe, “consequentialism” became a broader label than utilitarianism. As well as the classical view outlined above, “consequentialism” allowed for different conceptions of the good. For example, a view that thought that only consequences matter, but held that—as well as happiness or well-being—beauty is intrinsically valuable would be consequentialist, but not utilitarian (this is why G.E. Moore’s “ideal utilitarianism” has not been discussed in this article, as he makes claims of this sort). Today, the term “consequentialism” is used more often by philosophers than “utilitarianism”, though many of those identifying as consequentialists either embrace or sympathise with utilitarianism.

c. Act versus Rule

In the 20th century, a distinction that had been noted previously was scrutinised and given a name. This is the act/rule distinction. Versions of rule-utilitarianism had been given before the 20th century. The rule utilitarian claims that, rather than examining the consequences of any particular action to determine the ethical status of an action, one should consider whether it is compatible with a set of rules that would have good consequences if (roughly) most people accepted them.

The term “rule-utilitarian” was not in popular use until the second half of the 20th century, but the central claim—that the rules one is acting in accordance with determine the moral status of one’s actions—was much older. George Berkeley (1685-1753) is sometimes suggested to have offered the first formulation of rule-utilitarianism. He suggested that we should design rules that aim towards the well-being of humanity, that “The Rule is framed with respect to the Good of Mankind, but our Practice must be always shaped immediately by the Rule”.

Later in the 18th century, William Paley (1743-1804) also suggested something like rule-utilitarianism in response to the problem that his view would seemingly condone horrible behaviours, like lying one’s way to a powerful position, or murder, if the consequences were only good enough. Paley rejected this by claiming that the consequences of the rule should be considered. If one was willing to lie or cheat or steal in order to promote the good, Paley suggested this would licence others to lie, cheat, or steal in other situations. If others did, from this precedent, decide that lying, cheating, and stealing were permissible, this would have bad consequences, particularly when people did these actions for nefarious reasons. Thus, Paley reasoned, these behaviours should be prohibited. Later still, in his Utilitarianism, John Stuart Mill proposed what some have interpreted as a form of rule-utilitarianism, though this is controversial (a discussion on this dispute can be found here).

While principles that can properly be regarded as rule-utilitarian were proposed before, it was in the 20th century that these views received the name “rule-utilitarianism” and were given extensive scrutiny.

Before considering some of the serious objections to rule-utilitarianism, it is worth noting that the view has some apparent advantages over classical act-utilitarianism. Act-utilitarians have a difficulty in making sense of prohibitions resulting from rights. Jeremy Bentham famously described the idea that there might exist moral rights as “nonsense on stilts”, but this is a controversial position. It is often argued that we do have rights, and that these are unconditional and inalienable, such as the right to bodily autonomy. If one person has a right to bodily autonomy, this is understood as requiring that others do not use their body in certain ways, regardless of the consequences. However, basic act-utilitarianism cannot make sense of this. In a famous example, Judith Jarvis Thomson (1929-2020) imagines a surgeon who realises they could save the life of five patients by killing a healthy person who happens to be the right blood type. Assuming they could avoid special negative consequences from the surgeon killing an innocent healthy person (perhaps they can perform the killing so that it looks like an accident to prevent the public panicking about murderous surgeons), an act-utilitarian seems committed to the view that the surgeon should kill the one in order to save the five. The rule-utilitarian, however, has a neat response. They can suggest that a set of rules that gives people rights over their own bodies—rights that preclude surgeons killing them even if they have useful organs—leads to more happiness overall, perhaps because of the feeling of safety or self-respect that this might result in. So the rule-utilitarian can say such a killing was wrong, even if on this particular occasion it would have resulted in the best consequences.

Another potential advantage for rule-utilitarians is that they may have an easier time avoiding giving extremely demanding moral verdicts. For the act-utilitarian, one must always perform the action which has the best consequences, regardless of how burdensome this might be. Given the state of the world today, and how much people in affluent countries could improve the lives of those living in extreme poverty with small sums of money, act-utilitarianism seems to imply that affluent people in developed nations must donate the vast majority of their disposable income to those in extreme poverty. If buying a cup of coffee does not have expected consequences as good as donating the money to the Against Malaria Foundation to spend on mosquito nets, the act-utilitarian claims that buying the cup of coffee is morally wrong (because of the commitment to maximising). Rule-utilitarians can give a different answer. They consider what moral rule would be best for society. One of the reasons act-utilitarianism is so burdensome for a given individual is that the vast majority of people give nothing or very little. However, if every middle-class person in developed nations donated 10% of their income, this might be sufficient to eliminate extreme poverty. So perhaps that would be the rule a rule-utilitarian would endorse.

Despite some advantages, rule-utilitarianism does have many problems of its own. One issue pertains to the strength of the rules. Consider a rule prohibiting lying. This might seem like a good rule for a moral code. However, applying this rule in a case where a would-be murderer asks for the location of a would-be victim would seemingly have disastrous consequences (Kant is often ridiculed for his absolutist stance in this case). One response here would be to suggest that the rules could be more specific. Maybe “do not lie” is too broad, and instead the rule “do not lie, unless it saves a life” is better? But if all rules should be made more and more complicated when this leads to rules with better consequences, this defeats the purpose of the rules. As J. J. C. Smart (1920-2012) pointed out, the view then seems to collapse into a version of act-utilitarianism. In Smart’s words:

 I conclude that in every case if there is a rule R the keeping of which is in general optimific, but such that in a special sort of circumstances the optimific behaviour is to break R, then in these circumstances we should break R…. But if we do come to the conclusion that we should break the rule…what reason remains for keeping the rule?  (Smart, ‘Extreme and Restricted Utilitarianism’, 1956)

On the other hand, one might suggest that the rules stand, and that lying is wrong in this instance. However, this looks like an absurd position for a utilitarian to take, as they claim that what matters is promoting good consequences, yet they will be forced to endorse an action with disastrous consequences. If they suggest rule-following even when the consequences are terrible, this is difficult to reconcile with core consequentialist commitments, and looks like—in Smart’s terms—“superstitious rule worship”. Is it not incoherent to suggest that only the consequences matter, but also that sometimes one should not try to bring about the best consequences? The rule-utilitarian thus seems to face a dilemma. Of the two obvious responses available, one leads to a collapse into act-utilitarianism and the other leads to incoherence.

Richard Brandt (1910-1997) was the first to offer a rigorous defence of rule-utilitarianism. He offers one way of responding to the above criticism. He suggests that the rules should be of a fairly simple sort, like “do not lie”, “do not steal” and so on, but in extreme scenarios, these rules will be suspended. When a murderer arrives at the door asking for the location of one’s friends, this is an extreme example, so ordinary rules can be suspended so that disaster can be averted. A version of this strategy, where the correct set of rules includes an “avoid disaster” rule, is defended by contemporary rule-consequentialist Brad Hooker (Hooker’s own view is not strictly rule-utilitarian because his code includes an a priori caveat—he thinks there is some moral importance to prioritising the worst-off in society, over and above their benefits to well-being).

A second problem for rule-utilitarians concerns issues relating to partial compliance. If everyone always acted morally decently and followed the rules, this would mean that certain rules would not be required. For instance, there would be no rules needed for dealing with rule-breakers. But it is not realistic to think that everyone will always follow the rules. So, what degree of compliance should a rule-utilitarian cater for when devising their rules? Whatever answer is given to this is likely to look arbitrary. Some rule-utilitarians devise the rules not in terms of compliance, but acceptance or internalisation. Someone may have accepted the rules but, because of weakness of will or a misunderstanding, still break the rules. Formulating the view this way means that the resulting code will incorporate rules for rule-breakers.

A further dispute concerns whether rule-utilitarianism should really be classified as a form of utilitarianism at all. Because the rightness of an action is only connected to consequences indirectly (via whether or not the action accords to a rule and whether the rule relates to the consequences in the right way), it is sometimes argued that this should not count as a version of utilitarianism (or consequentialism) at all.

d. Satisficing and Scalar Views

A common objection to act-utilitarianism is that, by always requiring the best action, it demands too much. In ordinary life, people do not view each other as failing whenever they do something that does not maximise utility. One response to this is to reconstrue utilitarianism without the claim that an agent must always do the best. Two attempts at such a move will be considered here. One replaces the requirement to do the best with a requirement to do at least good enough. This is known as satisficing utilitarianism. A second adjustment removes obligation entirely. This is known as scalar utilitarianism.

Discussions of satisficing were introduced into moral philosophy by Michael Slote, who found maximising versions of utilitarianism unsatisfactory. Satisficing versions of utilitarianism hope to provide more intuitive verdicts. When someone does not give most of their money to an effective charity, which may be the best thing they could do, they might still do something good enough by giving some donation or helping the needy in other ways. According to the satisficing utilitarian, there is a standard which actions can be measured against. A big problem for satisficing views arises when they are challenged to say how this standard is arrived at—how do they figure out what makes an action good enough? Simple answers to the question have major issues. If, for instance, they suggest that everyone should bring about consequences at least 90% as good as they possibly can, this suggests someone can always permissibly do only 90% of the best. But in some cases, doing what brings about 90% of the best outcome looks really bad. For example, if 10 people are drowning, and an observer can decide how many to save without any cost to themselves, picking 9—and allowing one to die needlessly—would be a monstrous decision. Many sophisticated versions of satisficing utilitarianism have been proposed, but none so far has escaped some counterintuitive implications.

The problem of where to set the bar is not one faced by the scalar utilitarians, as they deny that there is a bar. The scalar utilitarian acknowledges that what makes actions better or worse is their effects on peoples’ well-being but shuns the application of “rightness” and “wrongness”. This approach avoids problems of being overly or insufficiently demanding, because it makes no demands. The scalar view avoids deontic categories, like permissible, impermissible, required, and forbidden. Why might such a view seem appealing? For one thing, the categories of right and wrong are typically seen as binary—the act-utilitarian says actions are either right or wrong, a black-and-white matter. If the moral quality of actions is extremely richly textured, this might look unsatisfactory. Furthermore, using the blunt categories of “right” and “wrong”, someone confident that they have acted rightly may become morally complacent. Unless you are doing the very best, there is room for improvement, scope for doing better, which can be obfuscated by viewing acts as merely permissible or impermissible. While some utilitarians have found this model attractive, abandoning “right” and “wrong” is a radical move, and perhaps unhelpful. It might seem very useful, for instance, for some actions to be regarded as forbidden. Similarly, an account of morality which sets the boundaries of permissible action may be much more useful for regulating behaviour than viewing it merely as matters of degrees.

6. Utilitarianism in the Early 21st Century

In moral theory, discussions of utilitarianism have been partly subsumed under discussions of consequentialism. As typically classified, utilitarianism is simply a form of consequentialism, so any problems that a theory faces in virtue of being consequentialist are also faced by utilitarian views. Some consequentialists will also explicitly reject the label of “utilitarianism” because of its commitment to a hedonistic or welfarist account of the good. Brad Hooker, for example, endorses a rule-consequentialism where not only the total quantity of happiness matters (as the utilitarian would suggest), but where the distribution of happiness is also non-instrumentally important. This allows him to claim that a world with slightly less overall happiness, but where the poorest are happier, is all-things-considered better than a world with more total happiness, but where the worst-off are miserable.

While many of the discussions concern consequentialism more broadly, many of the arguments involved in these discussions still resemble those from the 19th century. The major objections levelled against consequentialism in the early 21st century—for example, whether it demands too much, whether it can account for rights or justice, or whether it allows partial treatment in a satisfactory way—target its utilitarian aspects.

The influence of utilitarian thinking and the Utilitarian movement is still observable. One place where Utilitarian thinking is particularly conspicuous is in the Effective Altruism movement. Like the 19th century Utilitarians, Effective Altruists ask what interventions in the world will actually make a difference and promote the behaviours that are the best. Groups such as Giving What We Can urge individuals to pledge a portion of their income to effective charities. What makes a charity effective is determined by rigorous scientific research to ascertain which interventions have the best prospects for improving peoples’ lives. Like the classical utilitarians and their predecessors, they answer the question of “what is good?” by asking “what is useful?”. In this respect, the spirit of utilitarianism lives on.

7. References and Further Reading

  • Ahern, Dennis M. (1976): ‘Is Mo Tzu a Utilitarian?’, Journal of Chinese Philosophy 3 (1976): 185-193.
    • A discussion about whether the utilitarian label is appropriate for Mozi.
  • Anscombe, G. E. M. (1958): ‘Modern Moral Philosophy’, Philosophy, 33(124), 1-19.
    • Influential paper where Anscombe criticises various forms of utilitarianism popular at the time she was writing, and also introduces the word “consequentialism”.
  • Bentham, Jeremy (1776): A Fragment on Government, F. C. Montague (ed.) Oxford: Clarendon Press (1891).
    • One of the first places utilitarian thinking can be seen in Bentham’s writings.
  • Bentham, Jeremy (1787): ‘Panopticon or The Inspection House’, in The Panopticon Writings. Ed. Miran Bozovic (London: Verso, 1995). p. 29-95
    • This is where Bentham proposes his innovative prison model, the “panopticon”. It also includes lengthy discussions of how prisoners should be treated, as well as proposals for hospitals, “mad-houses” and schools.
  • Bentham, Jeremy (1789): An Introduction to the Principles of Morals and Legislation., Oxford: Clarendon Press, 1907.
    • Seen as the first rigorous account of utilitarianism. It begins by describing the principle of utility, and it continues by considering applications of the principle in morality and legal policy.
  • Brandt, R. B. (1959): Ethical Theory, Englewood-Cliffs, NJ: Prentice Hall.
    • This book offers a clear formulation of rule-utilitarianism, and it is one of the earliest resources that refers to the view explicitly as “rule-utilitarianism”.
  • Chastellux, François-Jean de (1774): De la Félicité publique, (“Essay on Public Happiness”), London: Cadell; facsimile reprint New York: Augustus Kelley, 1969.
    • This book is where Chastellux investigates the history of human societies in terms of their successes (and failures) in securing happiness for their citizens.
  • Cumberland, Richard (1672): A Treatise of the Laws of Nature (De Legibus Naturae), selection printed in British Moralists 1650-1800 (1991), D.D. Raphael (ed.), Hackett.
    • Here Cumberland discusses the nature of things, and introduces his natural law view, which leads to some utilitarian-like conclusion.
  • Dabhoiwala, Faramerz (2014): ‘Of Sexual Irregularities by Jeremy Bentham—review’, The Guardian,  https://www.theguardian.com/books/2014/jun/26/sexual-irregularities-morality-jeremy-bentham-review.
    • Article about a recent book discussing Bentham’s position on sexual ethics.
  • De Lazari-Radek, Karazyna and Singer, Peter (2014): The Point of View of the Universe, Oxford University Press.
    • An exposition of Henry Sidgwick’s utilitarianism, considering his view in light of contemporary ethical discussions.
  • Dickens, Charles (1854): Hard Times, Bradbury & Evans.
    • Novel featuring Thomas Gradgrind—a caricature of a utilitarianist.
  • Foot, Philippa (1985): ‘Utilitarianism and the Virtues’, Mind, 94(374), 196-209.
    • Foot—an opponent of utilitarianism—notes how utilitarianism has been extremely persistent. She suggests that one reason for this is that utilitarianism’s opponents have been willing to grant that it makes sense to think of objectively better and worse “states of affairs”, and she scrutinises this assumption.
  • Gay, John (1731): Concerning the Fundamental Principle of Virtue or Morality, selection printed in British Moralists 1650-1800 (1991), D.D. Raphael (ed.), Hackett.
    • This includes Gay’s challenge to secular versions of utilitarianism, to explain moral motivation.
  • Helvétius, Claude (1777): A Treatise on Man, His Intellectual Faculties, and His Education, 2 vols., London: B. Law and G. Robinson.
    • Published after Helvétius’ death, this work includes lengthy discussions of how society may be altered to better promote happiness.
  • Heydt, Colin (2014): ‘Utilitarianism before Bentham’, in The Cambridge Companion to Utilitarianism, pp. 16-37). Cambridge: Cambridge University Press. doi:10.1017/CCO9781139096737.002
    • This paper describes the intellectual development of utilitarianism, drawing attention to the non-utilitarian origins, as well as the distinct religious and secular variations of utilitarianism in Britain, and the French utilitarians.
  • Hooker, Brad (2000): Ideal Code, Real World: A Rule-consequentialist Theory of Morality. Oxford University Press.
    • This book offers a rigorous defence of rule-consequentialism. Hooker’s account is not rule-utilitarian (because he claims that some priority should be given to the worst-off in society), but he offers defences against all the major objections to rule-utilitarianism.
  • Hruschka, Joachim, 1991. “The Greatest Happiness Principle and Other Early German Anticipations of Utilitarian Theory,” Utilitas, 3: 165–77.
    • Hruschka dispels some myths about the origins of the term “greatest happiness for the greatest number”, and he explores the history of the idea in Germany prior to the development of utilitarianism in Britain.
  • Hutcheson, Francis (1725): Inquiry Concerning the Original of Our Ideas of Virtue or Moral Good, treatise II of An Inquiry into the Original of our Ideas of Beauty and Virtue, selection printed in British Moralists 1650-1800 (1991), D.D. Raphael (ed.), Hackett.
    • This work provides a detailed account of Hutcheson’s moral and aesthetic theory.
  • Hutcheson, Francis (1755): A System of Moral Philosophy, three volumes, London.
    • Published after Hutcheson’s death, this book was written specifically for students. It further develops Hutcheson’s moral thinking, and it includes a discussion of different kinds of pleasures.
  • Jacobson, Daniel (2008): ‘Utilitarianism without Consequentialism: The Case of John Stuart Mill’, Philosophical Review, 117(2), 159-191.
    • This article makes a case for distinguishing the view of John Stuart Mill and his contemporaries from consequentialism, as the view is discussed today. This locates “Utilitarianism” within a certain socio-historical context and identifies ways in which it differs in its commitments than the “consequentialism”.
  • MacAskill, William (2015): Doing Good Better: Effective Altruism and How You Can Make a Difference, Random House.
    • An introduction to the Effective Altruism movement, which can be seen as an intellectual descendent of the Utilitarians.
  • Mill, John Stuart (1861): Utilitarianism, originally published in Fraser’s Magazine, now widely available, e.g., https://www.utilitarianism.net/books/utilitarianism-john-stuart-mill/1
    • This is an attempt from John Stuart Mill to demonstrate that utilitarianism is much more appealing than critics at the time implied. This is often seen today as the foundational text for utilitarianism, though Mill did not seem to regard it as highly as some of his other works, like On Liberty and Considerations on Representative Government.
  • Mill, John Stuart (1867): ‘House of Commons Speech’, Hansard. https://hansard.parliament.uk/Commons/1867-05-20/debates/c38e8bdb-704c-4952-9375-e33d7967a5a4/Clauses34ProgressMay17?highlight=%22conceding%20to%22#contribution-b39e743f-6b70-45e4-82c4-8ac642f8fd18
    • A lengthy speech given by Mill as an MP arguing for suffrage for women.
  • Mozi (2010): The Mozi: A Complete Translation, Ian Johnston (trans.), The Chinese University Press.
    • A translated version of Mozi’s work, accompanied by commentary.
  • Nozick, Robert (1974): Anarchy, State & Utopia, New York: Basic Books.
    • In this book, as well as his general account of the requirements of justice, Nozick introduces the example of the “experience machine”, which is often thought to demonstrate a problem for hedonism.
  • O’Keefe, Tim (2009): Epicureanism, Acumen Publishing.
    • O’Keefe discusses the teachings of Epicurus. As well as Epicurean ethics, this includes large discussions of Epicurean thoughts on metaphysics and epistemology.
  • Paley, William (1785): Principles of Moral and Political Philosophy, Boston: Richardson and Lord (1821).
    • Paley’s Principles of Moral and Political Philosophy was the most influential work of utilitarianism for much of the 19th It also includes an early defence of what would be later termed rule-utilitarianism.
  • Priestley, Joseph (1768): Essay on the First Principles of Government, London.
    • In this work, Priestley claims that the greatest happiness for the greatest number is the measure of right and wrong. Bentham says this influenced him significantly.
  • Railton, Peter (1984): ‘Alienation, Consequentialism and the Demands of Morality’, Philosophy & Public Affairs, 13(2), 134-171.
    • Elaborates a complaint relating to utilitarian decision procedure, and how this may lead to alienation. Railton offers a distinction between “objective” and “subjective” versions of consequentialism, endorsing the former.
  • Rawls, John (1971): A Theory of Justice, Cambridge, MA: Harvard University Press.
    • When developing his influential theory of justice, Rawls criticises the inability of classical utilitarianism to properly appreciate the individual nature of persons.
  • Rosen, Frederick (2003): Classical Utilitarianism from Hume to Mill, London: Routledge.
    • This book traces the influence of the idea that utility is the basis of morality and justice, starting from Hume. It includes many of the figures discussed in this article in significantly more depth. It also devotes two chapters to considering the notion of utility as found in the works of Adam Smith.
  • Scarre, Geoffrey (1996): Utilitarianism, London: Routledge.
    • This book provides a wonderful discussion of utilitarianism. The first few chapters of the book were extremely useful in the creation of this article.
  • Schultz, Bart and Varouxakis, Georgios (2005): Utilitarianism and Empire, Oxford: Lexington.
    • This book is a collection of essays that consider the relationship between Utilitarianism—particularly as a social movement—and the British Empire. It explores the criticisms that early Utilitarians, like Jeremy Bentham and John Stuart Mill, were racist, insufficiently critical of slavery, and served as apologists for the British Empire.
  • Slote, Michael (1984): ‘Satisficing Consequentialism’, Proceedings of the Aristotelian Society, 58, 139-163.
    • This article marks the introduction of satisficing views, which remove the feature of maximising from utilitarianism, instead claiming that it is (at least) sometimes permissible to perform actions which do not have the best consequences, but which are good enough.
  • Smart, J. J. C and Williams, Bernard (1973): Utilitarianism: For & Against, Cambridge University Press.
    • A pair of essays for and against utilitarianism. Williams’ part includes his objection that utilitarianism undermines the integrity of moral agents, which has been very influential.
  • Taylor, Harriet (1851): ‘Enfranchisement of Women’, available here: https://www.utilitarianism.net/books/enfranchisement-of-women-harriet-taylor-mill
    • Harriet Taylor’s essay arguing for the legal equality of women.
  • Thomson, Judith Jarvis (1976): ‘Killing, Letting Die and The Trolley Problem’, The Monist, 59(2), 204-217.
    • This paper uses the case of a surgeon who must decide whether to kill one healthy person to save five, which has been used since to show problems utilitarianism has with making sense of rights. It also introduces the term “trolley problem” for a type of case that has become commonplace in moral philosophy.

 

Author Information

Joe Slater
Email: Joe.Slater@glasgow.ac.uk
University of Glasgow
United Kingdom

Moral Perception

It is a familiar thought that many of our beliefs are directly justified epistemically by perception. For example, she sees what looks to her to be a cat on the mat, and from this she is justified in saying “There is a cat on the mat.” This article explores the idea that our moral beliefs can be justified empirically in a similar manner. More precisely, it focuses on canonical moral perception (CMP), which restricts perceptual experiences to sensory perceptual experiences, such as vision, touch, taste, smell, and sound. For ease of exposition, this article uses visual perceptual experiences as the sensory modality of choice.

We should be interested in the viability of such a thesis for several reasons. First, if CMP is a plausible epistemology of justification of moral beliefs, then it is uniform with a broader perceptual epistemology and therefore comes with ready-made responses to skeptical challenges to morality. Second, CMP avoids over-intellectualising moral epistemology, and it explains how it is that lay people have justified moral beliefs. Third, CMP, if true, has interesting implications for our methodology of investigating morality. In effect, CMP states that experience comes first, contrary to how some (but not all) rival views characterize moral epistemology as starting from the armchair.

First, the thesis of CMP in presented in detail. The following section considers prima facie arguments in favor of CMP, which are the considerations of epistemic uniformity and the role of experience in moral inquiry. Next, the article discusses prima facie arguments against CMP, which are the problems of counterfactual knowledge, the causal objection, and the ‘looks’ objection. Finally, the article presents arguments for CMP that draw from the philosophy of perception and the philosophy of mind, and it concludes that much of the debate surrounding CMP is continuous with debates in the general philosophy of perception and the philosophy of mind.

Table of Contents

  1. The Central Thesis
  2. The Prima Facie Case for Moral Perception
    1. Moral Perception and Epistemic Uniformity
    2. The Role of Experience in Moral Inquiry
  3. The Prima Facie Case Against Moral Perception
    1. Justification of Counterfactual Moral Beliefs
    2. The Causal Objection
    3. The ‘Looks’ Objection
  4. Arguments from Philosophy of Perception
    1. High-Level Contents in Perception
    2. Phenomenal Contrast Arguments
    3. Phenomenal Contrast and Parity Considerations
    4. Cognitive Penetration
    5. The Mediation Challenge
    6. Moral Perception and Wider Debates in The Philosophy of Perception
  5. Summary: Looking Forward
  6. References and Further Reading

1. The Central Thesis

Suppose upon returning home one evening, someone encounters a stranger harming a senior citizen for entertainment. As they witness this act, they form the belief that what they are witnessing is morally wrong. Assuming that the belief is epistemically justified, it remains a question what the source of justification for this particular moral belief is. One answer is that perceptual states (such as sight and hearing) provide the justification. This thesis is called canonical moral perception:

CMP: Some moral beliefs are non-inferentially justified by sensory perceptual experiences.

To be clear, CMP claims that some moral beliefs are non-inferentially justified by sensory perceptual experiences. This leaves open the possibility of multiple sources for the justification of moral beliefs while showing that there is an interesting debate here regarding the possibility of CMP, since rivals of the view will deny that any moral beliefs are justified in such a way. For purposes of exposition, this article uses vision as the perceptual state of choice, but it should be kept in mind that this is not to convey that vision is the only source of perceptual justification for moral beliefs. Despite the fact that emotions are sometimes spoken of as if they are a kind of perception, this article does not consider emotional perception in any detail. Someone who endorses CMP may be called a ‘perceptualist.’

Fundamentally, the epistemic contribution of perception is to justify belief and play the role of a justificatory regress stopper. Given that justification for some beliefs bottoms out in perceptual experience, and that some moral beliefs are justified but not on the basis of other beliefs, CMP extends perceptual justification to the moral domain. CMP is a foundationalist theory of the justification of moral beliefs and this article treats it as such. Other foundationalist views, such as intuitionists and emotional perceptualists, will have their own ways of handling the regress problem that differs from Canonical Moral Perception. In particular, the perceptualist (at least) holds that what is essential to perception is its representational nature, the phenomenological character of perceptual experience, and its role as a non-inferential source of justification, and will offer a stopper to the regress problems based on those characteristics. Intuitionists and emotional perceptualists may agree that some of those characteristics are essential to their justificatory source as well, but the story for how their regress stoppers work will differ based on how emotions and intuitions differ from perception. For example, emotional perceptualists may say that what is special about emotional perceptual states is that they are valenced, and that this plays a special role in their justificatory story.

Furthermore, this paper assumes on behalf of the perceptualist a phenomenal dogmatist account of foundationalism of the kind espoused by Jim Pryor, where someone is immediately, but defeasibly, justified by their perceptual experience (Pryor 2000). Phenomenal dogmatism is not a very strong foundationalism in that it does not require an infallibly known belief to ground all the remaining knowledge one may possess. Rather, what phenomenal dogmatism grants us is the claim that representational seeming states justify beliefs based on those seeming states in virtue of having those seeming states. Insofar as one may be concerned about challenges to a general foundationalist picture, the perceptualist will follow Pryor in responding to those objections.

Some of the philosophers mentioned in this article will talk about theories of perceptual moral knowledge, and most of what this article says will be compatible with those theories. A perceptually justified moral belief in the absence of defeaters is perceptual moral knowledge, after all.

2. The Prima Facie Case for Moral Perception

a. Moral Perception and Epistemic Uniformity

Considerations of uniformity and economy within epistemology might push one towards adopting CMP over its more traditional rivals, such as intuitionism. CMP entails that the methodology of obtaining justified moral beliefs does not differ in any significant or substantial way from other kinds of justification gained by perceptual experiences. That is, just as one forms the justified belief that there is a cat in the room by seeing that there is in fact a cat in the room, one forms the justified belief that some act is wrong by perceiving the wrongness of the act. This leads us to the considerations of uniformity. If there is no special methodology that differentiates justified moral beliefs from other justified beliefs in different domains, then the need for positing a special source of justification, such as the intellectual seemings of the intuitionist is moot. Another advantage of CMP is that it gives us a foundationalist epistemology, thereby avoiding regress and circularity worries regarding justification. To be clear, the advantages mentioned are shared with some rival accounts of moral epistemology, so these are not unique advantages but rather considerations that keep it a live theory.

b. The Role of Experience in Moral Inquiry

CMP captures the role that experience seems to play in moral inquiry. If we consider how non-philosophers form most of their moral beliefs, it is unlikely that the sole basic source is a priori reasoning. Most people do not sit in an armchair and contemplate runaway trolleys, yet it seems that most individuals have justified basic moral beliefs. When an individual is asked to explain how they know that an action is wrong, a common answer among lay people is that they saw the wrongness of that action. CMP takes this statement at face value, and considering that moral philosophers are not different in kind from the typical human being, we might think that when engaging in a moral thought experiment the philosopher is making use of past moral observations.

If we are persuaded that experience plays a role in answering moral questions, then a natural thought is that particular moral beliefs are among the most epistemically basic; particular moral beliefs form part of our evidential bedrock. They are basic in the sense that, from justified particular moral beliefs we can infer additional justified moral beliefs, but we cannot make an inference in the opposite direction.  For example, one basic justified particular moral belief for the perceptualist may be a very specific claim such as, ‘The instance of a father hugging his child I witnessed yesterday is morally good.’ From this particular experience of goodness, once we return to the armchair and ponder if fathers hugging their children is good, we might inductively infer a more general statement such as ‘It is usually good for fathers to hug their children.’ In short, we draw from experience to reach conclusions about more abstract moral questions. Sarah McGrath, motivates CMP with these considerations in mind (2018, 2019). As McGrath explains:

[A] significant part of our most fundamental evidence for [moral] theorizing consists in singular moral judgments that we know to be true. But I also think that there is a fairly widespread tendency to neglect this fact, and to think that our evidence, or what we ultimately have to go on in our ethical theorizing, consists exclusively of judgments with more general content (2018).

To expand on this: it is a common self-conception of moral philosophers that the methodology of moral inquiry they perform is to consider cases or action types, form judgments about those cases and reach general moral principles (such as ‘It is usually good for fathers to hug their children’, or ‘All things being equal, it is wrong to intentionally cause harm’) that are broadly applicable. That is, judgments about very specific cases will be formed by way of considering the more general principles. As McGrath points out, when considering the morality of an action type, we often draw upon our past experiences of tokens of an action to make moral judgments.  To illustrate this, we can imagine an agent who yesterday saw the goodness of a father hugging a child, and then the next day is presented with a thought experiment that asks the agent to consider a near identical scenario. Presumably, this agent will judge the hugging once again to be good, and this judgment will be based on the past observations they made the day before. Thus, CMP denies that intuitions about general moral beliefs reached in the armchair are always methodologically prior to experience in moral theorizing.

If intuitions about general moral principles are epistemically basic, then making use of particular moral judgements is epistemically mistaken. However, drawing on past observations to reach judgements on thought experiments about fathers hugging their children, or even the trolley problem, is not obviously an epistemic misstep. In fact, we often draw on past observations and experiences to give advice on problems that our friends and family experience. Rather than draw on general principles to advise a friend to end her relationship, we usually appeal to previous relationships we have been through to make such a judgment. These are the common and legitimate ways we form moral beliefs, and CMP is the most natural epistemic explanation of our practice of moral inquiry as we find it.

That said, we may worry about cases where we have background knowledge informing our experience of a situation; it may seem strange that we can have the kind of experientially justified moral beliefs CMP promises while at the same time recognizing that background knowledge changes what we may justifiably believe about what our perceptual experiences. For example, we can imagine the father hugging child, but now know have the background information that the father has a criminal record of incest. There are two ways for the perceptualist to handle cases where there is background knowledge informing the observation in such cases. The first is to stick with the kind of Pryor style phenomenal dogmatism, in which the perceptual seeming of goodness delivers prima facie justification for believing the hugging is morally good, but this is defeated by the additional knowledge of the father’s criminal record. The second option is to lean into the phenomenon of cognitive penetration, and answer that the background knowledge does change the perceptual experience of the father hugging the child from one of goodness to one of badness, since our propositional attitude would contour our perceptual experience on this option. In sum, there are two possible ways for the perceptualist to answer this kind of concern, but adjudicating between the two options canvassed here is beyond the scope of this article.

3. The Prima Facie Case Against Moral Perception

a. Justification of Counterfactual Moral Beliefs

Although CMP provides a theory of justification in actual situations, situations in which you see a morally valenced act, we might wonder what the theory says about justification of moral beliefs gained via thought experiments or reading fiction. Call the kind of justification gained in these instances counterfactual justification. Both Hutton and Wodak challenge CMP to provide an account of how one can have counterfactual moral justification (Hutton 2021; Wodak 2019). The challenge takes the following general form: By hypothesis, CMP explains moral justification in localized, everyday cases. However, we do not receive justification for moral beliefs solely through sensory perception, since we can have counterfactual moral justification. So, CMP is an incomplete explanation of the sources of moral justification. Because CMP cannot capture cases where we receive justification through literature or thought experiments, an epistemological theory that can provide a unified explanation of both counterfactual justification and justification gained in everyday cases is preferable on the grounds of parsimony. The following two paragraphs present particular versions of this challenge.

Hutton asks us to consider a case of someone reading a book depicting the brutalities of slavery, stipulating that they have an emotional response to the scenarios depicted in the book. Here, no perception is present (other than of words on a page), but there is a strong emotional response and plausibly, Hutton claims, the individual reading the book forms the justified moral belief that slavery is wrong. The upshot of Hutton’s argument is that CMP cannot explain what the source of justification in the case of literature is, while emotion is able to both explain the source of justification in moral beliefs formed from reading literature and everyday cases.

Like Hutton, Wodak notes that much of our moral inquiry is a priori, and intuitionism is far better suited to capture instances where our justified moral beliefs come from imagining scenarios. When sitting in the armchair imagining a trolley scenario, when we form the justified moral belief that pulling the lever is the right action, we can ask what justifies the belief, and Wodak states “The intuitionist can explain this very easily: our intuitions can concern actual and hypothetical cases” (Wodak 2019). That is, the intuitionist’s story for justification stays the same between imagined cases and cases we encounter on the street. CMP cannot appeal to perceptual justification because in thought experiments there is no perception of the scenario. Because CMP lacks resources to explain the source of the justification, and intuitionism can explain the source of justification in both thought-experiments and everyday cases, Wodak concludes that intuitionism should be preferred on the grounds of parsimony.

While it is true that CMP by itself is unable to capture counterfactual justification and gives some prima facie considerations against the view, this should not be cause for alarm on the part of the advocate of CMP. Recall that CMP states that some of our moral beliefs are perceptually justified, not that all moral beliefs are justified in such a way. The advocate of CMP has the option to make a disjunctive response to challenges from counterfactual justification such as those made by Wodak and Hutton. This response needs to be done with care; the advocate of CMP should avoid introducing an account of counterfactual justification that suffices to explain actual justification as well. Even though the challenge for a story for counterfactual justification has yet to be fully answered, there are other considerations for adhering to CMP.

b. The Causal Objection

The causal objection argues that we cannot perceive moral properties because we cannot be put in a causal relation with them. That is, one might think that moral properties are causally inert, and for this reason we cannot perceive them. Put in the form of an argument, the causal objection appears as:

    1. To perceive some property, one must be placed in the appropriate causal relation with that property.
    2. One can never be put in the proper causal relation with moral properties.
    3. One cannot perceive moral properties.

McBrayer responds to the causal objection by pointing out that on three of the most popular realist accounts moral properties premise two comes out false (McBrayer 2010). These three proposals are (i) treating moral properties as secondary properties, (ii) treating moral properties as natural properties, and (iii) treating moral properties as non-natural properties.

When moral properties are held to be secondary properties, where secondary properties are properties that under appropriate viewing conditions are perceived as such, premise two fails as demonstrated by an analogy between colors and moral properties. We can imagine looking at a chair under midday light and perceiving it to be brown. What causally contributes to our perceptual experience is not the brownness of the chair (due to the nature of secondary properties), but the other properties of the chair. Nonetheless, perceiving the chair results in knowledge of the chair’s color, so we are still put in an appropriate causal relation with the property of brownness. In the case of moral properties, stipulated to be secondary properties, we will be placed in the same causal relation with them as we are with colors. Under ideal viewing circumstances, we will be placed in a causal relation with the base properties (such as a father hugging a child) and perceive the goodness of that action. In short, if we take moral properties to be secondary properties, the response to the causal objection is a common cause style of explanation.

If one takes a reductionist naturalist account of the moral properties, matters are even simpler. Because moral properties are identical to natural properties, the explanation as to how we are able to be in the proper causal relation with them is the same explanation as to how we are able to be in the proper causal relationship with chairs, cars, and human actions.

Finally, according to McBrayer, non-naturalism about moral properties avoids the causal objection as well. What the proponent of the causal objection wants is a non-accidental connection between our perceptual beliefs and the moral facts, and an account that delivers a non-accidental connection between our perceptual beliefs and the moral facts suffices to defuse the causal objection. This is so even if the connection is not causal, strictly speaking. To see this, first note that we are stipulating the supervenience principle, the moral facts necessarily supervene on the natural facts such that there is no change in the moral without a change in the natural. Assuming that we can see supervening properties, the accidentality is eliminated because whenever we see a supervening property we see its natural property that serves as its base, and the natural property serves as the proper causal relationship that satisfies the causal constraint.

The causal objection is an instance of a general challenge to the perception of high-level properties. In this case, the causal objection is an instance of explanatory superfluity. This challenge is as follows: One might think that we cannot be put in a causal relation with high-level properties, and so we do not perceive them. There is no need to claim that we are in a causal relation with trees when being in a causal relation with the lower-level properties of trees is sufficient for justified tree belief; further causal contact would be an instance of overdetermination. To put the objection in a slightly different way, if our perceptual states are in a causal relation with the property of being a pine tree, then the content of our perceptual experience of a pine tree would be causally overdetermined. There is no reason to think that our perceptual experiences are overdetermined, so our perceptual states are not in a causal relation with the property of being a pine tree. It is not clear how worried the defender of CMP should worry about this objection. Because the causal objection shares strong features with the causal exclusion problem of mind-body interaction which provides a framework for addressing these issues, the objection may not carry much weight (Kim 1993, Yablo 2003).

c. The ‘Looks’ Objection

If perception justifies some moral beliefs, then this is presumably because there is a phenomenological character, a what-it-is-likeness, when perceiving moral properties. The ‘looks objection’ claims that this is not the case: we do not have perceptual justification of moral beliefs because there is no phenomenological character for moral properties (Huemer 2005, Reiland 2021). The argument is commonly structured this way:

    1. A moral belief is perceptually justified if there is some way that a moral property looks.
    2. Moral properties have no look.
    3. No moral beliefs are perceptually justified.

We can deny the ‘looks’ objection by rejecting premises one or two, or arguing that the conclusion does not follow. Because ‘looks’ is ambiguous in the argument, one strategy for denying the objection is to interpret ‘looks’ in various ways and see if the argument remains valid. McBrayer (2010a) tackles the ‘looks’ objection by considering several possible readings of “looks” other than the phenomenal ‘looks’ mentioned above. The upshot of McBrayer’s strategy is that on all interpretations of “look” he considers, the objection is invalid. McBrayer settles on a possible reading of ‘looks’ which is supposed to provide the strongest version of the objection. This is the ‘normally looks’, which is understood as the way something resembles something else. If we substitute ‘normally look’ in premise two, we get:

2′. Moral properties do not normally look like anything.

Even with ‘normally looks’, the objection comes out invalid. This is for the following reasons. When ‘normally looks’ is read as normally looking a way to multiple people, the argument fails as many non-moral properties, assuming they have a normal look, do not appear the same way to multiple people. For example, imagine a group of individuals looking at a car from different viewpoints; there is no single way the car appears to all of them. Yet, if a car has a normal look but can appear different ways to different individuals, then there is no principled reason to think that rightness cannot appear different ways yet have a normal look as well. Understood in this cross-person sense, 2′ comes out false. Similarly, when 2′ is read as the way a thing normally looks to an individual, the objection is still invalid. Even if 2′ is true, it is only true of low-level properties such as colors, since no matter what angle you view red from, it would always look the same. Many high-level properties, such as danger, do not have a way of normally looking to an individual. But, assuming we are perceptually justified in judgments of danger despite its disparate looks, such as a rattlesnake looking dangerous and a loaded gun looking dangerous, premise 1 does not hold. We may still be perceptually justified in a belief about a property even if there is no particular look for that property. Finally, if an opponent argues that there is a complex and ineffable way that high-level properties normally look, then this strategy is open to the defender of moral perception as well, so 2′ again comes out false. On all readings McBrayer considers, the ‘looks’ objection is unsound.

Proponents of the ‘looks objection’ may be unsatisfied with McBrayer’s response, however. The kind of ‘looks’ that is likely intended by opponents of CMP is ‘phenomenal looks’. That is, the what-it-is-likeness of perceiving something, such as what it is like to perceive a car or a cat, is the intended meaning of “looks” in the argument. “Looks” in fact was characterized as the phenomenal kind in the opening paragraph of this section. However, McBrayer omits this reading of ‘looks’, and misses the most plausible and strongest reading of the objection. It remains up to contemporary defenders of CMP to provide an account of what the phenomenological ‘looks’ of moral properties are like. Until an account is provided, the looks objection remains a live challenge.

Whatever this account may be, it will also provide a general strategy for answering a general looks objection in the philosophy of perception. This objection is the same as the looks objection listed above, but with instances of ‘moral’ replaced with ‘high-level property’, and concludes that our high-level property beliefs are not perceptually justified (McGrath 2017). If an account is successful at articulating what the phenomenal looks of a higher-order property is, or motivating the belief that high-level properties have one, then this provides a framework for CMP to use in answering the moral looks objection.

4. Arguments from Philosophy of Perception

While the prima facie arguments provide initial motivation for CMP, the thesis is ultimately about the epistemic deliverances of our sensory faculty. Accordingly, much of the debate about the viability of CMP parallels debates in the general philosophy of perception. In this section, we will see the arguments for and against moral perception drawing from empirical perceptual psychology and general philosophy of perception.

a. High-Level Contents in Perception

A natural move for the moral perceptualist in defense of the claim that we are non-inferentially justified by perception is to argue that we see moral properties. The perceptualist here means to be taken literally, similar to how we see the yellow of a lemon or the shape of a motorcycle. If we do perceive moral properties, then a very straightforward epistemic story can be told. This story the perceptualist aims to explain how perceptual moral justification is the same for perceptual justification of ordinary objects. For example, the explanation for how someone knows there is a car before them is that they see a car and form the corresponding belief that there is a car. The story for justification of moral beliefs here will be that someone sees the wrongness of some action and forms the corresponding belief that the action is wrong (absent defeaters). The perceptualist will typically flesh out this move by assuming an additional epistemic requirement, called the Matching Content Constraint (MCC):

MCC: If your visual experience E gives you immediate justification to believe some external world proposition that P, then it’s a phenomenal content of E that P (Silins 2011).

The MCC states that one is non-inferentially justified only if there is a match in contents between a perceiver’s perceptual state and doxastic state (their belief). The reason perceptual contents matter to CMP is that if perceptual contents include moral properties, then one has a perceptual experience of those moral properties, and if one has an experience of those moral properties then a story for a non-inferential perceptual justification of moral beliefs is in hand, which is no different from our perceptual justification of other objects. On the other hand, if there is a mismatch between our perceptual contents and our moral beliefs, then we may find a non-inferentialist perceptual epistemology such as CMP to be implausible.

Given the MCC, the perceptualist needs it to be the case that perceptual experience includes high-level contents, such as being a car, being a pine tree, or being a cause of some effect. If perceptual experiences do contain high-level contents, then the inclusion of moral contents in perceptual experiences is a natural theoretical next-step, barring a principled reason for exclusion. After all, if we commit to arguing that we perceive causation and carhood, extending the contents of perception to rightness (and wrongness) does not appear to require too large a stretch of the imagination. The extension of perceptual experiences to include moral contents meets the matching content constraint, and it clears the way for arguing for CMP. However, if the contents of our perceptual experiences are restricted to low-level contents, which are colors, shapes, depth, and motion (although what counts a low-level content may vary between theorists), the defense of CMP becomes much trickier.

Holding onto CMP because one accepts a high-level theory of content comes with its own risk. If a thin view of contents turns out to be the correct account of perceptual content, such that what makes up the content of our perceptual states are color arrays, shapes, depth and motion, then CMP appears to lose much of its motivation. It would be theoretically awkward to insist that moral contents show up if content about cars, pine trees, and causation are incapable of doing so. And if moral properties do not appear in the contents of perceptual experience, then a simple story as to how we can have perceptual justification for moral beliefs is lost.

Even if perception does not have high-level contents, or nor moral contents, this does not mean that CMP is a failed theory of moral epistemology. Sarah McGrath , provides a story as to how we can have perceptual beliefs in the absence of high-level contents in perception (2018, 2019). This story is an externalist one; the source of the justification comes from a Bayesian account of the adjustment of priors (the probability that a belief is true) given non-moral observations, rather than any experiential contents of morality itself. Through perceptual training and experience our perceptual system is trained to detect morally relevant stimuli, such as detecting the whimper of pain a dog may voice when kicked. On McGrath’s view, then, one is perceptually justified in a moral belief when the perceptual system reliably tracks the moral facts. The upshot for the defender of CMP is that there is much theorizing to be done about the compatibility between CMP and the thin-content view, and McGrath’s view shows one way to reconcile the two.

b. Phenomenal Contrast Arguments

An argument for thinking that we do perceive moral properties, as well as other high-level properties, is the argument from phenomenal contrast. Susanna Siegel develops a kind of phenomenal contrast argument as a general strategy for arguing that the contents of our perception are far richer than a thin view of contents would allow (2006, 2011, 2017). How a phenomenal contrast argument works is as follows. We are asked to imagine two scenarios, one in which a property is present, and a contrast scenario where the same property is absent. If the intuition about these cases is that the perceptual phenomenology is different for a perceiver in these scenarios, then one can argue that what explains the difference in experience in these cases is the absence of the property, which makes a difference to what is perceptually experienced. The reason an advocate of CMP would want to use this strategy is that if there is a phenomenal contrast between two cases, then there is an explanatory gap that CMP fills; if there is a moral experience in one case but not in a different similar case, CMP can explain the difference by saying that a moral property is perceived one case but not in the other, thus explaining the phenomenal difference.

To better illustrate phenomenal contrast, here is a concrete example from Siegel arguing that causation appears in the contents of perception (2011). Imagine two cases both in which we are placed behind a shade and see two silhouettes of objects. In the control case, we see one silhouette of an object bump into another object, and the second object begins to roll. In the contrast case, whenever one silhouette begins to move towards the other silhouette, the other silhouette begins to move as well, keeping a steady distance from the first silhouette. If we have the intuition that these two cases are phenomenally different for a perceiver, then Siegel argues that the best explanation for this difference is that causation is perceptually represented in the former case and not the latter, whereas competitors deny that causation appears in the content would have to find some alternative, and more complicated, explanation for the contrast.

The phenomenal contrast argument has been wielded to argue for moral contents specifically by Preston Werner (2014). Werner asks us to imagine two different individuals, a neurotypical individual and an emotionally empathetic dysfunctional individual (EEDI), coming across the same morally-valenced scenario. Let this scenario be a father hugging a child. When the neurotypical individual comes upon the scene of the father hugging his child, this individual is likely to be moved and have a variety of physiological and psychological responses (such as feeling the “warm fuzzies”). When the EEDI comes upon the scene of the father hugging his child, however, they will be left completely cold, lacking the physiological and psychological responses the neurotypical individual underwent. This version of the phenomenal contrast argument purports to show that what best accounts for the experiential difference between these two individuals is that the neurotypical individual is able to perceptually represent the moral goodness of the father hugging the child, thus explaining the emotional reaction, whereas the EEDI was left cold because of their inability to perceptually represent moral goodness. If this argument is successful, then we have reason to think that moral properties appear in the contents of experience.

One might object here that Werner is not following the general methodology that Siegel sets out for phenomenal contrast. Werner  defends his case as a phenomenal contrast by arguing that making use of two different scenarios would be too controversial to be fruitful because of the difference between learning to recognise morally valenced situations and having the recognitional disposition to recognise pine trees, and that the two individuals in the scenario are sufficiently similar in that they both have generally functional psychologies, but interestingly different in that the EEDI lacks the ability to properly emotionally respond to situations. Similarly, we might wonder about the use of an EEDI in this phenomenal contrast case. Although the EEDI possesses much of the same cognitive architecture as the neurotypical individual, the EEDI is also different in significant aspects. First, an immediate explanation of the difference might appeal to emotions, rather than perceptual experiences; the EEDI lacks the relevant emotions requisite for moral experiences. Second, the EEDI makes for a poor contrast if they lack the moral concepts needed to recognise moral properties in the first place. Similarly, the use of an EEDI as a contrast may prove problematic due to the exact nature of an EEDI being unclear; claiming that the best explanation between the two individuals’ experiences is due to a representational difference may be premature in the face of numerous and conflicting theories about the pathology of an EEDI. That is, because an EEDI’s perceptual system is identical to that of the neurotypical individual, the EEDI may still perceptually represent moral properties but fail to respond or recognise them for some other reason. If this hypothesis is correct, then the use of an EEDI is illegitimate because it does not capture the purported experiential difference.

c. Phenomenal Contrast and Parity Considerations

Even if CMP gets the right result, this does not rule out that other views can explain the phenomenology as well. For example, Pekka Väyrynen claims that inferentialism provides a better explanation of moral experiences, particularly regarding explanations of different experiences in phenomenal contrast scenarios (2018). To show this, Väyrynen first provides a rival hypothesis to a perceptualist account, which is as follows. When we see a father hugging his child, our experience of moral goodness is a representation that “results from an implicit habitual inference or some other type of transition in thought which can be reliably prompted by the non-moral perceptual inputs jointly with the relevant background moral beliefs” (Väyrynen 2018). This rival hypothesis aims to explain the phenomenological experiences targeted by phenomenal contrast arguments by stating that rather than moral properties appearing in our perceptual contents, what happens when we have a moral experience is that past moral learning, in conjunction with the non-moral perceptual inputs, forms a moral belief downstream from perception.

To see how this might work in a non-moral case, we can consider the following vignette, Fine Wine (Väyrynen 2018):

Greg, an experienced wine maker, reports that when he samples wine he perceives it as having various non-evaluative qualities which form his basis for classifying it as fine or not. Michael, a wine connoisseur, says that he can taste also fineness in wine.

Väyrynen asks if Michael has a perceptual experience of a property, in this case, fineness, that Greg cannot pick up on, and argues that there is no difference in perceptual experience. Granting that Greg and Michael’s experiences of the wine can differ, we need not appeal to Michael being able to perceive the property of fineness in order to explain this difference. What explains the difference in phenomenology, according to Väyrynen, is that Michael’s representations of fineness are “plausibly an upshot of inferences or some other reliable transitions in thought…” (Väyrynen 2018). Väyrynen’s hypothesis aims to reveal the phenomenal contrast argument as lacking the virtue of parsimony. That is, the perceptualist is using more theoretical machinery than needed to explain the difference in phenomenal experiences. The phenomenal contrast argument explains the difference in phenomenology between two individuals by claiming that moral properties appear in the contents of perception. Väyrynen’s rival hypothesis is supposed to be a simpler and more plausible alternative that explains why we may think high-level contents are in perception. First, it explains what appears to be a difference in perceptual experience as a difference in doxastic experience (a difference in beliefs). Second, because the difference is in doxastic experience, Väyrynen’s hypothesis does not commit to high-level contents in perception. Everyone who is party to this debate agrees on the existence of low-level perceptual contents and doxastic experiences, so to endorse high-level contents is to take on board an extra commitment. All things being equal, it is better to explain a phenomenon with fewer theoretical posits. In other words, Väyrynen’s hypothesis gets better explanatory mileage than the perceptualist’s phenomenal contrast argument.

Consider a moral counterpart of fine wine, where Greg and Michael witness a father hugging his child. Greg rarely engages in moral theorizing, but he classifies the action as morally good based on some of the non-moral features he perceives. Michael, on the other hand, is a world class moral philosopher who claims he can see the goodness or badness of actions. The perceptualist will say that the latter individual perceives goodness, but the former individual is perceptually lacking such that they cannot pick up on moral properties. The perceptualist who makes use of phenomenal contrast arguments is committed to saying here that Michael’s perceptual system has been trained to detect moral properties and has moral contents in perceptual experience, whereas Greg has to do extra cognitive work to make a moral judgment. Väyrynen’s rival hypothesis, on the other hand, need not claim we perceptually represent moral properties, but rather can explain the difference in phenomenology by appealing to the implicit inferences one may make in response to non-moral properties to which one has a trained sensitivity. According to Väyrynen’s hypothesis, Michael’s cognitive system is trained to make implicit inferences in response to certain non-moral properties, whereas Greg needs to do a bit more explicit cognitive work to make a moral judgment. What seems like a difference in perceptual experience is explained away as a difference in post-perceptual experience.

Väyrynen’s hypothesis also challenges Werner’s phenomenal contrast argument above, as it has an explanation for the EEDI’s different phenomenological experience. The neurotypical has the moral experience because of implicit inferences being made, but the EEDI fails to have the same experience because the EEDI lacks a sensitivity to the moral properties altogether, failing to draw the inferences the neurotypical is trained to make. In short, the difference in one’s phenomenological experience is explained by this rival hypothesis by differences in belief, rather than in perception.

d. Cognitive Penetration

It is already clear how low-level contents make it into perception, as perceptual scientists are already familiar with the rod and cone cells that make up the retina and process incoming light, as well as how that information is used by the early visual system. Is less clear how high-level contents make their way into perceptual experience. If perception does contain high-level contents, then a mechanism is required to explain how such contents make it into perceptual experience. The mechanism of choice for philosophers of perception and cognitive scientists is cognitive penetration. Cognitive penetration is a psychological hypothesis claiming that at least some of an individual’s perceptual states are shaped by that individual’s propositional attitudes, such as beliefs, desires, and fears. Put another way, cognitive penetration is the claim that perceptual experience is theory-laden.

To understand how cognitive penetration is supposed to work, we should consider another phenomenal contrast case. Imagine that you are working at a nature conservation center, and are unfamiliar with the plant known as Queen Anne’s lace. While working at the conservation center, you are told by your supervisor that plants that look a certain way are Queen Anne’s Lace. After repeated exposure to the plant, that a plant is Queen Anne’s lace becomes visually salient to you. In other words, your perceptual experience of Queen Anne’s lace prior to learning to recognize it is different from the perceptual experience you have of the plant after you have learned to recognize it. Cognitive penetration explains this shift in perceptual experience as your Queen Anne’s lace beliefs shape your perceptual experience, such that the property of ‘being Queen Anne’s lace’ makes it into the content of your perception. In other words, the difference in perceptual experiences is explained by the difference in perceptual contents, which in turn is explained by perceptual experiences being mediated by propositional attitudes. We should take care to separate this from a similar thesis which claims that there is no change in perceptual experience after learning to recognize Queen Anne’s lace, but that the shift in the phenomenology (the what-it-is-likeness) of looking at Queen Anne’s lace is explained by changes in post-perceptual experience, such as having new beliefs about the plant. Cognitive penetration claims that the phenomenological difference is between perceptual experiences, and it is the beliefs about Queen Anne’s lace that changes the perceptual experience.

Cognitive penetration is an attractive option for the perceptualist because it provides a mechanism to explain how moral properties make their way into the contents of perception. Consequently, the perceptualist’s theory for how we see the rightness or wrongness of actions will be identical to the story about Queen Anne’s lace above: an individual learns about morality from their community and forms moral beliefs, which in turn prime the perceptual system to perceive moral properties. One the perceptualist has cognitive penetration, they then have a story for moral properties in the contents of perception, and then the perceptualist delivers an elegant epistemology of moral justification. This epistemology respects the matching content constraint, which states that in order for a belief to be justified by perception, the contents of a belief must match the contents of perception. The perceptualist may then say that we have foundational perceptual justification for our moral beliefs, in the same way that we have foundational perceptual justification for tree beliefs. Just as we see that there is a tree before us, we see that an action is wrong.

e. The Mediation Challenge

The perceptualist’s use of cognitive penetration has led to challenges to the view on the grounds that cognitive penetration, the thesis that propositional attitudes influence perceptual experiences, lead to counterintuitive consequences. One of the most prominent challenges to the possibility of moral perception comes from Faraci , who argues that if cognitive penetration is true, then CMP must be false (Faraci 2015). To motivate the argument that no moral justification is grounded in perception, Faraci defends a principle he calls mediation:

If perceptions of X are grounded in experiences as of Y, then perceptions of X produce perceptual justification only if they are mediated by background knowledge of some relation between X and Y. (Faraci 2015)

What mediation states is that one can only have perceptual justification of some high-level property if the experience of that higher level property is in some way informed by knowledge of its relation to the lower-level properties it is grounded in. To motivate the plausibility of mediation, Faraci appeals to the non-moral example of seeing someone angry. If Norm sees Vera angry, presumably he knows that she is angry because he sees her furrowed brow and scowl, and he knows that a furrowed brow and scowl is the kind of behavior that indicates that someone is angry. In an analogous moral case, someone witnessing a father hugging his child knows that they are seeing a morally good action only if they have the relevant background knowledge, the relevant moral beliefs, connecting parental affection with goodness. If the witness did not possess the moral bridge principle that parental affection was good, then the witness would not know that they had seen a morally good action. The thrust of the argument is that if mediation is plausible in the non-moral case, then it is plausible in the moral case as well. If mediation is plausible in the moral case, then CMP is an implausible account of moral epistemology because it will need to appeal to background moral knowledge not gained in perceptual experience to explain how we see that the father hugging the child is a morally good action.

Faraci considers three possible ways of avoiding appeal to background knowledge for the defender of moral perception. The first option is to claim that the moral bridge principles are themselves known through perceptual experiences. In the case of the child hugging the father, then, we antecedently had a perceptual experience that justified the belief that parental affection is good. The problem with this response is that it leads to a regress, since we would have to have further background knowledge connecting parental affection and goodness (such as parental affection causes pleasure, and pleasure is good), and experientially gained knowledge of each further bridge principle.

The second option is that one could already know some basic moral principles a priori. The problem with this response should be apparent, since if we know some background principles a priori, then this is to concede the argument to Faraci that none of our most basic moral knowledge is known through experience.

Finally, someone could try to argue that one comes to know a moral fact by witnessing an action multiple times and its correlation with its perceived goodness, but the problem with this is that if each individually viewed action is perceived as being good, then we already have background knowledge informing us of the goodness of that act. If so, then we have not properly answered the mediation challenge and shown that CMP is a plausible epistemology of morality.

One way to defend CMP in response to Faraci’s challenge is to follow Preston Werner’s claim that mediation is too strong and offer a reliabilist account of justification that is compatible with a weaker reading of mediation. Werner considers a weak reading and a strong reading of Faraci’s mediation condition (Werner 2018). Werner rejects the strong reading of mediation on the grounds that while it may make Faraci’s argument against the plausibility of moral perception work, it overgeneralises to cases of perception of ordinary objects. Werner points out that the strong reading of mediation requires that we be able to make explicit the background knowledge of perceptual judgements that we make; if we perceive a chair, the strong reading requires that we be able to articulate the ‘chair theory’ that is informing our perceptual experience, otherwise our perceptual judgment that there is a chair is unjustified. Because the vast majority of non-mereologists are not able to make explicit a ‘chair-theory’ informing their perception, the strong reading yields the verdict that the vast majority of us are unjustified in our perceptual judgment of there being a chair.

The weak reading that Werner offers is what he calls “thin-background knowledge”, which he characterizes as “subdoxastic information that can ground reliable transitions from perceptual information about some property Y to perceptual information as of some other property X” (Werner 2018). The upshot is that a pure perceptualist epistemology of morality is compatible with the thin-background knowledge reading of mediation: We do not need to have access to the subdoxastic states that ground our perceptual judgments in order for us to know that our perceptual judgments are justified. Werner’s response to Faraci, in summary, is that a pure perceptualist epistemology is plausible because thin-background knowledge gives us an explanation as to how our perceptual moral judgments are in good epistemic standing.

f. Moral Perception and Wider Debates in The Philosophy of Perception

A broader lesson from the mediation challenge, however, is that many of the issues facing CMP are the same arguments that appear in general debates regarding the epistemology and metaphysics of perception. In the case of Faraci , the argument is a particular instance of a wider concern about cognitive penetration (2015).

What the mediation challenge reflects is a general concern about epistemic dependence and epistemic downgrade in relation to cognitive penetration. In particular, the mediation principle is an instance of the general challenge of epistemic dependence:

A state or process, e, epistemically depends upon another state, d, with respect to content c if state or process e is justified or justification-conferring with respect to c only if (and partly because) d is justified or justification-conferring with respect to c. (Cowan 2014, 674)

The reason one might worry about epistemic dependence in connection with cognitive penetration is that the justification conferring state in instances of cognitive penetration might be the belief states shaping the perceptual experiences, rather than the perceptual experiences doing the justificatory work. If this is true, then a perceptual epistemology of all high-level contents is doubtful, since what does the justificatory work in identifying pine trees will be either training or reflecting on patterns shared between trees, neither of which lend them to a perceptual story.

There is another general worry for cognitive penetration: epistemic downgrade. Assuming cognitive penetration is true, even if one were able to explain away epistemic dependence one might still think that our perceptual justification is held hostage by the beliefs that shape our experiences. For illustration, let us say we have the belief that anyone with a hoodie is carrying a knife. If we see someone wearing a hoodie and they pull a cellphone out, our belief may shape our perceptual state such that our perceptual experience is that of the person in the hoodie pulling a knife. I then believe that the person is pulling a knife. Another example of epistemic downgrade is that of anger:

Before seeing Jack, Jill fears that Jack is angry at her. When she sees him, her fear causes her to have a visual experience in which he looks angry at her. She goes on to believe that he is angry (Siegel 2019, 67).

In both cases there appears to be an epistemic defect: a belief is shaping a perceptual experience, which in turn provides support to the very same belief that shaped that experience. It is an epistemically vicious feedback loop. The worry about epistemic downgrade and high-order contents should be clear. In the case of morality, our background beliefs may be false, which will in turn shape our moral perceptual experiences to be misrepresentative. This appears to provide a defeater for perceptual justification of morality, and it forces the perceptualist to engage in defense of the moral background beliefs which may turn out to be an a priori exercise, defeating the a posteriori character of justified moral beliefs the perceptualist wanted.

One way to avoid these epistemic worries is for the moral perceptualist to endorse some form of epistemic dogmatism, which is to claim that seemings (perceptual or doxastic) provide immediate prima facie, defeasible, justification for belief. The perceptualist who adopts this strategy can argue that the worry of epistemic dependence is misplaced because although the presence of high-level content is causally dependent on the influence of background beliefs, given their dogmatist theory justification for a belief epistemically depends only on the perceptual experience itself. To see this, consider the following analogy: If one is wearing sunglasses, the perceptual experience one has will depend on those sunglasses they are wearing, but one’s perceptual beliefs are not justified by the sunglasses, but rather by the perceptual experience itself (Pryor 2000). For concerns about epistemic downgrade, the perceptualist may give a similar response, which is to state that one is defeasibly justified in a perceptual belief until one is made aware of a defeater, which in this case is the vicious feedback loop. To be clear, no moral perceptualist has made use of this response in print, as most opt for a kind of externalist account of perceptual justification. We should keep in mind that the dogmatist response is made in debates in general perceptual epistemology, and because debates about the epistemic effects of cognitive penetration in moral perception are instances of the general debate, the dogmatist strategy is available should the moral perceptualist wish to use it.

Apart from the epistemic difficulties cognitive penetration incurs, because cognitive penetration is a thesis about the structure of human cognitive architecture it must withstand scrutiny from cognitive science and empirical psychology. A central assumption of the cognitive science and psychology of perception is that the perceptual system is modular, or informationally encapsulated. However, cognitive penetration assumes the opposite because it claims that beliefs influence perceptual experience. Because cognitive penetration holds that the perceptual system is non-modular and receives input from the cognitive system, it falls upon advocates of the hypothesis to show that there is empirical support for the thesis. The problem is that most empirical tests purporting to demonstrate effects of cognitive penetration are questionable. The results have been debunked as either being explainable by other psychological effects such as attention effects, or they have been dismissed on the grounds of poor methodology and difficult to replicate (Firestone and Scholl 2016). Furthermore, in the case of perceptual learning, cognitive penetration predicts changes in the neurophysiology of the cognitive system rather than in the perceptual system, as it would be new beliefs that explain learning to recognize an object. Research in perceptual neurophysiology shows the opposite: perceptual learning is accompanied by changes in the neurophysiology of the perceptual system (Connolly 2019). The viability of CMP, insofar as it depends on cognitive penetration for high-level contents, is subject not only to epistemic pressures, but also to empirical fortune.

5. Summary: Looking Forward

For moral epistemologists, a foundationalist epistemology that provides responses to skeptical challenges is highly desirable. While a variety of theories of moral epistemology do provide foundations, CMP provides an epistemology that grounds our justification in our perceptual faculty that we are all familiar with and provides a unified story for all perceptual justification.

The overall takeaway is that the arguments that are made by both defenders and challengers to CMP are instances of general issues in the philosophy of perception. The lesson to be drawn here for CMP is that the way forward is to pay close attention to the general philosophy of perception literature. Because the literature of CMP itself remains in very early development, paying attention to the general issues will prevent the advocate of CMP from falling into mistakes made in the general literature, as well as open potential pathways for developing CMP in interesting and novel ways.

6. References and Further Reading

  • Audi, Robert. 2013. Moral Perception. Princeton University Press.
    • A book length defense of CMP. A good example of the kind of epistemic ecumenicism a perceptualist may adopt.
  • Bergqvist, Anna, and Robert Cowan (eds.). 2018. Evaluative Perception. Oxford: Oxford University Press.
    • Collection of essays on the plausibility of CMP and emotional perception.
  • Church, Jennifer. 2013. “Moral Perception.” Possibilities of Perception (pp. 187-224). Oxford: Oxford University Press.
    • Presents a Kantian take on moral perception.
  • Crow, Daniel. 2016. “The Mystery of Moral Perception.” Journal Of Moral Philosophy 13, 187-210.
    • Challenges moral perception with a reliability challenge.
  • Connolly, Kevin. 2019. Perceptual Learning: The Flexibility of the Senses. Oxford: Oxford University Press.
    • Discusses the findings of the neuroscience and psychology of perception in relation to theses in the philosophy of mind. Chapter 2 argues against cognitive penetration.
  • Cowan, Robert. 2014. “Cognitive Penetrability and Ethical Perception.” Review of Philosophy and Psychology 6, 665-682.
    • Discusses the epistemic challenges posed to moral perception by cognitive penetration. Focuses on epistemic dependence.
  • Cowan, Robert. 2015. “Perceptual Intuitionism.” Philosophy and Phenomenological Research 90, 164-193.
    • Defends the emotional perception of morality.
  • Cowan, Robert. 2016. “Epistemic perceptualism and neo-sentimentalist objections.” Canadian Journal of Philosophy 46, 59-81.
    • Defends the emotional perception of morality.
  • Faraci, David. 2015. “A hard look at moral perception.” Philosophical Studies 172, 2055-2072.
  • Faraci, David. 2019. “Moral Perception and the Reliability Challenge.” Journal of Moral Philosophy 16, 63-73.
    • Responds to Werner 2018. Argues that moral perception has a reliability challenge.
  • Firestone, Chaz, and Brian J. Scholl. 2016a. “Cognition Does Not Affect Perception: Evaluating the Evidence for ‘Top-down’ Effects.” Behavioral and Brain Sciences 39.
    • Challenges studies that purport to demonstrate the effects of cognitive penetration.
  • Firestone, Chaz, and Brian J. Scholl. 2016b. “‘Moral Perception’ Reflects Neither Morality Nor Perception.” Trends in Cognitive Sciences 20, 75-76.
    • Response to Gantman and Van Bavel 2015.
  • Fodor, Jerry. 1983. The Modularity of Mind. Cambridge, Massachusetts: MIT Press.
    • Argues for the informational encapsulation of the perceptual system.
  • Gantman, Ana P. and Jay J.Van Bavel. 2014. “The moral pop-out effect: Enhanced perceptual awareness of morally relevant stimuli.” Cognition, 132, 22-29.
    • Argues that findings in perceptual psychology support moral perception.
  • Gantman, Ana P. and Jay J. Van Bavel. 2015. “Moral Perception.” Trends in Cognitive Sciences 19, 631-633.
  • Hutton, James. 2022. “Moral Experience: Perception or Emotion?” Ethics 132, 570-597.
  • Huemer, Michael. 2005. Ethical Intuitionism. New York: Palgrave MacMillan.
    • Section 4.4.1 presents the ‘looks’ objection.
  • Kim, Jaegwon. 1993. Supervenience and Mind: Selected Philosophical Essays. Cambridge: Cambridge University Press.
    • A collection of essays discussing the causal exclusion problem.
  • McBrayer, Justin P. 2010a. “A limited defense of moral perception.” Philosophical Studies 149, 305–320.
  • McBrayer, Justin P. 2010b. “Moral perception and the causal objection.” Ratio 23, 291-307.
  • McGrath, Matthew. 2017. “Knowing what things look like.” Philosophical Review 126, 1-41.
    • Presents a general version of the ‘looks’ objection.
  • McGrath, Sarah. 2004. “Moral Knowledge by Perception.” Philosophical Perspectives 18, 209-228.
    • An early formulation of CMP, discusses the epistemic motivations for the view.
  • McGrath, Sarah. 2018. “Moral Perception and its Rivals.” In Anna Bergqvist and Robert Cowan (eds.), Evaluative Perception (pp. 161-182). Oxford: Oxford University Press.
  • McGrath, Sarah. 2019. Moral Knowledge. Oxford: Oxford University Press.
    • Chapter 4 is a presentation of CMP that does not require high-level contents. Chapter 1 is a criticism of some views on the methodology of moral inquiry.
  • Pylyshyn, Zenon. 1999. “Is Vision Continuous with Cognition? The Case for Cognitive Impenetrability of Visual Perception.” Behavioral and Brain Sciences 22, 341-365.
  • Pryor, James. 2000. “The Skeptic and the Dogmatist.” Noûs 34, 517-549.
    • Early presentation of phenomenal dogmatism. Responds to epistemic concerns about the theory-ladenness of perception.
  • Reiland, Indrek. 2021. “On experiencing moral properties.” Synthese 198, 315-325.
    • Presents a version of the ‘looks’ objection.
  • Siegel, Susanna. 2006. “Which properties are represented in perception.” In Gendler, Tamar S. & John Hawthorne (eds.), Perceptual Experience (pp. 481-503). Oxford: Oxford University Press.
    • Argues that perceptual experience includes high-level contents.
  • Siegel, Susanna. 2011. The Contents of Visual Experience. Oxford: Oxford University Press.
    • Book length defense of high-level contents in perceptual experience.
  • Siegel, Susanna. 2012. “Cognitive Penetrability and Perceptual Justification.” Noûs 46.
    • Discusses the issue of epistemic downgrade.
  • Siegel, Susanna. 2019. The Rationality Of Perception. Oxford: Oxford University Press.
    • Chapter 4 is a discussion of epistemic downgrade and responds to criticisms of the problem.
  • Siegel, Susanna & Byrne, Alex. 2016. “Rich or thin?” In Bence Nanay (ed.), Current Controversies in Philosophy of Perception (pp. 59-80). New York: Routledge-Taylor & Francis.
    • Byrne and Siegel debate whether or not there are high-level perceptual contents.
  • Väyrynen, Pekka. 2018. “Doubts about Moral Perception.”  In Anna Bergqvist and Robert Cowan (eds.), Evaluative Perception (pp. 109-128). Oxford: Oxford University Press.
  • Werner, Preston J. 2016. “Moral Perception and the Contents of Experience.” Journal of Moral Philosophy 13, 294-317.
  • Werner, Preston J. 2017. “A Posteriori Ethical Intuitionism and the Problem of Cognitive Penetrability.” European Journal of Philosophy 25, 1791-1809.
    • Argues that synchronic cognitive penetration is a problem for CMP, but diachronic cognitive penetration is epistemically harmless.
  • Werner, Preston J. 2018. “Moral Perception without (Prior) Moral Knowledge.” Journal of Moral Philosophy 15, 164-181.
    • Response to Faraci 2015.
  • Werner, Preston J. 2018. “An epistemic argument for liberalism about perceptual content.” Philosophical Psychology 32, 143-159.
    • Defends the claim that there are high-level contents in perception by arguing that it best explains some findings in perceptual psychology, such as facial recognition.
  • Werner, Preston J. 2020. “Which Moral Properties Are Eligible for Perceptual Awareness?” Journal of Moral Philosophy 17, 290-319.
    • Discusses which moral properties we can perceive, concludes that we perceive at least pro-tanto evaluative properties.
  • Wodak, Daniel. 2019. “Moral perception, inference, and intuition.” Philosophical Studies 176, 1495-1512.
  • Yablo, Stephen. 2005. “Wide Causation” In Stephen Yablo (ed.), Thoughts: Papers on Mind, Meaning, and Modality. Oxford: Oxford University Press.
    • Presents a solution to the causal exclusion problem. Argues that mental states are causally efficacious in a ‘wide’ sense in that they would still be explanatorily valuable even if the ‘thin’ causes, the physical states, were different.

 

Author Information

Erich Jones
Email: Jones.7269@buckeyemail.osu.edu
The Ohio State University
U. S. A.

William Godwin (1756–1836)

Following the publication of An Enquiry Concerning Political Justice in 1793 and his most successful novel, Caleb Williams, in 1794, William Godwin was briefly celebrated as the most influential English thinker of the age. At the time of his marriage to the writer Mary Wollstonecraft in 1797, the achievements and influence of both writers, as well as their personal happiness together, seemed likely to extend into the new century. It was not to be. The war with revolutionary France and the rise of a new spirit of patriotic fervour turned opinion against reformers, and it targeted Godwin. Following her death in September 1797, a few days after the birth of a daughter, Mary, Godwin published a candid memoir of Wollstonecraft that ignited a propaganda campaign against them both and which became increasingly strident. He published a third edition of Political Justice and a second major novel, St. Leon, but the tide was clearly turning. And while he continued writing into old age, he never again achieved the success, nor the financial security, he had enjoyed in the 1790s. Today he is most often referenced as the husband of Mary Wollstonecraft, as the father of Mary Wollstonecraft Shelley (the author of Frankenstein and The Last Man), and as the founding father of philosophical anarchism. He also deserves to be remembered as a significant philosopher of education.

In An Enquiry Concerning Political Justice, Godwin argues that individuals have the power to free themselves from the intellectual and social restrictions imposed by government and state institutions.  The argument starts with the very demanding requirement that we assess options impartially and rationally. We should act only according to a conviction that arises from a conscientious assessment of what would contribute most to the general good. Incorporated in the argument are principles of impartiality, utility, duty, benevolence, perfectionism, and, crucially, independent private judgment.

Godwin insists that we are not free, morally or rationally, to make whatever choices we like. He subscribes to a form of necessitarianism, but he also believes that choices are constrained by duty and that one’s duty is always to put the general good first. Duties precede rights; rights are simply claims we make on people who have duties towards us. Ultimately, it is the priority of the principle of independent private judgment that produces Godwin’s approach to education, to law and punishment, to government, and to property. Independent private judgment generates truth, and therefore virtue, benevolence, justice, and happiness. Anything that inhibits it, such as political institutions or modes of government, must be replaced by progressively improved social practices.

When Godwin first started An Enquiry Concerning Political Justice, he intended it to explore how government can best benefit humanity. He and the publisher George Robinson wanted to catch the wave of interest created by the French Revolution itself and by Edmund Burke’s Reflections on the Revolution in France, which so provoked British supporters of the revolution. Robinson agreed to support Godwin financially while he worked, with the understanding that he would send sections of the work as he completed them. This meant that the first chapters were printed before he had fully realised the implications of his arguments. The inconsistencies that resulted were addressed in subsequent editions. His philosophical ideas were further revised and developed in The Enquirer (1797), Thoughts Occasioned by a Perusal of Dr. Parr’s Spital Sermon (1801), Of Population (1820), and Thoughts on Man (1831), and in his novels. He also wrote several works of history and biography, and wrote or edited several texts for children, which were published by the Juvenile Library that he started with his second wife, Mary Jane Clairmont.

Table of Contents

  1. Life
  2. Godwin’s Philosophy: An Enquiry Concerning Political Justice
    1. Summary of Principles
    2. From Private judgment to Political Justice
  3. Educational Implications of Godwin’s philosophy
    1. Progressive Education
    2. Education, Epistemology, and Language.
    3. Education, Volition, and Necessitarianism
    4. Government, the State, and Education
  4. Godwin’s Philosophical Anarchism
    1. Introduction
    2. Punishment
    3. Property
    4. Response to Malthus
  5. Godwin’s Fiction
    1. Caleb Williams (1794)
    2. St. Leon: A Tale of the Sixteenth Century (1799)
  6. Conclusion
  7. References and Further Reading
    1. Works by William Godwin
      1. Early Editions of An Enquiry Concerning Political Justice
      2. Other Editions of An Enquiry Concerning Political Justice
      3. Collected Editions of Godwin’s Works and Correspondence
      4. First Editions of Other Works by Godwin
      5. Online Resources
      6. Other Editions of Selected Works by Godwin
    2. Biographies of Godwin
    3. Social and Historical Background
    4. Other Secondary Sources in Philosophy, Education, Fiction, and Anarchism

1. Life

William Godwin was born in 1756 in Wisbech in Cambridgeshire, England, the seventh of thirteen children. His father was a Dissenting minister; his mother was the daughter of a successful shipowner. Godwin was fond of his lively mother, less so of his strictly Calvinist father. He was a pious and academically precocious boy, readily acquiring a close knowledge of the Old and New Testaments. After three years at a local school, where he read widely, learned some Latin and developed a passion for the classics, he moved at the age of 11 to Norwich to become the only pupil of the Reverend Samuel Newton. Newton was an adherent of Sandemanianism, a particularly strict form of Calvinism. Godwin found him pedantic and unjustly critical. The Calvinist doctrines of original sin and predestination weighed heavily. Calvinism left emotional scars, but it influenced his thinking. This was evidenced, Godwin later stated, in the errors of the first edition of Political Justice: its tendency to stoicism regarding pleasure and pain, and the inattention to feeling and private affections.

After a period as an assistant teacher of writing and arithmetic, Godwin began to develop his own ideas about education and to take an interest in contemporary politics. When William’s father died in 1772, his mother paid for her clever son to attend the New College, a Dissenting Academy in Hoxton, north of the City of London. By then Godwin had become, somewhat awkwardly, a Tory, a supporter of the aristocratic ruling class. Dissenters generally supported the Whigs, not least because they opposed the Test Acts, which prohibited anyone who was not an Anglican communicant from holding a public office. At Hoxton Godwin received a more comprehensive higher education than he would have received at Oxford or Cambridge universities (from which Dissenters were effectively barred). The pedagogy was liberal, based on free enquiry, and the curriculum was wide-ranging, covering psychology, ethics, politics, theology, philosophy, science, and mathematics. Hoxton introduced Godwin to the rational dissenting creeds, Socinianism and Unitarianism, to which philosophers and political reformers such as Joseph Priestley and Richard Price subscribed.

Godwin seems to have graduated from Hoxton with both his Sandemanianism and Toryism in place. But the speeches of Edmund Burke and Charles James Fox, the leading liberal Whigs, impressed him and his political opinions began to change. After several attempts to become a Dissenting minister, he accepted that congregations simply did not take to him; and his religious views began a journey through deism to atheism. He was influenced by his reading of the French philosophes. He settled in London, aiming to make a living from writing, and had some early encouragement. Having already completed a biography of William Pitt, Earl of Chatham, he now contributed reviews to the English Review and published a collection of sermons. By 1784 he had published three minor novels, all quite favourably reviewed, and a satirical pamphlet entitled The Herald of Literature, a collection of spoof ‘extracts’ from works purporting to be by contemporary writers. He also contemplated a career in education, for in July 1783 he published a prospectus for a small school that he planned to open in Epsom, Surrey.

For the next several years Godwin was able to earn a modest living as a writer, thanks in part to his former teacher at Hoxton, Andrew Kippis, who commissioned him to write on British and Foreign History for the New Annual Register. The work built him a reputation as a competent political commentator and introduced him to a circle of liberal Whig politicians, publishers, actors, artists, and authors. Then, in 1789, events in France raised hopes for radical reform in Great Britain. On November 4 Godwin was present at a sermon delivered by Richard Price which, while primarily celebrating the Glorious Revolution of 1688, anticipated many of the themes of Political Justice: universal justice and benevolence; rationalism; and a war on ignorance, intolerance, persecution, and slavery. The special significance of the sermon is that it roused Edmund Burke to write Reflections on the Revolution in France, which was published in November 1790. Godwin had admired Burke, and he was disappointed by this furious attack on the Revolution and by its support for custom, tradition, and aristocracy.

He was not alone in his disappointment. Thomas Paine’s Rights of Man, and Mary Wollstonecraft’s A Vindication of the Rights of Men were early responses to Burke. Godwin proposed to his publisher, George Robinson, a treatise on political principles, and Robinson agreed to sponsor him while he wrote it. Godwin’s ideas veered over the sixteen months of writing towards the philosophical anarchism for which the work is best known.

Political Justice, as Godwin declared in the preface, was the child of the French Revolution. As he finished writing it in January 1793, the French Republic declared war on the Kingdom of Great Britain. It was not the safest time for an anti-monarchist, anti-aristocracy, anti-government treatise to appear. Prime Minister William Pitt thought the two volumes too expensive to attract a mass readership; otherwise, the Government might have prosecuted Godwin and Robinson for sedition. In fact, the book sold well and immediately boosted Godwin’s fame and reputation. It was enthusiastically reviewed in much of the press and keenly welcomed by radicals and Dissenters. Among his many new admirers were young writers with whom Godwin soon became acquainted: William Wordsworth, Robert Southey, Samuel Taylor Coleridge, and a very youthful William Hazlitt.

In 1794 Godwin wrote two works that were impressive and successful in different ways. The novel Things as They Are: or The Adventures of Caleb Williams stands out as an original exploration of human psychology and the wrongs of society. Cursory Strictures on the Charge delivered by Lord Chief Justice Eyre to the Grand Jury first appeared in the Morning Chronicle newspaper. Pitt’s administration had become increasingly repressive, charging supporters of British reform societies with sedition. On May 12, 1794, Thomas Hardy, the chair of the London Corresponding Society (LCS), was arrested and committed with six others to the Tower of London; then John Thelwall, a radical lecturer, and John Horne Tooke, a leading light in the Society for Constitutional Information (SCI), were arrested.  The charge was High Treason, and the potential penalty was death. Habeas Corpus had been suspended, and the trials did not begin until October. Godwin had attended reform meetings and knew these men. He was especially close to Thomas Holcroft, the novelist and playwright. Godwin argued in Cursory Strictures that there was no evidence that the LCS and SCI were involved in any seditious plots, and he accused Lord Chief Justice Eyre of expanding the definition of treason to include mere criticism of the government. ‘This is the most important crisis in the history of English liberty,’ he concluded. Hardy was called to trial on October 25, and, after twelve days, the jury returned a verdict of not guilty. Subsequently, Horne Tooke and Thelwall were tried and acquitted, and others were dismissed.  Godwin’s article was considered decisive in undermining the charge of sedition. In Hazlitt’s view, Godwin had saved the lives of twelve innocent men (Hazlitt, 2000: 290). The collapse of the Treason Trials caused a surge of hope for reform, but a division between middle-class intellectuals and the leaders of labouring class agitation hastened the decline of British Jacobinism. This did not, however, end the anti-Jacobin propaganda campaign, nor the satirical attacks on Godwin himself.

A series of essays, published as The Enquirer: Reflections on Education, Manners and Literature (1797), developed a position on education equally opposed to Jean-Jacques Rousseau’s progressivism (in Emile) and to traditional education. Other essays modified or developed ideas from Political Justice. One essay, ‘Of English Style’, describes clarity and propriety of style as the ‘transparent envelope’ of thoughts. Another essay, ‘Of Avarice and Profusion’, prompted the Rev. Thomas Malthus to respond with his An Essay on the Principle of Population (1798).

At the lodgings of a mutual friend, the writer Mary Hays, Godwin became reacquainted with a woman he had first met in 1791 at one of the publisher Joseph Johnson’s regular dinners, when he had wanted to converse with Thomas Paine rather than with her. Since then, Mary Wollstonecraft had spent time in revolutionary Paris, fallen in love with an American businessman, Gilbert Imlay, and given birth to a daughter, Fanny. Imlay first left her then sent her on a business mission to Scandinavia. This led to the publication of Letters Written During a Short Residence in Sweden, Norway and Denmark (1796). She had completed A Vindication of the Rights of Woman in 1792, a more substantial work than her earlier A Vindication of the Rights of Men. She had also recently survived a second attempt at suicide. Having previously published Mary: A Fiction in 1788, she was working on a second novel, The Wrongs of Woman: or, Maria. A friendship soon became a courtship. When Mary became pregnant, they chose to get married and to brave the inevitable ridicule, both previously having condemned the institution of marriage (in Godwin’s view it was ‘the worst of monopolies’). They were married on March 29, 1797. They worked apart during daytime, Godwin in a rented room near their apartment in Somers Town, St. Pancras, north of central London, and came together in the evening.

Godwin enjoyed the dramatic change in his life: the unfamiliar affections and the semi-independent domesticity. Their daughter was born on August 30. The birth itself went well but the placenta had broken apart in the womb; a doctor was called to remove it, and an infection took hold. Mary died on September 10. At the end she said of Godwin that he was ‘the kindest, best man in the world’. Heartbroken, he wrote that he could see no prospect of future happiness: ‘I firmly believe that there does not exist her equal in the world. I know from experience we were formed to make each other happy’. He could not bring himself to attend the funeral in the churchyard of St. Pancras Church, where just a few months earlier they had married.

Godwin quickly threw himself into writing a memoir of Wollstonecraft’s life. Within a few weeks he had completed a work for which he was ridiculed at the time, and for which he has been criticised by historians who feel that it delayed the progress of women’s rights. The Memoirs of the Author of a Vindication of the Rights of Woman (1798) is a tender tribute, and a frank attempt to explore his own feelings, but Godwin’s commitment to complete candour meant that he underestimated, or was insensitive to, the likely consequence of revealing ‘disreputable’ details of Mary’s past, not least that Fanny had been born out of wedlock. It was a gift to moralists, humourists, and government propagandists.

Godwin was now a widower with a baby, Mary, and a toddler, Fanny, to care for. With help from a nursemaid and, subsequently, a housekeeper, he settled into the role of affectionate father and patient home educator. However, he retained a daily routine of writing, reading, and conversation. A new novel was to prove almost as successful as Caleb Williams. This was St. Leon: A Tale of the Sixteenth Century. It is the story of an ambitious nobleman disgraced by greed and an addiction to gambling, then alienated from society by the character-corrupting acquisition of alchemical secrets. It is also the story of the tragic loss of an exceptional wife and of domestic happiness: it has been seen as a tribute to Wollstonecraft and as a correction to the neglect of the affections in Political Justice.

The reaction against Godwin continued into the new century, with satirical attacks coming from all sides. It was not until he read a serious attack by his friend Dr. Samuel Parr that he was stung into a whole-hearted defence, engaging also with criticisms by James Mackintosh and Thomas Malthus. Thoughts Occasioned by the Perusal of Dr. Parr’s Spital Sermon was published in 1801. His replies to Mackintosh and Malthus were measured, but his response to Parr was more problematic, making concessions that could be seen as undermining the close connection between truth and justice that is crucial to the argument of Political Justice.

Since Mary Wollstonecraft’s death, Godwin had acquired several new friends, including Charles and Mary Lamb, but he clearly missed the domesticity he had enjoyed so briefly; and he needed a mother for the girls. The story goes that Godwin first encountered his second wife in May 1801, shortly before he started work on the reply to Dr. Parr. He was sitting reading on his balcony when he was hailed from next door: ‘Is it possible that I behold the immortal Godwin?’ Mary Jane Clairmont had two children, Charles and Jane, who were similar in age to Fanny and Mary. Godwin’s friends largely disapproved – they found Mary Jane bad-tempered and artificial – but Godwin married her, and their partnership endured until his death.

Godwin had a moderate success with a Life of Chaucer, failed badly as a dramatist, and completed another novel, Fleetwood, or the New Man of Feeling (1805), but he was not earning enough to provide for his family by his pen alone. He and Mary Jane conceived the idea of starting a children’s bookshop and publishing business. For several years the Juvenile Library supplied stationery and books of all sorts for children and schools, including history books and story collections written or edited by ‘Edward Baldwin’, Godwin’s own name being considered too notorious. Despite some publishing successes, such as Charles and Mary Lamb’s Tales from Shakespeare, the bookshop never really prospered. As he slipped into serious debt, Godwin felt he was slipping also into obscurity. In 1809 he wrote an Essay on Sepulchres: A Proposal for Erecting some Memorial of the Illustrious Dead in All Ages on the Spot where their Remains have been Interred. The Essay was generally well-received, but the proposal was ignored. With the Juvenile Library on the point of collapse, the family needed a benefactor who could bring them financial security.

Percy Bysshe Shelley was just twenty, recently expelled from Oxford University for atheism, and newly married and disinherited, when in January 1812 he wrote a fan letter to a philosopher he had not been sure was still living. His reading of Political Justice at school had ‘opened to my mind fresh & more extensive views’, he wrote. Shelley went off to Ireland to agitate for independence and distribute his pamphlet An Address to the Irish People. Godwin disapproved of the inflammatory tone, but invited Shelley and his wife, Harriet, to London. They eventually arrived in October and Shelley and Godwin thereafter maintained a friendly correspondence. Shelley’s first major poem, Queen Mab, with its Godwinian themes and references, was published at this time. During 1813, as he and Shelley continued to meet, Godwin saw a good deal of a new friend and admirer, Robert Owen, the reforming entrepreneur and philanthropist. Hazlitt commented that Owen’s ideas of Universal Benevolence, the Omnipotence of Truth and the Perfectibility of Human Nature were exactly those of Political Justice. Others thought Owen’s ‘socialism’ was Godwinianism by another name. As Godwin pleaded with friends and admirers for loans and deferrals to help keep the business afloat, the prospect of a major loan from Shelley was thwarted by Sir Timothy Shelley withholding his son’s inheritance when he turned twenty-one.

Godwin’s troubles took a different turn when Mary Godwin, aged sixteen, returned from a stay with friends in Scotland looking healthy and pretty. Harriet Shelley was in Bath with a baby. Shelley dined frequently with the Godwins and took walks with Mary and Jane. Soon he was dedicating an ode to ‘Mary Wollstonecraft Godwin’. On June 26th Mary declared her love as they lay together in St. Pancras Churchyard, beside her mother’s grave, Jane lingering nearby. By July Shelley had informed Harriet that he had only ever loved her as a brother. Godwin was appalled and remonstrated angrily, but early on the morning of July 28 he found a letter on his dressing table: Mary had eloped with Shelley, and they had taken Jane with them.

Godwin’s life over the next eight years, until Shelley’s tragic death in 1822, was far less dramatic or romantic than those of Mary and Shelley, or of Claire (as Jane now called herself). Their travels in Europe, the births and deaths of several children, including Claire’s daughter by Lord Byron, the precocious literary achievements (Shelley’s poems and Mary’s novel Frankenstein) are well known. Meanwhile, in London, Mary Wollstonecraft’s daughter, Fanny, was left unhappily behind. The atmosphere at home was tense and gloomy. Godwin refused to meet Mary and her lover until they were married, although the estrangement did not stop him accepting money from Shelley. A protracted struggle ensued, with neither party appearing to live up to Godwinian standards of candour and disinterestedness. Then, in October 1816, Fanny left the family home, ostensibly to travel to Ireland to visit her aunts (Wollstonecraft’s sisters). In Swansea, she killed herself by taking an overdose of laudanum. She was buried in an unnamed pauper’s grave, Godwin being fearful of further scandal connected with himself and Wollstonecraft. Shortly after this, Harriet Shelley’s body was pulled from the Serpentine in London. Shelley and Mary could now marry, and before long they escaped to Italy, with Claire (Jane) still in tow.

Despite these troubles and the precarious position of the Juvenile Library, Godwin managed to complete another novel, Mandeville, A Tale of the Seventeenth Century in England (1817). He took pride in his daughter’s novel and in his son-in-law’s use of Godwinian ideas in his poems. At the end of 1817, Godwin began his fullest response to Malthus. It took him three years of difficult research to complete Of Population. Meanwhile, his financial difficulties had reached a crisis point. He besieged Shelley in Italy with desperate requests to fulfil his promised commitments, but Shelley had lost patience and refused. The money he had already given, he complained, ‘might as well have been thrown into the sea’. A brief reprieve allowed the Godwins to move, with the Juvenile Library, to better premises. Then came the tragedy of July 8th, 1822. Shelley drowned in rough seas in the Gulf of La Spezia. Mary Shelley returned to England in 1823 to live by her pen. In 1826 she published The Last Man, a work, set in the twenty-first century, in which an English monarch becomes a popular republican leader only to survive a world-wide pandemic as the last man left alive. Godwin’s influence is seen in the ambition and originality of her speculative fiction.

Godwin himself worked for the next five years on a four-volume History of the Commonwealth—the period between the execution in 1649 of Charles I and the restoration in 1660 of Charles II. He describes the liberty that Cromwell and the Parliamentarians represented as a means, not an end in itself; the end is the interests and happiness of the whole: ‘But, unfortunately, men in all ages are the creatures of passions, perpetually prompting them to defy the rein, and break loose from the dictates of sobriety and speculation.’

In 1825, Godwin was finally declared bankrupt, and he and Mary Jane were relieved of the burden of the Juvenile Library. They moved to cheaper accommodation. Godwin had the comfort of good relations with his daughter and grandson. He hoped for an academic position with University College, which Jeremy Bentham had recently helped to establish, but was disappointed. He worked on two further novels, Cloudesley and Deloraine. In 1831 came Thoughts on Man, a collection of essays in which he revisited familiar philosophical topics. In 1834, the last work to appear in his lifetime was published. Lives of the Necromancers is a history of superstition, magic, and credulity, in which Godwin laments that we make ourselves ‘passive and terrified slaves of the creatures of our imagination’. A collection of essays on religion, published posthumously, made similar points but commended a religious sense of awe and wonder in the presence of nature.

The 1832 Reform Bill’s extension of the male franchise pleased Godwin. In 1833, the Whig government awarded him a pension of £200 a year and a residence in New Palace Yard, within the Palace of Westminster parliamentary estate—an odd residence for an anarchist. When the Palace of Westminster was largely destroyed by fire, in October 1834, the new Tory Government renewed his pension, even though he had been responsible for fire safety at Westminster and the upkeep of the fire engine. He spent the last years of his life in relative security with Mary Jane, mourning the deaths of old friends and meeting a new generation of writers. He died at the age of eighty on April 7, 1836. He was buried in St. Pancras Churchyard, in the same grave as Mary Wollstonecraft. When Mary Shelley died in 1851, her son and his wife had Godwin’s and Wollstonecraft’s remains reburied with her in the graveyard of St. Peter’s Church in Bournemouth, on the south coast.

2. Godwin’s Philosophy: An Enquiry Concerning Political Justice

Note: references to An Enquiry Concerning Political Justice (PJ) give the volume number and page number of the two volume 1798 third edition, which is the same as the 1946 University of Toronto Press, ed. F. E. L. Priestley, facsimile edition. This is followed by the book and chapter number of the first edition (for example, PJ II: 497; Bk VIII, vi). Page numbers of other works are those of the first edition.

a. Summary of Principles

The first edition of An Enquiry Concerning Political Justice was published in 1793. A second edition was published in 1796 and a third in 1798. Despite the modifications in the later editions, Godwin considered ‘the spirit and the great outlines of the work remain untouched’ (PJ I, xv; Preface to second edition). Arguably, he was underplaying the significance of the changes. They make clear that pleasure and pain are the only bases on which morality can rest, that feeling, rather than reason or judgment, is what motivates action, and that private affections have a legitimate place in our rational deliberations.

The modifications are incorporated in the ‘Summary of Principles’ (SP) that he added to the start of the third edition (PJ I, xxiii–xxvii). The eight principles are:

(1) ‘The true object of moral and political disquisition, is pleasure or happiness.’ Godwin divides pleasures between those of the senses and those that are ‘probably more exquisite’, such as the pleasures of intellectual feeling, sympathy, and self-approbation. The most desirable and civilized state is that in which we have access to all these diverse sources of pleasure and possess a happiness ‘the most varied and uninterrupted’.

(2) ‘The most desirable condition of the human species, is a state of society.’ Although government was intended to secure us from injustice and violence, in practice it embodies and perpetuates them, inciting passions and producing oppression, despotism, war, and conquest.

(3) ‘The immediate object of government is security.’ But, in practice, the means adopted by government restrict individual independence, limiting self-approbation and our ability to be wise, useful, or happy. Therefore, the best kind of society is one in which there is as little as possible encroachment by government upon individual independence.

(4) ‘The true standard of the conduct of one man to another is justice.’ Justice is universal, it requires us to aim to produce the greatest possible sum of pleasure and happiness and to be impartial.

(5) ‘Duty is the mode of proceeding, which constitutes the best application of the capacity of the individual, to the general advantage.’ Rights are claims which derive from duties; they include claims on the forbearance of others.

(6) ‘The voluntary actions of men are under the direction of their feelings.’  Reason is a controlling and balancing faculty; it does not cause actions but regulates ‘according to the comparative worth it ascribes to different excitements’—therefore, it is the improvement of reason that will produce social improvements.

(7) ‘Reason depends for its clearness and strength upon the cultivation of knowledge.’ As improvement in knowledge is limitless, ‘human inventions, and modes of social existence, are susceptible of perpetual improvement’. Any institution that perpetuates particular modes of thinking or conditions of existence is pernicious.

(8) ‘The pleasures of intellectual feeling, and the pleasures of self-approbation, together with the right cultivation of all our pleasures, are connected with the soundness of understanding.’ Prejudices and falsehoods are incompatible with soundness of understanding, which is connected, rather, with free enquiry and free speech (subject only to the requirements of public security). It is also connected with simplicity of manners and leisure for intellectual self-improvement: consequently, an unequal distribution of property is not compatible with a just society.

b. From Private judgment to Political Justice

Godwin claims there is a reciprocal relationship between the political character of a nation and its people’s experience. He rejects Montesquieu’s suggestion that political character is caused by external contingencies such as the country’s climate. Initially, Godwin seems prepared to argue that good government produces virtuous people. He wants to establish that the political and moral character of a nation is not static; rather, it is capable of progressive change. Subsequently, he makes clear that a society of progressively virtuous people requires progressively less governmental interference. He is contesting Burke’s arguments for tradition and stability, but readers who hoped that Godwin would go on to argue for a rapid, or violent, revolution were to be disappointed. There is even a Burkean strain in his view that sudden change can risk undoing political and social progress by breaking the interdependency between people’s intellectual and emotional worlds and the social and political worlds they inhabit. He wants a gradual march of opinions and ideas. The restlessness he argues for is intellectual, and it is encouraged in individuals by education.

Unlike Thomas Paine and Mary Wollstonecraft in their responses to Burke, Godwin rejects the language of rights. Obligations precede rights and our fundamental obligation is to do what we can to benefit society as a whole. If we do that, we act justly; if we act with a view to benefit only ourselves or those closest to us, we act unjustly. A close family relationship is not a sufficient reason for a moral preference, nor is social rank. Individuals have moral value according to their potential utility. In a fire your duty would be to rescue someone like Archbishop Fénelon, a benefactor to humankind, rather than, say, a member of your own family. (Fénelon’s 1699 didactic novel The Adventures of Telemachus, Son of Ulysses criticised European monarchies and advocated universal brotherhood and human rights; it influenced Rousseau’s philosophy of education.) It seems, then, that it is the consequences of one’s actions that make them right or wrong, that Godwin’s moral philosophy is a form of utilitarianism. However, Mark Philp (1986) argues that Godwin’s position is more accurately characterised as a form of perfectionism: one’s intentions matter and these, crucially, are improvable.

What makes our intentions improvable is our capacity for private judgment. As Godwin has often been unfairly described, both in his own day and more recently, as a cold-hearted rationalist, it is important to clarify what he means by ‘judgment’. It involves a scrupulous process of weighing relevant considerations (beliefs, feelings, pleasures, alternative opinions, potential consequences) in order to reach a reasonable conclusion. In the third edition (SP 6–8), he implies that motivating force is not restricted to feelings (passions, desires), but includes preferences of all kinds. The reason/passion binary is resisted. An existing opinion or intellectual commitment might be described as a feeling, as something which pleases us and earns a place in the deliberative process. In his Reply to Parr, Godwin mentions that the choice of saving Fénelon could be viewed as motivated by the love of the man’s excellence or by an eagerness ‘to achieve and secure the welfare and improvement of millions’ (1801: 41). Furthermore, any kind of feeling that comes to mind thereby becomes ratiocinative or cognitive; the mind could not otherwise include it in the comparing and balancing process. Godwin rejects the reason/passion binary most explicitly in Book VIII of Political Justice, ‘On Property’. The word ‘passion’, he tells us, is mischievous, perpetually shifting its meaning. Intellectual processes that compare and balance preferences and other considerations are perfectible (improvable); the idea that passions cannot be corrected is absurd, he insists. The only alternative position would be that the deliberative process is epiphenomenal, something Godwin could not accept. (For the shifting meaning of ‘passion’ in this period, and its political significance, see Hewitt, 2017.)

Judgments are unavoidably individual in the sense that the combination of relevant considerations in a particular case is bound to be unique, and also in the sense that personal integrity and autonomy are built into the concept of judgment. If we have conscientiously weighed all the relevant considerations, we cannot be blamed for trusting our own judgment over that of others or the dictates of authority. Nothing—no person or institution, certainly not the government—can provide more perfect judgments. Only autonomous acts, Godwin insists, are moral acts, regardless of actual benefit. Individual judgments are fallible, but our capacity for good judgment is perfectible (SP 6). Although autonomous and impartial judgments might not produce an immediate consensus, conversations and a conscientious consideration of different points of view help us to refine our judgment and to converge on moral truths.

In the first edition of Political Justice, it is the mind’s predisposition for truth that motivates our judgments and actions; in later editions, when it is said to be feelings that motivate, justice still requires an exercise of impartiality, a divestment of our own predilections (SP 4). Any judgment that fails the impartiality test would not be virtuous because it would not be conducive to truth. Godwin is not distinguishing knowledge from mere belief by specifying truth and justified belief conditions; rather, he is specifying the conditions of virtuous judgments: they intentionally or consciously aim at truth and impartiality. A preference for the general good is the dominant motivating passion when judgments are good and actions virtuous. The inclusion in the deliberation process of all relevant feelings and preferences arises from the complexity involved in identifying the general good in particular circumstances. Impartiality demands that we consider different options conscientiously; it does not preclude sometimes judging it best to benefit our friends or family.

Is the development of human intellect a means to an end or an end in itself? Is it intrinsically good? Is it the means to achieving the good of humankind or is the good of humankind the development of intellect? If the means and the end are one and the same, then, as Mark Philp (1986) argues, Godwin cannot be counted, straightforwardly at least, a utilitarian, even though the principle of utility plays a major role in delineating moral actions. If actions and practices with the greatest possible utility are those which promote the development of human intellect, universal benevolence and happiness must consist in providing the conditions for intellectual enhancement and the widest possible diffusion of knowledge. The happiest and most just society would be the one that achieved this for all.

When the capacity for private judgment has been enhanced, and improvements in knowledge and understanding have been achieved, individuals will no longer require the various forms of coercion and constraint that government and law impose on them, and which currently inhibit intellectual autonomy (SP 3). In time, Godwin speculates, mind could be so enhanced in its capacities, that it will conquer physical processes such as sleep, even death. At the time he was mocked for such speculations, but their boldness is impressive, and science and medicine have greatly prolonged the average lifespan, farm equipment (as he foretold) really can plough fields without human control, and research continues into the feasibility (and desirability) of immortality.

Anticipating the arguments of John Stuart Mill, Godwin argues that truth is generated by intellectual liberty and the duty to speak candidly and sincerely in robust dialogue with others whose judgments differs from one’s own. Ultimately, a process of mutual individual and societal improvement would evolve, including changes in opinion. Godwin’s anarchistic vision of future society anticipates the removal of the barriers to intellectual equality and justice and the widest possible access to education and to knowledge.

3. Educational Implications of Godwin’s philosophy

a. Progressive Education

Godwin’s interest in progressive education was revealed as early as July 1783 when the Morning Herald published An Account of the Seminary. This was the prospectus for a school—‘For the Instruction of 12 Pupils in the Greek, Latin, French and English Languages’—that he planned to open in Epsom, Surrey. It is unusually philosophical for a school prospectus. It asserts, for example, that when children are born their minds are tabula rasa, blank sheets susceptible to impressions; that by nature we are equal; that freedom can be achieved by changing our modes of thinking; that moral dispositions and character derive from education and from ignorance. The school’s curriculum would focus on languages and history, but the ‘book of nature’ would be preferred to human compositions. The prospectus criticizes Rousseau’s system for its inflexibility and existing schools for failing to accommodate children’s pursuits to their capacities. Small group tuition would be preferred to Rousseauian solitary tutoring. Teachers would not be fearsome: ‘There is not in the world,’ Godwin writes, ‘a truer object of pity than a child terrified at every glance, and watching with anxious uncertainty the caprices of a pedagogue’. Although nothing transpired because too few pupils were recruited, the episode reveals how central education was becoming to Godwin’s political and social thinking. In the Index to the third edition of Political Justice, there are references to topics such as education’s effects on the human mind, arguments for and against a national education system, the danger of education being a producer of fixed opinions and a tool of national government. Discussions of epistemological, psychological, and political questions with implications for education are frequent. What follows aims to synthesize Godwin’s ideas about education and to draw out some implications.

Many of Godwin’s ideas about education are undoubtedly radical, but they are not easily assimilated into the child-centred progressivism that traces its origin back to Rousseau. Godwin, like Wollstonecraft, admired Rousseau’s work, but they both took issue with aspects of the model of education described in Emile, or On Education (1762). Rousseau believed a child’s capacity for rationality should be allowed to grow stage by stage, not be forced. Godwin sees the child as a rational soul from birth. The ability to make and to grasp inferences is essential to children’s nature, and social communication is essential to their flourishing. Children need to develop, and to refine, the communication and reasoning skills that will allow them to participate in conversations, to learn, and to start contributing to society’s progressive improvement. A collision of opinions in discussions refines judgment. This rules out a solitary education of the kind Emile experiences. Whatever intellectual advancement is achieved, diversity of opinion will always be a condition of social progress, and discussion, debate, disagreement (‘conversation’) will remain necessary in education.

Unlike Rousseau, Godwin does not appear to be especially concerned with stages of development, with limits to learning or reading at particular ages. He is not as concerned as Rousseau is about the danger of children being corrupted by what they encounter. We know that his own children read widely and were encouraged to write, to think critically, to be imaginative. They listened and learned from articulate visitors such as Coleridge. Godwin’s interest in children’s reading encouraged him to start the Juvenile Library. One publication was an English dictionary, to which Godwin prefixed A New Guide to the English Tongue. He hoped to inspire children with the inclination to ‘dissect’ their words, to be clear about the primary and secondary ideas they represent. The implication is that the development of linguistic judgment is closely connected with the development of epistemic judgment, with the capacity for conveying truths accurately and persuasively. The kind of interactive dialogue that he believes to be truth-conducive would require mutual trust and respect. There would be little point in discussion, in a collision of ideas, if one could not trust the other participants to exercise the same linguistic and epistemic virtues as oneself. Judgment might be private but education for Godwin is interpersonal.

A point on which Godwin and Rousseau agree is that children are not born in sin, nor do they have a propensity to evil. Godwin is explicit in connecting their development with the intellectual ethos of their early environment, the opinions that have had an impact on them when they were young. Some of these opinions are inevitably false and harmful, especially in societies in which a powerful hierarchy intends children to grow up taking inequalities for granted. As their opinions and thinking develop through early childhood to adulthood, it is important that individuals learn to think independently and critically in order to protect themselves from false and corrupt opinions.

Godwin does not advocate the kind of manipulative tutoring to which Rousseau’s Emile is subjected; nor does he distinguish between the capacities or needs of boys and girls in the way that Rousseau does in his discussion of the education appropriate to Emile’s future wife, Sophie. According to Rousseau, a woman is formed to please a man, to be subjected to him, and therefore requires an education appropriate to that role. Mary Wollstonecraft, in Chapter 3 of A Vindication of the Rights of Woman, had similarly rejected Rousseau’s differentiation. Another difference is that, whereas Rousseau intends education to produce citizens who will contribute to an improved system of government, Godwin intends education to produce individuals with the independence of mind to contribute to a society that requires only minimal governmental or institutional superintendence.

b. Education, Epistemology, and Language.

Underlying Godwin’s educational thinking are important epistemological principles. In acquiring skills of communication, understanding, reasoning, discussion, and judgment, children acquire the virtue of complete sincerity or truthfulness. Learning is understanding, not memorisation. Understanding is the percipience of truth and requires sincere conviction. One cannot be said to have learned or to know or to have understood something, and one’s conduct cannot properly be guided by it, unless one has a sincere conviction of its truth. The connection between reason and conduct is crucial. Correct conduct is accessible to reason, to conscientious judgment. When they are given reasons for acting one way rather than another, children must be open to being convinced. This suggests that pedagogy should emphasise explanation and persuasion rather than monological direct instruction. Moral education is important in regard to conduct, but, as all education prepares individuals to contribute to the general good, all education is moral education.

Godwin gives an interesting analysis of the concept of truth, especially in the second and third editions of Political Justice. Children will need to learn that private judgment cannot guarantee truth. Not only are judgments clearly fallible, but—at least by the third edition—‘truth’ for Godwin does not indicate a transcendental idea, with an existence independent of human minds or propositions. ‘True’ propositions are always tentative, correctable on the basis of further evidence. The probability of a proposition being true can only be assessed by an active process of monitoring available evidence. Although Godwin frequently refers to truth, misleadingly perhaps, as ‘omnipotent’, he can only mean that the concept provides a standard, a degree of probability that precludes reasonable doubt. This suggests that ‘conviction’ is an epistemic judgment that there is sufficient probability to warrant avowal.

The reason why Godwin tends to emphasize truth rather than knowledge may be that we cannot transmit knowledge because we cannot transmit the rational conviction that would turn a reception of a truth into the epistemic achievement of knowing. Each recipient of truths must supply their own conviction via their own private judgment. Godwin insists that we should take no opinions on trust without independent thought and conviction. Judgments need to be refreshed to ensure that what was in the general interest previously still is. When we bind ourselves to the wisdom of our ancestors, to articles of faith or outdated teachings, we are inhibiting individual improvement and the general progress of knowledge. Conviction comes with a duty to bear witness, to pass on the truth clearly and candidly in ‘conversations’. The term ‘conversation’ implies a two-way, open-ended exchange, with at least the possibility of challenge. Integrity would not permit a proposition with an insufficient degree of probability to be conveyed without some indication of its lesser epistemic status, as with conjectures or hearsay. In modern terms, appreciating the difference in the epistemic commitments implicated by different speech acts, such as assertions, confessions, and speculations, would be important to the child’s acquisition of linguistic and epistemic skills or virtues.

c. Education, Volition, and Necessitarianism

 Another aspect of Godwin’s philosophy that makes children’s education in reasoning and discussion important is his account of volition and voluntary choice. If a judgment produced no volition, it could be overruled by hidden or unconscious feelings or desires, and there would be no prospect of developing self-control. Disinterested deliberation would be a delusion and moral education would be powerless. Although Godwin made concessions concerning reason’s role in the motivation of judgments and actions, and in time developed doubts about the potential for improving the human capacity for impartiality, he did not alter the central point that it is thoughts that are present to mind, cognitive states with content, that play a role in motivation. Not all thoughts are inferences. By the time passions or desires, or any kind of preference, become objects of awareness, they are ratiocinative; the intellect is necessarily involved in emotion and desire. This ensures there is a point in developing critical thinking skills, in learning to compare and balance conscientiously whatever preferences and considerations are present to mind.

Godwin admits that some people are more able than others to conquer their appetites and desires; nevertheless, he thinks all humans share a common nature and can, potentially, achieve the same level of self-control, allowing judgment to dominate. This suggests that learning self-control should be an educational priority. Young people are capable of being improved, not by any form of manipulative training, coercion, or indoctrination, but by an education that promotes independence of mind through reflective reading and discussion. He is confident that a society freed from governmental institutions and power interests would strengthen individuals’ resistance to self-love and allow them to identify their own interests with the good of all. It would be through education that they would learn what constitutes the general good and, therefore, what their duties are. Although actions are virtuous that are motivated by a passion for the general good, they still require a foundation in knowledge and understanding.

The accusation that Godwin had too optimistic a view of the human capacity for disinterested rationality and self-control was one made by contemporaries, including Thomas Malthus. In later editions of Political Justice, reason is represented as a capacity for deliberative prudence, a capacity that can be developed and refined even to the extent of exercising control over sexual desire. Malthus doubted that most people would ever be capable of the kind of prudence and self-control that Godwin anticipated. Malthus’s arguments pointed towards a refusal to extend benevolence to the poor and oppressed, Godwin’s pointed towards generosity and equity.

The influence on Godwin’s perfectionism of the rational Dissenters, especially Richard Price and Joseph Priestley, is most apparent in the first edition of Political Justice. He took from them, and also from David Hartley and Jonathan Edwards, the doctrine of philosophical necessity, according to which a person’s life is part of a chain of causes extending through eternity ‘and through the whole period of his existence, in consequence of which it is impossible for him to act in any instance otherwise than he has acted’ (PJ I: 385; Bk IV, vi). Thoughts, and therefore judgments, are not exceptions: they succeed each other according to necessary laws. What stops us from being mere automatons is the fact that experience creates habits of mind which compose our moral and epistemic character, the degree of perfection in our weighing of preferences in pursuit of truth. The more rational, or perfect, our wills have become, the more they subordinate other considerations to truth. But the course of our lives, including our mental deliberations, is influenced by our desires and passions and by external intrusions, including by government, so to become autonomous we need to resist distortions and diversions. Experience and active participation in candid discussion help to develop our judgment and cognitive capacities, and as this process of improvement spreads through society, the need for government intervention and coercion reduces.

In revising this account of perfectionism and necessitarianism for the second and third editions of Political Justice, Godwin attempts to keep it compatible with the more positive role he then allows desire and passion. The language shifts towards a more Humean account of causation, whereby regularity and observed concurrences are all we are entitled to use in explanations and predictions, and patterns of feeling are more completely absorbed into our intellectual character. Godwin’s shift towards empiricism and scepticism is apparent, too, in the way truth loses much of its immutability and teleological attraction. This can be viewed as a reformulation rather than a diminution of reason, at least in so far as the changes do not diminish the importance of rational autonomy. We think and act autonomously, Godwin might say, when our judgments are in accordance with our character—that is, with our individual combination of moral and epistemic virtues and vices, which we maintain or improve by conscientiously monitoring and recalibrating our opinions and preferences. Autonomy requires that we do not escape the trajectory of our character but do try to improve it.

It is important to Godwin that we can make a conceptual distinction between voluntary and involuntary actions. He would not want young people to become fatalistic as a consequence of learning about scientific determinism, and yet he did not believe people should be blamed or made to suffer for their false opinions and bad actions: the complexity in the internal and environmental determinants of character is too great for that. Wordsworth for one accepted the compatibility of these positions. ‘Throw aside your books of chemistry,’ Hazlitt reports him saying to a student, ‘and read Godwin on Necessity’ (Hazlitt, 2000: 280).

d. Government, the State, and Education

For Godwin, progress towards the general good is delineated by progressive improvement in education and the development of private judgment. The general good is sometimes referred to by Godwin in utilitarian terms as ‘happiness’, although he avoids the Benthamite notion of the greatest happiness of the greatest number; and there is no question of pushpin being as good as poetry. A just society is a happy society for all, not just because individual people are contented but because they are contented for a particular reason: they enjoy a society, an egalitarian democracy, that allows them to use their education and intellectual development for the general good, including the good of future generations. A proper appreciation of the aims of education will be sufficient inspiration for children to want to learn; they will not require the extrinsic motivation of rewards and sanctions.

Godwin’s critique of forms of government, in Book V of Political Justice, is linked to their respective merits or demerits in relation to education. The best form of government is the one that ‘least impedes the activity and application of intellectual powers’ (PJ: II: 5; Bk V, i).  A monarchy gives power to someone whose judgment and understanding have not been developed by vulnerability to the vicissitudes of fortune. All individuals need an education that provides not only access to books and conversation but also to experience of the diversity of minds and characters. The pampered, protected education of a prince inculcates epistemic vices such as intellectual arrogance and insouciance. He is likely to be misled by flatterers and be saved from rebellion only by the servility, credulity, and ignorance of the populace. No one person, not even an enlightened and virtuous despot, can match a deliberative assembly for breadth of knowledge and experience. A truly virtuous monarch, even an elected one, would immediately abolish the constitution that brought him to power. Any monarch is in the worst possible position to choose the best people for public office or to take responsibility for errors, and yet his subjects are expected to be guided by him rather than by justice and truth.

Similar arguments apply to aristocracies, to presidential systems, to any constitution that invests power in one person or class, that divides rulers from the people, including by a difference in access to education. Heredity cannot confer virtue or wisdom; only education, leisure and prosperity can explain differences of that kind. In a just society no one would be condemned to stupidity and vice. ‘The dissolution of aristocracy is equally in the interest of the oppressor and the oppressed. The one will be delivered from the listlessness of tyranny, and the other from brutalising operation of servitude’ (PJ II: 99; Bk V, xi).

Godwin recognises that democracy, too, has weaknesses, especially representative democracy. Uneducated people are likely to misjudge characters, be deceived by meretricious attractions or dazzled by eloquence. The solution is not epistocracy but an education for all that allows people to trust their own judgment, to find their own voice. Representative assemblies might play a temporary role, but when the people as a whole are more confident and well-informed, a direct democracy would be more ideal. Secret ballots encourage timidity and inconstancy, so decisions and elections should be decided by an open vote.

The close connection between Godwin’s ideas about education and his philosophical anarchism is clear. Had he been less sceptical about government involvement in education, he might have embraced more immediately implementable education policies. His optimism derives from a belief that the less interference there is by political institutions, the more likely people are to be persuaded by arguments and evidence to prefer virtue to vice, impartial justice to self-love. It is not the “whatever is, is right” optimism of Leibniz, Pope, Bolingbroke, Mandeville, and others; clearly, things can and should be better than they are. Complacency about the status quo benefits only the ruling elites. The state restricts reason by imposing false standards and self-interested values that limit the ordinary person’s sense of his or her potential mental capacities and contribution to society. Godwin’s recognition of a systemic denial of a voice to all but an elite suggests that his notion of political and educational injustice compares with what Miranda Fricker (2007) calls epistemic injustice. Social injustice for Godwin just is epistemic injustice in that social evils derive from ignorance, systemic prejudices, and inequalities of power; and epistemic injustice, ultimately, is educational injustice.

A major benefit of the future anarchistic society will be the reduction in drudgery and toil, and the increase in leisure time. Godwin recognises that the labouring classes especially are deprived of time in which to improve their minds. He welcomes technology such as printing, which helps to spread knowledge and literacy, but abhors such features of industrialisation as factories, the division of labour that makes single purpose machines of men, women, and children, and a commercial system that keeps the masses in poverty and makes a few opulently wealthy. Increased leisure and longevity create time for education and help to build the stock of educated and enlightened thinkers. Social and cultural improvement results from this accretion. Freed from governmental interference, education will benefit from a free press and increased exposure to a diversity of opinion. Godwin expresses ‘the belief that once freed from the bonds of outmoded ideas and educational practices, there was no limit to human abilities, to what men could do and achieve’ (Simon, 1960: 50). It is a mistake, Godwin writes towards the end of Political Justice, to assume that inequality in the distribution of what conduces to the well-being of all, education included, is recognised only by the ‘lower orders’. The beneficiaries of educational inequality, once brought to an appreciation of what constitutes justice, will inevitably initiate change. The diffusion of education will be initiated by an educated elite, but local discussion and reading groups will play a role: the educated and the less educated bearing witness to their own knowledge, passing it on and learning from frank conversation.

Unlike Paine and Wollstonecraft, Godwin does not advocate a planned national or state system of mass education. Neither the state nor the church could be trusted to develop curricula and pedagogical styles that educate children in an unbiased way. He is wary of the possibility of a mass education system levelling down, of reducing children to a “naked and savage equality” that suits the interests of the ruling elite. Nor could we trust state-accredited teachers to be unbiased or to model open-mindedness and explorative discussion. He puts his faith, rather, in the practices of a just community, one in which a moral duty to educate all children is enacted without restraint. Presumably, each community would evolve its own practices and make progressive improvements. The education of its children, and of adults, would find a place within the community’s exploration of how to thrive without government regulation and coercion. Paine wanted governmental involvement in a mass literacy movement, and Wollstonecraft wanted a system of coeducational schools for younger children, but Godwin sees a danger in any proposal that systematizes education.

Godwin’s vision of society does not allow him to specify in any detail a particular curriculum. Again, to do so would come too close to institutionalising education, inhibiting local democratic choice and diversity. He does, however, advocate epistemic practices which have pedagogical implications. Children should be taught to venerate truth, to enquire, to present reasons for belief, to reject as prejudice beliefs unsupported by evidence, to examine objections. ‘Refer them to reading, to conversation, to meditation; but teach them neither creeds nor catechisms, neither moral nor political’ (PJ II: 300; Bk VI, viii). In The Enquirer he writes: ‘It is probable that there is no one thing that it is of eminent importance for a child to learn. The true object of juvenile education, is to provide, against the age of five and twenty, a mind well regulated, active, and prepared to learn’ (1797: 77-78).

In the essay ‘Of Public and Private Education’, Godwin considers the advantages and disadvantages of education by private tutor rather than by public schooling. He concludes by wondering whether there might be a middle way: ‘Perhaps an adventurous and undaunted philosophy would lead to the rejecting them altogether, and pursuing the investigation of a mode totally dissimilar’ (1797: 64). His criticisms of both are reinforced in his novel Mandeville, in which the main character is educated privately by an evangelical minister, and then sent, unhappily, to Winchester College; he experiences both modes as an imposition on his liberty and natural dispositions. Certainly, Godwin’s ideas rule out traditional schools, with set timetables and curricula, with authoritarian teachers, ‘the worst of slaves’, whose only mode of teaching is direct instruction, and deferential pupils who ‘learn their lessons after the manner of parrots’ (1797: 81). The first task of a teacher, Godwin suggests in the essay ‘Of the Communication of Knowledge’, is to provide pupils with an intrinsic motive to learn—that is, with ‘a perception of the value of the thing learned’ (1797: 78). This is easiest if the teacher follows the pupil’s interests and facilitates his or her enquiries. The teacher’s task then is to smooth the pupil’s path, to be a consultant and a participant in discussions and debates, modelling the epistemic and linguistic virtues required for learning with and from each other. The pupil and the ‘preceptor’ will be co-learners and the forerunners of individuals who, in successive generations, will develop increasingly wise and comprehensive views.

In Godwin’s view, there will never be a need for a national system of pay or accreditation, but there will be a need, in the short-term, for leadership by a bourgeois educated elite. It is interesting to compare this view with Coleridge’s idea of a ‘clerisy’, a permanent national intellectual elite, most fully developed by Coleridge in On the Constitution of the Church and State (1830). The term ‘clerisy’ refers to a state-sponsored group of intellectual and learned individuals who would diffuse indispensable knowledge to the nation, whose role would be to humanize, cultivate, and unify. Where Godwin anticipates an erosion of differences of rank and an equitable education for all, Coleridge wants education for the labouring classes to be limited, prudentially, to religion and civility, with a more extensive liberal education for the higher classes. The clerisy is a secular clergy, holding the balance between agricultural and landed interests on the one hand, and mercantile and professional interests on the other. Sages and scholars in the frontline of the physical and moral sciences would serve also as the instructors of a larger group whose role would be to disseminate knowledge and culture to every ‘parish’.  Coleridge discussed the idea with Godwin, but very little in it could appeal to a philosopher who anticipated a withering away of the national state; nor could Godwin have agreed with the idea of a permanent intellectual class accredited and paid by the state, or with the idea of a society that depended for its unity on a permanently maintained intelligentsia. Coleridge’s idea put limits on the learning of the majority and denied them the freedom, and the capacity, to pursue their own enquiries and opinions—as did the national education system that developed in Britain in the nineteenth and twentieth centuries.

Godwin’s educational ideas have had little direct impact. They were not as well-known as those of Rousseau to later progressivist educational theorists and practitioners. He had, perhaps, an over-intellectualised conception of children’s development, and too utopian a vision of the kind of society in which his educational ideas could flourish. Nevertheless, it is interesting that his emphasis on autonomous thinking and critical discussion, on equality and justice in the distribution of knowledge and understanding, and his awareness of how powerful interests and dominant ideologies are insinuated through education, are among the key themes of modern educational discourse. The way in which his ideas about education are completely integral to his anarchist political philosophy is one reason why he deserves attention from philosophers of education, as well as from political theorists.

4. Godwin’s Philosophical Anarchism

a. Introduction

Godwin was the first to argue for anarchism from first principles. The examination of his ideas about education has introduced important aspects of his anarchism, including the preference for local community-based practices, rather than any national systems or institutions. His anarchism is both individualistic and socially oriented. He believes that the development of private judgment enables an improved access to truth, and truth enables progression towards a just society. Monarchical and aristocratic modes of government, together with any form of authority based on social rank or religion, are inconsistent with the development of private judgment. Godwin’s libertarianism in respect of freedom of thought and expression deserves recognition, but his commitment to sincerity and candour, to speech that presumes to assert as true only what is epistemically sound, means that not all speech is epistemically responsible. Nor is all listening responsible: free speech, like persuasive argument, requires a fair-minded and tolerant reception. To prepare individuals and society for the responsible exercise of freedom of thought and expression is a task for education.

Godwin was a philosophical anarchist. He did not specify ways in which like-minded people should organise or build a mass movement. Even in the 1790s, when the enthusiasm for the French Revolution was at its height, he was cautious about precipitating unrest. With regard to the practical politics of his day, he was a liberal Whig, never a revolutionary. But the final two Books of Political Justice take Godwin’s anarchism forward with arguments concerning crime and punishment (Book VII) and property (Book VIII). It is here that some of his most striking ideas are to be found, and where he engages with practical policy issues as well as with philosophical principles.

b. Punishment

Godwin sees punishment as inhumane and cruel. In keeping with his necessitarianism, he cannot accept that criminals make a genuinely free choice to commit a crime: ‘the assassin cannot help the murder he commits any more than the dagger’ (PJ II: 324; Bk VII, i). Human beings are not born into sin, but neither are they born virtuous. Crime is caused environmentally, by social circumstances, by ignorance, inequality, oppression. When the wealthy acknowledge this, they will recognise that if their circumstances and those of the poor were reversed, so, too, would be their crimes. Therefore, Godwin rejects the notions of desert and retributive justice. Only the future benefit that might result from punishment matters, and he finds no evidence that suffering is ever beneficial. Laws, like all prescriptions and prohibitions, condemn the mind to imbecility, alienating it from truth, inviting insincerity when obedience is coerced. Laws, and all the legal and penal apparatus of states, weaken us morally and intellectually by causing us to defer to authority and to ignore our responsibilities.

Godwin considers various potential justifications of punishment. It cannot be justified by the future deterrent effect on the same offender, for a mere suspicion of criminal conduct would justify it. It cannot be justified by its reformative effect, for patient persuasion would be more genuinely effective. It cannot be justified by its deterrent effect on non-offenders, for then the greatest possible suffering would be justified because that would have the greatest deterrent effect. Any argument for proportionality would be absurd because how can that be determined when there are so many variables of motivation, intention, provocation, harm done? Laws and penal sentences are too inflexible to produce justice. Prisons are seminaries of vice, and hard labour, like slavery of any kind, is evil. Only for the purposes of temporary restraint should people ever be deprived of their liberty. A radical alternative to punishment is required.

The development of individuals’ capacities for reason and judgment will be accompanied by a gradual emancipation from law and punishment. The community will apply its new spirit of independence to advance the general good. Simpler, more humane and just practices will emerge. The development of private judgment will enable finer distinctions, and better understanding, to move society towards genuine justice. When people trust themselves and their communities to shoulder responsibility as individuals, they will learn to be ‘as perspicacious in distinguishing, as they are now indiscriminate in confounding, the merit of actions and characters’ (PJ II: 412; Bk VI, viii).

c. Property

Property, Godwin argues, is responsible for oppression, servility, fraud, malice, revenge, fear, selfishness, and suspicion. The abolition—or, at least, transformation—of property will be a key achievement of a just society. If I have a superfluity of loaves and one loaf would save a starving neighbour’s life, to whom does that loaf justly belong? Equity is determined by benefit or utility: ‘Every man has a right to that, the exclusive possession of which being awarded to him, a greater sum of benefit or pleasure will result, than could have arisen from its being otherwise appropriated’ (PJ II:423; Bk VIII, i).

It is not just a question of subsistence, but of all indispensable means of improvement and happiness. It includes the distribution of education, skills, and knowledge. The poor are kept in ignorance while the rich are honoured and rewarded for being acquisitive, dissipated, and indolent. Leisure would be more evenly distributed if the rich man’s superfluities were removed, and this would allow more time for intellectual improvement. Godwin’s response to the objection that a superfluity of property generates excellence—culture, industry, employment, decoration, arts—is that all these would increase if leisure and intellectual cultivation were evenly distributed. Free from oppression and drudgery, people would discover new pleasures and capacities. They will see the benefit of their own exertions to the general good ‘and all will be animated by the example of all’ (PJ II: 488; Bk VIII, iv).

Godwin addresses another objection to his egalitarianism in relation to property: the impossibility of its being rendered permanent: we might see equality as desirable but lack the capacity to sustain it; human nature will always reassert itself. To this Godwin’s response is that equality can be sustained if the members of the community are sufficiently convinced that it is just and that it generates happiness. Only the current ‘infestation of mind’ could see inequality dissolve, happiness increase, and be willing to sacrifice that. In time people will grow less vulnerable to greed, flattery, fame, power, and more attracted to simplicity, frugality, and truth.

But if we choose to receive no more than our just share, why should we impose this restriction on others, why should we impose on their moral independence? Godwin replies that moral error needs to be censured frankly and contested by argument and persuasion, but we should govern ourselves ‘through no medium but that of inclination and conviction’ (PJ II, 497; Bk VIII, vi). If a conflict between the principle of equality and the principle of independent judgment appears, priority should go with the latter. The proper way to respect other people’s independence of mind is to engage them in discussion and seek to persuade them. Conversation remains, for Godwin, the most fertile source of improvement. If people trust their own opinions and resist all challenges to it, they are serving the community because the worst possible state of affairs would be a clockwork uniformity of opinion. This is why education should not seek to cast the minds of children in a particular mould.

In a society built on anarchist principles, property will no longer provide an excuse for the exploitation of other people’s time and labour; but it will still exist to the extent that each person retains items required for their welfare and day-to-day subsistence. They should not be selfish or jealous of them. If two people dispute an item, Godwin writes, let justice, not law, decide between them. All will serve on temporary juries for resolving disputes or agreeing on practices, and all will have the capacity to do so without fear or favour.

d. Response to Malthus

The final objection to his egalitarian strictures on property in Political Justice is the chapter ‘Of the objection to this system from the principle of population’ (Book VIII: vii). The objection raises the possibility that an egalitarian world might become too populous to sustain human life. Godwin argues that if this were to threaten human existence, people would develop the strength of mind to overcome the urge to propagate. Combined with the banishment of disease and increased longevity—even perhaps the achievement of immortality—the nature of the world’s population would change. Long life, extended education, a progressive improvement in concentration, a reduced need for sleep, and other advances, would result in a rapid increase in wisdom and benevolence. People would find ways to keep the world’s population at a sustainable level.

This chapter, together with the essay ‘Of Avarice and Profusion’ (The Enquirer, 1797), contributed to Thomas Malthus’ decision to write An Essay on the Principle of Population, first published in 1798. He argued that Godwin was too optimistic about social progress. They met and discussed the question amicably, and a response was included in Godwin’s Reply to Dr Parr, but his major response, Of Population, was not published until 1820, by which time Malthus’s Essay was into its fifth, greatly expanded, edition. Godwin argues against Malthus’s geometrical ratio for population increase and his arithmetical ratio for the increase in food production, drawing where possible on international census figures. He looks to mechanisation, to the untapped resources of the sea, to an increase in crop cultivation rather than meat production, and to chemistry’s potential for producing new foodstuffs. With regard to sexual passions, he repeats his opinion from Political Justice that men and women are capable of immense powers of restraint, and with regard to the Poor Laws, which Malthus wished to abolish, he argued that they were better for the poor than no support at all. Where Malthus argued for lower wages for the poor, Godwin argued for higher pay, to redistribute wealth and stimulate the economy.

When Malthus read Of Population, he rather sourly called it ‘the poorest and most old-womanish performance to have fallen from a writer of note’. The work shows that Godwin remained positive about the capacity of humankind to overcome misery and to achieve individual and social improvement. He knew that if Malthus was right, hopes for radical social progress, and even short-term relief for the poor and oppressed, were futile.

5. Godwin’s Fiction

a. Caleb Williams (1794)

Godwin wrote three minor novels before he wrote Political Justice. They had some success, but nothing like that of the two novels he completed in the 1790s. Caleb Williams and St. Leon were not only the most successful and intellectually ambitious of his novels but were also the two that relate most closely to his philosophical work of the 1790s. He wrote two more novels that were well received: Fleetwood. or The New Man of Feeling (1805) and Mandeville, a Tale of the Seventeenth Century in England (1817). His final two novels, Cloudsley (1830) and Deloraine (1831), were more romantic and less successful.

Things As They Are; or The Adventures of Caleb Williams is both a study of individual psychology and a continuation, or popularization, of Godwin’s critical analysis of English society in Political Justice. It explores how aristocracy insinuates authority and deference throughout society. One of the two main characters, Falkland, is a wealthy philanthropist whose tragic flaw is a desire to maintain at any cost his reputation as an honourable and benevolent gentleman. The other, Caleb, is his bright, self-educated servant with insatiable curiosity. Caleb admires Falkland, but he begins to suspect that it was his master who murdered the uncouth and boorish neighbouring squire, Barnabas Tyrrel. When the opportunity arises for him to search the contents of a mysterious chest in Falkland’s library, Caleb cannot resist. He is discovered by Falkland and learns the truth from him. Not only was Falkland the murderer, but he had allowed innocent people to die for the crime. He is driven to protect his reputation and honour at any cost. Caleb is chased across the country, and around Europe, by Falkland’s agents. He is resourceful and courageous in eluding them, but Falkland’s power and resources are able to wear him down and bring him to court, where Falkland and Caleb face each other. They are both emotionally, psychologically, and physically exhausted. In different ways, both have been persecuted and corrupted by the other, and yet theirs is almost a love story. The trial establishes the facts as far as they interest the law, but it is not the whole truth: not, from a moral perspective, in terms of true guilt and innocence, and not from a psychological perspective.

Caleb’s adventures during his pursuit across Britain and Europe allow us to see different aspects of human character and psyche, and of the state of society. Caleb recounts his adventures himself, allowing the reader to infer the degree to which he is reliably understanding and confessing his own moral and psychological decline. He espouses principles of sincerity and candour, but his narrative shows the difficulty of being truly honest with oneself. The emotional and mental effects of his persecution are amplified by his growing paranoia.

The novel was recognised as an attack on values and institutions embedded in English society, such as religion, law, prisons, inequality, social class, the abuse of power, and aristocratic notions of honour. One of the more didactic passages occurs when Caleb is visited in prison by Thomas, a fellow servant. Thomas looks at the conditions in which Caleb is kept—shackled and without even straw for a bed—and exclaims, ‘Zounds, I have been choused. They told me what a fine thing it was to be an Englishman, and about liberty and property, and all that there; and I find it is all flam’ (2009: 195). In another episode, Caleb encounters a group of bandits. Their leader, Raymond, justifies their activities to Caleb: ‘We undertake to counteract the partiality and iniquity of public institutions. We, who are thieves without a licence, are at open war with another set of men, who are thieves according to law… we act, not by choice, but only as our wise governors force us to act’ (2009: 209).

It is also a story of communication failure, of mutual distrust and resentment that could have been resolved by conversation. Caleb’s curiosity made him investigate the chest for himself, rather than openly confront Falkland with his suspicions. Both men have failed to exercise their private judgment independently of the values and expectations of their social situation. By the end of the novel, any hope of resolution has evaporated: a frank and rational discussion at the right time could have achieved it. It was, at least in part, the social environment—social inequality—that created their individual characters and the communication barrier.

As well as themes from Political Justice, there are echoes of the persecution and surveillance of British radicals at the time of writing and of the false values, as Godwin saw them, of Burke’s arguments in favour of tradition and aristocracy, of ‘things as they are’. It is not surprising that the novel was especially praised by readers with radical views. In his character sketch of Godwin (in The Spirt of the Age), Hazlitt wrote that ‘no one ever began Caleb Williams that did not read it through: no one that ever read it could possibly forget it, or speak of it after any length of time but with an impression as if the events and feelings had been personal to himself’ (Hazlitt, 2000: 288).

b. St. Leon: A Tale of the Sixteenth Century (1799)

Despite its historical setting, St. Leon is as concerned as Caleb Williams is with the condition of contemporary society and with themes from Political Justice. Gary Kelly (1976) has coupled St. Leon with Caleb Williams as an English Jacobin novel (together with works by Elizabeth Inchbald, Robert Bage, and Thomas Holcroft), and Pamela Clemit (1993) classes them as Rational or Godwinian novels (together with works by Mary Shelley and the American novelist Charles Brockden Brown). They are certainly philosophical novels. St. Leon is also an historical novel in that its setting in sixteenth century Europe is accurately depicted, and it is a Gothic novel in that it contains mystery, horror, arcane secrets, and dark dungeons. B. J. Tysdahl (1981) refers to its ‘recalcitrant Gothicism’. When Lord Byron asked why he did not write another novel, Godwin replied that it would kill him. ‘And what matter,’ Byron replied, ‘we should have another St. Leon’.

The central character and narrator, St. Leon, is as imbued with the values of his own country, class, and period as Falkland. At the start of the novel, he is a young French nobleman in thrall to chivalric values and anxious to create a great reputation as a knight. A high point of his youth is his attendance at the Field of the Cloth of Gold in 1520, when Francis I of France and Henry VIII of England met in awe-inspiring splendour, as if to mark the end of medievalism. A low point is when the French are defeated at the Battle of Pavia. St. Leon’s education had prepared him for a chivalric way of life; its passing leaves him unprepared for a world with more commercial values. His hopes of aristocratic glory are finally destroyed by an addiction to gambling. He loses his wealth and the respect of his son, Charles, and might have lost everything had he been married to a less extraordinary woman. Marguerite sees their financial ruin as a blessing in disguise, and for a period the family enjoys domestic contentment in a humble setting in Switzerland.

This changes when St. Leon encounters a stranger who has selected him to succeed to the possession of arcane knowledge. The alchemical secrets he is gifted—the philosopher’s stone and the elixir of life—restore his wealth and give him immortality. He seizes the opportunity to make amends to his family and to society by becoming the world’s benefactor. But the gift turns out to be a curse. His wife dies, his philanthropic schemes fail, and he becomes an outcast, mistrusted and alienated forever. Generations pass; St. Leon persists but sees himself as a monster undeserving of life. Only by unburdening himself of the alchemical knowledge, as the stranger had done, could he free himself to die. Otherwise, he must live forever a life of deceit and disguise. As the narrator, he cannot provide clues even to the recipients of his narration, in whatever age we might live. We pity him but we cannot entirely trust him. Even as a narrator he is suspected. As in Caleb Williams, the impossibility of candour and truthfulness is shown to be corrupting, and as in Mary Shelley’s Frankenstein, unique knowledge and a unique form of life are shown to bring desolation in the absence of affection, trust, and communication.

We can interpret St. Leon as a renewal of Godwin’s critique of Burke and of the British mixture of tradition and commercialism. We can see in Marguerite a tribute to Mary Wollstonecraft. Is there also, as Gary Kelly suggests (1976: 210), a parallel between the radical philosophers of the late eighteenth century—polymaths like Joseph Priestley and Richard Price, perhaps, or Godwin and Wollstonecraft themselves—and the alchemical adept whose knowledge and intentions society suspects and is unprepared for? Writing St. Leon so shortly after the death of Wollstonecraft, when he is enduring satirical attacks, Godwin must have felt himself in danger of becoming isolated and insufficiently appreciated. We can see the novel as pessimistic, reflecting Godwin’s doubts about the potential for radical change in his lifetime. But Godwin well knew that alchemy paved the way for chemical science, so perhaps the message is more optimistic: what seems like wishful thinking today will lead us to tomorrow’s accepted wisdom.

6. Conclusion

Godwin died on the cusp of the Victorian age, having played a part in the transition from the Enlightenment to Romanticism. His influence persisted as Political Justice reached a new, working-class readership through quotation in Owenite and Chartist pamphlets and a cheap edition published in 1842, and his ideas were discussed at labour movement meetings. His novels influenced Dickens, Poe, Hawthorne, Balzac, and others. According to Marshall (1984: 392), Marx knew of Godwin through Engels, but disagreed with his individualism and about which social class would be the agent of reform. Of the great anarchist thinkers who came after him, Bakunin does not refer to him, Tolstoy does but may not have read him directly; Kropotkin, however, hailed him as the first to define the principles of anarchism.

Godwin’s political philosophy can appear utopian, and his view of the potential for human improvement naively optimistic, but his ideas still have resonance and relevance. As a moral philosopher, he has not received sufficient credit for his version of utilitarian principles, contemporaneous with Bentham’s, a version that anticipates John Stuart Mill’s. He was both intellectually courageous in sticking to his fundamental principles, and conscientious in admitting to errors. Unlike Malthus, he believed the conditions of the poor and oppressed can and should be improved. He is confident that an egalitarian democracy free of government interference will allow individuals to thrive. One of his most important contributions to social and political theory is his analysis of how educational injustice is a primary source of social injustice. The journey to political justice begins and ends with educational justice.

7. References and Further Reading

a. Works by William Godwin

i. Early Editions of An Enquiry Concerning Political Justice

  • 1793. An Enquiry Concerning Political Justice, and Its Influence on General Virtue and Happiness. First edition. 2 vols. London: G.G and J. Robinson.
  • 1796. An Enquiry Concerning Political Justice, and Its Influence on General Virtue and Happiness. Second edition. 2 vols. London: G.G and J. Robinson.
  • 1798. An Enquiry Concerning Political Justice, and Its Influence on General Virtue and Happiness. Third edition. 2 vols. London: G.G and J. Robinson.

ii. Other Editions of An Enquiry Concerning Political Justice

  • 1946. An Enquiry Concerning Political Justice. F. E. L. Priestley (ed). 3 vols. Toronto: University of Toronto Press.
    • This is a facsimile of the third edition. Volume 3 contains variants from the first and second editions.
  • 2013. An Enquiry Concerning Political Justice. Mark Philp (ed). Oxford World Classics. Oxford: Oxford University Press.
    • This is based on the text of 1793 first edition. In addition to an introduction by Mark Philp, it includes a chronology of Godwin’s life and explanatory notes.
  • 2015. Enquiry Concerning Political Justice: And Its Influence On Morals And Happiness. Isaac Kramnick (ed.). London: Penguin.
    • This is based on the text of the 1798 third edition. It includes the Summary of Principles. Introduction and Editor’s Notes by Isaac Kramnick.

iii. Collected Editions of Godwin’s Works and Correspondence

  • 1992. Collected Novels and Memoirs of William Godwin. 8 vols. Mark Philp (ed.). London: Pickering and Chatto Publishers, Ltd.
    • A scholarly series that includes Memoirs of the Author of a Vindication of the Rights of Woman as well as the text of all Godwin’s fiction and some unpublished pieces.
  • 1993. Political and Philosophical Writings of William Godwin, 7 Volumes, Mark Philp (ed.). London, Pickering and Chatto Publishers Ltd.
    • A scholarly edition of Godwin’s principal political and philosophical works, including some previously unpublished pieces. Volume 1 includes a complete bibliography of Godwin’s works and political essays. Volume 2 contains the remaining political essays. Volume 3 contains the text of the first edition of Political Justice; volume 4 contains variants from the second and third editions. Volumes 5 and 6 contain educational and literary works, including The Enquirer essays. Volume 7 includes Godwin’s final (unfinished) work, published posthumously: The Genius of Christianity Unveiled.
  • 2011, 2014. The Letters of William Godwin. Volume 1: 1778–1797, Volume 2: 1798–1805. Pamela Clemit (ed). Oxford: Oxford University Press.
    • A projected six volume series.

iv. First Editions of Other Works by Godwin

  • 1783. An Account of the Seminary That Will Be Opened on Monday the Fourth Day of August at Epsom in Surrey. London: T. Cadell.
  • 1784. The Herald of Literature, as a Review of the Most Considerable Publications That Will Be Made in the Course of the Ensuing Winter. London: J. Murray.
  • 1794a. Cursory Strictures on the Charge Delivered by Lord Chief Justice Eyre to the Grand Jury London: D. I. Eaton.
  • 1794b. Things As They Are; or The Adventures of Caleb Williams. 3 vols. London: B. Crosby.
  • 1797. The Enquirer: Reflections on Education, Manners and Literature. London: GG and J Robinson.
  • 1798. Memoirs of the Author of a Vindication of the Rights of Woman. London: J. Johnson.
  • 1799. St. Leon, A Tale of the Sixteenth Century. 4 vols. London: G.G. and J. Robinson.
  • 1801 Thoughts Occasioned by the Perusal of Dr. Parr’s Spital Sermon, Preached at Christ Church, April I5, 1800: Being a Reply to the Attacks of Dr. Parr, Mr. Mackintosh, the Author of an Essay on Population, and Others. London: GG and J Robinson.
  • 1805. Fleetwood. or The New Man of Feeling. 3 vols. London: R. Phillips.
  • 1817. Mandeville, a Tale of the Seventeenth Century in England. 3 vols. London: Longman, Hurst, Rees, Orme and Brown.
  • 1820. Of Population. An Enquiry Concerning the Power of Increase in the Numbers of Mankind, Being an Answer to Mr. Malthus’s Essay on That Subject. London: Longman, Hurst, Rees, Orme and Brown.
  • 1824. History of the Commonwealth of England from Its Commencement to Its Restoration. 4 vols. London: H. Colburn
  • 1831. Thoughts on Man, His Nature, Productions, and Discoveries. Interspersed with Some Particulars Respecting the Author. London: Effingham Wilson

v. Online Resources

  • 2010. The Diary of William Godwin. Victoria Myers, David O’Shaughnessy, and Mark Philp (eds.). Oxford: Oxford Digital Library. http://godwindiary.bodleian.ox.ac.uk/index2.html.
    • Godwin kept a diary from 1788 to 1836. It is held by the Bodleian Library, University of Oxford as part of the Abinger Collection. Godwin recorded meetings, topics of conversation, his reading and writing in succinct notes.

vi. Other Editions of Selected Works by Godwin

  • 1986. Romantic Rationalist: A William Godwin Reader. Peter Marshall (ed.). London: Freedom Press.
    • Contains selections from Godwin’s works, arranged by theme.
  • 1988. Caleb Williams. Maurice Hindle (ed.). London: Penguin Books.
  • 1994. St. Leon. Pamela Clemit (ed.). Oxford World Classics. Oxford: Oxford University Press.
  • 2005. Godwin on Wollstonecraft: Memoirs of the Author of a Vindication of the Rights of Woman. Richard Holmes (ed). London: Harper Perennial.
  • 2009. Caleb Williams. Pamela Clemit (ed.). Oxford World Classics. Oxford: Oxford University Press
  • 2019. Fleetwood. Classic Reprint. London: Forgotten Books.
  • 2019. Mandeville: A Tale of the Seventeenth Century in England. Miami, Fl: Hard Press Books.

b. Biographies of Godwin

  • Brailsford, H N. 1951. Shelley, Godwin and Their Circle. Second edition. Home University Library of Modern Knowledge. Oxford: Oxford University Press.
  • Brown, Ford K. 1926. The Life of William Godwin. London: J. M. Dent and Sons.
  • Clemit, Pamela (ed). 1999. Godwin. Lives of the Great Romantics III: Godwin, Wollstonecraft and Mary Shelley by their Contemporaries. Volume 1. London: Pickering and Chatto.
  • Goulbourne, Russell, Higgins, David (eds.). 2017. Jean-Jacques Rousseau and British Romanticism: Gender and Selfhood, Politics and Nation. London: Bloomsbury.
  • Hazlitt, William. 2000. ‘William Godwin’ in The Fight and Other Writings. Tom Paulin (ed.). London: Penguin.
  • Locke, Don. 1980. A Fantasy of Reason: The Life and Thought of William Godwin. London: Routledge and Kegan Paul.
    • This is described as a ‘philosophical biography’.
  • Marshall, Peter. 1984. William Godwin. New Haven: Yale University Press.
    • A new edition is entitled William Godwin: Philosopher, Novelist, Revolutionary. PM Press, 2017. The text appears the same. A standard biography.
  • Paul, Charles Kegan, 1876, William Godwin: his Friends and Contemporaries, 2 volumes, London: H.S King.
    • An early and thorough biography, with important manuscript material.
  • St Clair, William. 1989. The Godwins and the Shelleys: The Biography of a Family. London: Faber and Faber.
  • Thomas, Richard Gough. 2019. William Godwin: A Political Life. London: Pluto Press.
  • Woodcock, George. 1946. A Biographical Study. London: Porcupine Press.

c. Social and Historical Background

  • Butler, Marilyn. 1984. Burke, Paine, Godwin and the Revolution Controversy. Cambridge: Cambridge University Press.
  • Grayling, A. C. 2007. Towards the Light: The Story of the Struggles for Liberty and Rights. London: Bloomsbury.
  • Hay, Daisy. 2022. Dinner with Joseph Johnson: Books and Friendship in a Revolutionary Age. London: Chatto and Windus.
    • A study of the regular dinners held by the radical publisher, whose guests included Godwin, Wollstonecraft, Fuseli, Blake, and many other writers, artists, and radicals.
  • Hewitt, Rachel. 2017. A Revolution in Feeling: The Decade that Forged the Modern Mind. London: Granta.
  • Norman Jesse. 2013. Edmund Burke: Philosopher Politician Prophet. London: William Collins.
  • Philp, Mark. 2020. Radical Conduct: Politics, Sociability and Equality in London 1789–1815. Cambridge UK: Cambridge University Press.
    • A study of the radical intellectual culture of the period and of Godwin’s position within it.
  • Simon, Brian. 1960. Studies in the History of Education, 1780 – 1870. London: Lawrence and Wishart.
  • Tomalin, Claire. 1974. The Life and Death of Mary Wollstonecraft. London: Weidenfeld and Nicolson.
  • Uglow, Jenny. 2014. In These Times: Living in Britain Through Napoleon’s Wars 1798 – 1815. London: Faber and Faber.

d. Other Secondary Sources in Philosophy, Education, Fiction, and Anarchism

  • Bottoms, Jane. 2004. ‘“Awakening the Mind”: The Educational Philosophy of William Godwin’. History of Education 33 (3): 267–82.
  • Claeys, Gregory. 1983. ‘The Concept of “Political Justice” in Godwin’s Political Justice.’ Political Theory 11 (4): 565–84.
  • Clark, John P. 1977. The Philosophical Anarchism of William Godwin. Princetown: Princetown University Press.
  • Clemit, Pamela. 1993. The Godwinian Novel. Oxford: Clarendon Press.
  • Crowder, George. 1991. Classical Anarchism: The Political Thought of Godwin, Proudhon, Bakunin and Kropotkin. Oxford: Oxford University Press.
  • Eltzbacher, Paul. 1960. Anarchism: Seven Exponents of the Anarchist Philosophy. London: Freedom Press.
  • Fleisher, David. 1951. William Godwin: A Study of Liberalism. London: Allen and Unwin.
  • Fricker, Miranda. 2007. Epistemic Injustice: Power and the Ethics of Knowing. Oxford: Oxford University Press.
  • Kelly, Gary. 1976. The English Jacobin Novel 1780 – 1805. Oxford: Clarendon Press.
  • Knights, B. 1978. The Idea of the Clerisy in the Nineteenth Century. Cambridge UK: Cambridge University Press.
  • Lamb, Robert. 2006. ‘The Foundations of Godwinian Impartiality’. Utilitas 18 (2): 134–53.
  • Lamb, Robert. 2009. ‘Was William Godwin a Utilitarian?’ Journal of the History of Ideas 70 (1): 119–41.
  • Manniquis, Robert, Myers, Victoria. 2011. Godwinian Moments: From Enlightenment to Romanticism. Toronto: University of Toronto/Clark Library UCLA.
  • Marshall, Peter. 2010. Demanding the Impossible: A History of Anarchism. Oakland, CA: PM Press.
  • Mee, Jon. 2011. ‘The Use of Conversation: William Godwin’s Conversable World and Romantic Sociability’. Studies in Romanticism 50 (4): 567–90.
  • Monro, D.H. 1953. Godwin’s Moral Philosophy. Oxford: Oxford University Press.
  • O’Brien, Eliza, Stark, Helen, Turner, Beatrice (eds.) 2021. New Approaches to William Godwin: Forms, Fears, Futures. London: Palgrave/MacMillan.
  • Philp, Mark. 1986. Godwin’s Political Justice. London: Duckworth.
    • A detailed analysis of Godwin’s major philosophical work.
  • Pollin, Burton R. 1962. Education and Enlightenment in the Works of William Godwin. New York: Las Americas Publishing Company.
    • Still the most thorough study of Godwin’s educational thought.
  • Scrivener, Michael. 1978. ‘Godwin’s Philosophy Re-evaluated’. Journal of the History of Ideas 39: 615–26.
  • Simon, Brian, (ed). 1972. The Radical Tradition in Education in Great Britain. London: Lawrence and Wishart.
  • Singer, Peter, Leslie Cannold, Helga Kuhse. 1995, ‘William Godwin and the Defence of Impartialist Ethics’. Utilitas, 7(1): 67–86.
  • Suissa, Judith. 2010. Anarchism and Education: A Philosophical Perspective. Second. Oakland, CA: PM Press.
  • Tysdahl, B J. 1981. William Godwin as Novelist. London: Athlone Press.
  • Weston, Rowland. 2002. ‘Passion and the “Puritan Temper”: Godwin’s Critique of Enlightened Modernity’. Studies in Romanticism. 41 (3): 445-470.
  • Weston, Rowland. 2013. ‘Radical Enlightenment and Antimodernism: The Apostasy of William Godwin (1756–1836)’. Journal for the Study of Radicalism. 7 (2): 1–30.

Author Information

Graham Nutbrown
Email: gn291@bath.ac.uk
University of Bath
United Kingdom

Noam Chomsky (1928 – )

Noam Chomsky is an American linguist who has had a profound impact on philosophy. Chomsky’s linguistic work has been motivated by the observation that nearly all adult human beings have the ability to effortlessly produce and understand a potentially infinite number of sentences. For instance, it is very likely that before now you have never encountered this very sentence you are reading, yet if you are a native English speaker, you easily understand it. While this ability often goes unnoticed, it is a remarkable fact that every developmentally normal person gains this kind of competence in their first few years, no matter their background or general intellectual ability. Chomsky’s explanation of these facts is that language is an innate and universal human property, a species-wide trait that develops as one matures in much the same manner as the organs of the body. A language is, according to Chomsky, a state obtained by a specific mental computational system that develops naturally and whose exact parameters are set by the linguistic environment that the individual is exposed to as a child. This definition, which is at odds with the common notion of a language as a public system of verbal signals shared by a group of speakers, has important implications for the nature of the mind.

Over decades of active research, Chomsky’s model of the human language faculty—the part of the mind responsible for the acquisition and use of language—has evolved from a complex system of rules for generating sentences to a more computationally elegant system that consists essentially of just constrained recursion (the ability of a function to apply itself repeatedly to its own output). What has remained constant is the view of language as a mental system that is based on a genetic endowment universal to all humans, an outlook that implies that all natural languages, from Latin to Kalaallisut, are variations on a Universal Grammar, differing only in relatively unimportant surface details. Chomsky’s research program has been revolutionary but contentious, and critics include prominent philosophers as well as linguists who argue that Chomsky discounts the diversity displayed by human languages.

Chomsky is also well known as a champion of liberal political causes and as a trenchant critic of United States foreign policy. However, this article focuses on the philosophical implications of his work on language. After a biographical sketch, it discusses Chomsky’s conception of linguistic science, which often departs sharply from other widespread ideas in this field. It then gives a thumbnail summary of the evolution of Chomsky’s research program, especially the points of interest to philosophers. This is followed by a discussion of some of Chomsky’s key ideas on the nature of language, language acquisition, and meaning. Finally, there is a section covering his influence on the philosophy of mind.

Table of Contents

  1. Life
  2. Philosophy of Linguistics
    1. Behaviorism and Linguistics
    2. The Galilean Method
    3. The Nature of the Evidence
    4. Linguistic Structures
  3. The Development of Chomsky’s Linguistic Theory
    1. Logical Constructivism
    2. The Standard Model
    3. The Extended Standard Model
    4. Principles and Parameters
    5. The Minimalist Program
  4. Language and Languages
    1. Universal Grammar
    2. Plato’s Problem and Language Acquisition
    3. I vs. E languages
    4. Meaning and Analyticity
    5. Kripkenstein and Rule Following
  5. Cognitive Science and Philosophy of Mind
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life

Avram Noam Chomsky was born in Philadelphia in 1928 to Jewish parents who had immigrated from Russia and Ukraine. He manifested an early interest in politics and, from his teenage years, frequented anarchist bookstores and political circles in New York City. Chomsky attended the University of Pennsylvania at the age of 16, but he initially found his studies unstimulating. After meeting the mathematical linguist Zellig Harris through political connections, Chomsky developed an interest in language, taking graduate courses with Harris and, on his advice, studying philosophy with Nelson Goodman. Chomsky’s 1951 undergraduate honors thesis, on Modern Hebrew, would form the basis of his MA thesis, also from the University of Pennsylvania. Although Chomsky would later have intellectual fallings out with both Harris and Goodman, they were major influences on him, particularly in their rigorous approach, informed by mathematics and logic, which would become a prominent feature of his own work.

After earning his MA, Chomsky spent the next four years with the Society of Fellows at Harvard, where he had applied largely because of his interest in the work of W.V.O. Quine, a Harvard professor and major figure in analytic philosophy. This would later prove to be somewhat ironic, as Chomsky’s work developed into the antithesis of Quine’s behaviorist approach to language and mind. In 1955, Chomsky was awarded his doctorate and became an assistant professor at the Massachusetts Institute of Technology, where he would continue to work as an emeritus professor even after his retirement in 2002. Throughout this long tenure at MIT, Chomsky produced an enormous volume of work in linguistics, beginning with the 1957 publication of Syntactic Structures. Although his work initially met with indifference or even hostility, including from his former mentors, it gradually altered the very nature of the field, and Chomsky grew to be widely recognized as one of the most important figures in the history of language science. Since 2017, he has been a laureate professor in the linguistics department at the University of Arizona.

Throughout his career, Chomsky has been at least as prolific in social, economic, and political criticism as in linguistics. Chomsky became publicly outspoken about his political views with the escalation of the Vietnam War, which he always referred to as an “invasion”. He was heavily involved in the anti-war movement, sometimes risking both his professional and personal security, and was arrested several times. He remained politically active and, among many other causes, was a vocal critic of US interventions in Latin America during the 1980s, the reaction to the September 2001 attacks, and the invasion of Iraq. Chomsky has opposed, since his early youth, the capitalist economic model and supported the Occupy movement of the early 2010s. He has also been an unwavering advocate of intellectual freedom and freedom of speech, a position that has at times pitted him against other left-leaning intellectuals and caused him to defend the rights of others who have very different views from his own. Despite the speculations of many biographers, Chomsky has always denied any connection between his work in language and politics, sometimes quipping that someone was allowed to have more than one interest.

In 1947, Chomsky married the linguist Carol Doris Chomsky (nee Schatz), a childhood friend from Philadelphia. They had three children and remained married until her death in 2008. Chomsky remarried Valeria Wasserman, a Brazilian professional translator, in 2014.

2. Philosophy of Linguistics

Chomsky’s approach to linguistic science, indeed his entire vision of what the subject matter of the discipline consists of, is a sharp departure from the attitudes prevalent in the mid-20th century. To simplify, prior to Chomsky, language was studied as a type of communicative behavior, an approach that is still widespread among those who do not accept Chomsky’s ideas. In contrast, his focus is on language as a type of (often unconscious) knowledge. The study of language has, as Chomsky states, three aspects: determining what the system of knowledge a language user has consists of, how that knowledge is acquired, and how that knowledge is used. A number of points in Chomsky’s approach are of interest to the philosophy of linguistics and to the philosophy of science more generally, and some of these points are discussed below.

a. Behaviorism and Linguistics

When Chomsky was first entering academics in the 1950s, the mainstream school of linguistics for several decades had been what is known as structuralism. The structuralist approach, endorsed by Chomsky’s mentor Zellig Harris, among others, concentrated on analyzing corpora, or records of the actual use of a language, either spoken or written. The goal of the analysis was to identify patterns in the data that might be studied to yield, among other things, the grammatical rules of the language in question. Reflecting this focus on language as it is used, structuralists viewed language as a social phenomenon, a communicative tool shared by groups of speakers. Structuralist linguistics might well be described as consisting of the study of what happens between a speaker’s mouth and a listener’s ear; as one well -known structuralist put it, “the linguist deals only with the speech signal” (Bloomfield, 1933: 32). This is in marked contrast to Chomsky and his followers, who concentrate on what is going on in the mind of a speaker and who look there to identify grammatical rules.

Structuralist linguistics was itself symptomatic of behaviorism, a paradigm prominently championed in psychology by B.F. Skinner and in philosophy by W.V.O. Quine and which was dominant in the midcentury. Behaviorism held that science should restrict itself to observable phenomena. In psychology, this meant seeking explanations entirely in terms of external behavior without discussing minds, which are, by their very nature, unobservable. Language was to be studied in terms of subjects’ responses to stimuli and their resulting verbal output. Behaviorist theories were often formed on the basis of laboratory experiments in which animals were conditioned by being given food rewards or tortured with electric shock in order to shape their behavior. It was thought that human behavior could be similarly explained in terms of conditioning that shapes reactions to specific stimuli. This approach perhaps reached its zenith with the publication of Skinner’s Verbal Behavior (1957), which sought to reduce human language to conditioned responses. According to Skinner, speakers are conditioned as children, through training by adults, to respond to stimuli with an appropriate verbal response. For example, a child might realize that if they see a piece of candy (the stimulus) and respond by saying “candy”, they might be rewarded by adults with the desired sweet, reinforcing that particular response. For an adult speaker, the pattern of stimuli and response could be very complex, and what specific aspect of a situation is being responded to might be difficult to ascertain, but the underlying principle was held to be the same.

Chomsky’s scathing 1959 review of Verbal Behavior has actually become far better known than the original book. Although Chomsky conceded to Skinner that the only data available for the study of language consisted of what people say, he denied that meaningful explanations were to be found at that level. He argued that in order to explain a complex behavior, such as language use, exhibited by a complex organism such as a human being, it is necessary to inquire into the internal organization of the organism and how it processes information. In other words, it was necessary to make inferences about the language user’s mind. Elsewhere, Chomsky likened the procedure of studying language to what engineers would do if confronted with a hypothetical “black box”, a mysterious machine whose input and output were available for inspection but whose internal functioning was hidden. Merely detecting patterns in the output would not be accepted as real understanding; instead, that would come from inferring what internal processes might be at work.

Chomsky particularly criticized Skinner’s theory that utterances could be classified as responses to subtle properties of an object or event. The observation that human languages seem to exhibit stimulus-freedom goes back at least to Descartes in the 17th century, and about the same time as Chomsky was reviewing Skinner, the linguist Charles Hockett (later one of Chomsky’s most determined critics) suggested that this is one of the features that distinguish human languages from most examples of animal communication. For instance, a vervet monkey will give a distinct alarm call any time she spots an eagle and at no other times. In contrast, a human being might say anything or nothing in response to any given stimulus. Viewing a paining one might say, “Dutch…clashes with the wallpaper…. Tilted, hanging too low, beautiful, hideous, remember our camping trip last summer? or whatever else might come to our minds when looking at a picture.” (Chomsky, 1959:2). What aspect of an object, event, or environment triggers a particular response rather than another can only be explained in mental terms. The most relevant fact is what the speaker is thinking about, so a true explanation must take internal psychology into account.

Chomsky’s observation concerning speech was part of his more general criticism of the behaviorist approach. Chomsky held that attempts to explain behavior in terms of stimuli and responses “will be in general a vain pursuit. In all but the most elementary cases, what a person does depends in large measure on what he knows, believes, and anticipates” (Chomsky, 2006: xv). This was also meant to apply to the behaviorist and empiricist philosophy exemplified by Quine. Although Quine has remained important in other aspects of analytic philosophy, such as logic and ontology, his behaviorism is largely forgotten. Chomsky is widely regarded as having inaugurated the era of cognitive science as it is practiced today, that is, as a study of the mental.

b. The Galilean Method

Chomsky’s fundamental approach to doing science was and remains different from that of many other linguists, not only in his concentration on mentalistic explanation. One approach to studying any phenomenon, including language, is to amass a large amount of data, look for patterns, and then formulate theories to explain those patterns. This method, which might seem like the obvious approach to doing any type of science, was favored by structuralist linguists, who valued the study of extensive catalogs of actual speech in the world’s languages. The goal of the structuralists was to provide descriptions of a language at various levels, starting with the analysis of pronunciation and eventually building up to a grammar for the language that would be an adequate description of the regularities identifiable in the data.

In contrast, Chomsky’s method is to concentrate not on a comprehensive analysis but rather on certain crucial data, or data that is better explained by his theory than by its rivals. This sort of methodology is often called “Galilean”, since it takes as its model the work of Galileo and Newton. These physicists, judiciously, did not attempt to identify the laws of motion by recording and studying the trajectory of as many moving objects as possible. In the normal course of events, the exact paths traced by objects in motion are the results of the complex interactions of numerous phenomena such as air resistance, surface friction, human interference, and so on. As a result, it is difficult to clearly isolate the phenomena of interest. Instead, the early physicists concentrated on certain key cases, such as the behavior of masses in free fall or even idealized fictions such as objects gliding over frictionless planes, in order to identify the principles that, in turn, could explain the wider data. For similar reasons, Chomsky doubts that the study of actual speech—what he calls performancewill yield theoretically important insights. In a widely cited passage (Chomsky, 1962, 531), he noted that:

Actual discourse consists of interrupted fragments, false starts, lapses, slurring, and other phenomena that can only be understood as distortions of an underlying idealized pattern.

Like the ordinary movements of objects observable in nature, which Galileo largely ignored, actual speech performance is likely to be the product of a mass of interacting factors, such as the social conventions governing the speech exchange, the urgency of the message and the time available, the psychological states of the speakers (excited, panicked, drunk), and so on, of which purely linguistic phenomena will form only a small part. It is the idealized patterns concealed by these effects and the mental system that generates those patterns —the underlying competence possessed by language users —that Chomsky regards as the proper subject of linguistic study. (Although the terms competence and performance have been superseded by the I-Language/E-Language distinction, discussed in 4.c. below, these labels are fairly entrenched and still widely used.)

c. The Nature of the Evidence

Early in his career (1965), Chomsky specified three levels of adequacy that a theory of language should satisfy, and this has remained a feature of his work. The first level is observational, to determine what sentences are grammatically acceptable in a language. The second is descriptive, to provide an account of what the speaker of the language knows, and the third is explanatory, to give an explanation of how such knowledge is acquired. Only the observational level can be attained by studying what speakers actually say, which cannot provide much insight into what they know about language, much less how they came to have that knowledge. A source of information about the second and third levels, perhaps surprisingly, is what speakers do not say, and this has been a focus of Chomsky’s program. This negative data is drawn from the judgments of native speakers about what they feel they can’t say in their language. This is not, of course, in the sense of being unable to produce these strings of words or of being unable, with a little effort, to understand the intended message, but simply a gut feeling that “you can’t say that”. Chomsky himself calls these interpretable but unsayable sentences “perfectly fine thoughts”, while the philosopher Georges Rey gave them the pithier name “WhyNots”. Consider the following examples from Rey 2022 (the “*” symbol is used by linguists to mark a string that is ill-formed in that it violates some principle of grammar):

(1) * Who did John and kiss Mary? (Compared to John, and who kissed Mary? and who-initial questions like Who did John kiss?)

(2) * Who did stories about terrify Mary? (Compared to stories about who terrified Mary?)

Or the following question/answer pairs:

(3) Which cheese did you recommend without tasting it? * I recommended the brie without tasting it. (Compared to… without tasting it.)

(4) Have you any wool? * Yes, I have any wool.

An introductory linguistics textbook provides two further examples (O’Grady et al. 2005):

(5) * I went to movie. (Compared to I went to school.)

(6) *May ate a cookie, and then Johnnie ate some cake, too. (Compared to Mary ate  a cookie, and then Johnnie ate a cookie too/ate a snack too.)

The vast majority of English speakers would balk at these sentences, although they would generally find it difficult to say precisely what the issue is (the textbook challenges the reader to try to explain). Analogous “whynot” sentences exist in every language yet studied.

What Chomsky holds to be significant about this fact is that almost no one, aside from those who are well read in linguistics or philosophy of language, has ever been exposed to (1) –(6) or any sentences like them. Analysis of corpora shows that sentences constructed along these lines virtually never occur, even in the speech of young children. This makes it very difficult to accept the explanation, favored by behaviorists, that we recognize them to be unacceptable as the result of training and conditioning. Since children do not produce utterances like (1) –(6), parents never have a chance to explain what is wrong, to correct them, and to tell them that such sentences are not part of English. Further, since they are almost never spoken by anyone, it is vanishingly unlikely that a parent and child would overhear them so that the parent could point them out as ill-formed. Neither is this knowledge learned through formal instruction in school. Instruction in not saying sentences like (1)–(6) is not a part of any curriculum, and an English speaker who has never attended a day of school is as capable of recognizing the unacceptability of (1)–(6) as any college graduate.

Examples can be multiplied far beyond (1)–(6); there are indefinite numbers of strings of English words (or words of any language) that are comprehensible but unacceptable. If speakers are not trained to recognize them as ill-formed, how do they acquire this knowledge? Chomsky argues that this demonstrates that human beings possess an underlying competence capable of forming and identifying grammatical structures—words, phrases, clauses, and sentences —in a way that operates almost entirely outside of conscious awareness, computing over structural features of language that are not actually pronounced or written down but which are critical to the production and understanding of sentences. This competence and its acquisition are the proper subject matter for linguistic science, as Chomsky defines the field.

d. Linguistic Structures

An important part of Chomsky’s linguistic theory (although it is an idea that predates him by several decades and is also endorsed by some rival theories) is that it postulates structures that lie below the surface of language. The presence of such structures is supported by, among other evidence, considering cases of non-linear dependency between the words in a sentence, that is, cases where a word modifies another word that is some distance away in the linear order of the sentence as it is pronounced. For instance, in the sentence (from Berwick and Chomsky, 2017: 117):

(7) Instinctively, birds who fly swim.

we know that instinctively applies to swim rather than fly, indicating an unspoken connection that bypasses the three intervening words and which the language faculty of our mind somehow detects when parsing the sentence. Chomsky’s hypothesis of a dedicated language faculty —a part of the mind existing for the sole purpose of forming and interpreting linguistic structures, operating in isolation from other mental systems —is supported by the fact that nonlinguistic knowledge does not seem to be relied on to arrive at the correct interpretation of sentences such as (7). Try replacing swim with play chess. Although you know that birds instinctively fly and do not play chess, your language faculty provides the intended meaning without any difficulty. Chomsky’s theory would suggest that this is because that faculty parses the underlying structure of the sentence rather than relying on your knowledge about birds.

According to Chomsky, the dependence of human languages on these structures can also be observed in the way that certain types of sentences are produced from more basic ones. He frequently discusses the formation of questions from declarative sentences. For instance, any English speaker understands that the question form of (8) is (9), and not (10) (Chomsky, 1986: 45):

(8) The man who is here is tall.

(9) Is the man who is here tall?

(10) * Is the man who here is tall?

What rule does a child learning English have to grasp to know this? To a Martian linguist unfamiliar with the way that human languages work, a reasonable initial guess might be to move the fourth word of the sentence to the front, which is obviously incorrect. To see this, change (8) to:

(11) The man who was here yesterday was tall.

A more sophisticated hypothesis might be to move the second auxiliary verb in the sentence, is in the case of (8), to the front. But this is also not correct, as more complicated cases show:

(12) The woman who is in charge of deciding who is hired is ready to see him now.          

(13) * Is the woman who is in charge of deciding who hired is ready to see him now?

In fact, in no human language do transformations from one type of sentence to another require taking the linear order of words into account, although there is no obvious reason why they shouldn’t. A language that works on a principle such as switch the first and second words of a sentence to indicate a question is certainly imaginable and would seem simple to learn, but no language yet cataloged operates in such a way.

The correct rule in the cases of (8) through (13) is that the question is formed by moving the auxiliary verb (is) occurring in the verb phrase of the main clause of the sentence, not the one in the relative clause (a clause modifying a noun, such as who is here). Thus, knowing that (9) is the correct question form of (8) or that (13) is wrong requires sensitivity to the way that the elements of a sentence are grouped together into phrases and clauses. This is something that is not apparent on the surface of either the spoken or written forms of (8) or (12), yet a speaker with no formal instruction grasps it without difficulty. It is the study of these underlying structures and the way that the mind processes them that is the core concern of Chomskyan linguistics, rather than the analysis of the strings of words actually articulated by speakers.

3. The Development of Chomsky’s Linguistic Theory

 Chomsky’s research program, which has grown to involve the work of many other linguists, is closely associated with generative linguistics. This name refers to the project of identifying sets of rules—grammars—that will generate all and only the sentences of a language. Although explicit rules eventually drop out of the picture, replaced by more abstract “principles”, the goal remains to identify a system that can produce the potentially infinite number of sentences of a human language using the resources contained in the minds of a speaker, which are necessarily finite.

Chomsky’s work has implications for the study of language as a whole, but his concentration has been on syntax. This branch of linguistic science is concerned with the grammars that govern the production of sentences that are acceptable in a language and divide them from nonacceptable strings of words, as opposed to semantics, the part of linguistics concerned with the meaning of words and sentences, and pragmatics, which studies the use of language in context.

Although the methodological principles have remained constant from the start, Chomsky’s theory has undergone major changes over the years, and various iterations may seem, at least on a first look, to have little obvious common ground. Critics present this as evidence that the program has been stumbling down one dead end after another, while Chomsky asserts in response that rapid evolution is characteristic of new fields of study and that changes in a program’s guiding theory are evidence of healthy intellectual progress. Five major stages of development might be identified, corresponding to the subsections below. Each stage builds on previous ones, it has been alleged; superseded iterations should not be considered to be false but rather replaced by a more complete explanation.

a. Logical Constructivism

Chomsky’s theory of language began to be codified in the 1950s, first set down in a massive manuscript that was later published as Logical Structure of Linguistic Theory (1975) and then partially in the much shorter and more widely read Syntactic Structures (1957). These books differed significantly from later iterations of Chomsky’s work in that they were more of an attempt to show what an adequate theory of natural language would need to look like than to fully work out such a theory. The focus was on demonstrating how a small set of rules could operate over a finite vocabulary to generate an infinite number of sentences, as opposed to identifying a psychologically realistic account of the processes actually occurring in the mind of a speaker.

Even before Chomsky, since at least the 1930s, the structure of a sentence was thought to consist of a series of phrases, such as noun phrases or verb phrases. In Chomsky’s early theory, two sorts of rules governed the generation of such structures. Basic structures were given by rewrite rules, procedures that indicate the more basic constituents of structural components. For example,

S → NP VP

indicates that a noun phrase, NP, followed directly by a verb phrase, VP, constitute a sentence, S. “NP → N” indicates that a noun may constitute a noun phrase. Eventually, the application of these rewrite rules stops when every constituent of a structure has been replaced by a syntactic element, a lexical word such as Albert or meows. Transformation rules alter those basic structures in various ways to produce structures corresponding to complex sentences. Importantly, certain transformation rules allowed recursion. This is a concept central to computer science and mathematical logic, by which a rule could be applied to its own output an unlimited number of times (for instance, in mathematics, one can start with 0 and apply the recursive function add 1 repeatedly to yield the natural numbers 0,1,2,3, and so forth.). The presence of recursive rules allows the embedding of structures within other structures, such as placing Albert meows under Leisa thinks to get Leisa thinks Albert meows. This could then be placed under Casey says that to produce Casey says that Leisa thinks Albert meows, and so on. Embedding could be done as many times as desired, so that recursive rules could produce sentences of any length and complexity, an important requirement for a theory of natural language. Recursion has not only remained central to subsequent iterations of Chomsky’s work but, more recently, has come to be seen as the defining characteristic of human languages.

Chomsky’s interest in rules that could be represented as operations over symbols reflected influence from philosophers inclined towards formal methods, such as Goodman and Quine. This is a central feature of Chomsky’s work to the present day, even though subsequent developments have also taken psychological realism into account. Some of Chomsky’s most impactful research from his early career (late 50s and early 60s) was the invention of formal language theory, a branch of mathematics dealing with languages consisting of an alphabet of symbols from which strings could be formed in accordance with a formal grammar, a set of specific rules. The Chomsky Hierarchy provides a method of classifying formal languages according to the complexity of the strings that could be generated by the language’s grammar (Chomsky 1956). Chomsky was able to demonstrate that natural human languages could not be produced by the lowest level of grammar on the hierarchy, contrary to many linguistic theories popular at the time. Formal language theory and the Chomsky Hierarchy have continued to have applications both in linguistics and elsewhere, particularly in computer science.

b. The Standard Model

Chomsky’s 1965 landmark work, Aspects of the Theory of Syntax, which devoted much space to philosophical foundations, introduced what later became known as the “Standard Model”. While the theory itself was in many respects an extension of the ideas contained in Syntactic Structures, there was a shift in explanatory goals as Chomsky addressed what he calls “Plato’s Problem”, the mystery of how children can learn something as complex as the grammar of a natural language from the sparse evidence they are presented with. The sentences of a human language are infinite in number, and no child ever hears more than a tiny subset of them, yet they master the grammar that allows them to produce every sentence in their language. (“Plato’s Problem” is an allusion to Plato’s Meno, a discussion of similar puzzles surrounding geometry. Section 4.b provides a fuller discussion of the issue as well as more recent developments in Chomsky’s model of language acquisition.) This led Chomsky, inspired by early modern rationalist philosophers such as Descartes and Leibniz, to postulate innate mechanisms that would guide a child in this process. Every human child was held to be born with a mental system for language acquisition, operating largely subconsciously, preprogrammed to recognize the underlying structure of incoming linguistic signals, identify possible grammars that could generate those structures, and then to select the simplest such grammar. It was never fully worked out how, on this model, possible grammars were to be compared, and this early picture has subsequently been modified, but the idea of language acquisition as relying on innate knowledge remains at the heart of Chomsky’s work.

An important idea introduced in Aspects was the existence of two levels of linguistic structure: deep structure and surface structure. A deep structure contains structural information necessary for interpreting sentence meaning. Transformations on a deep structure —moving, deleting, and adding elements in accordance with the grammar of a language —yield a surface structure that determines the way that the sentence is pronounced. Chomsky explained (in a 1968 lecture) that,

If this approach is correct in general, then a person who knows a specific language has control of a grammar that generates the infinite set of potential deep structures, maps them onto associated surface structures, and determines the semantic and phonetic interpretations of these abstract objects (Chomsky, 2006: 46).

Note that, for Chomsky, the deep structure was a grammatical object that contains structural information related to meaning. This is very different from conceiving of a deep structure as a meaning itself, although a theory to that effect, generative semantics, was developed by some of Chomsky’s colleagues (initiating a debate acrimonious enough to sometimes be referred to as “the linguistic wars”). The names and exact roles of the two levels would evolve over time, and they were finally dropped altogether in the 1990s (although this is not always noticed, a matter that sometimes confuses the discussion of Chomsky’s theories).

Aspects was also notable for the introduction of the competence/performance distinction, or the distinction between the underlying mental systems that give a speaker mastery of her language (competence) and her actual use of the language (performance), which will seldom fully reflect that mastery. Although these terms have technically been superseded by E-language and I-language (see 4.c), they remain useful concepts in understanding Chomsky’s ideas, and the vocabulary is still frequently used.

c. The Extended Standard Model

Throughout the 1970s, a number of technical changes, aimed at simplification and consolidation, were made to the Standard Model set out in Aspects. These gradually led to what became known as the “Extended Standard Model”. The grammars of the Standard Model contained dozens of highly specific transformation rules that successively rearranged elements of a deep structure to produce a surface structure. Eventually, a much simpler and more empirically adequate theory was arrived at by postulating only a single operation that moved any element of a structure to any place in that structure. This operation, move α, was subject to many “constraints” that limited its applications and therefore restrained what could be generated. For instance, under certain conditions, parts of a structure form “islands” that block movement (as when who is blocked from moving from the conjunction in John and who had lunch? to give *Who did John and have lunch?). Importantly, the constraints seemed to be highly consistent across human languages.

Grammars were also simplified by cutting out information that seemed to be specified in the vocabulary of a language. For example, some verbs must be followed by nouns, while others must not. Compare I like coffee and She slept to * I like and * She slept a book. Knowing which of these strings are correct is part of knowing the words like and slept, and it seems that a speaker’s mind contains a sort of lexicon, or dictionary, that encodes this type of information for each word she knows. There is no need for a rule in the grammar to state that some verbs need an object and others do not, which would just be repeating information already in the lexicon. The properties of the lexical items are therefore said to “project” onto the grammar, constraining and shaping the structures available in a language. Projection remains a key aspect of the theory, so that lexicon and grammar are thought to be tightly integrated.

Chomsky has frequently described a language as a mapping from meaning to sound. Around the time of the Extended Standard Model, he introduced a schema whereby grammar forms a bridge between the Phonetic Form, or PF, the form of a sentence that would actually be pronounced, and the Logical Form, or LF, which contained the structural specification of a sentence necessary to determine meaning. To consider an example beloved by introductory logic teachers, Everyone loves someone might mean that each person loves some person (possibly a different person in each case), or it might mean that there is some one person that everyone loves. Although these two sentences have identical PFs, they have different LFs.

Linking the idea of LF and PF to that of deep structure and surface structure (now called D-structure and S-structure, and with somewhat altered roles) gives the “T-model” of language:

D-structure

|

transformations

|

PF –    S-Structure    – LF

As the diagram indicates, the grammar generates the D-structure, which contains the basic structural relations of the sentence. The D-structure undergoes transformations to arrive at the S-structure, which differs from the PF in that it still contains unpronounced “traces” in places previously occupied by an element that was then moved elsewhere. The S-structure is then interpreted two ways: phonetically as the PF and semantically as the LF. The PF is passed from the language system to the cognitive system responsible for producing actual speech. The LF, which is not a meaning itself but contains structural information needed for semantic interpretation, is passed to the cognitive system responsible for semantics. This idea of syntactic structures and transformations over those structures as mediating between meaning and physical expression has been further developed and simplified, but the basic concept remains an important part of Chomsky’s theories

d. Principles and Parameters

In the 1980s, the Extended Standard Model would develop into what is perhaps the best known iteration of Chomskyan linguistics, what was first referred to as “Government and Binding”, after Chomsky’s book Lectures on Government and Binding (1981). Chomsky developed these ideas further in Barriers (1986), and the theory took on the more intuitive name “Principles and Parameters”. The fundamental idea was quite simple. As with previous versions, human beings have in their minds a computational system that generates the syntactic structures linking meanings to sounds. According to Principles and Parameters Theory, all of these systems share certain fixed settings (principles) for their core components, explaining the deep commonalities that Chomsky and his followers see between human languages. Other elements (parameters) are flexible and have values that are set during the language learning process, reflecting the variations observable across different languages. An analogy can be made with an early computer of the sort that was programmed by setting the position of switches on a control panel: the core, unchanging, circuitry of the computer is analogous to principles, the switches to parameters, and the program created by one of the possible arrangements of the switches to a language such as English, Japanese, or St’at’imcets (although this simple picture captures the essence of early Principles and Parameters, the details are a great deal more complicated, especially considering subsequent developments).

Principles are the core aspects of language, including the dependence on underlying structure and lexical projection, features that the theory predicts will be shared by all natural human languages. Parameters are aspects with binary settings that vary from language to language. Among the most widely discussed parameters, which might serve as convenient illustrations, are the Head and Pro-Drop parameters.

A head is the key element that gives a phrase its name, such as the noun in a noun phrase. The rest of the phrase is the complement. It can be observed that in English, the head comes before the complement, as in the noun phrase medicine for cats, where the noun medicine is before the complement for cats; in the verb phrase passed her the tea, the verb passed is first, and in the prepositional phrase in his pocket, the preposition in is first. But consider the following Japanese sentence (Cook and Newsom, 1996: 14):

(14) E wa kabe ni kakatte imasu
[subject marker] picture wall on hanging is

           The picture is hanging on the wall

Notice that the head of the verb phrase, the verb kakatte imasu, is after its complement, kabe ni, and ni (on) is a postposition that occurs after its complement, kabe. English and Japanese thus represent different settings of a parameter, the Head, or Head Directionality, Parameter. Although this and other parameters are set during a child’s development by the language they hear around them, it seems that very little exposure is needed to fix the correct value. It is taken as evidence of this that mistakes with head positioning are vanishingly rare; English speaking children almost never make mistakes like * The picture the wall on is at any point in their development.

The Pro-Drop Parameter explains the fact that certain languages can leave the pronoun subjects of a sentence implied, or up to context. For instance, in Italian, a pro-drop language, the following sentences are permitted (Cook and Newsom, 1996: 55).

(15) Sono il tricheco
be (1st-person-present) the walrus
I am the walrus.

 

(16) E’ pericoloso sporger- si
be (3rd person present) dangerous lean out- (reflexive)

        It is dangerous to lean out. [a warning posted on trains]

On the other hand, the direct English translations * Am the walrus and * Is dangerous to lean out are ungrammatical, reflecting a different parameter setting, “non-prodrop”, which requires an explicit subject for sentences.

A number of other, often complex, differences beyond whether subjects must be included in all sentences were thought to come from the settings of Pro-Drop and the way it interacts with other parameters. For example, it has been observed that many pro-drop languages allow the normal order of subjects and verbs to be inverted; Cade la note is acceptable in Italian, unlike its direct translation in English, * falls the night. However, this feature is not universal among pro-drop languages, and it was theorized that whether it is present or not depends on the settings of other parameters.

Examples such as these reflect the general theme of Principles and Parameters, in which “rules” of the sort that had been postulated in Chomsky’s previous work are no longer needed. Instead of syntactical rules present in a speaker’s mental language faculty, the particular grammar of a language was hypothesized to be the result of the complex interaction of principles, the setting of parameters, and the projection properties of lexical items. As a relatively simple example, there is no need for an English-speaking child to learn a bundle of related rules such as noun first in a noun phrase, verb first in a verb phrase, and so on, or for a Japanese-speaking child to learn the opposite rules for each type of phrase; all of this is covered by the setting of the Head Parameter. As Chomsky (1995: 388) puts it,

A language is not, then, a system of rules but a set of specifications for parameters in an invariant system of principles of Universal Grammar. Languages have no rules at all in anything like the traditional sense.

This outlook represents an important shift in approach, which is often not fully appreciated by philosophers and other non-specialists. Many scholars assume that Chomsky and his followers still regard languages as particular sets of rules internally represented by speakers, as opposed to principles that are realized without being explicitly represented in the brain.

This outlook led many linguists, especially during the last two decades of the 20th century, to hope that the resemblances and differences between individual languages could be neatly explained by parameter settings. Language learning also seemed much less puzzling, since it was now thought to be a matter, not of learning complex sets of rules and constraints, but rather of setting each parameter, of which there were at one time believed to be about twenty, to the correct value for the local language, a process that has been compared to the children’s game of “twenty questions”. It was even speculated that a table could be established where languages could be arranged by their parameter settings, in analogy to the periodic table on which elements could be placed and their chemical properties predicted by their atomic structures.

Unfortunately, as the program developed, things did not prove so simple. Researchers failed to reach a consensus on what parameters there are, what values they can take, and how they interact, and there seemed to be vastly more of them than initially believed. Additionally, parameters often failed to have the explanatory power they were envisioned as having. For example, as discussed above, it was originally claimed that the Pro-Drop parameter explained a large number of differences between languages with opposite settings for that parameter. However, these predictions were made on the basis of an analysis of several related European languages and were not fully borne out when checked against a wider sample. Many linguists now see the parameters themselves as emerging from the interactions of “microparameters” that explain the differences between closely related dialects of the same language and which are often found in the properties of individual words projecting onto the syntax. There is ongoing debate as to the explanatory value of parameters as they were originally conceived.

During the Principles and Parameters era, Chomsky sharpened the notions of competence and performance into the dichotomy of I-languages and E-languages. The former is a state of the language system in the mind of an individual speaker, while the latter, which corresponds to the common notion of a language, is a publicly shared system such as “English”, “French”, or “Swahili”. Chomsky was sharply critical of the study of E-languages, deriding them as poorly defined entities that play no role in the serious study of linguistics —a controversial attitude, as E-languages are what many linguists regard as precisely the subject matter of their discipline. This remains an important point in his work and will be discussed more fully in 4.d. below.

e. The Minimalist Program

From the beginning, critics have argued that the rule systems Chomsky postulated were too complex to be plausibly grasped by a child learning a language, even if important parts of this knowledge were innate. Initially, the replacement of rules by a limited number of parameters in the Principles and Parameters paradigm seemed to offer a solution, as by this theory, instead of an unwieldy set of rules, the child needed only to grasp the setting of some parameters. But, while it was initially hoped that twenty or so parameters might be identified, the number has increased to the point where, although there is no exact consensus, it is too large to offer much hope of providing a simple explanation of language learning, and microparameters further complicate the picture.

The Minimalist Program was initiated in the mid-1990s partially to respond to such criticisms by continuing the trend towards simplicity that had begun with the Extended Standard Theory, with the goal of the greatest possible degree of elegance and parsimony. The minimalist approach is regarded by advocates not as a full theory of syntax but rather as a program of research working towards such a theory, building on the key features of Principles and Parameters.

In the Minimalist Program, syntactic structures corresponding to sentences are constructed using a single operation, Merge, that combines a head with a complement, for example, merging Albert with will meow to give Albert will meow. Importantly, Merge is recursive, so that it can be applied over and over to give sentences of any length. For instance, the sentence just discussed can be merged with thinks to give thinks Albert will meow and then again with Leisa to form the sentence Leisa thinks Albert will meow. Instead of elements within a structure moving from one place to another, a structure merges with an element already inside of it and then deletes redundant elements; a question can be formed from Albert will meow by first producing will Albert will meow, and finally will Albert meow? In order to prevent the production of ungrammatical strings, Merge must be constrained in various ways. The main constraints are postulated to be lexical, coming from the syntactic features of the words in a language. These features control which elements can be merged together, which cannot, and when merging is obligatory, for instance, to provide an object for a transitive verb.

During the Minimalist Program era, Chomsky has worked on a more specific model for the architecture of the language faculty, which he divides into the Faculty of Language, Broad (FLB) and the Faculty of Language, Narrow (FLN). The FLN is the syntactic computational system that had been the subject of Chomsky’s work from the beginning, now envisioned as using a single operation, that of recursive Merge. The FLB is postulated to include the FLN, but additionally the perceptual-articulatory system that handles the reception and production of physical messages (spoken or signed words and sentences) and the conceptual-intentional system that handles interpreting the meaning of those messages. In a schema similar to a flattened version of the T-model, the FLN forms a bridge between the other systems of the FLB. Incoming messages are given a structural form by the FLN that is passed to the conceptual-intentional system to be interpreted, and the reverse process allows thoughts to be articulated as speech. The different structural levels, D-structure and S-structure, of the T-model are eliminated in favor of maximal simplicity (the upside-down T is now just a flat  ̶ ). The FLN is held to have a single level on which structures are derived through Merge, and two interfaces connected to the other parts of the FLB.

One important implication of this proposed architecture is the special role of recursion. The perceptual-articulatory system and conceptual-intentional system have clear analogs in other species, many of whom can obviously sense and produce signals and, in at least some cases, seem to be able to link meanings to them. Chomsky argues that, in contrast, recursion is uniquely human and that no system of communication among non-human animals allows users to limitlessly combine elements to produce a potential infinity of messages. In many ways, Chomsky is just restating what had been an important part of his theory from the beginning, which is that human language is unique in being productive or capable of expressing an infinity of different meanings, an insight he credits to Descartes. This makes recursion the characteristic aspect of human language that sets it apart from anything else in the natural world, and a central part of what it is to be human.

The status of recursion in Chomsky’s theory has been challenged in various ways, sometimes with the claim that some human language has been observed to be non-recursive (discussed below, in 4.a). That recursion is a uniquely human ability has also been called into question by experiments in which monkeys and corvids were apparently trained in recursive tasks under laboratory conditions. On the other hand, it has also been suggested that if the recursive FLN really does not have any counterpart among non-human species, it is unclear how such a mechanism might have evolved. This last point is only the latest version of a long-running objection that Chomsky’s ideas are difficult to reconcile with the theory of evolution since he postulates uniquely human traits for which, it is argued by critics, there is no plausible evolutionary history. Chomsky counters that it is not unlikely that the FLN appeared as a single mutation, one that would be selected due to the usefulness of recursion for general thought outside of communication. Providing evolutionary details and exploring the  relationship between the language faculty and the physical brain have formed a large part of Chomsky’s most recent work.

The central place of recursion in the Minimalist Program also brought about an interesting change in Chomsky’s thoughts on hypothetical extraterrestrial languages. During the 1980s, he speculated that alien languages would be unlearnable by human beings since they would not share the same principles as human languages. As such, one could be studied as a natural phenomenon in the way that humans study physics or biology, but it would be impossible for researchers to truly learn the language in the way that field linguists master newly encountered human languages. More recently, however, Chomsky hypothesized that since recursion is apparently the core, universal property of human language and any extraterrestrial language will almost certainly be recursive as well, alien languages may not be that different from our own, after all.

4. Language and Languages

As a linguist, Chomsky’s primary concern has always been, of course, language. His study of this phenomenon eventually led him to not only formulate theories that were very much at odds with those held at one time by the majority of linguists and philosophers, but also to have a fundamentally different view about the thing, language, that was being studied and theorized about. Chomsky’s views have been influential, but many of them remain controversial today. This section discusses some of Chomsky’s important ideas that will be of interest to philosophers, especially concerning the nature and acquisition of language, as well as meaning and analyticity, topics that are traditionally the central concerns of philosophy of the language.

a. Universal Grammar

Perhaps the single most salient feature of Chomsky’s theory is the idea of Universal Grammar ( UG). This is the central aspect of language that he argues is shared by all human beings —a part of the organization of the mind. Since it is widely assumed that mental features correspond, at some level, to physical features of the brain, UG is ultimately a biological hypothesis that would be part of the genetic inheritance that all humans are born with.

In terms of Principles and Parameters Theory, UG consists of the principles common to all languages and which will not change as the speaker matures. UG also consists of parameters, but the values of the parameters are not part of UG. Instead, parameters may change from their initial setting as a child grows up, based on the language she hears spoken around her. For instance, an English-speaking child will learn that every sentence must have a subject, setting her Pro-Drop parameter to a certain value, the opposite of the value it would take for a Spanish-speaking child. While the Pro-Drop parameter is a part of UG, this particular setting of the parameter is a part of English and other languages where the subject must be overtly included in the sentence. All of the differences between human languages are then differences in vocabulary and in the settings of parameters, but they are all organized around a common core given by UG.

Chomsky has frequently stated that the important aspects of human languages are set by UG. From a sufficiently detached viewpoint, for instance, that of a hypothetical Martian linguist, there would only be minor regional variations of a single language spoken by all human beings. Further, the variations between languages are predictable from the architecture of UG and can only occur within narrowly constrained limits set by that structure. This was a dramatic departure from the assumption, largely unquestioned until the mid-20th century, that languages can vary virtually without limit and in unpredictable ways. This part of Chomsky’s theory has remained controversial, with some authorities on crosslinguistic work, such as the British psycholinguist Stephen Levinson (2016), arguing that it discounts real and important differences among languages. Other linguists argue the exact contrary: that data from the study of languages worldwide backs Chomsky’s claims. Because the debate ultimately concerns invisible mental features of human beings and how they relate to unpronounced linguistic structures, the interpretation of the evidence is not straightforward, and both sides claim that the available empirical data supports their position.

The theory of UG is an important aspect of Chomsky’s ideas for many reasons, among which is that it clearly sets his theories apart as different from paradigms that had previously been dominant in linguistics. This is because UG is not a theory about behavior or how people use language, but instead about the internal composition of the human mind. Indeed, for Chomsky and others working within the framework of his ideas, language is not something that is spoken, signed, or written but instead exists inside of us. What many people think of as language —externalized acts of communication —are merely products of that internal mental faculty. This in turn has further implications for theories of language acquisition (see 4.b) and how different languages should be identified (4.c).

An important implication of UG is that it makes Chomsky’s theories empirically testable. A common criticism of his work is that because it abstracts away from the study of actual language use to seek underlying idealized patterns, no evidence can ever count against it. Instead, apparent counterexamples can always be dismissed as artifacts of performance rather than the competence that Chomsky was concerned with. If correct, this would be problematic since it is widely agreed that a good scientific theory should be testable in some way. However, this criticism is often based on misunderstandings. A linguist dismissing an apparent failure of the theory as due to performance would need to provide evidence that performance factors really are involved, rather than a problem with the underlying theory of competence. Further, if a language was discovered to be organized around principles that contravened those of UG, then many of the core aspects of Chomsky’s theories would be falsified. Although candidate languages have been proposed, all of them are highly controversial, and none is anything close to universally accepted as falsifying UG.

In order to count as a counterexample to UG, a language must actually breach one of its principles; it is not enough that a principle merely not be displayed. As an example, one of the principles is what is known as structure dependence: when an element of a linguistic structure is moved to derive a different structure, that movement depends on the structure and its organization into phrases. For instance, to arrive at the correct question form of The cat who is on the desk is hungry; it is the is in the main clause, the one before hungry, that is moved to the front of the sentence, not the one in the relative clause (between who and on). However, in some languages, for instance Japanese, elements are not moved to form questions; instead, a question marker (ka) is added at the end of the sentence. This does not make Japanese a counterexample to the UG principle that movement is always structurally dependent. The Japanese simply do not exercise this principle when forming questions, but neither is the principle violated. A counterexample to UG would be a language that moved elements but did so in a way that did not depend on structure, for instance, by always moving the third word to the front or inverting the word order to form a question.

A case that generated a great deal of recent controversy has been the claim that Pirahã, a language with a few hundred speakers in the Brazilian rain forest, lacks recursion (Everett 2005). This has been frequently presented as falsifying UG, since recursion is the most important principle, indeed the identifying feature, of human language, according to the Minimalist Program. This alleged counterexample received widespread and often incautious coverage in the popular press, at times being compared to the discovery of evidence that would disprove the theory of relativity.

This assertion that Pirahã has no recursion has itself been frequently challenged, and the status of this claim is unclear. But there is also a lack of agreement on whether, if true, this claim would invalidate UG or whether it would just be a case similar to the one discussed above, the absence of movement in Japanese when forming questions, where a principle is not being exercised. Proponents of Chomsky’s ideas counter that UG is a theory of mental organization and underlying competence, a competence that may or may not be put fully to use. The fact that the Pirahã are capable of learning Portuguese (the majority language in Brazil) shows that they have the same UG present in their minds as anyone else. Chomsky points out that there are numerous cases of human beings choosing not to exercise some sort of biological capacity that they have. Chomsky’s own example is that although humans are biologically capable of swimming, many would drown if placed in water. It has been suggested by sympathetic scholars that this example is not particularly felicitous, as swimming is not an instinctive behavior for humans, and a better example might be monks who are sworn to celibacy. Debate has continued concerning this case, with some still arguing that if a language without recursion would not be accepted as evidence against UG, it is difficult to imagine what can.

b. Plato’s Problem and Language Acquisition

One of Chomsky’s major goals has always been to explain the way in which human children learn language. Since he sees language as a type of knowledge, it is important to understand how that knowledge is acquired. It seems inexplicable that children acquire something as complex as the grammar and vocabulary of a language, let alone the speed and accuracy with which they do so, at an age when they cannot yet learn how to tie their shoes or do basic arithmetic. The mystery is deepened by the difficulty that adults, who are usually much better learners than small children, have with acquiring a second language.

Chomsky addressed this puzzle in Aspects of the Theory of Syntax (1965), where he called it “Plato’s Problem”. This name is a reference to Plato’s Meno, a dialog in which Socrates guides a young boy, without a formal education, into producing a fairly complex geometric proof, apparently from the child’s own mental resources. Considering the difficult question of where this apparent knowledge of geometry came from, Plato, speaking through Socrates, concludes that it must have been present in the child already, although dormant until the right conditions were presented for it to be awakened. Chomsky would endorse largely the same explanation for language acquisition. He also cites Leibniz and Descartes as holding similar views concerning important areas of knowledge.

Chomsky’s theories regarding language acquisition are largely motivated by what has become known as the “Poverty of the Stimulus Argument,” the observation that the information about their native language that children are exposed to seems inadequate to explain the linguistic knowledge that they arrive at. Children only ever hear a small subset of the sentences that they can produce or understand. Furthermore, the language that they do hear is often “corrupt” in some way, such as the incomplete sentences frequently used in casual exchanges. Yet on this basis, children somehow master the complex grammars of their native languages.

Chomsky pointed out that the Poverty of the Stimulus makes it difficult to maintain that language is learned through the same general-purpose learning mechanisms that allow a human being to learn about other aspects of the world. There are many other factors that he and his followers cite to underline this point. All developmentally normal children worldwide are able to speak their native languages at roughly the same age, despite vast differences in their cultural and material circumstances or the educational levels of their families. Indeed, language learning seems to be independent of the child’s own cognitive abilities, as children with high IQs do not learn the grammar of their language faster, on average, than others. There is a notable lack of explicit instruction; analyses of speech corpora show that adult correction of children’s grammar is rare, and it is usually ineffective when it does occur. Considering these factors together, it seems that the way in which human children acquire language requires an explanation in a way that learning, say, table manners or putting shoes on do not.

The solution to this puzzle is, according to Chomsky, that language is not learned through experience but innate. Children are born with Universal Grammar already in them, so the principles of language are present from birth. What remains is “merely” learning the particularities of the child’s native language. Because language is a part of the human mind, a part that each human being is born with, a child learning her native language is just undergoing the process of shaping that part of her mind into a particular form. In terms of the Principles and Parameters Theory, language learning is setting the value of the parameters. Although subsequent research has shown that things are more complicated than the simple setting of switches, the basic picture remains a part of Chomsky’s theory. The core principles of UG remain unchanged as the child grows, while peripheral elements are more plastic and are shaped by the linguistic environment of the child.

Chomsky has sometimes put the innateness of language in very strong terms and has stated that it is misleading to call language acquisition “language learning”. The language system of the mind is a mental organ, and its development is similar, Chomsky argues, to the growth of bodily organs such as the heart or lungs, an automatic process that is complete at some point in a child’s development. The language system also stabilizes at a certain point, after which changes will be relatively minor, such as the addition of new words to a speaker’s vocabulary. Even many of those who are firm adherents to Chomsky’s theories regard such statements as incautious. It is sometimes pointed out that while the growth of organs does not require having any particular experiences, proper language development requires being exposed to language within a certain critical period in early childhood. This requirement is evidenced by tragic cases of severely neglected children who were denied the needed input and, as a result, never learned to speak with full proficiency.

It has also been pointed out that even the rationalist philosophers whom Chomsky frequently cites did not seem to view innate and learned as mutually exclusive. Leibniz (1704), for instance, stated that arithmetical knowledge is “in us” but still learned, drawn out by demonstration and testing on examples. It has been suggested that some such view is necessary to explain language acquisition. Since humans are not born able to speak in the way that, for example, a horse is able to run within hours of birth, some learning seems to be involved, but those sympathetic to Chomsky regard the Poverty of the Stimulus as ruling out simply acquiring language completely from external sources. According to this view, we are born with language inside of us, but the proper experiences are required to draw that knowledge out and make it available.

The idea of innate language is not universally accepted. The behaviorist theory that language learning is a result of social conditioning, or training, is no longer considered viable. But it is a widely held view that general statistical learning mechanisms, the same mechanisms by which a child learns about other aspects of the world and human society, are responsible for language learning, with only the most general features of language being innate. These sorts of theories tend to have the most traction in schools of linguistic thought that reject the idea of Universal Grammar, maintaining that no deep commonalities hold across human languages. On such a view, there is little about language that can be said to be shared by all humans and therefore innate, so language would have to be acquired by children in the same way as other local customs. Advocates of Chomsky’s views counter that such theories cannot be upheld given the complexity of grammar and the Poverty of the Stimulus, and that the very fact that language acquisition occurs given these considerations is evidence for Universal Grammar. The degree to which language is innate remains a highly contested issue in both philosophy and science.

Although the application of statistical learning mechanisms to machine learning programs, such as OpenAI’s ChatGPT, has proven incredibly successful, Chomsky points out that the architecture of such programs is very different from that of the human mind: “A child’s operating system is completely different from that of a machine learning program” (Chomsky, Roberts, and Watumull, 2023). This difference, Chomskyans maintain, precludes drawing conclusions about the use or acquisition of language by humans on the basis of studying these models.

c. I vs. E languages

Perhaps the way in which Chomsky’s theories differ most sharply from those of other linguists and philosophers is in his understanding of what language is and how a language is to be identified. Almost from the beginning, he has been careful to distinguish speaker performance from underlying linguistic competence, which is the target of his inquiry. During the 1980s, this methodological point would be further developed into the I-language/E-language distinction.

A common concept of what an individual language is, explicitly endorsed by philosophers such as David Lewis (1969), Michael Dummett (1986), and Michael Devitt (2022), is a system of conventions shared between speakers to allow coordination. Therefore, language is a public entity used for communication. It is something like this that most linguists and philosophers of language have in mind when they talk about “English” or “Hindi”. Chomsky calls this concept of language E-language, where the “E” stands for external and extensional. What is meant by “extensional” is somewhat technical and will be discussed later in this subsection. “External” refers to the idea just discussed, where language is a public system that exists externally to any of its speakers. Chomsky points out that such a notion is inherently vague, and it is difficult to point to any criteria of identity that would allow one to draw firm boundaries that could be used to tell one such language apart from another. It has been observed that people living near border areas often cannot be neatly categorized as speaking one language or the other; Germans living near the Dutch border are comprehensible to the nearby Dutch but not to many Germans from the southern part of Germany. Based on the position of the border, we say that they are speaking “German” rather than “Dutch” or some other E-language, but a border is a political entity with negligible linguistic significance. Chomsky (1997: 7) also called attention to what he calls “semi-grammatical sentences,” such as the string of words.

(17) *The child seems sleeping.

Although (17) is clearly ill-formed, most “English” speakers will be able to assign some meaning to it. Given these conflicting facts, there seems to be no answer to whether (17) or similar strings are part of “English”.

Based on considerations like those just mentioned, Chomsky derides E-languages as indistinct entities that are of no interest to linguistic science. The real concept of interest is that of an I-language, where the “I” refers to intensional and internal. “Intensional” is in opposition to “extensional”, and will be discussed in a moment. “Internal” means contained in the mind of some individual human being. Chomsky defines language as a computational system contained in an individual mind, one that produces syntactic structures that are passed to the mental systems responsible for articulation and interpretation. A particular state of such a system, shaped by the linguistic environment it is exposed to, constitutes an I-language. Because all I-languages contain Universal Grammar, they will all resemble each other in their core aspects, and because more peripheral parts of language are set by the input received, the I-language of two members of the same linguistic community will resemble one another more closely. For Chomsky, for whom the study of language is ultimately the study of the mind, it is the I-language that is the proper topic of concern for linguists. When Chomsky speaks of “English” or “Swahili”, this is to be understood as shorthand for a cluster of characteristics that are typically displayed by the I-languages of people in a particular linguistic community.

This rejection of external languages as worthy of study is closely related to another point where Chomsky goes against a widely held belief in the philosophy of language, as he does not accept the common hypothesis that language is primarily a means of communication. The idea of external languages is largely motivated by the widespread theory that language is a means for interpersonal communication, something that evolved so that humans could come together, coordinate to solve problems, and share ideas. Chomsky responds that language serves many uses, including to speak silently to oneself for mental clarity, to aid in memorization, to solve problems, to plan, or to conduct other activities that are entirely internal to the individual, in addition to communication. There is no reason to emphasize one of these purposes over any other. Communication is one purpose of language—an important one, to be sure—but it is not the purpose.

Besides the internal/external dichotomy, there is the intensional/extensional distinction, referring to two different ways that sets might be specified. The extension of a set is what elements are in that set, while the intension is how the set is defined and the members are divided from non-members. For instance, the set {1, 2, 3} has as its extension the numbers 1, 2, and 3. The intension of the same set might be the first three positive integers, or the square roots of 1, 4, and 9, or the first three divisors of 6; indeed, an infinite number of intensions might generate the same set extension.

Applying this concept to languages, a language might be defined extensionally in terms of the sentences of the language or intentionally in terms of the grammar that generates all of those sentences but no others. While Chomsky favors the second approach, he attributes the first to two virtually opposite traditions. Structuralist linguists, who place great value on studying corpora, and other linguists and philosophers who focus on the actual use of language define a language in terms of the sentences attested in corpora and those that fit similar patterns. A very different tradition consists of philosophers of language who are known as “Platonists”, and who are exemplified by Jerrold Katz (1981, 1985) and Scott Soames (1984), former disciples of Chomsky. On this view, every possible language is a mathematical object, a set of possible sentences that really exist in the same abstract sense that sets of numbers do. Some of these sets happen to be the languages that humans speak.

Both of these extensional approaches are rejected by Chomsky, who maintains that language is an aspect of the human mind, so what is of interest is the organization of that part of the mind, the I-language. This is an intensional approach, since a particular I-language will constitute a grammar that will produce a certain set of sentences. Chomsky argues that both extensional approaches, the mathematical and the usage-based, are insufficiently focused on the mental to be of explanatory value. If a language is an abstract mathematical object, a set of sentences, it is unclear how humans are supposed to acquire knowledge of such a thing or to use it. The usage-based approach, as a theory of behavior, is insufficiently explanatory because any real explanation of how language is acquired and used must be in mental terms, which means looking at the organization of the underlying I-language.

While many who study language accept the concept of the I-language and agree with its importance, Chomsky’s complete dismissal of E-languages as worthy of study has not been widely endorsed. E-languages, even if they are ultimately fiction, seem to be a necessary fiction for disciplines such as sociolinguistics or for the historical analysis of how languages have evolved over time. Further, having vague criteria of identity does not automatically disqualify a class of entities from being used in science. For example, the idea of species is open to many of the same criticisms concerning vagueness that Chomsky directs at E-languages, and its status as a real category has been debated, but the concept often plays a useful role in biology.

d. Meaning and Analyticity

It might be said that the main concern of the philosophy of language is the question of meaning. How is it that language corresponds to, and allows us to communicate about, states of affairs in the world or to describe possible states of affairs? A related question is whether there are such things as analytic truths, that is, sentences that are (as they were often traditionally characterized) necessarily true by virtue of meaning alone. It might seem like anyone who understands all the words in:

(18) If Albert is a cat, then Albert is an animal.

knows that it has to be true, just in virtue of knowing what it means. Appeals to such knowledge were frequently the basis for explaining our apparent a priori knowledge of logic and mathematics and for what came to be known as “analytic philosophy” in the 20th century. But the exact nature and scope of this sort of truth and knowledge are surprisingly hard to clarify, and many philosophers, notably Quine (1953) and Fodor (1998), argue that allegedly analytic statements are no different from any other belief that is widely held, such as:

(19) The world is more than a day old.

On this outlook, not only are apparently analytic truths open to revision just like any other belief, but the entire idea of determinate meanings becomes questionable.

As mentioned earlier, Chomsky’s focus has been not on meaning but instead on syntax, the grammatical rules that govern the production of well-formed sentences, considered largely independent of their meanings. Much of the critical data for his program has consisted of unacceptable sentences, the “WhyNots,” such as:

(20) * She’s as likely as he’s to get ill. (Rey 2022)

Sentences like (20), or (1)-(6) in 2.c above, are problematic, not because they have no meaning or have an anomalous meaning in some way, but because of often subtle issues under the surface concerning the syntactic structure of the sentence. Chomsky frequently argued that syntax is independent of meaning, and a theory of language should be able to explain the syntactic data without entering into questions of meaning. This idea, sometimes called “the autonomy of syntax”, is supported by, among other evidence, considering sentences such as:

(21) Colorless green ideas sleep furiously. (Chomsky 1965: 149)

which makes no sense if understood literally but is immediately recognizable as a grammatical sentence in English. Whether syntax is entirely independent of meaning and use has proven somewhat contentious, with some arguing that, on the contrary, questions of grammaticality cannot be separated from pragmatic and semantic issues. However, the distinction fits well with Chomsky’s conception of I-language, an internal computational device that produces syntactic structures that are then passed to other mental systems. These include the conceptual-intentional system responsible for assigning meaning to the structures, a system that interfaces with the language faculty but is not itself part of that faculty, strictly speaking.

Despite his focus on syntax, Chomsky does frequently discuss questions of meaning, at least from 1965 on. Chomsky regards the words (and other lexical items, such as prefixes and suffixes) that a speaker has stored in her lexicon as bundles of semantic, syntactic, and phonetic features, indicating information about meaning, grammatical role, and pronunciation. Some features that Chomsky classifies as syntactic may seem to be more related to meaning, such as being abstract. Folding these features into syntax seemed to be supported by the observation that, for example,

(22) * A very running person passed us.

is anomalous because very requires an abstract complement in such a context (a very interesting person is fine). In Aspects of the Theory of Syntax (1965), he also introduced the notion of “selectional rules” that identify sentences such as:

(23) Golf plays John (1965: 149)

as “deviant”. A particularly interesting example is:

(24) Both of John’s parents are married to aunts of mine. (1965: 77)

In 1965, (24) might have seemed to be analytically false, but in the 21st century, such a sentence may very well be true!

One popular theory of semantics is that the meaning of a sentence consists of its truth conditions, that is, the state of affairs that would make the sentence true. This idea, associated with the philosopher of language Donald Davidson (1967), might be said to be almost an orthodoxy in the study of semantics, and it certainly has an intuitive appeal. To know what The cat is on the mat means is to know that this sentence is true if and only if the cat is indeed on the mat. Starting in the late 1990s, Chomsky would challenge this picture of meaning as an oversimplification of the way that language works.

According to Chomsky’s view, also developed by Paul Pietroski (2005), among others, the sentences of a language do not, themselves, have truth conditions. Instead, sentences are tools that might be used, among other things, to make statements that have truth values relative to  their context of use. Support for this position is drawn from the phenomenon of polysemy, where the same word might be used with different truth-conditional roles within a single sentence, such as in:

(25) The bank was destroyed by the fire and so moved across the street. (Chomsky 2000: 180)

where the word bank is used to refer to both a building and a financial institution. There is also open texture, a phenomenon by which the meaning of a word might be extended in multiple ways, many of which might have once been impossible to foresee (Waismann 1945). An oft-cited example is mother: in modern times, unlike in the past, it is possible that two women, the woman who produces the ovum and the woman who carries the fetus, may both be called  mothers of the child. One might also consider the way that a computer, at one time a human being engaged in performing computations, was easily extended to cover electronic machines that are sometimes said to think, something that was also at one time reserved for humans.

Considering these phenomena, it seems that the traditional idea of words as having fixed “meanings” might be better replaced by the idea of words as “filters or lenses, providing ways of looking at things and thinking about the products of our minds” (Chomsky 2000, 36), or, as Pietroski (2005) puts it, as pointers in conceptual space. A speaker uses the structures made available by her I-language in order to arrange these “pointers” in such a way as to convey information, producing statements that might be assigned truth values given the context. But a speaker is hardly constrained to her I-language, which might be supplemented by resources such as gestures, common knowledge, shared cultural background, or sensibility to the listener’s psychology and ability to fill in gaps. Consider a speaker nodding towards a picture of the Eiffel Tower and saying “been there”; to the right audience, under the right circumstances, this is a perfectly clear statement with a determinate truth value, even though the I-language, which produces structures corresponding to grammatical sentences, has been overridden in the interests of efficiency.

It has been suggested (Rey 2022) that this outlook on meaning offers a solution to the question of whether there are sentences that are analytically true and that are distinct from merely strongly held beliefs. Sentences such as If Albert is a cat, he is an animal may be analytic in the sense that, in the lexicon accessed by the I-language, [animal] is a feature of cat (as argued by Katz 1990). On the other hand, the I-language might be overruled in the face of future evidence, such as discovering that cats are really robots from another planet (as Putnam 1962 imagined). These two apparently opposing facts can be accommodated by the open texture of the word cat, which might come to be used in cases where it does not, at present, apply.

Chomsky, throughout his long career, seems to have frequently vacillated concerning the existence of analytic truths. Early on, as in Aspects (1965), he endorses analyticity, citing sentence 24 and similar examples. At other times, he seems to echo Quine, at one point (1975), stating that the study of meaning cannot be dissociated from systems of belief. More recently (1997) he explicitly allows for analytic truths, arguing that necessary connections occur between the concepts denoted by the lexicons of human languages. For example, “If John persuaded Bill to go to college, then Bill at some point decided or intended to go to college… this is a truth of meaning” (1997: 30). This is to say that it is an analytic truth based on a relation that obtains between the concepts persuade and intend. Ultimately, though, Chomsky regards analyticity as an empirical issue, not one to be settled by considering philosophical intuitions but rather through careful investigation of language acquisition, crosslinguistic comparison, and the relation of language to other cognitive systems, among other evidence. Currently, he holds that allowing for analytic truths based on relations between concepts seems more promising than alternative proposals, but this is an empirical question to be resolved through science.

Finally, mention should be made of the way that Chomsky connects considerations of meaning with “Plato’s Problem”, the question of how small children manage to do something as difficult as learning language. Chomsky notes that the acquisition of vocabulary poses this problem “in a very sharp form” (1997: 29). During the peak periods of language learning, children learn several words a day, often after hearing them a single time. Chomsky accounts for this rapid acquisition in the same way as the acquisition of grammar: what is being learned must already be in the child. The concepts themselves are innate, and what a child is doing is simply learning what sounds people in the local community use to label concepts she already possesses. Chomsky acknowledges that this idea has been criticized. Hilary Putnam (1988), for example, asks how evolution could have possibly had the foresight to equip humans with a concept such as carburetor. Chomsky’s response is simply that this certainly seems surprising, but that “the empirical facts appear to leave open a few other possibilities” (1997: 26). Conceptual relations, like those mentioned above between persuades and intends, or between chase and follow with the intent of staying on one’s path, are, Chomsky asserts, grasped by children on the basis of virtually no evidence. He concludes that this indicates that children approach language learning with an intuitive understanding of important concepts, such as intending, causing something to happen, having a goal, and so on.

Chomsky suggests a parallel to his theory of lexical acquisition in the Nobel Prize-winning work of the immunologist Niels Jerne. The number of antigens (substances that trigger the production of antibodies) in the world is so enormous, including man-made toxins, that it may seem absurd to propose that immune systems would have evolved to have an innate supply of specific antibodies. However, Jerne’s work upheld the theory that an animal could not be stimulated to make an antibody in response to a specific antigen unless it had already produced such an antibody before encountering the antigen. In fact, Jerne’s (1985) Nobel speech was entitled “The Generative Grammar of the Immune System”.

Chomsky’s theories of innate concepts fit with those of some philosophers, such as Jerry Fodor (1975). On the other hand, this approach has been challenged by other philosophers and by linguists such as Stephen Levinson and Nicholas Evans (2009), who argue that the concepts labeled by words in one language very seldom map neatly onto the vocabulary of another. This is sometimes true even of very basic terms, such as the English preposition “in”, which has no exact counterpart in, for example, Korean or Tzeltal, languages that instead have a range of words that more specifically identify the relation between the contained object and the container. This kind of evidence is understood by some linguists to cast doubt on the idea that childhood language acquisition is a matter of acquiring labels for preexisting universal concepts.

e. Kripkenstein and Rule Following

This subsection introduces the “Wittgenstenian Problem”, one of the most famous philosophical objections to Chomsky’s notion of an underlying linguistic competence. Chomsky himself stated that out of the various criticisms his theory had received over the years, “this seems to me to be the most interesting” (1986: 223). Inspired by Ludwig Wittgenstein’s cryptic statement that “no course of action could be determined by a rule, because every course of action could be made out to accord with the rule” (1953: §201), Saul Kripke (1982) developed a line of argument that entailed a deep skepticism about the nature of rule-following activities, including the use of language. Kripke is frequently regarded as having gone beyond what Wittgenstein might have intended, so his argument is often attributed to a composite figure, “Kripke’s Wittgenstein” or “Kripkenstein”. A full treatment of this fascinating, but lengthy and complex, argument is beyond the scope of this article (the interested reader might consult the article “Kripke’s Wittgenstein.” It can be summarized as asserting that, in a case where a person seems to be following a rule, there are no facts about her that determine which rule she is actually following. To take Kripke’s example, if someone seems to be adding numbers in accordance with the normal rules of addition but then gives a deviant answer, say 68 + 57 = 5, there is no way to establish that she was not actually performing an operation called quaddition instead, which is like addition except that it gives an answer of 5 for any equation involving numbers larger than 57. Kripke claims that any evidence, including her own introspection, that she was performing addition and made a bizarre mistake is equally compatible with the hypothesis that she was performing quaddition. Ultimately, he concludes, there is no way to settle such questions, even in theory; there is simply no fact of the matter about what rule is being followed.

The relevance of Kripke’s argument to Chomsky’s linguistic theory is that this directly confronts his notion of language as an internalized system of rules (or, in later iterations, a system of principles and parameters that gives rise to rules that are not themselves represented). According to Chomsky’s theory, a grammatical error is explained as a performance issue, for example, a mistake brought on by inattention or distraction that causes a deviation from the system of rules in the mind of the speaker. According to Kripke, calling this a deviation from those rules, rather than an indication that different rules (or no rules) are being followed, is like trying to decide the question of addition vs. quaddition. Kripke asserted that there is no fact of the matter in the linguistic case, either, any more than in the example of addition and quaddition. Therefore, “it would seem that the use of the ideas of rules and competence in linguistics needs serious reconsideration” (1982: 31).

An essential part of Chomsky’s response to Kripke’s criticism was that the question of what is going on inside a speaker is no different in principle than any other question investigated by the sciences. Given a language user, say Jones, “We then try… to construct a complete theory, the best one we can, of relevant aspects of how Jones is constructed” (1986: 237). Such a theory would involve specifying that Jones incorporates a particular language, consisting of fixed principles and the setting of parameters, and that he follows the rules that would emerge from the interactions of these factors. Any particular theory like this could be proven wrong —Chomsky notes, “This has frequently been the case” —and, therefore, such a theory is an empirically testable one that can be found to be correct or incorrect. That is, given a theory of the speaker’s underlying linguistic competence, whether she is making a mistake or the theory is wrong is “surely as ascertainable as any other fact about a complex system” (Rey 2020: 125). What would be required is an acceptable explanation of why a mistake was made. The issues here are very similar to those surrounding Chomsky’s adaptation of the “Galilean Method” (see 2.b above) and the testability, or lack thereof, of his theories in general (see 4.a).

5. Cognitive Science and Philosophy of Mind

Because Chomsky regards language as a part of the human mind, his work has inevitably overlapped with both cognitive science and philosophy of mind. Although Chomsky has not ventured far into general questions about mental architecture outside of the areas concerned with language, his impact has been enormous, especially concerning methodology. Prior to Chomsky, the dominant paradigm in both philosophy and cognitive science was behaviorism, the idea that only external behavior could be legitimately studied and that the mind was a scientifically dubious entity. In extreme cases, most notably Quine (1960), the mind was regarded as a fiction best dropped from serious philosophy. Chomsky began receiving widespread notice in the 1950s for challenging this orthodoxy, arguing that it was a totally inadequate framework for the study of language (see 2.a, above), and he is widely held to have dramatically altered the scientific landscape by reintroducing the mind as a legitimate object of study.

Chomsky has remained committed throughout his career to the view that the mind is an important target of inquiry. He cautions against what he calls “methodological dualism” (2000: 135), the view that the study of the human mind must somehow proceed differently than the study of other natural phenomena. Although Chomsky says that few contemporary philosophers or scientists would overtly admit to following such a principle, he suggests that in practice it is widespread.

Chomsky postulates that the human mind contains a language faculty, or module, a biological computer that operates largely independently of other mental systems to produce and parse linguistic structures. This theory is supported by the fact that we, as language users, apparently systematically perform highly complex operations, largely subconsciously, in order to derive appropriate structures that can be used to think and communicate our thoughts and to parse incoming structures underlying messages from other language users. These activities point to the presence of a mental computational device that carries them out. This has been interpreted by some as strong evidence for the computational theory of mind, essentially the idea that the entire mind is a biological computer. Chomsky himself cautions against such a conclusion, stating that the extension from the language module to the whole mind is as of yet unwarranted.

In his work over the last two decades, Chomsky has dealt more with questions of how the language faculty relates to the mind more broadly, as well as the physical brain, questions that he had previously not addressed extensively. Most recently, he proposed a scheme by which the language faculty, narrowly defined, or FLN, consists only of a computational device responsible for constructing syntactic structures. This device provides a bridge between the two other systems that constitute the language faculty more broadly, one of which is responsible for providing conceptual interpretations for the structures of the FLN, the other for physical expression and reception. Thus, while, in this view, the actual language faculty plays a narrow role, it is a critical one that allows the communication of concepts. The FLN itself works with a single operation, merge, which combines two elements. This operation is recursive, allowing elements to be merged repeatedly. He suggests that the FLN, which is the only part of the system unique to humans, evolved due to the usefulness of recursion not only for communication but also for planning, navigation, and other types of complex thought. Because the FLN is thought to have no analog among other species, recursion is theorized to be an important characteristic of human thought, which gives it its unique nature.

While the FLN interfaces with other mental systems, passing syntactic structures between them, the system itself is postulated to carry out its operations in isolation. This follows from Chomsky’s view of syntax as largely autonomous from questions of meaning and also from the way that linguistic knowledge seems to be specialized and independent of our general knowledge about the world. For instance, we can recognize a sentence such as:

(26) On later engines, fully floating gudgeon pins are fitted (Cook and Newsom 1998: 83).

as well-formed, despite the fact that most readers will not know what it means. This concept of a specialized language faculty, which has been a constant in Chomsky’s work almost from the start, represents a substantive commitment to the “modularity of mind”, a thesis that the mind consists, at least in part, of specialized and autonomous systems. There is debate among cognitive scientists and in the philosophy of psychology regarding the degree to which this picture is accurate, as opposed to the idea that mental processes result from the interaction of general faculties, such as memory and perception, which are not domain-specific in the way of the hypothesized language faculty.

It should be emphasized that the language faculty Chomsky hypothesizes is mental, not a specific physical organ in the brain, unlike, for example, the hippocampus. Asking where it is in the brain is something like asking where a certain program is in a computer; both emerge from the functioning of many physical processes that may be scattered in different locations throughout the entire physical device. At the same time, although Chomsky’s theory concerns mental systems and their operations, this is intended as a description, at a high level of abstraction, of computational processes instantiated in the physical brain. Opponents of Chomsky’s ideas frequently point out that there has been little progress in actually mapping these mental systems onto the brain. Chomsky acknowledges that “we do not really know how [language] is actually implemented in neural circuitry” (Berwick and Chomsky 2017: 157). However, he also holds that this is entirely unsurprising, given that neuroscience, like linguistics, is as of yet in its infancy as a serious science. Even in much simpler cases, such as insect navigation, where researchers carry out experiments and genetic manipulations that cannot be performed on humans, “we still do not know in detail how that computation is implemented” (2017: 157).

In his most recent publications, Chomsky has worked towards unifying his theories of language and mind with neuroscience and theories of the physical brain. He has at times expressed pessimism about the possibility of fully unifying these fields, which would require explaining linguistic and psychological phenomena completely in terms of physical events and structures in the brain, While he holds that this may be possible at some point in the distant future, it may require a fundamental conceptual shift in neuroscience. He cautions that it is also possible that such a unification may never be completely possible. Chomsky points to Descartes’ discussion of the “creative” nature of human thought and language, which is the observation that in ordinary circumstances the use of these abilities is “innovative without bounds, appropriate to circumstances but not caused by them” (Chomsky 2014: 1), as well as our apparent possession of free will. Chomsky suggests that it is possible that such phenomena may be beyond our inherent cognitive limitations and impossible for us to ever fully understand.

6. References and Further Reading

a. Primary Sources

Chomsky has been a highly prolific author who has written dozens of books explaining and promoting his theories. Although almost all of them are of great interest to anyone interested in language and mind, including philosophers, they vary greatly in the degree to which they are accessible to non-specialists. The following is a short list of some of the relatively non-technical works of philosophical importance:

  • Chomsky, N. 1956. “Three Models for the Description of Language”. IRE Transactions   on Language Theory. 2(3) pages 113 –124.
    • The earliest presentation of the Chomsky Hierarchy.
  • Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton and Company.
  • Chomsky, N. 1959. “A Review of B.F. Skinner’s Verbal Behavior”. Language 35(1): 2658.
  • Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
    • While many of the exact proposals about syntax are dated, this contains what is still one of the best summaries of Chomsky’s ideas concerning language acquisition and the connections he sees between his program and the work of the early modern rationalist philosophers.
  • Chomsky, N. 1968. “Quine’s Empirical Assumptions”. Synthese, 19 (1/2): 53 –68.
    • A critique of Quine’s philosophical objections.
  • Chomsky, N. 1975. The Logical Structure of Linguistic Theory. Berlin: Springer.
    • The earliest statement of Chomsky’s theory, now somewhat outdated, originally circulated as a typescript in 1956.
  • Chomsky, N. 1981. Lectures on Government and Binding. The Hague: Mouton.
  • Chomsky, N. 1986. Barriers. Boston: The MIT Press.
  • Chomsky, N. 1986. Knowledge of Language: its Nature, Origin and Use. Westport, CN: Praeger.
    • Contains Chomsky’s response to “Kripkenstein”, as well as the first discussion of languages.
  • Chomsky, N. 1988. Language and Problems of Knowledge: The Managua Lectures. Cambridge, MA: MIT Press.
    • A series of lectures for a popular audience that introduces Chomsky’s linguistic work.
  • Chomsky, N. 1995. The Minimalist Program. Boston: MIT Press.
  • Chomsky, N. 1997. “Language and Problems of Knowledge”. Teorema. (16)2: 5 –33.
    • This is probably the best short introduction to Chomsky’s ideas on the nature and acquisition of language, especially the E-language/I-language distinction.
  • Chomsky, N. 2000. New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press.
    • It is philosophically interesting in that it contains a significant discussion of Chomsky’s views on contemporary trends in the philosophy of language, particularly his rejection of “externalist” theories of meaning.
  • Hauser, M.; Chomsky, N.; Fitch, T. 2002. “The Faculty of Language: What Is It, Who Has It, and How Did It Evolve”. Science. 198: 1569 –1579.
    • A good summary of directions in generative linguistics, including proposals about the structure of the language faculty in terms of FLN/FLB.
  • Chomsky, N. 2006. Language and Mind. Cambridge: Cambridge University Press.
    • Also contains valuable historical context.
  • Chomsky, N. 2014. “Science, Mind and Limits of Understanding”. The Science and Faith Foundation, https://chomsky.info/201401/. The Vatican.
  • Berwick, R. and Chomsky, N. 2016. Why Only Us: Language and Evolution. Boston: MIT Press.
    • It is valuable as a non-technical look at the current state of Chomsky’s theories as well as a discussion of the evolutionary development of language.
  • Keating, B. 2020. “Noam Chomsky: Is it Possible to Communicate with Extraterrestrials”. YouTube. https://www.youtube.com/watch?v=n7mvUw37g-U.
    • Chomsky discusses hypothetical extraterrestrial languages and the possibility of communicating with aliens.
  • Chomsky, N., Roberts, I., and Watumull, J. “Noam Chomsky: The False Promise of ChatGPT”. New York Times. March 8, 2023.
    • For someone interested in exploring Chomsky’s linguistic theories in depth, the following are a few key works tracing their development (along with Aspects, listed above).

b. Secondary Sources

There is a vast secondary literature surrounding Chomsky that seeks to explain, develop, and often criticize his theories. The following is a small sampling of works interesting to non-specialists. After a list of sources that cover Chomsky’s work in general, sources that are relevant to more specific aspects are listed by the section of this article they were referenced in or apply to.

  • General: 
    • Cook, V. and Newsom, M. 1996. Chomsky’s Universal Grammar: An Introduction.  Malden, MA: Blackwell.
      • Very clear introduction to Chomsky’s theories and their importance to linguistic science. The first three chapters are especially valuable to non-specialists.
    • Rey, G. 2020. Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Oxford: Oxford University Press.
      • A useful and thorough overview of the philosophical implications of Chomsky’s theories, particularly regarding the philosophy of science and the philosophy of mind, as well as a summary of the core linguistic theory.
    • Scholz, B., Pelletier, F., Pullum, G., and Nedft, R. 2022. “Philosophy of Linguistics”, The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.).
      • This article is an excellent critical comparison of Chomsky’s theories on language and linguistic science with the major rival approaches.
  • Life:
    • Rai, M. 1995. Chomsky’s Politics. London: Verso.
    • Cohen, J., and Rogers, J. 1991. “Knowledge, Morality and Hope: The Social Thought of Noam Chomsky.” New Left Review. I/187: 5–27.
  • Philosophy of Linguistics:
    • Bloomfield, L. 1933. Language. New York: Holt, Rinehart, and Winston.
    • Hockett, C. 1960. “The Origin of Speech”. Scientific American. 203: 88 –111.
    • Quine, W. 1960. Word and Object. Cambridge, MA: MIT University Press.
    • Skinner, B. 1957. Verbal Behavior. New York: Appleton-Century-Crofts.
  • The Development of Chomsky’s Linguistic Theory:
    • Baker, M. 2001. The Atoms of Language. New York: Basic Books
      • Easily readable presentation of Principles and Parameters Theory.
    • Harris, R. 2021. The Linguistics Wars. Oxford: Oxford University Press.
    • Liao, D., et al.  2022. “Recursive Sequence Generation in Crows”. Science Advances. 8(44).
      • Summarizes recent challenges to Chomsky’s claim that recursion is uniquely human.
    • Tomalin, M. 2006. Linguistics and the Formal Sciences: The Origins of Generative Grammar. Cambridge, UK: University of Cambridge Press.
      • Provides as interesting historical background connecting Chomsky’s early work with  contemporary developments in logic and mathematics.
  • Technical:
  • Generative Grammar:
    • Lasnik, H. 1999. Minimalist Analysis. Malden, MA: Blackwell
    • Lasnik, H. 2000. Syntactic Structures Revisited. Cambridge, MA. MIT University Press
    • Lasnik, H. and Uriagereka, J. 1988. A Course in GB Syntax. Cambridge, MA. MIT University Press.
  • Language and Languages:
  • Criticisms of Universal Grammar:
    • Evans, N. and Levinson, S. 2009. “The Myth of Language Universals: Language Diversity and its Importance for Cognitive Science”. Behavioral and Brain Sciences 32(5) pages 429 –492.
    • Levinson, S. 2016. “Language and Mind: Let’s Get the Issues Straight!”. Making Sense of Language (Blum, S., ed.). Oxford: Oxford University Press pages 68 –80.
      • Relevant to the debate over the I-language/E-language distinction:
    • Devitt, M. 2022. Overlooking Conventions: The Trouble with Linguistic Pragmatism. Oxford: Oxford University Press.
    • Dummet, M. 1986. “’A nice derangement of epitaphs: Some comments on Davidson and Hacking”. Truth and Interpretation (Lepore, E. ed.). Oxford: Blackwell.
    • Katz, J. 1981. Language and Other Abstract Objects. Lanham, MD: Rowman and Littlefield.
    • Katz, J. 1985. The Philosophy of Linguistics. Oxford: Oxford University Press.
    • Lewis, D. 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press.
    • Soames, S. 1984. “Linguistics and Psychology”. Linguistics and Philosophy 7: 155 –179.
  • Meaning and Analyticity:
    • Davidson, D. 1967. “Truth and Meaning”. Synthese 17(3): 304 –323.
    • Fodor, J. 1998. Concepts: Where Cognitive Science Went Wrong. Cambridge, MA: MIT   University Press.
    • Katz, J. 1990. The Metaphysics of Meaning. Oxford: Oxford University Press.
    • Pietroski, P. 2005. “Meaning Before Truth”. Contextualism in Philosophy: Knowledge, Meaning and Truth. Oxford: Oxford University Press.
    • Putnam, H. 1962. “It Ain’t Necessarily So.” Journal of Philosophy LIX: 658 –671.
    • Quine, W. 1953. “Two Dogmas of Empiricism”. From a Logical Point of View. Cambridge, MA: Harvard University Press.
    • Rey, G. 2022. “The Analytic/Synthetic Distinction. The Stanford Encyclopedia of Philosophy (Spring 2023 Edition), Edward N. Zalta & Uri Nodelman (eds.).
      • See especially the supplement specifically on Chomsky and analyticity.
    • Waismann, F. 1945. “Verifiability”. Proceedings of the Aristotelian Society 19.
  • Language Acquisition and the Theory of Innate Concepts:
    • Fodor, J. 1975. The Language of Thought. Scranton, PA: Crowell.
    • Jerne, N. “The Generative Grammar of the Immune System”. Science. 229 pages 1057 –1059.
    • Putnam, H. 1988. Representation and Reality. Cambridge, MA: MIT University Press.
  • “Kripkenstein” and Rule-Following:
    • Kripke, S. 1982. Wittgenstein on Rules and Private Language. Cambridge, MA: Harvard University Press.
    • Wittgenstein, L. 1953. Philosophical Investigations (Anscombe, G. translator). Oxford: Blackwell.
  • On Pirahã:
    • Everett, D. 2005. “Cultural Constraints on Grammar and Cognition in Pirahã”. Current Anthropology 46(4): 621–646.
      • The original claim that a language without recursion had been identified, allegedly showing Universal Grammar to be false.
    • Hornstein, N. and Robinson, J. 2016. “100 Ways to Misrepresent Noam Chomsky”. Current Affairs.
      • Representative responses to Everett from those in Chomsky’s camp assert that even if his claims are correct, they would not represent a counterexample to Universal Grammar.
    • McWhorter, J. 2016. “The bonfire of Noam Chomsky: journalist Tom Wolfe targets the  acclaimed linguist”. Vox.
      • Linguist John McWhorter provides a very understandable summary of the issues and assesses the often incautious way that the case has been handled in the popular press.
    • Nevins, A., Pesetsky, D., Rodrigues, C. 2009. “Pirahã Exceptionality: A Reassessment”. Language 85(2): 355 –404.
      • Technical article criticizing Everett’s assessment of Pirahã syntax.
  • Other:
    • Lakoff, G. 1971. “On Generative Semantics”. Semantics (Steinberg, G. and Jacobovits, I. ed.). Cambridge, UK: Cambridge University Press.
      • An important work critical of Chomsky’s “autonomy of syntax”.
    • Cognitive Science and Philosophy of Mind.
    • Rey, G. 1997. Contemporary Philosophy of Mind. Hoboken: Wiley-Blackwell.
      • Covers Chomsky’s contributions in this area, particularly regarding the downfall of behaviorism and the development of the computational theory of mind.

 

Author Information

Casey A. Enos
Email: cenos@georgiasouthern.edu
Georgia Southern University
U. S. A.

Humean Arguments from Evil against Theism

Arguments from evil are arguments against Theism, which is broadly construed as the view that there is a supremely powerful, knowledgeable, and good creator of the universe. Arguments from evil attempt to show that there is a problem with Theism. Some arguments depend on it being known that Theism is false, but some arguments from evil also try to show that Theism is known to be probably false, or unreasonable, or that there is strong evidence against it. Arguments from evil are part of the project of criticizing religions, and because religions offer comprehensive worldviews, arguments from evil are also part of the project of evaluating which comprehensive worldviews are true or false.

Humean arguments from evil take their argumentative strategy from Philo’s argument from evil in part XI of Hume’s Dialogues Concerning Natural Religion. Philo’s argumentative strategy is distinctive in that it is fundamentally explanatory in nature. Philo takes as his data for explanation the good and evil we know about. He asks which hypothesis about a creator best explains that data. He argues that the good and evil we know about is not best explained by Theism but some rival hypothesis to Theism. In this way, the good and evil we know about provides a reason for rejecting Theism.

This article surveys Humean arguments from evil. It begins by explaining Philo’s original argument from evil as well as some potential drawbacks of that argument. Then it turns to more fully explaining the distinctive features of Humean arguments from evil in comparison to other arguments from evil. It highlights three features in particular: they appeal to facts about good and evil, they are comparative, and they are abductive. The remainder of the article articulates a modern, prototypical Humean argument inspired by the work of Paul Draper. It explains the idea that the good and evil we know about is better explained by a rival to Theism called the “Hypothesis of Indifference,” roughly, the hypothesis that there is no creator who cares about the world one way or the other. It then shows how to strengthen Humean arguments from evil by providing additional support for the rival hypothesis to Theism. Finally, it examines four prominent objections to Humean arguments.

This article focuses on Humean arguments that try to show that Theism is known to be false, or probably false, or unreasonable to believe. These kinds of Humean arguments are ambitious, as they try to draw an overall conclusion about Theism itself. But there can also be more modest Humean arguments that try to show that some evidence favors a rival to Theism without necessarily drawing any overall conclusions about Theism itself. This article focuses on ambitious Humean arguments rather than these modest Humean arguments mostly because ambitious Humean arguments are the ones contemporary philosophers have focused on. But it is important to keep in mind that Humean arguments from evil—like arguments from evil more generally—come in different shapes and sizes and may have different strengths and weaknesses.

Table of Contents

  1. Philo’s Argument from Evil
  2. Distinctive Features of Humean Arguments
  3. Modern Humean Arguments
  4. Strengthening Humean Arguments
  5. Criticisms of Humean Arguments
    1. Objection 1: Limited Biological Roles
    2. Objection 2: Naturalism and Normativity
    3. Objection 3: God’s Obligations
    4. Objection 4: Skeptical Theism
  6. References and Further Reading

1. Philo’s Argument from Evil

Natural theology is the attempt to provide arguments for the existence of God by only appealing to natural facts—that is, facts that are not (purportedly) revealed or otherwise supernatural. Three of the traditional arguments for the existence of God—the ontological argument, the cosmological argument, and the teleological argument—belong to the project of natural theology. Conversely, natural atheology is the attempt to provide arguments against the existence of God by appealing to natural (non-supernatural, non-revealed) facts as well.

Hume’s Dialogues Concerning Natural Religion is a classic work of natural atheology. In the dialogue, the interlocutors assume that there is a creator (or creators) of the world; they advance arguments about the nature or character of this creator. Most of the dialogue—parts II-VIII—discusses design arguments for the existence of God whereas later parts—parts X-XI—discuss arguments from evil. In the dialogue, Philo offers a variety of critiques of prototypical theistic ideas. (Because it is controversial whether Philo speaks for Hume—and if so, where—this article attributes the reasoning to Philo.)

In section X, the interlocutors discuss what is called a “logical” or “incompatibility” argument from evil. They begin by describing various facts about good and evil they have observed. For instance, many people experience pleasure in life; but oftentimes they also experience great pain; the strong prey upon the weak; people use their imaginations not just for relaxation but to create new fears and anxieties; and so forth. They consider whether those facts are logically inconsistent with the existence of a creator with infinite power, wisdom, and goodness. Ultimately Philo does not think it would be reasonable to infer the existence of such a creator from those facts; but he does concede that they are logically consistent with the existence of such a creator (X:35; XI.4, 12). But Philo’s concession is not his last word on the subject.

In section XI, Philo constructs a different argument from evil. Philo begins by articulating additional claims about good and evil he takes himself to know. Most of these additional claims consist in causes of suffering that seem to be unnecessary—for example, variations in weather cause suffering, yet seem to serve no purpose; pain teaches animals and people how to act, but it seems that pleasure would be just as effective at motivating people to act; and so forth. Given these claims, Philo considers what we can reasonably infer about the creator (or creators) of the universe. He considers four potential hypotheses:

  1. The creator(s) of the universe are supremely good.
  2. The creator(s) of the universe are supremely malicious.
  3. The creator(s) of the universe have some mixture of both goodness and malice.
  4. The creator(s) of the universe have neither goodness nor malice.

In evaluating these hypotheses, Philo uses a Humean principle of reasoning that “like effects have like causes.” In other words, the only kinds of features it is reasonable to infer from an effect to its cause(s) are features that would be similar between the two. (He uses this principle throughout the Dialogue; see also II.7, II.8, II.14, II.17, V.1, VI.1.) Using this principle, he argues that of these hypotheses the fourth is “by far the most probable” (XI.15). He rejects the first and the second because the causes of the universe would be too dissimilar to the universe itself. The world is mixed, containing both good and evil. Thus, one cannot infer that the cause of the world contains no evil—as the first hypothesis suggests—or contains no good—as the second hypothesis suggests. Those causes are too dissimilar. He also rejects the third hypothesis. He assumes that if the universe had some mixture of goodness and malice this would be because some of the creators of the universe would be good and some of the creators of the universe would be malicious. And, he assumes, the universe would then be like a battlefield between them. But the regularity of the world suggests the universe is not a battlefield between dueling creators. Having ruled out the first three hypotheses, the most probable hypothesis must be the fourth. As Philo himself says of this hypothesis, using language that is graphic both now and then (XI.13):

The whole [of the universe] presents nothing but the idea of a blind nature, impregnated by a great vivifying principle, and pouring forth from her lap, without discernment or parental care, her maimed and abortive children.

Philo’s conclusion has both a weak and a strong interpretation. In the strong interpretation, Philo is concluding that we can reasonably believe something about the nature of the creator(s), namely, that they are indifferent. In a weak interpretation, Philo is concluding that of these four hypotheses, the fourth is the most probable—but it may not be sufficiently probable to reasonably believe. Either way, the most reasonable hypothesis is that the creator has neither goodness nor malice.

At first blush, it might not be obvious how Philo’s conclusion provides a reason for rejecting Theism. In fact, it might look like Philo is just concerned to undermine an argument from our knowledge of good and evil to Theism. And, one might point out, undermining an argument for a conclusion is not the same thing as providing a reason for rejecting that conclusion. To see how Philo’s conclusion provides a reason for rejecting Theism, notice two things. First, Philo is not merely claiming something purely negative, like that some argument for Theism fails. Rather, he is also claiming something positive, namely, that the fourth hypothesis—where the creator has neither goodness nor malice—is the most reasonable of the four considered, given our knowledge of good and evil. Second, that hypothesis is inconsistent with Theism, which maintains (at the very least) that God is supremely good. Since the most reasonable thing to believe, given that data, is inconsistent with Theism, then that data provides a reason for rejecting Theism. In this way, Philo is not simply undermining an argument for Theism; he is also providing a reason for rejecting Theism.

Philo’s specific argument has received a mixed reaction both historically and in the early 21st century. From a contemporary perspective, there are at least two drawbacks to Philo’s specific argument. First, Philo and his interlocutors assume that there is a creator (or creators) of the universe. Thus, they only consider hypotheses that imply that there is a creator (or creators) of the universe. But many contemporary naturalists and atheists do not assume that there is any creator at all. From a contemporary perspective, it would be better to consider a wider range of hypotheses, including some that do not imply that there is a creator. Second, when evaluating hypotheses, Philo uses Hume’s principles of reasoning that “like causes have like effects.” But many contemporary philosophers reject such principles. Insofar as Philo’s reasoning assumes Hume’s own principles of reasoning, it will exhibit the various problems philosophers have identified for Hume’s principles of reasoning.

But even if Philo’s specific argument suffers from drawbacks, his argumentative strategy is both distinctive and significant. Thus, one might mount an argument that shares several of the distinctive features of his argumentative strategy without committing oneself to the specific details of Philo’s own argument. Toward the end of the 20th and beginning of the 21st century, Paul Draper did exactly that, constructing arguments against Theism that utilize Philo’s argumentative strategy while relying on a more modern epistemology. It is natural to call these arguments Humean arguments since their strategy originates in a dialogue written by Hume—even if modern defenses of them vary from Hume’s original epistemology. The next section describes in more detail several of the distinctive features of Philo’s argumentative strategy.

2. Distinctive Features of Humean Arguments

First, many arguments from evil focus exclusively on facts about evil. Some arguments focus on our inability to see reasons that would justify God’s permission of those evils (Martin (1978), Rowe (1979)). Other arguments focus on the horrific nature of such evils (Adams (1999)). By contrast, Humean arguments from evil focus on facts about both good and evil. The focus on both good and evil is appropriate and significant.

The focus on good and evil is appropriate because, if God exists, God cares about preventing evil but also bringing about what is good. The focus on good and evil is significant because it provides a richer set of data with which to reason about the existence of God. For it is conceivable that facts about evil provide some evidence against the existence of God, but facts about good provide even stronger evidence for the existence of God, thereby offsetting that evidence. Or, alternatively, it is conceivable that facts about evil provide little to no evidence against the existence of God, but facts about good and evil together provide strong evidence against the existence of God. By focusing on both good and evil, Humean arguments provide a richer set of data to reason about the moral character of a purported creator.

Second, Humean arguments compare Theism to some rival hypothesis that is inconsistent with Theism. Normally, the rival hypothesis is more specific than the denial of Theism. For instance, Philo’s argument considered rival hypotheses to Theism that are fairly specific. And we can distinguish between different Humean arguments on the basis of the different rival hypotheses they use.

There is an important advantage of using a specific rival hypothesis to Theism. The simplest rival to Theism is the denial of Theism. But consider all of the views that are inconsistent with Theism. That set includes various forms of naturalism, but also pantheism, panentheism, non-theistic idealisms, various forms of pagan religions, and perhaps others yet. So, the denial of Theism is logically equivalent to the disjunction of these various theories. But it is not at all obvious what a disjunction of these various theories will predict. By contrast, it is normally more obvious what a more specific, rival hypothesis to Theism predicts. Thus, by focusing on a more specific rival hypothesis to Theism, it is easier to compare Theism to that rival.

Third, Humean arguments are best understood abductively. They compare to what degree a specific rival to Theism better explains, or otherwise predicts, some data. Even Philo’s own argument could be understood abductively: the hypothesis that there is a supremely good creator does not explain the good and evil Philo observes because the creator proposed by that hypothesis is not similar to the good and evil he observes. To be clear, Humean arguments need not claim that the rival actually provides the best explanation of those facts. Rather, their claim is more modest, but with real bite: a rival to theism does a better job of explaining some facts about good and evil.

Some Humean arguments may stop here with a comparison between Theism and a specific rival hypothesis. But many Humean arguments are more ambitious than that: they try to provide a reason for rejecting Theism. This feature of such Humean arguments deserves further clarification. Sometimes abductive reasoning is characterized as “inference to the best explanation.” In a specific inference to the best explanation, one infers that some hypothesis is true because it is part of the best explanation of some data. Such Humean arguments need not be understood as inference to the best explanation in this sense. Though it is not as catchy, some Humean arguments could be understood as “inference away from a worse explanation.” Some body of data gives us reason to reject Theism because some hypothesis other than Theism does a better job of explaining that data and that hypothesis is inconsistent with Theism. Notice that a specific rival to Theism can do a better job of explaining that data even if some other hypothesis does an even better job yet.

Lastly, Humean arguments are evidential arguments from evil, not logical arguments from evil. More specifically, Humean arguments do not claim that some known facts are logically inconsistent with Theism. Rather, they claim that some known facts are strong evidence against Theism. Logical arguments from evil have an important methodological feature. If some known fact is logically inconsistent with Theism, then it does not matter what evidence people muster for Theism—we already know that Theism is false. By contrast, evidential arguments may need to be evidentially shored up. Even if the arguments are successful in providing strong evidence against Theism, it may be that there is also strong evidence in favor of Theism as well. This difference between evidential arguments and logical arguments is relevant in section 4 which indicates how to strengthen Humean arguments.

3. Modern Humean Arguments

 This section explains a modern, prototypical Humean argument. The author who has done the most to develop Humean arguments is Paul Draper. The argument in this section is inspired by Paul Draper’s work without being an interpretation of any specific argument Draper has given. Humean arguments compare Theism to some specific rival to Theism; and different Humean arguments may use different specific rivals to compare to Theism. Consequently, it is important to begin by clarifying what specific rival is used to generate Humean arguments.

This article uses the term Hypothesis of Indifference. The Hypothesis of Indifference is the claim that it is not the case that the nature or condition of life on earth is the result of a creator (or creators) who cares positively or negatively about that life. The Hypothesis of Indifference is a natural hypothesis to focus on for several reasons. First, it is inconsistent with Theism, but is more specific than just the denial of Theism. Second, it does not imply that there is a creator. Third, it is consistent with metaphysical naturalism, the view that there are no supernatural facts. These last two reasons are important to a modern audience—many people believe that there is no creator of the universe, and many philosophers accept metaphysical naturalism.

The central claim of this Humean argument is this: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does. This article refers to this claim as Central Claim. Central Claim does not claim that the Hypothesis of Indifference perfectly predicts the good and evil we know about. It does not even claim that the Hypothesis of Indifference is the best explanation of the good and evil we know about. Rather, it claims that in comparison to Theism, the Hypothesis of Indifference does a much better job of predicting the good and evil we know about.

The comparison in Central Claim is an antecedent comparison. That is, it compares what the Hypothesis of Indifference and Theism predict about good and evil antecedent of our actual knowledge of that good and evil. We are to set aside, or bracket, our actual knowledge of good and evil and ask to what degree each hypothesis—the Hypothesis of Indifference, Theism—predicts what we know.

This procedure of antecedent comparison is not unique to Humean arguments. It is frequently used in the sciences. A classic example of the same procedure is the retrograde movement of Mars. Using the naked eye, Mars seems to move “backwards” through the sky. Some astronomers argued that the retrograde motion of Mars was better explained by heliocentrism than geocentrism. But in making their arguments, they first set aside what they already knew about the retrograde motion of Mars. Rather, they asked to what degree each hypothesis would predict the retrograde motion of Mars before considering whether Mars exhibits retrograde motion.

There are different strategies one might use to defend Central Claim. One strategy appeals to what is normally called our background knowledge. This is knowledge we already have “in the background.” Such knowledge is frequently relied upon when we are evaluating claims about evidence, prediction, explanation, and the like. For instance, suppose I hear a loud repeating shrieking noise from my kitchen. I will immediately take that as evidence that there is smoke in my kitchen and go to investigate. However, when I take that noise as evidence of smoke in my kitchen, I rely upon a huge range of knowledge that is in the background, such as: loud repeating shrieking noises do not happen at random; that noise is not caused by a person or pet; there is a smoke detector in my kitchen; smoke detectors are designed to emit loud noises in the presence of smoke; and so on. I rely on this background knowledge—implicitly or explicitly—when I take that noise as evidence of smoke in my kitchen. For instance, if I lacked all of that background knowledge, it is very unlikely I would immediately take that noise as evidence of smoke in my kitchen.

One strategy for defending Central Claim relies upon our background knowledge. The basic strategy has four parts. First, one argues that our background knowledge supports certain kinds of predictions about good and evil. Second, one argues that those predictions are, to a certain degree, accurate. Third, one argues that the Hypothesis of Indifference does not interfere with or undermine those predictions. Finally, one argues that Theism interferes with or undermines those predictions, producing more inaccurate predictions. The end result, then, is that the combination of the Hypothesis of Indifference with our background knowledge does a better job of predicting the data of good and evil than the combination of Theism with our background knowledge.

This strategy can be implemented in various ways. One way of implementing it appeals to our background knowledge of the biological role or function of pleasure and pain (Draper (1989)). Specifically, our background knowledge predicts that pleasure and pain will play certain adaptive roles or functions for organisms. And when we consider the pleasure and pain we know about, we find that it frequently plays those kinds of roles. For instance, warm sunlight on the skin is pleasant, but also releases an important vitamin (vitamin D); rotten food normally produces an unpleasant odor; extreme temperatures that are bad for the body are also painful to experience for extended durations; and so forth. So, our background knowledge makes certain predictions about the biological role or function of pleasure and pain, and those predictions are fairly accurate.

The Hypothesis of Indifference does not interfere with, or undermine, those predictions as it does not imply the existence of a creator who has moral reasons for deviating from the biological role of pleasure and pain. By contrast, Theism does interfere with, and undermine, those predictions. For pleasure is a good and pain a bad. Thus, given Theism, one might expect pleasure and pain to play moral or religious roles or functions. The exact nature of those moral or religious roles might be open to debate. But they might include things like the righteous receiving happiness or perhaps good people getting the pleasure they deserve. Similarly, given Theism, one might expect pain to not play certain biological roles if it does not simultaneously play moral or religious roles. For instance, given Theism, one might not expect organisms that are not moral agents to undergo intense physical pain (regardless of whether that pain serves a biological role). In this way, Theism may interfere with the fairly accurate predictions from our background information. Thus, the combination of the Hypothesis of Indifference and our background knowledge does a better job of predicting some of our knowledge of good and evil—namely, the distribution of pleasure and pain—than the combination of Theism and our background knowledge.

A second strategy for defending Central Claim utilizes a thought experiment (compare Hume Dialogue, XI.4, Dougherty and Draper (2013), Morriston (2014)). Imagine two alien creatures who are of roughly human intelligence and skill. One of them accepts Theism, and the other accepts the Hypothesis of Indifference. But neither of them knows anything about the condition of life on earth. They first make predictions about the nature and quality of life on earth, then they learn about the accuracy of their predictions. One might argue that the alien who accepts the Hypothesis of Indifference will do a much better job predicting the good and evil on earth than the alien who accepts Theism. But as it goes for the aliens so it goes for us: the Hypothesis of Indifference does a much better job of predicting the good and evil we know about than Theism does

The alien who accepts Theism might be surprised as it learns about the actual good and evil of life on earth. For the alien’s acceptance of Theism gives it reason to expect a better overall balance of good and evil than we know about. By contrast, the alien who accepts the Hypothesis of Indifference might not be surprised by the good and evil that we know about because the Hypothesis of Indifference does not imply the existence of a creator with a moral reason for influencing the good and evil the earth has. So the alien’s acceptance of the Hypothesis of Indifference does not give it a reason for anticipating any particular distribution of good and evil. Thus, the alien accepting the Hypothesis of Indifference might not be surprised to discover the specific good and evil it does in fact know about.

Recall that Central Claim involves an antecedent comparison—it compares to what degree two hypotheses predict some data antecedent of our actual knowledge of that data. This thought experiment models the idea of an antecedent comparison by having the aliens not actually know the relevant data of good and evil. Their ignorance of the good and evil models our “bracketing” of our own knowledge.

Having considered some defenses of Central Claim, we can now formulate some Humean arguments that use Central Claim as a premise. One Humean argument goes like this:

Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.

Therefore, the good and evil we know about is evidence favoring the Hypothesis of Indifference over Theism.

This argument is valid. But the inference of this argument is modest on two fronts. First, evidence comes in degrees, from weak evidence to overwhelming evidence. The conclusion of this argument merely states that the good and evil we know about is evidence favoring one hypothesis over another without specifying the strength of that evidence. Second, this conclusion is consistent with a wide range of views about what is reasonable for us to believe. The conclusion is consistent with views like: it is reasonable to believe Theism; it is reasonable to believe the Hypothesis of Indifference; it is not reasonable to believe or disbelieve either. To be sure, this argument still asserts Central Claim; and as we see in section V, a number of authors have objected to Central Claim and arguments for it. But the conclusion drawn from Central Claim is quite modest. Perhaps for these reasons, defenders of Humean arguments from Philo to the present have tended to defend Humean arguments with more ambitious conclusions.

Consider the following simple Humean argument against Theism:

Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.

Therefore, Theism is probably false.

This argument does not draw a conclusion comparing Theism to some rival. Rather, it draws a conclusion about Theism itself. In this way it is more ambitious than the argument just considered. What makes this Humean argument a simple Humean argument is that it only has one premise—Central Claim. However, this argument is not valid, and there are several reasons for thinking it is not very strong. The next section explains what those reasons are and how to strengthen Humean arguments by adding additional premises to produce a better (and arguably valid) argument.

4. Strengthening Humean Arguments

Suppose that Central Claim is true. Then a rival hypothesis (Hypothesis of Indifference) to a hypothesis (Theism) does a much better job predicting some data (what we know about good and evil). However, that fact on its own might not make it reasonable to believe the rival hypothesis (Hypothesis of Indifference) or disbelieve the relevant hypothesis (Theism). For the rival hypothesis might have other problems such as being ad hoc or not predicting other data (compare Plantinga (1996)).

An analogy will be useful in explaining these points. Suppose I come home to find that one of the glass windows on the back door of my home has been broken. These facts are “data” that I want to explain. One hypothesis is that the kids next door were playing and accidentally broke the glass with a ball (Accident Hypothesis). A rival hypothesis is that a burglar broke into my home by breaking the glass (Burglar Hypothesis). Now the Burglar Hypothesis better predicts the data. If the burglar is going to break into my home, an effective way to do that is to break the glass on the door to thereby unlock the door. By contrast, the Accident Hypothesis does a worse job predicting the data. Even if the kids were playing, the ball might not hit my door. And even if the ball did hit the door, it might not hit the glass with enough force to break it. So, in this case, the rival hypothesis (Burglar Hypothesis) to a hypothesis (Accident Hypothesis) does a much better job predicting some data (the broken glass on my back door). Does it thereby follow that it is reasonable for me to believe the rival hypothesis (Burglar Hypothesis) or it is unreasonable for me to believe the hypothesis (Accident Hypothesis)?

No, or at least, not yet. First, the Burglar Hypothesis is much less simple than the Accident Hypothesis. I already know that there are kids next door who like to play outside. I do not already know that there is a burglar who wants to break into my home. So the Burglar Hypothesis is committed to the existence of more things than I already know about. That makes the Burglar Hypothesis less ontologically simple. Second, the Burglar Hypothesis might not predict as well other data that I know. Suppose, for instance, there is a baseball rolling around inside my home, and nothing has been stolen. The Accident Hypothesis does a much better job predicting this data than the Burglar Hypothesis. So even if the Burglar Hypothesis better predicts some data, on its own, that would not make it reasonable for me to believe The Burglar Hypothesis or make it reasonable to disbelieve the Accident Hypothesis.

Returning to Humean arguments, suppose Central Claim is true so that a rival to Theism, specifically the Hypothesis of Indifference, better predicts the good and evil we know about. It may not yet follow that it is reasonable to believe the Hypothesis of Indifference or disbelieve Theism. For it may be that the rival is much less simple than Theism. Or it may be that the rival to Theism does a much worse job predicting other data that we know about.

To strengthen Humean arguments, additional premises can be added (compare Dougherty and Draper (2013), Perrine and Wykstra (2014), Morriston (2014)). For instance, an additional premise might be Simplicity Claim: the Hypothesis of Indifference is just as simple, if not more so, than Theism. Another premise might be Not-Counterbalanced Claim: there is no body of data we know about that Theism does a much better job predicting than the Hypothesis of Indifference. The strengthened argument looks like this:

Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.

Simplicity Claim: the Hypothesis of Indifference is just as simple, if not more so, than Theism.

Not-Counterbalanced Claim: there is no body of data we know about that Theism does a much better job predicting than the Hypothesis of Indifference.

Therefore, Theism is false.

This argument is a stronger argument than the simple one-premise argument from the previous section. Arguably, it is valid. (Whether it is valid depends partly on the relationship between issues like simplicity and probability; but see Dougherty and Draper (2013: 69) for an argument that it is valid.)

Premises like Simplicity Claim and Not-Counterbalanced Claim are not always defended in discussion of arguments from evil. But they can be defended by pressing into service other work in the philosophy of religion. For instance, natural theologians try to provide evidence for the existence of God by appealing to facts we know about. Critics argue that such evidence does not support Theism or, perhaps, supports Theism only to a limited degree. These exchanges are relevant to evaluating Not-Counterbalanced Claim. To be sure, Humean arguments compare Theism to some rival. So other work in philosophy of religion might not straightforwardly apply if it does not consider a rival to Theism or considers a different rival than the one used in the relevant Humean argument.

These additional premises strengthened Humean arguments because Humean arguments are not logical or incompatibility arguments. That is, they do not claim that the good and evil we know about is logically inconsistent with Theism. Rather, they are abductive arguments. They claim that what we know about good and evil is evidence against Theism because some rival to Theism better predicts or explains it. But in evaluating how well a hypothesis explains some data, it is oftentimes important to also consider further facts about the hypothesis—such as how simple it is or if it is also known to be false or otherwise problematic.

Lastly, some might think that the relation between simple and strengthened Humean arguments is just a matter of whether we have considered some evidence against Theism or all relevant evidence for or against Theism. But considering some evidence versus all the evidence are just two different tasks, and the first task can be done without consideration of the second. However, the relation between simple and strengthened Humean arguments is a little more complex than that for certain methodological reasons.

Each of the premises of a strengthened Humean argument involves a comparison of Theism with a specific rival to Theism. But the specific choice of the rival might make it easier to defend some of the comparisons while simultaneously making it harder to defend other comparisons. For instance, the Hypothesis of Indifference does not posit any entity that has the ability or desire to influence life on earth. Some defenders of Central Claim might use that feature to argue that the Hypothesis of Indifference has better predictive fit than Theism with regard to the good and evil we know about. But exactly because the Hypothesis of Indifference does not posit any entity that has the ability or desire to influence life on earth, it may have worse predictive fit when it comes to the fine-tuning of the universe, the existence of life at all, the existence of conscious organisms, the existence of moral agents, and other potential evidence. So picking the Hypothesis of Indifference might make it easier to defend some premises of a strengthened Humean argument (perhaps Central Claim) while also making it harder to defend other premises of a strengthened Humean argument (perhaps Not-Counterbalanced Claim).

As such, the relationship between a simple and strengthened Humean argument is more complex. It is not simply a matter of considering one potential pool of evidence and then considering a larger pool of evidence. Rather, the choice of a specific rival to Theism is relevant to an evaluation of both simple and strengthened Humean arguments. Some specific rivals might make it easier to defend a simple Humean argument while also making it harder to defend a strengthened Humean argument (or vice versa). Defenders of Humean arguments have to carefully choose a specific rival that balances simplicity and predictive strength to challenge Theism.

5. Criticisms of Humean Arguments

Like all philosophical arguments, Humean arguments have received their fair share of criticisms. This section describes a handful of criticisms and potential responses to those criticisms. These criticisms are all criticisms of Central Claim (or premises like it). Consequently, these objections could be lodged against simple Humean arguments and strengthened Humen arguments—as well as the “modest” Humean argument mentioned at the end of section III. (For a discussion of historical responses to Hume’s writing on religion, see Pyle (2006: chapter 5).)

a. Objection 1: Limited Biological Roles

Some authors object to the biological role argument for Central Claim (Plantinga (1996), Dougherty and Draper (2013)). Consider the wide range of pleasure and pain we know about. For instance, I get pleasure out of reading a gripping novel, listening to a well-crafted musical album, or tasting the subtle flavors of a well-balanced curry. Likewise, consider the pain of self-sacrifice, the displeasure of a hard workout, or the frustration of seeing a coworker still fail to fill in standardized forms correctly. The objection goes that these pleasures and pains do not seem to serve any biological roles.

Defenders of Humean arguments might respond in two ways. First, they might distinguish between the pleasure and pain of humans and of non-human animals. It might be that the pleasure and pain in non-human animals is much more likely to play a biological role than the pleasure and pain in humans. Thus, overall, pleasure and pain are more likely to play a biological role. Second, they might point out that Central Claim does not imply that the Hypothesis of Indifference does a good job explaining pleasure and pain. Rather, it implies that the Hypothesis of Indifference does a much better job than Theism. Thus, from the mere fact that some pleasures and pains do not seem to serve any biological roles it would not follow that Theism does a better job of predicting pleasure and pain than the Hypothesis of Indifference.

b. Objection 2: Naturalism and Normativity

Humean arguments maintain that what we know about good and evil is better predicted or explained by some rival to Theism than by Theism itself. In a simple understanding, what we know about good and evil includes claims like: it is bad that stray cats starve in the winter. However, some critics argue that the best explanation of the existence of good and evil is Theism itself. That is, they might argue that a purely naturalistic world, devoid of any supernatural reality, does a much worse job predicting the existence of good and evil than a claim like Theism. The argument here is abductive: there might not be any contradiction in claiming that the world is purely naturalistic and that there is good and evil. Nonetheless, a purely naturalistic hypothesis does a much worse job predicting or explaining good and evil than Theism. Thus, these critics argue, premises like Central Claim are false, since Theism does a much better job of explaining the existence of good and evil than naturalistic alternatives to Theism (see Lauinger (2014) for an example of this criticism).

Note that this objection only applies to certain kinds of Humean arguments. Specifically, it only applies to Humean arguments that implicitly or explicitly assume a rival to Theism that is a purely naturalistic hypothesis. However, not all rivals to Theism need be a purely naturalistic hypothesis. For instance, some of the rivals that Philo considered are not purely naturalistic. Nonetheless, many contemporary authors do accept a purely naturalistic worldview and would compare that worldview with a Theistic one.

In response, defenders of Humean arguments might defend metaethical naturalism. According to metaethical naturalism, normative facts, including facts about good and evil, are natural facts. Defenders of Humean arguments might argue that given metaethical naturalism, a purely naturalistic worldview does predict, to a high degree, normative facts. Determining whether this response succeeds, though, would require a foray into complex issues in metaethics.

c. Objection 3: God’s Obligations

Many philosophers and ordinary people assume that if Theism is true, then God has certain obligations to us. For instance, God is obligated to not bring about evil for us for absolutely no reason at all. These obligations might be based in God’s nature or some independent order. Either way, God is required to treat us in certain ways. The idea that if Theism is true, then God has certain obligations to us is a key idea in defending arguments from evil, including Humean arguments from evil. For instance, one of the defenses of Central Claim from above said that Theists might be surprised at the distribution of good and evil we know about. They might be surprised because they expect God to prevent that evil, since God has an obligation to prevent it, and that being all-powerful, God could prevent it. In this way, defenses of Central Claim (and premises like it) may implicitly assume that if Theism is true, then God has certain obligations to us.

However, some philosophers reject the claim that God has certain obligations to us (Adams (2013), Murphy (2017)). In these views, God might have a justifying reason to prevent evils and harms to us; but God does not have requiring reasons of the sort generated by obligations. There are different arguments for these views, and they are normally quite complex. But the arguments normally articulate a conception of God in which God is not a moral agent in the same way an average human person is a moral agent. But if God is not required to prevent evils and harms for us, God is closer to Hume’s “indifferent creator.” Just as an indifferent creator may, if they so desire, improve the lives of humans and animals, so too God may, if God so desires, improve the lives of humans and animals. But neither God nor the indifferent creator must do so.

Defenders of Humean arguments may respond to these arguments by simply criticizing these conceptions of God. Defenders of Humean arguments might argue that those conceptions are false or subtly incoherent. Alternatively, they might argue that those conceptions of God make it more difficult to challenge premises like Not-Counterbalanced Claim. For if God only has justifying reasons for treating us in certain ways, there might be a wide range of potential ways God would allow the world to be. But if there is a wide range of potential ways God would allow the world to be, then Theism does not make very specific predictions about how the world is. In this way, critics of Humean arguments may make it easier to challenge a premise like Central Claim but at the cost of making it harder to challenge a premise like Not-Counterbalanced Claim.

d. Objection 4: Skeptical Theism

Perhaps some of the most persistent critics of Humean arguments are skeptical theists (van Inwagen (1991), Bergmann (2009), Perrine and Wykstra (2014), Perrine (2019)). While there are many forms of skeptical theism, a unifying idea is that even if God were to exist, we should be skeptical of our ability to predict what the universe is like—including what the universe is like regarding good and evil. Skeptical theists develop and apply these ideas to a wide range of arguments against Theism, including Humean arguments.

Skeptical theistic critiques of Humean arguments can be quite complex. Here the critiques are simplified into two parts that form a simple modus tollens structure. The first part is to argue that there are certain claims that we cannot reasonably disbelieve or otherwise reasonably rule out. (In other words, we should be skeptical of their truth.) The second part is to argue that if we are reasonable in believing Central Claim (or something like it), then it is reasonable for us to disbelieve those claims. Since it is not reasonable for us to believe those claims, it follows that we are not reasonable in believing Central Claim (or something like it).

For the first part, consider a claim like this:

Limitation. God is unable to create a world with a better balance of good and evil without sacrificing other morally significant goods.

Skeptical theists argue that it is not reasonable for us to believe that Limitation is false; rather, we should be skeptical of its truth or falsity. One might argue that it is reasonable for us to believe that Limitation is false because it is hard for us to identify the relevant morally significant goods. But skeptical theists argue that this is a poor reason for disbelieving Limitation since God is likely to have created the world with many morally significant goods that are obscure to us. One might argue that it is reasonable for us to believe that Limitation is false because it is easy for us to imagine or conceive of a world in which it is false. But skeptical theists argue that this is a poor reason for disbelieving Limitation because conceivability is an unreliable guide to possibility when it comes to such complex claims like Limitation. In general, skeptical theists argue that our grasp of the goods and evils there are, as well as how they are connected, is too poor for us to reasonably disbelieve something like Limitation. In this way, they are skeptical of our access to all of the reasons God might have that are relevant to the permission of evil.

The second part of the skeptical theist’s critique is that if it is not reasonable for us to believe Limitation is false, then it is not reasonable for us to believe Central Claim is true. This part of the skeptical theist’s critique may seem surprising. Central Claim is a comparison between two hypotheses. Limitation is not comparative. Nonetheless, skeptical theists think they are importantly related. To see how they might relate, an analogy might be useful.

Suppose Keith is a caring doctor. How likely is it that Keith will cut a patient with a scalpel? At first blush, it might seem that it is extremely unlikely. Caring doctors do not cut people with scalpels! But on second thought, it is natural to think that whether Keith will cut a patient with a scalpel depends upon the kinds of reasons Keith has. If Keith has no compelling medical reason to do so, then given that Keith is a caring doctor, it is extremely unlikely Keith will cut a patient with a scalpel. But if Keith does have a compelling reason—he is performing surgery or a biopsy, for instance—then even if Keith is a caring doctor, it is extremely likely he will cut a patient with a scalpel. Now suppose someone claims that Keith will not cut a patient with a scalpel. That person is committed to a further claim: that Keith lacks a compelling medical reason to cut the patient with a scalpel. After all, even a caring doctor will cut a patient with a scalpel if there is a compelling medical reason to do so.

So, reconsider:

Central Claim: the Hypothesis of Indifference does a much better job predicting the good and evil we know about than Theism does.

There are several arguments one can give for Central Claim. But most of them utilize a simple idea: if Theism is true, there is a God who has reason for preventing the suffering and evil we know about, but if the Hypothesis of Indifference is true, there is no creator with such reasons. But, skeptical theists claim, God might have reasons for permitting suffering and evil if by doing so God can achieve other morally significant goods. Thus, to claim that God would prevent the suffering and evil we know about assumes that God could create a world with a better balance of good and evil without sacrificing other morally significant goods. (Compare: to claim that Keith, the kindly doctor, would not cut a patient with a scalpel assumes that Keith lacks a compelling medical reason to cut the patient with a scalpel.) Thus, if it is reasonable for us to believe Central Claim, it must also be reasonable for us to disbelieve:

 Limitation: God is unable to a create a world with a better balance of good and evil without sacrificing other morally significant goods.

After all, God might create a world with this balance of good and evil if it were necessary for other morally significant goods. But at this point, the first part of the skeptical theistic critique is relevant. For the skeptical theist claims that it is not reasonable for us to disbelieve Limitation. To do that, we would have to have a better understanding of the relationship between goods and evils than we do. Since it is not reasonable for us to reject Limitation, it is not reasonable for us to accept Central Claim.

As indicated earlier, the skeptical theist’s critique is quite complex. Nonetheless, some defenders of Humean arguments think that the criticism fails because the reasons skeptical theists give for doubting Central Claim can be offset or cancelled out. The defenders of Humean arguments reason by parity here. Suppose that the skeptical theist is right and that, for all we know, God could not have created a better balance of good and evil without sacrificing other morally significant goods. And suppose that the skeptical theist is right that this gives us a reason for doubting Central Claim. Well, that skepticism cuts both ways. For all we know, God could have created a better balance of good and evil without sacrificing other morally significant goods. By parity, that gives us a reason for accepting Central Claim. Thus, the skepticism of skeptical theism gives us both a reason to doubt Central Claim and a reason for accepting Central Claim. These reasons offset or cancel each other out. But once we set aside these offsetting reasons, we are still left with strong reasons for accepting Central Claim—namely, the reasons given by the arguments of section II. So, the skeptical theist’s critique does not ultimately succeed.

6. References and Further Reading

  • Adams, Marilyn McCord. (1999). Horrendous Evils and the Goodness of God. Cornell University Press.
  • Develops and responds to an argument from evil based on horrendous evils.

  • Adams, Marilyn McCord. (2013). “Ignorance, Instrumentality, Compensation, and the Problem of Evil.” Sophia. 52: 7-26.
  • Argues that God does not have obligations to us to prevent evil.

  • Bergmann, Michael. (2009). “Skeptical Theism and the Problem of Evil.” In Thomas Flint and Michael Rea, eds., The Oxford Handbook of Philosophical Theology. Oxford University Press.
  • A general introduction to skeptical theism that also briefly criticizes Humean arguments.

  • David Hume, Dialogues Concerning Natural Religion, part XI.
  • The original presentation of a Humean argument.

  • Dougherty, Trent and Paul Draper. (2013). “Explanation and the Problem of Evil.” In Justin McBrayer and Daniel Howard-Snyder, eds., The Blackwell Companion to the Problem of Evil. Blackwell Publishing.
  • A debate on Humean arguments.

  • Draper, Paul. (1989). “Pain and Pleasure: An Evidential Problem for Theists.” Nous. 23: 331-350
  • A classic modern presentation of a Humean argument.

  • Draper, Paul. (2013). “The Limitation of Pure Skeptical Theism.” Res Philosophica. 90.1: 97-111.
  • A defense of Humean arguments from skeptical theistic critiques.

  • Draper, Paul. (2017). “Evil and the God of Abraham, Anselm, and Murphy.” Religious Studies. 53: 564-72.
  • A defense of Humean arguments from the criticism that God lacks obligations to us.

  • Lauinger, William. (2014). “The Neutralization of Draper-Style Evidential Arguments from Evil.” Faith and Philosophy. 31.3: 303-324.
  • A critique of Humean arguments that good and evil better fit with Theism than naturalism.

  • Martin, Michael. (1978). “Is Evil Evidence Against the Existence of God?” Mind. 87.347: 429-432.
  • A brief argument that our inability to see God’s reasons for permitting suffering is evidence against Theism.

  • Morriston, Wes. (2014). “Skeptical Demonism: A Failed Response to a Humean Challenge.” In Trent Dougherty and Justin McBrayer, eds., Skeptical Theism. Oxford University Press.
  • A defense of a Humean argument from Skeptical Theism.

  • Murphy, Mark. (2017). God’s Own Ethics. Oxford: Oxford University Press.
  • A criticism of Humean arguments from the claim that God lacks obligations to us.

  • O’Connor, David. (2001). Hume on Religion. Routledge Press, chapter 9.
  • A modern discussion of Philo’s argument from evil that discusses the weak and strong interpretations.

  • Perrine, Timothy and Stephen Wykstra. (2014). “Skeptical Theism, Abductive Atheology, and Theory Versioning.” In Trent Dougherty and Justin McBrayer, eds., Skeptical Theism. Oxford University Press.
  • A skeptical theistic critique of Humean arguments, focusing on the methodology of the arguments.

  • Perrine, Timothy. (2019). “Skeptical Theism and Morriston’s Humean Argument from Evil.” Sophia. 58: 115-135.
  • A skeptical theistic critique of Humean arguments that defends them from the offsetting objection.

  • Pitson, Tony. (2008). “The Miseries of Life: Hume and the Problem of Evil.” Hume Studies. 34.1: 89-114.
  • A historical discussion of Hume’s views on the relation between the problem of evil and natural theology and atheology.

  • Plantinga, Alvin. (1996). “On Being Evidentially Challenged.” In Daniel Howard-Snyder, ed., The Evidential Argument From Evil. Bloomington, IN: Indiana University Press.
  • An argument that Humean arguments need to be strengthened to be cogent.

  • Pyle, Andrew. (2006). Hume’s Dialogue Concerning Natural Religion. Continuum.
  • A modern commentary on Hume’s Dialogue that provides a discussion of its historical place and reception.

  • Van Inwagen, Peter. (1991 [1996]). “The Problem of Evil, the Problem of Air, and the Problem of Silence.” Reprinted in Daniel Howard-Snyder, ed., The Evidential Argument From Evil. Bloomington, IN: Indiana University Press.
  • An earlier skeptical theistic critique of Humean arguments.

 

Author Information

Timothy Perrine
Email: tp654@scarletmail.rutgers.edu
Rutgers University
U. S. A.

The Metaphysics of Nothing

This article is about nothing. It is not the case that there is no thing that the article is about; nevertheless, the article does indeed explore the absence of referents as well as referring to absence. Nothing is said to have many extraordinary properties, but in predicating anything of nothingness we risk contradicting ourselves. In trying to avoid such misleading descriptions, nothingness could be theorised as ineffable, though that theorisation itself is an attempt to disparage it. Maybe nothingness is dialetheic, or maybe there are no things that are dialetheic, since contradictions are infamous for leading to absurdity. Contradictions and nothingness can explode very quickly into infinity, giving us everything out of nothing. So, perhaps nothing is something after all.

This article considers different metaphysical and logical understandings of nothingness via an analysis of the presence/absence distinction, by considering nothing first as the presence of absence, second as the absence of presence, third as both a presence and an absence, and fourth as neither a presence nor an absence. In short, it analyses nothingness as a noun, a quantifier, a verb, and a place, and it postulates nothingness as a presence, an absence, both, and neither.

Table of Contents

  1. Introduction—Nothing and No-thing
  2. Nothing as Presence of Absence
  3. No-thing as Absence of Presence
    1. Eliminating Negation
    2. Eliminating True Negative Existentials
    3. Eliminating Referring Terms
    4. Eliminating Existentially Loaded Quantification
  4. Beyond the Binary—Both Presence and Absence
    1. Dialectical Becoming
    2. Dialetheic Nothing
  5. Beyond the Binary—Neither Presence nor Absence
    1. The Nothing Noths
    2. Absolute Nothing
  6. Conclusion
  7. References and Further Reading

1. Introduction—Nothing and No-thing

Consider the opening sentence:

“This article is about nothing.”

This has two readings:

(i) This article is about no-thing (in that there is no thing that this article is about).

(ii) This article is about Nothing (in that there is something that this article is about).

The first reading (i) is a quantificational reading about the (lack of) quantity of things that this article is about. ‘Quantificational’ comes from ‘quantifier’, where a quantifier is a quantity term that ranges over entities of a certain kind. In (i), the quantity is none, and the entities that there are none of are things. This reading is referred to throughout the article as ‘no-thing’ (hyphenated, rather than the ambiguous ‘nothing’) to highlight this absence of things. The second reading (ii) is a noun phrase about the identity of the thing that this article is about. This reading is referred to throughout the article as ‘Nothing’ (capitalised, again avoiding the ambiguous ‘nothing’) to highlight the presence of a thing. In going from (i) to (ii), we have made a noun out of a quantity (a process we can call ‘nounification’). We have given a name to the absence, Nothing, giving it a presence. Sometimes this presence is referred to as ‘nothingness’, but that locution is avoided here since usually the ‘-ness’ suffix in other contexts indicates a quality or way of being, rather than a being itself (compare the redness of a thing to red as a thing, for example), and as such ‘nothingness’ is reserved for describing the nothing-y state of the presence Nothing and the absence no-thing.

It is important not to conflate these readings, and they cannot be reduced to one or the other. To demonstrate their distinctness, consider that (i) and (ii) have different truth values, as (ii) is true whilst (i) is false: it is not the case that this article is not about anything (that is, that for any x whatsoever there is no x that this article is about). As such, the article would be very short indeed (or even empty), bereft of a topic and perhaps bereft of meaning. I intend to do better than that. My intentional states are directed towards Nothing, hence the truth of (ii): there is indeed a topic of this article, and that topic—the subject, or even object of it—is Nothing.

There has been much debate over whether it is legitimate to nounify the quantificational reading of no-thing. Those who are sceptical would say that the ambiguous ‘nothing’ is really not ambiguous at all and should only be understood as a (lack of) quantity, rather than a thing itself. They might further argue that it is just a slip of language that confuses us into taking Nothing to be a thing, and that some of the so-called paradoxes of nothingness arise from illegitimate nounification that otherwise dissolve into mere linguistic confusions. The dialogues between characters in Lewis Carroll’s Alice in Wonderland and Through the Looking Glass are often cited as exemplars of such slippage and confusions. For instance [with my own commentary in square brackets]:

“‘I see nobody [that is, no-body as a quantifier] on the road’, said Alice.

‘I only wish I had such eyes’, the King remarked in a fretful tone.

‘To be able to see Nobody! [that is, Nobody as a noun] And at that distance too! Why, it’s as much as I can do to see real people [that is, somebodyness, rather than nobodyness, as states], by this light!’” (1871 p234)

Here, the term under consideration is ‘nobody’, and the same treatment applies to this as ‘nothing’ (in that we can disambiguate ‘nobody’ into the quantificational no-body and nounified Nobody). Alice intended to convey that there were no-bodies (an absence of presence) in quantitative terms. But the King then nounifies the quantifier, moving to a presence of absence, and applauds Alice on her apparent capacity to see Nobody.

Making this shift from things to bodies is helpful because bodies are less abstract than things (presumably you are reading this article using your body, your family members have bodies, animals have bodies, and so you have an intuitive understanding of what a body is). Once we have determined what is going on with no-body and every-body, we can apply it to no-thing and every-thing. So, consider now ‘everybody’. When understood as a quantifier, every-body is taken to mean all the bodies in the relevant domain of quantification (where a domain of quantification can be understood as the selection of entities that our quantifier terms range over). Do all those bodies, together, create the referent of Everybody as a noun? In other words, does Everybody as a noun refer to all the bodies within the quantitative every-body? One of the mistakes made by the likes of the King is to treat the referent of the noun as itself an instance of the type of entity the quantifier term is quantifying over. This is clear with respect to bodies, as Everybody is not the right sort of entity to be a body itself. All those bodies, together, is not itself a body (unless your understanding of what a body is can accommodate for such a conglomerate monster). Likewise, Nobody, when understood alongside its quantifier reading of no-body as a lack of bodies, is not itself a body (as, by definition, it has no bodies). So, the King, who is able to see only ‘real people’, makes a category mistake in taking Nobody to be, presumably, ‘unreal people’. Nobody, like Everybody, are quite simply not the right category of entity to instantiate or exemplify people-hood, bodyness, or be a body themselves.

The lesson we have learnt from considering ‘nobody’ is that nounifying the quantifier (no-body) does not create an entity (Nobody) of the kind that is being quantified over (bodies). So, returning to the more general terms ‘nothing’ and ‘everything’, are they the right kind of entities to be things themselves? Do Nothing and Everything, as nouns, refer to things, the same category of thing that their quantifier readings of no-thing and every-thing quantify over? The level of generality we are working with when talking of things makes it more difficult to diagnose what is going on in these cases (by comparison with Nobody and Everybody, for example).

To help, we can apply the lessons learnt from Alfred Tarski (1944) in so far as when talking of these entities as things we are doing so within a higher order or level of language—a metalanguage—in order to avoid paradox. We can see how this works with the Liar Paradox. Consider the following sentence, call it ‘S’: ‘This sentence is false’. Now consider that S is true and name the following sentence ‘S*’: ‘S is true’. If S (and thereby also S*) is true, then S says of itself that it is false (given that S literally states ‘This sentence is false’, which if true, would say it is false). On the other hand, if S (and thereby also S*) is false, then S turns out to be true (again, given that S literally states ‘This sentence is false’, which if it is false, would be saying something true). Tarski’s trick is to say that S and S* are in different levels of language. By distinguishing the level of language that S is talking in when it says it ‘… is false’, from the level of language that S* is talking in when it says that S ‘is true’, we end up avoiding the contradiction of having S be true and false at the same time within the same level. S is in the first level or order of language—the object language—and when we talk about S we ascend to a higher level or order of language—the metalanguage. As such, the truth and falsity appealed to in S are of the object language, and the truth and falsity appealed to in S* are of the metalanguage.

Applying Tarski’s trick to Nothing, perhaps Nothing cannot be considered a thing at the same level as the things it is not, just as Everything cannot be considered a thing at the same level as all the things it encapsulates. As quantifier terms, no-thing and every-thing quantify over things in the first level or order of the object language. As nouns, Nothing and Everything can only be considered things themselves in the higher level or order of the metalanguage, which speaks about the object language. The ‘things’ (or lack of) quantified over by every-thing and no-thing are of the object language, whereas the type of ‘thing’ that Everything and Nothing are are of the metalanguage. This avoids Nothing being a thing of the same type that there are no-things of.

Finally, then, with such terminology and distinctions in hand, we are now in a position to understand the difference between the presence of an absence (Nothing, noun), and the absence of a presence (no-thing, quantifier). Lumped into these two theoretical categories are the related positions of referring to a non-existing thing and the failure to refer to any thing at all (which whilst there are important variations, there are illuminating similarities that justify their shared treatment). Each of these approaches in turn are explored before describing other ways in which one can derive (and attempt to avoid deriving) the existence of some-thing from no-thing.

2. Nothing as Presence of Absence

When we sing that childhood song, ‘There’s a hole in my bucket, dear Liza’, the lyrics can be interpreted as straightforwardly meaning that there really is, there really exists, a hole in the bucket, and it is to that hole that the lyrics refer. Extrapolating existence in this sort of way from our language is a Quinean (inspired by the work of W. V. O. Quine) criterion for deriving ontological commitments, and specifically Quine argued that we should take to exist what our best scientific theories refer to. Much of our language is about things, and according to the principle of intentionality, so are our thoughts, in that they are directed towards or refer to things. (Of course, not all language and thought point to things: for example, in the lyrics above, the words ‘a’ and ‘in’ do not pick out entities in the way that ‘bucket’ and ‘Liza’ do. The question is whether ‘hole’ and ‘nothing’ function more like nonreferential ‘a’ and ‘in’ or referential ‘bucket’ and ‘Liza’.)

In our perceptual experiences and in our languages and theories we can find many examples of seeming references to nothingness, including to holes, gaps, lacks, losses, absences, silences, voids, vacancies, emptiness, and space. If we take such experiences, thoughts, and language at face value, then nothingness, in its various forms, is a genuine feature of reality. Jean-Paul Sartre is in this camp, and, in Being and Nothingness, he argues that absences can be the objects of judgements. Famously, Sartre described the situation in which he arrived late for his appointment with Pierre at a café, and ‘sees’ the absence of Pierre (because Pierre is who he is expecting to see, and the absence of Pierre frustrates that expectation and creates a presence of that absence—Sartre does not also ‘see’ the absence of me, because he was not expecting to see me). Relatedly, and perhaps more infamously, Alexius Meinong takes non-existent things to have some form of Being, such that they are to be included in our ontology, though Meinongians—those inspired by Meinong—disagree on what things specifically should be taken as non-existent.

So, what things should we take to exist? Consider the Eleatic principle which states that only causes are real. Using this principle, Leucippus noted that voids have causal power, and generalises that nonbeings are causally efficacious, such that they are as equally real as atoms and beings in general. When we sing, on the part of Henry, his complaints to dear Liza that the water is leaking from his bucket, then, the hole is blamed as being the cause of this leakage, and from this we might deduce the hole’s existence (the presence of an absence with causal powers). Similarly, we might interpret Taoists as believing that a wide variety of absences can be causes (for example, by doing no-thing—or as little as possible to minimise disruption to the natural way of the Tao—which is considered the best course of ‘(in)action’), and as such are part of our reality. As James Legge has translated from the Tao Te Ching: “Vacancy, stillness, placidity, tastelessness, quietude, silence, and non-action, this is the level of heaven and earth, and the perfection of the Tao and its characteristics” (1891 p13).

Roy Sorensen (2022) has gone to great lengths to describe the ontological status of various nothings, and his book on ‘Nothing’ (aptly named Nothing) opens with the following interesting case about when the Mona Lisa was stolen from the Louvre in Paris. Apparently, at the time, more Parisians visited the Louvre to ‘see’ the absence than they did the presence of the Mona Lisa, and the ‘wall of shame’ where the Mona Lisa once hung was kept vacant for weeks to accommodate demand. The Parisians regarded this presence of the absence of the Mona Lisa as something that could be photographed, and they aimed to get a good view of this presence of absence for such a photo, otherwise complaining that they could not ‘see’ if their view was obstructed. Applying the Eleatic principle, the principle of intentionality, a criterion for ontological commitment, or other such metaphysical tests to this scenario (as with Sartre’s scenario) may provide a theoretical basis for interpreting the ‘object’ of the Parisians’ hype (and the missing Pierre) as a presence of absence (of presence)—a thing, specifically, a Nothing.

Interpreting Nothing as a presence of absence requires us to understand Nothing as a noun that picks out such a presence of absence. If there is no such presence of this nothingness, and instead such a state is simply describing where something is not, then it is to be understood as an absence of presence via a quantificational reading of there being no-thing that there is. It can be argued that the burden of proof is on the latter position, which denies Nothing as a noun, to argue that there is only absence of a presence rather than a presence of absence. Therefore, in what follows, we pay close attention to this sceptical view to determine whether we can get away with nothingness as an absence, where there is no-thing, rather than there being a presence of Nothing as our language and experience seem to suggest.

3. No-thing as Absence of Presence

Returning to Liza and that leaking bucket, instead of there being a hole in the bucket, we could reinterpret the situation as the bucket having a certain perforated shape. Rather than there being a presence of a hole (where the hole is an absence), we could say that there is an absence of bucket (where the bucket is a presence) at the site of the leaking water. Such a strategy can be used not only to avoid the existence of holes as things themselves, but also to reinterpret other negative states in positive ways. For example, Aristotle, like Leucippus, argues from the Eleatic principle in saying that omissions can be causes, but to avoid the existence of omissions themselves this seeming causation-by-absence must be redescribed within the framework of Being. As such, negative nothings are just placeholders for positive somethings.

We can see a parallel move happen with Augustine who treats Nothing as a linguistic confusion—where others took there to be negative things (presences of an absence), Augustine redescribed those negative things as mere lacks of positive things (absences of a presence). For example, Mani thought ‘evil’ names a substance, but Augustine says ‘evil’ names an absence of goodness just as ‘cold’ names the absence of heat. Saying that evil exists is as misleading as saying cold exists, as absences are mere privations, and privations of presences specifically. Adeodatus and his father argue similarly, where Adeodatus says ‘nihil’ refers to what is not, and in response his father says that to refer to what is not is to simply fail to refer (see Sorensen 2022 p175). This interpretation of language is speculated to have been imported from Arab grammarians and been influenced by Indian languages where negative statements such as ‘Ostriches do not fly’ are understood as metacognitive remarks that warn us not to believe in ostrich flight rather than a description of the non-flight of ostriches (again see Sorensen 2022 p176 and p181).

Bertrand Russell attempted to generalise this interpretation of negative statements by reducing all negative truths to positive truths (1985). For example, he tried to paraphrase ‘the cat is not on the mat’ as ‘there is a state of affairs incompatible with the cat being on the mat’. But of course, this paraphrase still makes use of negation with respect to ‘incompatible’ which simply means ‘not compatible’, and even when he tried to model ‘not p’ as an expression of ‘disbelief that p’, this too requires negation in the form of believing that something is not the case (or not believing that something is the case). This ineliminatibility of the negation and the negative facts we find it in meant that Russell eventually abandoned this project and (in a famous lecture at Harvard) conceded that irreducibly negative facts exist. Dorothy Wrinch (1918) jests at the self-refuting nature of such positions that try to eliminate the negative, by saying that it is “a little unwise to base a theory on such a disputable point as the non-existence of negative facts”. So can we eliminate Nothing in favour of no-thing? Can we try, like Russell’s attempt, to avoid the presence of negative absences like Nothing, and instead only appeal to the absence of positive presences like no-thing? Can we escape commitment to the new thing created by nounifying no-thing into Nothing, can no-thing do all the work that Nothing does? Consider various strategies.

a. Eliminating Negation

Despite Russell’s attempt, it seems we cannot eliminate negative facts from our natural language. But from the point of view of formal languages, like that of logic, negation is in fact dispensable. Take, for example, the pioneering work of Christine Ladd-Franklin. In 1883, her dissertation put forward an entire logical system based on exclusion, where she coined the NAND operator which reads ‘not … and not …’, or ‘neither … nor …’.  This closely resembles the work of Henry Sheffer, who later, in 1913, demonstrated that all of the logical connectives can be defined in terms of the dual of disjunction, which he named NOR (short for NOT OR, ‘neither … nor …’), or the dual of conjunction, which was (confusingly) named NAND (short for NOT AND, ‘either not … or not …’) and has come to be known as the Sheffer stroke. This Sheffer stroke, as well as the earlier Ladd-Franklin’s NAND operator, do away with the need for a symbolic representation of negation. Another example of such a method is in Alonzo Church’s formal language whereby the propositional constant f was stipulated to always be false (1956, §10), and f can then be used to define negation in terms of it as such: ~ A =df  A → f. If we can do away with formal negation, then perhaps this mirrors the possibility of doing away with informal negation, including Nothing.

An issue with using this general method of escaping negative reality regards what is known as ‘true negative existentials’ (for example, ‘Pegasus does not exist’). Using Sheffer’s NAND, this is ‘Pegasus exists NAND Pegasus exists’ which is read ‘either it is not the case that Pegasus exists or it is not the case that Pegasus exists’, which we would want to be true. But since Pegasus does not exist, the NAND sentence will not be true, as each side of the NAND (that is, ‘Pegasus exists’) is false. As we shall see, this is a persistent problem which has motivated many alternatives to the classical logic setup.

Another issue concerns whether the concept of negation has really been translated away in these cases, or whether negation has just become embedded within the formal language elsewhere under the guise of some sort of falsehood, ever present in the interpretation. This questioning of the priority of the concept of negation was put forward by Martin Heidegger, when he asks: “Is there Nothing only because there is ‘not’, i.e. negation? Or is it the other way round? Is there negation and ‘not’ only because there is Nothing?” (1929 p12) Heidegger’s answer is that “‘Nothing’ is prior to ‘not’ and negation” (ibid.), and so whilst ‘not’ and negation may be conceptually eliminable because they are not primitive, ‘Nothing’ cannot be so. Try as we might to rid ourselves of Nothing, we will fail, even if we succeed in ridding our formal language of ‘not’ and negation. We shall now turn to more of these eliminative methods.

b. Eliminating True Negative Existentials

The riddle, or paradox, of non-being describes the problem of true negative existentials, where propositions like ‘Pegasus does not exist’ are true but seem to bring with them some commitment to an entity ‘Pegasus’. As we learn from Plato’s Parmenides, “Non-being is… being something that is not, – if it’s going not to be” (1996 p81). It is thus self-defeating to say that something, like Pegasus, does not exist, and so it is impossible to speak of what there is not (but even this very argument negates itself). What do we do in such a predicament?

In the seminal paper ‘On What There Is’ (1948), Quine described this riddle of non-being as ‘Plato’s Beard’—overgrown, full of non-entities beyond necessity, to be shaved off with Ockham’s Razor. The problem arises because we bring a thing into existence in order to deny its existence. It is as if we are pointing towards something, and accusing what we are pointing at of not being there to be pointed at. This is reflected in the classical logic that Quine endorsed, where both ‘there is’ and ‘there exists’ are expressed by means of the ‘existential quantifier’ (∃), which is, consequently, interpreted as having ontological import. As a result, such formal systems render the statement ‘There is something that does not exist’ false, nonsensical, inexpressible, or contradictory. How can we get around this issue, in order to rescue the truth of negative existentials like ‘Pegasus does not exist’ without formalising it as ‘Pegasus—an existent thing—does not exist’?

This issue closely resembles the paradox of understanding Nothing—in referring to nothingness as if it were something. As Thales argues, thinking about nothing makes it something, so there can only truly be nothing if there is no one to contemplate it (see Frank Close 2009 p5). The very act of contemplation, or the very act of referring, brings something into existence, and turns no-thing into some-thing, which is self-defeating for the purposes of acknowledging an absence or denying existence. In his entry on ‘Nothingness’ in The Oxford Companion to the Mind, Oliver Sacks summarises the difficulty in the following way: “How can one describe nothingness, not-being, nonentity, when there is, literally, nothing to describe?” (1987 p564)

c. Eliminating Referring Terms

Bertrand Russell (1905) provides a way to ‘describe nothingness’ by removing the referent from definite descriptions. Russell analyses true negative existentials such as ‘The present King of France does not exist’ as ‘It is not the case that there is exactly one present King of France and all present Kings of France exist’. By transforming definite descriptions into quantitative terms, we do not end up referring to an entity in order to deny its existence—rather, the lack of an entity that meets the description ensures the truth of the negative existential. Quine (1948) takes this method a step further by rendering all names as disguised descriptions, and thereby analyses ‘Pegasus does not exist’ as more accurately reading ‘The thing that pegasizes does not exist’. Such paraphrasing away of referring devices removes the problem of pointing to an entity when asserting its nonexistence, thereby eliminating the problem of true negative existentials.

However, such methods are not without criticism, with some claiming their resolutions are worse than the problems they were initially trying to resolve. As Karel Lambert argues, they come with their own problems and place “undue weight both on Russell’s controversial theory of descriptions as the correct analysis of definite descriptions and on the validity of Quine’s elimination of grammatically proper names” (1967 p137). Lambert proposes, instead of ridding language of singular terms via these questionable means, one could rid singular terms of their ontological import. She creates a system of ‘free logic’ whereby singular terms like names need not refer in order to be meaningful, and propositions containing such empty terms can indeed be true. Therefore, ‘Pegasus does not exist’ may be meaningful and true even whilst ‘Pegasus’ does not refer, without contradiction or fancy footwork via paraphrasing into definite descriptions and quantificational statements.

Lambert (1963) also insists that such a move to free logic is required in order to prevent getting something from nothing in classical logic, when we derive an existential claim from a corresponding universal claim where the predicate in use is not true of anything in the domain. This happens when we infer according to the rule of ‘Universal Instantiation’ whereby what is true of all things is true of some (or particular) things, for example:

∀x(Fx → Gx)

∃x(Fx & Gx)

If no thing in the domain is F, then theoretically hypothesizing that all Fs are Gs leads to inferring that some Fs are Gs, thereby deriving an x that is F and G from the domain where there was no thing in the domain that was F to start with. Rather than the ad hoc limitation of the validity of such inferences to domains that include (at least) things that are F (or are more generally simply not empty), Lambert instead proposes her system of free logic where there need not be a thing in the domain for statements to be true.

But what about Nothing? Is ‘Nothing’ a referring term? For Rudolf Carnap, asking such a question is “based on the mistake of employing the word ‘nothing’ as a noun, because in ordinary language it is customary to use it in this form in order to construct negative existential statements… [E]ven if it were admissible to use ‘nothing’ as a name or description of an entity, still the existence of this entity would be denied by its very definition” (1959 p70). Many have argued against the first part of Carnap’s argument, to show that there are occurrences of ‘Nothing’ as a noun which cannot be understood in quantificational terms or as the null object without at least some loss of meaning (see, for example, Casati and Fujikawa 2019). Nevertheless, many have agreed with the second part of Carnap’s argument that even as a noun ‘Nothing’ would fail to refer to an existent thing (see, for example, Oliver and Smiley 2013). But if Nothing does not refer to an existent thing, what then is this encyclopaedia article about?

As Maria Reicher (2022) states, “One of the difficulties of this solution, however, is to give an account of what makes such sentences true, i.e., of what their truthmakers are (given the principle that, for every true sentence, there is something in the world that makes it true, i.e., something that is the sentence’s truthmaker).” The truthmaker of my opening sentence ‘This article is about nothing’ might then be that Nothing is what this article is about, even when Nothing is the name for the nounified no-thing. The problematic situation we seem to find ourselves in is this: Without an entity that the statement is about, the statement lacks a truthmaker; but with an entity that the statement is about, the statement becomes self-refuting in denying that very entity’s existence. But there is another option. ‘Nothing’ may not refer to an existent thing, yet this need not entail the lack of a referent altogether, because instead perhaps ‘Nothing’ refers to a non-existent thing, as we shall now explore.

d. Eliminating Existentially Loaded Quantification

Meinong’s ‘Theory of Objects’ (1904) explains how we can speak meaningfully and truthfully about entities that do not exist. Meinongians believe that we can refer to non-existent things, and talk of them truthfully, due to quantifying over them and having them as members in our domains of quantification. When we speak of non-existent things, then, our talk refers to entities in the domain that are non-existent things. So it is not that our language can be true without referring at all (as in free logic), but rather that our language can be true without referring to an existent thing (where instead what is referred to is a non-existent thing, which acts as a truthmaker). This approach grants that flying horses do not exist, but this does not imply that there are no flying horses. According to the Meinongian, there are flying horses, and they (presumably) belong to the class of non-existent things, where Pegasus is one of them. This class of non-existent things might also include the present King of France, Santa Claus, the largest prime number, the square circle, and every/any-thing you could possibly imagine if taken to not exist—maybe even Nothing.

So, for the Meinongian, naïvely put, there are existents and non-existents. Both are types of ‘thing’, and the over-arching name for these things are that they have ‘being’. All existent things have being, but not all being things have existence. And perhaps in such an account, Nothing could have ‘being’ regardless of its non/existence. Since Meinongians quantify over both existent and non-existent things, their quantification over domains containing both such things must be ontologically neutral (namely, by not having existential import), and they can differentiate between the two types of things by employing a predicate for existence which existent things instantiate and non-existent things do not. The neutral universal and particular quantifiers (Λ and Σ) can then be defined using the classical universal and existential quantifiers (∀ and ∃) with the existence predicate (E!) as such:

∀x =df Λx(E!x)

∃x =df Σx(E!x)

‘All existent things are F’ can be written as such:

∀x(Fx) =df Λx(E!x → Fx)

And ‘Some existent things are F’ can be written as such:

∃x(Fx) =df Σx(E!x & Fx)

Using these neutral quantifiers, we can then say, without contradiction, that some things do not exist, as such:

Σx(~E!x)

Despite these definitions, it would be erroneous to describe Meinongianism as “the way of the two quantifiers” (Peter van Inwagen 2003 p138). This is because the ontologically loaded quantifier ∃ can be considered as being restricted to existents, and so is different to Σ only by a matter of degree with respect to what is in the domain, that is, its range. Such a restriction of the domain can be understood as part and parcel of restricting what it is to count as a ‘thing’, where, for Quine, every-(and only)-thing(s) exists.

One need not be a Meinongian to treat the quantifiers as ontologically neutral, however. For example, Czeslaw Lejewski argues that the existentially non-committal ‘particular quantifier’ is “a nearer approximation to ordinary usage” and claims to “not see a contradiction in saying that something does not exist” (1954 p114). Another way to free the quantifiers of their ontological import is to demarcate ontological commitment from quantificational commitment, as in the work of Jody Azzouni (2004). Even the very basic idea of quantificational commitment leading to a commitment to an object in the domain of quantification can be challenged, by taking the quantifiers to be substitutional rather than objectual. In a substitutional interpretation, a quantificational claim is true not because there is an object in the domain that it is true of, but because there is a term in the language that it is true of (for an early pioneer of substitutional quantification, see Ruth Barcan-Marcus 1962).

In contrast to these alternative systems, for Quine (1948), “to be is to be the value of a bound variable”, which simply means to be quantified over by a quantifier, which further simplified means to be in the domain of quantification. An ontology, then, can be read straight from the domain, which contains (only) the existent things, which happens to be all the ‘things’ that there are. As we have seen, this is problematic with respect to understanding nonexistence. But that is not all. Ladd-Franklin (1912 p653), for example, argues that domains are just ‘fields of thought’, and thus the domain of discourse may vary, and it cannot simply be assumed to contain all of (and only) the things that exist in our reality. Even when the field of thought is physics, or whatever our best science may be, the domain of quantification still leaves us none the wiser with respect to what there is in reality. As Mary Hesse argues, “it is precisely what this domain of values is that is often a matter of dispute within physics” (1962 p243). Indeed, she continues, the very act of axiomatizing a theory in order to answer the question ‘what are the values of its variables?’ implies the adoption of a certain interpretation, which in turn is equivalent to the decisions involved in answering the question ‘what are entities?’ Therefore, one cannot informatively answer ‘what is there?’ with ‘the values of the bound variables’. Extrapolating from the domain is thus no guide to reality: it can give us some-thing from no-thing, regardless of whether every-thing includes more than every (existent) thing. And we cannot infer the existence of Nothing from ‘Nothing’.

4. Beyond the Binary—Both Presence and Absence

As we shall now see, the supposed choice between the binary options of understanding ‘nothing’ as Nothing (a noun, presence of absence) or no-thing (a quantifier, absence of presence) can itself be challenged. To get to that point, firstly, we introduce the dialectical process of Becoming which Nothing participates in, and then we introduce dialetheic understandings of the contradictory nature of Nothing.

a. Dialectical Becoming

In G. W. F. Hegel’s dialectics, a particular pattern is followed when it comes to conceptual analysis. To start, a positive concept is introduced as the ‘thesis’. Then, that positive concept is negated to create the ‘antithesis’ which opposes the thesis. The magic happens when the positive concept and the negative concept are unified to create a third concept, the ‘synthesis’ of the thesis and antithesis. When Hegel applied this dialectic of thesis-antithesis-synthesis to the topic we are considering in this article, the resulting pattern is Being-Nothing-Becoming. To start, he took Being as the positive thesis, which he stated is ‘meant’ to be the concept of presence. Negating this thesis of Being, we get what he stated is ‘meant’ to be the concept of absence, namely, Nothing, as the antithesis.

It is important to note that for Hegel the difference between Being and Nothing is only “something merely meant” (1991 remark to §87) in that we do mean to be highlighting different things when we use the term ‘Nothing’ rather than ‘Being’ or vice versa, but in content they are actually the same. What is the content of Being and Nothing, then, that would equate them in this extensional manner? Well, as purely abstract concepts, Being and Nothing are said to have no further determination, in that Being asserts bare presence, and Nothing asserts bare absence. Given that both are bare, and thus undetermined, they have the same (lack of) properties or content. (Compare the situation with the morning star and evening star—these terms were employed to mean different things, but actually they both refer to Venus.)

There is a presence to Nothing in its asserting absence, and there is an absence to Being in its empty presence. As Julie Maybee (2020) has described, “Being’s lack of determination thus leads it to sublate itself and pass into the concept of Nothing”, and this movement goes both ways. In speculating the bidirectional relationship between Being and Nothing, we enter the dialectic moment of synthesis that unifies and combines them into a state of Becoming. To Become is to go from Being to Nothing or from Nothing to Being, as we do when we consider their equally undefined content. But despite their extensional similarity (in what content they pick out), intensionally (their intended definitional meaning) Being and Nothing are different. Any contradiction that may arise from their synthesis can thus be avoided by reference to this difference. But what if such contradictions provide a more accurate understanding of nothingness, to better reflect its paradoxical nature? This is the idea we will now take up.

b. Dialetheic Nothing

Heidegger pointed out that in speaking of Nothing we make it into something and thereby contradict ourselves. Much like in that dialectical moment of synthesis, we posit Nothing as a being—as a thing—even though by our quantificational understanding that is precisely what it is not (see Krell 1977 p98f). Where can we go from here? Does this mean it is impossible to speak of Nothing without instantaneous self-defeat, by turning Nothing into not-no-thing, namely, some-thing? To this, Graham Priest adds, “One cannot, therefore, say anything of nothing. To say anything, whether that it is something or other, or just that it is, or even to refer to it at all, is to treat it as an object, which it is not” (2002 p241, emphasis in original).

Of course, Priest did say something about Nothing, as did Heidegger, and as does this article. It therefore is not impossible to talk of it. Perhaps the lesson to learn is that any talk of it will be false because the very act of doing so turns it into what it is not. This would be a kind of error-theory of Nothing, that whatever theorising is done will be in error, by virtue of postulating an object to be theorised where there is no object. But this will not do once we consider statements that motivate such a theory, like ‘Nothing is not an object’, which the error-theorist would want to be true in order for all (other) statements about Nothing to be false. Can we not even say that we cannot say anything about Nothing, then? Nor say that?

These problems reflect issues of ineffability. To be ineffable is to not be able to be effed, where to be effed is to be described in some way. Start with the idea that Nothing is ineffable, because in trying to describe it (a no-thing) we end up turning it into some-thing (a thing) that it is not. But, to say that Nothing is ineffable is a self-refuting statement, since ‘Nothing is ineffable’ is to say something about Nothing, namely, that it is ineffable. Furthermore, if it is true that Nothing is ineffable, then it is not true that no-thing is ineffable, because Nothing is. So, to repeat, can the (in)effability of nothingness be effed? And what about effing that?

Ludwig Wittgenstein’s Tractatus is also an example of trying to eff the ineffable, via a self-conscious process of ‘showing’ rather than ‘saying’ what cannot be said, or else rendering it all meaningless. Wittgenstein’s work explores (among other things) the limits of our language in relation to the limits of our world, and the messy paths that philosophical reflection on our language can take us down. Applying this to Nothing, it might be that the contradictions that arise from attempts to express nothingness reflect contradictions in its very nature. And maybe when we get caught up in linguistic knots trying to understand Nothing it is because Nothing is knotty (which pleasingly rhymes with not-y). Perhaps then we need not try to find a way out of contradictions that stem from analysing nothingness if those contradictions are true. So, is it true that Nothing is both an object and not an object? Is it true that Nothing is both a thing and no-thing? Whilst this would not be Wittgenstein’s remedy, according to Priest, ‘yes’, we ought to bite this bullet and accept the paradoxical nature of Nothing at face value. To treat such a contradiction as true, one must endorse a dialetheic metaphysics, with a paraconsistent logic to match, where Nothing is a dialetheia.

5. Beyond the Binary—Neither Presence nor Absence

a. The Nothing Noths

As we have seen, when contemplating nothingness, we can quickly go from no-thing to Nothing, which is no longer a ‘nothing’ due to being some-thing. When we turn towards nothingness, it turns away from us by turning itself into something else. This makes nothingness rather active, or rather re-active, in a self-destructive sort of way. As Heidegger put it, “the nothing itself noths or nihilates” (1929 p90).

Carnap was vehemently against such metaphysical musings, claiming that they were meaningless (1959 p65-67). Indeed, Heidegger and the Vienna Circle (of which Carnap was a leading and central figure) were in opposition in many ways, not least with respect to Heidegger’s antisemitism and affiliation with the Nazis in contrast with the Vienna Circle’s large proportion of Jewish and socialist members (see David Edmonds 2020 for the relationship between the political and philosophical disputes).

Somewhat mediating on the logical side of things, Oliver and Smiley (2013) consider ‘the nothing noths’ as “merely a case of verbing a noun” and argue: “If ‘critiques’ is what a critique does, and ‘references’ is what a reference does, ‘nichtet’ is what das Nichts does. The upshot of all this is that ‘das Nichts nichtet’ [‘the nothing noths’] translates as ‘zilch is zilch’ or, in symbols, ‘O=O’. Far from being a metaphysical pseudo-statement, it is a straightforward logical truth” (p611). If verbing a noun is legitimate, what about nouning a quantifier? If ‘Criticisms’ is the name for all criticisms, and ‘References’ is the name for all references, then is not ‘Everything’ the name for every-thing, and likewise ‘Nothing’ the name for no-thing? Such an understanding would make the path to such entities quite trivial, a triviality that ‘straightforward logical truths’ share. But if we have learnt anything about Nothing so far, it is surely that it is a long way (at least 8,000 words away) from being trivial.

Heidegger avoids charges of triviality by clarifying that Nothing is “‘higher’ than or beyond all ‘positivity’ and ‘negativity’” (see Krummel 2017 p256 which cites Beiträge). This resonates with Eastern understandings of true nothingness as irreducible to and outside of binary oppositions, which is prominent in the views of Nishida Kitarō from the Kyoto School. What are they good for? ‘Absolute nothing’ (and more).

b. Absolute Nothing

When Edwin Starr sang that war was good for absolutely nothing (1970), the message being conveyed was that there was no-thing for which war was good. This was emphasised and made salient by the ‘absolutely’. When we are analysing nothingness, we might likewise want to emphasise that what we are analysing is absolutely nothing. But what would that emphasis do? In what way does our conception of nothingness change when we make its absoluteness salient?

For the Kyoto School, this ‘absolute’ means cutting off oppositional understandings, in a bid to go beyond relativity. The way we comprehend reality is very much bound up in such oppositions: life/death, yes/no, true/false, black/white, man/woman, good/bad, acid/alkaline, high/low, left/right, on/off, 0/1, even/odd, this/that, us/them, in/out, hot/cold… and challenging such binaries is an important part of engaging in critical analysis to better grasp the complexities of reality. But these binaries may very well include opposites we have been relying upon in our understanding of nothingness, namely, presence/absence, thing/no-thing, no-thing/Nothing, binary/nonbinary, relative/absolute, and so forth. It seems whatever concept or term or object we hold (like Hegel’s ‘thesis’), we can negate it (like Hegel’s ‘antithesis’), making a set of opposites. What then can be beyond such oppositional dialect? Nothing. (Or is it no-thing?)

Zen Buddhism explains that true nothingness is absolute, not relative—beyond the realm of things. Our earlier attempts at elucidating Nothing and no-thing were very much conceptually related to things, and so to get a truer, more absolute nothingness, we must go beyond no-thing/thing and no-thing/Nothing. Only once detached from all contrasts do we have absolute nothingness.

Nishida says absolute negation (zettai hitei 絶対否定) is beyond the affirmative/negative itself, and so is a rejection of what it colloquially represents: true negation is thereby a negation of negation. This is not the double-negation of classical logic (whereby something being not not true is for that something to be true) and it is not the mealy-mouthed multiple-negation of conversation (whereby not disliking someone does not entail liking them but rather just finding them incredibly annoying, for example). Instead, this negation of negation leaves the realm of relativity behind, it goes beyond (or negates) that which can be negated to enter the absolute realm. No-thing can be absolute without being absolved of any defining opposition that would render it merely relative. And so Nothing can only be absolute when it goes beyond the binaries that attempt to define it in the world of being. This does not place the absolute nothingness in the realm of nonbeing; rather, absolute nothingness transcends the being/nonbeing distinction.

Without anything to define absolute nothingness in relation to, it is quite literally undefined. As such, Nothing cannot be made into a subject or object that could be judged, and so is completely undetermined. It would not make sense, then, to interpret ‘absolute nothing’ as a thing, because that would bring it into the purview of predication. Instead, Nishida (2000 467, 482) speaks of it as a place: “the place of absolute nothing” (zettai mu no basho) or “the place of true nothing” (shin no mu no basho). Within this place is every determination of all beings, and as such is infinitely determined. But this is in contradiction with its status as being completely undetermined, beyond the realm of relative definition. Is absolute nothingness really beyond the realm of relative definition if it is defined in contrast to relativity, namely, as absolute? It seems that we have stumbled upon contradictions and binaries again. (Ask yourself: Can we avoid them? Ought we avoid them?) Like the dialetheic understanding of Nothing, this absolute nothingness is effed as ineffable in terms of what it is and is not. And like the nothing-that-noths, this absolute nothingness is active, but rather than nihilating anything that comes in its path, it creates every-thing.

6. Conclusion

This article has analysed nothingness as a noun, a quantifier, a verb, and a place. It has postulated nothingness as a presence, an absence, both, and neither. Through an exploration of metaphysical and logical theories that crossed the analytic/continental and East/West divides, it started with nothing, got something, and ended up with everything. What other topic could be quite as encompassing? Without further ado, and after much ado about nothing, let us conclude the same way that Priest does in his article ‘Everything and Nothing’ (which hopefully you, the reader, will now be able to disambiguate):

“Everything is interesting; but perhaps nothing is more interesting than nothing” (Gabriel and Priest 2022 p38).

7. References and Further Reading

  • Jody Azzouni (2004) Deflating Existential Consequence: A Case for Nominalism, Oxford University Press.
  • Ruth Barcan-Marcus (1962) ‘Interpreting Quantification’, Inquiry, V: 252–259.
  • Filippo Casati and Naoya Fujikawa (2019) ‘Nothingness, Meinongianism and Inconsistent Mereology’, Synthese, 196.9: 3739–3772.
  • Rudolf Carnap (1959) ‘The Elimination Of Metaphysics Through Logical Analysis of Language’, A. Pap (trans.) in A. J. Ayer (ed.) Logical Positivism, New York: Free Press, 60–81.
  • Lewis Carroll (1871) Through the Looking-Glass and What Alice Found There, in M. Gardner (ed.) The Annotated Alice: The Definitive Edition, Harmondsworth: Penguin, 2000.
  • Alonzo Church (1956) Introduction to Mathematical Logic, Princeton University Press.
  • Frank Close (2009) Nothing: A very short introduction, Oxford University Press.
  • David Edmonds (2020) The Murder of Professor Schlick: The Rise and Fall of the Vienna Circle, Princeton University Press.
  • Suki Finn (2018) ‘The Hole Truth’, Aeon.
  • Suki Finn (2021) ‘Nothing’, Philosophy Bites. https://podcasts.google.com/feed/aHR0cHM6Ly9waGlsb3NvcGh5Yml0ZXMubGlic3luLmNvbS9yc3M.
  • Suki Finn (2023) ‘Nothing To Speak Of’, Think, 22.63: 39–45.
  • Markus Gabriel and Graham Priest (2022) Everything and Nothing, Polity Press.
  • W. F. Hegel (1991) The Encyclopedia Logic: Part 1 of the Encyclopaedia of Philosophical Sciences, F. Geraets, W. A. Suchting, and H. S. Harris (trans.), Indianapolis: Hackett.
  • Martin Heidegger (1929) ‘What is Metaphysics?’, in (1949) Existence and Being, Henry Regenry Co.
  • Mary Hesse (1962) ‘On What There Is in Physics’, British Journal for the Philosophy of Science, 13.51: 234–244.
  • Peter van Inwagen (2003) ‘Existence, Ontological Commitment, and Fictional Entities’, in Michael
  • Loux and Dean Zimmerman (eds.) The Oxford Handbook of Metaphysics, Oxford University Press, 131–157.
  • F. Krell (ed.) (1977) Martin Heidegger: Basic Writings, New York: Harper & Row.
  • John W. M. Krummel (2017) ‘On (the) nothing: Heidegger and Nishida’, Continental Philosophy Review, 51.2: 239–268.
  • Christine Ladd-Franklin (1883) ‘The Algebra of Logic’, in Charles S. Pierce (ed.) Studies in Logic, Boston: Little, Brown & Co.
  • Christine Ladd-Franklin (1912) ‘Implication and Existence in Logic’, The Philosophical Review, 21.6: 641–665.
  • Karel Lambert (1963) ‘Existential Import Revisited’, Notre Dame Journal of Formal Logic, 4.4: 288–292.
  • Karel Lambert (1967) ‘Free Logic and the Concept of Existence’, Notre Dame Journal of Formal Logic 8.1-2: 133–144.
  • James Legge (1891) The Writings of Chuang Tzu, Oxford University Press.
  • Czeslaw Lejewski (1954) ‘Logic and Existence’, British Journal for the Philosophy of Science, 5: 104–19.
  • Julie E. Maybee (2020) ‘Hegel’s Dialectics’, The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.), <https://plato.stanford.edu/archives/win2020/entries/hegel-dialectics/>.
  • Alexius Meinong (1904) ‘Über Gegenstandstheorie’, in Alexius Meinong (ed.) Untersuchungen zur Gegenstandstheorie und Psychologie, Leipzig: J. A. Barth.
  • Kitarō Nishida (2000) Nishida Kitarō zenshū [Collected works of Nishida Kitarō], Tokyo: Iwanami.
  • Alex Oliver and Timothy Smiley (2013) ‘Zilch’, Analysis, 73.4: 601–613.
  • Plato (1996) Parmenides, A. K. Whitaker (trans.) Newburyport, MA: Focus Philosophical Library.
  • Graham Priest (2002) Beyond the Limits of Thought, Oxford University Press.
  • W.V.O. Quine (1948) ‘On What There Is’, The Review of Metaphysics, 2.5: 21–38.
  • Maria Reicher (2022) ‘Non-existent Objects’, The Stanford Encyclopedia of Philosophy, Edward N. Zalta and Uri Nodelman (eds.), URL = <https://plato.stanford.edu/archives/win2022/entries/non-existent-objects/>.
  • Bertrand Russell (1905) ‘On Denoting’, Mind, 14: 479–493.
  • Bertrand Russell (1985) The Philosophy of Logical Atomism, La Salle, II: Open Court.
  • Oliver Sacks (1987) ‘Nothingness’, in Richard L. Gregory (ed.) The Oxford Companion to the Mind, Oxford University Press.
  • Jean-Paul Sartre (1956) Being and Nothingness: An Essay on Phenomenological Ontology, Hazel E. Barnes (trans.), New York: Philosophical Library.
  • Henry Sheffer (1913) ‘A Set of Five Independent Postulates for Boolean Algebras, with Applications to Logical Constants’, Transactions of the American Mathematical Society, 14: 481–488.
  • Roy Sorensen (2022) Nothing: A Philosophical History, Oxford: Oxford University Press. Edwin Starr (1970) War, Motown: Gordy Records.
  • Alfred Tarski (1944) ‘The Semantic Conception of Truth’, Philosophy and Phenomenological Research, 4.3: 341–376.
  • Ludwig Wittgenstein (1961) Tractatus Logico-Philosophicus, D. F. Pears and B. F. McGuinness (trans.), New York: Humanities Press.
  • Dorothy Wrinch (1918) ‘Recent Work In Mathematical Logic’, The Monist, 28.4: 620–623.

 

Author Information

Suki Finn
Email: suki.finn@rhul.ac.uk
Royal Holloway University of London
United Kingdom

Impossible Worlds

Actual facts abound and actual propositions are true because there is a world, the actual world, that the propositions correctly describe. Possibilities abound as well. The actual world reveals what there is, but it is far from clear that it also reveals what there might be. Philosophers have been aware of this limitation and have introduced the notion of a possible world. Finally, impossibilities abound because it turned out that possibilities do not exhaust the modal space as a whole. Beside the actual facts, and facts about the possible, there are facts about what is impossible. In order to explain this, philosophers have introduced the notion of an impossible world.

This article is about impossible worlds. First, there is a presentation of the motivations for postulating impossible worlds as a tool for analysing impossible phenomena. This apparatus seems to deliver great advances in modal logic and semantics, but at the same time it gives rise to metaphysical issues concerning the nature of impossible worlds. Discourse about impossible worlds is explained in Sections 2 and 3. Section 4 provides an overview of the theories in discussion in the academic literature, and Section 5 summarises the drawbacks of those theories. Section 6 takes a closer look at the logical structure of impossible worlds, and Section 7 discusses the connection between impossible worlds and hyperintensionality.

Table of Contents

  1. Introduction
  2. The First Argument for Impossible Worlds
  3. Impossible Worlds and Their Applications
  4. The Metaphysics of Impossible Worlds
  5. Troubles with Impossible Worlds
  6. The Logic of Impossible Worlds
  7. Impossible Worlds and Hyperintensionality
  8. Conclusion
  9. References and Further Readings

1. Introduction

Modal notions are those such as ‘possibility’, ‘necessity’, and ‘impossibility’, whose analysis requires a different account than so-called indicative notions. To compare the two, indicative propositions are about this world, the world that obtains; and all the true indicative propositions describe the world completely. Propositions of the latter kind are about the world as well, although in a different sense. They are about its modal features or, said otherwise, about alternatives to it. Philosophers call them possible worlds.

For a start, it is important to consider the distinction between pre-theoretical and theoretical terms. Pre-theoretical terms are terms we handle before we engage in philosophical theorizing. Theoretical terms, on the other hand, are introduced by philosophers via sets of definitions. Such terms are usually defined via terms that we already understand in advance. The debate about possible worlds can be understood along the similar lines. The word ‘world’ is a theoretical notion that differs from the word as we use it in everyday life. In the latter, the world is everything we live in and interact with. The philosophical ‘world’ represents the world and is one of many such representations. Its uniqueness rests on the correct representation of it. ‘Actual world’, ‘possible world’, as well as ‘impossible world’ are thus theoretical terms.

An example will be helpful here. Consider the following proposition:

(1)  Canberra is the capital of Australia.

Given the constitutional order of Australia, (1) is true because Canberra is the capital of Australia. In contrast, the proposition:

(2)  Melbourne is the capital of Australia

is false, because it is not the case. So (1) and (2) are factual claims, because they describe the constitutional order in Australia. Consider, however, the following proposition:

(3)  Melbourne could be the capital of Australia.

At first sight, (3) also appears to be about our world in some sense, yet it displays structurally different features than (1) and (2). So, why is it so? Some philosophers dismiss this question by rejecting its coherence. Others propose a positive solution by means of other worlds. In the following two sections I provide two arguments for doing so.

2. The First Argument for Impossible Worlds

In his Counterfactuals (1973), David Lewis states the following:

I believe, and so do you, that things could have been different in countless ways. But what does this mean? Ordinary language permits the paraphrase: there are many ways things could have been besides the way they actually are. I believe that things could have been different in countless ways; I believe permissible paraphrases of what I believe; taking the paraphrase at its face value, I therefore believe in the existence of entities that might be called ‘ways things could have been.’ I prefer to call them ‘possible worlds’. (Lewis 1973: 84)

Takashi Yagisawa builds on Lewis’s view as follows:

There are other ways of the world than the way the world actually is. Call them ‘possible worlds.’ That, we recall, was Lewis’ argument. There are other ways of the world than the ways the world could be. Call them ‘impossible worlds’. (Yagisawa 1988: 183)

These two quotes reflect a need for an analysis of modality in terms of worlds. While Lewis postulates possible worlds as the best tool for analysing modal propositions, Yagisawa extends the framework by adding impossible worlds. In other words, while Lewis accepts:

(P) It is possible that P if and only if there is a possible world, w, such that at w, P.

and:

(I) It is impossible that P if and only if there is no possible world, i, such that at i, P.

as definitions of possibility and impossibility.

An alternative analysis of impossibility extends the space of worlds and, in addition to possible worlds, commits to impossible worlds. As a consequence, proponents of impossible worlds formulate a dilemma in the form of modus tollens and modus ponens respectively:

    1. If we endorse arguments for the existence of possible worlds, then, with all needed changes made, we should endorse the same kind of argument for the existence of impossible worlds.
    2. There are arguments that disqualify impossible worlds from being acceptable entities.

Therefore:

There are no possible worlds. (By modus tollens.)

Or:

1*. If we endorse arguments for the existence of possible worlds, then mutatis mutandis, we should endorse the same kind of argument for the existence of impossible worlds.

2*. There are arguments that establish possible worlds as acceptable entities.

Therefore:

There are impossible worlds. (By modus ponens.)

A need for impossible worlds starts from an assumption that if the paraphrase argument justifies belief in worlds as ways things could have been, then the same argument justifies belief in worlds as ways things could not have been. The second reason is the applicability of impossible worlds. I will discuss some applications of impossible worlds in the next section.

3. Impossible Worlds and Their Applications

It is thought of as a platitude that the introduction of theoretical terms ought to be followed by their theoretical utility. Moreover, the usability of theoretical terms should not solve a particular problem only. Instead, their applications should range over various philosophical phenomena and systematically contribute to their explanation.

The theoretical usefulness of possible worlds has been proven in the analysis of de re as well as de dicto modalities (see the article on Frege’s Problem: Referential Opacity, Section 2), as well as in the analysis of counterfactual conditionals, propositional states, intensional entities, or relations between philosophical theories. Given their applicability, possible worlds have turned out to be a useful philosophical approach to longstanding philosophical problems.

To begin with, representing properties and propositions as sets of their instances, possible individuals and possible worlds respectively, offered many advantages in philosophy. In particular, impossible worlds provide a more nuanced explanation of modality in a way that an unadulterated possible world framework does not. Like possible worlds, impossible worlds are ‘localisers’, albeit in the latter case, where impossible things happen. Consider these two statements:

(4)  2 + 2 = 5

and

(5)  Melbourne both is and is not in Australia.

(4), according to a possible worlds semantic treatment, does not hold in any possible world, because possible worlds are worlds at which only possible things happen. Also, there is no possible world at which Melbourne both is and is not in Australia. Given these two data, and assuming the widely accepted, although disputable, view of propositions as sets of possible worlds, (4) and (5) are ontologically one and the same proposition. It is the empty set. However, (4) and (5) are about different subject matters, namely arithmetic and geography. In order not to confuse these two (impossible) subjects, one sort of way out is presented by impossible worlds: there is an impossible world at which (4) is true and (5) is false, and vice versa.

The well-known reductio ad absurdum mode of argument is another, although controversial, reason for taking impossible worlds seriously (for a more detailed exposition of this, see the article on Reductio ad Absurdum). The internal structure of such arguments starts with certain assumptions and then, via logically valid steps, leads to a contradiction. The occurrence of such an assumption shows that, although the conclusion is contradictory, the impossible assumption gives rise to a counterfactual string of mutually interconnected and meaningful premises. Some proponents of impossible worlds insist that unless we take such impossible assumptions seriously, reductio ad absurdum arguments would not play such a crucial role in philosophical reasoning. For the opposite view according to which mathematical practice does not depend on using counterfactuals, see Williamson (Williamson 2007, 2017). For a more substantive discussion of the reductio ad absurdum and impossible worlds, see also Berto& Jago (2019, especially Chapter XII).

Whatever the machinery behind the reductio ad absurdum argument is, there is a strong reason to postulate impossible worlds for the analysis of a sort of counterfactual conditionals, nonetheless. According to the most prevalent theory, a counterfactual is true if and only if there is no possible world w more similar to the actual world than some possible world such that (i) the antecedent and the consequent of the conditional are both true in , and (ii) the antecedent is true but the consequent is not true in w. Clearly, such an account falls short in analysing counterpossible conditionals unless we either deny their possible worlds interpretation (Fine 2012), admit that they are trivially true (Lewis 1973, Williamson 2007), treat the putative triviality by other means (Vetter 2016) or simply accept impossible worlds. To demonstrate the problem, here is a pair of famous examples, originally from (Nolan 1997):

(6) If Hobbes had (secretly) squared the circle, sick children in the mountains of South America at the time would have cared.

(7) If Hobbes had (secretly) squared the circle, sick children in the mountains of South America at the time would not have cared.

Although intuitions are usually controversial within the philosophical room, there is something intriguing about (7). Namely, although its antecedent is impossible, we seem to take (7) to be true. For, in fact, no sick children would have cared if the antecedent had been true, since this would have made no difference to sick children whatsoever. By the same reasoning, (6) is intuitively false; for again, no sick children would have cared if the antecedent had been true. Consequently, the occurrence of these distinct truth values requires a distinctive analysis and impossible worlds analysis is one candidate.

Disagreements in metaphysical disputes display another feature of impossibility. Metaphysicians argue with each other about lots of issues. For instance, they variously disagree about the nature of properties. Suppose that trope theory is the correct theory of properties and so is necessary true (see the article on Universals). Then this means that both the theory of properties as transcendent universals and the theory of properties as immanent universals are both (a) impossible, and (b) distinct. But they are true in the same possible worlds (that is, none), and to distinguish these two views in terms of where they are true requires impossible worlds. Similarly, proponents of modal realism and modal ersatzism disagree about the nature of possible worlds (see the article on Modal Metaphysics). But they both agree that if either of these theories is true, it is true in all possible worlds; necessarily so. By this reasoning, one’s opponent’s claim is necessarily wrong; she defends an impossible hypothesis. For more details on this (and other issues) see Nolan (1997) and (Miller 2017).

Although theories of fiction abound, its analyses in terms of possible worlds dominate. According to such analyses, what happens in a work of fiction happens at a set of possible worlds, full stop. However, the problem is that fiction fairly often hosts impossible events.

For instance, ‘Sylvan’s Box’ (Priest 1997) is a short story about an object which is inconsistent because it is both empty and non-empty. A usual treatment of such stories uses the terminology of worlds which realise what is stated in the story. However, Priest claims, any interpretation of the story in terms of sub-sets of internally consistent sets of possible worlds (see Lewis 1978) misrepresents the story.

Of course, these applications of impossible worlds are not exhaustive and, as we will see in Section 4, impossible worlds have limitations. Let us, however, suppose that the dilemma is irresistible, and that impossible worlds are, at least to some extent, as applicable as possible worlds are. Given so, one must always consider the cost of such commitment. Since the theoretical application of any entity brings with it an ontological burden, an optimal trade-off between application and ontological commitments must be sought. And impossible worlds are an excellent example of such a trade-off. The next section overviews several metaphysical issues about impossible worlds.

4. The Metaphysics of Impossible Worlds

The introduction of theoretical entities requires a view about their metaphysical nature. The introduction of impossible worlds in not an exception and requires an answer to the question of what impossible worlds are, and, additionally, how impossible worlds differ from possible worlds. We can think of the questions as the identification question and the kind question, respectively.

The identification question concerns the nature of impossible worlds. Like proponents of possible worlds, proponents of impossible worlds disagree about the metaphysical nature and divide into several camps. To start with realism about worlds, these views share a common idea that whatever worlds are, these worlds exist. Probably the most prominent version of modal realism is the genuine modal realism.  While modal realism is a thesis according to which possible worlds exist, genuine modal realism claims that possible worlds exist and, moreover, possible worlds exist in the very same way as ‘we and our surroundings’; they are as concrete as we, buildings, animals, and cars are. What is more, every individual exists in one possible world only (for more on transworld identity, see the article on David Lewis). The actual world is a world which has temporal and spatial dimensions and, consequently, every possible world fulfils this requirement. According to modal realism, possible worlds are concrete spatiotemporal entities.

Another version of modal realism with impossible worlds is presented by Kris McDaniel (2004). His strategy is to withdraw Lewis’s commitment to individuals existing in one possible world only. Instead, he allows an individual to exist in many worlds and to thus bear the exists at relation to more than one world. Such so-called modal realism with overlap is genuine realism, because it accepts concrete possible worlds and their inhabitants.

A modified version of modal realism is presented by Yagisawa (2010). Under the name of modal dimensionalism, Yagisawa postulates so-called metaphysical indices. These indices represent the spatial, temporal, and modal dimensions of the world. According to Yagisawa, the world has spatial, temporal, and additionally modal dimensions, in the same way that I have my own spatial, temporal and modal dimensions. Namely, my temporal dimension includes, among other things, me as a child, me nine minutes ago, and me in the future. My spatial dimensions are the space occupied by my hands, head, as well as the rest of my body. My modal dimension includes my possible stages of being a president, a football player and so forth.

A more moderate version of modal realism is modal ersatzism. Like genuine modal realism, modal ersatzism takes possible worlds to be existent entities (see again the article on Modal Metaphysics), yet denies that they have spatiotemporal dimensions. Naturally, such a brand of realism attracts fans of less exotic ontology because possible worlds are considered as already accepted surrogates for otherwise unwelcome philosophical commitments: complete and consistent sets of propositions or sentences, complete and consistent properties, or complete and consistent states of affairs. Usually, these entities are non-concrete in nature and are parts of the actual world (the view is sometimes called actualism). Alternatively, for an excellent overview of various kinds of ersatzism, see (Divers 2002).

Finally, views according to which worlds do not in fact exist, are widespread in literature. Under the name of modal anti-realism, such views reject modal realism for primarily epistemological reasons although neither deny the meaningfulness of modal talk nor the accuracy of its worlds semantics. Although modal anti-realism is not so widespread in the literature, several positive proposals have demonstrated its prospects. For instance, Rosen (1990) proposes a strategy of ‘fictionalising’ the realist’s positions in shape of useful fictions. Although his primary target is genuine modal realism, it is easy to generalise the idea to other versions of modal realism.

The kind question asks whether possible and impossible worlds are of the same metaphysical category or fall under metaphysically distinct categories. The extent to which we identify possible worlds with a certain kind of entity (identification question) and accept impossible worlds for one reason or another, the response to the kind question predetermines our views about the nature of impossible worlds.

A positive response to the kind question is put forward in Priest (1997). As he puts it, anyone who accepts a particular theory of possible worlds, be it concrete entities, abstract entities, or non-existent entities, has no cogent reason to pose an ontological difference between merely possible and impossible worlds (see Priest 1997: 580–581). The idea is expressed by the so-called parity thesis which says that theories of the nature of possible worlds should be applied equally to impossible worlds.

Now, particular versions of modal realism together with the parity thesis lead to specific views of impossible worlds. To begin with genuine modal realism, extended genuine modal realism accepts concrete possible and impossible worlds. These worlds are spatiotemporal entities, and whatever is impossible holds in some concrete impossible world. For the idea of paraphrasing Lewis’s original argument from ways, see Naylor (1986) and Yagisawa (1988).

Modal dimensionalism as well as modal realism with overlap find their impossible alternatives relatively easily. In the former, I simply have impossible stages as well. In the latter, modal realism with overlap allows that an individual can have mutually incompatible properties at two different possible worlds. For example, an individual, a, bears the exists at relation to a world at which a is round, and bears the exists at relation to another world in which a is square, thus representing the situation ‘a is round and square’. Since it is impossible to be both round and square, this is an impossible situation.

A moderate version of modal realism, modal ersatzism combined with parity thesis is, so to speak, in an easier position. Given her metaphysical commitments, be it sets, sentences, propositions, or whatever you have are already assumed to exist, it is only one step further to introduce impossible worlds as their incomplete and inconsistent counterparts without incurring any additional ontological commitments.

Proponents of the negative response to the kind question, on the other hand, deny the parity thesis. Impossible worlds, according to them, are a distinct kind of entity. Interestingly, such a metaphysical stance allows for a ‘recombination’ of philosophically competitive position. For instance, the hybrid genuine modal realism, indicated in Restall (1997), Divers (2002) and further developed in (Berto 2009), posits concrete possible worlds as the best representation of possible phenomena, but abstract impossible worlds as the ‘safest’ representation of impossible phenomena. In other words, what is possible happens in concrete possible worlds as genuine modal realism conceives them, and what is impossible is represented by more moderate ontological commitments.  In particular, possible worlds are concrete and impossible worlds are, according to hybrid genuine modal realism, sets of propositions modelled in accordance with genuine modal realism. Notably, hybrid genuine modal realism is one of many options for the opponents of the Parity thesis. As mentioned earlier, the hybrid approach to modality allows us to interpret possibility/impossibility pair in terms of distinct metaphysical categories and, depending on the category choice, explicates the duality via the identification question (possible tropes/inconsistent sets; maximal properties/impossible fictions, or other alternatives). Given that the variety of versions remains an underdeveloped region of modal metaphysics in the early twenty-first century, it is a challenge for the future to fill in the gaps in the literature.

5. Troubles with Impossible Worlds

Undoubtedly, any introduction of suspicious entities into philosophy comes with problems, and impossible worlds are not an exception. Besides incredulous stares toward them, philosophical arguments against impossible worlds abound.

A general argument against impossible worlds points to the analysis of modality. For, as far as the goal is to provide an account of modal concepts in more graspable notions, the introduction of impossible worlds puts the accuracy of the analysis at stake. Recall the initial impossibility schema (I):

(I) It is impossible that P if and only if there is no possible world, i, such that at i,

An impossible worlds reading substitutes the occurrence of ‘no possible world’ with ‘impossible world’ along the lines of (I*):

(I*) It is impossible that P if and only if there is an impossible world, i, such that at i, P.

(I*) mimics the structure of (P) and proponents of impossible worlds are expected to be tempted to it. However, (I*) is ‘superficially tempting’. For, although (P) and (I*) are both biconditionals it is hard to accept the right-to-left direction of (I*). For instance, although it is impossible that A & ~A, the conjuncts themselves may be contingent and, by (P), be true in some possible world. Such disanalogy between (P) and (I*) makes impossible worlds of not much use in the theory of impossibility in the first place.

Other problems concern particular theories of modality. Starting with extended modal realism, Lewis himself did not the feel the need to dedicate much space to its rejection. There are two reasons. The first reason is that to provide an extensional, non-modal analysis of modality and, at the same time, distinguish possible worlds from impossible worlds without making use of modal notions is a viable project. The second reason is that a restricting modifier, like ‘in a world’, works by limiting domains of implicit and explicit quantification to a certain part of all that there is, and therefore has no effect on the truth-functional connectives (Lewis 1986, 7, fn.3).). By this, Lewis means that insofar as you admit an impossible thing in some impossible world, you thereby admit impossibility into reality. Since this is an unacceptable conclusion, Lewis rejects the extended version of his modal realism via a simple argument:

1. There is a concrete impossible world at which (A & ~A)

2. At w (A & ~A) if and only if at w A & ~(at w A)

3. The right-hand side of (2) is literally a true contradiction

4. The Law of Non-Contradiction is an undisputable logical principle.

C. There are no concrete impossible worlds.

For Lewis, restricting modifiers works by limiting domains of implicit and explicit quantification to a certain part of all there is. Therefore, ‘On the mountain both P and Q’ is equivalent to ‘On the mountain P, and on the mountain Q’; likewise, ‘On the mountain not P’ is equivalent to ‘Not: on the mountain P’. As a result, ‘On the mountain both P and not P’ is equivalent to the overt contradiction ‘On the mountain P, and not: on the mountain P’. In other words, there is no difference between a contradiction within the scope of the modifier and a plain contradiction that has the modifier within it. See (Lewis 1986: 7 fn. 3) for a full exposition of this argument.

Modal dimensionalism is not without problems either. Jago (2013) argues that adding an impossible stage of ‘Martin’s being a philosopher and not a philosopher’ to my modal profile generates undesired consequences, for modal stages are subject to existential quantification in the same way that actual stages are. And since both actual and modal stages exist, they instantiate inconsistencies, full stop. In the opposite direction, see Yagisawa’s response (2015), as well as Vacek (2017).

Modal realism with overlap has its problems too. A simple counterexample to it relies on the (usually) indisputable necessity of identity and the view according to which no two objects share the same properties: Leibniz’s law. The argument goes as follows: it is impossible for Richard Routley not to be Richard Sylvan because this is one and the same person (in 1983 Richard Routley adopted the last name “Sylvan”):

    1. It is impossible that ∼ (Routley = Sylvan)

Therefore, there is an impossible world i where ∼ (Routley = Sylvan). Now, take the property ‘being a logician’. It is impossible for Routley but not Sylvan to be a logician which, by modal realism with overlap’s lights, means that Routley, but not Sylvan, bears the being a logician relation to a world i. Generalising the idea,

    1. for some property P, in i Routley has P, but Sylvan does not.

However, by Leibniz’s law, it follows that ∼ (Routley = Sylvan). And that is absurd.

What about modal ersatzism? Recall that this alternative to (extended) modal realism takes possible worlds to be existent entities of a more modest kind. The move from ersatz possible worlds to impossible worlds, together with the parity thesis, leads to the inheritance of the various problems of ersatz theories. One such problem is the failure of the reductive analysis of modality. As Lewis argues, any ersatzist theory must at some point appeal to primitive modality and thus give up the project of analysing modality in non-modal terms. Another problem is that entities like states of affairs, properties and propositions are intensional in nature and thus do not contribute to a fully extensional analysis. For scepticism about intensional entities, see Quine (1956). For more problems with modal ersatzism, see Lewis (1986: ch. 3).

Modal fictionalism can be a way of avoiding the realist’s problems. For, if ‘according to the possible worlds fiction’ explains possibility, then ‘according to the possible and impossible worlds fiction’ offers a finer-grained analysis with no exotic ontological commitments. But again, such a relatively easy move from possibility to impossibility faces the threat of inheriting the problems of modal fictionalism. One such difficulty is that fictionalism is committed to weird abstract objects, to wit, ‘stories’. Another worry about (extended) modal fictionalism is the story operator itself. For, unless the operator is understood as primitive, it should receive an analysis in more basic terms. And the same applies to the ‘according to the possible and impossible worlds fiction’ operator.

Moreover, even if modal fictionalists provide us with an account of their fiction operator, it will probably face the same importation problem that the modal realist does. The argument goes as follows. First, suppose logic is governed by classical logic. Second, if something is true in fiction, so are any of its classical consequences. Third, given the explosion principle (everything follows from a contradiction), an inconsistent fiction implies that every sentence is true in the fiction. Fourth, take an arbitrary sentence and translate it as ‘according to the fiction, A’. Fifth, ‘according to the fiction, A’ is true (because an inconsistent fiction implies that all sentences are true within it). Sixth, given that A is the actual truth, ‘according to the fiction, A’ implies: actually A. But it seems literally false to say that any arbitrary sentence is actually true. For more details, see Jago (2014).

The hybrid view has its limitations too. One limitation is that the view introduces two ontological categories and is, so to speak, ideologically less parsimonious than theories following the parity thesis. Moreover, as Vander Laan (1997, 600) points out, there does not seem to be any ontological principle which would justify two different ontological categories in one modal language, namely language of possibility and impossibility.

Yet, there are at least two responses available for the hybrid view. First, proponents of the hybrid view might simply claim that if the best theory of modality plays out that way, that is, if the theory which best systematises our intuitions about modality approves such a distinction, the objection is illegitimate. Second, even the ersatzer faces the same objection. The actual world has two different interpretations and, consequently, two different ontological categories. The actual world can be understood either as us and all our (concrete) surroundings, or abstract representation of it.

Undoubtedly, there is much more to be said about the metaphysics of impossible worlds. Since they come in various versions, one might worry whether any systematic account of such entities is available. Be that as it may, the story does not end with metaphysics. Besides semantic applications of impossible worlds and their metaphysical interpretation, there are logical criteria which complicate their story even more. The next section therefore discusses the logical boundaries (if any) of impossible worlds.

6. The Logic of Impossible Worlds

One might wonder how far impossibility goes, because, one might think, impossible worlds have no logical borders. Yet, one view to think of impossible worlds is as so-called ‘logic violators’. According to this definition, impossible worlds are worlds where the laws of a logic fail. I use the indefinite article here because it is an open question what the correct logic is. Suppose we grant classical logic its exclusive status among other logics. Then, impossible worlds are worlds where the laws and principles of classical logic cease to hold, and the proper description of logical behaviour of impossible worlds requires different logic.

We might therefore wonder whether there is a logic which impossible worlds are closed under. One such candidate is paraconsistent logic(s). Such logics are not explosive, which means that it is not the case that from contradictory premises anything follows. Formally, paraconsistent logic denies the principle α, ~α |= β, and its proponents argue that, impossibly, there are worlds at which inconsistent events happen. Given their denial of the explosion principle, paraconsistent logics should be the tool for an accurate and appropriate analysis of such phenomena. For an extensive discussion of paraconsistent logics, see Priest, Beall, and Armour-Garb (2004).

However, some examples show that even paraconsistent logics are not sufficient for describing the plenitude of the impossible. For example, paraconsistent logic usually preserves at least some principles of classical logic (see the article on Paraconsistent Logic) and cannot thus treat the impossibilities of their violations. A solution would be to introduce its weaker alternative which would violate those principles. But even this manoeuvre seems not to be enough because, as Nolan (1997) puts it, there is tension between a need of at least some logical principles on one side and the impossibility of their failure on the other. For, ‘if for any cherished logical principle there are logics available where that principle fails… if there is an impossible situation for every way things cannot be, there will be impossible situations where even the principles of (any) subclassical logics fail (Nolan 1997, 547). In other words, if we think of a weaker logic as validating fewer arguments, we easily end up with logical nihilism (Russell 2018). Another option is to admit a plurality of logics (Beall & Restall 2006) or, controversially, accept the explosion principle and fall into trivialism: every proposition follows (Kabay 2008).

7. Impossible Worlds and Hyperintensionality

Let me finish with the question of the place of impossibility in reality. In other words, the question remains whether impossibility is a matter of reality, or a matter of representing it. In other words, are impossible matters representational or non-representational? While the literature about impossible issues is inclined towards the latter option, some authors have located the failure of necessary equivalence, that is, the failure of substituting extensionally as well as intensionally equivalent terms, within the world.

To be more precise, levels of analysis ascend from the extensional, to the intensional, to the hyperintensional level. Nolan (2014) suggests that a position in a sentence is extensional if expressions with the same extension can be substituted into that position without changing the truth-value of the sentence. An intensional position in a sentence is then characterised as non-extensional,  such that expressions that are necessarily co-extensional are freely substitutable in that position, while preserving its truth value. Finally, a hyperintensional position in a sentence is neither extensional nor intensional, and one can substitute necessary equivalents while failing to preserve the truth-value of the sentence. Apparently, the introduction of impossible worlds moves philosophical analyses into the hyperintensional level, since even when A and B are necessarily equivalent (be this logical, mathematical, or metaphysical necessity), substituting one of them for the other may result in a difference in truth value. But if that is so, and if some hyperintensional phenomena are non-representational, then impossibility is a very part of reality.

There are several cases which both display worldly features and are hyperintensional. For instance, some counterfactual conditionals with impossible antecedents are non-representational (Nolan 2014). Also, Schaffer (2009) contrasts the supervenience relation to the grounding relation, and concludes that there are substantive grounding questions regarding mathematical entities and relations between them. Yet, given the supervenience relation, such questions turn out to be vacuously true. Explanation as a hyperintensional phenomenon might be understood non-representationally as well. Namely, as an asymmetric relation between the explanans and its necessarily equivalent explanandum. Among other things, some dispositions (Jenkins & Nolan 2012), the notion of intrinsicality (Nolan 2014), the notion of essence (Fine 1994) or omissions (Bernstein 2016) might be understood in the same way. Indeed, all these examples are subject to criticism, but the reader might at least feel some pressure to distinguish between ‘merely’ representational and non-representational hyperintensionality. For more details, see Nolan (2014) and Berto & Jago (2019) and, for an alternative approach to hyperintensionality, Duží,  Jespersen,  Kosterec,  and Vacek (2023).

8. Conclusion

Impossible worlds have been with us, at least implicitly, since the introduction of possible worlds. The reason for this is the equivalence of the phrases ‘it is possible’ and ‘it is not impossible’, or ‘it is impossible’ and ‘it is not possible’. The controversies about impossible worlds can also be understood as a sequel to the controversies about possible worlds. In the beginning, possible worlds were hard to understand, and this produced some difficult philosophical debates. It is therefore no surprise that impossible worlds have come to follow the same philosophical path.

9. References and Further Readings

  • Beall, J. & Restall, G. (2006). Logical Pluralism, Oxford: Oxford University Press.
  • A developed account of a position according to which there is more than one (correct) logic.

  • Bernstein, S. (2016). Omission Impossible, Philosophical Studies, 173, pp. 2575–2589.
  • A view according to which omissions with impossible outcomes play an explanatory role.

  • Berto, F. (2008). Modal Meinongianism for Fictional Objects, Metaphysica 9, pp. 205–218.
  • A combination of Meinongian tradition and impossible worlds.

  • Berto, F. (2010). Impossible Worlds and Propositions: Against the Parity Thesis, Philosophical Quarterly 60, pp. 471–486.
  • A version of modal realism which distinguishes distinct impossible propositions, identifies impossible worlds as sets and avoids primitive modality.

  • Berto, F. & Jago, M. (2019). Impossible Worlds, Oxford: Oxford University Press.
  • A detailed overview of theories of impossible worlds.

  • Divers, J. (2002). Possible Worlds, London: Routledge.
  • Duží, M.; Jespersen, B.; Kosterec, M.; Vacek, D. (eds.). (2023).  Transparent Intensional Logic, College Publications.
  • A detailed survey of the foundations of Transparent Intensional Logic.

  • Fine, K. (1994). Essence and Modality: The Second Philosophical Perspectives Lecture, Philosophical Perspectives 8, pp.  1–16.
  • A detailed overview of the possible world ontologies.

  • Fine, K. (2012). Counterfactuals Without Possible Worlds, Journal of Philosophy 109: 221–246.
  • The paper argues that counterfactuals raise a serious difficulty for possible worlds semantics.

  • Jago, M. (2013). Against Yagisawa’s Modal Realism, Analysis 73, pp. 10–17.
  • This paper attacks modal dimensionalism from both possibility and impossibility angles.

  • Jago, M. (2014). The Impossible: An Essay on Hyperintensionality, Oxford: Oxford University Press.
  • A detailed overview of the history, as well as the current state of impossible worlds discourse.

  • Jenkins, C.S. & Daniel N. (2012). Disposition Impossible, Noûs, 46, pp. 732–753.
  • An original account of impossible dispositions.

  • Kabay, P. D. (2008). A Defense of Trivialism, PhD thesis, University of Melbourne.
  • A defence of trivialism, on the basis that there are good reasons for thinking that trivialism is true.

  • Kiourti, I. (2010). Real Impossible Worlds: The Bounds of Possibility, Ph.D. thesis, University of St Andrews.
  • A defence of Lewisian impossible worlds. It provides two alternative extensions of modal realism by adding impossible worlds.

  • Lewis, D. (1973). Counterfactuals, Cambridge, MA: Harvard University Press.
  • One of the first explicit articulations of modal realism and its analysis of counterfactual conditionals.

  • Lewis, D. (1978). Truth in Fiction, American Philosophical Quarterly 15, pp. 37–46.
  • An approach which aims at dispensing with inconsistent fictions via the method of union or the method of intersection. According to Lewis, we can explain away an inconsistent story via maximally consistent fragments of it.

  • Lewis, D. (1986). On the Plurality of Worlds, Oxford: Blackwell.
  • A detailed defence of modal realism, including an overview of arguments against modal ersatzism.

  • McDaniel, K. (2004). Modal Realism with Overlap, Australasian Journal of Philosophy 82, pp. 137–152.
  • An approach according to which the worlds of modal realism overlap, resulting in transworld identity.

  • Miller, K. (2017). A Hyperintensional Account of Metaphysical Equivalence, Philosophical Quarterly 67: 772–793.
  • This paper presents an account of hyperintensional equivalency in terms of impossible worlds.

  • Naylor, M. (1986). A Note on David Lewis’ Realism about Possible Worlds, Analysis 46, pp. 28–29.
  • One of the first modus tollens arguments given in response to modal realism.

  • Nolan, D. (1997). Impossible Worlds: A Modest Approach, Notre Dame Journal of Formal Logic 38, pp. 535–572.
  • Besides giving an original account of counterpossible conditionals, this paper introduces the strangeness of impossibility condition: any possible world is more similar (nearer) to the actual world than any impossible world.

  • Nolan, D. (2014). Hyperintensional Metaphysics, Philosophical Studies 171, pp. 149–160.
  • A defence of modal realism with overlap: the view that objects exist at more than one possible world.

  • Priest, G. (1997). Sylvan’s Box: A Short Story and Ten Morals, Notre Dame Journal of Formal Logic, 38, 573–582
  • A short story which is internally inconsistent, yet perfectly intelligible.

  • Priest, G., Beall, J. C., & Armour-Garb, B. (eds.). (2004), The Law of Non-Contradiction, Oxford: Oxford University Press.
  • A collection of papers dedicated to the defence as well as the rejection of the law of non-contradiction.

  • Russell, G. (2018). Logical Nihilism: Could There Be No Logic?, Philosophical Issues 28: 308–324
  • A proposal according to which there is no logic at all.

  • Schaffer, J. (2009). On What Grounds What, in D, Chalmers, D. Manley, and R. Wasserman (eds.), Metametaphysics: New Essays on the Foundations of Ontology, Oxford: Oxford University Press, pp. 347–383.
  • A defence of the grounding relation as providing a philosophical explanation.

  • Quine, W. V. (1956). Quantifiers and Propositional Attitudes, Journal of Philosophy 53, pp. 177–187.
  • According to Quine, propositional attitude constructions are ambiguous, yet an intensional analysis of them does not work.

  • Restall, G. (1997). Ways Things Can’t Be, Notre Dame Journal of Formal Logic 38: 583–96.
  • In the paper, Restall identifies impossible worlds with sets of possible worlds.

  • Rosen, G. (1990). Modal Fictionalism, Mind 99, pp. 327–354.
  • An initial fictionalist account of modality, ‘parasiting’ on the advantages of modal realism, while avoiding its ontological commitments.

  • Vacek, M. (2017). Extended Modal Dimensionalism, Acta Analytica 32, pp. 13–28.
  • A defence of modal dimensionalism with impossible worlds.

  • Vander Laan, D. (1997). The Ontology of Impossible Worlds, Notre Dame Journal of Formal Logic 38, pp. 597–620.
  • A theory of impossible worlds as maximal inconsistent classes of propositions, as well as a critique of various alternative positions.

  • Vetter, B. (2016). Counterpossibles (not only) for Dispositionalists, Philosophical Studies 173: 2681–2700
  • A proposal according to which the non-vacuity of some counterpossibles does not require impossible worlds.

  • Williamson, T. (2017). Counterpossibles in Semantics and Metaphysics, Argumenta 2: 195–226.
  • A substantial contribution to the semantics of counterpossible conditionals.

  • Yagisawa, T. (1988). Beyond Possible Worlds, Philosophical Studies 53, pp. 175–204.
  • An influential work about the need for impossible worlds, especially with regard to modal realism.

  • Yagisawa, T. (2010). Worlds and Individuals, Possible and Otherwise, Oxford: Oxford University Press.
  • A detailed account of modal dimensionalism and its ontological, semantic and epistemological applications.

  • Yagisawa, T. (2015). Impossibilia and Modally Tensed Predication, Acta Analytica 30, pp. 317–323.
  • The paper provides responses to several arguments against modal dimensionalism.

 

Author Information

Martin Vacek
Email: martin.vacek@savba.sk
Institute of Philosophy at the Slovak Academy of Sciences
Slovakia

Boethius (480-524)

Boethius was a prolific Roman scholar of the sixth century AD who played an important role in transmitting Greek science and philosophy to the medieval Latin world. His most influential work is The Consolation of Philosophy. Boethius left a deep mark in Christian theology and provided the basis for the development of mathematics, music, logic, and dialectic in medieval Latin schools. He devoted his life to political affairs as the first minister of the Ostrogothic regime of Theodoric in Italy while looking for Greek wisdom in devout translations, commentaries, and treatises.

During the twenty century, his academic modus operandi and his Christian faith have been a matter of renewed discussion. There are many reasons to believe his academic work was not a servile translation of Greek sources

The Contra Eutychen is the most original work by Boethius. It is original in its speculative solution and its methodology of using hypothetical and categorical logic in its analysis of terms, propositions, and arguments. The Consolation of Philosophy is also original, though many authors restrict it to his methodology and the way to dispose of the elements, but not the content, which would represent the Neoplatonic school of Iamblichus, Syrianus, and Proclus. Boethius was primarily inspired by Plato, Aristotle, and Pythagoras. His scientific, mathematical and logical works are not original, as he recognized.

Table of Contents

  1. Life
  2. Time
  3. Writings
    1. Literary Writings 
      1. The Consolation of Philosophy
    2. Theological Treatises
    3. Scientific Treatises
    4. Logical Writings
      1. Translations
      2. Commentaries
      3. Treatises
        1. On the Division
        2. On the Topics
        3. On the Hypothetical Syllogisms
      4. Treatises on Categorical Syllogisms
        1. The De Syllogismo Categorico
        2. The Introductio ad Syllogismos Categoricos
  4. Influence of the Treatises
  5. His Sources
  6. References and Further Reading

1. Life

Anicius Manlius Severinus Boethius (c. 480-524 AD), Boethius, was a prominent member of the gens Anicia, a family with strong presence in the republican and imperial Roman life. From the time of Constantine its members were converted and advocated for the doctrine of the Christian church of Rome. The study of Latin epigraphy (compare Martindale 1980, p. 232) and some biographical details about his childhood delivered by Boethius himself (Consolation of Philosophy ii, 3, 5) allow believing that his father was another Boethius, Narius Manlius Boethius, who was praetorian prefect, then prefect of Italy, and finally consul and patrician in 487 AD (compare Cameron 1981, pp. 181-183). It is not clear if this Boethius is the one who was the prefect of Alexandria in 457 AD, but Courcelle (1970, p. 299, n.1) suggested it so to give more weight to his hypothesis that Boethius could have used his social situation to go to Athens or Alexandria and learn Greek and deepen his study of philosophy and theology. What seems more likely is that Boethius’ grandfather was the same Boethius who was murdered by Valentinian III in 454 AD (compare Martindale 1980, p. 231).

After his father’s death, which occurred when Boethius was a child, he received the protection of Quintus Aurelius Symmachus Memmius, who belonged to a very influential family of the Roman nobility. Later, Boethius married Symmachus’s daughter, Rusticiana, sealing a family alliance that was disturbing to Theodoric, the Ostrogoth king, who was in Italy to impose authority and governance to the collapsed Western Empire by following the request of Flavius Zeno, the Eastern Roman Emperor. The political commitment of Boethius with Rome is attested not only by the public office of magister officiorum, the highest political rank that could be exercised in the reign of Theodoric, but also for the education and cursus honorum of his two sons, Symmachus and Boethius, who became senators (Consolation of Philosophy ii, 3,8; 4,7).

The prestige of Boethius in sixth-century Rome is attested not only by the honors granted him during his youth (some of which were denied to his older fellows, compare Consolation of Philosophy ii, 3), but also by the requests from friends and relatives to write commentaries and treatises to explain some difficult matters. In addition, Cassiodorus (Magnus Aurelius Cassiodorus), well known for founding in 529 AD the monastery of Vivarium, reports a scientific mission entrusted to him by Theodoric in terms of giving a horologium, a clock regulated by a measured flow of water, to the king of the Burgundians, Gundobad (compare. Variae I, 45 and I, 10. Mommsen ed. 1894).

2. Time

Theodoric must have been an ominous character for Romans, perhaps the lesser evil. The difficulties involved in moving from the pure ideal of Rome to Theodoric’s nascent eclectic culture must have been the context in which Boethius lived. By this time the unity of the Western Roman Empire was fragile, and the political power continuously disputed by various Germanic warlords, from Genseric the Vandal king, in 455 AD until Theodoric, the Ostrogoth king, in 526 AD.

It was Theodoric who organized a more stable government and attracted greater political unity among the leaders of the dominant two ethnic groups, the Roman and the Ostrogoth. In 493 Theodoric founded in Ravenna, northern Italy, the political and diplomatic capital of his government after defeating Odoacer there, as planned by the Emperor Flavius Zeno in Constantinople as a punishment for not respecting the authority of the host eastern Empire.

Theodoric brief reign (he died in 526, two years after Boethius) kept the administrative structure of the Roman Empire and sustained a joint government between the two main ethnic and political groups. Theodoric was not an entirely uneducated man (though see Excerpta Valesiana II, 79. Moreau ed. 1968) and would have had familiarity with Greek culture after staying in Constantinople, as hostage at the age of eight; it is known that, whatever his motivation was, he regularly respected the Roman civil institutions (but see the opinion of Anderson 1990, pp. 111-115). Boethius himself gave a panegyric to Theodoric during the ceremony in which Boethius’ two children were elected consuls (Consolation of Philosophy ii, 3, 30).

But the association of the two political powers, the military of Theodoric and the political of Rome, had many reasons to be adverse. By this time, Boethius must have been not only the most influential Roman politician in the Ostrogoth government but also the most distinguished public figure of the Roman class. The personal and political opposition was, after all, deep and irreconcilable. The Arianism of Theodoric and the Catholicism of Boethius clashed in 518, when Justin was appointed Roman emperor of the East. He abolished the Henoticon and embarked on a recovery policy of the Catholic faith of the Council of Chalcedon, and he began a plan to approach to Rome (Matthews 1981, p. 35). The most difficult years came as the elder Theodoric began to be worried about the destiny of his non-Catholic Eastern allies and concerned on his own stability in Italy. Around 524 AD, Boethius was accused of treason by Theodoric himself, without the right to be defended by the Roman Senate, which was also accused of treason (compare also Excerpta Valesiana II, 85-87. Mommsen ed. 1984). He was quickly imprisoned near Pavia, where he remained until being executed.

The detailed circumstances of the accusation have been not entirely clear to posterity, even if Boethius gives a summary of them in his Consolation of Philosophy i, 4. In essence, the charge was treason against the Ostrogoth government by seeking the alliance of Justin in Constantinople. The evidence for this charge includes Boethius’ intention of defending the senate of being involved in protecting the senator Albinus (who was accused of the same charge before), and the exhibition of some letters sent to Justin that contained expressions adverse to Theodoric and his regime. Boethius calls these letters apocryphal (Consolation of Philosophy i, 4). Probably Albinus was not in secret negotiations with the Eastern empire, and Boethius was innocent of wishing to defend the Senate of treason and concealment. However, he was accused and punished for his conspiracy, at the mercy of a violent despotic king who did not establish a proper defense or prove the charge against him. The execution of Boethius came quickly, and the manslaughter of his father-in-law, Symmachus, followed soon as well as the abuse and death of Pope John I. During his imprisonment, Boethius wrote his masterpiece The Consolation of Philosophy, which was not only a work of great influence in the Middle Ages and the Renaissance, but one of the most respected works of human creativity.

3. Writings

Boethius’ writings divide into three kinds: philosophical, theological, and scientific. The scientific writings are divided into mathematical and logical. The relationship between Boethius and his work remains complex. His completed works are traditionally conceived as original. The disorganized and incomplete shape of some of his works, especially his scientific treatises, is explained by his execution and death. However, many twentieth century scholars believe that this classical description only applies to his Theological treatises and partly to the Consolation of Philosophy, for Boethius depends on his sources more than an original work. However, this opinion is somehow a generalization of the situation that surrounded the scientific writings, and the truth is more in the middle.

a. Literary Writings

i. The Consolation of Philosophy

Boethius’ philosophical work is identified with his Consolatio Philosophiae, which combines stylistic refinement through the composition of prose and poetry with philosophical ideas within a conceptual framework based on a Neoplatonic interpretation of Aristotle with some subtle touches of Christian philosophy‑although this has been a matter of discussion. The unexpected visit of Lady Philosophy in his prison allows for a dialogue with a wonderful counterpoint between human opinion and the wisdom of Lady Philosophy, although Boethius says that Lady Philosophy is just the announcer of the light of truth (IV, 1, 5). The themes raised by The Consolation of Philosophy, such as the nature of fortune, human happiness, the existence of God and evil, human freedom and divine providence, became the focus of attention for Christian metaphysics of the Latin Middle Ages.

In Book I, Boethius briefly reviews his political life and the reasons for his accusation and imprisonment, showing that he is fully aware of those who accused him. In Book II, he discusses the nature of fortune and the reasons why no one should trust in it. In Book III he argues‑already in a different sense from what we might expect from the philosophy of Plato and Aristotle‑that true happiness (beatitudo) identifies with divinity itself, whose nature is unique and simple. He identifies the highest good (perfectus bonum) with the father of all things (III, 10, 43), and maintains that it is not possible to possess happiness without first having access to the highest good. The difference between his theory of happiness and that of Aristotle and Plato is that Boethius places God as a sine qua non condition for the possession of happiness, implying that every man must trust in divine wisdom and God’s provident wisdom to be happy. In Book IV, he addresses the issue of the existence of evil in the realm of one who knows and can do everything (IV, 1, 12: in regno scientis omnia potentis omnia). The allusion to the kingdom of God (regnum dei) is highly significant for proving its implicit Christianity mostly because he completes this allusion with the metaphor of the gold and clay vessels that the master of the house disposes since this symbol is found in the Letters of Saint Paul (Timothy 2,20; 2 Corinthians 4,7; and Romans 9, 21), and because it had an enormous Patristic citation. In Book V, Boethius examines one of the most complex problems of post-Aristotelian philosophy: the compatibility of human freedom and divine foreknowledge (divina praescientia). Boethius’ treatment will be of great theoretical value for later philosophy, and the remains of his discussion can be seen in Thomas Aquinas, Valla, and Leibniz (compare Correia (2002a), pp. 175-186).

Neoplatonic influence has been discerned in the Consolation, especially that of Proclus (412-485 AD), and Iamblichus. But this fact is not enough to affirm that Boethius in the Consolation only follows Neoplatonic authors. The issue is whether there is an implicitly Christian philosophy in this work. Furthermore, the absence of the name of Christ and Christian authors has led some scholars to believe that the Consolation is not a work of Christian philosophy, and Boethius’ Christianity was even doubted for this fact (compare Courcelle, 1967, p. 7-8). Added to this is the fact that, if Boethius was a Christian, he would seek consolation in the Christian faith rather than in pagan philosophy. However, it must be considered that the genre of philosophical consolation, in the form of logotherapy, was traditional in Greek philosophy. Crantor of Solos, Epicurus, Cicero, and Seneca had written consolations about the loss of life, exile, and other ills that affect the human spirit. Cicero in his Tusculan Disputations (3.76) even shows that the different philosophical schools were committed to the task of consoling the dejected and recognizes various strategies applied by the different schools as they conceived the place of human beings in the universe. Boethius was surely aware of this tradition (Cicero wrote his consolation for himself) and if we take this assumption for granted, Boethius’ Consolation of Philosophy would fit within the genre of consolation as a universal genre together with the themes of universal human grief (evil, destiny, fortune, unhappiness). At the same time, Boethius would be renovating this literary genre in a Christian literary genre, since Lady Philosophy does not convey Boethius’ spirit towards pagan philosophy in general, but rather to a new philosophy that should be called Christian. We see this not only in the evocations of Saint Paul’s letters and the new theory of happiness but also when, in Book V, Boethius identifies God with the efficient principle (de operante principio) capable of creating from nothing (V, 1, 24-29). Hence, he adapts Aristotle’s definition of chance by incorporating the role of divine providence (providentia) in disposing all things in time and place (locis temporibusque disponit: V, 1, 53-58).

b. Theological Treatises

The Opuscula sacra or Theological treatises are original efforts to resolve some theological controversies of his time, which were absorbed by Christological issues and by the Acacian schism (485-519). Basically, they are anti-heretical writings, where a refined set of Aristotelian concepts and reasoning are put in favor of the Chalcedonian formula on the unity of God and against both Nestorious’ dyophysitism and Eutyches’ monophysitism. Boethius claims to be seeking an explanation on these issues using Aristotle’s logic. This makes him a forerunner of those theologians trusting theological speculations in logic. The following five treatises are now accepted as original: (1) De Trinitate, (2) If the Father and the Son and the Holy Spirit are substantially predicated of divinity, (3) How the substances are good in virtue of their existence without being substantially good, (4) Treatise against Eutyches and Nestorius, and (5) De Fide Catholica. The most original and influential of Boethius’ theological treatises is the Contra Eutychen.

Because of the absence of explicit Christian doctrines in the Consolation of Philosophy, the authenticity of the theological treatises was doubted by some scholars in the early modern era. But Alfred Holder discovered a fragment of Cassiodorus in the manuscript of Reichenau, which later was published by H. Usener (1877), in which the existence of these treatises and their attribution to Boethius is reported. Cassiodorus served as senator with Boethius and succeeded him in the charge of Magister officiorum in Theodoric’s government. Cassiodorus mentions that Boethius wrote “a book on the Trinity, some chapters of dogmatic teaching, and a book against Nestorius” (compare Anecdoton Holderi, p. 4, 12-19. Usener ed. 1877). This discovery confirmed not only the authenticity of Boethius’ theological treatises, but also cleared the doubts over whether Boethius was a Christian or not. The Treatise against Eutyches and Nestorius has been admitted as the most original of Boethius’s theological treatises (Mair, 1981, p. 208). By the year 518 Boethius had translated, commented on, and treated a large part of Aristotle’s Organon (compare De Rijk’s chronology, 1964). Thus, Boethius makes use of Aristotelian logic as an instrument. In Contra Eutychen, he uses all the resources that are relevant to the subject in question: division and definition, hypothetical syllogisms, distinction of ambiguous meanings of terms, detection and resolution of fallacies involved. This is accompanied by the idea that human intelligence can store arguments against or in favor of a certain thesis to possess a copia argumentorum (a copy of arguments; p. 100, 126-130), suggesting that there can be several arguments to demonstrate the same point under discussion, which is a matter reminiscent of Aristotle’s Topics. Thus, Boethius gives a perfect order of exposition, rigorously declared at the beginning of the central discussion of the treatise: 1) Define nature and person, and distinguish these two concepts by means of a specific difference; 2) Know the extreme errors of the positions of Nestorius and Eutyches; 3) Know the middle path of the solution of Catholic faith. Boethius’s solution is the Catholic solution to theological problem of the two natures of Christ. According to his view, Christ is one person and two natures, the divine and the human, which are perfect and united without being confused. He is thus consubstantial with humanity and consubstantial with God.

c. Scientific Treatises

Within the scientific writings, we find mathematical and logical works. Boethius gives us scientific writings on arithmetic, geometry, and music; no work on astronomy has survived, but Cassiodorus (Variae I. 45, 4) attributed to him one on Ptolemy. Similarly, Cassiodorus attributes another on geometry with a translation of Euclid’s Elementa, but what we count as Boethius’ writing on geometry does not correspond to Cassiodorus’ description. His logical works are on demonstrative and inventive logic. A treatise on division (De divisione) has also credited to him (compare Magee 1998). But one on definition (De definitione) has been refuted as original and attributed to Marius Victorinus (Usener, 1877). Boethius devotes to logic three types of writings: translations, commentaries, and treatises.

Boethius uses the term quadrivium (De Institutione Arithmetica I, 1, 28) to refer to arithmetic, geometry, music, and astronomy, which reveals that he might be engaged not only in the development of these sciences, but also in the didactic of them. However, his works on logic do not reveal that this plan might also have covered the other disciplines of the trivium, grammar and rhetoric.

The scientific writings of Boethius occupied an important place in the education of Latin Christendom. The influence that these treatises had in the medieval quadrivium and even into early modern tradition is such that only Newton’s physics, Descartes’ analytical geometry, and Newton’s and Leibniz’s infinitesimal calculus were able to prevail in the Boethian scientific tradition.

It is known that the way Boethius approaches arithmetic and music is speculative and mathematical. Arithmetic is known as the science of numbers and does not necessarily include calculation. And music is a theoretical doctrine of proportion and harmony and has nothing directly to do with making music or musical performance techniques. In De Institutione musica I, 2, 20-23, Boethius makes a distinction of three types of music: cosmic (mundana), human (humana) and instrumental. He distinguishes them according to their universality. The mundana is the most universal, since it corresponds to celestial harmony and the order of stars: some stars rotate lower, others higher, but all form a set with each other. It is followed by human music, which is what we, as humans, experience and reproduce directly within ourselves. It is the song, the melodies that are created by poetry. It is responsible for our own harmony, especially the harmonious conjunction between the sensitive part and the intellectual part of our nature, just as the bass and treble voices articulate in musical consonance. The third is instrumental music, generated by tension of a string or by the breath of air or by percussion.

At the beginning of his De Institutione Musica (I, 10, 3-6), when following Nichomachus of Gerasa, Boethius adopts without criticism not only Pythagoras’ theory of music, but also the supernatural context in which Pythagoras announces the origin of music through a divine revelation given by the symmetric and proportional sounds coming from a blacksmith. The marked tendency of the Pythagorean theory of music impedes Boethius from making a richer report of music by including the more empirical approach by Aristoxenus, who is criticized by Boethius just as the Stoics are in logic.

d. Logical Writings

Boethius has three kinds of works on logic: translations, commentaries, and treatises. Their content revolves mainly around Aristotle’s logical writings: Categories, De Interpretatione, Prior Analytics, Posterior Analytics, Topics and Sophistical Refutations, traditionally called the Organon. But even if Boethius wanted to devote works on each one, he did not complete the task.

i. Translations

As a translator, Boethius has a consummate artistry. His translations are literal and systematic. They do not lack the force of the Greek, and they never spoil the style of Latin. Its literal translation method has been compared to that developed later by William of Moerbeke (who translated some works of Aristotle and other Greek commentators) for their use and study of Thomas Aquinas. Boethius’ translations from Greek are so systematic that scholars often can determine what the Greek term behind the Latin word is. Boethius’ translations are edited in Aristoteles Latinus (1961-1975). Translations on every work by Aristotle’s Organon have been found. In addition to these works, Boethius translated the Isagoge of Porphyry, which is an introduction (Eisagogé is the Greek term for ‘introduction’) to Aristotle’s Categories.

In these translations, Boethius exceeded the art of Marius Victorinus, who had earlier translated into Latin Aristotle’s Categories and De Interpretatione, and Porphyry’s Isagoge. Boethius himself attributed certain errors and confusions in Marius Victorinus and informs us that Vetius Praetextatus’ translation of Aristotle’s Prior Analytics, rather than being a translation of Aristotle’s text, is a paraphrase of the paraphrase made by Themistius on this Aristotelian work (compare Boethius in Int 2, p. 3; Meiser ed. 1877-1880). The translation of Greek works into Latin was common. Apuleius of Madaura, a Latin writer of 2 AD., born and settled in North Africa, had translated the arithmetic of Nicomachus of Gerasa and wrote an abridgement of Aristotelian logic. In general, we can say that Boethius saw very clearly the importance of systematic translations into Latin of Greek philosophy and science as an educational service to the nascent European Latin Christianity.

ii. Commentaries

Even if Boethius planned to comment on the complete Organon, he finished only the following:

    • On Porphyry’s Isagoge (In Porphyry Isagogen, two editions).
    • On Aristotle’s Categories (In Aristotelis Categorias, two editions).
    • On Aristotle’s De Interpretatione (In Aristotelis Peri hermeneias, two editions).
    • On the Topics of Cicero (In Ciceronis Topica, one edition).

Though no commentary on Posterior Analytics, Topics or Sophistical Refutations exist, this does not suggest that Boethius was unaware of them. In his Introductio ad syllogismos categoricos (p. 48, 2), when Boethius deals with singular propositions, he seems to follow some explanations closely related to a commentary on Sophistical Refutations. Even if his plan of performing a double commentary on every work is not original, he explained this modus operandi. The first edition contains everything which is simple to understand, and the second edition focuses on everything which is more subtle and requires deeper, longer explanation.

The influence of these commentaries on medieval education was enormous, as they contain key concepts that became central to both the logica vetus and medieval philosophy. In fact, his comments on Porphyry’s Isagoge contain the so-called problem of universals (Brandt 1906 ed.p. 24, 159), and his comments on De Intepretatione give the linguistic and semantic basis of the long tradition of logical analysis of medieval thinkers until Peter Abelard. Additionally, his comments on Cicero’s Topics were influential in the history of logic and sciences by dividing logic into the demonstrative and the dialectic branches, underlining the distinction between Aristotle’s Analytics and Topics.

Many times, Boethius’ commentaries are given through long explanations, but they contain valuable information on the history of logic as they build upon many doctrines on earlier commentators of Aristotle. The commentary on Aristotle’s logic had a long Greek tradition, and Boethius knew to select those commentators and doctrines that improve Aristotle’s text. In that tradition, the earlier author played an important role over the latter. However, there is important evidence that Boethius is not following any continuous copy of any of the earlier Greek commentators.

iii. Treatises

Boethius not only translated and commented on the works of Aristotle and Porphyry, but he wrote some monographs or logical treatises that are different from his commentaries, for they are not intended to provide the correct interpretation of Aristotle’s text, but to improve the theory itself. If we leave aside the De definitione, five treatises are recognized:

    • On Division (De divisione liber)
    • On Categorical syllogism (De syllogismo categorico)
    • Introduction to categorical syllogisms (Introductio ad syllogismos categoricos)
    • On Topical Differences (De Topicis differentiis)
    • On hypothetical syllogisms (De hypotheticis syllogismis).
1. On the Division

Boethius’ De divisione transmitted the Aristotelian doctrine of division, that is, the doctrine that divides a genus into subordinate species. The aim of division is to define (compare Magee 1998). For example:

 

In Aristotle’s works there are examples of divisions (for example, Politics 1290b21, De generatione et corruptione 330a1), which proves that Boethius accepted this definition method regardless of the fact that its origin was Platonic. The logical procedure was also appreciated by the first Peripatetics, and the proof is that, as Boethius reports at the beginning of this treatise, Andronicus of Rhodes published a book on the division, because of its considerable interest to Peripatetic philosophy (De divisione 875D; compare also Magee 1998, pp. xxxiv-xliii). Also, the Neoplatonic philosopher Plotinus studied Andronicus’ book and Porphyry adapted its contents for commenting on Plato’s Sophist (De divisione 876D). The species of division that were recounted by Boethius are that any division is either secundum se or secundum accidens. The first has three branches: (i) a genus into species (for example, animal is divided into rational and non-rational); (ii) the whole into its parts (for example, the parts of a house); and (iii) a term into its own meanings (for example, ‘dog’ means quadruped capable of barking, a star in Orion and an aquatic animal). The division secundum accidens is also triple: (i) a subject into its accidents (for example, a man into black, white and an intermediate color); (ii) accidents into a subject (for example, among the things that are seeking, some belong to the soul and some belong to body); finally, (iii) the accidents into accidents (for example, among white things some are liquid some are solid).

It is worth noting that not all the genus-species divisions are dichotomous, as it was with Platonists, because Peripatetic philosophers also accepted that a genus can be divided into three species or more, since the general condition of a division to be correct is that it must never have less than two species and never infinite species (De divisione 877C-D). As it seems, this is one of the differences between Aristotle and the Platonists. In fact, Aristotle criticizes the Platonists’ dependence on dichotomous divisions by arguing that if all divisions were dichotomous, then the number of animal species would be odd or a multiple of two (Aristotle, Parts of Animals I, 3, 643a16-24).

2. On the Topics

Boethius’ idea of logic is complex and in no way reduces only to formal demonstration. When he refers to logic as such (compare In Isagogen 138,4-143,7; De Top Diff 1173C.; and In Ciceronis topica I 2.6-8), he distinguishes between demonstrative and dialectical syllogism and criticizes the Stoics for leaving out the dialectical part of logic and maintaining a narrower idea of it. In fact, Boethius does not reduce logic to demonstration, but he divides logic into two parts: judgement and the discovery of arguments. Since he identifies the former to Analytics and the later to Topics, the division applies to reconcile these two main procedures of logic. Logic is both a demonstration and a justification of reasonable premises, as the syllogism can manage necessary or possible matters.

In Ciceronis Topica Boethius is commenting on Cicero’s Topics. The objective of this work is to adopt Ciceronian forensic cases and explain them within his understanding of Peripatetic tradition of Aristotle’s Topics. Boethius’ notion of topic is based on what seems to be the Theophrastean notion, which is a universal proposition, primitive and indemonstrable, and in and of itself known (Stump, 1988, pp. 210-211). A topic is true if demonstrated through human experience, and its function is to serve as a premise within the argument sought. The topic may be within or outside argumentation. One example in the treatise (1185C) appears to be autobiographic: the question of whether to be ruled by a king is better than by a consul. According to Boethius, one should argue thus: the rule of a king lasts longer than the government maintained by a consul. If we assume that both governments are good, it must be said that a good that lasts longer is better than one that takes less time. Consequently, to be ruled by a king is better than being governed by a consul. This argument clearly shows the topic or precept: goods that last longer are more valuable than those that last a shorter time. Within the argument it works as an indemonstrable proposition. Boethius often calls them a maximal proposition (propositio maxima).

Boethius called dialectic the discipline studying this type of argumentation. The syllogism can be categorical or hypothetical, but it will be dialectic if the matter in its premises is only credible and non-demonstrative. In De Top Diff 1180C, Boethius introduces a general classification of arguments in which demonstrative arguments can be non-evident to human opinion and nevertheless demonstratively true. In fact, our science has innumerable non-evident affirmations that are entirely demonstrable. On the other hand, dialectical arguments are evident to human opinion, but they could lack demonstration.

Boethius devotes the entire Book 5 of this commentary to discussing dialectical hypothetical syllogisms and here, as in his treatise on hypothetical syllogisms, the role of belief (fides) is quite important in defining dialectical arguments in general, as it will be more explained in the following section.

3. On the Hypothetical Syllogisms

The De hypothetico syllogismo (DHS), perhaps originally titled by Boethius De hypotheticis syllogismis, as Brandt (1903, p. 38) has suggested, was published in Venice in 1492 (1st ed.) and 1499 (2nd ed.). This double edition was the basis for the editions of Basel (1546 and 1570) and the subsequent publication of J.P. Migne in Patrologia Latina, vols. 63 and 64 (1st ed., 1847) and (2nd ed. 1860), which appears to be a reprint of the Basel edition. The editions of 1492 and 1499 form the editio princeps, which has been used regularly for the study of this work to the critical revision of the text by Obertello (1969). DHS is the most original and complete treatise of all those written in the antiquity on hypothetical logic that have survived. It was not systematically studied during medieval times, but it had a renaissance in the twentieth century, through the works of Dürr (1951), Maroth (1979), Obertello (1969), and others.

According to the conjecture of Brandt (1903, p. 38), it was written by Boethius between 523 and 510, but De Rijk (1964, p. 159) maintains that it was written between 516 and 522. In DHS Boethius does not follow any Aristotle’s text but rather Peripatetic doctrines. This is because Aristotle wrote nothing about hypothetical syllogisms, although he was aware of the difference between categorical and hypothetical propositions. Thus, De Interpretatione 17a15-16 defines that “A single-statement-making sentence is either one that reveals a single thing or one that is single in virtue of a connective” (Ackrill’s translation, 1963), and later (17a20-22) he adds, “Of these the one is a simple statement, affirming or denying something of something, the other is compounded of simple statements and is a kind of composite sentence” (Ackrill’s translation, 1963). Even if Aristotle promised to explain how categorical and hypothetical syllogisms are related to each other (compare Prior Analytics 45b19-20 and 50a39-b1), he never did.

Aristotle only developed  a syllogistic logic with simple or categorical propositions, that is, propositions saying something of something (e.g., “Virtue is good”). The syllogism with conditional premises (for example, “The man is happy, if he is wise”) was covered by the first associates of Aristotle, Theophrastus and Eudemus (DHS I, 1,3). Boethius’ DHS contains the most complete information about this Peripatetic development. The theory is divided into two parts: disjunctive and connective propositions. A conditional connection is like “If P, then Q”, where P and Q are simple propositions. A disjunctive proposition is instanced as “Either P or Q”. Boethius presents two indemonstrable syllogisms to each part. The first disjunctive syllogism: ‘It is P or it is Q. But, it is not P. Therefore, it is Q.’ And the second: ‘It is P or it is Q. But, it is not Q. Therefore, it is P.’ As to connectives, the first syllogism is “If it is P, then it is Q. But it is P. Then, it is Q”. And the second is “If it is P, then it is Q. But it is not Q. Then, it is not P”.  Boethius accepts that ‘It is P or it is Q’ is equivalent to ‘If it is not P, then it is Q. Accordingly, Boethius leaves implicit the concordance between hypothetical and disjunctive syllogisms:

First disjunctive syll. First hypothetical syll.   Second disjunctive syll. Second hypothetical syll.
It is P or it is Q

It is not P

Therefore, it is Q.

If it is not P, it is Q

It is not P

Therefore, it is Q.

It is P or it is Q

It is not Q

Therefore, it is P.

It is not P, it is Q

It is not Q

Therefore, it is P.

The theory also develops more complex syllogisms and classifies them in modes. For example, DHS II, 11, 7, says correctly that: “The eighth mode is what forms this proposition: “If it is not a, it is not b; and if it is not b, it is not c; but it is c; therefore, it must be a”.

Boethius’ development does not use conjunctions, and this must be an important difference between the Stoic theory and the Peripatetic original development. This fact makes Boethius deny the hypothetical affirmation “If it is P, then it is Q” by attaching the negative particle to the consequent. Thus ‘If it is P, then it is not Q’ (DHS I, 9,7). This is an internal negation, instead of Stoic negation, which is external or propositional, since applies the negative particle to the entire proposition. This explains why he does not consider Stoic axioms based on conjunction in DHS, which he did in his In Ciceronis Topica, V.

The question of whether Boethius is right in believing that the theory comes from Theophrastus and other Peripatetics is still difficult to answer. Speca (2001, p. 71) raises the doubt that we cannot presently be certain of its Peripatetic provenance, because the sources cannot go further back than the end of II century AD, and by then the hypothetical theory was already terminologically conflated with Stoic terminology. He is right, if we look at Boethius’ examples like ‘It is day, then it is light’, and so forth, which are from the Stoic school. On the other hand, Bobzien (2002 and 2002a) has supported the contrary view, and she is inclined to belief in the historical view of Boethius’ account.

The scrupulous view of Speca (2001) is methodologically safe, but it is worth noticing that there are at least three important differences between Boethius’ hypothetical syllogistic logic and Stoic logic. One is negation: Peripatetic hypothetical negation follows the argumentative line of categorical negation; the negative particle must be posed before the most important part of the proposition, and that is the consequent in the case of a conditional proposition. Thus, as said, the negation of “If P, then Q” will be “If P, then  not Q”. Stoic negation poses the negative particle before the entire proposition. And thus, the negation will be “It is not the case that if P, then Q”.

The second difference is that Boethius, in his DHS, distinguishes material and formal conclusions just as he does in his treatises on categorical logic (compare DHS I, iv, 1-2; 3; and I, ii, 1-7; II, ii, 7). In a hypothetical syllogism, to affirm the consequent is fallacious, but if the terms mutually exclude (as if they had an impossible matter) and the third hypothetical mood is given (“If it is not P, it is Q”), there will be a syllogism. Boethius gives the example “If it is not day, it is night. It is night. Therefore, it is not day”. But the conclusion does not obtain if ‘white’ and ‘black’ are correspondingly proposed by P and Q. Thus, a syllogism, either categorical or hypothetical, is logically valid if it does not depend on a specific matter of proposition to be conclusive. On the contrary, material syllogisms, either categorical or hypothetical, are valid under certain matters within a certain form, as they are not logical conclusions, for they are not valid universally or in every propositional matter. Accordingly, Boethius (DHS II, iv, 2) distinguishes between the nature of the relation (natura complexionis) and the nature of the terms (natura terminorum).

The third difference lies in the function Boethius puts on fides, belief (DHS I, 2,4; I, 2,5;  II, 1,2). The role of fides is the crucial core of Boethius’ DHS. According to him, if someone argues through the first indemonstrable, or by any other hypothetical syllogism, he needs to confirm the minor premise, which is a belief. It is not the syllogism as such which is in doubt, but its conclusion, which is conditioned to the truth of the categorical proposition. Boethius’ reason is the originality and primitiveness of categorical syllogisms. He calls categorical syllogisms ‘simple’ and hypothetical syllogisms ‘non-simple’, because the latter resolves into the former (DHS I, 2,4. Non simplices vero dicuntur quoniam ex simplicibus constant, atque in eosdem ultimos resolvuntur). The role of belief in Boethius’ theory of hypothetical syllogisms is also emphasized in his ICT and, if Stump (1988, pp. 210-1) is right, in recognizing the activity of Theophrastus behind Boethius’ theory of Aristotle’s Topics, then Theophrastus and the activity of the first Peripatetics could be well behind DHS.

iv. Treatises on Categorical Syllogisms

The De syllogismo categorico (DSC) and Introductio ad syllogismos categoricos (ISC) ​​are two treatises on categorical syllogisms composed by the young Boethius. Their contents are similar and almost parallel, which have raised various explanations during the early twenty-first century. They have greatly influenced the teaching of logic in medieval Western thought, especially the former which is the only one that contains syllogistic logic.

1. The De Syllogismo Categorico

DSC was written by Boethius early in his life, perhaps around 505 or 506 AD (for the chronology of Boethius works in logic, compare De Rijk 1964). Despite its importance, it did not received a critical edition until the work by Thörnqvist Thomsen (2008a). In the oldest codices (for example, Orleans 267, p. 57), DSC is entitled “Introductio in syllogi cathegoricos”, but this title changed to De syllogismo categorico after the editions by Martianus Rota (Venice, 1543) and Henrichus Loritus Glareanus (Basel, 1546). The edition of Migne (1891) is based on these two editions of the sixteenth century. During the twentieth century, most scholars have corrected this title to De categoricis syllogismis, after Brandt (1903, p. 238, n. 4), argued for using the plural.

The sources of DSC seem to be a certain introduction to categorical syllogistic logic that Porphyry had written to examine and approve the syllogistic theory of Theophrastus, whose principles are inspired by Aristotle’s Prior Analytics. This seems to be suggested from what Boethius says at the end of this work (p. 101, 6-8): “When composing this on the introduction to the categorical syllogism as fully as the brevity of an introductory work would allow, I have followed Aristotle as my principal source and borrowed from Theophrastus and Porphyry occasionally” (Thomsen Thörnqvist transl.). The existence of a similar work by Theophrastus is confirmed by various ancient references; for example, Boethius attributes to him the work “On the affirmation and negation” (in Int 2, 9, 25; Meiser ed.; also Alexander of Aphrodisias in An Pr 367, 15 and so forth), and Alexander of Aphrodisias cites profusely Theophrastus’ own Prior Analytics (in An Pr 123, 19 and 388, 18; Wallies ed. On the works by Theophrastus, see Bochenski 1947 and Sharples 1992, p. 114-123). Moreover, J. Bidez, in the life and works of Porphyry (compare Bidez 1923, p. 198, and Bidez 1964, p. 66*) confirms the existence of a written work entitled “Introduction to categorical syllogisms” written by Porphyry.

DSC is divided into two books. In the first, Boethius reviews the theory of simple propositions, in a way that recalls his commentaries on Aristotle’s De Interpretatione (ed. Meiser 1877-1880). However, DSC exceeds both the commentaries and what Aristotle teaches in his De Interpretatione. In fact, it includes some extra matters: (i) the law of subalternation when reviewing the logical relationships of the Square of oppositions; (ii) a broader explanation on conversion by containing conversion in contraposition (which Aristotle only developed for universal affirmative propositions); (iii) conversion by accident for universal negative propositions (which Aristotle did not include); and (iv) the division of simple propositions.

The second book is a synopsis of the central part of Aristotle’s theory of syllogism (Prior Analytics I, 2-8) plus Theophrastus’ doctrine of indirect syllogistic moods. Theophrastus added five indirect moods to Aristotle’s four moods. Medieval logicians knew these moods through the technical names: Baralipton, Celantes, Dabitis, Fapesmo, and Frisesomorum. Moreover, the second book of DSC (69, 8-72, 11) contains a complete explanation of the definition of syllogism, which recalls Alexander of Aphrodisias’ teaching in his commentary on Aristotle’s Topics. Again, DSC is more technical and elaborated than Aristotle’s Prior Analytics. In addition, Boethius’ explanation on reducing the imperfect moods of the second and third syllogistic figures to the first four modes of the first figure (Barbara, Celarent, Darii and Ferio) suggests a more systematic way than Aristotle’s own explanations.

A careful reading of the logical contents of DSC also makes clear that Boethius (DSC 17, 10) is following a division of categorical propositions to define the three main logical operations of Aristotelian logic: the opposition of propositions (contradiction, contrariety, and subcontrariety); the conversion of propositions (simple, by accident, and by contraposition); and syllogisms, with its figures, syllogistic moods, and the main extensions of first figure. This division is not Boethius’. Already Alexander of Aphrodisias (In An Pr 45,9) gives a complete use of it. There are remnants in Apuleius (PeriH 7, 9-14, p. 183) and Galen (Inst Log, 6,3), and it reappears in Boethius’ time in Ammonius (In An Pr 35.26) and Philoponus (In An Pr 40.31). It is also present in later authors.

Boethius, after commenting on the definitions of the elements of simple propositions (name, verb, indefinite name and verb, and phrase) takes a pair of propositions and divides them into categorical propositions as follows: a pair of simple propositions can or cannot have terms in common. If they do not have any term in common, then they do not have any logical relation. But if they have some terms in common, there is an alternative: either both terms are in common or some term in common. If both terms are in common, they can or cannot have the same order. When they have same order, the theory of Opposition is stated. If both terms change their order, the theory of Conversion is defined. On the other hand, if the pair has only one term in common, the syllogistic theory will appear.

2. The Introductio ad Syllogismos Categoricos

Boethius is the author of DSC and ISC, two treatises on categorical logic. They have a notorious similarity, and they look parallel to some extent. This opens the question of why Boethius wrote two. The first modern explanation proposed a strong dependence between them. Prantl (1855, I, p. 682, n.80) believed that the first book of DSC was an excerpt of ISC. But the presence of syllogistic logic in the second book of DSC and its total absence in ISC is enough to contradict Prantl’s explanation. Brandt (1903, p. 245) was right in refuting him. However, the reason why the treatises are so alike each other had not been found at all. Murari (1905) and McKinlay (1907) have suggested that the second book of DSC (dedicated to syllogistic logic) was originally the second book of ISC, while the first book of DSC was not by Boethius, but it was attached later to the codices in the middle age. According to McKinlay’s later revision of his hypothesis (1938, p. 218), ISC must be identified to Boethius’s Institutio categorica, thought to be lost, and mentioned by Boethius in his treatise On Hypothetical Syllogism (833B).

McKinlay’s hypothesis has lost support due to later works by De Rijk (1964, p. 39) and Magee (1998, p. xvii-xix). In the early twenty-first century, in her critical edition of both treatises, Christina Thomsen Thörnqvist (2008a and 2008b) has given a new explanation. She thinks (2008a, p. xxxix) that ISC is a review of the first book of DSC and that Boethius was intending to give a review of DSC’s two books, but this original plan was not completed (compare Thomsen Thörnqvist), for while Boethius was writing the first book, he realized that he had gone too far in what was supposed to be nothing more than an introduction to Aristotle’s syllogistic logic. In this conjecture she follows Marenbon (2003, p. 49).

In any case, ISC is different from DSC not only because of its absence of syllogistic logic. ISC (15.2) incorporates the notion of strict and non-strict definitions of the elements of the categorical proposition (name, verb, and so on). It incorporates with high interest proofs based on the matters of proposition (29.18). And it has a high consideration of singular propositions by including material that was not in his commentaries (48.2). Additionally, ISC contains a crucial difference: the logic of indefinite propositions. It states their opposition (51.9), their equivalence (62.9), and it develops with more detail conversion by contraposition (69.1).

The divisions of DSC and ISC

ISC cannot be the breviarium Boethius promised to write in his second commentary on Aristotle’s De Interpretatione (in Int 2, p. 251, 8-9). However, Shiel (1958, p. 238) thinks the contrary. The only reason is that ISC contains more than Boethius’ commentaries on De Interpretatione.  The essence of ISC must come from its division.

After developing the linguistic section of Aristotle’s De Interpretatione, both ISC and DSC present their plans through the establishment of a division of a pair of categorical propositions. These divisions contain identical branches, but they also contain important differences. On the one hand, the division of ISC is not as complete as that of DSC, because it does not incorporate the theory of syllogism, but it is more specific than that of DSC by incorporating indefinite terms, on which DSC says nothing. The following description shows how both divisions overlap one another, and what the differences between them are:

On the one hand, if ISC were the first book of DSC, then the indefinite propositions (which only ISC develops) would not take any part of the second book of DSC (which is only on syllogisms). Accordingly, their introduction would be purposeless. On the other hand, if the plan of ISC were a review of DSC’s two books, then Boethius was obliged to develop a theory of syllogisms with indefinite premises, which is unlikely since ISC’s division does not contain syllogistic logic (despite ISC’s being an introduction to syllogistic). But even if one thinks that this could have been so, there are several doubts concerning the logical capacity to do so in Boethius’ sources, even though the issue was not unknown. Boethius indeed recounts (in Int 2, 12-26, p. 316) that Plato and others made ​​conclusive syllogisms with negative premises, which is not allowed by Aristotle in his Prior Analytics (I, 4.41b7-9). According to Boethius, it is possible because Plato in Theaetetus (186e3-4) knew that sometimes a negative categorical proposition can be replaced with the correspondent affirmation with indefinite predicate terms. Boethius (in Int 2, 9, p. 317) cites Alexander of Aphrodisias as one the ancient authors in dealing with syllogisms with indefinite premise, which is certain because Alexander, in his commentary on Aristotle’s Prior Analytics, quotes another syllogism of this sort (in An Pr 397, 5-14). Even Aristotle’s De caelo (269b29-31) has another example. However, this does not seem sufficient to believe that Boethius in his ISC was able to introduce a theory of syllogistic logic with indefinite premises. (To this point, compare I. M. Bochenski (1948), pp. 35-37; and Thomas (1949), pp. 145-160; also, Álvarez & Correia (2012), pp. 297-306. Compare also Correia (2001), pp. 161-174.).

4. Influence of the Treatises

DSC and ISC were taken together and never considered separate. There are no signs that both treatises were studied by the medieval logicians and philosophers before the eleventh century (compare Van de Vyver, 1929, pp. 425-452).

The first text where the influence of their teaching is clear is the twelve century Anonymus Abbreviatio Montana. The other is the Dialectic by Peter Abelard. We know this not only because the name of Boethius is cited as the main source, but also because the division of propositions we have seen above is accepted and maintained by Abelard and the anonymous author of the Abbreviatio.

Later on, the authority of these treatises is more evident. In the fourteenth century, Peter of Spain’s Summulae logicales adopted the indirect moods of the first figure—the doctrine of the matters of proposition (which can be traced in the history of logic as far back as Alexander of Aphrodisias and Apuleius)—and he follows Boethius in the idea that is originally found in Aristotle of reducing the imperfect moods of the second and third syllogistic figures to the first four perfect moods of the first figure.

5. His Sources

The Contra Eutychen is the most original work by Boethius. It is original in its speculative solution and its methodology of using hypothetical and categorical logic in his analysis of terms, propositions, and arguments. The Consolation of Philosophy is also original, though many authors restrict it to his methodology and the way to dispose of the elements, but not the content, which would represent the Neoplatonic school of Iamblichus, Syrianus, and Proclus. As to his inspiring figures, Boethius gives his most respectful words to Plato and Aristotle, but the figure of Pythagoras is also venerated in De Institutione musica (DIM I, 10-20).

As to his scientific writings, his mathematical and logical works are not original, and Boethius recognizes it. When dealing with these scientific matters, Boethius relies on specific Greek sources: in mathematical disciplines, he follows the middle-Platonist Nicomachus of Gerasa (compare Bower, C., 1978, p. 5). However, not everything comes from him (Barbera, A. 1991. pp. 1-3 and 48-49). In his De Institutione musica (IV, 1-2), he follows with some changes (Barbera, ibid., pp. 38-60) to the Sectio Canonis, attributed to Euclides; and, in developing books V, and part of IV, he uses C. Ptolemy’s Harmonicas (compare DIM V, 4, 25; V, 5, 5; V, 8, 13; V, 11, 1; V, 14, 21, V, 18, 24 et al.; also, Redondo Reyes, 2002, p. cxv).

As to Aristotelian logic, he recognizes agreement with the certainty of the Peripatetic doctrines reviewed by the Neoplatonist Porphyry (compare Boethius in Int 2, 24-27, p. 17. Meiser ed., 1877-1880), but it is also true that not everything comes from him, for Boethius also names Syrianus, Proclus’s master.

As to the sources of his logical works, though far from being resolved, there is a basic agreement with refusing the hypothesis proposed by Pierre Courcelle (1969, p. 28) that they are dependent on the work of Ammonius Hermias in Alexandria. This same rebuttal attacks the widespread belief (from Courcelle too) that Boethius studied Greek in Alexandria. Indeed, Courcelle followed Bidez (1923, pp. 189-201), who some years before had shown that Boethius’ logical commentaries (not the treatises) owed almost everything to Porphyry. But Courcelle (1969) made a valuable observation about this: Boethius also refers to Syrianus, the teacher of Proclus, who taught later than Porphyry. Accordingly, Courcelle proposed that the occurrence of post-Porphyrian authors was due to Boethius’ reliance on the school of Ammonius in Alexandria, as Boethius’ logical works were written between 500 and 524, and by this time the school of Athens had fallen into decline after the death of Proclus in 485. On the other hand, Alexandria, where Ammonius taught from this date, had flourished as the center of philological, philosophical, and medical studies. Courcelle showed several parallels in the texts, but these, as he also saw, implied only a common source. However, he proposed that, in a passage of the second commentary on Aristotle’s De Interpretatione (in Int 2, 9, p. 361), the corrupt phrase sicut audivimus docet should be amended as follows: sicut Ammonius docet. Courcelle knew that the absence of the name of Ammonius in Boethius’ writings was the main objection of his hypothesis, but this emendation made it very convincing. He refused, therefore, the emendation that Meiser had done earlier in 1880, in the critical edition of Boethius’s commentaries on De Interpretatione (compare Praefatio, iv). Indeed, before Courcelle, Meiser had proposed emending Eudemus to read: sicut Eudemus docet. Subsequent studies showed that the emendation of Meiser was correct because the doctrine in question was given by Eudemus.

The focus of Courcelle, however, was to place the problem of the sources of Boethius’s logical writings into the correct focus. That is why Shiel (1958, pp. 217-244) offered a new explanation to this status quaestionis: he proposed that Boethius managed all his material, either pre- or post-Porphyrian, from a Greek codex of Aristotle’s Organon, having glosses and marginal notes from which he translated all the comments and explanations. This singular hypothesis has seduced many scholars and has even been generalized as Boethius’ general modus operandi. Shiel’s hypothesis is plausible in some respects when applied to the works on logic, but it seems to have many problems when applied to other kinds of writing. Many scholars have accepted the existence of this manuscript in Boethius’s hands by his verbatim allusions (for example, in Int 2 20-3 p. 250), although not all have accepted Shiel’s conclusions, which remove all originality to Boethius, when presenting him only as a mechanical translator of these Greek glosses. And even though Shiel always referred to Boethius’ works on logic, it is easy to generalize the servile attitude in his scientific material to his other works, but the poems or the philosophical synthesis of the Consolation or the logical analysis of Contra Eutychen have no parallel in earlier sources and are by themselves evidence of a lucid thinker.

According to Shiel (1990), Boethius’s logic comes from a copy of the commentary of Porphyry that was used in the school of Proclus in Athens. This copy was a codex containing Aristotle’s Organon with margins strongly annotated with comments and explanations. Magee has shown the difficulty to accepting the existence of this kind of codex before the ninth century AD (Magee, 1998, Introduction). On the other hand, some scholars find that Shiel’s hypothesis does not accurately apply to all the logical writings of Boethius, as Stump (1974, p. 73-93) has argued in his defense of the comments on the Topics. Moreover, the absence of Proclus’s name in Boethius’ works on logic, even when Proclus made important contribution to logic as in the case of the Canon of Proclus (compare Correia, 2002, pp. 71-84), raises new doubts about the accuracy of the formula given by Shiel.

6. References and Further Reading

  • Ackrill, J.L., 1963. Aristotle’s Categories and De Interpretatione. Translation with notes.  Oxford 1963: Oxford University Press.
  • Alvarez, E. & Correia, M. 2012. “Syllogistic with indefinite terms”. History and Philosophy of Logic, 33, 4, pp. 297-306.
  • Anderson, P. 1990. Transiciones de la antigüedad al feudalismo, Madrid: Siglo XXI.
  • Barbera, A. 1991. The Euclidean division of the Canon. Greek and Latin sources. Lincoln: The University of Nebraska Press, pp. 1-3, y 48-49.
  • Bidez, J. 1923. “Boèce et Porphyre”, en Revue Belge de Philologie et d’Histoire, 2 (1923), pp. 189-201.
  • Bidez, J. 1964. Vie de Porphyre. Le philosophe néoplatonicien, Hildesheim: G. Olms.
  • Bidez, J. 1984. Boethius und Porphyrios, en Boethius, M. Fuhrmann und J. Gruber (eds.). Wege der Forschung, Band 483, Darmstadt, pp. 133-145.
  • Bobzien, S. 2002. “The development of Modus Ponens in Antiquity: from Aristotle to the 2nd century AD. Phronesis vol. 47, 4, pp. 359-394.
  • Bobzien, S. 2002a. “A Greek Parallel to Boethius’ De Hypotheticis Syllogismis”, Mnemosyne 55 (2002a), pp. 285-300.
  • Bochenski, I.M. 1947. La logique de Théophraste, 2nd ed., Fribourg: Libraire de L’Université.
  • Bochenski, I.M. 1948. “On the categorical syllogism”, en Dominican Studies, vol. I, 1, pp. 35-37.
  • Bower, C., 1978. “Boethius and Nicomachus: An essay concerning the sources of De Institutione Musica”, Vivarium, 6, 1, pp. 1-45-
  • Brandt, S. 1903. “Entstehungszeit und zeitliche Folge der Werke von Boethius”, en Philologus, 62, pp. 234-275.
  • Cameron, A. 1981. “Boethius’ father’s name”, en Zeitschrifts für Papyrologie und Epigraph, 44, 1981, pp. 181-183.
  • Chadwick, H. 1981. “Introduction”, en Gibson (1981), pp. 1-12.
  • Correia, M. 2002a. “Libertad humana y presciencia divina en Boecio”, Teología y Vida, XLIII (2002), pp. 175-186
  • Correia, M. 2001. “Boethius on syllogisms with negative premisses”, en Ancient Philosophy 21, pp. 161-174.
  • Correia, M. 2009. “The syllogistic theory of Boethius”, en Ancient Philosophy 29, pp. 391-405.
  • Correia, M. 2002. “El Canon de Proclo y la idea de lógica en Aristóteles”. Méthexis 15, pp. 71-84.
  • Courcelle, P. 1967. La Consolation de Philosophie dans la tradition littéraire. Antécédent et Postérité de Boèce. Etudes Augustiniennes. Paris: Editions du Centre National de la Recherche Scientifique. C.N.R.S.
  • Courcelle, P. 1969. Late Latin Writers and their Sources, Harvard University Press, Cambridge/Massachusetts (see: Les Lettres Grecques en Occident de Macrobe à Cassiodore, 2nd. Ed., París, 1948).
  • De Rijk, L. 1964. On the chronology of Boethius’ works on logic (I and II), en Vivarium, vol. 2, parts 1 & 2, pp. 1-49 and 122-162.
  • Devereux, D. & Pellegrin, P. 1990. Biologie, logique et métaphysique chez Aristote, D. Devereux et P. Pellegrin (eds.). Paris: Editions du Centre National de la Recherche Scientifique. C.N.R.S.
  • Dürr, K. 1951. The propositional logic of Boethius. Amsterdam: North Holland Publishing. (Reprinted in 1980 by Greenwood Press, USA).
  • Friedlein, G. 1867. Anicii Manlii Torquati Severini Boetii De Institutione Arithmetica libri duo. De Institutione Musica libri quinque. Accedit Geometria quae fertur Boetii. G. Friedlein (ed.). Leipzig: Teubner.
  • Fuhrmann, M. & Gruber, J. 1984. Boethius. M. Fuhrmann y J. Gruber (eds.). Wege der Forschung, Band 483, Darmstadt.
  • Gibson, M. 1981. Boethius, his life, thought and influence. Gibson, M. (ed.). Oxford: Blacwell.
  • Isaac, I. 1953. Le Peri Hermeneias en Occident de Boèce à Saint Thomas. Histoire littéraire d’un traité d’Aristote, París.
  • Kaylor, N.H. & Phillips, P.E. 2012. A Companion to Boethius in the Middle Ages. Kaylor, N.H. & Phillips, P.E (eds.). Leiden/Boston : Brill.
  • Kretzmann, N. 1982. “Syncategoremata, exponibilia, sophismata”. The Cambridge History of Later Medieval Philosophy, pp. 211-214. Cambridge: Cambridge University Press.
  • Lloyd, G.E.R. 1990. “Aristotle’s Zoology and his Metaphysics: the status quaestionis. A critical review of some recent theories”, en Devereux & Pellegrin (1990), pp. 8-35.
  • Lukasiewicz, J. 1951. Aristotle’s Syllogistic. Oxford: Oxford University Press.
  • Magee, J. 1998. Anicii Manlii Severini Boethii De divisione liber. Critical edition, translation, prolegomena and commentary. Leiden/Boston/Koln: Brill.
  • Mair, J. 1981. “The text of the Opuscula Sacra”, pp. 206-213. In Gibson, M. (1981).
  • Marenbon, J. 2003. Boethius. Oxford: Oxford University Press.
  • Marenbon, J. 2009. The Cambridge Companion to Boethius. Cambridge: Cambridge University Press.
  • Maroth, M. 1979. “Die hypothetischen Syllogismen”, en Acta Antigua 27 (1979), pp. 407-436.
  • Martindale, J. R. 1980. The prosopography of the Later Roman Empire: A.D. 395-527. Cambridge 1980: Cambridge University Press.
  • Matthews, J., 1981. Boethius. His life, thought and influence, en M. Gibson (ed.), Oxford.
  • McKinlay, A. P. 1907. “Stylistic tests and the chronology of the works of Boethius”, en Harvard Studies in Classical Philology, XVIII, pp. 123-156.
  • McKinlay, A.P. 1938. “The De syllogismis categoricis and Introductio ad syllogismos categoricos of Boethius”, en Classical and Mediaeval Studies in honor of E. K. Rand, pp. 209-219.
  • Meiser, C. 1877-1880. Anicii Manlii Severini Boetii Commentarii in Librum Aristotelis PERI ERMHNEIAS. Prima et secunda editio. C. Meiser (ed.), Leipzig.
  • Migne, J.-P. 1891. De Syllogismo Categorico, en Patrologia Latina, 64, vol. 2, J.-P. Migne (ed.), París.
  • Migne, J.-P. 1981. Introductio ad Syllogismos Categoricos, en Patrologia Latina, 64, vol. 2, J.-P. Migne (ed.), París.
  • Minio Paluello, L. 1972. Opuscula. The Latin Aristotle, Amsterdam: A. Hakkert. Eds.
  • Minio-Paluello, L. 1965. Aristoteles Latinus, II, 1-2, L. Minio-Paluello (ed.), París: Desclée de Brouwer.
  • Mommsen, Th. 1894. Cassiodori Senatoris Variae, Monumenta Germaniae Historica, Auctorum Antiquissimorum Tomus XII, Mommsen, Th. (ed.), Berlin: Weidmann.
  • Moreau, J. & Velkov, V. 1968. Excerpta Valesiana. Moreau, J and Velkov, V. (eds.), Leipzig, Academia Scientiarum Germanica Berolinensis: Teubner.
  • Murari, R. 1905. Dante e Boezio, Contributo allo studio delle fonti Dantesche. Bologna: Nicola Zanichelli.
  • Mynors, R.A.B. 1963. Cassiodori Senatori Institutiones, R.A.B. Mynors (Ed.), Oxford: Clarendon Press.
  • Nuchelmans, G. 1973. Theories of the Proposition, Leiden 1973: North Holland.
  • Obertello, L.A.M. 1969. Severino Boezio De hypotheticis syllogismis. Testo, traduzione e commento. Brescia: Paideia Editrice.
  • Prantl, C., 1855. Geschichte der Logik im Abendlande, Leipzig, 1855-1870: G. Olms.
  • Prior, A.N. 1963. “The Logic of the Negative Terms in Boethius”, en Franciscan Studies, 13, vol. I, pp. 1-6.
  • Prior, A.N. 1962. Formal Logic, Oxford: Clarendon Press.
  • Rand, Stewart & Tester. 1990. Boethius, the Theological tractates and The Consolation of philosophy, Translated by H.F. Stewart, E.K. Rand and S.J. Tester. Cambridge Massachusetts/London, England. The Loeb Classical Library: Harvard University Press.
  • Redondo Reyes, P. 2002. La Harmónica de Claudio Ptolomeo: edición crítica, introducción traducción y comentario. PhD thesis, Murcia, España.
  • Sharples, R. 1992. Theophrastus of Eresus. Sources for his Life, Writings, Thought and Influence, vols. i-iii, W.W. Fortenbaugh, P.M. Huby, R.W. Sharples, D. Gutas (Eds.), together with A.D. Barker, J.J. Keaney, D.C. Mirhady, D. Sedley and M.G. Sollenberger. Leiden: Brill.
  • Shiel, J. 1990. “Boethius’ Commentaries on Aristotle, en Sorabji (1990): pp. 349-372, (también: Medieval and Renaissance Studies 4, 1958, pp. 217-44).
  • Sorabji, R. 1990. Aristotle Transformed. The Ancient Commentators and their Influence. Sorabji, R. (ed.). London: Duckworth.
  • Spade, P.V., 1982. The semantics of terms, en The Cambridge History of Later Medieval Philosophy, Cambridge 1982, pp. 190-1: Cambridge University Press.
  • Speca, A. 2001. Hypothetical syllogistic & Stoic logic. Leiden/Boston/Köln: Brill.
  • Stump, E., 1974. “Boethius’ Works on Topics”, en Vivarium, 12, 2, pp. 77-93.
  • Sullivan, M.W., 1967. Apuleian Logic. The Nature, Sources, and Influence of Apuleius’s Peri Hermeneias, en: Studies in Logic and the Foundations of Mathematics, Amsterdam: North-Holland.
  • Usener, H. 1877. Anecdoton Holderi: ein beitrag zur Geschichte Roms in ostgotischer Zeit. Leipzig: Teubner.
  • Thomas, P. 1908. Apulei Opera quae Supersunt, vol iii, Apulei Platonici Madaurensis De Philosophia Libri, liber PERI ERMHNEIAS, Thomas P. (ed.), pp. 176-194, Leipzig: Teubner.
  • Thomas, I., O.P. 1949. “CS(n): An Extension of CS”, in Dominican Studies, pp. 145-160.
  • Thomsen Thörnqvist, C. 2008a. Anicii Manlii Seuerini Boethii De syllogismo categorico. A critical edition with introduction, translation, notes and indexes. Studia Graeca et Latina Gothoburgensia LXVIII, University of Gothenburg: Acta Universitatis Gothoburgensis.
  • Thomsen Thörnqvist, C. 2008b. Anicii Manlii Severini Boethii Introductio ad syllogismos categoricos. A critical edition with introduction, commentary and indexes. Studia Graeca et Latina Gothoburgensia LXIX, University of Gothenburg: Acta Universitatis Gothoburgensis.
  • Van de Vyver, A., 1929. “Les étapes du développement philosophique du aut Moyen-Age”, Revue Belge de Philologie et d’Histoire, viii (1929), pp. 425-452. Brussels: Société pour Le Progrès des Études Philosophiques et Historiques.
  • Wallies (1883): Alexandri in Aristotelis Analyticorum Priorum Librum I Commentarium, M. Wallies (ed.), in Commentaria in Aristotelem Graeca, vol. 2.1, Berlín: G. Reimerum.

 

Author Information

Manuel Correia
Email: mcorreia@uc.cl
Pontifical Catholic University of Chile
Chile

Enactivism

The term ‘enaction’ was first introduced in The Embodied Mind, co-authored by Varela, Thompson, and Rosch and published in 1991. That seminal work provides the first original contemporary formulation of enactivism. Its authors define cognition as enaction, which they in turn characterize as the ‘bringing forth’ of domains of significance through organismic activity that has been itself conditioned by a history of interactions between an organism and its environment.

To understand mentality, however complex and sophisticated it may be, it is necessary to appreciate how living beings dynamically interact with their environments. From an enactivist perspective, there is no prospect of understanding minds without reference to such interactions because interactions are taken to lie at the heart of mentality in all of its varied forms.

Since 1991, enactivism has attracted interest and attention from academics and practitioners in many fields, and it is a well-established framework for thinking about and investigating mind and cognition. It has been articulated into several recognizably distinct varieties distinguished by their specific commitments. Some versions of enactivism, such as those put forward by Thompson and Di Paolo and others, focus on expanding and developing the core ideas of the original formulation of enactivism advanced by Varela, Thompson, and Rosch. Other versions of enactivism, such as sensorimotor knowledge enactivism and radical enactivism incorporate other ideas and influences in their articulation of enactivism, sometimes leaving aside and sometimes challenging the core assumptions of the original version of enactivism.

Table of Contents

  1. Core Commitments
  2. Contemporary Varieties of Enactivism
    1. Original Enactivism
      1. Biological Autonomy
      2. Bringing Forth Domains of Significance
      3. Phenomenological Connections
      4. Buddhist Connections
      5. Sense-Making
    2. Sensorimotor Knowledge Enactivism
    3. Radical Enactivism
  3. Forerunners
  4. Debates
  5. Applications and Influence
  6. Conclusion
  7. References and Further Reading

1. Core Commitments

 What unifies different articulations of enactivism is that, at their core, they all look to living systems to understand minds, and they conceive of cognition as embodied activity. In enactivist terms, perceiving, imagining, remembering, and even the most abstract forms of thinking are to be understood, first and foremost, as organismic activities that dynamically unfold across time and space.

Enactivists conceive of the embodied cognitive activity that they take to constitute cognition as fundamentally interactive in at least two ways. First, the manner and style of any given bout of cognitive activity are conditioned by the cognizer’s prior history of engagement with environments and the particularities of the current environment with which they are actively engaged. Second, cognizers shape their environments and are, in turn, shaped by them in a variety of ways across multiple timescales.

A cornerstone commitment of enactivism is that minds arise and take shape through the precarious self-creating, self-sustaining, adaptive activities of living creatures as they regulate themselves by interacting with features of its environment. To take a central case, an organism’s characteristic patterns of sensorimotor interaction are deemed to be shaped by its prior history of active engagement with aspects of their environments. Its past engagements reinforce and tend to perpetuate its sensorimotor habits and tendencies. Yet organisms are not wholly creatures of past habits. Living beings always remain flexibly open to adjusting their repertoires and ways of doing things in new and novel ways. Cognition, which takes the form of patterns of open-ended, flexible, extended spatio-temporal activity, is thus deemed ‘autonomous’ in the sense that it unfolds in ways that are viable for sustaining itself and that are not externally regulated or pre-programmed.

Enactivists regard an organism’s environment as a domain of significance populated with items of relevance, not as a neutral setting that can be adequately characterized in, say, purely physical terms. Importantly, in this regard, organisms are said to ‘enact’ or ‘bring forth’ their ‘worlds’. Organisms not only adapt to and are shaped by their environments; they also dynamically fashion, curate, and adapt to them. Through such activity and exchanges, both organisms and their environments are transformed and, in an important sense, brought into being. Enactivists often explicate the unprescribed bi-directional influence of organisms on their environments and vice versa, poetically, using the metaphor of “laying down a path in walking”.

Another signature enactivist idea is that qualitative, phenomenal aspects of lived experience—what it is like to experience something—are an achievement of organismic activity. To take a central case, perceptual experience arises and takes shape through an organism’s active exploration of aspects of its environment. It is through such engaged efforts and the specific ways they are carried out that organisms experience the world in particular ways. Accordingly, organismic activities of certain kinds are required to achieve phenomenal access to aspects of the world or for things to ‘show up’ or “to be present” phenomenally.

Minds, conceived in enactivist terms, operate in ways that are fundamentally unlike those of mechanisms that are driven entirely by externally sourced programs and algorithms. Enactivism thus sees itself as directly opposing the views of cognition that understand it as essentially computational and representational in nature. In its original formulation, enactivism strongly rejects the idea that minds are in the business of collecting, transforming, and representing information sourced from a pre-given world that is assumed to exist independently of and prior to organisms. Strikingly, to conceive of cognition in line with the original version of enactivism entails holding that when organisms actively engage with aspects of their worlds, they always do so in mentality-constituting ways. Yet, enactivists hold that such cognitive activity neither involves constructing representations of those worlds based on retrieved information nor does it depend on any kind of computational processing. So conceived, enactivism rejects the longstanding idea that the core business of cognition is to represent and compute, and, concomitantly, it rejects the familiar explanatory strategies of orthodox cognitive science.

Enactivism is a significant philosophical enterprise because, at least under standard interpretations, it offers a foundational challenge to cognitivist accounts of mind—those that conceive of mentality in representational and computational terms. Enactivists regard such conceptions of mind, which dominate much mainstream analytic philosophy and cognitive science, not only as resting on a mistaken theoretical foundation but as presenting a tempting picture of mentality that, practically, subverts efforts to develop a healthier and more accurate understanding of ourselves and our place in nature.

2. Contemporary Varieties of Enactivism

There are several, and importantly, different versions of enactivism occupying the contemporary philosophical landscape.

a. Original Enactivism

 The Embodied Mind by Varela, Thompson, and Rosch, published in 1991, is the locus classicus of enactivism. That landmark work is variously described as initially formulating and advancing the most influential statement of enactivism in recent times. It is credited with being “the first and among the most profound” of the many and various enactivist offerings that have followed in its wake (Kabat-Zinn 2016, p. xiii).

Enactivism, as originally formulated, is not a neatly defined or finished theory. It is variously described in the literature as a broad, emerging ‘perspective’, ‘approach’, ‘paradigm’, or ‘framework’ for understanding mind and cognition (see, for instance, Varela, Thompson and Rosch 1991; Baerveldt and Verheggen 2012; Stewart and others 2010; Gallagher 2017). Enactivism is not a finished product; it continues to evolve as new versions of enactivism emerge which adjust, add to, or reject certain core and peripheral commitments of the original version.

Though the original version of enactivism resists definition in terms of a set of central theses, it does have distinctive features. There are three key and recurring themes emphasized in the original statement of enactivism. The first theme is that understanding organismic biological autonomy is the key to understanding minds. Original enactivism assumes that there is deep continuity between life and mind, such that understanding the biological autonomy of organisms sheds direct light on cognition. The second theme is that minds cannot be understood without coming to terms with subjective, lived experience, and consciousness. The third theme is that non-Western traditions, and in particular, Buddhist philosophy and its practices of meditation and mindfulness, should play a significant role in reforming and rethinking the future sciences of the mind, both theoretically and practically.

The original version of enactivism put forward in The Embodied Mind has been successively developed and expanded upon in works, mainly by Thompson, Di Paolo, and their co-authors (principally Thompson 2007, Froese and Di Paolo 2011, McGann and others 2013, Di Paolo and others 2017, Di Paolo and others 2018, Di Paolo 2018, 2021). Some speak of these works, collectively, as constituting and contributing to a variety of autopoietic enactivism (Hutto and Myin 2013, 2017, Ward and others 2017, Stapleton 2022). The label, which now has some purchase was chosen because the original version of enactivism and those that seek directly to expand on it, are united in looking to biological autonomy to understand the fundamentals of mindedness. Crucially, all enactivists of this stripe embrace the notion of autopoiesis —the self-creating and self-sustaining activity of living systems —as a common theoretical starting point, having been inspired by “the work of the biologists Humberto Maturana and Francisco Varela” (Baerveldt and Verheggen 2012, p. 165; see Maturana and Varela, 1980, 1987). Nevertheless, the label autopoietic enactivism is contested (see, for example, Thompson 2018, Netland 2022). It is thought to be misleading because, although these enactivists build upon the work of Varela and Maturana, they have added significant resources, expanding upon and modifying the initial conception of autopoiesis in their efforts to explicate key aspects of biological autonomy, namely, recognizing its teleological character (see, for instance, Thompson 2007, 127; Di Paolo 2009, p. 12; Di Paolo 2018 and others, p. 37). As such, enactivists working on these topics deem autopoiesis, as originally conceived, to be, at most, necessary but insufficient for important world-involving forms of cognition (see Thompson 2007, p. 149-150; see also p. 127). For these reasons, Barandiaran (2017) recommends the label autonomist enactivism instead. However, given these nuances, it may be safer and more accurate to speak of these positions simply as variants of original enactivism.

The primary aim of the original version of enactivism was to address the problem of understanding how lived experience fits into the world, as described by science, including cognitive science. On the face of it, the two appear unreconcilably different from one another. Thompson (2016) puts the apparent dilemma that motivated the first formulation of enactivism in terms of a hard choice: either “accept what science seems to be telling us and deny our experience… or hold fast to our experience and deny science” (p. xix).

The original version of enactivism was born from the aspiration of finding a way for cognitive science to give appropriate attention to lived experience. One of its key assumptions is that “we cannot begin to address… [the gap between science and experience] without relying on some kind of phenomenology, that is, on some kind of descriptive account of our experience in the everyday world” (Thompson 2016, p. xx).

Enactivism rejects mainstream conceptions of mind that strongly demarcate minds from bodies and environments. It holds that such conceptions are not justified and should be rethought. Enactivism aims to eradicate misleading dualisms that continue to dominate analytic philosophy of mind and much cognitive science. It aims to dissolve the mind-body problem by asking us to abandon our attachment to traditional dichotomies and to come to see that minds are not ultimately separate from bodies, environments, or others.

Original enactivism seeks to put the mind-body problem to rest once and for all. It also rejects the traditional input-output processing model of the mind, a model which pays homage, often explicitly, to the idea that minds are furnished by the senses by accepting that the senses supply minds with information about the external world. Original enactivism rejects this familiar characterization of mental activity, denying that minds ever pick up or process information from the environment. Concomitantly, original enactivism rejects the idea that minds are fundamentally information processing systems that manipulate informational content by categorizing, conceptualizing, and schematizing it by means of representational-computational processes. By also pressing us to radically rethink key notions —self, nature, and science – original enactivism aims to usher in “a new kind of cognitive science” (Rosch 2016, p. xxxv). So conceived, enactivism seeks to revolutionize and massively reform the sciences of the mind.

Embracing original enactivism entails rethinking foundational mainstream theoretical assumptions that are prevalent in much analytic philosophy of mind and cognitive science. Importantly, in this vein, original enactivists advocate not only for changes to our theoretical mindset but also for changes in existing practices and approaches we use in the cognitive sciences and cognate domains that study and engage with minds. Thus, the original version of enactivism envisions that future sciences of the mind will recognize and work with “another mode of knowing not based on an observer and observed” (Rosch 2016, p. x). Original enactivism, thus, issues a normative demand to create a space in which those working to understand and expand our lived experience can speak to and understand empirically focused scientists of the mind. In such a setting, there would be a dynamic and interactive ‘circulation’ and cross-fertilization of theory and practice (Thompson 2016, 2017).

This is the sense in which original enactivism seeks to provide “a framework for a far-reaching renewal of cognitive science as a whole” (Stewart, Gapenne, and Di Paolo 2010, p. viii.).

It is an open question just how much of the ambition of original enactivism has been achieved, but it is undeniable that much has changed in the fields of philosophy and the sciences of the mind since its debut. Thompson (2016) summarizes the current state of the art.

The idea that there is a deep continuity in the principles of self-organization from the simplest living things to more complex cognitive beings — an idea central to Varela’s earlier work with neurobiologist Humberto Maturana — is now a mainstay of theoretical biology. Subjective experience and consciousness, once taboo subjects for cognitive science, are now important research topics, especially in cognitive neuroscience. Phenomenology now plays an active role in the philosophy of mind and experimental cognitive science. Meditation and mindfulness practices are increasingly used in clinical contexts and are a growing subject of investigation in behavioral psychology and cognitive neuroscience. And Buddhist philosophy is increasingly recognized as an important interlocutor in contemporary philosophy (p. xix).

Notably, there have been efforts to transform the way the science of intersubjectivity is itself conducted by getting researchers to participate, at once, both as subjects and objects of research. Details of this method, called PRISMA, are set out in De Jaegher and others (2017). Thompson (2017) praises this work for being “clearly animated by the full meaning of enaction as requiring not just a change in how we think but also in how we experience” (p. 43). For a related discussion of how cognitive science practice might change by giving due attention to dynamically evolving experience, see McGann (2022).

i. Biological Autonomy

All living systems —from simple cells to whole organisms, whether the latter are single-celled bacteria or human beings —actively individuate themselves from other aspects of their environments and maintain themselves by engaging in a constant “dynamical exchange of energy and matter that keeps the inside conditions just right for life to perpetuate itself” (Kabat-Zinn 2016, p. xiv). This is all part of the great game of life: staying far enough away from entropy, aka thermodynamic equilibrium, to survive.

Enactivists emphasize the autopoietic character—the self-creating and self-individuating results—of the activity through which living systems adaptively produce and maintain vital boundaries and relationships between themselves and what lies beyond them (Varela and others, 1991; Thompson, 2007). Accordingly, “organisms actively and continuously produce a distinction between themselves and their environment where none existed before they appeared and where none will remain after they are gone” (Di Paolo and others 2018, p. 23).

What determines the boundaries of a given organism? Where does a given organism end and the environment begin? Enactivists seek to answer such questions by pointing to the fact that living systems are organizationally and operationally closed, which is to say that they are “constituted as a network of interdependent processes, where the behavior of the whole emerges from the interaction dynamics of its component parts” (Barandiaran 2017, p. 411, see also Di Paolo and Thompson 2014, Di Paolo and others 2018; Kirchhoff 2018a).

The basic idea of operational closure is that self-defining autopoietic processes can be picked out by the fact that they exist in mutually enabling networks of circular means-end activities, such that “all of the processes that make up the system are enabled by other processes in the system” (Di Paolo and others 2018, p. 25). Operational closure is evident in the self-sustaining autonomous activity of, for example, metabolic networks in living systems. External influences —such as, say, the effects of sunlight being absorbed by chlorophyll —are any influences that are not mutually enabled or produced by processes within such a closed system.

The exact boundaries of a self-producing, self-individuating living system can be flexible. In this regard, Di Paolo and others (2018) cite the capacity of some insects and spiders to breathe underwater for certain periods of time. They manage to do this by trapping air bubbles in the hair on their abdomens. In such cases, these environmental features become part of the self-individuating enabling conditions of the organism’s operationally closed network: “These bubbles function like external gills as the partial pressure of oxygen within the bubble, diminished by respiration, equilibrates with that of the water as the external oxygen flows in” (Di Paolo and others 2018, p. 28, see also Turner 2000).

When we consider concrete cases, it is evident that autopoietic processes of self-production and self-distinction require living systems to continuously adjust to features of their environment. This involves the “selective opening and selective rejection of material flows—in other words, an adaptive regulation of what goes in and what stays out” (Di Paolo and others 2018, p. 40).

Adaptive regulation requires flexibility. It requires simultaneous adjustments at multiple timescales and various levels, where each adjustment must be responsive to particular speeds and rhythms at the scale required to meet specific thresholds. This is why the business of being and staying alive is necessarily complex, forever unfinished, precarious, and restless (Di Paolo and others, 2017; 2018). Though there is room for error, minimally, organisms that survive and propagate must actively avoid engaging in behaviors that are overly maladaptive.

Enactivists hold that such adaptive activity is autonomous. Living systems establish their own unprincipled norms of operation —norms that arise naturally from the activity of staying alive and far from entropy. It is because organisms generate their own norms through their activities that enactivists speak of them as having an immanent teleology (Thompson 2007, Di Paolo and others 2018).

It is claimed that this notion of autonomy is the very core of enactivism (Thompson 2007, Barandiaran 2017, p. 409; Di Paolo and others, 2018, p. 23). It is regarded as a notion that, strictly speaking, goes “beyond autopoiesis” (Di Paolo and others 2018, p. 25).

Enactivists contend that the fact that living systems are autonomous in the precise sense just defined is what distinguishes them from wholly lifeless, heteronomous machines of the sort that are driven only by external, exogenous instructions. A core idea of enactivism is that “the living body is a self-organizing system. To think of living bodies in this way “contrasts with viewing it as a machine that happens to be made of meat rather than silicon” (Rosch 2016, p. xxviii). In line with this understanding, enactivists hold that organismic processes “operate and self-organize historically rather than function” (Di Paolo and others 2018, p. 20). It is precisely because organisms must always be ready to adjust to new possibilities and circumstances that the self-organizing activity of living systems cannot be governed by instructions in a functionally pre-specified manner (see Barandiaran 2017, p. 411).

Enactivists hold that autonomous norm generation is a feature of all modes and styles of cognitive activity and not just as it concerns basic organismic self-production, self-individuation, and self-maintenance. Di Paolo and others (2018), for example, identify two important dimensions of autonomous self-regulation beyond the basic cycles of regulation that sustain living organisms. These additional dimensions they identify are cycles of sensorimotor interactions involved in action, perception, and emotion and cycles of intersubjectivity involved in social engagements with others (Di Paolo and others 2018, p. 22).

ii. Bringing Forth Domains of Significance

Connected with their understanding of biological autonomy, enactivists reject the idea that organisms simply adapt to features of a pre-existing, neutrally characterized physical world. Instead, they hold that organisms are attuned to features of environments or domains that are significant to them —environments that organisms themselves bring into being. It is on this basis that enactivists “conceive of mental life as the ongoing meaningful engagement between precariously constituted embodied agents and the worlds of significance they bring forth in their self-asserting activity” (Di Paolo and others 2018, p. 20). Hence, a central thesis of enactivism is that “cognition is not the grasping of an independent, outside world by a separate mind or self, but instead the bringing forth or enacting of a dependent world of relevance in and through embodied action” (Thompson 2016, p. xviii).

In this view, organisms and environments dynamically co-emerge. The autonomous adaptative activity of organisms “brings forth, in the same stroke, what counts as other, the organism’s world.” (Thompson 2007, p. 153). The pre-existing world, as characterized by physics and chemistry, is not equivalent to an organism’s environment. The latter, which is effectively captured by von Uexküll’s (1957) notion of an Umwelt, is a sub-set of the physio-chemical world that is relevant to the organism in question. This environmental domain of significance or relevance for organisms, which enactivists hold, is brought into being through the activity of organisms themselves.

For example, sucrose only serves as food for a bacterium because it has certain physical and chemical properties. Yet without organisms that use it as a nutrient, sucrose, understood merely as something that exists as part of the physicochemical world, is not food. Hence, that it is food for bacteria depends not only, or even primarily, on the physiochemical properties of sucrose itself but chiefly on the existence and properties of bacteria —properties connected to their metabolic needs and processes that they brought into being. Although, taking the stance of scientists, we can and do speak of aspects of an organism’s environment using the language of physics and chemistry, describing them in organism-neutral terms, it is only if we recognize the significance that such worldly features have for the organism that we are able to pick the right aspects of the world that are relevant or important to it.

On the face of it, to suggest that organisms ‘bring forth’ or ‘enact’ their own environments may appear to be an extravagant thesis. Yet it finds support in the seminal work of biologists, principally Gould and Lewontin (1979), who question accounts of Darwinian adaptationism in two key respects. They reject construing natural selection as an external evolutionary force that separately targets and optimizes individuated organismic traits. They also reject the idea that natural selection fashions organisms to better compete against one another for the resources of the pre-existing physical world (for further details, see Godfrey-Smith 2001). In the place of strong adaptationism, original enactivists propose to understand evolution in terms of natural drift– seeing it as a holistic, “ongoing process of satisfaction that triggers (but does not specify) change in the form of viable trajectories” (see a full summary in Varela and others 1991, pp. 196-197 and also Maturana and Mpodozis 2000).

A major focus of the critique of adaptationism is the rejection of the idea that a living creature’s environment is an external, “preexistent element of nature formed by autonomous forces, as a kind of theatrical stage on which the organisms play out their lives” (Lewontin and Levins 1997, p. 96, Lewontin 2000).

Putting pressure on the idea that organisms simply adapt to a neutrally characterized external world, Lewontin and Levins (1997) observe that not all worldly forces affect every organism equally. In some cases, some forces greatly affect certain organisms, while the same forces matter to other creatures hardly at all. The all-pervasive force of gravity provides a shining example. All middle-sized plants and animals must contend with it. Not only does gravity affect the musculoskeletal, respiratory, and circulatory systems of such organisms, but also affects their single biological cells. Gravity influences cell size and processes such as mechanotransduction —processes by which cells electrochemically respond, at micro-timescales, to mechanical features and forces in the environment. Hence, even on a microlevel, gravity matters for such cognitively important activities as hearing, proprioception, touch, and balance. Due to their size, other organisms, however, must contend with and are shaped in their activities by other forces. For microorganisms, it is Brownian motion, not gravity, that matters most to their lives. It is reported that some microbes can survive the hypergravity of extraterrestrial, cosmic environments, which exert a gravitational force up to 400,000 times greater than that found on Earth (Deguchi and others 2011). This is one reason why bacteria “are ubiquitous, present in nearly every environment from the abyssal zone to the stratosphere at heights up to 60 km, from arctic ice to boiling volcanoes” (Sharma and Curtis 2022, p. 1).

These reminders support the enactivist claim that the relationship between organism and environment is dialectical —that the one cannot exist without the other. Maintaining that organisms and their environments reciprocally codetermine one another, defenders of this view of biological development hold that:

Environments are as much the product of organisms as organisms are of environments. There is no organism without an environment, but there is no environment without an organism. There is a physical world outside of organisms, and that world undergoes certain transformations that are autonomous. Volcanoes erupt, and the earth precesses on its axis of rotation. But the physical world is not an environment; only the circumstances from which environments can be made are (Lewontin and Levins 1997, p. 96).

Moreover, the relationship between organisms and their environments is not static; it coevolves dynamically over time: “As the species evolves in response to natural selection in its current environment, the world that it constructs around itself is actively changed” (Thompson 2007, p. 150). Lewontin and Levins (1997) provide a range of examples of how organisms relate to and actively construct their environments. These include organisms regulating ambient temperatures through the metabolic production of shells of warm, moist air around themselves and plant roots producing humic acids that alter the physiochemical structure of soil to help them absorb nutrients.

Looking to these foundations, Rolla and Figueiredo (2021) further explicate the evolutionary dynamics by which organisms can be said to, literally, bring forth their worlds. Drawing on the notion of niche construction, theirs is an effort to show that “enactivism is compatible with the idea of an independent reality without committing to the claim that organisms have cognitive access to a world composed of properties specified prior to any cognitive activity”. For more on the notion of niche construction, and why it is thought to be needed, see Laland and others (2014), Laland and others (2016), and Werner (2020).

iii. Phenomenological Connections

In line with its overarching aim, original enactivism aims at giving an account of “situated meaningful action that remains connected both to biology and to the hermeneutic and phenomenological studies of experience” (Baerveldt and Verheggen, 2012, p. 165. See also Stapleton and Froese (2016), Netland (2022)).

See also Stapleton and Froese (2016), Netland (2022). It owes a great deal to the European tradition of phenomenology in that its account of virtual milieus and vital norms is inspired by Merleau-Ponty’s The Structure of Behaviour and, especially, his notion of “the lived body” (Kabat-Zinn 2016, p. xiii). Virtual milieus and their properties are not something found ‘objectively’ in the world; rather, they are enacted or brought forth by organisms. Organisms not only enact their environments —in the sense that sucrose might become food for certain creatures —they also enact their qualitative, felt experiences of the world. In this vein, enactivists advance the view that “our perceived world [the world as perceived]…is constituted through complex and delicate patterns of sensorimotor activity” (Varela and others, 1991, p. 164).

By appealing to arguments from biology, enactivists defend the view that organisms and their environments are bound together in ways that make it impossible to characterize one without reference to the other when it comes to understanding mental life. They apply this same thinking when it comes to thinking about qualitative, phenomenally conscious aspects of mind, holding, for example, that “we will not be able to explain colour if we seek to locate it in a world independent of our perceptual capacities” (Varela and others, 1991, p. 164). This is not meant to be a rejection of mind-independent realism in favor of mind-dependent idealism. Defenders of the original version of enactivism offer this proposal explicitly as providing a ‘middle way’ between these familiar options. By their lights, “colours are not ‘out there’ independent of our perceptual and cognitive capacities…[but equally] colors are not ‘in here’ independent of our surrounding biological and cultural world” (p. Varela and others 1991, p. 172).

For enactivists, colors cannot be understood independently of the very particular ways that experiencing beings respond to specific kinds of worldly offerings. Accordingly, it is not possible to think about the nature of colors qua colors without also referencing those ways of interacting with environmental offerings. This claim rests on two assumptions. First, the way colors appear to organisms —the way they experience them —is essential to understanding the nature of colors as such. Second, such experiential properties are inescapably bound up with organismic ways of responding to aspects of their environments.

Importantly, though enactivists deny that colors are objective properties of the world independent of organisms that perceive them, they neither claim nor imply that colors are wholly mind-dependent properties in the sense associated with classical Berkleyian idealism as it is standardly portrayed.

Furthermore, it is precisely because enactivists hold that an organism’s ways of responding to aspects of its environment are not inherently representational, or representationally mediated that “color provides a paradigm of a cognitive domain that is neither pregiven nor represented but rather experiential and enacted” (Varela and others 1991, p. 171). This conclusion is meant to generalize, applying to all phenomenological structures and aspects of what is brought forth by organisms as domains of significance through their autonomous activity.

In this regard, in its original formulation, enactivism drew on “significant resources in the phenomenological tradition for rethinking the mind” (Gallagher 2017, p. 5). Apart from explicitly borrowing from Merleau-Ponty, Varela and others (1991) also aligned their project with other classic thinkers of the phenomenological tradition, such as Husserl and Sartre, to some extent.

For example, although the enactivists wished to steer clear of what Hubert Dreyfus interpreted as Husserl’s representationalist leanings, they acknowledge the prime importance of his efforts to “develop a specific procedure for examining the structure of intentionality, which [for him] was the structure of experience itself” (Varela and others 1991, p. 16). For this reason, and by contrast, they explicitly oppose and criticize the cognitivist conviction that there is “a fundamental distinction between consciousness and intentionality” (p. 56). By their lights, drawing such a distinction creates a mind-mind problem and disunifies our understanding of the cognizing subject.

Nevertheless, despite borrowing in key respects from the Western phenomenological tradition, when formulating their initial statement of enactivism, Varela and others (1991) also criticized that tradition for, allegedly, being overly theoretical in its preoccupations. According to their assessment at the time, phenomenology “had gotten bogged down in abstract, theoretical reflection and had lost touch with its original inspiration to examine lived experience in a rigorous way” (Thompson 2016, p. xx-xxi). This critical take on phenomenology motivated the original enactivists to “turn to the tradition of Buddhist philosophy and mindfulness-awareness medita­tion as a more promising phenomenological partner for cognitive sci­ence” (Thompson 2007, p. 413).

In time, Thompson and Varela too, in their analysis of the specious present and their work with Natalie Depraz, at least, came to revise original enactivism’s negative verdict concerning phenomenology’s limitations. In his later writings, Thompson admits that the authors of The Embodied Mind, wrongly, gave short shrift to phenomenology. For example, by conceding that they had relied too heavily on second-hand sources and had not given careful attention to the primary texts, Thompson makes clear that the original enactivists came to hold, mistakenly, that Husserl sponsored an unwanted brand of representationalism (see Thompson 2007 appendix A, Thompson 2016).

Many contemporary enactivists, including Thompson, openly draw on and seek to renovate ideas from the phenomenological tradition, connecting them directly with current theorizing in the cognitive sciences (Gallagher 2005, Gallagher and Zahavi 2008/2021, Gallagher 2017). As Gallagher notes, for example, there has been new work in this vein on “Husserl’s concept of the ‘I can’ (the idea that I perceive things in my environment in terms of what I can do with them); Heidegger’s concept of the pragmatic ready-to-hand (Zuhanden) attitude (we experience the world primarily in terms of pre-reflective pragmatic, action-oriented use, rather than in reflective intellectual contemplation or scientific observation); and especially Merleau-Ponty’s focus on embodied practice” (Gallagher 2017, p. 5).

iv. Buddhist Connections

 A major source of inspiration for original enactivists comes from Buddhist philosophy and practice. Thompson remarks in an interview that, to his knowledge, The Embodied Mind, “was the first book that related Buddhist philosophy to cognitive science, the scientific study of the mind, and the Western philosophy of mind” (Littlefair 2020).

Speaking on behalf of the authors of The Embodied Mind, Rosch (2016) reports that “We turned to Buddhism because, in our judgment, it provided what both Western psychology and phenomenology lacked, a disciplined and nonmanipulative method of allowing the mind to know itself—a method that we (in retrospect naively) simply called mindfulness” (Rosch 2016, xli). Despite having turned to Buddhist philosophy and psychology due to a mistaken assessment of what Western phenomenology has to offer, original enactivism continues to seek fruitful dialogues between Buddhist and Western traditions of philosophy of mind. Enactivism has helped to promote the recognition that phenomenological investigations need not be limited to work done in the European tradition.

There are potential gains to be had from conducting dialogues across traditions of thought for at least two reasons. Sometimes those working in a different tradition focus on phenomena unnoticed by other traditions. And sometimes those working in a different tradition offer novel observations about phenomena that are already of common interest. Recognizing the potential value of such dialogues, enactivists have a sustained interest in what Asian traditions of thought and practice have to offer when it comes to investigating and describing experience, and “in particular the various Buddhist and Hindu philosophical analyses of the nature of the mind and consciousness, based on contemplative mental training” (Thompson 2007, p. 474).

Inspired by these efforts at cross-fertilization, Varela initially formulated neurophenomenology, which was subsequently taken up by others (Varela 1996, 1999, Thompson 2007). Neurophenomenology was developed as a novel approach to the science of consciousness —one that incorporates empirical studies of mindful, meditative practice with the aim of getting beyond the hard problem of consciousness. Although, as a practical approach to the science of consciousness, neurophenomenology certainly breaks new ground, it has been criticized for failing to adequately address the theoretical roots of the hard problem of consciousness, which are grounded in particular metaphysical commitments (see, for example, Kirchhoff and Hutto 2016 and replies from commentators).

Another enactivist borrowing from Buddhist philosophy, of a more theoretical bent, is the claim that cognition and consciousness are absolutely groundless —that they are ultimately based only on empty co-dependent arising. Thompson (2016) reports that the original working title of The Embodied Mind was Worlds Without Grounds. That initial choice of title, though later changed, shows the centrality of the idea of groundlessness for the book’s authors. As Thompson explains, the notion of groundlessness in Buddhist philosophy is meant to capture the idea “that phenomena lack any inherent and independent being; they are said to be ‘empty’ of ‘own being’” (p. xviii).

The original enactivists saw a connection with the Buddhist notion of groundlessness and their view that cognition only arises through viable organismic activity and histories of interaction that are not predetermined. For them, the idea that cognition is groundless is supported by the conception of evolution as natural drift. Accordingly, they maintain that “our human embodiment and the world that is enacted by our history of coupling reflect only one of many possible evolutionary pathways. We are always constrained by the path we have laid down, but there is no ultimate ground to prescribe the steps that we take” (Varela and others 1991, p. 213). Or, as Thompson (2016) puts it, “Cognition as the enaction of a world means that cognition has no ground or foundation beyond its own history” (p. xviii).

Thompson (2021) has emphasized the apparently far-reaching consequences this view has for mainstream conceptions of science and nature. To take it fully on board is to hold that ultimate reality is ungraspable, that it is beyond conception, or that it is not ‘findable under analysis’. As such, he observes that, on the face of it, the traditional Mahāyāna Buddhist idea of ‘emptiness’ (śūnyatā—the lack of intrinsic reality) appears to be at odds with standard, realist, and objectivist conceptions of scientific naturalism. As such, this raises a deep question of what taking these Buddhist ideas seriously might mean “for scientific thinking and practice” (Thompson 2021, p. 78). Others too have sought to work through the implications of taking enactivist ideas seriously when thinking about an overall philosophy of nature (Hutto and Satne 2015, 2018a, 2018b; Gallagher 2017, 2018b; Meyer and Brancazio 2022). These developments raise the interesting question: To what extent, and at what point, might enactivist revisions to our understanding and practice of science come into direct tension with and begin to undermine attempts to make the notion of autonomous agency credible by “providing a factual, biological justification for it” (Varela 1991 p. 79)?

v. Sense-Making

A foundational, signature idea associated with the original version of enactivism and its direct descendants is that the autonomous agency of living systems and what it entails are a kind of sense-making. The notion of sensemaking made its debut in the title of a presentation that Varela delivered in 1981, and the idea’s first published expression arrived with the publication of that presentation, as follows: “Order is order, relative to somebody or some being who takes such a stance towards it. In the world of the living, order is indeed in­separable from the ways in which living systems make sense, so that they can be said to have a world” (Varela 1984, p. 208; see Thompson 2011 for further discussion of the origins of the idea). The idea that living systems are sense-making systems has proved popular with many enactivists, although interestingly, there is no explicit mention of sense-making in The Embodied Mind.

Sense making is variously characterised in the literature. Sometimes it is characterised austerely, serving simply as another name for the autonomous activity of living systems. In other uses, it picks out, more contentiously, what is claimed to be directly entailed by the autonomous activity of living systems. In the latter uses, different authors attribute a variety of diverse properties to sense making activity in their efforts to demonstrate how phenomenological aspects of mind derive directly from, or are otherwise somehow connected with, the autonomous agency of living systems. Making the case for links between life and mind can be seen, broadly, as a continuation of Varela’s project “to establish a direct entailment from autopoiesis to the emergence of a world of significance” (Di Paolo and others 2018, p. 32).

At its simplest, sense-making is used to denote the autonomous agency of living systems. For example, that is how the notion is used in the following passages:

Living is a process of sense-making, of bringing forth sig­nificance and value. In this way, the environment becomes a place of valence, of attraction and repulsion, approach or escape (Thompson 2007, p. 158).

Sense-making is the capacity of an autonomous system to adaptively regulate its operation and its relation to the environment depending on the virtual consequences for its own viability as a form of life (Di Paolo and others 2018, p. 33).

Such an identification is at play when it is said that “even the simplest organisms regulate their interactions with the world in such a way that they transform the world into a place of salience, meaning, and value —into an environment (Umwelt) in the proper biological sense of the term. This transformation of the world into an environment happens through the organism’s sense-making activity” (Thompson and Stapleton 2009, p. 25). However, Di Paolo and others (2017) go further, claiming that “it is possible to deduce from processes of precarious, material self-individuation the concept of sense-making” (p. 7).

Enactivists add to this basic explication of sense-making, claiming that the autonomous activity of living systems is equivalent to, invariably gives rise to, entails, or is naturally accompanied by a plethora of additional properties: having a perspective, intentionality, interpretation, making sense of the world, care, concern, affect, values, evaluation, and meaning.

Thompson (2007) explains that the self-individuating and identity-forging activity of living systems “establishes logically and operationally the reference point or perspective for sense-making and a do­main of interactions” (p. 148). It is claimed that such autonomous sense-making activity establishes “a perspective from which interactions with the world acquire a normative status” (Di Paolo and others 2018, p. 32). Di Paolo and others (2017) appear to add something more to this explication when they take sense-making to be equivalent to an organism not only having a perspective on things but having “a perspective of meaning on the world invested with interest for the agent itself (p. 7).

Thompson (2007) tells us that according to Varela, sense-making “is none other than intentionality in its minimal and original biological form” (Thompson 2007, p. 148; see Varela 1997a, Thompson 2004). This fits with the account of intentionality provided in The Embodied Mindccording to which “embodied action is always about or directed toward something that is missing… actions of the system are always directed toward situations that have yet to become actual” (Varela and others 1991, p. 205). In their classic statement of this view, the original enactivists held that intentionality “consists primarily in the directedness of action… to what the system takes its possibilities for action to be and to how the resulting situations fulfill or fail to fulfill these possibilities” (Varela and others 1991, p. 205-206).

Talk of sense-making, despite the minimal operational definition provided above, is sometimes used interchangeably and synonymously with the notion that organisms make sense of their environments. This illocution is at the heart of Varela’s initial presentation of the view in Varela (1984), but others retain the language. Thompson (2007) tells us that “an autopoietic system always has to make sense of the world so as to remain viable” (p. 147-8). He also tells us, “Life is thus a self-affirming process that brings forth or enacts its own identity and makes sense of the world from the perspec­tive of that identity.” (Thompson 2007, p. 153). Rolla and Huffermann (2021) describe enactivists as committed to the claim that “organisms make sense of their environments through autopoiesis and sensorimotor autonomy, thereby establishing meaningful environmental encounters” (p. 345).

Enactivists also regard sense-making as the basis for values and evaluations, as these, they claim, appear even in the simplest and most basic forms of life (see, for example, Rosch 2016). This claim connects with the enactivist assumption that all living things have intrinsic purposiveness and an immanent teleology (Thompson 2007, Di Paolo and others 2018, see also Gambarotto and Mossio 2022).

Certain things are adaptative or maladaptive for organisms, and, as such, through their active sense-making, they tend to be attracted to the former and repulsed by the latter (Thompson 2007, p. 154). Accordingly, it is claimed that organisms must evaluate whatever they encounter. For example, a sense-making system “… ‘evaluates’ the environmental situation as nutrient-rich or nutrient-poor” (Di Paolo and others 2018, p. 32). It is claimed that such evaluation is necessary given that the “organism’s ‘concern’… is to keep on going, to continue living” (Di Paolo and others 2018, p. 33). Moreover, it is held that the autonomous sense-making activity of organisms generates norms that “must somehow be accessible (situations must be accordingly discernible) by the organism itself.” (Di Paolo and others 2018, p. 32). So conceived, we are told that “sense-making… lies at the core of every form of action, perception, emotion, and cognition, since in no instance of these is the basic structure of concern or caring ever absent. This is constitutively what distinguishes mental life from other material and relational processes” (Di Paolo and others 2018, p. 33).

Those who have sought to develop the idea of sense-making also maintain that “cognition is behaviour in relation to meaning… that the system itself enacts or brings forth on the basis of its autonomy” (Thompson 2007, p. 126). In this regard, Cappuccio and Froese (2014) speak of an organism’s “active constitution of a meaningful ‘world-environment’ (Umwelt)” (p. 5).

Importantly, Thompson (2007) emphasizes that sense-making activity not only generates its own meaning but also simultaneously responds to it. He tells us that “meaning is generated within the system for the system itself —that is, it is generated and at the same time consumed by the system” (p. 148). This idea comes to the fore when he explicates his account of emotional responding, telling us that “an endogenously generated response… creates and carries the meaning of the stimulus for the animal. This meaning reflects the individual organism’s history, state of expectancy, and environmental context” (Thompson 2007, p. 368). Similarly, in advancing her own account of enactive emotions, Colombetti (2010) also speaks of organismic “meaning generating” activity and describes the non-neural body as a “vehicle of meaning” (2010, p. 146; p. 147).

Di Paolo and his co-authors defend similar views, holding that “the concept of sense-making describes how living organisms relate to their world in terms of meaning” (Di Paolo and others 2017, p. 7); and that an organism’s engagements with features of the environment “are appreciated as meaningful by the organism” (Di Paolo and others 2018, p. 32).

Enactivists who defend these views about sense-making are keen to note that the kind of ‘meaning’ that they assume is brought forth and consumed by organisms is not to be understood in terms of semantic content, nor does it entail the latter. As such, the kind of meaning that they hold organisms bring forth is not in any way connected to or dependent upon mental representations as standardly understood. We are told “if we wish to continue using the term representation, then we need to be aware of what sense this term can have for the enactive approach… “Autonomous systems do not operate on the basis of internal representations; they enact an environment” (Thompson 2007, p. 58 –59). Indeed, in moving away from cognitivist assumptions, a major ambition of this variety of enactivism is to establish that “behavior…expresses meaning-constitution rather than information processing” (Thompson 2007, p. 71).

In sum, a main aspiration of original enactivism is to bring notions such as sense-making to bear to demonstrate how key observations about biological autonomy can ground phenomenological aspects of mindedness such as “concernful affect, caring attitudes, and meaningful engagements that underscore embodied experience” (Di Paolo and others 2018, p. 42). The sense-making interpretation of biological autonomy is meant to justify attributing basic structures of caring, concern, meaning, sense, and value to living systems quite generally (Di Paolo and others 2018, p. 22). Crucially and pivotally, it is claimed of the original version of enactivism that through its understanding of “precarious autonomy, adaptivity, and sense-making, the core aspect of mind is naturalized” (Di Paolo and others 2018, p. 33).

In pursuing its naturalizing ambition, the original version of enactivism faces a particular challenge. Simply put, the weaker —more austere and deflated —its account of sense-making, the more credible it will be for the purpose of explicating the natural origins of minds, but it will be less capable of accounting for all aspects of mindedness. Contrariwise, the stronger —more fulsome and inflated —its account of sense-making, the more capable it will be of accounting for all aspects of mindedness, but the less credible it will be for the purpose of explicating the natural origins of minds.

For example, in their original statement of enactivism, Varela and others (1991) speak of the most primitive organisms enacting domains of ‘significance’ and ‘relevance’. They add that this implies that ‘some kind of interpretation’ is going on. Yet, they are also careful to emphasize that they use their terms advisedly and are at pains to highlight that “this interpretation is a far cry from the kinds of interpretation that depend on experience” (p. 156). More recently, Stapleton (2022) maintains that:

The autopoietic enactivist is, of course, not committed to viewing the bacterium as experiencing the value that things in its environment have for it. Nor, to viewing the bacterium as purposefully regulating its coupling with the environment, where ‘purposeful’ is understood in the terms we normally use it—as implying some kind of reflection on a goal state and striving to achieve that goal state by behaving in a way in which one could have done otherwise (p. 168).

Even if it is accepted that all cognition lies along a continuum, anyone who acknowledges that there are significantly different varieties of cognition that have additional properties not exhibited by the most basic forms must face up to the ‘scaling up’ challenge. As Froese and Di Paolo (2009) ask, “Is it a question of merely adding more complexity, that is, of just having more of the same kind of organizations and mechanisms? Then why is it seemingly impossible to properly address the hallmarks of human cognition with only these basic biological principles?” (p. 441). In this regard, Froese and Di Paolo (2009) admit that even if the notion of sense-making is thought to be appropriate for characterizing the activity of the simplest living creatures, it still “cries out for further specification that can distinguish between different modes of sense-making” (p. 446).

With the scaling up challenge in sight, several enactivists have been working to explicate how certain, seemingly distinctive high-end human forms of sense-making relate to those of the most basic, primitive forms of life (Froese and Di Paolo 2009; De Jaegher and Froese 2009; Froese, Woodward and Ikegami 2013, Kee 2018). Working in this vein, Cuffari and others (2015) and Di Paolo and others (2018) have broken new ground by providing a sense-making account of human language in their efforts to dissolve the scaling-up problem and demonstrate the full scope and power of key ideas from the original version of enactivism.

b. Sensorimotor Knowledge Enactivism

At a first pass, what is sometimes called simply sensorimotor enactivism holds that perceiving and perceptual experience “isn’t something that happens in us, it is something we do” (Noë 2004, p. 216). Accordingly, perceiving and experiencing are “realized in the active life of the skillful animal” (Noë 2004, p. 227). Its main proponent, Alva Noë (2021), tells us:

The core claim of the enactive approach, as I understand it, and as this was developed in Noë, 2004, and also O’Regan and Noë, 2001… [is that] the presence of the world, in thought and experience, is not something that happens to us but rather something that we achieve or enact (p. 958).

This version of enactivism travels under various names in the literature, including the enactive approach (Noë 2004, 2009, 2012, 2016, 2021); sensorimotor theory (O’Regan and Noë 2001; Myin and O’Regan 2002; Myin and Noë 2005; O’Regan 2011); ‘the dynamic sensorimotor approach’ (Hurley and Noë 2003), which also drew on Hurley (1998); and ‘actionism (Noë 2012, 2016). In Noë (2021), the new label sensorimotor knowledge enactivism’ was introduced to underscore the key importance of the idea that perceiving and perceptual experiences are grounded in a special kind of knowledge. Hence, a fuller and more precise explication of the core view of this version of enactivism is that experience of the world comes in the form of an understanding that is achieved through an active exploration of the world, which is mediated by practical knowledge of its relevant sensorimotor contingencies.

The emphasis on sensorimotor understanding and knowledge is what makes this version of enactivism distinctive. Sensorimotor knowledge enactivism holds that in order “to perceive, you must have sensory stimulation that you understand” (Noë 2004, p. 183; see also p. 180, p. 3). In explicating this view, Noë (2012) is thus at pains to highlight “the central role understanding, knowledge, and skill play in opening up the world for experience… the world is blank and flat until we understand it” (Noë 2012, p. 2). Later in the same book, he underscores this crucial point yet again, saying that:

According to the actionist (or enactive) direct realism that I am developing here, there is no perceptual experience of an object that is not dependent on the exercise by the perceiver of a special kind of knowledge. Perceptual awareness of objects, for actionist-direct realism, is an achievement of sensorimotor understanding. (Noë 2012, p. 65).

These claims also echo the original statement of the view, which tells us that “the central idea of our new approach is that vision is a mode of exploration of the world that is mediated by knowledge of what we call sensorimotor contingencies” (O’Regan and Noë 2001, p. 940, see also Noë 2004, p. 228).

Putting this together, Noë (2004) holds that “all perception is intrinsically thoughtful” (2004, p. 3). Accordingly, canonical forms of perceiving and thinking really just lie at different points along the same spectrum: “perception is… a kind of thoughtful exploration of the world, and thought is… a kind of extended perception” (Noë 2012, p. 104 –105). Sensorimotor knowledge enactivism thus asks us to think of the distinction between thought and perception as “a distinction among different styles of access to what there is… thought and experience are different styles of exploring and achieving, or trying to achieve, access to the world” (Noë 2012, p. 104 –105).

The view is motivated by the longstanding observation that we cannot achieve an accurate phenomenology of experience if we only focus on the raw stimulation and perturbation of sensory modalities. A range of considerations support this general position. A proper phenomenology of experience requires an account of what it is to grasp the perceptual presence of objects in the environment. But this cannot be accounted for solely by focusing on raw sensations. The visual experience of, say, seeing a tomato is an experience of a three-dimensional object that takes up space voluminously. This cannot be explained simply by appealing to what is passively ‘given’ to or supplied by the senses. For what is, strictly, provided to the visual system is only, at most, a partial, two-dimensional take of the tomato.

Empirical findings also reveal the need to distinguish between mere sensing and experiencing. It has been shown that it is possible to be sensorially stimulated in normal ways without this resulting in the experience of features or aspects of the surrounding environment in genuinely perceptual ways —in ways that allow subjects to competently engage with worldly offerings or to make genuinely perceptual reports. This is the situation, for example, for those first learning to manipulate sensory substitution devices (O’Regan and Nöe 2001, Nöe 2004, Roberts 2010)

There are longstanding philosophical and empirical reasons for thinking that something must be added to sensory stimulation to a yield full -blown experience of worldly offerings and to enable organisms to engage with them successfully. Something must be added to sensory stimulation to a yield full-blown experience of worldly offerings and enable organisms to engage with them successfully.

A familiar cognitivist answer is that the extra ingredient needed for perceiving comes in the form of inner images or mental representations. Sensorimotor knowledge enactivism rejects these proposals, denying that perceiving depends on mental representations, however rich and detailed. In this regard, sensorimotor knowledge enactivism also sets its face against the core assumption of the popular predictive processing accounts of cognition by holding that

the world does not show up for us “as it does because we project or interpret or confabulate or hypothesize… in something like the way a scientist might posit the existence of an unobserved force” (Noë 2012, p. 5).

Sensorimotor knowledge enactivism, by contrast, holds that perceptual experience proper is grounded in the possession and use of implicit, practical knowledge such that, when such knowledge is deployed properly, it constitutes understanding and allows organisms to make successful contact with the world.

Successfully perceiving the world and enjoying perceptual experiences of it are mediated and made possible by the possession and skillful deployment of a special kind of practical knowledge of sensorimotor contingencies, namely, knowledge of the ways in which stimulation of sense modalities changes, contingent upon aspects of the environment and the organism’s own activities.

Having the sensation of softness consists in being aware that one can exercise certain practical skills with respect to the sponge: one can, for example, press it, and it will yield under the pressure. The experience of the softness of the sponge is characterized by a variety of such possible patterns of interaction with the sponge, and the laws that describe these sensorimotor interactions we call, following MacKay (1962), the laws of sensorimotor contingency (O’Regan and Noë, 2001). (O’Regan and others, 2005, p. 56, emphasis added).

Knowledge of this special sort is meant to account for the expectations that perceivers have concerning how things will appear in the light of possible actions. It amounts to knowing how things will manifest themselves if the environment is perceptually explored in certain ways. At some level, so the theory claims, successful perceivers must have implicit mastery of relevant laws concerning sensorimotor contingencies.

Echoing ideas first set out in the original version of enactivism, sensorimotor knowledge enactivism holds that the phenomenal properties of experience —what-it-is-like properties —are not to be identified with extra ingredients over and above the dynamic, interactive responses of organisms. As such, its advocates hold that “we enact our perceptual experience: we act it out” (Noë 2004, p. 1). In line with the position advanced by other enactivists, Noë (2004) claims that:

Different animals inhabit different perceptual worlds even though they inhabit the same physical world. The sights, sounds, odors, and so on that are available to humans may be unavailable to some creatures, and likewise, there is much we ourselves cannot perceive. We lack the sensorimotor tuning and the understanding to encounter those qualities. The qualities themselves are not subjective in the sense of being sensations. We don’t bring them into existence. But only a very special kind of creature has the biologically capacity, as it were, to enact them (p. 156).

On their face, some of the statements Noë makes about phenomenal properties appear to be of a wholly realist bent. For example, he says, “There is a sense in which we move about in a sea of perspectival properties and are aware of them (usually without thought or notice) whenever we are perceptually conscious. Indeed, to be perceptually conscious is to be aware of them” (Noë 2004, p. 167). Yet, he also appears to endorse a middle way -position that recognizes that the world can be understood as a domain of perceptual activity just as much as it can be understood as a place consisting of or containing the properties and facts that interest us (Noë 2004, p. 167).

It is against that backdrop that Noë holds, “Colours are environmental phenomena, and colour experience depends not only on movement-dependent but also on object-dependent sensorimotor contingencies… colour experience is grounded in the complex tangle of our embodied existence” (Noë 2004, p. 158) In the end, sensorimotor knowledge enactivism offers the following answer to the problem of consciousness: “How the world shows up for us depends not only on our brains and nervous systems but also on our bodies, our skills, our environment, and the way we are placed in and at home in the world” (Noë 2012, pp. 132-3).

Ultimately, “perceptual experience presents the world as being this way or that; to have experience, therefore, one must be able to appreciate how the experience presents things as being” (Noë 2004, p. 180). This is not something that is automatically done for organisms; it is something that they sometimes achieve. Thus, “The world shows up for us thanks to what we can do… We make complicated adjustments to bring the world into focus … We achieve access to the world. We enact it by enabling it to show up for us.… If I don’t have the relevant skills of literacy, for example, the words written on the wall do not show up for me” (Noë 2012, p. 132 –133).

So understood, sensorimotor knowledge enactivism resists standard representational accounts of perception, holding that “perceivings are not about the world; they are episodes of contact with the world” (Noë 2012, p. 64). It sponsors a form of enactive realism according to which the content of perceiving only becomes properly perceptual content that represents how things are when the skillful use of knowledge makes successful contact with the world. There is no guarantee of achieving that outcome. Hence, many attempts at perceiving might be groping, provisional efforts in which we only gain access to how things appear to be and not how they are.

On this view, “perception is an activity of learning about the world by exploring it. In that sense, then, perception is mediated by appearance” (Noë 2004, p. 166). Achieving access to the world via knowledgeable, skillful exploration is to discover the relevant patterns that reveal “how things are from how they appear” (Noë 2012, p. 164). Thus, “hearing, like sight and touch, is a way of learning about the world… Auditory experience, like visual experience, can represent how things are” (Noë 2004, p. 160).

Accordingly, Noë (2004) holds that the perceptual content of experience has a dual character: it presents the world as being a certain way and presents how things are experienced, capturing how things look, or sound, or feel from the vantage point of the perceiver. It is because Noë assumes perceptual content has both of these aspects that he is able to defend the view that perceptual experience is a “way of encountering how things are by making contact with how they appear to be” (Noë 2004, p. 164).

The key equation for how this is possible, according to sensorimotor knowledge enactivism, is as follows: “How [things] (merely) appear to be plus sensorimotor knowledge gives you how things are” (Noë 2004, p. 164). Put otherwise, “for perceptual sensation to constitute experience —that is, for it to have genuine representational content —the perceiver must possess and make use of sensorimotor knowledge” (Noë 2004, p. 17).

Even though knowledge and understanding lie at the heart of sensorimotor knowledge enactivism, Noë (2012) stresses that “your consciousness of… the larger world around you is not an intellectual feat” (Noë 2012, p. 6). He proposes to explain how to square these ideas by offering a putatively de-intellectualized account of knowledge and understanding, advancing a “practical, active, tool-like conception of concepts and the understanding” (Noë 2012, p. 105).

Sensorimotor knowledge enactivism bills itself as rejecting standard representationalism about cognition while also maintaining that perceptual experiences make claims or demands on how things are (Noë 2021). Since, to this extent, sensorimotor knowledge enactivism retains this traditional notion of representational content, at its core, Noë (2021) has come to regard the ‘real task’ for defenders of this view as “to rethink what representation, content, and the other notions are or could be” (p. 961).

It remains to be seen if sensorimotor knowledge enactivism can explicate its peculiar notions of implicit, practical understanding, and representational content in sufficiently novel and deflated ways that can do all the philosophical work asked of them without collapsing into or otherwise relying on standard cognitivist conceptions of such notions. This is the longstanding major challenge faced by this version of enactivism (Block 2005, Hutto 2005).

c. Radical Enactivism

 Radical enactivism, also known as radical enactive cognition or REC, saw its debut in Hutto (2005) and was developed and supported in subsequent publications (Menary 2006, Hutto 2008, 2011a, 2011c, 2013a, 2013c, 2017, 2020, Hutto and Myin 2013, 2017, 2018a, 2018b, 2021). It was originally proposed as a critical adjustment to sensorimotor enactivists conservative tendencies, as set out in O’Regan and Noë (2001), tendencies which were deemed to be at odds with the professed anti-representationalism of the original version of enactivism. Radical enactivism proposes an account of enactive cognition that rejects characterizing or explaining the most basic forms of cognition in terms of mediating knowledge. This is because radical enactivists deem it unlikely that such notions can be non-vacuously explicated or accounted for naturalistically.

Importantly, radical enactivism never sought to advance a wholly original, new type or brand of enactivism. Instead, its goal was always to identify a minimal core set of tenable yet non-trivial enactivist theses and defend them through analysis and argument.

Much of the work of radical enactivists is subtractive —it adds by cutting away, operating on the assumption that often less is more. The adopted approach is explicated in greater detail in Evolving Enactivism, wherein several non-enactivist proposals about cognition are examined in an effort to assess whether they could be modified and allied with radical enactivism. This process, known as RECtification, is one “through which…. target accounts of cognition are radicalized by analysis and argument, rendering them compatible with a Radically Enactive account of Cognition” (Hutto and Myin 2017, p. xviii).

In advancing this cause, Hutto and Myin (2013) restrict radical enactivism’s ambitions to only promoting strong versions of what they call the Embodiment Thesis and the Developmental-Explanatory Thesis.

The Embodiment Thesis conceives of basic cognition in terms of concrete, spatio-temporally extended patterns of dynamic interaction between organisms and their environments. These interactions are assumed to take the form of individuals engaging with aspects of their environments across time, often in complex ways and on multiple scales. Radical enactivists maintain that these dynamic interactions are loopy, not linear. Such sensitive interactions are assumed, constitutively, to involve aspects of the non-neural bodies and environments of organisms. Hence, they hold that cognitive activity is not restricted to what goes on in the brain. In conceiving of cognition in terms of relevant kinds of world-involving organismic activity, radical enactivists characterize it as essentially extensive, not merely extended, in contrast to what Clark and Chalmers (1998) famously argued (see Hutto and Myin 2013; Hutto, Kirchhoff and Myin 2014).

The Developmental-Explanatory Thesis holds that mentality-constituting interactions are grounded in, shaped by, and explained by nothing more than the history of an organism’s previous interactions and features of its current environment. Sentience and sapience emerge, in the main, through repeated processes of organismic engagement with environmental offerings. An organism’s prolonged history of engaged encounters is the basis of its current embodied tendencies, know-how, and skills.

Radical enactivism differs from other versions of enactivism precisely in rejecting their more extravagant claims. It seeks to get by without the assumption that basic cognition involves mediating knowledge and understanding. Similarly, radical enactivism seeks to get by without assuming that basic cognition involves sense-making. It challenges the grounds for thinking that basic forms of cognition have the full array of psychological and phenomenological attributes associated with sense-making by other enactivists. Radical enactivists, for example, resist the idea that basic cognition involves organisms somehow creating, carrying, and consuming meanings.

Additionally, radical enactivists do not assume that intentionality and phenomenality are constitutively or inseparably linked. Its supporters do not endorse the connection principle according to which intentionality and phenomenal consciousness are taken to be intrinsically related (see Searle 1992, Ch. 7; compare Varela and others, 1991, p. 22). Instead, radical enactivists maintain that there can be instances of world-directed cognition that are lacking in phenomenality, even though, in the most common human cases, acts of world-directed cognition possess a distinctive phenomenal character (Hutto 2000, p. 70).

Most pivotally, radical enactivism thoroughly rejects positing representational contents at the level of basic mentality. One of its most signature claims, and one in which it agrees with original enactivism, is that basic forms of mental activity neither involve nor are best explained by the manipulation of contentful representations. Its special contribution has been to advance novel arguments designed to support the idea that organismic activity, conceived of as engaging with features of their environments in specifiable ways, suffices for the most basic kinds of cognition.

To encourage acceptance of this view, radical enactivists articulated the hard problem of content (Hutto 2013c, Hutto and Myin 2013, Hutto and Myin 2018a, 2018b). This hard problem, posed as a challenge to the whole field, rests on the observation that information understood only in terms of covariance does not constitute any kind of content. Hutto and Myin (2013) erect this observation into a principle and use it to reveal the hard choice dilemma that anyone seeking to give a naturalistic account of basic cognition must face. The first option is to rely only on the notion of information-as-covariance in securing the naturalistic credentials of explanatory resourcesthe cost of not having adequate resources to explain the natural origins of the content that basic forms of cognition are assumed to have. The second option is to presuppose an expanded or inflated notion of information, one which can adequately account for the content of basic forms of cognition, at the cost of having to surrender its naturalistic credentials. Either way, so the analysis goes, it is not possible to give a naturalized account of the content of basic forms of cognition.

Providing a straight solution to the hard problem of content requires “explaining how it is possible to get from non-semantic, non-contentful informational foundations to a theory of content using only the resources of a respectable explanatory naturalism” (Hutto 2018, pp. 245).

Hutto and Myin (2013) put existing naturalistic theories of content to the test, assessing their capacity to answer this challenge. As Salis (2022, p.1) describes this work, they offer “an ensemble of reasons” for thinking naturalistic accounts of content will fail.

Radical enactivism wears the moniker ‘radical’ due to its interest in getting to the root of issues concerning cognition and its conviction that not all versions of enactivism have been properly steadfast in their commitment to anti-content, anti-representational views about the character of basic mindedness. For example, when first explicating their conception of the aboutness or intentionality of cognition as embodied action, the original enactivists note that the mainstream assumption is that “in general, intentionality has two sides: first, intentionality includes how the system construes the world to be (specified in terms of the semantic content of intentional states); second, intentionality includes how the world satisfies or fails to satisfy this construal (specified in terms of the conditions of satisfaction of intentional states)” (Varela and others 1991, p. 205). That mainstream notion of intentionality, which is tied to a particular notion of content, is precisely the kind of intentionality that radical enactivism claims does not feature in basic cognition. In providing compelling arguments against the assumption that basic cognition is contentful in that sense, radical enactivism’s primary ambition is to strengthen enactivism by securely radicalizing it.

Several researchers have since argued the hard problem of content has already been solved, or, at least, that it can be answered in principle or otherwise avoided (Miłkowski 2015, Raleigh 2018, Lee 2019, Ramstead and others 2020, Buckner 2021, Piccinini 2022). Yet, see Hutto and Myin (2017, 2018a, 2018b) and Segundo-Ortin and Hutto (2021) for assessments of the potential moves.

On the positive side of the ledger, radical enactivists contend that the kind of mindedness found at the roots of cognition can be fruitfully characterized as a kind of Ur-intentionality. It is a kind of intentionality that lacks the sort of content associated with truth or accuracy conditions (Hutto and Myin 2013, 2017, 2018a, Zahnoun 2020, 2021b, 2021c). Moreover, radical enactivists hold that we can adequately account for Ur-intentionality, naturalistically, using biosemiotics – a modified teleosemantics inspired, in the main, by Millikan (1984) but stripped of its problematic semantic ambitions. This proposed adjustment of Millikan’s theory was originally advanced in Hutto (1999) in the guise of a modest biosemantics that sought to explain forms of intentionality with only nonconceptual content. That version of the position was abandoned and later radicalized to become a content-free biosemiotics (see Hutto 2006, 2008, Ch. 3). The pros and cons of the Ur-intentionality proposal continue to be debated in the literature (Abramova and Villalobos 2015, De Jesus 2016, 2018, Schlicht and Starzak 2019, Legg 2021, Paolucci 2021, Zipoli Caiani 2022, Mann and Pain 2022).

Importantly, radical enactivists only put biosemiotics to the theoretical use of explicating the properties of non-contentful forms of world-involving cognition. Relatedly, they hold that when engaged in acts of basic cognition, organisms are often sensitive to covariant environmental information, even though it is a mere metaphor to say organisms process it. Although organisms are sensitive to relevant indicative, informational relationships, “these relationships were not lying about ready-made to be pressed into service for their purposes” (Hutto 2008, p. 53 –54). When it comes to understanding biological cognition, the existence of the relevant correspondences is not explained by appeals to ahistorical natural laws but by various selectionist forces.

As Thompson (2011b) notes, if radical enactivism’s account of biosemiotics is to find common ground with original enactivism and its direct descendants, it would have to put aside strong adaptationist views of evolution. In fact, although radical enactivism does place great explanatory weight on natural selection, it agrees with original enactivism at least to the extent that it does not hold that biological traits are individually optimized —not selected for —in isolation from one another to make organisms maximally fit to deal with features of a neutral, pre-existing world.

Radical enactivists accept that content-involving cognition exists even though they hold that our basic ways of engaging with the world and others are contentless. In line with this position, they have sought to develop an account of The Natural Origins of Content, a project pursued in several publications by Hutto and Satne (2015, 2017a, 2017b) and Hutto and Myin (2017). In these works, the authors have proposed that capacities for contentful speech and thought emerge with the mastery of distinctive socio-cultural practices —specifically, varieties of discursive practices with their own special norms. These authors also hold that the mastery of such practices introduces kinks into the cognitive mix, such as the capacity for ratio-logical reasoning (see, for example, Rolla 2021). Nevertheless, defenders of radical enactivism maintain that these kinks do not constitute a gap or break in the natural or evolutionary order (see Myin and Van den Herik 2020 for a defense of this position and Moyal-Sharrock 2021b for its critique). Instead, radical enactivists argue that the content-involving practices that enable the development of distinctively kinky cognitive capacities can be best understood as a product of constructed environmental niches (Hutto and Kirchhoff 2015). Rolla and Huffermann (2021) propose that in fleshing out this account, radical enactivism could combine with Di Paolo and others (2018)’s new work on linguistic bodies to understand the cognitive basis of language mastery, characterizing it as a kind of norm-infused and acquired shared know-how.

3. Forerunners

In the opening pages of Sensorimotor Life, its authors describe their contribution to the enactive literature as that of adding a ‘tributary to the flow of ideas’ which found its first expression in Varela, Thompson and Rosch’s The Embodied Mind. Making use of that metaphor, they also astutely note the value of looking “upstream to discover ‘new’ predecessors,” namely precursors to enactivism that can only be identified in retrospect: those which might qualify as “enactivists avant la lettre” (Di Paolo and others 2017, p. 3).

Enactivism clearly has “roots that predate psychology in its modern academic form.”

(Baerveldt and Verheggen 2012, p. 165). For example, in challenging early modern Cartesian conceptions of the mind as a kind of mechanism, it reaches back to a more Aristotelian vision of the mind that emphasizes its biological basis and features shared with all living things. Baerveldt and Verheggen (2012) also see clear links between enactivism and “a particular ‘radical’ tradition in Western Enlightenment thinking that can be traced at least to Spinoza” (p. 165). Gallagher argues that Anaxagoras should be considered the first enactivist based on his claim that human hands are what make humans the most intelligent of animals.

In the domain of biological ecology, there are clear and explicit connections between enactivism and the work of the German biologist Jakob von Uexküll, who introduced the notion of Umwelt, that had great influence in cybernetics and robotics. Resonances with enactivism can also be found in the work of Helmuth Plessner, a German sociologist and philosopher who studied with Husserl and authored Levels of Organic Life and the Human.

Another philosopher, Hans Jonas, who studied with both Heidegger and Husserl, stands out in this regard. As Di Paolo and others (2017) note, “Varela read his work relatively late in his career and was impressed with the resonances with his own thinking” (p. 3). In a collection of his essays, The Phenomenon of Life, very much in the spirit of the original version of enactivism, Jonas defends the view that there exists a deep, existential continuity between life and mind.

Many key enactivist ideas have also been advanced by key figures in the American pragmatist tradition. As Gallagher (2017) observes, many of the ideas of Peirce, Dewey, and Mead can be considered forerunners of enactivism” (p. 5). Gallagher and Lindgren (2015) go a step further, maintaining that the pioneers of enactivism “could have easily drawn on the work of John Dewey and other pragmatists. Indeed, long before Varela and others (1991), Dewey (1896) clearly characterized what has become known as enactivism” (p. 392). See also Gallagher (2014), Gallagher and Miyahara (2012), and Barrett (2019).

In advocating the so-called actional turn, enactivists touch on recurrent themes of central importance in Wittgenstein’s later philosophy, in particular his emphasis on the importance of our animal nature, forms of life, and the fundamental importance of action for understanding mind, knowledge, and language use. Contemporary enactivists characterize the nature of minds and how they fundamentally relate to the world in ways that not only echo but, in many ways, fully concur with the later Wittgenstein’s trademark philosophical remarks on the same topics. Indeed, Moyal-Sharrock (2021a) goes so far as to say that “Wittgenstein is —and should be recognized to be —at the root of the important contemporary philosophical movement called enactivism” (p. 8). The connections between Wittgenstein and enactivism are set out by many other authors (Hutto 2013d, 2015c, Boncompagni 2013, Loughlin 2014, 2021a, 2021b, Heras-Escribano and others 2015. See also Loughlin 2021c, for a discussion of how some of Wittgenstein’s ideas might also challenge enactivist assumptions).

4. Debates

Enactivism bills itself as providing an antidote to accounts of cognition that “take representation as their central notion” (Varela and others 1991, p. 172). Most fundamentally, in proposing that minds, like all living systems, are distinguished from machines by their biological autonomy, it sees itself as opposed to and rejects computational theories and functionalist theories of mind, including extended functionalist theories of mind (Di Paolo and others 2017, Gallagher 2017). Enactivism thus looks to work in robotics in the tradition of Brooks (1991) and dynamical systems theory (Smith and Thelen 1994, Beer 1998, Juarrero 1999) for representation-free and model-free ways of characterising and potentially explaining extensive cognitive activity (Kirchhoff and Meyer 2019, Meyer 2020a, 2020b).

In a series of publications, Villalobos and coauthors offer a sustained critique of enactivism for its commitment to biological autonomy on the grounds that its conception of mind is not sufficiently naturalistic. These critics deem enactivism’s commitment to teleology as the most problematic and seek to develop, in its place, an account of biological cognition built on a more austere interpretation of autopoietic theory (Villalobos 2013, Villalobos and Ward 2015, Abramova and Villalobos 2015, Villalobos and Ward 2016, Villalobos and Silverman 2018, Villalobos 2020, Villalobos and Razeto-Barry 2020, Villalobos and Palacios 2021).

An important topic in this body of work, taken up by Villalobos and Dewhurst (2017), is the proposal that enactivism may be compatible, despite its resistance to the idea, with a computational approach to cognitive mechanisms. This possibility seems plausible to some given the articulation of conceptions of computation that allow for computation without representation (see, for example, Piccinini 2008, 2015, 2020). For a critical response to the suggestion that enactivism is or should want to be compatible with a representation-free computationalism, see Hutto and others (2019) and Hutto and others (2020).

Several authors see great potential in allying enactivism and ecological psychology, a tradition in psychology miniated by James Gibson which places responsiveness to affordances at its center (Gibson 1979). In recent times, this possibility has become more attractive with the articulation of radical embodied cognitive science (Chemero 2009), that seeks to connect Gibsonian ideas with dynamical systems theory, without invoking mental representations.

A joint ecological-enactive approach to cognition has been proposed in the form of the skilled intentionality framework (Rietveld and Kiverstein 2014, Bruineberg and Rietveld 2014, Kiverstein and Rietveld and 2015, 2018, Bruineberg and others 2016, Rietveld, Denys and Van Westen 2018, Bruineberg, Chemero and Rietveld 2019). It seeks to provide an integrated basis for understanding the situated and affective aspects of the embodied mind, emphasizing that organisms must always be sensitive to multiple affordances simultaneously in concrete situations.

The task of ironing out apparent disagreements between enactivsm and ecological psychology to forge a tenable alliance of these two traditions has also been actively pursued by others (see Heras-Escribano 2016, Stapleton 2016, Segundo-Ortin and others 2019, Heras-Escribano 2019, Crippen 2020, Heft 2020, Myin 2020, Ryan and Gallagher 2020, Segundo-Ortin 2020, McGann and others 2020, Heras-Escribano 2021, Jurgens 2021, Rolla and Novaes 2022).

A longstanding sticking point that has impeded a fully-fledged enactivist-ecological psychology alliance is the apparent tension between enactivism’s wholesale rejection of the notion that cognition involves information processing and the tendency of those in the ecological psychology tradition to talk of perception as involving the ‘pickup’ of information ‘about’ environmental affordances (see Varela and others 1991, p. 201-204; Hutto and Myin 2017, p. 86). See also Van Dijk and others (2015). The use of such language can make it appear as if the Gibsonian framework is committed to the view that perceiving is a matter of organisms attuning to the covariant structures of a pre-given world. Notably, Baggs and Chemero (2021) attempt to directly address this obstacle to uniting the two frameworks (see also de Carvalho and Rolla 2020).

There have been attempts to take enactivist ideas seriously by some versions of predictive processing theories of cognition. In several publications, Andy Clark (2013, 2015, 2016) has sought to develop a version of predictive processing accounts of cognition that is informed, to some extent, by the embodied, non-intellectualist, action-orientated vision of mind promoted by enactivists.

Yet most enactivist-friendly advocates of predictive processing accounts of cognition tend to baulk when it comes to giving up the idea that cognition is grounded in models and mental representations. Clark (2015) tells us that he can’t imagine how to get by without such constructs when he rhetorically asks himself, “Why not simply ditch the talk of inner models and internal representations and stay on the true path of enactivist virtue?” (Clark 2015, p. 4; see also Clark 2016, p. 293). Whether a tenable compromise is achievable or whether there is a way around this impasse is a recurring and now prominent theme in the literature on predictive processing (see, for example, Gärtner and Clowes 2017, Constant and others 2021, Venter 2021, Constant and others 2022, Gallagher and others 2022, Gallagher 2022b).

Several philosophers have argued that it is possible to develop entirely non-representationalist predictive processing accounts of cognition that could be fully compatible with enactivism (Bruineberg and Rietveld 2014; Bruineberg, Kiverstein, and Rietveld 2016; Bruineberg and others, 2018; Bruineberg and Rietveld 2019). This promised union comes in the form of what Venter (2021) has called free energy enactivism. The Free Energy Principle articulated by Friston (2010, 2011) maintains that what unites all self-organizing systems (including non-living systems) is that they work to minimize free energy. Many have sought to build similar bridges between enactivism and free energy theorizing (Kirchhoff 2015, Kirchhoff and Froese 2017, Kirchhoff and Robertson 2018, Kirchhoff 2018a, 2018b,

Kirchhoff and others 2018, Robertson and Kirchhoff 2019, Ramstead and others 2020a, Hesp and others 2019). However, Di Paolo, Thompson, and Beer (2022) identify what they take to be fundamental differences between the enactive approach and the free energy framework that appear to make such a union unlikely, if not impossible.

5. Applications and Influence

Enactivism’s novel framework for conceiving of minds and our place in nature has proved fertile and fecund. Enactivism serves as an attractive philosophical platform from which many researchers and practitioners are inspired to launch fresh investigations into a great variety of topics—investigations that have potentially deep and wide-ranging implications for theory and practice.

In the domain of philosophy of psychology, beyond breaking new ground in our thinking about the phenomenality and intentionality of perception and perceptual experience, enactivism has generated many fresh lines of research. Enactivists have contributed to new thinking about: the nature of habits and their intelligence (for example, Di Paolo and others 2017; Ramírez-Vizcaya and Froese 2019; Zarco and Egbert 2019; Hutto and Robertson 2020); emotions and, especially, the distinction in the affective sciences between basic and non-basic emotions ( for example, Colombetti and Thompson 2008; Hutto 2012; Colombetti 2014; Hutto, Robertson, and Kirchhoff 2018); pretense (Rucińska 2016, 2019; Weichold and Rucińska 2021, 2022); imagination (for example, Thompson 2007; Medina 2013; Hutto 2015a; Roelofs 2018; Facchin 2021); memory (for example, Hutto and Peeters 2018; Michaelian and Sant’Anna 2021); mathematical cognition (for example, Zahidi and Myin 2016; Gallagher 2017, 2019; Hutto 2019; Zahidi 2021); social cognition – and, in particular, advanced the proposal that the most basic forms of intersubjectivity take the form of direct, engaged interactions between agents, where this is variously understood in terms of unprincipled embodied engagements scaffolded by narrative practices (Hutto 2006, Gallagher and Hutto 2008 – see also Paolucci 2020, Hutto and Jurgens 2019), interaction theory (Gallagher 2005, 2017, 2020a), and participatory sense-making (De Jaegher and Di Paolo 2007; De Jaegher 2009).

In addition to stimulating new thinking about mind and cognition, enactive ideas have also influenced research on topics in many other domains, including: AI and technological development (Froese and Ziemke 2009; Froese and others 2012; Ihde and Malafouris 2019; Sato and McKinney 2022; Rolla and others 2022); art, music, and aesthetics (Noë 2015; Schiavio and De Jaegher 2017; Fingerhut 2018, Murphy 2019; Gallagher 2021; Høffding and Schiavio 2021); cognitive archaeology (Garofoli 2015, 2018, 2019; Garofoli and Iliopoulos 2018); cross-cultural philosophy (McKinney 2020, Janz 2022, Lai 2022); education and pedagogical design (Hutto and others 2015; Gallagher and Lindgren 2015; Abrahamson and others 2016; Hutto and Abrahamson 2022); epistemology (Vörös 2016; Venturinha 2016; Rolla 2018; De Jaegher 2021; Moyal-Sharrock 2021); ethics and values (Varela 1999a; Colombetti and Torrance 2009; Di Paolo and De Jeagher 2022); expertise and skilled performance (Hutto and Sánchez-García 2015; Miyahara and Segundo-Ortin 2022; Robertson and Hutto 2023); mental health, psychopathology, and psychiatry (Fuchs 2018; de Haan 2020; Jurgens and others 2020; Maiese 2022b, 2022c, 2022d); rationality (Rolla 2021).

6. Conclusion

There can be no doubt that enactivism is making waves in today’s philosophy, cognitive science, and beyond the boundaries of the academy. Although only newly born, enactivism has established itself as a force to be reckoned with in our thinking about mind, cognition, the world around us, and many other related topics. What remains to be seen is whether, and to what extent, different versions of enactivism will continue to develop productively, whether they will unite or diverge, whether they will find new partners, and, most crucially, whether enactivist ideas will continue to be actively taken up and widely influential. For now, this much is certain: The enactivist game is very much afoot.

7. References and Further Reading

  • Abrahamson, D., Shayan, S., Bakker, A., and Van der Schaaf, M. 2016. Eye-tracking Piaget: Capturing the Emergence of Attentional Anchors in the Coordination of Proportional Motor Action. Human Development, 58(4-5), 218–244.
  • Abramova, K. and Villalobos, M. 2015. The Apparent Ur-Intentionality of Living Beings and the Game of Content. Philosophia, 43(3), 651-668.
  • Baerveldt, C. and Verheggen, T. 2012. Enactivism. The Oxford Handbook of Culture and Psychology. Valsiner, J. (ed). Oxford. Oxford University Press. pp. 165–190.
  • Baggs, E. and Chemero, A. 2021. Radical Embodiment in Two Directions. Synthese, 198:S9, 2175–2190.
  • Barandiaran, X. E. 2017. Autonomy and Enactivism: Towards a Theory of Sensorimotor Autonomous Agency. Topoi, 36(3), 409–430.
  • Barandiaran, X. and Di Paolo, E. 2014. A Genealogical Map of the Concept of Habit. Frontiers in Human Neuroscience.
  • Barrett, L. 2019. Enactivism, Pragmatism … Behaviorism? Philosophical Studies. 176(3), 807–818.
  • Beer R. 1998. Framing the Debate between Computational and Dynamical Approaches to Cognitive Science. Behavioral and Brain Sciences, 21(5), 630-630.
  • Boncompagni, A. 2020. Enactivism and Normativity: The case of Aesthetic Gestures. JOLMA – The Journal for the Philosophy of Language, Mind, and the Arts, 2(1):177-194.
  • Boncompagni, A. 2013. Enactivism and the ‘Explanatory Trap’: A Wittgensteinian Perspective. Methode – Analytic Perspectives, 2, 27-49.
  • Brooks R. 1991. Intelligence without Representation. Artificial Intelligence. 47: 139-159.
  • Bruineberg, J., Chemero, A., and Rietveld, E. 2019. General Ecological Information supports Engagement with Affordances for ‘Higher’ Cognition. Synthese, 196(12), 5231–5251.
  • Bruineberg, J., Kiverstein, J., and Rietveld, E. 2016. The Anticipating Brain is Not a Scientist: The Free-Energy Principle from an Ecological-Enactive Perspective. Synthese, 195(6), 2417-2444.
  • Bruineberg, J., and Rietveld, E. 2014. Self-organisation, Free Energy Minimization, and Optimal Grip on a Field of Affordances. Frontiers in Human Neuroscience 8(599), 1-14. doi.org/10.3389/fnhum.2014.00599.
  • Buckner, C. 2021. A Forward-Looking Theory of Content. Ergo. 8:37. 367-401.
  • Burnett, M. and Gallagher, S. 2020. 4E Cognition and the Spectrum of Aesthetic Experience. JOLMA – The Journal for the Philosophy of Language, Mind, and the Arts. 1: 2. 157–176.
  • Candiotto, L. 2022. Loving the Earth by Loving a Place: A Situated Approach to the Love of Nature, Constructivist Foundations, 17(3), 179–189.
  • Cappuccio M. and Froese, T. 2014. Introduction. In Cappuccio, M and Froese, T. (eds.), Enactive Cognition at the Edge of Sense-Making: Making Sense of Nonsense. Basingstoke: Palgrave Macmillan. pp. 1-33
  • Chemero A. 2009. Radical Embodied Cognitive Science. Cambridge, MA: MIT Press.
  • Clark, A. 2015. Predicting Peace: The End of the Representation Wars: Reply to Madary. In Open MIND: 7(R), ed. T. Metzinger and J. M. Windt. MIND Group. doi: 10.15502/9783958570979.
  • Clark, A. 2016. Surfing Uncertainty: Prediction, Action, and the Embodied Mind. New York: Oxford University Press.
  • Colombetti, G. 2014. The Feeling Body: Affective Science Meets the Enactive Mind. Cambridge, MA, MIT Press.
  • Colombetti, G. 2010. Enaction, Sense-Making and Emotion. In Stewart J, Gapenne O, and Paolo, E.D. (eds.). Enaction: Toward a New Paradigm for Cognitive Science, Cambridge MA: MIT Press, 145-164.
  • Colombetti, G. and Torrance, S. 2009. Emotion and Ethics: An Inter-(En)active Approach. Phenomenology and the Cognitive Sciences, 8 (4): 505-526.
  • Colombetti, G. and Thompson, E. 2008. The Feeling Body: Towards an Enactive Approach to Emotion. In W. F. Overton, U. Müller and J. L. Newman (eds.), Developmental Perspectives on Embodiment and Consciousness. Erlbaum. pp. 45-68.
  • Constant, A., Clark, A and Friston, K. 2021. Representation Wars: Enacting an Armistice Through Active Inference. Frontiers in Psychology.
  • Constant, C., Clark, A., Kirchhoff, M, and Friston, K. 2022. Extended Active Inference: Constructing Predictive Cognition Beyond Skulls. Mind and Language, 37(3), 373-394.
  • Crippen, M. 2020. Enactive Pragmatism and Ecological Psychology. Frontiers in Psychology, 11. 203–204.
  • Cuffari, E.C., Di Paolo, E.A., and De Jaegher, H. 2015. From Participatory Sense-Making to Language: There and Back Again. Phenomenology and the Cognitive Sciences, 14 (4), 1089-1125.
  • de Carvalho, E., and Rolla, G. 2020. An Enactive‐Ecological Approach to Information and Uncertainty. Frontiers in Psychology, 11, 1–11.
  • de Haan, S. 2020. Enactive Psychiatry. Cambridge, UK: Cambridge University Press.
  • De Jaegher, H. 2021. Loving and Knowing: Reflections for an Engaged Epistemology. Phenomenology and the Cognitive Sciences 20, 847–870.
  • De Jaegher, H. 2015. How We Affect Each Other: Michel Henry’s’ Pathos-With’ and the Enactive Approach to Intersubjectivity. Journal of Consciousness Studies 22 (1-2), 112-132.
  • De Jaegher, H. 2013. Embodiment and Sense-Making in Autism. Frontiers in Integrative Neuroscience, 7, 15. doi:10.3389/fnint.2013.00015
  • De Jeagher, H. 2009. Social Understanding through Direct Perception? Yes, by Interacting. Consciousness and Cognition. 18 (2), 535-542.
  • De Jaegher, H. and Di Paolo, E.A. 2008. Making Sense in Participation: An Enactive Approach to Social Cognition. In Morganti, F. and others (eds.). Enacting Intersubjectivity. IOS Press.
  • De Jeagher, H. and Di Paolo, E. 2007. Participatory Sense-Making: An Enactive Approach to Social Cognition. Phenomenology and the Cognitive Sciences 6(4): 485-507
  • De Jaegher, H, Di Paolo, E.A, and Gallagher, S. 2010. Can Social Interaction Constitute Social Cognition? Trends in Cognitive Sciences, 14 (10), 441-447.
  • De Jaegher, H., and Froese, T. 2009. On the Role of Social Interaction in Individual Agency. Adaptive Behavior, 17(5), 444‐460.
  • De Jaegher, H., Pieper, B., Clénin, D., and Fuchs, T. 2017. Grasping Intersubjectivity: An Invitation to Embody Social Interaction Research. Phenomenology and the Cognitive Sciences, 16(3), 491–523.
  • De Jesus, P. 2018. Thinking through Enactive Agency: Sense‐Making, Bio‐semiosis and the Ontologies of Organismic Worlds. Phenomenology and the Cognitive Sciences, 17(5), 861–887.
  • De Jesus P. 2016. From Enactive Phenomenology to Biosemiotic Enactivism. Adaptive Behavior. 24(2):130-146.
  • Degenaar, J., and O’Regan, J. K. 2017. Sensorimotor Theory and Enactivism. Topoi, 36, 393–407.
  • Deguchi, S., Shimoshige, H., Tsudome, M. Mukai, S., Corkery, R.W., S, and Horikoshi, K. 2011. Microbial growth at hyperaccelerations up to 403,627 × g. PNAS. 108:19. 7997-8002
  • Dewey, J. 1922. Human Nature and Conduct: An Introduction to Social Psychology, 1st edn. New York: Holt.
  • Di Paolo, E. A. 2021. Enactive Becoming. Phenomenology and the Cognitive Sciences, 20, 783–809.
  • Di Paolo, E. A. 2018. The Enactive Conception of Life. In A. Newen, De L. Bruin, and S. Gallagher (eds.). The Oxford Handbook of 4E Cognition (pp. 71–94). Oxford: Oxford University Press.
  • Di Paolo, E. A. 2009. Extended Life. Topoi, 28(9).
  • Di Paolo, E. A. 2005. Autopoiesis, Adaptivity, Teleology, Agency. Phenomenology and the Cognitive Sciences, 4, 429–452.
  • Di Paolo, E., Buhrmann, T., and Barandiaran, X. E. 2017. Sensorimotor Life. Oxford: Oxford University Press.
  • Di Paolo, E. A., Cuffari, E. C., and De Jaegher, H. 2018. Linguistic Bodies. The Continuity Between Life and Language. Cambridge, MA: MIT Press.
  • Di Paolo, E.A. and De Jaegher, H. 2022. Enactive Ethics: Difference Becoming Participation. Topoi. 41, 241–256.
  • Di Paolo, E.A. and De Jaegher, H. 2012. The Interactive Brain Hypothesis. Frontiers in Human Neuroscience.
  • Di Paolo, E.A., Rohde, M. and De Jaegher, H. 2010. Horizons for the Enactive Mind: Values, Social Interaction, and Play. In Stewart, J., Gapenne, O., and Di Paolo, E.A. (eds). Enaction: Toward a New Paradigm for Cognitive Science. Cambridge, MA: MIT Press.
  • Di Paolo, E. A. and Thompson, E. 2014. The Enactive Approach. In L. Shapiro (Ed.), The Routledge Handbook of Embodied Cognition (pp. 68–78). London: Routledge.
  • Di Paolo, E., Thompson, E. and Beer, R. 2022. Laying Down a Forking Path: Tensions between Enaction and the Free Energy Principle. Philosophy and the Mind Sciences. 3.
  • Facchin, M. 2021. Is Radically Enactive Imagination Really Contentless? Phenomenology and the Cognitive Sciences. 21. 1089–1105.
  • Fingerhut, J. 2018. Enactive Aesthetics and Neuroaesthetics. Phenomenology and Mind, 14, 80–97.
  • Froese, T. and Di Paolo, E.A. 2011. The Enactive Approach: Theoretical Sketches from Cell to Society. Pragmatics and Cognition, 19 (1), 1-36.
  • Froese, T., and Di Paolo, E.A. 2009. Sociality and the Life‐Mind Continuity Thesis. Phenomenology and the Cognitive Sciences, 8(4), 439-463.
  • Froese, T, McGann, M, Bigge, W, Spiers, A, and Seth, A.K. 2012. The Enactive Torch: A New Tool for the Science of Perception. IEEE Trans Haptics. 5(4):365-75
  • Froese, T., Woodward, A. and Ikegami, T. 2013. Turing Instabilities in Biology, Culture, and Consciousness? On the Enactive Origins of Symbolic Material Culture. Adaptive Behavior, 21 (3), 199-214.
  • Froese, T., and Ziemke, T. 2009. Enactive Artificial Intelligence: Investigating the Systemic Organization of Life and Mind. Artificial Intelligence, 173(3–4), 466–500.
  • Fuchs, T. 2018. Ecology of the Brain: The Phenomenology and Biology of the Embodied Mind. New York: Oxford University Press.
  • Gallagher, S. 2022b. Surprise! Why Enactivism and Predictive Processing are Parting Ways: The Case of Improvisation. Possibility Studies and Society.
  • Gallagher, S. 2021. Performance/Art: The Venetian Lectures. Milan: Mimesis International Edizioni.
  • Gallagher, S. 2020a. Action and Interaction. Oxford: Oxford University Press.
  • Gallagher, S. 2020b. Enactivism, Causality, and Therapy. Philosophy, Psychiatry, and Psychology, 27 (1), 27-28.
  • Gallagher, S. 2018a. Educating the Right Stuff: Lessons in Enactivist Learning. Educational Theory. 68 (6): 625-641.
  • Gallagher, S. 2018b. Rethinking Nature: Phenomenology and a Non-Reductionist Cognitive Science. Australasian Philosophical Review. 2 (2): 125-137
  • Gallagher, S. 2017. Enactivist Interventions: Rethinking the Mind. Oxford: Oxford University Press.
  • Gallagher, S. 2014. Pragmatic Interventions into Enactive and Extended Conceptions of Cognition. Philosophical Issues, 24 (1), 110-126.
  • Gallagher, S. 2005. How the Body Shapes the Mind. New York: Oxford University Press.
  • Gallagher, S. and Bower, M. 2014. Making Enactivism Even More Embodied. Avant, 5 (2), 232-247.
  • Gallagher, S. and Hutto, D. 2008. Understanding Others through Primary Interaction and Narrative Practice. In Zlatev, J., Racine, T., Sinha, C. and Itkonen, E. (eds). The Shared Mind: Perspectives on Intersubjectivity. John Benjamins. 17-38.
  • Gallagher, S., Hutto, D. and Hipólito, I. 2022. Predictive Processing and Some Disillusions about Illusions. Review of Philosophy and Psychology. 13, 999–1017.
  • Gallagher, S. and Lindgren, R. 2015. Enactive Metaphors: Learning through Full-Body Engagement. Educational Psychological Review. 27: 391–404.
  • Gallagher, S. and Miyahara, K. 2012. Neo-Pragmatism and Enactive Intentionality. In: Schulkin, J. (eds) Action, Perception and the Brain. New Directions in Philosophy and Cognitive Science. Palgrave Macmillan, London.
  • Garofoli, D. 2019. Embodied Cognition and the Archaeology of Mind: A Radical Reassessment. In Marie Prentiss, A. M. (ed). Handbook of Evolutionary Research in Archaeology. Springer. 379-405.
  • Garofoli, D. 2018. RECkoning with Representational Apriorism in Evolutionary Cognitive Archaeology. Phenomenology and the Cognitive Sciences.17, 973–995.
  • Garofoli, G. 2015. A Radical Embodied Approach to Lower Palaeolithic Spear-making. The Journal of Mind and Behavior. 1-25.
  • Garofoli, D and Iliopoulos, A. 2018. Replacing Epiphenomenalism: A Pluralistic Enactive Take on the Metaplasticity of Early Body Ornamentation. Philosophy and Technology, 32, 215–242.
  • Gärtner, K. and Clowes, R. 2017. Enactivism, Radical Enactivism and Predictive Processing: What is Radical in Cognitive Science? Kairos. Journal of Philosophy and Science, 18(1). 54-83.
  • Gibson, J.J. 1979. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin.
  • Godfrey‐Smith, P. 2001. Three Kinds of Adaptationism. In Hecht Orzack, S. H. (ed). Adaptationism and Optimality (pp. 335–357). Cambridge: Cambridge University Press.
  • Gould, S. J., and Lewontin, R.C. 1979. The Spandrels of San Marco and the Panglossian Paradigm: A Critique of the Adaptationist Programme. Proceedings of the Royal Society of London—Biological Sciences, 205(1161), 581–598.
  • Heft, H. 2020. Ecological Psychology and Enaction Theory: Divergent Groundings. Frontiers in Psychology.
  • Heras-Escribano, M. 2021. Pragmatism, Enactivism, and Ecological Psychology: Towards a Unified Approach to Post-Cognitivism. Synthese, 198 (1), 337-363.
  • Heras-Escribano, M. 2019. The Philosophy of Affordances. Basingstoke: Palgrave Macmillan.
  • Heras-Escribano, M. 2016. Embracing the Environment: Ecological Answers for Enactive Problems. Constructivist Foundations, 11 (2), 309-312.
  • Heras-Escribano, M, Noble, J., and De Pinedo, M. 2015. Enactivism, Action and Normativity: a Wittgensteinian Analysis. Adaptive Behavior, 23 (1), 20-33.
  • Hesp, C., Ramstead, M., Constant, A., Badcock, P., Kirchhoff, M., Friston, K. 2019. A Multi-scale View of the Emergent Complexity of Life: A Free-Energy Proposal. In: Georgiev, G., Smart, J., Flores Martinez, C., Price, M. (eds) Evolution, Development and Complexity. Springer Proceedings in Complexity. Springer, Cham.
  • Hipólito, I, Hutto, D.D., and Chown, N. 2020. Understanding Autistic Individuals: Cognitive Diversity not Theoretical Deficit. In Rosqvist, H., Chown, N., and Stenning, A. (eds). Neurodiversity Studies: A New Critical Paradigm, 193-209.
  • Høffding, S. and Schiavio. A. 2021. Exploratory Expertise and the Dual Intentionality of Music-Making. Phenomenology and the Cognitive Sciences, 20 (5): 811-829.
  • Hurley, S. L. 1998. Consciousness in Action. Cambridge, MA: Harvard University Press.
  • Hurley, S. and Noë, A. 2003. Neural Plasticity and Consciousness. Biology and Philosophy 18, 131–168.
  • Hutto, D. D. 2020. From Radical Enactivism to Folk Philosophy. The Philosophers’ Magazine. 88. 75-82.
  • Hutto, D.D. 2019. Re-doing the Math: Making Enactivism Add Up. Philosophical Studies. 176. 827–837.
  • Hutto, D.D. 2018. Getting into Predictive Processing’s Great Guessing Game: Bootstrap Heaven or Hell? Synthese, 195, 2445–2458.
  • Hutto, D.D. 2017. REC: Revolution Effected by Clarification. Topoi. 36:3. 377–391
  • Hutto, D.D. 2015a. Overly Enactive Imagination? Radically Re-imagining Imagining. The Southern Journal of Philosophy. 53. 68–89.
  • Hutto, D.D. 2015b. Contentless Perceiving: The Very Idea. In Wittgenstein and Perception. O’Sullivan, M. and Campbell, M. (eds). London: Routledge. 64-84.
  • Hutto, D.D. 2013a. Radically Enactive Cognition in our Grasp. In The Hand – An Organ of the Mind. Radman, Z. (ed). Cambridge, MA: MIT Press. 227-258.
  • Hutto, D.D. 2013b. Enactivism from a Wittgensteinian Point of View. American Philosophical Quarterly. 50(3). 281-302.
  • Hutto, D.D. 2013c. Psychology Unified: From Folk Psychology to Radical Enactivism. Review of General Psychology. 17(2). 174-178.
  • Hutto, D.D. 2012. Truly Enactive Emotion. Emotion Review. 4:1. 176-181.
  • Hutto, D.D. 2011a. Philosophy of Mind’s New Lease on Life: Autopoietic Enactivism meets Teleosemiotics. Journal of Consciousness Studies. 18:5-6. 44-64.
  • Hutto, D.D. 2011c. Enactivism: Why be Radical? In Sehen und Handeln. Bredekamp, H. and Krois, J. M. (eds). Berlin: Akademie Verlag. 21-44
  • Hutto, D.D. 2008. Folk Psychological Narratives: The Socio-Cultural Basis of Understanding Reasons. Cambridge, MA: The MIT Press.
  • Hutto, D.D. 2006. Unprincipled Engagement: Emotional Experience, Expression and Response. In Menary, R. (ed.), Radical Enactivism: Intentionality, Phenomenology and Narrative: Focus on the Philosophy of Daniel D. Hutto. Amsterdam: Jon Benjamins.
  • Hutto, D.D. 2005. Knowing What? Radical versus Conservative Enactivism. Phenomenology and the Cognitive Sciences. 4(4). 389-405.
  • Hutto, D.D. 2000. Beyond Physicalism. Philadelphia/Amsterdam: John Benjamins.
  • Hutto, D.D. 1999. The Presence of Mind. Philadelphia/Amsterdam: John Benjamins.
  • Hutto, D.D. and Abrahamson, D. 2022. Embodied, Enactive Education: Conservative versus Radical Approaches. In Movement Matters: How Embodied Cognition Informs Teaching and Learning. Macrine and Fugate (eds). Cambridge, MA: MIT Press.
  • Hutto, D., Gallagher, S., Ilundáin-Agurruza, J., and Hipólito, I. 2020. Culture in Mind – An Enactivist Account: Not Cognitive Penetration but Cultural Permeation. In Kirmayer, L. J., S. Kitayama, S., Worthman, C.M., Lemelson, R. and Cummings, C.A. (Eds.), Culture, Mind, and Brain: Emerging Concepts, Models, Applications. New York, NY: Cambridge University Press. pp. 163–187.
  • Hutto, D.D. and Jurgens, A. 2019. Exploring Enactive Empathy: Actively Responding to and Understanding Others. In Matravers, D. and Waldow, A. (eds). Philosophical Perspectives on Empathy: Theoretical Approaches and Emerging Challenges. London: Routledge. pp. 111-128.
  • Hutto, D.D. and Kirchhoff, M. 2015. Looking Beyond the Brain: Social Neuroscience meets Narrative Practice. Cognitive Systems Research, 34, 5-17.
  • Hutto, D.D., Kirchhoff, M.D., and Abrahamson, D. 2015. The Enactive Roots of STEM: Rethinking Educational Design in Mathematics. Educational Psychology Review, 27(3), 371-389.
  • Hutto, D.D., Kirchhoff, M. and Myin, E. 2014. Extensive Enactivism: Why Keep it All In? Frontiers in Human Neuroscience. doi: 10.3389/fnhum.2014.00706.
  • Hutto, D.D. and Myin, E. 2021. Re-affirming Experience, Presence, and the World: Setting the RECord Straight in Reply to Noë. Phenomenology and the Cognitive Sciences. 20, vol. 5, no. 20, pp. 971-989
  • Hutto, D.D. and Myin, E. 2018a. Much Ado about Nothing? Why Going Non-semantic is Not Merely Semantics. Philosophical Explorations. 21(2). 187–203.
  • Hutto, D.D. and Myin, E. 2018b. Going Radical. In The Oxford Handbook of 4E Cognition. Newen, A. Gallagher, S. and de Bruin, L. (eds). Oxford: Oxford University Press. pp. 95-116.
  • Hutto, D.D. and Myin, E. 2017. Evolving Enactivism: Basic Minds meet Content. Cambridge, MA: The MIT Press.
  • Hutto, D.D. and Myin, E. 2013. Radicalizing Enactivism: Basic Minds without Content. Cambridge, MA: The MIT Press.
  • Hutto, D. D., Myin, E., Peeters, A, and Zahnoun, F. 2019. The Cognitive Basis of Computation: Putting Computation In its Place. In Colombo, M. and Sprevak, M. (eds). The Routledge Handbook of The Computational Mind. London: Routledge. 272-282.
  • Hutto, D.D. and Peeters, A. 2018. The Roots of Remembering. Extensively Enactive RECollection. In New Directions in the Philosophy of Memory. Michaelian, K. Debus, D. Perrin, D. (eds). London: Routledge. pp. 97-118.
  • Hutto, D.D. and Robertson, I. 2020. Clarifying the Character of Habits: Understanding What and How They Explain. In Habits: Pragmatist Approaches from Cognitive Science, Neuroscience, and Social Theory. Caruana, F. and Testa, I. (eds). Cambridge: Cambridge University Press. pp. 204-222.
  • Hutto, D.D., Robertson, I and Kirchhoff, M. 2018. A New, Better BET: Rescuing and Revising Basic Emotion Theory. Frontiers in Psychology, 9, 1217.
  • Hutto, D.D., Röhricht, F., Geuter, U., and S. Gallagher. 2014. Embodied Cognition and Body Psychotherapy: The Construction of New Therapeutic Environments. Sensoria: A Journal of Mind, Brain and Culture. 10(1).
  • Hutto, D.D. and Sánchez-García, R. 2015. Choking RECtified: Enactive Expertise Beyond Dreyfus. Phenomenology and the Cognitive Sciences. 14:2. 309-331.
  • Hutto, D.D., and Satne, G. 2018a. Naturalism in the Goldilock’s Zone: Wittgenstein’s Delicate Balancing Act. In Raleigh, T. and Cahill, K. (eds). Wittgenstein and Naturalism. London: Routledge. 56-76.
  • Hutto, D.D. and Satne, G. 2018b. Wittgenstein’s Inspiring View of Nature: On Connecting Philosophy and Science Aright. Philosophical Investigations. 41:2. 141-160.
  • Hutto, D.D., and Satne, G. 2017a. Continuity Scepticism in Doubt: A Radically Enactive Take. In Embodiment, Enaction, and Culture. Durt, C, Fuchs, T and Tewes, C (eds). Cambridge, MA. MIT Press. 107-126.
  • Hutto, D.D. and Satne, G. 2017b. Davidson Demystified: Radical Interpretation meets Radical Enactivism. Argumenta. 3:1. 127-144.
  • Hutto. D.D. and Satne, G. 2015. The Natural Origins of Content. Philosophia. 43. 521–536.
  • Ihde, D., and Malafouris, L. 2019. Homo Faber Revisited: Postphenomenology and Material Engagement Theory. Philosophy and Technology, 32(2), 195–214.
  • Janz, B. 2022. African Philosophy and Enactivist Cognition: The Space of Thought. Imprint Bloomsbury Academic.
  • Jonas, H. 1966. The Phenomenon of Life. Evanston: Northwestern University Press
  • Juarrero, A. 1999. Dynamics in Action: Intentional Behavior as a Complex System. Cambridge: The MIT Press.
  • Jurgens, A., Chown, N, Stenning, A. and Bertilsdotter-Rosqvist, H. 2020. Neurodiversity in a Neurotypical World: An Enactive Framework for Investigating Autism and Social Institutions. In Rosqvist, H., Chown, N., and Stenning, A. (eds). Neurodiversity Studies: A New Critical Paradigm, 73-88.
  • Jurgens, A. 2021. Re-conceptualizing the Role of Stimuli: An Enactive, Ecological Explanation of Spontaneous-Response Tasks. Phenomenology and the Cognitive Sciences, 20 (5), 915-934.
  • Kabat-Zinn, J. 2016. Foreword to the Revised Edition. In Varela, F. J., Thompson, E., and Rosch, E. The Embodied Mind: Cognitive Science and Human Experience. Revised Edition (6th ed.). Cambridge, MA: MIT Press.
  • Kee, H. 2018. Phenomenology and Naturalism in Autopoietic and Radical Enactivism: Exploring Sense-Making and Continuity from the Top Down. Synthese. pp. 2323–2343.
  • Kirchhoff, M. 2018a. Autopoiesis, Free Energy, and the Life–Mind Continuity Thesis, Synthese, 195 (6), 2519-2540.
  • Kirchhoff, M. 2018b. The Body in Action: Predictive Processing and the Embodiment Thesis. The Oxford Handbook of 4E Cognition. Oxford: Oxford University Press. pp. 243-260.
  • Kirchhoff, M. 2015. Species of Realization and the Free Energy Principle. Australasian Journal of Philosophy. 93 (4), 706-723.
  • Kirchhoff, M and Froese, T. 2017. Where There is Life, There is Mind: In Support of a Strong Life-Mind Continuity Thesis. Entropy, 19 (4), 169.
  • Kirchhoff, M. and Hutto, D.D. 2016. Never Mind the Gap: Neurophenomenology, Radical Enactivism and the Hard Problem of Consciousness. Constructivist Foundations. 11 (2): 302–30.
  • Kirchhoff, M and Meyer, R. 2019. Breaking Explanatory Boundaries: Flexible Borders and Plastic Minds. Phenomenology and the Cognitive Sciences, 18 (1), 185-204.
  • Kirchhoff, M., Parr, T., Palacios, E, Friston, K, and Kiverstein, J. 2018. The Markov Blankets of Life: Autonomy, Active Inference and the Free Energy Principle. Journal of The Royal Society Interface, 15 (138), 2017-0792.
  • Kirchhoff, M. D., and Robertson, I. 2018. Enactivism and Predictive Processing: A Non‐Representational View. Philosophical Explorations, 21(2), 264–281.
  • Kiverstein, J. D., and Rietveld, E. 2018. Reconceiving Representation‐Hungry Cognition: An Ecological‐Enactive Proposal. Adaptive Behavior, 26(4), 147–163.
  • Kiverstein, J. and Rietveld, E. 2015. The Primacy of Skilled Intentionality: On Hutto and Satne’s The Natural Origins of Content. Philosophia, 43 (3). 701–721.
  • Lai, K.L. 2022. Models of Knowledge in the Zhuangzi: Knowing with Chisels and Sticks. In Lai, K.L. (ed). Knowers and Knowledge in East-West Philosophy: Epistemology Extended. Basingstoke: Palgrave Macmillan 319-344.
  • Laland, K. N., Matthews, B., and Feldman, M.W. 2016. An Introduction to Niche Construction Theory. Evolutionary Ecology, 30(2), 191–202.
  • Laland, K., Uller, T., Feldman, M., Sterelny, K., Müller, G., Moczek, A., Jablonka, E., and Odling‐Smee, J. 2014. Does Evolutionary Theory Need a Rethink? Yes, Urgently. Nature, 514, 161–164.
  • Langland-Hassan, P. 2021. Why Pretense Poses a Problem for 4E Cognition (and how to Move Forward). Phenomenology and the Cognitive Sciences. 21. 1003 – 1021.
  • Langland-Hassan, P. 2022. Secret Charades: Reply to Hutto. Phenomenology and the Cognitive Sciences. 21. 1183 – 1187.
  • Lee, J. 2019. Structural Representation and the Two Problems of Content. Mind and Language. 34: 5. 606-626.
  • Legg, C. 2021. Discursive Habits: A Representationalist Re-Reading of Teleosemiotics. Synthese, 199(5), 14751-14768.
  • Lewontin, R. 2000. The Triple Helix: Gene. Cambridge, MA: Harvard University Press.
  • Lewontin, R. and Levins, R. 1997. Organism and Environment. Capitalism Nature Socialism. 8: 2. 95-98.
  • Littlefair, S. 2020. Why Evan Thompson isn’t a Buddhist. Lion’s Roar: Buddhist Wisdom for Our Time. https://www.lionsroar.com/evan-thompson-not-buddhist/
  • Loughlin, V. 2014. Radical Enactivism, Wittgenstein and the Cognitive Gap. Adaptive Behavior. 22 (5): 350-359.
  • Loughlin, V. 2021a. 4E Cognitive Science and Wittgenstein. Basingstoke: Palgrave Macmillan
  • Loughlin, V. 2021b. Why Enactivists Should Care about Wittgenstein. Philosophia 49(11–12).
  • Loughlin, V. 2021c. Wittgenstein’s Challenge to Enactivism. Synthese, 198 (Suppl 1), 391–404.
  • Maiese, M. 2022a. Autonomy, Enactivism, and Mental Disorder: A Philosophical Account. London: Routledge.
  • Maiese, M. 2022b. White Supremacy as an Affective Milieu. Topoi, 41 (5): 905-915.
  • Maiese, M. 2022c. Mindshaping, Enactivism, and Ideological Oppression. Topoi, 41 (2): 341-354.
  • Maiese, M. 2022d. Neoliberalism and Mental Health Education. Journal of Philosophy of Education. 56 (1): 67-77.
  • Malafouris, L. 2013. How Things Shape the Mind: A Theory of Material Engagement. Cambridge, MA: MIT Press.
  • Mann, S. and Pain, R. 2022. Teleosemantics and the Hard Problem of Content, Philosophical Psychology, 35:1, 22-46.
  • Maturana, H. R., and Varela, F. J. 1980. Autopoiesis and Cognition: The Realization of the Living. Boston: D. Reidel.
  • Maturana, H., and Varela, F. 1987. The Tree of Knowledge: The Biological Roots of Human Understanding. New Science Library/Shambhala Publications.
  • Maturana, H., and Mpodozis, J. 2000. The Origin of Species by Means of Natural Drift. Revista Chilena De Historia Natural, 73(2), 261–310.
  • McGann M. 2022. Connecting with the Subject of our Science: Course-of-Experience Research Supports Valid Theory Building in Cognitive Science. Adaptive Behavior. doi:10.1177/10597123221094360).
  • McGann M. 2021. Enactive and Ecological Dynamics Approaches: Complementarity and Differences for Interventions in Physical Education Lessons. Physical Education and Sport Pedagogy, 27(3):1-14.
  • McGann, M. 2007. Enactive Theorists Do it on Purpose: Toward An Enactive Account of Goals and Goal-directedness. Phenomenology and the Cognitive Sciences, 6, 463–483.
  • McGann, M, De Jaegher, H. and Di Paolo, E.A. 2013. Enaction and Psychology. Review of General Psychology, 17 (2), 203-209.
  • McGann, M. Di Paolo, E.A., Heras-Escribano, M. and Chemero, A. 2020. Enaction and Ecological Psychology: Convergences and Complementarities. Frontiers in Psychology. 11:1982.
  • McKinney, J. 2020. Ecological-Enactivism Through the Lens of Japanese Philosophy. Frontiers Psychology. 11.
  • Medina, J. 2013. An Enactivist Approach to the Imagination: Embodied Enactments and ‘Fictional Emotions’. American Philosophical Quarterly 50.3: 317–335.
  • Merleau-Ponty, M. 1963. The Structure of Behavior. Pittsburgh: Duquesne University Press .
  • Meyer, R. 2020a. The Nonmechanistic Option: Defending Dynamical Explanations. The British Journal for the Philosophy of Science. 71 (3):959-985
  • Meyer, R. 2020b. Dynamical Causes. Biology and Philosophy, 35 (5), 1-21.
  • Meyer, R and Brancazio, N. 2022. Putting Down the Revolt: Enactivism as a Philosophy of Nature.
  • Michaelian, K. and Sant’Anna, A. 2021. Memory without Content? Radical Enactivism and (Post)causal theories of Memory. Synthese, 198 (Suppl 1), 307–335.
  • Miłkowski, M. 2015. The Hard Problem of Content: Solved (Long Ago). Studies in Logic, 41(1): 73-88.
  • Miyahara, K and Segundo-Ortin, M. 2022. Situated Self-Awareness in Expert Performance: A Situated Normativity account of Riken no Ken, Synthese. 200, 192. https://doi.org/10.1007/s11229-022-03688-w
  • Moyal-Sharrock, 2016. The Animal in Epistemology: Wittgenstein’s Enactivist Solution to the Problem of Regress. International Journal for the Study of Skepticism. 6 (2-3): 97-119
  • Moyal-Sharrock, 2021a. Certainty In Action: Wittgenstein on Language, Mind and Epistemology. London: Bloomsbury.
  • Moyal-Sharrock, D. 2021b. From Deed to Word: Gapless and Kink-free Enactivism. Synthese, 198 (Suppl 1), 405–425.
  • Murphy, M. 2019. Enacting Lecoq: Movement in Theatre, Cognition, and Life. Basingstoke: Palgrave Macmillan.
  • Myin. E. 2020. On the Importance of Correctly Locating Content: Why and How REC Can Afford Affordance Perception. Synthese, 198 (Suppl 1):25-39.
  • Myin, E., and O’Regan, K. J. 2002. Perceptual Consciousness, Access to Modality, and Skill Theories: A Way to Naturalize Phenomenology? Journal of Consciousness Studies, 9, 27–46
  • Myin, E and Van den Herik, J.C. 2020. A Twofold Tale of One Mind: Revisiting REC’s Multi-Storey Story. Synthese,198 (12): 12175-12193.
  • Netland, T. 2022. The lived, living, and behavioral sense of perception. Phenomenology and the Cognitive Sciences, https://doi.org/10.1007/s11097-022-09858-y
  • Noë, A. 2021. The Enactive Approach: A Briefer Statement, with Some Remarks on ‘Radical Enactivism’. Phenomenology and the Cognitive Sciences, 20, 957–970
  • Noë, A. 2015. Strange Tools: Art and Human Nature. New York: Hill and Wang
  • Noë, A. 2012. Varieties of Presence. Cambridge, MA: Harvard University Press.
  • Noë, A. 2009. Out of Our Heads: Why You Are not Your Brain and Other Lessons from the Biology of Consciousness. New York: Hill and Wang.
  • Noë, A. 2004. Action in Perception. Cambridge, MA: MIT Press.
  • Øberg, G. K., Normann, B. and S Gallagher. 2015. Embodied-Enactive Clinical Reasoning in Physical Therapy. Physiotherapy Theory and Practice, 31 (4), 244-252.
  • O’Regan, J. K. 2011. Why Red Doesn’t Sound Like a Bell: Understanding the Feel of Consciousness. Oxford: Oxford University Press.
  • O’Regan, J. K., and Noë, A. 2001. A Sensorimotor Account of Vision and Visual Consciousness. Behavioral and Brain Sciences, 24, 883–917.
  • O’Regan, J. K., Myin, E., and Noë, A. 2005. Skill, Corporality and Alerting Capacity in an Account of Sensory Consciousness. Progress in Brain Research, 150, 55–68.
  • Paolucci, C. 2021. Cognitive Semiotics. Integrating Signs, Minds, Meaning and Cognition, Cham Switzerland: Springer.
  • Paolucci, C. 2020. A Radical Enactivist Approach to Social Cognition. In: Pennisi, A., Falzone, A. (eds). The Extended Theory of Cognitive Creativity. Perspectives in Pragmatics, Philosophy and Psychology. Cham: Springer.
  • Piccinini, G. 2022. Situated Neural Representations: Solving the Problems of Content. Frontiers in Neurorobotics. 14 April 2022, Volume 16.
  • Piccinini, G. 2020. Neurocognitive Mechanisms: Explaining Biological Cognition. New York: Oxford University Press.
  • Piccinini, G. 2015. Physical Computation: A Mechanistic Account. New York: Oxford University Press.
  • Piccinini. G. 2008. Computation without Representation. Philosophical Studies. 137 (2), 205-241.
  • Raleigh, T. 2018. Tolerant Enactivist Cognitive Science, Philosophical Explorations, 21:2, 226-244.
  • Ramírez-Vizcaya, S., and Froese, T. 2019. The Enactive Approach to Habits: New Concepts for the Cognitive Science of Bad Habits and Addiction. Frontiers in psychology, 10, 301.
  • Ramstead, MJD, Kirchhoff, M, and Friston, K. 2020a. A Tale of Two Densities: Active Inference is Enactive Inference. Adaptive Behavior. 28(4):225-239.
  • Ramstead, MJD, Friston, K, Hipolito, I. 2020b. Is the Free-Energy Principle a Formal Theory of Semantics? From Variational Density Dynamics to Neural and Phenotypic Representations. Entropy, 22(8), 889.
  • Reid, D. 2014. The Coherence of Enactivism and Mathematics Education Research: A Case Study. AVANT. 5:2. 137-172.
  • Rietveld, E., Denys, D. and Van Westen, M. 2018. Ecological-Enactive Cognition as Engaging with a Field of Relevant Affordances: The Skilled Intentionality Framework (SIF). In A. Newen, L. L. de Bruin and S. Gallagher (Eds.). Oxford Handbook of 4E Cognition. Oxford: Oxford University Press, 41-70.
  • Rietveld, E. and Kiverstein, J. 2014. A Rich Landscape of Affordances. Ecological Psychology, 26(4), 325-352.
  • Robertson, I. and Hutto, D. D. 2023. Against Intellectualism about Skill. Synthese201(4), 143.
  • Robertson, I. and Kirchhoff, M. D. (2019). Anticipatory Action: Active Inference in Embodied Cognitive Activity. Journal of Consciousness Studies, 27(3-4), 38-68.
  • Roelofs, L. 2018. Why Imagining Requires Content: A Reply to A Reply to an Objection to Radical Enactive Cognition. Thought: A Journal of Philosophy. 7 (4):246-254.
  • Rolla, G. 2021. Reconceiving Rationality: Situating Rationality into Radically Enactive Cognition. Synthese, 198(Suppl 1), pp. 571–590.
  • Rolla, G. 2018. Radical Enactivism and Self-Knowledge. Kriterion,141, pp. 732-743
  • Rolla, G. and Figueiredo, N. 2021. Bringing Forth a World, Literally. Phenomenology and the Cognitive Sciences.
  • Rolla, G. and Huffermann, J. 2021. Converging Enactivisms: Radical Enactivism meets Linguistic Bodies. Adaptive Behavior. 30(4). 345-359.
  • Rolla, G., and Novaes, F. 2022. Ecological-Enactive Scientific Cognition: Modeling and Material Engagement. Phenomenology and the Cognitive Sciences, 21, pp. 625–643.
  • Rolla, G., Vasconcelos, G., and Figueiredo, N. 2022. Virtual Reality, Embodiment, and Allusion: An Ecological-Enactive Approach. Philosophy and Technology. 35: 95.
  • Rosch, E. 2016. Introduction to the Revised Edition. In The Embodied Mind: Cognitive Science and Human Experience. Revised Edition (6th ed.). Cambridge, MA: MIT Press.
  • Rucińska, Z. 2019. Social and Enactive Perspectives on Pretending. Avant. 10:3. 1-27.
  • Rucińska, Z. 2016. What Guides Pretence? Towards the Interactive and the Narrative Approaches. Phenomenology and the Cognitive Sciences. 15: 117–133.
  • Ryan Jr, K.J. and S. Gallagher. 2020. Between Ecological Psychology and Enactivism: Is There Resonance? Frontiers in Psychology, 11, 1147.
  • Salis, P. 2022. The Given and the Hard problem of Content. Phenomenology and the Cognitive Sciences.
  • Sato, M. and McKinney, J. 2022. The Enactive and Interactive Dimensions of AI: Ingenuity and Imagination Through the Lens of Art and Music. Artificial Life. 28 (3): 310–321.
  • Schiavio, A. and De Jaegher, H. 2017. Participatory Sense-Making in Joint Musical Practice. Lesaffre, M. Maes, P-J, Marc Leman, M. (eds). The Routledge Companion to Embodied Music Interaction. London: Routledge, 31-39.
  • Schlicht, T. and Starzak, T. 2019. Prospects of Enactivist Approaches to Intentionality and Cognition. Synthese, 198 (Suppl 1): 89-113.
  • Searle, J. 1992. The Rediscovery of the Mind. Cambridge: The MIT Press.
  • Segundo-Ortin, M. 2020. Agency From a Radical Embodied Standpoint: An Ecological-Enactive Proposal. Frontiers in Psychology.
  • Segundo-Ortin, M, Heras-Escribano, M, and Raja, V. 2019. Ecological Psychology is Radical Enough. A Reply to Radical Enactivists. Philosophical Psychology, 32 (7), 1001-102340.
  • Segundo-Ortin, M and Hutto, D.D. 2021. Similarity-based Cognition: Radical Enactivism meets Cognitive Neuroscience. Synthese, 198 (1), 198, 5–23.
  • Seifert, L, Davids, K, Hauw, D and McGann, M. 2020. Editorial: Radical Embodied Cognitive Science of Human Behavior: Skill Acquisition, Expertise and Talent Development. Frontiers in Psychology 11.
  • Sharma, G. and Curtis, P.D. 2022. The Impacts of Microgravity on Bacterial Metabolism. Life (Basel). 12(6): 774.
  • Smith, L.B., and, E.Thelen. 1994. A Dynamic Systems Approach to the Development of Cognition and Action. Cambridge, MA: MIT Press.
  • Stapleton, M. 2022. Enacting Environments: From Umwelts to Institutions. In Lai, K.L. (ed). Knowers and Knowledge in East-West Philosophy: Epistemology Extended. Basingstoke: Palgrave Macmillan. 159-190.
  • Stapleton, M. and Froese, T. 2016. The Enactive Philosophy of Embodiment: From biological Foundations of Agency to the Phenomenology of Subjectivity. In M. García-Valdecasas, J. I. Murillo, and N. F. Barrett (Eds.), Biology and Subjectivity: Philosophical Contributions to non-Reductive Neuroscience. (pp. 113–129). Cham: Springer.
  • Stewart, O. Gapenne, and E. A. Di Paolo (Eds.). 2010. Enaction: Toward a New Paradigm for Cognitive Science (pp. 183–218). Cambridge: The MIT Press.
  • Thompson, E. 2021. Buddhist Philosophy and Scientific Naturalism. Sophia, 1-16.
  • Thompson, E. 2018. Review: Evolving Enactivism: Basic Minds Meet Content. Notre Dame Philosophical Reviews. http://ndpr.nd.edu/reviews/ evolving-enactivism-basic-minds-meet-content/
  • Thompson, E. 2017. Enaction without Hagiography. Constructivist Foundations, 13(1), 41–44.
  • Thompson, E. 2016. Introduction to the Revised Edition. In Varela, F. J., Thompson, E., and Rosch, E. The Embodied Mind: Cognitive Science and Human Experience. Revised Edition (6th ed.). Cambridge, MA: MIT Press.
  • Thompson, E. 2011a. Living Ways of Sense-Making, Philosophy Today: SPEP Supplement. 114-123.
  • Thompson, E. 2011b. Précis of Mind in Life. Journal of Consciousness Studies.18. 10-22.
  • Thompson, E. 2011c. Reply to Commentaries. Journal of Consciousness Studies.18. 176-223.
  • Thompson, E. 2007. Mind in Life: Biology, Phenomenology, and the Sciences of Mind. Cambridge, MA: Harvard University Press.
  • Thompson, E. 2005. Sensorimotor Subjectivity and the Enactive Approach to Experience. Phenomenology and the Cognitive Sciences, 4: 407-427.
  • Thompson, E. and Stapleton, M. 2009. Making Sense of Sense-Making: Reflections on Enactive and Extended Mind Theories. Topoi, 28: 23-30.
  • Turner, J. S. 2000. The Extended Organism: The Physiology of Animal-Built Structures. Cambridge, MA: Harvard University Press.
  • Van Dijk, L, Withagen, R.G. Bongers, R.M. 2015. Information without Content: A Gibsonian Reply to Enactivists’ Worries. Cognition, 134. pp. 210-214.
  • Varela, F. J. 1999a. Ethical Know-How: Action, Wisdom, and Cognition. Stanford, CA: Stanford University Press.
  • Varela, F.J. 1999b. The Specious Present: A Neurophenomenology of Time Consciousness. In J. Petitot, F. J. Varela, and B. R. M. Pachoud (Eds.), Naturalizing Phenomenology (pp. 266–314). Stanford: Stanford University Press
  • Varela, F. J. 1996. Neurophenomenology: A methodological remedy for the hard problem. Journal of Consciousness Studies, 4, 330–349
  • Varela F. J. 1991. Organism: A Meshwork of Selfless Selves. In: Tauber A. I. (ed.), Organism and the Origins of Self. Dordrecht: Kluwer, 79–107.
  • Varela, F.J. 1984. Living Ways of Sense-Making: A Middle Path for Neuroscience. In: P. Livingstone (Ed.), Order and Disorder: Proceedings of the Stanford International Symposium, Anma Libri, Stanford, pp.208-224.
  • Varela, F. J. 1979. Principles of Biological Autonomy. New York: Elsevier.
  • Varela, F. J., Thompson, E., and Rosch, E. 1991. The Embodied Mind: Cognitive Science and Human Experience. Cambridge: MIT Press.
  • Venter, E. 2021. Toward an Embodied, Embedded Predictive Processing Account. Frontiers in Psychology. doi: 10.3389/fpsyg.2021.543076
  • Venturinha, N. 2016. Moral Epistemology, Interpersonal Indeterminacy and Enactivism. In Gálvez, J.P. (ed). Action, Decision-Making and Forms of Life. Berlin, Boston: De Gruyter, pp. 109-120.
  • Villalobos, M. 2020. Living Beings as Autopoietic Bodies. Adaptive Behavior, 28 (1), 51-58.
  • Villalobos, M. 2013. Enactive Cognitive Science: Revisionism or Revolution? Adaptive Behavior, 21 (3), 159-167.
  • Villalobos, M., and Dewhurst, J. 2017. Why Post‐cognitivism Does Not (Necessarily) Entail Anti‐Computationalism. Adaptive Behavior, 25(3), 117–128.
  • Villalobos, M. and Palacios, S. 2021. Autopoietic Theory, Enactivism, and Their Incommensurable Marks of the Cognitive. Synthese, 198 (Suppl 1), 71–87.
  • Villalobos, M. and Razeto-Barry, P. 2020. Are Living Beings Extended Autopoietic Systems? An Embodied Reply. Adaptive Behavior, 28 (1), 3-13.
  • Villalobos, M. and Silverman, D. 2018. Extended Functionalism, Radical Enactivism, and the Autopoietic Theory of Cognition: Prospects for a Full Revolution in Cognitive Science. Phenomenology and the Cognitive Sciences, 17 (4), 719-739.
  • Villalobos, M., and Ward, D. 2016. Lived Experience and Cognitive Science: Reappraising Enactivism’s Jonasian Turn. Constructivist Foundations, 11, 204–233
  • Villalobos, M. and Ward, D. 2015. Living Systems: Autopoiesis, Autonomy and Enaction. Philosophy and Technology, 28 (2), 225-239.
  • Vörös, S., Froese, T., and Riegler, A. 2016. Epistemological Odyssey: Introduction to Special Issue on the Diversity of Enactivism and Neurophenomenology. Constructivist Foundations, 11(2), 189–203.
  • Ward, D., Silverman, D., and Villalobos, M. 2017. Introduction: The Varieties of Enactivism. Topoi, 36(3), 365–375.
  • Weichold, M. and Rucińska, Z. 2021. Pretense as Alternative Sense-making: A Praxeological Enactivist Account. Phenomenology and the Cognitive Sciences. 21. 1131–1156.
  • Weichold, M. and Rucińska, Z. 2022 Praxeological Enactivism vs. Radical Enactivism: Reply to Hutto. Phenomenology and the Cognitive Sciences. 21. 1177-1182.
  • Werner, K. 2020. Enactment and Construction of the Cognitive Niche: Toward an Ontology of the Mind‐World Connection. Synthese, 197(3), 1313–1341.
  • Zahidi, K. 2021. Radicalizing Numerical Cognition. Synthese, 198 (Suppl 1): S529–S545.
  • Zahidi, K., and Myin, E. 2016. Radically Enactive Numerical Cognition. In G. Etzelmüller and C. Tewes (Eds.), Embodiment in Evolution and Culture (pp. 57–72). Tübingen, Germany: Mohr Siebeck.
  • Zahnoun F. 2021a. The Socio-Normative Nature of Representation. Adaptive Behavior. 29(4): 417-429.
  • Zahnoun, F. 2021b. Some Inaccuracies about Accuracy Conditions. Phenomenology and the Cognitive Sciences.
  • Zahnoun, F. 2021c. On Representation Hungry Cognition (and Why We Should Stop Feeding It). Synthese, 198 (Suppl 1), 267–284.
  • Zahnoun, F. 2020. Truth or Accuracy? Theoria 86 (5):643-650.
  • Zarco M., Egbert M. D. 2019. Different Forms of Random Motor Activity Scaffold the Formation of Different Habits in a Simulated Robot. In Fellermann H., Bacardit J., Goni-Moreno A., Fuchslin M. (Eds.) The 2019 Conference on Artificial Life. No. 31, 582-589
  • Zipoli Caiani, S. 2022. Intelligence Involves Intensionality: An Explanatory Issue for Radical Enactivism (Again). Synthese. 200: 132.

 

Author Information

Daniel D. Hutto
Email: ddhutto@uow.edu.au
University of Wollongong
Australia

Hunhu/Ubuntu in the Traditional Thought of Southern Africa

The term Ubuntu/Botho/Hunhu is a Zulu/Xhosa/Ndebele/Sesotho/Shona word referring to the moral attribute of a person, who is known in the Bantu languages as Munhu (Among the Shona of Zimbabwe), Umuntu (Among the Ndebele of Zimbabwe and the Zulu/Xhosa of South Africa) and Muthu (Among the Tswana of Botswana) and Omundu (Among the Herero of Namibia) to name just a few of the Bantu tribal groupings. Though the term has a wider linguistic rendering in almost all the Bantu languages of Southern Africa, it has gained a lot of philosophical attention in Zimbabwe and South Africa, especially in the early twenty-first century for the simple reason that both Zimbabwe and South Africa needed home-grown philosophies to move forward following political disturbances that had been caused by the liberation war and apartheid respectively. Philosophically, the term Ubuntu emphasises the importance of a group or community and it finds its clear expression in the Nguni/Ndebele phrase: umuntu ngumuntu ngabantu which when translated to Shona means munhu munhu muvanhu (a person is a person through other persons).  This article critically reflects on hunhu/ubuntu as a traditional and/or indigenous philosophy by focussing particularly on its distinctive features, its components and how it is deployed in the public sphere.

Table of Contents

  1. Introduction
  2. About the Sources
  3. Hunhu/Ubuntu and Ethno-Philosophy
    1. The Deployment of Hunhu/Ubuntu in the Public Sphere
  4. The Distinctive Qualities/Features of Hunhu/Ubuntu
  5. The Components of Hunhu/Ubuntu
    1. Hunhu/Ubuntu Metaphysics
    2. Hunhu/Ubuntu Ethics
    3. Hunhu/Ubuntu Epistemology
  6. Conclusion
  7. References and Further Reading

1. Introduction

The subject of Hunhu/Ubuntu has generated considerable debate within the public and private intellectual discussions, especially in South Africa and Zimbabwe where the major focus has been on whether or not Hunhu/Ubuntu can compete with other philosophical world views as well as whether or not Hunhu/Ubuntu can solve the socio-political challenges facing the two countries.  Hunhu/ubuntu is also a key theme in African philosophy as it places an imperative on the importance of group or communal existence as opposed to the West’s emphasis on individualism and individual human rights. Thus, Hunhu/Ubuntu, as an aspect of African philosophy, prides in the idea that the benefits and burdens of the community must be shared in such a way that no one is prejudiced but that everything is done to put the interests of the community ahead of the interests of the individual. To this end, the traditional philosophical meaning of the term Ubuntu/Botho/Hunhu is sought and its importance in the academy is highlighted and explained. The article also looks at how the concept is deployed in the public sphere. It provides an elaborate analysis of the qualities/features of Hunhu/Ubuntu as exemplified by John S Pobee’s expression Cognatus ergo sum, which means I am related by blood therefore I exist. Finally, the article outlines and thoroughly explains the components cognate to Hunhu/Ubuntu as an aspect of ethno-philosophy, namely: Hunhu/Ubuntu Metaphysics, Hunhu/Ubuntu Ethics and Hunhu/Ubuntu Epistemology.

2. About the Sources

Many scholars have written about Ubuntu and it is only fair to limit our discussion to those scholars who have had an interest in the philosophical meaning of the term in Southern African Thought. In this category, we have first generation scholars of Ubuntu such as Mogobe Bernard Ramose (1999; 2014), who is credited for his definition of Ubuntu as humaneness, Stanlake Samkange and Tommie Marie Samkange (1980) who link Hunhu/Ubuntu with the idea of humanism and Desmond Tutu (1999) who sees Ubuntu as a conflict resolution philosophy. These three are regarded as first-generation scholars of Ubuntu because historically, they are among the first black philosophers hailing from Africa to write about Hunhu/Ubuntu as a philosophy. They also started writing as early as the 1980s and early 1990s and they regarded Ubuntu inspired by the traditional southern African thought as a human quality or as an attribute of the soul.

We also have second generation scholars of Ubuntu such as Michael Onyebuchi Eze (2010), who is credited for his critical historicisation of the term Ubuntu,  Michael Battle (2009) who is credited for some deep insights on the linguistic meaning of the term Ubuntu as well as his famous claim that Ubuntu is a gift to the Western world; Fainos Mangena (2012a and 2012b) who is credited for defining Hunhu/Ubuntu and extracting from it the idea of the Common Moral Position (CMP) and Thaddeus Metz (2007) whose search for a basic principle that would define African ethics has attracted a lot of academic attention; Christian BN Gade (2011; 2012 and 2013) who has taken the discourse of Hunhu/Ubuntu to another level by looking at the historical development of discourses on Ubuntu as well as the meaning of Ubuntu among South Africans of African Descent (SAADs).  Finally, we have Martin H Prozesky who has outlined some of the distinctive qualities/features of Hunhu/Ubuntu philosophy that are important for this article.

3. Hunhu/Ubuntu and Ethno-Philosophy

In order to define Ubuntu and show its nexus with ethno-philosophy, it is important that we first define ethno-philosophy. To this end, Zeverin Emagalit defines ethno-philosophy as a system of thought that deals with the collective worldviews of diverse African people as a unified form of knowledge based on myths, folk wisdom and the proverbs of the people. From the above definition, we can pick two important points:  The first point is ethno-philosophy as a “system of thought” and the second point is “the collective world views of diverse African people” and that they are a unified form of knowledge.  This means that the diversity that characterise African people, in terms of geographical location, history and ethnicity, does not take away the fact that Africans have “a unified form of knowledge” that is based on group identity or community.  Now, this is what qualifies Ubuntu as an important aspect of ethno-philosophy.

This section defines Ubuntu as well as tracing its historical roots in Southern African cultures. To begin with, the term Ubuntu comes from a group of sub-Saharan languages known as Bantu (Battle, 2009: 2). It is a term used to describe the quality or essence of being a person amongst many sub-Saharan tribes of the Bantu language family (Onyebuchi Eze, 2008: 107). While Battle does not make reference to the Shona equivalence of Ubuntu and recognises the words Ubuntu and Bantu by the common root of –ntu (human); Ramose uses the Zulu/isiNdebele word Ubuntu concurrently with its Shona equivalent – hunhu to denote the idea of existence. For Ramose, Hu– is ontological, while –nhu is epistemological and so is Ubu– and –ntu (Ramose 1999: 50).  Having lived in Africa and Zimbabwe, Ramose is able to know with some degree of certainty the ontological and epistemological status of the words hunhu and ubuntu.  It sometimes takes an insider to be able to correctly discern the meanings of such words.

Hunhu/ubuntu also says something about the character and conduct of a person (Samkange and Samkange 1980: 38). What this translates to is that hunhu/ubuntu is not only an ontological and epistemological concept; it is also an ethical concept. For Battle, Ubuntu is the interdependence of persons for the exercise, development and fulfilment of their potential to be both individuals and community. Desmond Tutu captures this aptly when he uses the Xhosa proverb, ungamntu ngabanye abantu whose Shona equivalence is munhu unoitwa munhu nevamwe vanhu (a person is made a person by other persons). Generally, this proverb, for Battle, means that each individual’s humanity is ideally expressed in relationship with others. This view was earlier expressed by Onyebuchi Eze (2008: 107) who put it thus:

More critical…is the understanding of a person as located in a community where being a person is to be in a dialogical relationship in this community. A person’s humanity is dependent on the appreciation, preservation and affirmation of other person’s humanity. To be a person is to recognize therefore that my subjectivity is in part constituted by other persons with whom I share the social world.

In regard to the proverbial character of Ubuntu, Ramose also remarks that, “Ubuntu is also consistent with the practices of African peoples as expressed in the proverbs and aphorisms of certain Nguni languages, specifically Zulu and Sotho” (Ramose quoted in van Niekerk 2013).

In his definition of ubuntu, Metz (2007: 323) follows Tutu and Ramose when he equates Ubuntu to the idea of humanness and to the maxim a person is a person through other persons. This maxim, for Metz, “has descriptive senses to the effect that one’s identity as a human being causally and even metaphysically depends on a community.” With this submission, Metz, agrees with Ramose, Samkange and Samkange and Gade that ubuntu is about the group/community more than it is about the self.

It may be important, at this juncture, to briefly consider the historical roots of the term Ubuntu in order to buttress the foregoing. To begin with, in his attempt to trace the history of the idea of Ubuntu, Michael Onyebuchi Eze (2010: 90) remarks thus when it comes to the idea of Ubuntu, “history adopts a new posture…where it is no longer a narrative of the past only but of the moment, the present and the future.” Other than asking a series of questions relating to “history as a narrative of the moment, present and future,” he does not adequately explain why this is so. Instead, he goes further to explain the view of “history as a narrative of the past.” As a narrative of the past, Onyebuchi Eze observes thus:

Ubuntu is projected to us in a rather hegemonic format; by way of an appeal to a unanimous past through which we may begin to understand the socio-cultural imaginary of the “African” people before the violence of colonialism; an imagination that must be rehabilitated in that percussive sense for its actual appeal for the contemporary African society (2010:93).

Onyebuchi Eze seems to be suggesting that there is too much romanticisation of the past when it comes to the conceptualisation and use of the term Ubuntu. He seems to question the idea of the unanimous character of Ubuntu before “the violence of colonialism” reducing this idea to some kind of imagination that should have no place in contemporary African society. We are compelled to agree with him to that extent. Thus, unlike many scholars of Ubuntu who have tended to gloss over the limitations of Ubuntu, Onyebuchi Eze is no doubt looking at the history of this concept with a critical eye. One of the key arguments he presents which is worthy of our attention in this article is that of the status of the individual and that of the community in the definition and conceptualisation of Ubuntu.

While many Ubuntu writers have tended to glorify community over and above the individual, Onyebuchi Eze (2008: 106) is of the view that, “the individual and the community are not radically opposed in the sense of priority but engaged in contemporaneous formation.” Thus, while we agree with Onyebuchi Eze that both the individual and the community put together define Ubuntu, we maintain that their relationship is not that of equals but that the individual is submerged within the community and the interests and aspirations of the community matter more than those of the individual. This, however, should not be interpreted to mean that the individual plays an ancillary role in the definition of Ubuntu.  Below, we outline and explain the qualities/features of hunhu/ubuntu as an aspect of ethno-philosophy.

4. The Deployment of Hunhu/Ubuntu in the Public Sphere

Hunhu/Ubuntu has dominated the public discourse especially in Zimbabwe and South Africa where it has been used to deal with both political and social differences. In Zimbabwe, for instance, hunhu/ubuntu has been used to bring together the Zimbabwe African National Union Patriotic Front (ZANU PF) and Patriotic Front Zimbabwe African People’s Union (PF ZAPU) after political tensions that led to the Midlands and Matabeleland disturbances of the early 1980s which saw about 20000 people killed by the North Korea trained Fifth Brigade. The 1987 Unity accord was done in the spirit of Ubuntu where people had to put aside their political differences and advance the cause of the nation.

The Global Political Agreement of 2008 which led to the signing of the Government of National Unity (GNU) also saw hunhu/ubuntu being deployed to deal with the political differences between ZANU PF and the Movement for Democratic Change (MDC) formations as a result of the violent elections of June 2008. This violence had sown the seeds of fear to the generality of the Zimbabwean population and so it took hunhu/ubuntu to remove the fear and demonstrate the spirit of “I am because we are, since we are therefore I am.” The point is that the two political parties needed each other in the interest of the development of the nation of Zimbabwe.

In South Africa, Desmond Tutu, who was the Chairperson of the Truth and Reconciliation Commission (TRC) which was formed to investigate and deal with the apartheid atrocities in the 1990s demonstrated in his final report that it took Ubuntu for people to confess, forgive and forget. In his book: No Future without Forgiveness, published in 1999, Tutu writes, “the single main ingredient that made the achievements of the TRC possible was a uniquely African ingredient – Ubuntu.” Tutu maintains that, what constrained so many to choose to forgive rather than to demand retribution, to be magnanimous and ready to forgive rather than to wreak revenge was Ubuntu (Tutu quoted in Richardson, 2008: 67). As Onyebuchi Eze (2011: 12) would put it, “the TRC used Ubuntu as an ideology to achieve political ends.”  As an ideology Ubuntu has been used as a panacea to the socio-political problems affecting the continent of Africa, especially the Southern part of the continent. This means that Ubuntu as a traditional thought has not been restricted to the academy alone but has also found its place in the public sphere where it has been utilised to solve political conflicts and thereby bring about socio-political harmony.  To underscore the importance of Ubuntu not only as an intellectual and public good, Gabriel Setiloane (quoted in Vicencio, 2009: 115) remarks thus, “Ubuntu is a piece of home grown African wisdom that the world would do well to make its own.” This suggests the southern African roots of ubuntu as a traditional thought.

5. The Distinctive Qualities/Features of Hunhu/Ubuntu

 While Martin H Prozesky (2003: 5-6) has identified the ten qualities that are characteristic of hunhu/ubuntu, it is important to note that, although this article will only utilise Prozesky’s ten qualities, the philosophy of hunhu/ubuntu has more than ten qualities or characteristics. Our justification of using Prozesky’s ten qualities is that they aptly capture the essence of Ubuntu as an aspect of ethno-philosophy. This article begins by outlining Prozesky’s ten qualities before attempting to explain only four of them, namely humaneness, gentleness, hospitality and generosity. Prozesky’s ten qualities are as follows:

    1. Humaneness
    2. Gentleness
    3. Hospitality
    4. Empathy or taking trouble for others
    5. Deep Kindness
    6. Friendliness
    7. Generosity
    8. Vulnerability
    9. Toughness
    10. Compassion

Hunhu/ubuntu as an important aspect of ethno-philosophy is an embodiment of these qualities. While Ramose uses humaneness to define hunhu/ubuntu, Samkange and Samkange use humanism to define and characterise the same. The impression one gets is that the former is similar to the latter. But this is further from the truth. Thus, with regard to the dissimilarity between humaneness and humanism, Gade (2011: 308) observes:

I have located three texts from the 1970s in which Ubuntu is identified as ‘African humanism.’ The texts do not explain what African humanism is, so it is possible that their authors understood African humanism as something different from a human quality.

Granted that this is may be the case, the question then is: What is the difference between humaneness and humanism, and African humaneness and African humanism as aspects of hunhu/ubuntu philosophy? While humaneness may refer to the essence of being human including the character traits that define it (Dolamo, 2013: 2); humanism, on the other hand, is an ideology, an outlook or a thought system in which human interests and needs are given more value than the interests and needs of other beings (cf. Flexner, 1988: 645).Taken together, humaneness and humanism become definitive aspects of hunhu/ubuntu only if the pre-fix ‘African’ is added to them to have African humaneness and African humanism respectively. African humaneness would, then, entail that the qualities of selflessness and commitment to one’s group or community are more important than the selfish celebration of individual achievements and dispositions.

African humanism, on the other hand; would, then, refer to an ideology or outlook or thought system that values peaceful co-existence and the valorisation of community.  In other words, it is a philosophy that sees human needs, interests and dignity as of fundamental importance and concern (Gyekye 1997: 158).  Gyekye maintains that African humanism “is quite different from the Western classical notion of humanism which places a premium on acquired individual skills and favours a social and political system that encourages individual freedom and civil rights” (1997: 158).

Thus, among the Shona people of Zimbabwe, the expression munhu munhu muvanhu, which in isiNdebele and Zulu language translates to umuntu ngumuntu ngabantu, both of which have the English translation of “a person is a person through other persons,” best explain the idea of African humanism (cf. Mangena 2012a; Mangena 2012b; Shutte 2008; Tutu 1999).

In regard to the definition and characterisation of African humanism, Onyebuchi Eze (2011:12) adds his voice to the definition of African humanism when he observes that:

As a public discourse, Ubuntu/botho has gained recognition as a peculiar form of African humanism, encapsulated in the following Bantu aphorisms, like Motho ke motho ka batho babang; Umuntu ngumuntu ngabantu (a person is a person through other people). In other words, a human being achieves humanity through his or her relations with other human beings.

Whether one prefers humaneness or humanism, the bottom line is that the two are definitive aspects of the philosophy of hunhu/ubuntu which places the communal interests ahead of the individual interests. Of course, this is a position which Onyebuchi Eze would not buy given that in his view, the community cannot be prioritised over the individual as:

The relation with ‘other’ is one of subjective equality, where the mutual recognition of our different but equal humanity opens the door to unconditional tolerance and a deep appreciation of the ‘other’ as an embedded gift that enriches one’s humanity (2011: 12).

Some believe that what distinguishes an African of black extraction from a Westerner is the view that the former is a communal being while the latter prides in the idea of selfhood or individualism. To these people the moment we take the individual and the community as subjective equals [as Onyebuchi Eze does] we end up failing to draw the line between what is African from what is fundamentally Western.  Having defined humaneness, this article will now define and characterise the quality of gentleness as understood through hunhu/ubuntu. Gentleness encompasses softness of heart and being able to sacrifice one’s time for others. Thus, being gentle means being tender-hearted and having the ability to spend time attending to other people’s problems. Gentleness is a quality of the tradition thought of hunhu/ubuntu in that it resonates with John S Mbiti’s dictum: “I am because we are, since we are therefore I am” (1969: 215). The point is that with gentleness, one’s humanity is inseparably bound to that of others. Eric K Yamamoto (1997: 52) puts it differently in reference to the altruistic character of Ubuntu philosophy when he remarks thus:

Ubuntu is the idea that no one can be healthy when the community is sick. Ubuntu says I am human only because you are human. If I undermine your humanity, I dehumanise myself.

Both the definition of gentleness provided above and Mbiti’s dictum are equivalent to Yamamoto’s understanding of gentleness in that they both emphasise on otherness rather than the self. The attribute of hospitality also defines hunhu/ubuntu philosophy. Hospitality generally means being able to take care of your visitors in such a way that they feel comfortable to have you as their host and the relationship is not commercial.  However, the Western definition of hospitality is such that the host goes out of his or her way to provide for the needs of his guests in return for some payment. This, however, should not be interpreted to mean that the Westerner is not hospitable outside of commerce. No doubt, they can also be hospitable but it is the magnitude of hospitality that differs.

In the case of the Shona/Ndebele communities in Africa where hospitality is given for free as when one provides accommodation and food to a stranger at his or her home, the magnitude is high. Coming to the idea of hospitality in Africa, it is important to note that in traditional Shona/Ndebele society when a person had travelled a long journey looking for some relative, they would get tired before reaching their relative’s home and along the way; it was common for them to be accommodated for a day or two before they get to their relative’s home. During their short stay, they would be provided with food, accommodation and warm clothes (if they happened to travel during the winter season).

Among the Korekore-Nyombwe people of Northern Zimbabwe, strangers would be given water to drink before asking for directions or before they ask for accommodation in transit. The thinking was that the stranger would have travelled a very long distance and was probably tired and thirsty and so there was need to give them water to quench their thirst. Besides, water (in Africa) symbolises life and welfare and so by giving strangers water they were saying that life needed to be sustained and that as Africans, we are “our brothers’ keepers.” Thus, hunhu/ubuntu hospitality derives its impetus from this understanding that the life and welfare of strangers is as important as our own lives and welfare.

Now, this is different from the idea of home and hospitality in Western Cosmopolitans where a home is a place of privacy. Most homes in the West have durawalls or high fences to maximise the privacy of the owner and so a stranger cannot just walk in and be accommodated. This is quite understandable because in Western societies, the individual is conceived of as the centre of human existence and so there is need to respect his or her rights to privacy.  In the West, the idea of a stranger walking into a private space is called trespassing and one can be prosecuted for this act. And yet in African traditional thought, in general, and in the Shona/Ndebele society, in particular, the idea of trespassing does not exist in that way.

In fact, in pre-colonial Shona/Ndebele society, however, the community was at the centre of human existence and that is why the pre-colonial Shona/Ndebele people would easily accommodate strangers or visitors without asking many questions. However, due to the colonisation of Africa, some Africans have adopted the Western style of individual privacy, but this is contrary to hunhu/ubuntu hospitality which is still being practiced in most Shona/Ndebele rural communities today. The point is that philosophies of hospitality, identity and belonging are more clearly played out on the home front than in the public sphere.

The last attribute to be discussed in this section, is generosity. Generally, generosity refers to freedom or liberality in giving (Flexner 1988: 550). The attribute of generosity in Southern African thought is best expressed proverbially. In Shona culture, for instance, there are proverbs that explain the generosity of the Shona people or vanhu. Some of these include: Muenzi haapedzi dura (A visitor does not finish food), Chipavhurire uchakodzwa(The one who gives too much will also receive too much), Chawawana idya nehama mutogwa unokangamwa (Share whatever you get with your relatives because strangers are very forgetful) and Ukama igasva hunazadziswa nekudya (Relations cannot be complete without sharing food).

These proverbs not only demonstrate that Bantu people are generous people, but the proverbs also say something about the hunhu/ubuntu strand that runs through the traditional thought of almost all the Bantu cultures of Southern Africa whereby everything is done to promote the interests of the group or community. The proverbs show that the Bantu people are selfless people as summarised by the Nguni proverb which we referred to earlier, which says: Umuntu ngumuntu ngabantu (a person is a person through other persons) or as they put it in Shona: Munhu munhu muvanhu. Without the attribute of generosity, it may be difficult to express one’s selflessness.

6. The Components of Hunhu/Ubuntu

This section outlines the components of hunhu/ubuntu traditional philosophy showing how these are different from the branches of Western philosophy. These components will be outlined as hunhu/ubuntu metaphysics, hunhu/ubuntu ethics as well as hunhu/ubuntu epistemology. The objective is to show that while Western philosophy is persona-centric and is summarised by Descartes’ famous phrase, Cogito ergo sum which when translated to English means “I think therefore I am”; hunhu/ubuntu traditional philosophy, on the other hand, is communo-centric and is summarised by Pobee’s famous dictum, Cognatus ergo sum which when translated to English means, “I am related by blood, therefore, I exist.” In much simpler terms, while Western philosophy emphasises the self and selfhood through the promotion of individual rights and freedoms, hunhu/ubuntu traditional thought emphasises on the importance of the group or community through the promotion of group or communal interests.

a. Hunhu/Ubuntu Metaphysics

Before defining and characterising hunhu/ubuntu metaphysics, it is important to begin by defining the term Metaphysics itself. For lack of a better word in African cultures, the article will define metaphysics from the standpoint of Western philosophy. The article will then show that this definition, though, it will give us a head-start; can only partially be applied to non- Western cultures. To begin with, in the history of Western philosophy, Metaphysics is by far regarded as the most ancient branch of philosophy and it was originally called first philosophy (Steward and Blocker 1987: 95).  The term Metaphysics is only but an accident of history as it is thought to have resulted from an editor’s mistake as “he was sorting out Aristotle’s works in order to give them titles, several decades after Aristotle had died. It is thought that the editor came across a batch of Aristotle’s writings that followed The Physics and he called them Metaphysics, meaning After Physics” (1987: 96).

Metaphysics then is a branch of philosophy that deals with the nature of reality. It asks questions such as: What is reality? Is reality material, physical or an idea?  As one tries to answer these questions, a world is opened to him or her that enables him or her to identify, name and describe the kinds of beings that exist in the universe. Thus, two words define being, namely: ontology and predication. While pre-Socratics such as Thales, Anaximander, Anaximenes, Heraclitus and Parmenides and others defined being in terms of appearance and reality as well as change and permanence; Classical philosophers such as Socrates/Plato and Aristotle defined change in terms of form and matter.

While form was more real Socrates/Plato and existed in a different realm than that of matter, Aristotle argued that both form and matter together formed substance which was reality. Although the idea of being has always been defined in individual terms in the history of Western philosophy; it was given its definitive character by French Philosopher, Rene Descartes, who defined it in terms of what he called Cogito ergo sum which when translated to English means, “I think therefore I am.” Thus, the individual character of Western philosophy was firmly established with the popularisation of Descartes’ Cogito. A question can be asked: Does this understanding of metaphysics also apply to non-Western cultures? The answer is yes and no. Yes in the sense that in non-Western cultures being is also explained in terms of appearance and reality as well as change and permanence and no in the sense that non-Western philosophies, especially the hunhu/ubuntu traditional philosophy of Southern Africa has a communal character, not an individual character. Having said this, so what is hunhu/ubuntu metaphysics?

Hunhu/ubuntu metaphysics is a component of hunhu/ubuntu traditional philosophy that deals with the nature of being as understood by people from Southern Africa. As we have already intimated, in Southern African traditional thought, being is understood in the communal, physical and spiritual sense. Thus, a human being is always in communion with other human beings as well as with the spiritual world. Sekou Toure (1959) calls this “the communion of persons” whereby being is a function of the “us” or “we” as opposed to the “I” as found in “the autonomy of the individuals” that is celebrated in the West and is especially more revealing in Descartes’ Cogito.  Pobee (1979) defines the African being in terms of what he calls Cognatus ergo sum which means “I am related by blood, therefore, I exist.” What this suggests is that in Southern Africa, just like in the rest of Sub-Saharan Africa, the idea of being is relational.

Coming to the communion of human beings with the spiritual world, it is important to remark that the idea of being has its full expression through participation. Just as, Socrates/Plato’s matter partakes in the immutable forms, being in the Shona/Ndebele society depends solely on its relationship with the spiritual world which is populated by ancestral spirits, avenging spirits, alien spirits and the greatest spiritual being called Musikavanhu/Nyadenga/unkulunkulu (The God of Creation). The greatest being works with his lieutenants, ancestors and other spirits to protect the interests of the lesser beings, vanhu/abantu. In return, vanhu/abantu enact rituals of appeasement so that it does not become a one-way kind of interaction. It is, however, important to note that while Socratic/Platonic Metaphysics is dualistic in character; hunhu/ubuntu Metaphysics is onto-triadic or tripartite in character. It involves the Supreme Being (God), other lesser spirits (ancestral/alien and avenging) and human beings.

b. Hunhu/Ubuntu Ethics

Hunhu/ubuntu ethics refer to the idea of hunhu/ubuntu moral terms and phrases such as tsika dzakanaka kana kuti dzakaipa (good or bad behaviour), kuzvibata kana kuti kusazvibata (self-control or reckless behaviour), kukudza vakuru (respecting or disrespecting elders) and kuteerera vabereki (being obedient or disobedient to one’s immediate parents and the other elders of the community) among others. In Shona society they say: Mwana anorerwa nemusha kana kuti nedunhu (It takes a clan, village or community to raise a child). Having defined hunhu/ubuntu ethics, it is important to distinguish them from hunhu/ubuntu morality which relates to the principles or rules that guide and regulate the behaviour of vanhu or abantu (human beings in the Shona/Ndebele sense of the word) within Bantu societies.

What distinguishes hunhu/ubuntu ethics from Western ethics is that the former are both upward-looking/transcendental and lateral, while the latter are only lateral. This section will briefly distinguish between an upward-looking/transcendental kind of hunhu/ubuntu ethic from a lateral kind of hunhu/ubuntu ethic. By upward-looking/transcendental is meant that hunhu/ubuntu ethics are not only confined to the interaction between humans, they also involve spiritual beings such as Mwari/Musikavanhu/Unkulunkulu (Creator God), Midzimu (ancestors) and Mashavi (Alien spirits). Thus, hunhu/ubuntu ethics are spiritual, dialogical and consensual (cf. Nafukho 2006). By dialogical and consensual is meant that the principles that guide and regulate the behaviour of vanhu/abantu are products of the dialogue between spiritual beings and human beings and the consensus that they reach.  By lateral is meant that these principles or rules are crafted solely to guide human interactions.

As Mangena (2012: 11) would put it, hunhu/ubuntu ethics proceed through what is called the Common Moral Position (CMP). The CMP is not a position established by one person as is the case with Plato’s justice theory, Aristotle’s eudaimonism, Kant’s deontology or Bentham’s hedonism (2012: 11). With the CMP, the community is the source, author and custodian of moral standards and personhood is defined in terms of conformity to these established moral standards whose objective is to have a person who is communo-centric than one who is individualistic. In Shona/Ndebele society, for instance, respect for elders is one of the ways in which personhood can be expressed with the goal being to uphold communal values. It is within this context that respect for elders is a non-negotiable matter since these are the custodians of these values and fountains of moral wisdom.

Thus, one is born and bred in a society that values respect for the elderly and he or she has to conform. One important point to note is that the process of attaining the CMP is dialogical and spiritual in the sense that elders set moral standards in consultation with the spirit world which, as intimated earlier is made up of Mwari/Musikavanhu/Unkulunkulu (Creator God) and Midzimu (ancestors), and these moral standards are upheld by society (2012: 12).  These moral standards, which make up the CMP, are not forced on society as the elders (who represent society), Midzimu (who convey the message to Mwari) and Mwari (who gives a nod of approval) ensure that the standards are there to protect the interest of the community at large.

Communities are allowed to exercise their free will but remain responsible for the choices they make as well as their actions. For instance, if a community chooses to ignore the warnings of the spirit world regarding an impending danger such as a calamity resulting from failure by that community to enact an important ritual that will protect members of that community from say, flooding or famine; then the community will face the consequences.

c. Hunhu/Ubuntu Epistemology

What is epistemology? In the Western sense of the word, epistemology deals with the meaning, source and nature of knowledge. Western philosophers differ when it comes to the sources of knowledge with some arguing that reason is the source of knowledge while others view experience or the use of the senses as the gateway to knowledge. This article will not delve much into these arguments since they have found an audience, instead it focuses on hunhu/ubuntu epistemology. However, one cannot define and characterise hunhu/ubuntu traditional epistemology without first defining and demarcating the province of African epistemology as opposed to Western epistemology.

According to Michael Battle (2009: 135), “African epistemology begins with community and moves to individuality.” Thus, the idea of knowledge in Africa resides in the community and not in the individuals that make up the community. Inherent in the powerful wisdom of Africa is the ontological need of the individual to know self and community (2009: 135) and discourses on hunhu/ubuntu traditional epistemology stems from this wisdom. As Mogobe Ramose (1999) puts it, “the African tree of knowledge stems from ubuntu philosophy. Thus, ubuntu is a well spring that flows within African notions of existence and epistemology in which the two constitute a wholeness and oneness.” Just like, hunhu/ubuntu ontology, hunhu/ubuntu epistemology is experiential.

In Shona society, for instance, the idea of knowledge is expressed through listening to elders telling stories of their experiences as youths and how such experiences can be relevant to the lives of the youths of today. Sometimes, they use proverbs to express their epistemology. The proverb: Rega zvipore akabva mukutsva(Experience is the best teacher) is a case in point. One comes to know that promiscuity is bad when he or she was once involved in it and got a Sexually Transmitted Infection (STI) and other bad consequences. No doubt, this person will be able to tell others that promiscuity is bad because of his or her experiences. The point is that hunhu/ubuntu epistemology is a function of experience. In Shona, they also say: Takabva noko kumhunga hakuna ipwa (We passed through the millet field and we know that there are no sweet reeds there). The point is that one gets to know that there are no sweet reeds in a millet field because he or she passed through the millet field. One has to use the senses to discern knowledge.

7. Conclusion

In this article, the traditional philosophy of hunhu/ubuntu was defined and characterised with a view to show that Africa has a traditional philosophy and ethic which are distinctively communal and spiritual. This philosophy was also discussed with reference to how it has been deployed in the public sphere in both Zimbabwe and South Africa. The key distinctive qualities/features of this traditional philosophy were clearly spelt out as humaneness, gentleness, hospitality and generosity. This philosophy was also discussed within the context of its three main components, namely; hunhu/ubuntu metaphysics, hunhu/ubuntu ethics and hunhu/ubuntu epistemology. In the final analysis, it was explained that hunhu/ubuntu metaphysics, hunhu/ubuntu ethics and hunhu/ubuntu epistemology formed the aspects of what is known today as traditional southern African thought.

8. References and Further Reading

  • Appiah, K.A. 1992. In My Father’s House: Africa in the Philosophy of Culture. New York: Oxford University Press.
    • A thorough treatise of the idea of Africa in the philosophy of culture
  • Battle, M. 2009. Ubuntu: I in You and You in Me. New York: Seasbury Publishing
    • A discussion of Ubuntu and how this idea has benefitted the Western world.
  • Dolamo, R. 2013.  “Botho/Ubuntu: The Heart of African Ethics.” Scriptura, 112 (1), pp.1-10
    • A thorough treatise on the notion of Ubuntu and its origin in Africa
  • Eze, M.O. 2011.  “I am Because You Are.” The UNESCO Courier, pp. 10-13
      • A Philosophical analysis of the idea of ubuntu
  • Eze, M.O. 2010. Intellectual History in Contemporary South Africa.  New York: Palgrave Macmillan
    • A detailed outline of the definition and characterization of intellectual history in Contemporary Africa
  • Eze, M.O. 2008. “What is African Communitarianism? Against Consensus as a regulative ideal.” South African Journal of Philosophy. 27 (4), pp. 106-119
    • A philosophical discussion of the notions of community and individuality in African thought
  • Flexner, S et al. 1988. The Random House Dictionary. New York: Random House.
    • One of the best dictionaries used by academics
  • Gade, C.B.N. 2011. “The Historical Development of the Written Discourses on Ubuntu.” South African Journal of Philosophy, 30(3), pp. 303-330
    • A philosophical discussion of the historical development of the Ubuntu discourse in Southern Africa
  • Gade, C.B.N. 2012. “What is Ubuntu? Different Interpretations among South Africans of African Descent.” South African Journal of Philosophy, 31 (3), pp.484-503
    • A Case-study on how South Africans of African descent interpret ubuntu
  • Gade, C.B.N. 2013. “Restorative Justice and the South African Truth and Reconciliation Process.”  African Journal of Philosophy, 32(1), pp.  10-35
    • A philosophical discussion of the origins of the idea of Restorative Justice
  • Gyekye, K. 1997. Tradition and Modernity: Reflections on the African Experience. New York: Oxford University Press
    • A philosophical rendition of the concepts of tradition and modernity in Africa
  • Hurka, T. 1993. Perfectionism. New York: Oxford University Press
    • A discussion on the notion of perfectionism
  • Makinde, M.A. 1988. African philosophy, Culture and Traditional Medicine. Athens: Africa Series number 53.
    • A thorough treatise on culture and philosophy in African thought
  • Mangena, F. 2012a. On Ubuntu and Retributive Justice in Korekore-Nyombwe Culture: Emerging Ethical Perspectives. Harare: Best Practices Books
    • A philosophical discussion of the place of Ubuntu and culture in the death penalty debate
  • Mangena, F. 2012b. “Towards a Hunhu/Ubuntu Dialogical Moral Theory.” Phronimon: Journal of the South African Society for Greek Philosophy and the Humanities, 13 (2), pp. 1-17
    • A philosophical discussion of the problems of applying Western ethical models in non-Western cultures
  • Mangena, F.2014. “In Defense of Ethno-philosophy: A Brief Response Kanu’s Eclecticism.” Filosofia Theoretica: A Journal of Philosophy, Culture and Religions, 3 (1), pp.96-107
    • A reflection on the importance of ethno-philosophy in the African philosophy debate
  • Mangena, F. 2015. “Ethno-philosophy as Rational: A Reply to Two Famous Critics.” Thought and Practice: A Journal of the Philosophical Association of Kenya, 6 (2), pp. 24-38
    • A reaction to the Universalists regarding the place of ethno-philosophy in African thought
  • Mbiti, J.S. 1969. African Religions and Philosophy. London: Heinemann
    • A discussion of community in African philosophy
  • Metz, T. 2007. “Towards an African Moral Theory.” The Journal of Political Philosophy, 15(3), pp. 321-341
    • A philosophical outline of what Thaddeus Metz perceive as African ethics
  • Nafukho. F.M. 2006.  “Ubuntu Worldview: A Traditional African View of Adult Learning in the Workplace.” Advances in Developing Human Resources, 8(3), pp.408-415
    • A thorough treatise on the three pillars of ubuntu
  • Pobee, J.S. 1979. Towards an African Theology. Nashville: Abingdon Press.
    • A theological discussion of the notions of community and individuality in African thought
  • Prozesky, M.H. 2003. Frontiers of Conscience: Exploring Ethics in a New Millennium. Cascades: Equinym Publishing
    • An outline of Ubuntu’s ten qualities
  • Ramose, M.B. 1999. African Philosophy through Ubuntu. Harare: Mond Books.
    • A thorough discussion on the nature and character of ubuntu
  • Ramose, M.B. 2007. “But Hans Kelsen was not born in Africa: A reply to Metz.” South African Journal of Philosophy, 26(4), pp. 347-355
    • Ramose’s response to Thaddeus Metz’s claim that African ethics lack a basic norm
  • Ramose, M.B. 2014b.  “Ubuntu: Affirming Right and Seeking Remedies in South Africa.” In: L Praeg and S Magadla (Eds.). Ubuntu: Curating the Archive (pp. 121-1346). Scottsville: University of KwaZulu Natal Press
    • A discussion of Ubuntu as affirming right and wrong in South Africa
  • Samkange, S and Samkange, T.M. 1980. Hunhuism or Ubuntuism: A Zimbabwean Indigenous Political Philosophy. Salisbury: Graham Publishing
    • A philosophical handbook on notions of Hunhu/Ubuntu in Zimbabwe
  • Steward, D and Blocker H.G. 1987. Fundamentals of Philosophy. New York: Macmillan Publishing Company
    • A discussion of key topics in Western philosophy
  • Shutte, A. 2008. “African Ethics in a Globalizing World.” In: R Nicolson (Ed.).Persons in Community: African Ethics in a Global Culture (pp. 15-34). Scottsville: University of KwaZulu Natal Press
    • A philosophical discussion of African ethics and its place in the globe
  • Taylor, D.F.P. 2013. “Defining Ubuntu for Business Ethics: A Deontological Approach.” South African Journal of Philosophy, 33(3), pp.331-345
    • An attempt to apply Ubuntu in the field of Business in Africa
  • Toure, S. 1959. “The Political Leader considered as the Representative of Culture.” http://www.blackpast.org/1959-sekou-toure-political-leader-considered-representative-culture
    • A discussion of the link between leadership, politics and culture in Africa
  • Tutu, D. 1999. No Future without Forgiveness. New York: Doubleday
    • A philosophical discussion of the findings of the Truth and Reconciliation Commission in South Africa
  • van Niekerk, J. (2013). “Ubuntu and Moral Value.” Johannesburg (PhD Dissertation submitted to the Department of Philosophy, University of Witwatersrand)
    • A philosophical rendition of the discourse of ubuntu and moral value.
  • Yamamoto, E.K. 1997. “Race Apologies.” Journal of Gender, Race and Justice, Vol. 1, pp. 47-88
    • A critical reflection on the nexus of Ubuntu, race, gender and justice
  • Vicencio, C.V.  2009. Walk with Us and Listen: Political Reconciliation in Africa. Cape Town: University of Cape Town Press
    • A philosophical discussion of political reconciliation in Africa.
  • Richardson, N. R. 2006. “Reflections on Reconciliation and Ubuntu.” In: R Nicolson (Ed.). Persons in Community: African Ethics in a Global Culture. Scottsville: University of KwaZulu Natal Press
    • A discussion on reconciliation in light of the Truth and Reconciliation Commission in South Africa.

 

Author Information

Fainos Mangena
Email: fainosmangena@gmail.com
University of Zimbabwe
Zimbabwe

History of African Philosophy

This article traces the history of systematic African philosophy from the early 1920s to date. In Plato’s Theaetetus, Socrates suggests that philosophy begins with wonder. Aristotle agreed. However, recent research shows that wonder may have different subsets. If that is the case, which specific subset of wonder inspired the beginning of the systematic African philosophy? In the history of Western philosophy, there is the one called thaumazein interpreted as ‘awe’ and the other called miraculum interpreted as ‘curiosity’. History shows that these two subsets manifest in the African place as well, even during the pre-systematic era. However, there is now an idea appearing in recent African philosophy literature called ọnụma interpreted as ‘frustration,’ which is regarded as the subset of wonder that jump-started the systematic African philosophy. In the 1920s, a host of Africans who went to study in the West were just returning. They had experienced terrible racism and discrimination while in the West. They were referred to as descendants of slaves, as people from the slave colony, as sub-humans, and so on. On return to their native lands, they met the same maltreatment by the colonial officials. ‘Frustrated’ by colonialism and racialism as well as the legacies of slavery, they were jolted onto the path of philosophy—African philosophy—by what can be called ọnụma.

These ugly episodes of slavery, colonialism and racialism not only shaped the world’s perception of Africa; they also instigated a form of intellectual revolt from the African intelligentsias. The frustration with the colonial order eventually led to angry questions and reactions out of which African philosophy emerged, first in the form of nationalisms, and then in the form of ideological theorizations. But the frustration was borne out of colonial caricature of Africa as culturally naïve, intellectually docile and rationally inept. This caricature was created by European scholars such as Kant, Hegel and, much later, Levy-Bruhl to name just a few. It was the reaction to this caricature that led some African scholars returning from the West into the type of philosophizing one can describe as systematic beginning with the identity of the African people, their place in history, and their contributions to civilization. To dethrone the colonially-built episteme became a ready attraction for African scholars’ vexed frustrations. Thus began the history of systematic African philosophy with the likes of JB Danquah, Meinrad Hebga, George James, SK. Akesson, Aime Cesaire, Leopold Senghor, Kwame Nkrumah, Julius Nyerere, George James, William Abraham, John Mbiti and others such as Placid Tempels, and Janheinz Jahn to name a few.

Table of Contents

  1. Introduction
  2. Criteria of African Philosophy
  3. Methods of African Philosophy
    1. The Communitarian Method
    2. The Complementarity Method
    3. The Conversational Method
  4. Schools of African Philosophy
    1. Ethnophilosophy School
    2. Nationalist/Ideological School
    3. Philosophic Sagacity
    4. Hermeneutical School
    5. Literary School
    6. Professional School
    7. Conversational School
  5. The Movements in African Philosophy
    1. Excavationism
    2. Afro-Constructionism/Afro-Deconstructionism
    3. Critical Reconstructionism/Afro-Eclecticism
    4. Conversationalism
  6. Epochs in African Philosophy
    1. Pre-systematic Epoch
    2. Systematic Epoch
  7. Periods of African Philosophy
    1. Early Period
    2. Middle Period
    3. Later Period
    4. New Era
  8. Conclusion
  9. References and Further Reading

1. Introduction

African philosophy as a systematic study has a very short history. This history is also a very dense one since actors sought to do in a few decades what would have been better done in many centuries. As a result, they also did in later years what ought to have been done earlier and vice versa, thus making the early and the middle periods overlap considerably. The reason for this overtime endeavor is not far-fetched. Soon after colonialism, actors realized that Africa had been sucked into the global matrix unprepared. During colonial times, the identity of the African was European; his thought system, standard and even his perception of reality were structured by the colonial shadow which stood towering behind him. It was easy for the African to position himself within these Western cultural appurtenances even though they had no real connection with his being.

The vanity of this presupposition and the emptiness of colonial assurances manifested soon after the towering colonial shadow vanished. Now, in the global matrix, it became shameful for the African to continue to identify himself within the European colonialist milieu. For one, he had just rejected colonialism, and for another, the deposed European colonialist made it clear that the identity of the African was no longer covered and insured by the European medium. So, actors realized suddenly that they had been disillusioned and had suffered severe self-deceit under colonial temper. The question which trailed every African was, “Who are you?” Of course, the answers from European perspective were savage, primitive, less than human, etc. It was the urgent, sudden need to contradict these European positions that led some post-colonial Africans in search of African identity. So, to discover or rediscover African identity in order to initiate a non-colonial or original history for Africa in the global matrix and start a course of viable economic, political and social progress that is entirely African became one of the focal points of African philosophy. Here, the likes of Cesaire, Nkrumah and Leon Damas began articulating the negritude movement.

While JB Danquah (1928, 1944) and SK Akesson (1965) rationally investigated topics in African politics, law and metaphysics, George James (1954) reconstructed African philosophical history, Meinrad Hebga (1958) probed topics in African logic. These three represent some of the early African philosophers. Placid Tempels (1959), the European missionary, also elected to help and, in his controversial book, Bantu Philosophy, sought to create Africa’s own philosophy as proof that Africa has its own peculiar identity and thought system. However, it was George James, who attempted a much more ambitious project in his work, Stolen Legacy. In this work, there were strong suggestions not only that Africa had philosophy but that the so-called Western philosophy, the very bastion of European identity, was stolen from Africa. This claim was intended to make the proud European colonialists feel indebted to the humiliated Africans, but it was unsuccessful. That Greek philosophy had roots in Egypt does not imply, as some claim, that Egyptians were high-melanated nor that high-melanated Africans created Egyptian philosophy. The use of the term “Africans” in this work is in keeping with George James’ demarcation that precludes the low-melanated people of North Africa and refers to the high-melanated people of southern Sahara.

Besides the two above, other Africans contributed ideas. Aime Cesaire, John Mbiti, Odera Oruka, Julius Nyerere, Leopold Senghor, Nnamdi Azikiwe, Kwame Nkrumah, Obafemi Awolowo, Alexis Kegame, Uzodinma Nwala, Emmanuel Edeh, Innocent Onyewuenyi, and Henry Olela, to name just a few, opened the doors of ideas. A few of the works produced sought to prove and establish the philosophical basis of African, unique identity in the history of humankind, while others sought to chart a course of Africa’s true identity through unique political and economic ideologies. It can be stated that much of these endeavors fall under the early period.

For its concerns, the middle period of African philosophy is characterized by the Great Debate. Those who seek to clarify and justify the position held in the early period and those who seek to criticize and deny the viability of such a position entangled themselves in a great debate. Some of the actors on this front include, C. S. Momoh, Robin Horton, Henri Maurier, Lacinay Keita, Peter Bodunrin, Kwasi Wiredu, Kwame Gyekye, Richard Wright, Barry Halen, Joseph Omoregbe, C. B. Okolo, Theophilus Okere, Paulin Hountondji, Gordon Hunnings, Odera Oruka and Sophie Oluwole to name a few.

The middle period eventually gave way to the later period, which has as its focus, the construction of an African episteme. Two camps rivaled each other, namely; the Critical Reconstructionists who are the evolved Universalists/Deconstructionists, and the Eclectics who are the evolved Traditionalists/Excavators. The former seek to build an African episteme untainted by ethnophilosophy; whereas, the latter seek to do the same by a delicate fusion of relevant ideals of the two camps. In the end, Critical Reconstructionism ran into a brick wall when it became clear that whatever it produced cannot truly be called African philosophy if it is all Western without African marks. The mere claim that it would be African philosophy simply because it was produced by Africans (Hountondji 1996 and Oruka 1975) would collapse like a house of cards under any argument. For this great failure, the influence of Critical Reconstructionism in the later period was whittled down, and it was later absorbed by its rival—Eclecticism.

The works of the Eclectics heralded the emergence of the New Era in African philosophy. The focus becomes the Conversational philosophizing, in which the production of philosophically rigorous and original African episteme better than what the Eclectics produced occupied the center stage.

Overall, the sum of what historians of African philosophy have done can be presented in the following two broad categorizations to wit; Pre-systematic epoch  and the Systematic epoch. The former refers to Africa’s philosophical culture, thoughts of the anonymous African thinkers and may include the problems of Egyptian and Ethiopian legacies. The latter refers to the periods marking the return of Africa’s first eleven, Western-trained philosophers from the 1920’s to date. This latter category could further be delineated into four periods:

    1. Early period 1920s – 1960s
    2. Middle period 1960s – 1980s
    3. Later period 1980s – 1990s
    4. New (Contemporary) period since 1990s

Note, of course, that this does not commit us to saying that, before the early period, people in Africa never philosophized—they did.  But one fact that must not be denied is that much of their thoughts were not documented in writing; most of those that may have been documented in writing are either lost or destroyed, and, as such, scholars cannot attest to their systematicity or sources. In other words, what this periodization shows is that African philosophy as a system first began in the late 1920s. There are, of course, documented writings in ancient Egypt, medieval Ethiopia, etc. The historian Cheikh Anta Diop (1974) has gazetted some of the ideas. Some of the popularly cited include St Augustine, born in present-day Algeria, but who being a Catholic Priest of the Roman Church, was trained in western-style philosophy education, and is counted amongst the medieval philosophers. Wilhelm Anton Amo, who was born in Ghana, West Africa, was sold into slavery as a little boy, and later educated in western-style philosophy in Germany where he also practised. Zera Yacob and Walda Heywat, both Ethiopian philosophers with Arabic and European educational influences. The question is, are the ideas produced by these people indubitably worthy of the name ‘African philosophies’? Their authors may be Africans by birth, but how independent are their views from foreign influences? We observe from these questions that the best that can be expected is a heated controversy. It would be uncharitable to say to the European historian of philosophy that St Augustine or Amo was not one of their own. Similarly, it may be uncharitable to say to the African historian that Amo or Yacob was not an African. But, does being an African translate to being an African philosopher?  If we set sentiments aside, it would be less difficult to see that all there is in those questions is a controversy. Even if there were any substance beyond controversy, were those isolated and disconnected views (most of which were sociological, religious, ethnological and anthropological) from Egypt, Rome, Germany and Ethiopia adequate to form a coherent corpus of African philosophy? The conversationalists, a contemporary African philosophical movement, have provided us with a via-media out of this controversy. Rather than discard this body of knowledge as non-African philosophies or uncritically accept them as African philosophy as the likes of Obi Ogejiofor and  Anke Graness, the conversationalists urge that they be discussed as part of the pre-systematic epoch that also include those Innocent Asouzu (2004) describes as the “Anonymous Traditional African Philosophers”. These are the ancient African philosophers whose names were forgotten through the passage of time, and whose ideas were transmitted through orality.

Because there are credible objections among African philosophers with regards to the inclusion of it in the historical chart of African philosophy, the Egyptian question (the idea that the creators of ancient Egyptian civilization were high-melanated Africans from the south of the Sahara) will be included as part of the controversies in the pre-systematic epoch. The main objection is that even if the philosophers of stolen legacy were able to prove a connection between Greece and Egypt, they could not prove in concrete terms that Egyptians who created the philosophy stolen by the Greeks were high-melanated Africans or that high-melanated Africans were Egyptians. It is understandable the frustration and desperation that motivated such ambitious effort in the ugly colonial era which was captured above, but any reasonable person, judging by the responses of time and events in the last few decades knew it was high time Africans abandoned that unproven legacy and let go of that, now helpless propaganda. If, however, some would want to retain it as part of African philosophy, it would carefully fall within the  pre-systematic era.

In this essay, the discussion will focus on the history of systematic African philosophy touching prominently on the criteria, schools, movements and periods in African philosophy. As much as the philosophers of a given era may disagree, they are inevitably united by the problem of their epoch. That is to say, it is orthodoxy that each epoch is defined by a common focus or problem. Therefore, the approach of the study of the history of philosophy can be done either through personality periscope or through the periods, but whichever approach one chooses, he unavoidably runs into the person who had chosen the other. This is a sign of unity of focus. Thus philosophers are those who seek to solve the problem of their time. In this presentation, the study of the history of African philosophy will be approached from the perspectives of criteria, periods, schools, and movements. The personalities will be discussed within these purviews.

2. Criteria of African Philosophy

To start with, more than three decades of debate on the status of philosophy ended with the affirmation that African philosophy exists. But what is it that makes a philosophy African? Answers to this question polarized actors into two main groups, namely the Traditionalists and Universalists. Whereas the Traditionalists aver that the studies of the philosophical elements in world-view of the people constitute African philosophy, the Universalists insist that it has to be a body of analytic and critical reflections of individual African philosophers. Further probing of the issue was done during the debate by the end of which the question of what makes a philosophy “African” produced two contrasting criteria. First, as a racial criterion; a philosophy would be African if it is produced by Africans. This is the view held by people like Paulin Hountondji, Odera Oruka (in part), and early Peter Bodunrin, derived from the two constituting terms—“African” and “philosophy”. African philosophy following this criterion is the philosophy done by Africans. This has been criticized as inadequate, incorrect and exclusivist. Second, as a tradition criterion; a philosophy is “African” if it designates a non-racial-bound philosophy tradition where the predicate “African” is treated as a solidarity term of no racial import and where the approach derives inspiration from African cultural background or system of thought. It does not matter whether the issues addressed are African or that the philosophy is done by an African insofar as it has universal applicability and emerged from the purview of African system of thought. African philosophy would then be that rigorous discourse of African issues or any issues whatsoever from the critical eye of African system of thought. Actors like Odera Oruka (in part), Meinrad Hebga, C. S. Momoh, Udo Etuk, Joseph Omoregbe, the later Peter Bodunrin, Jonathan Chimakonam can be grouped here. This criterion has also been criticized as courting uncritical elements of the past when it makes reference to the controversial idea of African logic tradition. Further discussion on this is well beyond the scope of this essay. What is, however, common in the two criteria is that African philosophy is a critical discourse on issues that may or may not affect Africa by African philosophers—the purview of this discourse remains unsettled. Recently, the issue of language has come to the fore as crucial in the determination of the Africanness of a philosophy. Inspired by the works of Euphrase Kezilahabi (1985), Ngugi wa Thiong’o (1986), AGA Bello (1987), Francis Ogunmodede (1998), to name just a few, the ‘language challenge’ is now taken as an important element in the affirmation of African philosophy. Advocates ask, should authentic African philosophy be done in African languages or in a foreign language with wider reach? Godfrey Tangwa (2017), Chukwueloka Uduagwu (2022) and Enyimba Maduka (2022) are some contemporary Africans who investigate this question. Alena Rettova (2007) represents non-African philosophers who engage the question.

3. Methods of African Philosophy

a. The Communitarian Method

This method speaks to the idea of mutuality, together or harmony, the type found in the classic expression of ubuntu: “a person is a person through other persons” or that, which is credited to John Mbiti, “ I am because we are, since we are, therefore I am”. Those who employ this method wish to demonstrate the idea of mutual interdependence of variables or the relational analysis of variables. You find this most prominent in the works of researchers working in the areas of ubuntu, personhood and communalism. Some of the scholars who employ this method include; Ifeanyi Menkiti, Mogobe Ramose, Kwame Gyekye, Thaddeus Metz, Fainos Mangena, Leonhard Praeg, Bernard Matolino, Michael Eze, Olajumoke Akiode, Rianna Oelofsen, and so forth.

b. The Complementarity Method

This method was propounded by Innocent Asouzu, and it emphasizes the idea of missing link. In it, no variable is useless. The system of reality is like a network in which each variable has an important role to play i.e. it complements and is, in return, complemented because no variable is self-sufficient. Each variable is then seen as a ‘missing link’ of reality to other variables. Here, method is viewed as a disposition or a bridge-building mechanism. As a disposition, it says a lot about the orientation of the philosopher who employs it. The method of complementary reflection seeks to bring together seemingly opposed variables into a functional unity. Other scholars whose works have followed this method include Mesembe Edet, Ada Agada, Jonathan Chimakonam and a host of others.

c. The Conversational Method

This is a formal procedure for assessing the relationships of opposed variables in which thoughts are shuffled through disjunctive and conjunctive modes to constantly recreate fresh thesis and anti-thesis each time at a higher level of discourse without the expectation of the synthesis. The three principal features of this method are relationality, the idea that variables necessarily interrelate; contextuality, the idea that the relationship of variables is informed and shaped by contexts; complementarity, the idea that seemingly opposed variables can complement rather than contradict. It is an encounter between philosophers of rival schools of thought and between different philosophical traditions or cultures in which one party called nwa-nsa (the defender or proponent) holds a position and another party called nwa-nju (the doubter or opponent) doubts or questions the veracity and viability of the position. On the whole, this method points to the idea of relationships among interdependent, interrelated and interconnected realities existing in a network whose peculiar truth conditions can more accurately and broadly be determined within specific contexts. This method was first proposed by Jonathan Chimakonam and endorsed by the  Conversational School of Philosophy. Other scholars who now employ this method include, Victor Nweke, Mesembe Edet, Fainos Mangena, Enyimba Maduka, Ada Agada, Pius Mosima, L. Uchenna Ogbonnaya, Aribiah Attoe, Leyla Tavernaro-Haidarian, Amara Chimakonam, Chukwueloka Uduagwu, Patrick Ben, and a host of others.

4. Schools of African Philosophy

a. Ethnophilosophy School

This is the foremost school in systematic African philosophy which equated African philosophy with culture-bound systems of thought. For this, their enterprise was scornfully described as substandard hence the term “ethnophilosophy.” Thoughts of the members of the Excavationism movement like Tempels Placid and Alexis Kagame properly belong here, and their high point was in the early period of African philosophy.

b. Nationalist/Ideological School

The concern of this school was nationalist philosophical jingoism to combat colonialism and to create political philosophy and ideology for Africa from the indigenous traditional system as a project of decolonization. Thoughts of members of the Excavationism movement like Kwame Nkrumah, Leopold Sedar Senghor and Julius Nyerere in the early period can be brought under this school.

c. Philosophic Sagacity

There is also the philosophic sagacity school, whose main focus is to show that standard philosophical discourse existed and still exists in traditional Africa and can only be discovered through sage conversations. The chief proponent of this school was the brilliant Kenyan philosopher Odera Oruka who took time to emphasize that Marcel Gruaile’s similar programme is less sophisticated than his.  Other adherents of this school include Gail Presbey, Anke Graness and the Cameroonian philosopher Pius Mosima. But since Oruka’s approach thrives on the method of oral interview of presumed sages whose authenticity can easily be challenged be, what was produced may well distance itself from the sages and becomes the fruits of the interviewing philosopher. So, the sage connection and the tradition became questionable. Their enterprise falls within the movement of Critical Reconstructionism of the later period.

d. Hermeneutical School

Another prominent school is the hermeneutical school. Its focus is that the best approach to studying African philosophy is through interpretations of oral traditions and emerging philosophical texts. Theophilus Okere, Okonda Okolo, Tsenay Serequeberhan and Ademola Fayemi Kazeem are some of the major proponents and members of this school. The confusion, however, is that they reject ethnophilosophy whereas the oral tradition and most of the texts available for interpretation are ethnophilosophical in nature. The works of Okere and Okolo feasted on ethno-philosophy. This school exemplifies the movement called Afro-constructionism of the middle period.

e. Literary School

The literary school’s main concern is to make a philosophical presentation of African cultural values through literary/fictional ways. Proponents like Chinua Achebe, Okot P’Bitek, Ngugi wa Thiong’o, Wole Soyinka to name a few have been outstanding. Yet critics have found it convenient to identify their discourse with ethnophilosophy from literary angle thereby denigrating it as sub-standard. Their enterprise remarks the movement of Afro-constructionism of the middle period.

f. Professional School

 Perhaps the most controversial is the one variously described as professional, universalist or modernist school. It contends that all the other schools are engaged in one form of ethnophilosophy or the other, that standard African philosophy is critical, individual discourse and that what qualifies as African philosophy must have universal merit and thrive on the method of critical analysis and individual discursive enterprise. It is not about talking, it is about doing. Some staunch unrepentant members of this school include Kwasi Wiredu, Paulin Hountondji, Peter Bodunrin, Richard Wright, Henri Maurier to name a few. They demolished all that has been built in African philosophy and built nothing as an alternative episteme. This school champions the movement of Afro-deconstructionism and the abortive Critical Reconstructionism of the middle and later periods, respectively.

Perhaps, one of the deeper criticisms that can be leveled against the position of the professional school comes from C. S. Momoh’s scornful description of the school as African logical neo-positivism. They agitate that (1) there is nothing as yet in African traditional philosophy that qualifies as philosophy and (2) that critical analysis should be the focus of African philosophy; so, what then is there to be critically analyzed? Professional school adherents are said to forget in their overt copying of European philosophy that analysis is a recent development in European philosophy which attained maturation in the 19th century after over 2000 years of historical evolution thereby requiring some downsizing. Would they also grant that philosophy in Europe before 19th century was not philosophy? The aim of this essay is not to offer criticisms of the schools but to present historical journey of philosophy in the African tradition. It is in opposition to and the need to fill the lacuna in the enterprise of the professional school that the new school called the conversational school has emerged in African philosophy.

g. Conversational School

 This new school thrives on fulfilling the yearning of the professional/modernist school to have a robust individual discourse as well as fulfilling the conviction of the traditionalists that a thorough-going African philosophy has to be erected on the foundation of African thought systems. They make the most of the criterion that presents African philosophy as a critical tradition that prioritizes engagements between philosophers and cultures, and projects individual discourses from the methodological lenses and thought system of Africa that features the principles of relationality, contextuality and complementarity. The school has an ideological structure consisting of four aspects: their working assumption that relationship and context are crucial to understanding reality; their main problem called border lines or the presentation of reality as binary opposites; their challenge, which is to trace the root cause of border lines; and their two main questions, which are: does difference amount to inferiority and are opposites irreconcilable? Those whose writings fit into this school include Pantaleon Iroegbu, Innocent Asouzu, Chris Ijiomah, Godfrey Ozumba, Andrew Uduigwomen, Bruce Janz, Jennifer Vest, Jonathan Chimakonam, Fainos Mangena, Victor Nweke, Paul Dottin, Aribiah Attoe, Leyla Tavernaro-Haidarian, Maduka Enyimba, L. Uchenna Ogbonnaya, Isaiah Negedu, Christiana Idika, Ada Agada, Amara Chimakonam, Patrick Ben, Emmanuel Ofuasia, Umezurike Ezugwu, to name a few. Their projects promote partly the movements of Afro-eclecticism and fully the conversationalism of the later and the new periods, respectively.

5. The Movements in African Philosophy

There are four main movements that can be identified in the history of African philosophy, they include: Excavationism, Afro-constructionism / Afro-deconstructionism, Critical Reconstructionism / Afro-Eclecticism and Conversationalism.

a. Excavationism

 The Excavators are all those who sought to erect the edifice of African philosophy by systematizing the African cultural world-views. Some of them aimed at retrieving and reconstructing presumably lost African identity from the raw materials of African culture, while others sought to develop compatible political ideologies for Africa from the native political systems of African peoples. Members of this movement have all been grouped under the schools known as ethnophilosophy and nationalist/ideological schools, and they thrived in the early period of African philosophy. Their concern was to build and demonstrate unique African identity in various forms. A few of them include JB Danquah, SK Akesson, Placid Tempels, Julius Nyerere, John Mbiti, Alexis Kagame, Leopold Senghor, Kwame Nkrumah and Aime Cesaire, and so on.

b. Afro-Constructionism/Afro-Deconstructionism

The Afro-deconstructionists, sometimes called the Modernists or the Universalists are those who sought to demote such edifice erected by the Excavators on the ground that their raw materials are substandard cultural paraphernalia. They are opposed to the idea of unique African identity or culture-bound philosophy and prefer a philosophy that will integrate African identity with the identity of all other races. They never built this philosophy. Some members of this movement include Paulin Hountondji, Kwasi Wiredu, Peter Bodunrin, Macien Towa, Fabien Ebousi Boulaga, Richard Wright and Henri Maurier, and partly Kwame Appiah. Their opponents are the Afro-constructionists, sometimes called the Traditionalists or Particularists who sought to add rigor and promote the works of the Excavators as true African philosophy. Some prominent actors in this movement include Ifeanyi Menkiti, Innocent Onyewuenyi, Henry Olela, Lansana Keita, C. S. Momoh, Joseph Omoregbe, Janheinz Jahn, Sophie Oluwole and, in some ways, Kwame Gyekye. Members of this twin-movement have variously been grouped under ethnophilosophy, philosophic sagacity, professional, hermeneutical and literary schools and they thrived in the middle period of African philosophy. This is also known as the period of the Great Debate.

c. Critical Reconstructionism/Afro-Eclecticism

 A few Afro-deconstructionists of the middle period evolved into Critical Reconstructionists hoping to reconstruct from scratch, the edifice of authentic African philosophy that would be critical, individualistic and universal. They hold that the edifice of ethnophilosophy, which they had demolished in the middle period, contained no critical rigor. Some of the members of this movement include, Kwasi Wiredu, Olusegun Oladipo, Kwame Appiah, V. Y. Mudimbe, D. A. Masolo, Odera Oruka and, in some ways, Barry Hallen and J. O. Sodipo. Their opponents are the Afro-Eclectics who evolved from Afro-constructionism of the middle period. Unable to sustain their advocacy and the structure of ethnophilosophy they had constructed, they stepped down a little bit to say, “Maybe we can combine meaningfully, some of the non-conflicting concerns of the Traditionalists and the Modernists.” They say (1) that African traditional philosophy is not rigorous enough as claimed by the Modernists is a fact (2) that the deconstructionist program of the Modernists did not offer and is incapable of offering an alternative episteme is also a fact (3) maybe the rigor of the Modernists can be applied on the usable and relevant elements produced by the Traditionalists to produce the much elusive, authentic African philosophy. African philosophy for this movement therefore becomes a product of synthesis resulting from the application of tools of critical reasoning on the relevant traditions of African life-world.  A. F. Uduigwomen, Kwame  Gyekye, Ifeanyi Menkiti, Kwame Appiah, Godwin Sogolo and Jay van Hook are some of the members of this movement. This movement played a vital reconciliatory role, the importance of which was not fully realized in African philosophy. Most importantly, they found a way out and laid the foundation for the emergence of Conversationalism. Members of this twin-movement thrived in the later period of African philosophy.

d. Conversationalism

The Conversationalists are those who seek to create an enduring corpus in African philosophy by engaging elements of tradition and individual thinkers in critical conversations. They emphasize originality, creativity, innovation, peer-criticism and cross-pollination of ideas in prescribing and evaluating their ideas. They hold that new episteme in African philosophy can only be created by individual African philosophers who make use of the “usable past” and the depth of individual originality in finding solutions to contemporary demands. They do not lay emphasis on analysis alone but also on critical rigor and what is now called arumaristics—a creative reshuffling of thesis and anti-thesis that spins out new concepts and thoughts. Further, their methodological ambience features principles such as relationality, contextuality and complementarity. Members of this movement thrive in this contemporary period, and their school can be called the conversational school. Some of the philosophers that have demonstrated this trait include Pantaleon Iroegbu, Innocent Asouzu, Chris Ijomah, Godfrey Ozumba, Andrew Uduigwomen,  Bruce Janz, Jonathan Chimakonam, Fainos Mangena, Jennifer Lisa Vest, L. Uchenna Ogbonnaya, Maduka Enyimba, Leyla Tervanaro-Haidarian, Aribiah Attoe, and so forth.

6. Epochs in African Philosophy

Various historians of African philosophy have delineated the historiography of African philosophy differently. Most, like Obenga, Abanuka, Okoro, Oguejiofor, Graness, Fayemi, etc., have merely adapted the Western periodization model of ancient, medieval, modern and contemporary. But there is a strong objection to this model. Africa, for example, did not experience the medieval age as Europe did. The intellectual history of the ancient period of Europe shares little in common with ancient Africa. The same goes for the modern period. In other words, the names ancient, medieval and modern refer to actual historical periods in Europe with specific features in their intellectual heritage, which share very little in common with those exact dates in Africa. It, thus, makes no historical, let alone philosophical sense, to adopt such a model for African philosophy. Here, we have a classic case of what Innocent Asouzu calls “copycat philosophy”, which must be rejected. The conversationalists, therefore, propose a different model, one that actually reflects the true state of things. In this model, there are two broad categorizations to wit; Pre-systematic epoch and the Systematic epoch. The latter is further divided into four periods, early, middle, later and the contemporary periods.

a. Pre-systematic Epoch

This refers to the era from the time of the first homo sapiens to the 1900s. African philosophers studied here are those Innocent Asouzu describes as the “Anonymous Traditional African Philosophers”, whose names have been lost in history. They may also include the ancient Egyptians, Ethiopians and Africans who thrived in Europe in that era. The controversies surrounding the nativity of the philosophies of St Augustine, Anton Amo, the Egyptian question, etc., may also be included.

b. Systematic Epoch

This refers to the era from the 1920s to date when systematicity that involves academic training, writing, publishing, engagements, etc., inspired by African conditions and geared towards addressing those conditions, became central to philosophical practice in Africa, South of the Sahara. This latter epoch could further be delineated into four periods: early, middle, later and the contemporary periods.

7. Periods of African Philosophy

a. Early Period

The early period of African philosophy was an era of the movement called cultural/ideological excavation aimed at retrieving and reconstructing African identity. The schools that emerged and thrived in this period were ethnophilosophy and ideological/nationalist schools. Hegel wrote that the Sub-Saharan Africans had no high cultures and made no contributions to world history and “civilization” (1975: 190). Lucien Levy Bruhl also suggested that they were “pre-logical” (1947: 17). The summary of these two positions, which represent the colonial mindset, is that Africans have no dignified identity like their European counterpart. This could be deciphered in the British colonial system that sought to erode the native thought system in the constitution of social systems in their colonies and also in the French policy of assimilation. Assimilation is a concept credited to the French philosopher Chris Talbot (1837), that rests on the idea of expanding French culture to the colonies outside of France in the 19th and 20th centuries. According to Betts (2005: 8), the natives of these colonies were considered French citizens as long as the “French culture” and customs were adopted to replace the indigenous system. The purpose of the theory of assimilation, for Michael Lambert, therefore, was to turn African natives into French men by educating them in the French language and culture (1993: 239-262).

During colonial times, the British, for example, educated their colonies in the British language and culture, strictly undermining the native languages and cultures. The products of this new social system were then given the impression that they were British, though second class, the king was their king, and the empire was also theirs. Suddenly, however, colonialism ended, and they found, to their chagrin, that they were treated as slave countries in the new post-colonial order. Their native identity had been destroyed, and their fake British identity had also been taken from them; what was left was amorphous and corrupt. It was in the heat of this confusion and frustration that the African philosophers sought to retrieve and recreate the original African identity lost in the event of colonization. Ruch and Anyanwu, therefore, ask, “What is this debate about African identity concerned with and what led to it? In other words, why should Africans search for their identity?” Their response to the questions is as follows:

The simple answer to these questions is this: Africans of the first half of this (20th century) century have begun to search for their identity, because they had, rightly or wrongly, the feeling that they had lost it or that they were being deprived of it. The three main factors which led to this feeling were: slavery, colonialism and racialism. (1981: 184-85)

Racialism, as Ruch and Anyanwu believed, may have sparked it off and slavery may have dealt the heaviest blow, but it was colonialism that entrenched it. Ironically, it was the same colonialism at its stylistic conclusion that opened the eyes of the Africans by stirring the hornet’s nest. An African can never be British or French, even with the colonially imposed language and culture. With this shock, the post-colonial African philosophers of the early period set out in search of Africa’s lost identity.

James, in 1954 published his monumental work Stolen Legacy. In it, he attempted to prove that the Egyptians were the true authors of Western philosophy; that Pythagoras, Socrates, Plato and Aristotle plagiarized the Egyptians; that the authorship of the individual doctrines of Greek philosophers is mere speculation perpetuated chiefly by Aristotle and executed by his school; and that the African continent gave the world its civilization, knowledge, arts and sciences, religion and philosophy, a fact that is destined to produce a change in the mentality both of the European and African peoples. In G. M. James’ words:

In this way, the Greeks stole the legacy of the African continent and called it their own. And as has already been pointed out, the result of this dishonesty had been the creation of an enormous world opinion; that the African continent has made no contribution to civilization, because her people are backward and low in intelligence and culture…This erroneous opinion about the Black people has seriously injured them through the centuries up to modern times in which it appears to have reached a climax in the history of human relations. (1954: 54)

These robust intellectual positions supported by evidential and well-thought-out arguments quickly heralded a shift in the intellectual culture of the world. However, there was one problem George James could not fix; he could not prove that the people of North Africa (Egyptians) who were the true authors of ancient art, sciences, religion and philosophy were high-melanated Africans, as can be seen in his hopeful but inconsistent conclusions:

This is going to mean a tremendous change in world opinion, and attitude, for all people and races who accept the new philosophy of African redemption, i.e. the truth that the Greeks were not the authors of Greek philosophy; but the people of North Africa; would change their opinion from one of disrespect to one of respect for the black people throughout the world and treat them accordingly. (1954: 153)

It is inconsistent how the achievements of North Africans (Egyptians) can redeem the black Africans. This is also the problem with Henri Olela’s article “The African Foundations of Greek Philosophy”.

However, in Onyewuenyi’s The African Origin of Greek Philosophy, an ambitious attempt emerges to fill this lacuna in the argument for a new philosophy of African redemption. In the first part of chapter two, he reduced Greek philosophy to Egyptian philosophy, and in the second part, he attempted to further reduce the Egyptians of the time to high-melanated Africans. There are, however, two holes he could not fill. First, Egypt is the world’s oldest standing country which also tells its own story by themselves in different forms. At no point did they or other historians describe them as wholly high-melanated people. Second, if the Egyptians were at a time wholly high-melanated, why are they now wholly low-melanated? For the failure of this group of scholars to prove that high-melanated Africans were the authors of Egyptian philosophy, one must abandon the Egyptian legacy or discuss it as one of the precursor arguments to systematic African philosophy until more evidence emerges.

There are other scholars of the early period who tried more reliable ways to assert African identity by establishing native African philosophical heritage. Some examples include JB Danquah, who produced a text in the Akan Doctrine of God (1944), Meinrad Hebga (1958), who wrote “Logic in Africa”, and SK Akesson, who published “The Akan Concept of Soul” (1965). Another is Tempels, who authored Bantu Philosophy (1959). They all proved that rationality was an important feature of the traditional African culture. By systematizing Bantu philosophical ideas, Tempels confronted the racist orientation of the West, which depicted Africa as a continent of semi-humans. In fact, Tempels showed latent similarities in the spiritual inclinations of the Europeans and their African counterpart. In the opening passage of his work he observed that the European who has taken to atheism quickly returns to a Christian viewpoint when suffering or pain threatens his survival. In much the same way, he says the Christian Bantu returns to the ways of his ancestors when confronted by suffering and death. So, spiritual orientation or thinking is not found only in Africa.

In his attempt to explain the Bantu understanding of being, Tempels admits that this might not be the same as the understanding of the European. Instead, he argues that the Bantu construction is as much rational as that of the European. In his words:

So, the criteriology of the Bantu rests upon external evidence, upon the authority and dominating life force of the ancestors. It rests at the same time upon the internal evidence of experience of nature and of living phenomena, observed from their point of view. No doubt, anyone can show the error of their reasoning; but it must none the less be admitted that their notions are based on reason, that their criteriology and their wisdom belong to rational knowledge. (1959: 51)

Tempels obviously believes that the Bantu, like the rest of the African peoples, possess rationality, which undergird their philosophical enterprise. The error in their reasoning is only obvious in the light of European logic. But Tempels was mistaken in his supposition that the Bantu system is erroneous. The Bantu categories only differ from those of the Europeans in terms of logic, which is why a first-time European on-looker would misinterpret them to be irrational or spiritual. Hebga demonstrates this and suggests the development of African logic. Thus, the racist assumptions that Africans are less intelligent, which Tempels rejected with one hand, was smuggled in with another. For this, and his other errors such as, his depiction of Bantu ontology with vital force, his arrogant claim that the Bantu could not write his philosophy, requiring the intervention of the European, some African philosophers like Paulin Hountondji and Innocent Asouzu to name just a few, criticized Tempels. Asouzu, for one, describes what he calls the “Tempelsian Damage” in African philosophy to refer to the undue and erroneous influence, which the Bantu Philosophy has had on contemporary Africans. For example,  Tempels makes a case for Africa’s true identity, which, for him, could be found in African religion within which African philosophy (ontology) is subsumed. In his words, “being is force, force is being”. This went on to influence the next generation of African philosophers like the Rwandise,  Alexis Kagame. Kagame’s work The Bantu-Rwandan Philosophy (1956), which offers similar arguments, thus further strengthening the claims made by Tempels, especially from an African’s perspective. The major criticism against their industry remains the association of their thoughts with ethnophilosophy, where ethnophilosophy is a derogatory term. A much more studded criticism is offered recently by Innocent Asouzu in his work Ibuanyidanda: New Complementary Ontology (2007). His criticism was not directed at the validity of the thoughts they expressed or whether Africa could boast of a rational enterprise such as philosophy but at the logical foundation of their thoughts. Asouzu seems to quarrel with Tempels for allowing his native Aristotelian orientation to influence his construction of African philosophy and lambasts Kagame for following suit instead of correcting Tempels’ mistake. The principle of bivalence evident in the Western thought system was at the background of their construction.

Another important philosopher in this period is John Mbiti. His work African Religions and Philosophy (1969) avidly educated those who doubted Africans’ possession of their own identities before the arrival of the European by excavating and demonstrating the rationality in the religious and philosophical enterprises in African cultures. He boldly declared: “We shall use the singular, ‘philosophy’ to refer to the philosophical understanding of African peoples concerning different issues of life” (1969: 2). His presentation of time in African thought shows off the pattern of excavation in his African philosophy. Although his studies focus primarily on the Kikamba and Gikuyu tribes of Africa, he observes that there are similarities in many African cultures just as Tempels did earlier.  He subsumes African philosophy in African religion on the assumption that African peoples do not know how to exist without religion. This idea is also shared by William Abraham in his book The Mind of Africa as well as Tempels’ Bantu Philosophy. African philosophy, from Mbiti’s treatment, could be likened to Tempels’ vital force, of which African religion is its outer cloak. The obvious focus of this book is on African views about God, political thought, afterlife, culture or world-view and creation, the philosophical aspects lie within these religious over-coats. Thus, Mbiti establishes that the true, and lost, identity of the African could be found within his religion. Another important observation Mbiti made was that this identity is communal and not individualistic. Hence, he states, “I am because we are and since we are therefore I am” (1969: 108). Therefore, the African has to re-enter his religion to find his philosophy and the community to find his identity. But just like Kagame, Mbiti was unduly and erroneously influenced both by Tempels and the Judeo-Christian religion in accepting the vital force theory and in cloaking the African God with the attributes of the Judeo-Christian God.

This is a view shared by William Abraham. He shares Tempels’ and Mbiti’s views that the high-melanated African peoples have many similarities in their culture, though his studies focus on the culture and political thought of the Akan of present-day Ghana. Another important aspect of Abraham’s work is that he subsumed African philosophical thought in African culture taking, as Barry Hallen described, “an essentialist interpretation of African culture” (2002: 15). Thus for Abraham, like Tempels and Mbiti, the lost African identity could be found in the seabed of African indigenous culture in which religion features prominently.

On the other hand, there were those who sought to retrieve and establish, once again, Africa’s lost identity through economic and political ways. Some names discussed here include Kwame Nkrumah, Leopold Senghor and Julius Nyerere. These actors felt that the African could never be truly decolonized unless he found his own system of living and social organization. One cannot be African living like the European. The question that guided their study, therefore, became, “What system of economic and social engineering will suit us and project our true identity?” Nkrumah advocates African socialism, which, according to Barry Hallen, is an original, social, political and philosophical theory of African origin and orientation. This system is forged from the traditional, communal structure of African society, a view strongly projected by Mbiti. Like Amilcar Cabral, and Julius Nyerere, Nkrumah suggests that a return to African cultural system with its astute moral values, communal ownership of land and a humanitarian social and political engineering holds the key to Africa rediscovering her lost identity. Systematizing this process will yield what he calls the African brand of socialism. In most of his books, he projects the idea that Africa’s lost identity is to be found in African native culture, within which is African philosophical thought and identity shaped by communal orientation. Some of his works include, Neo-colonialism: The Last Stage of Imperialism (1965), I Speak of Freedom: A Statement of African Ideology (1961), Africa Must Unite (1970), and Consciencism (1965).

Leopold Sedar Senghor of Senegal charted a course similar to that of Nkrumah. In his works Negritude et Humanisme (1964) and Negritude and the Germans (1967), Senghor traced Africa’s philosophy of social engineering down to African culture, which he said is communal and laden with brotherly emotion. This is different from the European system, which he says is individualistic, having been marshaled purely by reason. He opposed the French colonial principle of assimilation aimed at turning Africans into Frenchmen by eroding and replacing African culture with French culture. African culture and languages are the bastions of African identity, and it is in this culture that he found the pedestal for constructing a political ideology that would project African identity. Senghor is in agreement with Nkrumah, Mbiti, Abraham and Tempels in many ways, especially with regards to the basis for Africa’s true identity.

Julius Nyerere of Tanzania is another philosopher of note in the early period of African philosophy. In his books Uhuru na Ujamaa: Freedom and Socialism (1964) and Ujamaa: The Basis of African Socialism (1968), he sought to retrieve and establish African true identity through economic and political ways. For him, Africans cannot regain their identity unless they are first free, and freedom (Uhuru) transcends independence. Cultural imperialism has to be overcome. And what is the best way to achieve this if not by developing a socio-political and economic ideology from the petals of African native culture, and traditional values of togetherness and brotherliness? Hence, Nyerere proposes Ujamaa, meaning familyhood—the “being-with” philosophy or the “we” instead of the “I—spirit” (Okoro 2004: 96). In the words of Barry Hallen, “Nyerere argued that there was a form of life and system of values indigenous to the culture of pre-colonial Africa, Tanzania in particular, that was distinctive if not unique and that had survived the onslaughts of colonialism sufficiently intact to be regenerated as the basis for an African polity” (2002: 74). Thus for Nyerere, the basis of African identity is the African culture, which is communal rather than individualistic. Nyerere was in agreement with other actors of this period on the path to full recovery of Africa’s lost identity. Some of the philosophers of this era not treated here include Aime Cesaire, Nnamdi Azikiwe, Obafemi Awolowo, Amilcar Cabral, and the two foreigners, Janheinz Jahn and Marcel Griaule.

b. Middle Period

The middle period of African philosophy was also an era of the twin-movement called Afro-constructionism and afro-deconstructionism, otherwise called the Great Debate, when two rival schools—Traditionalists and Universalists clashed. While the Traditionalists sought to construct an African identity based on excavated African cultural elements, the Universalists sought to demolish such architectonic structure by associating it with ethnophilosophy. The schools that thrived in this era include Philosophic Sagacity, Professional/Modernist/Universalist, hermeneutical and Literary schools.

An important factor of the early period was that the thoughts on Africa’s true identity generated arguments that fostered the emergence of the Middle Period of African philosophy. These arguments result from questions that could be summarized as follows: (1) Is it proper to take for granted the sweeping assertion that all of Africa’s cultures share a few basic elements in common? It was this assumption that had necessitated the favorite phrase in the early period, “African philosophy,” rather than “African philosophies”. (2) Does Africa or African culture contain a philosophy in the strict sense of the term? (3) Can African philosophy emerge from the womb of African religion, world-view and culture? Answers and objections to answers soon took the shape of a debate, characterizing the middle period as the era of the Great Debate in African philosophy.

This debate was between members of Africa’s new crop of intellectual radicals. On the one hand, were the demoters and, on the other were the promoters of African philosophy established by the league of early-period intellectuals. The former sought to criticize this new philosophy of redemption, gave it the derogatory tag “ethnophilosophy” and consequently denigrated the African identity that was founded on it, as savage and primitive identity. At the other end, the promoters sought to clarify and defend this philosophy and justify the African identity that was rooted in it as true and original.

For clarity, the assessment of the debate era will begin from the middle instead of the beginning. In 1978 Odera Oruka a Kenyan philosopher presented a paper at the William Amo Symposium held in Accra, Ghana on the topic “Four Trends in Current African Philosophy” in which he identified or grouped voices on African philosophy into four schools, namely ethnophilosophy, philosophic sagacity, nationalistic-ideological school and professional philosophy. In 1990 he wrote another work, Sage Philosophy: Indigenous Thinkers and the Modern Debate on African Philosophy in which he further added two schools to bring the number to six schools in African philosophy. Those two additions are the hermeneutic and the artistic/literary schools.

Those who uphold philosophy in African culture are the ethnophilosophers, and they include the actors treated as members of the early period of African philosophy and their followers or supporters in the Middle Period. Some would include C. S. Momoh, Joseph Omoregbe, Lansana Keita, Olusegun Oladipo, Gordon Hunnings, Kwame Gyekye, M. A. Makinde, Emmanuel Edeh, Uzodinma Nwala, K. C. Anyanwu and later E. A. Ruch, to name a few. The philosophic sagacity school, to which Oruka belongs, also accommodates C. S. Momoh, C. B. Nze, J. I. Omoregbe, C. B. Okolo and T. F. Mason. The nationalist-ideological school consists of those who sought to develop indigenous socio-political and economic ideologies for Africa. Prominent members include Julius Nyerere, Leopold Senghor, Kwame Nkrumah, Amilcar Cabral, Nnamdi Azikiwe and Obafemi Awolowo. The professional philosophy school insists that African philosophy must be done with professional philosophical methods such as analysis, critical reflection and logical argumentation, as it is in Western philosophy. Members of this school include: Paulin Hountondji, Henri Maurier, Richard Wright, Peter Bodunrin, Kwasi Wiredu, early E. A. Ruch, R. Horton, and later C. B. Okolo. The hermeneutic school recommends interpretation as a method of doing African philosophy. A few of its members include Theophilus Okere, Okonda Okolo, Tsenay Serequeberhan, Godwin Sogolo and partly J. Sodipo and B. Hallen. The Artistic/Literary school philosophically discusses the core of African norms in literary works, and includes Chinua Achebe, Okot P’Bitek, Ngugi wa Thiong’o, Wole Soyinka, Elechi Amadi and F. C. Ogbalu.

Also, in 1989, C. S. Momoh in his The Substance of African Philosophy outlined five schools, namely African logical neo-positivism, the colonial/missionary school of thought, the Egyptological school, the ideological school and the purist school. The article was titled “Nature, Issues and Substance of African Philosophy” and was reproduced in Jim Unah’s Metaphysics, Phenomenology and African Philosophy (1996).

In comparing Momoh’s delineations with Oruka’s, it can be said that the purist school encompasses Oruka’s ethnophilosophy, artistic/literary school and philosophic sagacity; The African logical neo-positivism encompasses  professional philosophy and the hermeneutical schools; and the ideological and colonial/missionary schools correspond to Oruka’s nationalistic-ideological school. The Egyptological school, therefore, remains outstanding. Momoh sees it as a school that sees African philosophy as synonymous with Egyptian philosophy or, at least, as originating from it. Also, Egyptian philosophy as a product of African philosophy is expressed in the writings of George James, I. C. Onyewuenyi and Henry Olela.

Welding all these divisions together are the perspectives of Peter Bodunrin and Kwasi Wiredu. In the introduction to his 1985 edited volume Philosophy in Africa: Trends and Perspectives, Bodunrin created two broad schools for all the subdivisions in both Oruka and Momoh, namely the Traditionalist and Modernist schools. While the former includes Africa’s rich culture and past, the latter excludes them from the mainstream of African philosophy. Kwasi Wiredu also made this type of division, specifically Traditional and Modernist, in his paper “On Defining African Philosophy” in C. S. Momoh’s (1989) edited volume. Also, A. F. Uduigwomen created two broad schools, namely the Universalists and the Particularists, in his “Philosophy and the Place of African Philosophy” (1995). These can be equated to Bodunrin’s Modernist and Traditionalist schools, respectively. The significance of his contribution to the Great Debate rests on the new school he evolved from the compromise of the Universalist and the Particularist schools (1995/2009: 2-7). As Uduigwomen defines it, the Eclectic school accommodates discourses pertaining to African experiences, culture and world-view as parts of African philosophy. Those discourses must be critical, argumentative and rational. In other words, the so-called ethnophilosophy can comply with the analytic and argumentative standards that people like Bodunrin, Hountondji, and Wiredu insist upon. Some later African philosophers revived Uduigwomen’s Eclectic school as a much more decisive approach to African philosophy (Kanu 2013: 275-87). It is the era dominated by Eclecticism and meta-philosophy that is tagged the ‘Later period’ in the history of African philosophy. For perspicuity, therefore, the debate from these two broad schools shall be addressed as the perspectives of the Traditionalist or Particularist and the Modernist or Universalist.

The reader must now have understood the perspectives on which the individual philosophers of the middle period debated. Hence, when Richard Wright published his critical essay “Investigating African Philosophy” and Henri Maurier published his “Do we have an African Philosophy?” denying the existence of African philosophy at least, as yet, the reader understands why Lansana Keita’s “The African Philosophical Tradition”, C. S. Momoh’s African Philosophy … does it exist?” or J. I. Omoregbe’s “African Philosophy: Yesterday and Today” are offered as critical responses. When Wright arrived at the conclusion that the problems surrounding the study of African philosophy were so great that others were effectively prevented from any worthwhile work until their resolution, Henri Maurier responded  to the question, “Do we have an African Philosophy?” with “No! Not Yet!” (1984: 25). One would understand why Lansana Keita took it up to provide concrete evidence that Africa had and still has a philosophical tradition. In his words:

It is the purpose of this paper to present evidence that a sufficiently firm literate philosophical tradition has existed in Africa since ancient times, and that this tradition is of sufficient intellectual sophistication to warrant serious analysis…it is rather…an attempt to offer a defensible idea of African philosophy. (1984: 58)

Keita went on in that paper to excavate intellectual resources to prove his case, but it was J. I. Omoregbe who tackled the demoters on every front. Of particular interest are his critical commentaries on the position of Kwasi Wiredu and others who share Wiredu’s opinion that what is called African philosophy is not philosophy, but community thought at best. Omoregbe alludes that the logic and method of African philosophy need not be the same as those of Western philosophy, which the demoters cling to.  In his words:

It is not necessary to employ Aristotelian or the Russellian logic in this reflective activity before one can be deemed to be philosophizing. It is not necessary to carry out this reflective activity in the same way that the Western thinkers did. Ability to reason logically and coherently is an integral part of man’s rationality. The power of logical thinking is identical with the power of rationality. It is therefore false to say that people cannot think logically or reason coherently unless they employ Aristotle’s or Russell’s form of logic or even the Western-type argumentation. (1998: 4-5)

Omoregbe was addressing the position of most members of the Modernist school who believed that African philosophy must follow the pattern of Western philosophy if it were to exist. As he cautions:

Some people, trained in Western philosophy and its method, assert that there is no philosophy and no philosophizing outside the Western type of philosophy or the Western method of philosophizing (which they call “scientific” or “technical”. (1998: 5)

Philosophers like E. A. Ruch in some of his earlier writings, Peter Bodunrin, C. B. Okolo, and Robin Horton were direct recipients of Omoregbe’s criticism. Robin Horton’s “African Traditional Thought and Western Science” is a two-part essay that sought, in the long run, to expose the rational ineptitude in African thought. On the question of logic in African philosophy, Robin Horton’s “Traditional Thought and the emerging African Philosophy Department: A Comment on the Current Debate” first stirred the hornet’s nest and was ably challenged by Godorn Hunnings’ “Logic, Language and Culture”, as well as by Omoregbe’s “African Philosophy: Yesterday and Today”. Earlier, Meinrad Hebga’s “Logic in Africa” had made insightful ground-clearing on the matter. Recently, C.S. Momoh’s “The Logic Question in African Philosophy” and Udo Etuk’s “The Possibility of an African Logic” as well as Jonathan C. Okeke’s “Why can’t there be an African Logic” made impressions. However, this logic question is gathering new momentum in African philosophical discourse. Recently, Jonathan O Chimakonam (2020), has put together a new edited collection that compiled some of the seminal essays in the logic question debate.

On the philosophical angle, Kwasi Wiredu’s “How not to Compare African Traditional Thought with Western Thought” responded to the lopsided earlier effort of Robin Horton but ended up making its own criticisms of the status of African philosophy, which, for Wiredu, is yet to attain maturation. In his words, “[M]any traditional African institutions and cultural practices, such as the ones just mentioned, are based on superstition. By ‘superstition’ I mean a rationally unsupported belief in entities of any sort (1976: 4-8 and 1995: 194).” In his Philosophy and an African Culture, Wiredu was more pungent. He caricatured much of the discourse on African philosophy as community thought or folk thought unqualified to be called philosophy. For him, there had to be a practised distinction between “African philosophy as folk thought preserved in oral traditions and African philosophy as critical, individual reflection, using modern logical and conceptual techniques” (1980: 14). Olusegun Oladipo supports this in his Philosophy and the African Experience. As he puts it:

But this kind of attitude is mistaken. In Africa, we are engaged in the task of the improvement of “the condition of men”. There can be no successful execution of this task without a reasonable knowledge of, and control over, nature. But essential to the quest for knowledge of, and control over, nature are “logical, mathematical and analytical procedures” which are products of modern intellectual practices. The glorification of the “unanalytical cast of mind” which a conception of African philosophy as African folk thought encourages, would not avail us the opportunity of taking advantage of the theoretical and practical benefits offered by these intellectual procedures. It thus can only succeed in making the task of improving the condition of man in Africa a daunting one. (1996: 15)

Oladipo also shares similar thoughts in his The Idea of African Philosophy. African philosophy, for some of the Modernists, is practised in a debased sense. This position is considered opinionated by the Traditionalists. Later E. A. Ruch and K. C. Anyanwu in their African Philosophy: An Introduction to the Main Philosophical Trends in Contemporary Africa attempt to excavate the philosophical elements in folklore and myth. C. S. Momoh’s “The Mythological Question in African Philosophy” and K. C. Anyanwu’s “Philosophical Significance of Myth and Symbol in Dogon World-View” further reinforced the position of the Traditionalists. (cf. Momoh 1989 and Anyanwu 1989).

However, it took Paulin Hountondji in his African Philosophy: Myth and Reality to drive a long nail in the coffin. African philosophy, for him, must be done in the same frame as Western philosophy, including its principles, methodologies and all. K. C. Anyanwu again admitted that Western philosophy is one of the challenges facing African philosophy but that only calls for systematization of African philosophy not its decimation. He made these arguments in his paper “The Problem of Method in African philosophy”.

Other arguments set Greek standards for authentic African philosophy as can be found in Odera Oruka’s “The Fundamental Principles in the Question of ‘African Philosophy’ (I)” and Hountondji’s “African Wisdom and Modern Philosophy.” They readily met with Lansana Keita’s “African Philosophical Systems: A Rational Reconstruction”, J. Kinyongo’s “Philosophy in Africa: An Existence” and even P. K. Roy’s “African Theory of Knowledge”. For every step the Modernists took, the Traditionalists replied with two, a response that lingered till the early 1990’s when a certain phase of disillusionment began to set in to quell the debate. Actors on both fronts had only then begun to reach a new consciousness, realizing that a new step had to be taken beyond the debate. Even Kwasi Wiredu who had earlier justified the debate by his insistence that “without argument and clarification, there is strictly no philosophy” (1980: 47), had to admit that it was time to do something else. For him, African philosophers had to go beyond talking about African philosophy and get down to actually doing it.

It was with this sort of new orientation, which emerged from the disillusionment of the protracted debate that the later period of African philosophy was born in the 1980’s. As it is said in the Igbo proverb, “The music makers almost unanimously were changing the rhythm and the dancers had to change their dance steps.”  One of the high points of the disillusionment was the emergence of the Eclectic school in the next period called ‘the Later Period’ of African philosophy.

c. Later Period

This period of African philosophy heralds the emergence of movements that can be called Critical Reconstructionism and Afro-Eclecticism. For the Deconstructionists of the middle period, the focus shifted from deconstruction to reconstruction of African episteme in a universally integrated way; whereas, for the eclectics, finding a reconcilable middle path between traditional African philosophy and modern African philosophy should be paramount. Thus they advocate a shift from entrenched ethnophilosophy and universal hue to the reconstruction of African episteme if somewhat different from the imposed Westernism and the uncritical ethnophilosophy. So, both the Critical Reconstructionists and the Eclectics advocate one form of reconstruction or the other. The former desire a new episteme untainted by ethnophilosophy, while the latter sue for reconciled central and relevant ideals.

Not knowing how to proceed to this sort of task was a telling problem for all advocates of critical reconstruction in African philosophy, such as V. Y. Mudimbe, Ebousi Boulaga, Olusegun Oladipo, Franz Crahey, Jay van Hook, Godwin Sogolo, and Marcien Towa to name a few. At the dawn of the era, these African legionnaires pointed out, in different terms, that reconstructing African episteme was imperative. But more urgent was the need to first analyse the haggard philosophical structure patched into existence with the cement of perverse dialogues. It appeared inexorable to these scholars and others of the time that none of these could be successful outside the shadow of Westernism. For whatever one writes, if it is effectively free from ethnophilosophy, then it is either contained in Western discourse or, at the very least proceeds from its logic. If it is already contained in Western narrative or proceeds from its logic, what then makes it African? This became something of a dead-end for this illustrious group, which struggled against evolutions in their positions.

Intuitively, almost every analyst knows that discussing what has been discussed in Western philosophy or taking the cue from Western philosophy does not absolutely negate or vitiate what is produced as African philosophy. But how is this to be effectively justified? This appears to be the Achilles heel of the Critical Reconstructionists of the later period in African philosophy. The massive failure of these Critical Reconstructionists to go beyond the lines of recommendation and actually engage in reconstruction delayed their emergence as a school of thought in African philosophy. The diversionary trend which occurred at this point ensured that the later period, which began with the two rival camps of Critical Reconstructionists and Eclectics, ended with only the Eclectics left standing. Thus dying in its embryo, Critical Reconstructionism became absorbed in Eclecticism.

The campaign for Afro-reconstructionism had first emerged in the late 1980s in the writings of Peter Bodunrin, Kwasi Wiredu, V. Y. Mudimbe, Lucius Outlaw, and much later, in Godwin Sogolo, Olusegun Oladipo, and Jay van Hook, even though principals like Marcien Towa and Franz Crahey had hinted at it much earlier. The insights of the latter two never rang bells beyond the ear-shot of identity reconstruction, which was the echo of their time. Wiredu’s cry for conceptual decolonization and Hountondji’s call for the abandonment of the ship of ethnophilosophy were in the spirit of Afro-reconstructionism of the episteme. None of the Afro-reconstructionists except for Wiredu was able to truly chart a course for reconstruction. His was linguistic, even though the significance of his campaign was never truly appreciated. His 1998 work “Toward Decolonizing African Philosophy and Religion,” was a clearer recapitulation of his works of preceding years.

Beyond this modest line, no other reconstructionist crusader of the time actually went beyond deconstruction and problem identification. Almost spontaneously, Afro-reconstructionism evolved into Afro-eclecticism in the early 1990s when the emerging Critical Reconstructionism ran into a brick wall of inactivity. The argument seems to say, ‘If it is not philosophically permissible to employ alternative logic different from the one in the West or methods, perhaps we can make do with the merger of the approaches we have identified in African philosophy following the deconstructions.’ These approaches are the various schools of thought from ethnophilosophy, philosophic sagacity, ideological school, universal, literary to hermeneutic schools, which were deconstructed into two broad approaches, namely: The traditionalist school and the modernist school, also called the particularist and the universalist schools.

Eclectics, therefore, are those who think that the effective integration or complementation of the African native system and the Western system could produce a viable synthesis that is first African and then modern. Andrew Uduigwomen, the Nigerian philosopher, could be regarded as the founder of this school in African philosophy. In his 1995 work “Philosophy and the Place of African Philosophy,” he gave official birth to Afro-eclecticism. Identifying the Traditionalist and Modernist schools as the Particularist and Universalist schools, he created the eclectic school by carefully unifying their goals from the ruins of the deconstructed past.

Uduigwomen states that the eclectic school holds that an intellectual romance between the Universalist conception and the Particularist conception will give rise to an authentic African philosophy. The Universalist approach will provide the necessary analytic and conceptual framework for the Particularist school. Since, according to Uduigwomen, this framework cannot thrive in a vacuum, the Particularist approach will, in turn, supply the raw materials or indigenous data needed by the Universalist approach. From the submission of Uduigwomen above, one easily detects that eclecticism for him entails employing Western methods in analyzing African cultural paraphernalia.

However, Afro-Eclecticism is not without problems. The first problem, though is that he did not supply the yardstick for determining what is to be admitted and what must be left out of the corpus of African tradition. Everything cannot meet the standard of genuine philosophy, nor should the philosophical selection be arbitrary. Hountondji, a chronic critic of traditional efforts, once called Tempels’ Bantu philosophy a sham. For him, it was not African or Bantu philosophy but Tempels’ philosophy with African paraphernalia. This could be extended to the vision of Afro-eclecticism. On the contrary, it could be argued that if Hountondji agrees that the synthesis contains as little as African paraphernalia, then it is something new and, in this respect, can claim the tag of African philosophy. However, it leaves to be proven how philosophical that little African paraphernalia is.

Other notable eclectics include Batholomew Abanuka, Udobata Onunwa, C. C. Ekwealor and much later Chris Ijiomah. Abanuka posits in his 1994 work that a veritable way to do authentic African philosophy would be to recognize the unity of individual things and, by extension, theories in ontology, epistemology or ethics. There is a basic identity among these because they are connected and can be unified. Following C. S. Momoh (1985: 12), Abanuka went on in A History of African Philosophy to argue that synthesis should be the ultimate approach to doing African Philosophy. This position is shared by Onunwa on a micro level. He says that realities in African world-view are inter-connected and inter-dependent (1991: 66-71). Ekwealor and Ijiomah also believe in synthesis, noting that these realities are broadly dualistic, being physical and spiritual (cf. Ekwalor 1990: 30 and Ijiomah 2005: 76 and 84). So, it would be an anomaly to think of African philosophy as chiefly an exercise in analysis rather than synthesis. The ultimate methodological approach to African philosophy, therefore, has to reflect a unity of methods above all else.

Eclecticism survived in the contemporary period of African philosophy in conversational forms. Godfrey Ozumba and Jonathan Chimakonam on Njikoka philosophy, E. G. Ekwuru and later Innocent Egwutuorah on Afrizealotism, and even Innocent Asouzu on Ibuanyidanda ontology are all in a small way, various forms of eclectic thinking. However, these theories are grouped in the New Era specifically for the time of their emergence and the robust conversational structure they have.

The purest development of eclectic thinking in the later period could be found in Pantaleon Iroegbu’s Uwa Ontology. He posits uwa (worlds) as an abstract generic concept with fifteen connotations and six zones. Everything is uwa, in uwa and can be known through uwa. For him, while the fifteen connotations are the different senses and aspects which uwa concept carries in African thought, the six zones are the spatio-temporal locations of the worlds in terms of their inhabitants. He adds that these six zones are dualistic and comprise the earthly and the spiritual. They are also dynamic and mutually related. Thus, Iroegbu suggests that the approach to authentic African philosophy could consist of the conglomeration of uwa. This demonstrates a veritable eclectic method in African philosophy.

One of the major hindrances of eclecticism of the later period is that it leads straight to applied philosophy. Following this approach in this period almost makes it impossible for second readers to do original and abstract philosophizing for its own sake. Eclectic theories and methods confine one to their internal dynamics believing that for a work to be regarded as authentic African philosophy, it must follow the rules of Eclecticism. The wider implication is that while creativity might blossom, innovation and originality are stifled. Because of pertinent problems such as these, further evolutions in African philosophy became inevitable. The Kenyan philosopher Odera Oruka had magnified the thoughts concerning individual rather than group philosophizing, thoughts that had been variously expressed earlier by Peter Bodunrin, Paulin Hountondji and Kwasi Wiredu, who further admonished African philosophers to stop talking and start doing African philosophy. And V. Y. Mudimbe, in his The Invention of Africa…, suggested the development of an African conversational philosophy, and the reinvention of Africa by its philosophers, to undermine the Africa that Europe invented. The content of Lewis Gordon’s essay “African Philosophy’s search for Identity: Existential consideration of a recent effort”, and the works of Outlaw and Sogolo suggest a craving for a new line of development for African philosophy—a new approach which is to be critical, engaging and universal while still being African. This in particular, is the spirit of the conversational thinking, which was beginning to grip African philosophers in late 1990s when Gordon wrote his paper. Influences from these thoughts by the turn of the millennium year crystallized into a new mode of thinking, which then metamorphosed into conversational philosophy. The New Era in African philosophy was thus heralded. The focus of this New Era and the orientation became the conversational philosophy.

d. New Era

This period of African philosophy began in the late 1990s and took shape by the turn of the millennium years. The orientation of this period is conversational philosophy, so, conversationalism is the movement that thrives in this period. The University of Calabar has emerged as the international headquarters of this new movement hosting various workshops, colloquia and conferences in African philosophy under the auspices of a revolutionary forum called The Conversational/Calabar School of Philosophy. This forum can fairly be described as revolutionary for the radical way they turned the fortunes of African philosophy around. When different schools and actors were still groping about, the new school provided a completely new and authentically African approach to doing philosophy. Hinged on the triple principles of relationality (that variables necessarily interrelate), contextuality (that the relationships of variables occur in contexts) and complementarity (that seemingly opposed variables can complement rather than merely contradict), they formulated new methodologies (complementary reflection and conversational method) and developed original systems to inaugurate a new era in the history of African philosophy.

The Calabar School begins its philosophical inquiry with the assumptions that a) relationships are central to understanding the nature of reality, b) each of these relationships must be contextualized and studied as such. They also identify border lines as the main problem of the 21st century. By border lines, they mean the divisive line we draw between realities in order to establish them as binary opposites. These lines lead to all marginal problems such as racism, sexism, classisim, creedoism, etc. To address these problems, they raise two questions: does difference amount to inferiority? And, are opposites irreconcilable? In the Calabar School of Philosophy, some prominent theories have emerged to respond to the border lines problems and the two questions that trail it. Some theoretic contributions of the Calabar School include, uwa ontology (Pantaleon Iroegbu), ibuanyidanda (complementary philosophy) (Innocent Asouzu), harmonious monism (Chris Ijiomah), Njikoka philosophy (Godfrey Ozumba), conceptual mandelanism (Mesembe Edet), and conversational thinking (Jonathan Chimakonam), consolation philosophy (Ada Agada), predeterministic historicity (Aribiah Attoe), personhood-based theory of right action (Amara Chimakonam), etc. All these theories speak to the method of conversational philosophy.  Conversational philosophy is defined by the focus on studying relationships existing between variables and active engagement between individual African philosophers in the creation of critical narratives therefrom, through engaging the elements of tradition or straightforwardly by producing new thoughts or by engaging other individual thinkers. It thrives on incessant questioning geared toward the production of new concepts, opening up new vistas and sustaining the conversation.

Some of the African philosophers whose works follow this trajectory ironically have emerged in the Western world, notably in America. The American philosopher Jennifer Lisa Vest is one of them. Another one is Bruce Janz. These two, to name a few, suggest that the highest purification of African philosophy is to be realized in conversational-styled philosophizing. However, it was the Nigerian philosopher Innocent Asouzu who went beyond the earlier botched attempt of Leopold Senghor and transcended the foundations of Pantaleon Iroegbu and CS Momoh to erect a new model of African philosophy that is conversational. The New Era, therefore, is the beginning of conversational philosophy.

Iroegbu in his Metaphysics: The Kpim of Philosophy inaugurated the reconstructive and conversational approach in African philosophy. He studied the relationships between the zones and connotations of uwa. From the preceding, he engaged previous writers in a critical conversation out of which he produced his own thought, (Uwa ontology) bearing the stamp of African tradition and thought systems but remarkably different in approach and method of ethnophilosophy. Franz Fanon has highlighted the importance of sourcing African philosophical paraphernalia from African indigenous culture. This is corroborated in a way by Lucius Outlaw in his African Philosophy: Deconstructive and Reconstructive Challenges. In it, Outlaw advocates the deconstruction of European-invented Africa to be replaced by a reconstruction to be done by conscientious Africans free from the grip of colonial mentality (1996: 11). Whereas the Wiredu’s crusade sought to deconstruct the invented Africa, actors in the New Era of African philosophy seek to reconstruct through conversational approach.

Iroegbu and Momoh inaugurated this drive but it was Asouzu who has made the most of it. His theory of Ibuanyidanda ontology or complementary reflection maintains that “to be” simply means to be in a mutual, complementary relationship (2007: 251-55). Every being, therefore, is a variable with the capacity to join a mutual interaction. In this capacity, every being alone is seen as a missing link and serving a missing link of reality in the network of realities. One immediately suspects the apparent contradiction that might arise from the fusion of two opposed variables when considered logically. But the logic of this theory is not the two-valued classical logic but the three-valued system of logic developed in Africa (cf. Asouzu 2004, 2013; Ijiomah 2006, 2014, 2020; Chimakonam 2012, 2013 and 2014a, 2017, 2018, 2019, 2020). In this, the two standard values are sub-contraries rather than contradictories thereby facilitating effective complementation of variables. The possibility of the two standard values merging to form the third value in the complementary mode is what makes Ezumezu logic, one of the systems developed in the Calabar school, a powerful tool of thought.

A good number of African philosophers are tuning their works into the pattern of conversational style. Elsewhere in Africa, Michael Eze, Fainos Mangena, Bernard Matolino, Motsamai Molefe, Anthony Oyowe, Thaddeus Metz and Leonhard Praeg are doing this when they engage with the idea of ubuntu ethics and ontology, except that they come short of studying relationships. Like all these scholars, the champions of the new conversational orientation are building the new edifice by reconstructing the deconstructed domain of thought in the later period of African philosophy. The central approach is conversation, as a relational methodology. By studying relationships and engaging other African philosophers, entities or traditions in creative struggle, they hope to reconstruct the deconstructed edifice of African philosophy. Hence, the New Era of African philosophy is safe from the retrogressive, perverse dialogues, which characterized the early and middle periods.

Also, with the critical deconstruction that occurred in the latter part of the middle period and the attendant eclecticism that emerged in the later period, the stage was set for the formidable reconstructions and conversational encounters that marked the arrival of the New Era of African philosophy.

8. Conclusion

The development of African philosophy through the periods yields two vital conceptions for African philosophy, namely that African philosophy is a critical engagement of tradition and individual thinkers on the one hand, and on the other hand, it is also a critical construction of futurity. When individual African philosophers engage tradition critically in order to ascertain its logical coherency and universal validity, they are doing African philosophy. And when they employ the tools of African logic in doing this, they are doing African philosophy. On the second conception, when African philosophers study relationships and engage in critical conversations with one another and in the construction of new thoughts in matters that concern Africa but which are nonetheless universal and projected from African native thought systems, they are doing African philosophy. So, the authentic African philosophy is not just a future project; it can also continue from the past.

On the whole, this essay discussed the journey of African philosophy from the beginning and focused on the criteria, schools and movements in African philosophical tradition. The historical account of the periods in African philosophy began with the early period through to the middle, the later and finally, the new period. These periods of African philosophy were covered, taking particular interest in the robust, individual contributions. Some questions still trail the development of African philosophy, many of which include, “Must African philosophy be tailored to the pattern of Western philosophy, even in less definitive issues? If African philosophy is found to be different in approach from Western philosophy, — so what? Are logical issues likely to play any major roles in the structure and future of African philosophy? What is the future direction of African philosophy? Is the problem of the language of African philosophy pregnant? Would conversations in contemporary African philosophy totally eschew perverse dialogue? What shall be the rules of engagement in African philosophy?” These questions are likely to shape the next lines of thought in African philosophy.

9. References and Further Reading

  • Abanuka, Batholomew. A History of African Philosophy. Enugu: Snaap Press, 2011.
    • An epochal discussion of African philosophy.
  • Abraham, William. The Mind of Africa. Chicago: University of Chicago Press, 1962.
    • A philosophical discussion of culture, African thought and colonial times.
  • Achebe, Chinua. Morning yet on Creation Day. London: Heinemann, 1975.
    • A philosophical treatment of African tradition and colonial burden.
  • Anyanwu, K. C. “Philosophical Significance of Myth and Symbol in Dogon World-view”. C. S. Momoh ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
    • A discussion of the philosophical elements in an African culture.
  • Akesson, Sam. K. “The Akan Concept of Soul”. African Affairs: The Journal of the Royal African Society, 64(257), 280-291.
    • A discourse on African metaphysics and philosophy of religion.
  • Akiode, Olajumoke. “African philosophy, its questions, the place and the role of women and its disconnect with its world”. African Philosophy and the Epistemic Marginalization of Women; edited by Jonathan O. Chimakonam and Louise du Toit. Routledge, 2018.
    • A critical and Afro-feminist discussion of the communalist orientation in African philosophy.
  • Aristotle. Metaphysica, Translated into English under the editorship of W. D. Ross, M.A., Hon. LL.D (Edin.) Oxford. Vol. VIII, Second Edition, OXFORD at the Clarendon Press 1926. Online Edition. 982b.
    • A translation of Aristotle’s treatise on metaphysics.
  • Asouzu Innocent. I. Ibuanyidanda: New Complementary Ontology Beyond World-Immanentism, Ethnocentric Reduction and Impositions. Litverlag, Münster, Zurich, New Brunswick, London, 2007.
    • An African perspectival treatment of metaphysics or the theory of complementarity of beings.
  • Asouzu, Innocent. I. The Method and Principles of Complementary, Calabar University Press, 2004.
    • A formulation of the method and theory of Complementary Reflection.
  • Asouzu, Innocent. I. 2013. Ibuanyidanda (Complementary Reflection) and Some Basic Philosophical Problems in Africa Today. Sense Experience, “ihe mkpuchi anya” and the Super-maxim. Litverlag, Münster, Zurich, Vienna, 2013.
    • A further discussion on the theory, method and logic of complementary Reflection.
  • Attoe, Aribiah David. “Examining the Method and Praxis of Conversationalism,” in Chimakonam Jonathan O., E Etieyibo, and I Odimegwu (eds). Essays on Contemporary Issues in African Philosophy. Cham: Springer, 2022.
    • An broad examination of the method of conversational thinking.
  • Babalola, Yai. “Theory and Practice in African Philosophy: The Poverty of Speculative Philosophy. A Review of the Work of P. Hountondji, M. Towa, et al.” Second Order, 2. 2. 1977.
    • A Critical review of Hountondji and Towa.
  • Bello, A. G. A. Philosophy and African Language. Quest: Philosophical Discussions. An International African journal of Philosophy, Vol 1, No 1, Pp5-12, 1987.
    • A critical engagement on the subject of language of philosophy.
  • Betts, Raymond. Assimilation and Association in French Colonial Territory 1890 to 1915. (First ed. 1961), Reprinted. Nebraska: University of Nebraska Press, 2005
    • A discourse on French colonial policies.
  • Bodunrin, Peter. “The Question of African Philosophy”. Richard Wright (ed) African Philosophy: An Introduction 3rd ed. Lanham: UPA, 1984.
    • A discourse on the nature and universal conception of African philosophy.
  • Cesaire Aime. Return to My Native Land. London: Penguin Books, 1969.
    • A presentation of colonial impact on the mind of the colonized.
  • Chimakonam, Jonathan. O. “On the System of Conversational Thinking: An Overview”, Arụmarụka: Journal of Conversational Thinking, 1(1), 2021, pp1-45.
    • A detail discussion of the main components of Conversational Thinking.
  • Chimakonam Jonathan O. Ed. Logic and African Philosophy: Seminal Essays in African Systems of Thought. Delaware: Vernon Press, 2020.
    • A collection of selected seminal papers on the African logic debate.
  • Chimakonam, Jonathan O. Ezumezu A System of Logic for African Philosophy and Studies. Cham. Springer Nature, 2019.
    • A theoretic formulation of the system of Ezumezu logic.
  • Chimakonam, Jonathan, O. The ‘Demise’ of Philosophical Universalism and the Rise of Conversational Thinking in Contemporary African Philosophy. Method, Substance, and the Future of African Philosophy, ed. Etieyibo Edwin. 135-160. Cham. Springer Nature, 2018.
    • A critique of philosophical universalism.
  • Chimakonam Jonathan O. “Conversationalism as an Emerging Method of Thinking in and Beyond African Philosophy,” Acta Academica, 2017a. pp11-33, Vol 2.
    • A methodological presentation of Conversational thinking.
  • Chimakonam Jonathan O. “What is Conversational Philosophy? A Prescription of a New Theory and Method of Philosophising in and Beyond African Philosophy,” Phronimon, 2017b. pp115-130, Vol 18.
    • An intercultural formulation of the Conversational method.
  • Chimakonam, Jonathan, O. The Criteria Question in African Philosophy: Escape from the Horns of Jingoism and Afrocentrism. Atuolu Omalu: Some Unanswered Questions in Contemporary African Philosophy, ed. Jonathan O. Chimakonam. Pp101-123. University Press of America: Lanham, 2015a.
    • A discussion of the Criteria of African philosophy.
  • Chimakonam, Jonathan, O. Addressing Uduma’s Africanness of a Philosophy Question and Shifting the Paradigm from Metaphilosophy to Conversational Philosophy. Filosofia Theoretica: Journal of African Philosophy, Culture and Religions, Vol 4. No 1. 2015b, 33-50.
    • An engagement with Uduma on his Africanness of philosophy question from a conversational viewpoint.
  • Chimakonam, Jonathan, O. Conversational Philosophy as a New School of Thought in African Philosophy: A Conversation with Bruce Janz on the Concept of Philosophical Space. Confluence: Online Journal of World Philosophies. 9-40, 2015c.
    • A rejoinder to Bruce Janz on the concept of philosophical space.
  • Chimakonam Jonathan O. “Transforming the African philosophical place through conversations: An inquiry into the Global Expansion of Thought (GET)”, in South African Journal of Philosophy, Vol. 34, No. 4. 2015d, 462-479.
    • A formulation of some basic principles of conversational thinking.
  • Chimakonam, O. Jonathan. “Ezumezu: A Variant of Three-valued Logic—Insights and Controversies”. Paper presented at the Annual Conference of the Philosophical Society of Southern Africa. Free State University, Bloemfontein, South Africa. Jan. 20-22, 2014.
    • An articulation of the structure of Ezumezu/African logic tradition.
  • Chimakonam, O. Jonathan. “Principles of Indigenous African Logic: Toward Africa’s Development and Restoration of African Identity” Paper presented at the 19th Annual Conference of International Society for African Philosophy and Studies [ISAPS], ‘50 Years of OAU/AU: Revisiting the Questions of African Unity, Identity and Development’. Department of Philosophy, Nnamdi Azikiwe University, Awka. 27th – 29th May, 2013.
    • A presentation of the principles of Ezumezu/African logic tradition.
  • Chimakonam, O. Jonathan. “Integrative Humanism: Extensions and Clarifications”. Integrative Humanism Journal. 3.1, 2013.
    • Further discussions on the theory of integrative humanism.
  • Chimakonam Jonathan O. and Uti Ojah Egbai. “The Value of Conversational Thinking in Building a Decent World: The Perspective of Postcolonial Sub-Saharan Africa”, in Dialogue and Universalism, Vol XXVI No 4. 105-117, 2016.
  • Danquah, J.B. Gold Coast : Akan laws and customs and the Akim Abuakwa constitution. London: G. Routledge & Sons, 1928.
    • A discourse on African philosophy of law.
  • Danquah, J.B. The Akan doctrine of God: a fragment of Gold Coast ethics and religion. London: Cass, 1944.
    • A discourse on African metaphysics, ethics and philosophy of religion.
  • Diop, Cheikh Anta. The African Origin of Civilization: Myth or Reality. Mercer Cook Transl. New York: Lawrence Hill & Company, 1974.
  • Du Bois, W. E. B. The Souls of Black Folk. (1903). New York: Bantam Classic edition, 1989.
    • A discourse on race and cultural imperialism.
  • Edeh, Emmanuel. Igbo Metaphysics. Chicago: Loyola University Press, 1985.
    • An Igbo-African discourse on the nature being.
  • Egbai, Uti Ojah & Jonathan O. Chimakonam. Why Conversational Thinking Could be an Alternative Method for Intercultural Philosophy, Journal of Intercultural Studies, 40:2, 2019. 172-189.
    • A discussion of conversational thinking as a method of intercultural philosophy.
  • Enyimba, Maduka. “On how to do African Philosophy in African Language: Some Objections and Extensions. Philosophy Today, 66. 1, 2022. Pp. 25-37.
    • A discussion on how to do African philosophy using an African language.
  • Ekwealor, C. “The Igbo World-View: A General Survey”. The Humanities and All of Us. Emeka Oguegbu (ed) Onitsha: Watchword, 1990.
    • A philosophical presentation of Igbo life-world.
  • Etuk, Udo. “The Possibility of African logic”. The Third Way in African Philosophy, Olusegun Oladipo (ed). Ibadan: Hope Publications, 2002.
    • A discussion of the nature and possibility of African logic.
  • Fayemi, Ademola K. “African Philosophy in Search of Historiography”. Nokoko: Journal of Institute of African Studies. 6. 2017. 297-316.
    • A historiographical discussion of African philosophy.
  • Frantz, Fanon. The Wretched of the Earth. London: The Chaucer Press, 1965.
    • A critical discourse on race and colonialism.
  • Graness, Anke. “Writing the History of Philosophy in Africa: Where to Begin?”. Journal of African Cultural Studies. 28. 2. 2015. 132-146.
    • A Eurocentric historicization of African philosophy.
  • Graness, A., & Kresse, K. eds., Sagacious Reasoning: H. Odera Oruka in memoriam, Frankfurt: Peter Lang, 1997.
    • A collection of articles on Oruka’s Sage philosophy.
  • Graiule, Marcel. Conversations with Ogotemmêli, London: Oxford University Press for the International African Institute, 1965.
    • An interlocutory presentation of African philosophy.
  • Gyekye, Kwame. An Essay in African Philosophical Thought: The Akan Conceptual Scheme. Cambridge: Cambridge University Press, 1987.
    • A discussion of philosophy from an African cultural view point.
  • Hallen, Barry. A Short History of African Philosophy. Bloomington: Indiana University Press, 2002.
    • A presentation of the history of African philosophy from thematic and personality perspectives.
  • Hallen, B. and J. O. Sodipo. Knowledge, Belief and Witchcraft: Analytic Experiments in African Philosophy. Palo Alto, CA: Stanford University Press, 1997.
    • An analytic discourse of the universal nature of themes and terms in African philosophy.
  • Hebga, Meinrad. “Logic in Africa”. Philosophy Today, Vol.11 No.4/4 (1958).
    • A discourse on the structure of African logical tradition.
  • Hegel, Georg. Lectures on the Philosophy of World History. Cambridge: Cambridge University Press, reprint 1975.
    • Hegel’s discussion of his philosophy of world history.
  • Horton, Robin. “African Traditional Religion and Western Science” in Africa 37: 1 and 2, 1967.
    • A comparison of African and Western thought.
  • Horton, Robin. “Traditional Thought and the Emerging African Philosophy Department: A Comment on the Current Debate” in Second Order: An African Journal of Philosophy vol. III No. 1, 1977.
    • A logical critique of the idea of African philosophy.
  • Hountondji, Paulin. African Philosophy: Myth and Reality. Second Revised ed. Bloomington, Indiana: University Press, 1996.
    • A critique of ethnophilosophy and an affirmation of African philosophy as a universal discourse.
  • Hunnings, Gordon. “Logic, Language and Culture”. Second Order: An African Journal of Philosophy, Vol.4, No.1. (1975).
    • A critique of classical logic and its laws in African thought and a suggestion of African logical tradition.
  • Ijiomah, Chris. “An Excavation of a Logic in African World-view”. African Journal of Religion, Culture and Society. 1. 1. (August, 2006): pp.29-35.
    • An extrapolation on a possible African logic tradition.
  • Iroegbu, Pantaleon. Metaphysics: The Kpim of Philosophy. Owerri: International Universities Press, 1995.
    • A conversational presentation of theory of being in African philosophy.
  • Jacques, Tomaz. “Philosophy in Black: African Philosophy as a Negritude”. Discursos Postcoloniales Entorno Africa. CIEA7, No. 17, 7th Congress of African Studies.
    • A critique of the rigor of African philosophy as a discipline.
  • James, George. Stolen Legacy: Greek Philosophy is Stolen Egyptian Philosophy. New York: Philosophical Library, 1954.
    • A philosophical discourse on race, culture, imperialism and colonial deceit.
  • Jahn, Janheinz. Muntu: An Outline of Neo-African Culture. New York: Grove Press, 1961.
    • A presentation of a new African culture as a synthesis and as philosophical relevant and rational.
  • Jewsiewicki, Bogumil. “African Historical Studies: Academic Knowledge as ‘usable past’ and Radical Scholarship”. The African Studies Review. Vol. 32. No. 3, December, 1989.
    • A discourse on the value of African tradition to modern scholarship.
  • Kanu, Ikechukwu. ‘Trends in African Philosophy: A Case for Eclectism.’ Filosofia Theoretica: A Journal of African Philosophy, Culture and Religion, 2(1), 2013. pp. 275-287.
    • A survey of the trends in African philosophy with a focus on Eclectism.
  • Keita, Lansana. “The African Philosophical Tradition”. Wright, Richard A., ed. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
    • An examination of African philosophical heritage.
  • Keita, Lansana. “Contemporary African Philosophy: The Search for a Method”. Tsanay Serequeberhan (ed) African Philosophy: The Essential Readings. New York: Paragon House, 1991.
    • An analysis of methodological issues in and basis of African philosophy.
  • Kezilahabi, Euphrase. African Philosophy and the Problem of Literary Interpretation. Unpublished Ph.D Dissertation. University of Wisconsin, Madison, 1985.
    • A doctoral dissertation on the problem of literary interpretation in African philosophy.
  • Lambert, Michael. “From Citizenship to Négritude: Making a Difference in Elite Ideologies of Colonized Francophone West Africa”. Comparative Studies in Society and History, Vol. 35, No. 2. (Apr., 1993), pp. 239–262.
    • A discourse on the problems of colonial policies in Francophone West Africa.
  • Lewis Gordon. “African Philosophy’s Search for Identity: Existential Considerations of a recent Effort”. The CLR James Journal, Winter 1997, pp. 98-117.
    • A survey of the identity crisis of African philosophical tradition.
  • Leo Apostel. African Philosophy. Belgium: Scientific Publishers, 1981.
    • An Afrocentrist presentation of African philosophy.
  • Levy-Bruhl, Lucien. Primitive Mentality. Paris: University of France Press, 1947.
    • A Eurocentrist presentation of non-European world.
  • Makinde, M.A. Philosophy in Africa. The Substance of African philosophy. C.S. Momoh. Ed. Auchi: African Philosophy Projects’ Publications. 2000.
    • A discourse on the practise and relevance of philosophy in Africa.
  • Mangena, Fainos. “The Fallacy of Exclusion and the Promise of Conversational Philosophy in Africa”, in Chimakonam Jonathan O., E Etieyibo, and I Odimegwu (eds). Essays on Contemporary Issues in African Philosophy. Cham: Springer, 2022.
    • A discourse on the significance of conversational thinking.
  • Masolo, D. A. African Philosophy in Search of Identity. Bloomington: Indiana University Press, 1994.
    • An individual-based presentation of the history of African philosophy.
  • Maurier, Henri. “Do We have an African Philosophy?”. Wright, Richard A., ed. 1984. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
    • A critique of Ethnophilosophy as authentic African philosophy.
  • Mbiti, John. African Religions and Philosophy. London: Heinemann,1969.
    • A discourse on African philosophical culture.
  • Momoh, Campbell. “Canons of African Philosophy”. Paper presented at the 6th Congress of the Nigerian Philosophical Association. University of Ife, July 31- August 3, 1985.
    • A presentation of the major schools of thought in African philosophy.
  • Momoh, Campbell .ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
    • A collection of essays on different issues in African philosophy.
  • Momoh, Campbell. “The Logic Question in African Philosophy”. C. S. Momoh ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
    • A defense of the thesis of a possible African logic tradition.
  • Mosima, P. M. Philosophic Sagacity and Intercultural Philosophy: Beyond Henry Odera Oruka, Leiden/Tilburg: African Studies Collection 62/Tilburg University, 2016.
  • Mudimbe, V. Y. The Invention of Africa: Gnosis, Philosophy and the Order of Knowledge (African Systems of Thought). Bloomington: Indiana University Press, 1988.
    • A discourse on culture, race, Eurocentrism and modern Africa as an invention of Western scholarship.
  • Nkrumah, Kwame. I Speak of Freedom: A Statement of African Ideology. London: Mercury Books, 1961.
    • A discourse on political ideology for Africa.
  • Nkrumah, Kwame. Towards Colonial Freedom. London: Heinemann. (First published in 1945), 1962.
    • A discussion of colonialism and its negative impact on Africa.
  • Nwala, Uzodinma. Igbo Philosophy. London: Lantern Books, 1985.
    • An Afrocentrist presentation of Igbo-African philosophical culture.
  • Nyerere, Julius. Freedom and Unity. Dares Salaam: Oxford University Press, 1986.
    • A discussion of a postcolonial Africa that should thrive on freedom and unity.
  • Nyerere, Julius. Freedom and Socialism. Dares Salaam: Oxford University Press, 1986.
    • A discourse on the fundamental traits of African socialism.
  • Nyerere, Julius. Ujamaa—Essays on Socialism. Dar-es-Salaam, Tanzania: Oxford University Press, 1986.
    • A collection of essays detailing the characteristics of African brand of socialism.
  • Obenga, Theophile. “Egypt: Ancient History of African Philosophy”. A Companion to African Philosophy. Ed. Kwasi Wiredu. Malden: Blackwell Publishing, 2004.
    • An Afrocentrist historicization of African philosophy.
  • Oelofsen, Rianna. “Women and ubuntu: Does ubuntu condone the subordination of women?” African Philosophy and the Epistemic Marginalization of Women; edited by Jonathan O. Chimakonam and Louise du Toit. Routledge, 2018.
    • A feminist discourse on ubuntu.
  • Ogbalu, F.C. Ilu Igbo: The Book of Igbo Proverbs. Onitsha: University Publishing Company, 1965.
    • A philosophical presentation of Igbo-African proverbs.
  • Ogbonnaya, L. Uchenna. “How Conversational Philosophy Profits from the Particularist and the Universalist Agenda”, in Chimakonam Jonathan O., E Etieyibo, and I Odimegwu (eds). Essays on Contemporary Issues in African Philosophy. Cham: Springer, 2023.
    • A conversational perspective on particularism and universalism.
  • Oguejiofor, J. Obi. “African Philosophy: The State of its Historiography”. Diogenes. 59. 3-4, 2014. 139-148.
    • A Euro-historical adaptation of African philosophy.
  • Ogunmodede, Francis. 1998. ‘African philosophy in African language.’ West African Journal of Philosophical Studies, Vol 1. Pp3-26.
    • A discourse on doing African philosophy in African languages.
  • Okeke, J. Chimakonam. “Why Can’t There be an African logic?”. Journal of Integrative Humanism. 1. 2. (2011). 141-152.
    • A defense of a possible African logic tradition and a critique of critics.
  • Okere, Theophilus. “The Relation between Culture and Philosophy,” in Uche 2 1976.
    • A discourse on the differences and similarities between culture and philosophy.
  • Okere, Theophilus. African Philosophy: A Historico-Hermeneutical Investigation of the Conditions of Its Possibility. Lanham, Md.: University Press of America, 1983.
    • A hermeneutical discourse on the basis of African philosophy.
  • Okolo, Chukwudum B. Problems of African Philosophy. Enugu: Cecta Nigeria Press, 1990.
    • An x-ray of the major hindrances facing African philosophy as a discipline.
  • Okoro, C. M. African Philosophy: Question and Debate, A Historical Study. Enugu: Paqon Press, 2004.
    • A historical presentation of the great debate in African philosophy.
  • Oladipo, Olusegun. (ed) The Third Way in African Philosophy. Ibadan: Hope, 2002.
    • A collection of essays on the topical issues in African philosophy of the time.
  • Oladipo, Olusegun. Core Issues in African Philosophy. Ibadan: Hope Publications, 2006.
    • A discussion of central issues of African philosophy.
  • Olela, Henry. “The African Foundations of Greek Philosophy”. Wright, Richard A., ed. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
    • An Afrocentrist presentation of African philosophy as the source of Greek philosophy.
  • Oluwole, Sophie. Philosophy and Oral Tradition. Lagos: Ark Publications, 1999.
    • A cultural excavationist programme in African philosophy.
  • Omoregbe, Joseph. “African Philosophy: Yesterday and Today”. African Philosophy: An Anthology. Emmanuel Eze (ed.), Massachusetts: Blackwell, 1998.
    • A survey of major issues in the debate and a critique of the Universalist school.
  • Onunwa, Udobata. “Humanism: The Bedrock of African Traditional Religion and Culture”. Religious Humanism. Vol. XXV, No. 2, Spring 1991, Pp 66 – 71.
    • A presentation of Humanism as the basis for African religion and culture.
  • Onyewuenyi, Innocent. African Origin of Greek Philosophy: An Exercise in Afrocentrism. Enugu: SNAAP Press, 1993.
    • An Afrocentrist presentation of philosophy as a child of African thought.
  • Oruka, H. Odera. “The Fundamental Principles in the Question of ‘African Philosophy,’ I.” Second Order 4, no. 1: 44–55, 1975.
    • A discussion of the main issues in the debate on African philosophy.
  • Oruka, H. Odera.“Four Trends in African Philosophy.” In Philosophy in the Present Situation of Africa, edited by Alwin Diemer. Weisbaden, Germany: Franz Steiner Erlagh. (First published in 1978), 1981; Ed.
    • A breakdown of the major schools of thought in the debate on African philosophy.
  • Oruka, H. Odera. Sage Philosophy: Indigenous Thinkers and the Modern Debate on African Philosophy. Leiden: E. J. Brill. 1990.
    • A survey of the journey so far in African philosophy and the identification of two additional schools of thought.
  • Osuagwu, I. Maduakonam. African Historical Reconsideration: A Methodological Option for African Studies, the North African Case of the Ancient History of Philosophy; Amamihe Lecture 1. Owerri: Amamihe, 1999.
    • A Euro-historical adaptation of African philosophy.
  • Outlaw, Lucius. “African ‘Philosophy’? Deconstructive and Reconstructive Challenges.” In his On Race and Philosophy. New York and London: Routledge. 1996.
    • A presentation of African philosophy as a tool for cultural renaissance.
  • Plato. Theætetus,155d, p.37.
    • Contains Plato’s theory of knowledge.
  • Presbey, G.M. “Who Counts as a Sage?, Problems in the future implementation of Sage Philosophy,” in: Quest- Philosophical Discussions: An International African Journal of Philosophy/ Revue Africaine Internationale de Philosophie, Vol.XI, No.1, 1997. 2:52-66.
  • Rettova, Alena. Afrophone Philosophies: Reality and Challenge. Zdenek Susa Stredokluky, 2007.
    • A Eurocentric discussion on Afrophone philosophies.
  • Ruch, E. A. and Anyawnu, K. C. African Philosophy: An Introduction to the Main Philosophical Trends in Contemporary Africa. Rome: Catholic Book Agency, 1981.
    • A discussion on racialism, slavery, colonialism and their influence on the emergence of African philosophy, in addition to current issues in the discipline.
  • Sogolo, Godwin. Foundations of African Philosophy. Ibadan: Ibadan University Press, 1993.
    • A discussion of the logical, epistemological and metaphysical grounds for African philosophy project.
  • Sogolo, Godwin. 1990. Options in African philosophy. Philosophy. 65. 251: 39-52.
    • A critical and eclectic proposal in African philosophy.
  • Tangwa, Godfrey. ‘Revisiting the Language Question in African Philosophy’. The Palgrave Handbook of African Philosophy. Eds. Adesinya Afolayan and Toyin Falola. Pp 129-140. New York: Springer Nature, 2017.
    • A discourse on the language problem in African philosophy.
  • Tavernaro-Haidarian, Leyla. “Deliberative Epistemology: Towards an Ubuntu-based Epistemology that Accounts for a Prior Knowledge and Objective Truth,” South African Journal of Philosophy. 37(2), 229-242, 2018.
    • A conversational perspective on ubuntu-based epistemology.
  • Temples, Placide. Bantu philosophy. Paris: Presence Africaine.
    • A theorization on Bantu philosophy.
  • Towa, Marcien. “Conditions for the Affirmation of a Modern African Philosophical Thought”. Tsanay Serequeberhan (ed) African Philosophy: The Essential Readings. New York: Paragon House, 1991.
    • A presentation of important factors required for the emergence of African philosophy as a discipline.
  • Uduagwu, Chukwueloka. “Doing Philosophy in the African Place: A Perspective on the Language Challenge”. Jonathan Chimakonam et al (eds), Essays on Contemporary Issues in African Philosophy. Cham, Springer, 2023.
    • A discourse on the language problem in African philosophy.
  • Uduigwomen, F. Andrew. “Philosophy and the Place of African Philosophy”. A. F. Uduigwomen ed. From Footmarks to Landmarks on African Philosophy. 1995, 2nd Ed. 1995/2009. Lagos: O. O. P. 2009.
    • A collection of essays on different issues in African philosophy.
  • Uduma Orji. “Can there be an African Logic” in A. F. Uduigwomen(ed.) From Footmarks to Landmarks on African Philosophy Lagos: O. O. P. Ltd, 2009.
    • A critique of a culture-bound logic in African thought.
  • Uduma Orji. “Between Universalism and Cultural Identity: Revisiting the Motivation for an African Logic”. A Paper delivered at an International Conference of the Council for Research in Values and Philosophy Washington D.C., USA at University of Cape Coast, Cape Coast Ghana 3–5 February, 2010.
    • A critique of a culture-bound logic in African thought and a presentation of logic as universal.
  • Van Hook, Jay M. “African Philosophy and the Universalist Thesis”. Metaphilosophy. 28. 4: 385-396, 1997.
    • A critique of the universalist thesis in African philosophy.
  • Van Hook, Jay M. The Universalist Thesis Revisited: What Direction for African Philosophy in the New Millennium? In Thought and Practice in African Philosophy, ed. G. Presbey, D. Smith, P. Abuya and O. Nyarwath, 87-93. Nairobi: Konrad Adenauer Stiftung, 2002.
    • A further critique of the universalist thesis in African philosophy.
  • Vest, J. L. 2009. ‘Perverse and necessary dialogues in African philosophy’, in: Thought and practice: a journal of the philosophical association of Kenya. New series, Vol.1 No.2, December, pp. 1-23.
    • An discussion of the proper direction and focus of African philosophy in the new Age.
  • Wamba-ia Wamba, E. “Philosophy in Africa: Challenges of the African Philosopher,” in African Philosophy: The Essential Readings. New York: Paragon House, 1991.
    • A discussions of the technical problems of African philosophy as a discipline.
  • wa Thiong’o, Ngugi. Decolonizing the Mind: The Politics of Language in African Literature. London: J. Curry and Portsmouth, N. H: Heinemann, 1986.
    • A discourse on Eurocentrism, Africa’s decolonization and cultural imperialism.
  • Winch, Peter. “Understanding a Primitive Society”. American Philosophical Quarterly. No. 1, 1964.
    • A discussion and a defense of the rationality of primitive people.
  • Wiredu, Kwasi. Philosophy and an African Culture. Cambridge and New York: Cambridge University Press, 1980.
    • A discussion of the philosophical elements in an African culture and a call for a universalizable episteme for African philosophy.
  • Wiredu, Kwasi. “How Not to Compare African Thought with Western Thought.”Ch’Indaba no. 2 ( July–December): 1976. 4–8. Reprinted in African Philosophy: An Introduction, edited by R. Wright. Washington, D.C.: University Press of America, 1977; and in African Philosophy: Selected Readings, edited by Albert G. Mosley. Englewood Cliffs, N.J.: Prentice Hall, 1995.
    • A critique of Robin Horton’s comparison of African and Western thought.
  • Wiredu, Kwasi.“Our Problem of Knowledge: Brief Reflections on Knowledge and Development in Africa”. African Philosophy as Cultural Inquiry. Ivan Karp and D. A. Masolo (ed). Bloomington, Indiana: Indiana University Press, 2000.
    • A discussion on the role of knowledge in the development of Africa.
  • Wiredu, Kwasi. Cultural Universals and Particulars: An African Perspective. Bloomington: Indiana University Press, 1996.
    • A collection of essays on sundry philosophical issues pertaining to comparative and cross-cultural philosophy.
  • Wiredu, Kwasi. Conceptual Decolonization in African Philosophy. Ed. Olusegun Oladipo. Ibadan: Hope Publications, 1995.
    • A discussion of the importance and relevance of the theory of conceptual decolonization in African philosophy.
  • Wiredu, Kwasi. “On Defining African Philosophy”. C. S. Momoh ed. The Substance of African Philosophy. Auchi: APP Publications, 1989.
    • A discourse on the parameters of the discipline of African philosophy.
  • Wright, Richard A., ed. “Investigating African Philosophy”. African Philosophy: An Introduction. 3rd ed. Lanham, Md.: University Press of America, 1984.
    • A critique of the existence of African philosophy as a discipline.

 

Author Information

Jonathan O. Chimakonam
Email: jchimakonam@unical.edu.ng
University of Calabar
Nigeria

Spinoza: Free Will and Freedom

SpinozaBaruch Spinoza (1632-1677) was a Dutch Jewish rationalist philosopher who is most famous for his Ethics and Theological-Political Treatise. Although influenced by Stoicism, Maimonides, Machiavelli, Descartes, and Hobbes, among others, he developed distinct and innovative positions on a number of issues in metaphysics, epistemology, ethics, politics, biblical hermeneutics, and theology. He is also known as a pivotal figure in the development of Enlightenment thinking. Some of his most notorious claims and most radical views surround issues concerning determinism and free will. Spinoza was an adamant determinist, and he denied the existence of free will. This led to much controversy concerning his philosophy in subsequent centuries. He was, in fact, one of the first modern philosophers to both defend determinism and deny free will. Nevertheless, his philosophy champions freedom, both ethically and politically. It provides an ethics without free will but one that leads to freedom, virtue, and happiness. Prima facie, such an ethical project might seem paradoxical, but Spinoza distinguished between free will, which is an illusion, and freedom, which can be achieved. A thorough familiarity with Spinoza’s views on determinism, free will, freedom, and moral responsibility resolves this apparent paradox of an ethics without free will.

Table of Contents

  1. Spinoza’s Determinism
  2. Spinoza on Free Will
  3. Spinoza on Human Freedom
  4. The Free Man and the Way to Freedom
  5. Spinoza on Moral Responsibility
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Spinoza’s Determinism

 Contrary to many of his predecessors and contemporaries, Spinoza is an adamant and notorious determinist. For him, nature is thoroughly determined. While there are many different varieties of determinism, Spinoza is committed to causal determinism, or what is sometimes called nomological determinism. Some commentators argue that Spinoza is also a necessitarian or that he holds that the actual world is the only one possible (see IP33); (for an overview, see Garrett 1991). In any case, as a causal determinist, Spinoza certainly argues that events are determined by previous events or causes (which are further determined by previous past events or causes, and so on) following the laws of nature. Spinoza clearly expresses that all events are determined by previous causes:

Every singular thing, or anything which is finite and has a determinate existence, can neither exist nor be determined to produce an effect unless it is determined to exist and produce an effect by another cause, which is also finite and has a determinate existence; and again, this cause can neither exist nor be determined to produce an effect unless it is determined to exist and produce an effect by another, which is also finite and has a determinate existence, and so on, to infinity. (IP28)

Here, Spinoza is arguing for an infinite chain of finite causes for any given effect, or, as he puts it, any singular thing which exists. Spinoza demonstrates the above proposition in his (in)famous geometrical method, which requires starting with definitions and axioms, demonstrating propositions from them, and building upon previous demonstrations. His commitment to causal determinism is already displayed in Axiom 3 of Part I: “From a given determinate cause, the effect follows necessarily; and conversely, if there is no determinate cause, it is impossible for an effect to follow.” Surprisingly, Spinoza uses only this axiom to demonstrate the previous proposition, IP27 “a thing which has been determined by God to produce an effect cannot render itself undetermined.” His demonstrations refer to Axiom 3: “This proposition is evident from A3.” So, it is clear that Spinoza thinks that every effect has a cause, but why he holds this view is not yet clear.

To understand why Spinoza is committed to causal determinism, requires an examination of his larger philosophical commitments. First, Spinoza is a rationalist, and as a rationalist, he holds that everything is, in principle, explainable or intelligible. This is to say that everything that exists and everything that occurs have a reason to be or to happen, and that this reason can be known and understood. This is known as the principle of sufficient reason, after Leibniz’s formulation. Secondly, Spinoza is committed to naturalism, at least a kind of naturalism that argues that there are no explanations or causes outside of nature. This is to say, there are no super-natural causes, and all events can be explained naturally with respect to nature and its laws. Spinoza’s rationalism and naturalism are in evidence when he argues for the necessary existence of the one infinite substance (IP11), God or Nature (Deus sive Natura), which is the immanent (IP18) and efficient cause (IP25) of all things.

The existence of everything cannot be a brute fact for Spinoza, nor does it make sense to him to postpone the reason for existence by referring to a personal God as the creator of all. Rather, he argues that the one substance (“God” or “Nature” in Spinoza’s terminology, but in the following just “God” with the caveat that Nature is implied) is the cause of itself and necessarily exists. “God, or a substance consisting of infinite attributes, each of which expresses eternal and infinite essence, necessarily exists” (IP11). In his alternate demonstration for this proposition, he explicitly uses the principle of sufficient reason: “for each thing there must be assigned a cause, or reason, both for its existence and for its nonexistence” (417). The one substance, or God, is the cause of itself, or, as he defines it “that whose essence involves existence, or that whose nature cannot be conceived except as existing” (ID1).

This necessary existence of God entails the necessity by which every individual thing is determined. This is because Spinoza is committed to substance monism, or the position that there is only one substance. This is markedly different from his rationalist predecessor, Descartes, who, though also arguing that only God is properly speaking an independent substance (Principles I, 51), held that there were indefinitely many substances of two kinds: bodies, or res extensa, and thoughts, or res cogitantes (Principles I, 52). Spinoza, though, defines God as one substance consisting of infinite attributes. An attribute is “what the intellect perceives of a substance as constituting its essence” (ID4). By “infinite” here, Spinoza refers primarily to a totality rather than a numerical infinity, so that the one substance has all possible attributes. Spinoza goes on to indicate that the human intellect knows two attributes, namely extension and thought (IIA5). Besides the one substance and its attributes, Spinoza’s ontology includes what he calls modes. Modes are defined as “affections of a substance or that which is in another thing through which it is also conceived” (ID5). Furthermore, Spinoza distinguishes between infinite modes (IP23) and finite modes, the latter generally taken to be all the singular finite things, such as apples, books, or dogs, as well as ideas of these things, thus also the human body and its mind.

There is much scholarly controversy about the question of how substance, attributes, and infinite and finite modes all relate to each other. Of particular contention is the relation between the finite modes and the one infinite substance. A more traditional interpretation of Spinoza’s substance monism takes finite modes to be parts of God, such that they are properties which inhere in the one substance, with the implication of some variety of pantheism, or the doctrine that everything is God. Edwin Curley, however, influentially argues that finite modes should be taken merely as causally and logically dependent on the one infinite substance, that is, God, which itself is causally independent, following Spinoza’s argument of substance as cause of itself or involving necessary existence (IP1-IP11). According to this interpretation, God is identified with its attributes (extension and thought) as the most general structural features of the universe with infinite modes, following necessarily from the attributes and expressing the necessary general laws of nature (for instance, Spinoza identifies the immediate infinite mode of the attribute of extension with motion and rest in Letter 64, 439). On this causal-nomological interpretation of substance, God is the cause of all things but should only be identified with the most general features of the universe rather than with everything existing, for instance the finite modes (Curley 1969, esp. 44-81).

There is, however, resistance to this causal interpretation of the relation between substance and finite modes (see Bennett 1984, 92-110; 1991; Nadler 2008). Jonathan Bennet argues against Curley’s interpretation—returning to the more traditional relation of modes as properties that inhere in a substance—by taking Spinoza’s proposition IP15 more literally: “Whatever is, is in God, and nothing can be, or be conceived without God.” Bennett identifies the finite modes as ways in which the attributes are expressed adjectively (that is, this region of extension is muddy), keeping closer to Spinoza’s use of “mode” as “affections of God’s attributes… by which God’s attributes are expressed in a specific and determinate way” (IP25C). But as Curley points out, Bennett’s interpretation has some difficulty explaining the precise relation of finite modes to infinite modes and attributes, the latter having an immediate causal relation to God (Curley 1991, 49). Leaving aside the larger interpretive controversies, the issue here is that God and its attributes, being infinite and eternal, cannot be the direct or proximate cause of finite modes, though God is the cause of everything, including finite modes. Spinoza writes “From the necessity of the divine nature there must follow infinitely many things in infinitely many modes (that is, everything that can fall under an infinite intellect)” (IP16). For this reason, Spinoza’s argument for determinism seems to recognize an infinite chain of finite causes and a finite chain of infinite causes. The former has already been referred to when Spinoza argues in IP28 that any particular finite thing is determined to exist or produce an effect by another finite cause “and so on, ad infinitum.” Indeed, in his demonstration, Spinoza states that God, being infinite and eternal, could not be the proximate cause of finite things. Further, in the Scholium to this proposition, Spinoza explains that God is the proximate cause of only those things produced immediately by him, which in turn are infinite and eternal (eternal here indicating necessity as in IP10S, 416). That is, Spinoza does indeed argue that that which follows from the absolute nature of any of God’s attributes must be likewise infinite and eternal in IP21-P23.

Some commentators interpret God as being the proximate cause (through its attributes) of the infinite modes, which are understood as part of the finite chain of infinite causes associated with the most basic laws of nature. While Spinoza does not write directly of the “laws of nature” in this discussion in the Ethics, he does so in the Theological Political Treatise (TTP) in his discussion of miracles. Here Spinoza argues that nothing happens outside of the universal laws of nature, which for him are the same as God’s will and decree. Spinoza writes “But since nothing is necessarily true except by the divine decree alone, it follows quite clearly that the universal laws of nature are nothing but decrees of God, which follow from the necessity and perfection of the divine nature” (TTP VI.9). He goes on to argue that if a miracle were conceived as an occurrence contrary to the universal laws of nature, it would be contradictory in itself and mean that God was acting contrary to his own nature. From this passage, it is clear that Spinoza equates what follows from God’s nature with the universal laws of nature, which are eternal and immutable. For this reason, God’s attributes and the infinite modes are often identified with the most general feature of the universe, expressing the laws of nature.

We tend to use “laws of nature” when referring to physical laws. Spinoza, however, holds that God can be understood under the attribute of extension or the attribute of thought, that is, God is both extended (IIP2) and thinking (IIP1). For this reason, laws of nature exist not only in the attribute of extension but also in that of thought. Bodies and ideas both follow the laws of nature. Bodies are finite modes of extension, while ideas are finite modes of thought. Accordingly, he argues that “the order and connection of ideas are the same as the order and connection of things” (IIP7). This is Spinoza’s famous “parallelism,” though he never uses this term. While there is much controversy concerning how to interpret this identity, Spinoza indicates that the extended thing and the thinking thing are one and the same thing expressed under two different attributes or conceived from two different perspectives (IIP7S). For this reason, a body, or an extended mode, and its correlating idea, or a thinking mode, are one and the same thing conceived from different perspectives, namely through the attributes of extension or thought.

This claim has two significant consequences. First, when Spinoza indicates that each singular finite thing is determined to exist and to produce an effect by another singular finite thing ad infinitum, this applies to ideas as well as bodies. For this reason, just as bodies and their motion or rest are the cause of other bodies and their motion or rest—in accordance with universal laws of nature, namely the laws of physics—ideas are the cause of other ideas (IIP9) in accordance with universal laws of nature, presumably psychological laws. Second, being one and the same thing, bodies and ideas do not interact causally. That is to say, the order and connection of ideas are one and the same as the order and connection of bodies, but ideas cannot bring about the motion or rest of bodies, nor can bodies bring about the thinking of ideas. Spinoza writes “The body cannot determine the mind to thinking, and the mind cannot determine the body to motion, to rest, or to anything else if there is anything else” (IIIP2). It is clear, then, that both bodies and ideas are causally determined within their respective attributes and that there is no interaction between them. This will have a significant consequence for Spinoza’s understandings of free will versus freedom.

Spinoza’s most challenging consequence from these positions is his blunt denial of contingency in IP29, where he states: “In nature there is nothing contingent, but all things have been determined from the necessity of the divine nature to exist and produce an effect in a certain way.” To recall, finite modes of the one infinite substance (in the case of the attributes of extension or thought, bodies and ideas) are determined to exist by a finite cause (that is, another body or idea), which is further determined to exist by another cause, and so on to infinity. Furthermore, though the connection between singular things and God (conceived as the one eternal, infinite substance) is complex, ultimately, God is the cause of everything that exists, and everything is determined according to the universal and necessary laws of nature expressed by the infinite modes and the other fundamental features of the attributes of God, as mentioned above. In other words, for Spinoza, every event is necessitated by previous causes and the laws of nature.

2. Spinoza on Free Will

Because he is a determinist, Spinoza denies the existence of free will. This would make him, in contemporary discussions of free will, an incompatibilist as well as a determinist. In contemporary discussions of free will, the major concern centers mostly on the question of whether free will and thereby moral responsibility are compatible with determinism. There are two dominant solutions to this problem. Incompatibilism claims that free will and/or moral responsibility are incompatible with determinism because the latter prohibits free choice and thus accountability. Some incompatibilists, namely libertarians, even claim that—because human beings do have free will and we hold each other accountable for our actions—the world is not thoroughly determined. Other incompatibilists argue that if the world is determined, then free will is not compatible, but may be agnostic about whether the world is determined. The opposite camp of compatibilism claims that free will and/or moral responsibility are compatible with determinism, though they can also be agnostic about whether the world is determined.

Spinoza’s position cannot easily be sorted into this scheme because he distinguishes between free will (libera voluntas) and freedom (libertas). It is very clear that he denies free will because of his determinism: “In the mind there is no absolute, or free, will, but the mind is determined to will this or that by a cause which is also determined by another, and this again by another, and so to infinity” (IIP48). It is also, however, a consequence of Spinoza’s conception of the will. In the Scholium to IIP48, Spinoza explains that by “will” he means “a faculty of affirming or denying and not desire” (IIP48S, 484). That is to say, Spinoza, here, wants to emphasize will as a cognitive power rather than a conative one. In this respect, he seems to be following Descartes, who also understands the will as a faculty of affirming and denying, which, coupled with the understanding, produces judgements. However, Spinoza quickly qualifies against Descartes that the will is not, in fact, a faculty at all, but a universal notion abstracted from singular volitions: “we have demonstrated that these faculties are universal notions which are distinguished from the singulars from which we form them” (IIP48S, 484). Spinoza is here referring to his earlier explanation in the Ethics of the origin of “those notions called universals, like man, horse, dog, and the like” (IIP40S, 477). For Spinoza, these universal notions are imaginary or fictions that are formed “because so many images are formed at one time in the human body that they surpass the power of imagining.” The resulting universal notion combines what all of the singulars agree on and ignores distinctions.

Spinoza is making two bold and related claims here. First, there is no real faculty of will, that is a faculty of affirming and denying. Rather, the will is a created fiction, a universal that adds to the illusion of free will. Second, the will is simply constituted by the individual volitions—our affirmations and denials—and these volitions are simply the very ideas themselves. For this reason, Spinoza claims that the will is the same as the intellect (or mind) (IIP49C). Therefore, it is not an ability to choose this or that as in the traditional understanding, and certainly not an ability to choose between alternative courses of action arbitrarily. It is not even an ability to affirm or deny, as Descartes claimed. Descartes, in explaining error in judgment, distinguishes the intellect from the will. Thus, with his claim that the will is the same as the intellect, Spinoza is directly criticizing the Cartesian view of free will. We will return to this criticism after examining Spinoza’s view of the human mind.

For Spinoza, the human mind is the idea of an actually existing singular thing (IIP11), namely the body (IIP13). So, for instance, my mind is the idea of my body. As mentioned above, Spinoza holds that the order and connection of ideas are the same as the order and connection of things (IIIP7) insofar as God is understood through both the attribute of extension and the attribute of thought. This entails that for every body, there is an idea that has that body as its object, and this idea is one and the same as that body, although conceived under a different attribute. On the other hand, Spinoza also characterizes the human mind as a part of the infinite intellect of God (IIP11C) understood as the totality of ideas. For this reason, Spinoza explains that when the human mind perceives something, God has this idea “not insofar as he is infinite, but insofar as he is explained through the nature of the human mind, or insofar as he constitutes the essence of the human mind,” that is, as an affection or finite mode of the attribute of thought.

While Spinoza says the mind is the idea of the body, he also recognizes that the human body is considered an individual composed of multiple other bodies that form an individual body by the preservation of the ratio of motion and rest (II Physical Interlude, P1 and L5). Accordingly, every body that composes the individual’s body also has a correlative idea. Therefore, the mind is made up of a multitude of ideas just as the body is made up of a multitude of bodies (IIP15). Furthermore, when the human body interacts with the other bodies external to it, or has what Spinoza calls affections, ideas of these affections (the affections caused by external bodies in the individual human body) become part of the mind and the mind regards the external body as present (IIP16 and IIP17). These ideas of the affections, however, involve both the nature of the human body and that of the external body. Spinoza calls these “affections of the human body whose ideas present external bodies as present to us” images. He continues that “when the mind regards bodies in this way, we shall say that it imagines” (IIP17S, 465). Note here that Spinoza avers that images are the affections of the body caused by other bodies, and although they do not always “reproduce the figures of things”, he calls having the ideas of these affections of the body imagining.

As we can see, for Spinoza, the mind is a composite idea that is composed of ideas of the body and ideas of the body’s affections, which involve both the human body and the external body (and ideas of these ideas as well (IIP20)). Without these ideas of the affections of our body “the human mind does not know the human body, nor does it know that it exists, except through ideas of the affections by which the body is affected” (IIP19). At the same time, Spinoza explains that whenever the human mind perceives something, God has the idea of this thing together with the human mind (IIP11C); but God has the idea which constitutes the human mind only “insofar as he is considered to be affected by the idea of another singular thing” (IIP19D). That is, on the one hand, as explained in IP28, finite singular things come into existence or produce an effect by other finite singular things, on the other hand though, to the extent that all things are modes of the one substance, each effect is at the same time caused by God. Though most of our knowledge of the body and the external world comes from ideas of fections, Spinoza claims that these ideas of the body and its affections are for the most part inadequate, that is, incomplete, partial, or mutilated, and therefore not clear and distinct. Spinoza writes “Insofar as he [God] also has the idea of another thing together with the human mind, we say that the human mind perceives the thing only partially, or inadequately” (IIP11C).

Spinoza argues that for the most part we only have inadequate knowledge (cognitio) of the state of our body, of external bodies that affect our body, and of our own mind (as ideas of ideas of our body) (IIP26C, IIP27, and IIP28). Our knowledge concerning our body and its affections and the external bodies affecting our body and our own mind is, therefore, limited in its distinctness. While it is not always entirely clear what Spinoza means by inadequate knowledge or an inadequate idea, he defines an adequate idea as “an idea which, insofar as it is considered in itself, without relation to an object, has all the denomination of a true idea” (IID4). Avoiding the epistemic problems of a correspondence theory of truth, Spinoza argues we can form adequate ideas insofar as “every idea which in us is absolute, or adequate and perfect, is true” (IIP34). An inadequate idea is an incomplete, partial, or mutilated idea, and Spinoza argues that “falsity consists in the privation of knowledge which inadequate, or mutilate and confused, ideas involve” (IIP35).

Returning to Spinoza’s claim that the will is the same as the intellect, the mind is just constituted by all the individual ideas. To say that the will is the same as the intellect means that, for Spinoza, the will as the sum of individual volitions is just the sum of these individual ideas which compose the mind. What Spinoza has in mind is that our ideas, which constitute our mind, already involve affirmations and negations. There is no special faculty needed. To give a simple example, while sitting in a café, I see my friend walk in, order a coffee, and sit down. Perceiving all this is to say that my mind has ideas of the affections of my body caused by external bodies (which is also to say that there is in God the idea of my mind together with the ideas of other things). All these ideas are inadequate, incomplete, or partial. Because I perceive my friend, the idea of the affection of my body affirms that she is present in the café, drinking coffee, sitting over yonder. I am not choosing to affirm these ideas, according to Spinoza, but the very ideas already involve affirmations. As I am distracted by other concerns, such as reading a book, these ideas continue to involve the affirmation of her being present in the café, regardless of whether that fact is true or not. If I look up and see her again, this new idea reaffirms her presence. But if I look up and she has gone, the new idea negates the previous idea.

Spinoza seems to hold that ideas involve beliefs. This is what Spinoza means when he says that the ideas themselves involve affirmations and negations. Rather than the will choosing to assent or deny things, the will is only the individual volitions that are in fact the individual ideas, which always already involve affirmation and/or negation. To be sure, even knowledge as simple as my friend’s presence will involve a complex of indefinite affirmations and negations, everything from the general laws of nature to mundane facts about daily life. A consequence of ideas as involving affirmation and negation is that error does not result from affirming judgments that are false but rather is a consequence of inadequate knowledge (IIP49SI, 485). Unfortunately, most of our ideas are inadequate. In the above example, it can easily be the case that I continue to have the idea of my friend’s presence when she is no longer in the café, because I will have this idea as long as no other idea negates it (IIP17C).

For Spinoza, therefore, the will is not free and is the same as the intellect. He is aware that this is a strange teaching, explicitly pointing out that most people do not recognize its truth. The reason for this failure to recognize the doctrine that the will is not free can, however, be understood both as an epistemic and a global confusion. Epistemically, most people do not understand that an idea involves an affirmation or negation, but they believe the will is free to affirm or deny ideas. According to Spinoza, “because many people either completely confuse these three – ideas, images, and words – or do not distinguish them accurately, or carefully enough, they have been completely ignorant of this doctrine concerning the will” (IIP49SII, 485-86). First, some people confuse ideas with images “which are formed in us from encounters with bodies.” Images, for Spinoza, are physical and extended, and are, therefore, not ideas. But these people take the ideas to be formed by the direct relation between the mind and body. This has two results: a) ideas of things of which no image can be formed are taken to be “only fictions which we feign from free choice of the will”. In other words, some ideas are not understood as ideas (which involve affirmation and negation) caused by other ideas but as choices of the free will; b) these people “look on ideas, therefore, as mute pictures on a panel,” which do not involve affirmation or negation but are affirmed and denied by the will. Second, some people confuse words with ideas or with the affirmation involved in the ideas. Here they confuse affirmations and negations with willfully affirming or denying in words. Spinoza points out that they cannot affirm or deny something contrary to what the very idea in the mind affirms or negates. They can only affirm or deny in words what is contrary to an idea. In the above example, I can deny in words that my friend is in the café, but these words will not be a negation of the idea which I had while perceiving her as being in the café. For Spinoza, images and words are both extended things and not ideas. This confusion, however, has hindered people from realizing that ideas in themselves already involve affirmations and negations.

Spinoza further explains these confusions and defends his view against possible objections. It is here that Spinoza launches his attack on the Cartesian defense of free will and its involvement in error. Before turning to these possible objections and Spinoza’s replies, a brief overview of Descartes’ view of the will is helpful. In Meditations 4, Descartes explains error through the different scopes of the intellect and the will. The former is limited since we only have limited knowledge, that is, clear and distinct ideas, while our will possibly extends to everything in application, and is thus infinite. Descartes writes, “This is because the will simply consists in our ability to do or not do something (that is, to affirm or deny, to pursue or avoid), or rather, it consists simply in the fact that when the intellect puts something forward for affirmation or denial, for pursuit or avoidance, our inclinations are such that we do not feel we are determined by any external force” (57). Descartes continues, however, that freedom of the will does not consist in indifference. The more the will is inclined toward the truth and goodness of what the intellect presents to it, the freer it is. Descartes’ remedy against error is the suspension of judgment whenever the intellect cannot perceive the truth or goodness clearly and distinctly. Descartes, therefore, understands the will as a faculty of choice, which can affirm or deny freely to make judgments upon ideas presented by the intellect. Though the will is freer when it is based on clear and distinct ideas, it still has an absolute power of free choice in its ability to affirm or deny.

Turning to the possible objections to Spinoza’s view of the will brought up in II49S, the first common objection concerns the alleged different scope of the intellect and the will. Spinoza disagrees that the “faculty of the will” has a greater scope than the “faculty of perception”. Spinoza argues that this only seems to be the case because: 1) if the intellect is taken to only involve clear and distinct ideas, then it will necessarily be more limited; and 2) the “faculty of the will” is itself a universal notion “by which we explain all the singular volitions, that is, it is what is common to them all” (488). Under this view of the will, the power of assenting seems infinite because it employs a universal idea of affirmation that seems applicable to everything. Nevertheless, this view of the will is a fiction. Against the second common objection, that we know from experience that we can suspend judgment, Spinoza denies that we have the power to do so. What actually happens when we seem to hold back our judgment is nothing but an awareness that we lack adequate ideas. Therefore, suspension of judgment is nothing more than perception and not an act of free volition. Spinoza provides examples to illustrate his argument, among them that of a child who imagines a winged horse. The child will not doubt the existence of the winged horse, like an adult who has ideas that exclude the existence of winged horses, until he learns the inadequacy of such an idea. Spinoza is careful to note that perceptions themselves are not deceptive. But they do already involve affirmation independently of their adequacy. For this reason, if nothing negates the affirmation of a perception, the perceiver necessarily affirms the existence of what is perceived.

The third objection is that, since it seems that it is equally possible to affirm something which is true as to affirm something which is false, the affirmation cannot spring from knowledge but from the will. Therefore, the will must be distinct from the intellect. In reply to this, Spinoza reminds us that the will is something universal, which is ascribed to all ideas because all ideas affirm something. As soon as we turn to particular cases, the affirmation involved in the ideas is different. Moreover, Spinoza “denies absolutely” that we need the same power of thinking to affirm something as true which is true as we would need in the case of affirming something as true which is false. An adequate or true idea is perfect and has more reality than an inadequate idea, and therefore the affirmation involved in an adequate idea is different from that of an inadequate idea. Finally, the fourth objection refers to the famous Buridan’s ass, who is caught equidistantly from two piles of feed. A human in such an equilibrium, if it had no free will, would necessarily die. Spinoza, rather humorously, responds, “I say that I grant entirely that a man placed in such an equilibrium (namely, who perceives nothing but thirst and hunger and such food and drink as are equally distant from him) will perish of hunger and thirst. If they ask me whether such a man should be thought an ass rather than a man, I say that I do not know – just as I also do not know how highly we should esteem one who hangs himself, or children, fools, and madmen, and so on” (II49S, 490).

Besides answering the common objections to his identification of the will with the intellect, Spinoza also provides an explanation for the necessary origin of our illusionary belief that the will is free (see Melamed 2017). Spinoza alludes to this illusion a number of times. In the Ethics, it first occurs in the Appendix to Part 1 when he argues against natural teleology. He writes that,

All men are born ignorant of the causes of things, and that they all want to seek their own advantage and are conscious of this appetite. From these it follows, first, that men think themselves free, because they are conscious of their volitions and their appetites, and do not even think in their dreams, of the causes by which they are disposed to wanting and willing because they are ignorant of those causes. (440)

That is, because human beings are 1) ignorant of the causes of their volitions but 2) conscious of their desires, they necessarily believe themselves to be free. Hence, free will is an illusion born of ignorance. In a correspondence with Shuller, Spinoza provides a vivid image of the illusion of free will, writing that a stone, when put into motion, if it could judge, would believe itself free to move, though it is determined by external forces. This is exactly the same for human beings’ belief in free will. Spinoza even writes that “because this prejudice is innate in all men, they are not so easily freed from it” (Letter 58, 428).

Spinoza has another extensive discussion of free will as a result of ignorance in the scholium of IIIP2 in the Ethics. The proposition states “I body cannot determine the mind to thinking, and the mind cannot determine the body to motion, to rest, or anything else (if there is anything else)” (IIIP2). Spinoza’s parallelism holds that the mind and the body are one and the same thing conceived through different attributes, so there is no intra—attribute causality. The order and connection of ideas are the same as the order and connection of bodies, but it is not possible to explain the movement of bodies in terms of the attribute of thought, nor is it possible to explain the thinking of ideas through the attribute of extension. Spinoza is well aware that this will be unacceptable to most people who believe their will is free and that it is the mind which causes the body to move: They are so firmly persuaded that the body now moves, now is at rest, solely from the mind’s command, and that it does a great many things which depend only on the mind’s will and its art of thinking” (IIIP2S, 494-95).

Against this prejudice, Spinoza defends his position by pointing out 1) that human beings are so far quite ignorant of the mechanics of the human body and its workings (for instance, the brain) and 2) that human beings cannot explain how the mind can interact with the body. He further elucidates these points by responding to two objections taken from experience.

But they will say [i] that – whether or not they know by what means the mind moves the body – they still know by experience that unless the human mind were capable of thinking, the body would be inactive. And then [ii], they know by experience, that it is in the mind’s power alone both to speak and to be silent, and to do many other things, which they therefore believe to depend on the mind’s decision. (495)

In response to the first objection, Spinoza argues that while it is true that the body cannot move if the mind is not thinking, the contrary, that the mind cannot think if the body is inactive, is equally true, for they are, after all, one and the same thing conceived through different attributes. Against the great disbelief, though, that “the causes of buildings, of painting, and of things of this kind, which are made only of human skill, should be able to be deduced from the laws of Nature alone, insofar as it is considered corporeal” (496), Spinoza responds by reaffirming that humans are not yet aware of what the human body can do according to its own laws. He gives an interesting example of sleepwalkers doing all kinds of actions, none of which they recall when they are awake.

Concerning the second objection that humans apparently speak (a physical action) from the free power of the mind being an indication that the mind controls the body, Spinoza states that humans have just as much control over their words as over their appetites. He points out that they can hold their tongue only in cases of a weak inclination to speak, just as they can resist indulgence in a weak inclination to certain pleasures. But when it comes to stronger inclinations, humans often suffer from akrasia, or weakness of will. Again, they believe themselves to be free when, in fact, they are driven by causes they do not know. He points to:

[The infant believing] he freely wants the milk; the angry child that he wants vengeance; and the timid, flight. So, the drunk believes it is from a free decision of the mind that he speaks the things he later, when sober, wishes he had not said. So, the madman, the chatterbox, the child, and great many people of this kind believe they speak from a free decision of the mind, when really they cannot contain their impulse to speak. (496)

Here again, Spinoza argues that humans believe themselves free because they are conscious of their own desires but ignorant of the causes of them. Discussing the will with the body, he then states that, as bodies and minds are identical, decisions of the mind are the same as appetites and determinations of the body, understood under different attributes.

Finally, Spinoza points out that humans could not even speak unless they recollected words, though recollecting or forgetting itself is not at will, that is, by the free power of the mind. So it must be that the power of the mind consists only in deciding to speak or not to speak. However, Spinoza counters that often humans dream they are speaking and in their dreams believe that they do this freely, but they are not in fact speaking. In general, when humans are dreaming, they believe they are freely making many decisions, but in fact they are doing nothing. Spinoza asks pointedly:

So, I should very much like to know whether there are in the mind two kinds of decisions – those belonging to our fantasies and those that are free? And if we do not want to go that far in our madness, it must be granted that this decision of the mind, which is believed to be free, is not distinguished from the imagination itself, or the memory, nor is it anything beyond that affirmation which is the idea, insofar as it is an idea, necessarily involves. And so the decisions of the mind arise by the same necessity as the idea of things which actually exist. Those, therefore, who believe that they speak or are silent or do anything from a free decision of the mind, dream with open eyes. (497)

One final point concerning the illusion of free will: Spinoza uses belief in free will as one of his examples of error in IIP35S. IIP35 states that “falsity consists in the privation of knowledge which inadequate, or mutilated and confused ideas, involve.” In the Scholium, he reiterates the now familiar cause of the belief in free will, namely, that humans are conscious of their volitions but ignorant of the causes which determine their volitions. However, Spinoza here is not just claiming that we have an inadequate knowledge of the causes of our volitions leading us to err in thinking the will is free. He makes the stronger claim that because our knowledge of the will is inadequate, we cannot help but imagine that our will is free, that is, we cannot help but experience our will as free in some way, even if we know that it is not.

This can be seen from the second example of error that he uses. When looking at the sun, we imagine that it is rather close. But, Spinoza argues, the problem is not just the error of thinking of a much smaller distance than it is. The problem is that we imagine (that is, we have an idea of the affectation of our body affected by the sun) or experience the sun as being two hundred feet away regardless of whether we adequately know the true distance. Even knowing the sun’s true distance from our body, we will always experience it as being about two hundred feet away. Similarly, even if we adequately understand that our will is not free but that each of our volitions is determined, we will still experience it as free. The reason for this is explained in IIP48S, where Spinoza argues that the will, understood as an absolute faculty, is a “complete fiction or metaphysical being, or universal” which we form, however, necessarily. As mentioned above, universals are formed when—the body overloaded with images through affections—the power of imagining is surpassed, and a notion formed by focusing on similarities and ignoring a great many of the differences between its ideas. Spinoza’s point here in emphasizing the inevitability of error due to the prevalence of imagination and the limited scope of our reason is that humans cannot escape the illusion of free will.

3. Spinoza on Human Freedom

While Spinoza denies that the will is free, he does consider human freedom (libertas humana) as possible. Given the caveat just described, this freedom must be understood as limited. For Spinoza, freedom is the end of human striving. He often equates freedom with virtue, happiness, and blessedness (beatitudo), the more familiar end of human activity (for an overview, see Youpa 2010). Spinoza does not understand freedom as a capacity for choice, that is, as liberum arbitrium (free choice), but rather as consisting in acting as opposed to being acted upon. For Spinoza, freedom is ultimately constituted by activity. In Part I of the Ethics, Spinoza defines, “that thing is called free which exists from the necessity of its nature alone, and is determined to act by itself alone. But a thing is called necessary, or rather compelled, which is determined by another thing to exist and produce an effect in a certain and determinant manner” (ID7). According to this definition, only God, properly speaking, is absolutely free, because only God exists from the necessity of his nature and is determined to act from his nature alone (IP17 and IP17C2). Nevertheless, Spinoza argues that freedom is possible for human beings insofar as they act: “I say we act when something happens, in us or outside of us, of which we are the adequate cause, that is, (by D1), when something in us or outside of us follows from our nature, which can be clearly and distinctly understood through it alone” (IIID2). IIID1 gives the definition of adequate cause: “I call that cause adequate whose effect can be clearly and distinctly perceived through it.” From these definitions, we can see that if human freedom is constituted by activity, then freedom will be constituted by having clear and distinct ideas or adequate knowledge.

Above, it was seen that for Spinoza, will and intellect are one and the same. The will is nothing but singular volitions, which are ideas. These ideas already involve affirmation and negation (commonly ascribed to the faculty of will). In Part II, when arguing against the Cartesian view of the will, Spinoza emphasizes the will as a supposed “faculty of affirming and denying” in order to dispel the universal notion of a free will. In Part III, in his discussion of affects, he provides a fuller description of the will and the affective nature of ideas, providing the tools for his discussion of human freedom. By “affect,” Spinoza understands “the affections of the body by which the body’s power of acting is increased or diminished, aided and restrained, and at the same time, the ideas of these affections” (IIID3). Accordingly, he concludes that “if we can be the adequate cause of any of these affections, I understand by the affect an action; otherwise, a passion.” There is thus a close connection between activity and adequate ideas, as well as between passions and inadequate ideas (IIIP3).

Since most of our knowledge involves ideas of affections of the body, which are inadequate ideas, human beings undergo many things, and the mind suffers many passions until the human body is ultimately destroyed. Nevertheless, Spinoza argues that “each thing, as far as it can by its own power, strives to persevere in its own being” (IIIP6)”. This is Spinoza’s famous conatus principle, by which each individual strives to preserve its being or maintain what might be called homeostasis. In fact, Spinoza argues that the conatus, or striving, is the very essence of each thing (IIIP7). Furthermore, this striving is the primary affect, appetite, or desire. The conatus, or striving, when related solely to the mind, is understood as the will. When the conatus is conceived as related to both mind and body, Spinoza calls it appetite, and when humans are conscious of their appetite, he calls it desire (IIIP9S). Hence, Spinoza defines “desire is man’s very essence, insofar as it is conceived to be determined, from any given affection of it, to do something” (Def. Aff. I).

The conatus is central to Spinoza’s entire moral psychology, from which he derives his theory of affects, his theory of freedom, and his ethical and political theories. In arguing that any human individual is fundamentally striving (conatus) to persevere in being, Spinoza follows Hobbes’ moral psychology. In the Leviathan, Hobbes introduces his concept of conatus in its English version: “the small beginnings of motion within the body of man, before they appear in walking, speaking, and other visible action, are commonly called endeavor [conatus]. This endeavor, when it is toward something which causes it, is called appetite or desire” (Leviathan VI.1-2). Such desire or voluntary motion does not spring from a free will, Hobbes argues, but has its origins from the motion of external bodies imparting their motion to the human body, producing sensation. That is, Hobbes already equates the conatus with the will. Also, Hobbes already derives a taxonomy of passions from the conatus, albeit one that is far less sophisticated and complex than Spinoza’s taxonomy. Furthermore, Hobbes holds that the entire life of human beings consists of an endless desire for power, by which he understands “the present means to attain some future apparent good” (Leviathan X.1). This desire for power ends only with the eventual death of an individual (Leviathan IX.2). For Hobbes, humans are, for the most part, led by their passions, as, for instance, in the construction of a commonwealth from the state of nature, in which they are led by the fear of death and hope for a better life (Leviathan XIII.14). Though, of course, reason provides the means by which the construction of the state is possible. While there are many parallels between Hobbes’ and Spinoza’s psychology, Hobbes understands the conatus entirely as physical, explained by a materialistic mechanical philosophy. In contrast, for Spinoza, the conatus is both physical and psychological, according to his parallelism. Notwithstanding his focus on an ethic, his account of the affects often emphasizes psychological explanations.

From desire, that is, the conscious appetite of striving, Spinoza derives two other primary affects, namely joy and sadness. Spinoza describes joy as the passage or transition of the mind from a lesser to a greater perfection or reality, and sadness as the opposite, the passage of the mind from a greater to a lesser perfection or reality (IIIP11S). The affect of joy as related to both mind and body, he calls “pleasure or cheerfulness,” that of sadness “pain or melancholy.” IIIP3 underlines Spinoza’s parallelism with respect to his theory of affects: “the idea of anything that increases or diminishes, aids or restrains, our body’s power of acting, increases or diminishes our power of thinking” (IIIP11). In these essential basic definitions, Spinoza employs the concept of perfection or reality (equated in IID6). What he means by this can be grasped rather intuitively. The more perfection or reality an individual has, the more power it has to persevere in being, or the more the individual is capable of acting and thinking. When this power increases through a transition to greater perfection, the individual experiences joy. But if it decreases to lesser perfection, it experiences sadness.

Spinoza holds that from these three main affects all others, in principle, can be deduced or explained. However, the variety of affects is dependent not only on the individual but also on all the external circumstances under which they strive. Still, Spinoza provides explanations of the major human affects and their origin from other affects. The first affects he deduces from joy and sadness are love and hate. Whatever an individual imagines increases their power and causes joy, they love; and what decreases their power and causes sadness, they hate: “Love is nothing but joy and the accompanying idea of an external cause, and hate is nothing but sadness with the accompanying idea of an external cause” (IIIP13S). Accordingly, human beings strive to imagine those things (that is, have ideas of the affections of their body caused by those things) that increase their power of acting and thinking (IIIP12), causing joy, while avoiding imagining things that decrease their power of acting and thinking, causing sadness. Like Hobbes, Spinoza holds that human beings strive to increase their power, Spinoza, though, understands this specifically as a power to act and indeed to think.

Furthermore, because “the human body can be affected in many ways in which its power of acting is increased or diminished, and also in others which render its power of acting neither greater nor less” (III Post.I), there are many things which become the accidental cause of joy or sadness. In other words, it can happen that an individual loves or hates something not according to what actually causes joy (or an increase in power) or sadness (or a decrease in power), but rather something that appears to bring joy or sadness. This is possible because human beings are usually affected by two or more things at once, one or more of which may increase or decrease their power or causes joy or sadness, while others have no effect. Moreover, an individual, remembering experiences of joy or sadness accidently related to certain external causes, can come to love and hate many things by association (IIIP14). Indeed, Spinoza holds that there are as many kinds of joy, sadness, and desire as there are objects that can affect us (IIIP56), noting the well—known excessive desires of gluttony, drunkenness, lust, greed, and ambition.

Spinoza ultimately develops a rich taxonomy of passions and their mixtures, including the more common anger, envy, hope, fear, and pride, but also gratitude, benevolence, remorse, and wonder, to name a few. Not only does he define these passions, but he also gives an account of their logic, which is paramount for understanding the origin of these passions, and thereby ultimately overcoming them. True to his promise in the preface to the third part, Spinoza treats the affects “just as if they were a question of lines, planes, and bodies” (492). Initially and broadly, Spinoza discusses those affects that are passions because we experience them when we are acted upon. Human beings are passive in their striving to persevere in their being due to their inadequate ideas about themselves, their needs, as well as external things. Therefore, their striving to imagine what increases their power and avoiding what decreases their power fails, leading to a variety of affects of sadness. In contrast to traditional complaints about the weakness of humans with respect to their affects, however, Spinoza argues that “apart from the joy and desire which are passions, there are other affects of joy and desire which are related to us insofar as we act” (IIIP58) and that all such affects related to humans insofar as they act are ones of joy or desire and not sadness. Of course, this makes sense, as sadness is the transition from greater to lesser perfection and a decrease in the power of acting or thinking.

Spinoza’s theory of affects provides the foundation for his theory of human freedom, because ultimately freedom involves maximizing acting and minimizing being acted upon, that is, having active affects and not suffering passions. Recall that for Spinoza only God is absolutely free, because only God is independent as a self—caused substance and acts according to the necessity of his own nature, and because Spinoza defines a free thing as “existing from the necessity of its nature alone, and is determined to act by itself alone.” Human beings cannot be absolutely free. But insofar as they act, they are the adequate cause of their actions. This is to say that the action “follows from their nature, which can be clearly and distinctly understood through it alone” (IIID2). Therefore, when human beings act, they are free. This is opposed to being acted upon, or having passions, in which humans are only the inadequate or partial cause and are not acting according to their nature alone but are determined by something outside of themselves (see Kisner 2021). Therefore, the more human beings act, the freer they are; the more they suffer from passions, the less they are free.

Thus, Spinoza understands freedom in terms of activity as opposed to passivity, acting as opposed to being acted upon, or being the adequate cause of something as opposed to the inadequate cause of something: “I call that cause adequate whose effect can be clearly and distinctly perceived through it. But I call it partial or inadequate if its effect cannot be understood through it alone” (IIID1). From the perspective of the attribute of thought, being the adequate cause of an action is a function of having adequate ideas or true knowledge. He writes, “Our mind does certain things, [acts] and undergoes other things, namely, insofar as it has adequate ideas, it necessarily does certain things, and insofar as it has inadequate ideas, it necessarily undergoes other things” (IIIP1). Spinoza’s reasoning here is that when the mind has an adequate idea, this idea is adequate in God insofar as God constitutes the mind through the adequate idea. Thus, the mind is the adequate cause of the effect because the effect can be understood through the mind alone (by the adequate idea) and not something outside of the mind. But in the case of inadequate ideas, the mind is not the adequate cause of something, and thus the inadequate idea is, in God, the composite of the idea of the human mind together with the idea of something else. For this reason, the effect cannot be understood as being caused by the mind alone. Thus, it is the inadequate or partial cause. While this is Spinoza’s explanation of how being an adequate cause involves having adequate knowledge, there is some controversy among scholars about the status of humans having adequate ideas and true knowledge.

In Part II of the Ethics in IIP40S2, Spinoza differentiates three kinds of knowledge, which he calls imagination, reason, and intuitive knowledge. The first kind, imagination, mentioned above, has its sources in bodies affecting the human body and the ideas of these affections, or perception and sensation. It also includes associations with these things by signs or language. This kind of knowledge is entirely inadequate or incomplete, and Spinoza often writes that it has “no order for the intellect” or follows from the “common order of Nature,” that is, it is random and based on association. Passions, or passive affects, fall in the realm of imagination because imaginations are quite literally the result of the body being acted upon by other things, or, what is the same, ideas of these affections. The other two kinds of knowledge are adequate. Reason is knowledge that is derived from the knowledge of the common properties of all things, what Spinoza calls “common notions”. His thinking here is that there are certain properties shared by all things and that, being in the part and the whole, these properties can only be conceived adequately (IIP38 and IIP39). The ideas of these common properties cannot be but adequate in God when God is thinking the idea that constitutes the human mind and the idea which constitutes other things together in perception. Also, those ideas that are deduced from adequate ideas are also adequate (P40). The common notions, therefore, are the foundation of reasoning.

Some commentators, however, have pointed out that it seems impossible for humans to have adequate ideas. Michael Della Rocca, for instance, argues that having an adequate idea seems to involve knowledge of the entire causal history of a particular thing, which is not possible (Della Rocca 2001, 183, n. 29). This is because of Spinoza’s axiom that “the knowledge of an effect depends on and involves the knowledge of its cause” (IA4), and, as we have seen, finite singular things are determined to exist and produce an effect by another finite singular thing, and so on ad infinitum. Thus, adequate knowledge of anything would require adequate knowledge of all the finite causes in the infinite series. Eugene Marshall obviates this problem by arguing that it is possible to have adequate knowledge of the infinite modes (Marshall 2011, 31-36), which some commentators take, for Spinoza, to be the concern of the common notions (Curley 1988, 45fn; Bennett 1984, 107). Indeed, Spinoza argues that humans have adequate knowledge of God’s eternal and infinite essence (IIP45-P47), which would include knowledge of the attributes and infinite modes. Intuitive knowledge is also adequate, though it is less clear what specifically it entails. Spinoza defines it as “a kind of knowing [that] proceeds from an adequate idea of the formal essence of certain attributes of God to the adequate knowledge of the formal essence of things” (IIP40S2, 478). Here, Spinoza does indicate knowledge of the essence of singular things returning to the above problem, though Marshall, for instance, points out that Spinoza does not indicate the essence of finite modes existing in duration (existing in time), which would require knowledge of the causal history of a finite mode. Rather, he suggests that Spinoza here speaks of the idea of the essence of things as sub specie aeternitatis, or things considered existing in the eternal attributes of God (Marshall 2011, 41-50). Furthermore, rational knowledge and intuitive knowledge are both related (Spinoza argues that rational knowledge encourages intuitive knowledge) but also distinct (VP28).

Rational knowledge and intuitive knowledge, because they involve adequate ideas, are necessary for human freedom. Again, this is because human freedom is constituted by activity, and humans act when they are the adequate cause of something that follows from their nature (IIID2). Moreover, humans can be the adequate cause, in part, when the mind acts or has adequate ideas (IIIPI). This is how Spinoza explains the possibility of human freedom metaphysically. However, human freedom, which Spinoza equates with virtue and blessedness, is the end of human striving, that is, the ongoing project of human existence. The essence of a human being, the conatus, is the striving to persevere in being and consequently to increase the power of acting and thinking, and this increase brings about the affect of joy. This increase in the power of acting and thinking can occur passively— the passion of joy—when human beings strive from inadequate ideas, or it can occur actively when human beings strive from adequate ideas, or from reason and intuitive knowledge. The more human beings strive for adequate ideas or act rationally in accordance with their own nature, the freer they are and the greater is their power of acting and thinking and the consequent joy. Therefore, reason and intuitive knowledge are paramount for freedom, virtue, and blessedness (VP36S) (see Soyarslan 2021).

For Spinoza, human freedom is very different from free will as ordinarily understood. It is not a faculty or ability apart from the intellect. Rather, it is a striving for a specific way of life defined by activity, reason, and knowledge instead of passivity and ignorance. Determinism is not opposed to this view of freedom, as freedom is understood as acting according to one’s own nature and not being compelled by external forces, especially passions. In this respect, it has many similarities to the view of freedom held by Hobbes and that of the Stoa in different respects. For Hobbes, being a materialist, freedom only applies properly to bodies and concerns the absence of external impediments to the motion of a body. Likewise, calling a human free indicates he is free “in those things which by his own strength and wit he is able to do is not hindered to do what he has a will to” (Leviathan XXI.1-2). However, Spinoza’s view of freedom differs substantially from Hobbes in that he has a more extensive view of what it means to be impeded by external forces, recognizing that the order of ideas and bodies are one and the same. For the Stoa, generally speaking, freedom consists in living a rational life according to nature. If one lives according to nature, which is rational, one can be free despite the fact that nature is determined because one conforms the desires to the order of nature through virtue. A famous illustration of such an understanding of freedom is given by a dog led by a cart. If the dog willingly follows the cart that is pulling it, it acts freely; if it resists the motion of the cart, being pulled along nonetheless, it lacks freedom (Long 1987, 386). For Spinoza, freedom does not conflict with determinism either, as long as human beings are active and not passive. Likewise, the greatest impediment to freedom are the passions, which can so overcome the power of an individual that they are in bondage or a slave. Spinoza famously writes “Man’s lack of power to moderate and restrain the affects I call bondage. For the man who is subject to affects is under the control, not of himself, but of fortune, in whose power he so greatly is that often, though he sees the better for himself, he is still forced to follow the worse” (IV Preface, 543). In these lines, Spinoza presents not only the problem that the passions present to human thriving but also situates this problem within the context of the classic enigma of akrasia, or weakness of will.

In the first 18 propositions of Part IV of the Ethics, entitled “Of Human Bondage, or the Power of the Affects,” Spinoza aims to explain “the causes of man’s lack of power and inconstancy, and why men do not observe the precepts of reason” (IVP18S, 555). First, he sets up the general condition that human beings, being a part of nature, are necessarily acted upon by other things (IVP2). Their power in striving to persevere in being is limited and surpassed by the power of other things in nature (IVP3). Therefore, it is impossible for them to be completely free or act only in accordance with their own nature (IVP4). Accordingly, Spinoza admits, “from this it follows that man is necessarily always subject to passions, that he follows and obeys the common order of Nature, and accommodates himself to it as much as the nature of things requires” (IVP4C). This, of course, is the reason that human freedom is always limited and requires constant striving. Human beings are constantly beset by passions, but what is worse is that the power of a passion is defined by the power of external causes in relation to an individual’s power (IVP5). This is to say, human beings can be overwhelmed by the power of external causes in such a way that “the force of any passion or affect can surpass the other actions, or powers of a man, so that the affect stubbornly clings to the man” (IVP6). This can be easily understood from the universal human experiences of grief and loss, envy and ambition, great love and hatred, as well as from any form of addiction and excessive desire for pleasures. Such passions, and even lesser ones, are hard to regulate and can interrupt our striving for a good life or even completing the simple tasks of daily life.

In IVP7, Spinoza touches on the main issue in akrasia, writing that “an affect cannot be restrained or taken away except by an affect opposite to and stronger than the affect to be restrained”. Here we can see why merely knowing what is good or best does not restrain an affect, and humans often see the better course of action but pursue the worse. The issue here is that Spinoza thinks that a true or adequate idea does not restrain a passion unless it is also an affect that increases the individual’s power of action (IVP 14). Furthermore, an affect’s power is compounded by its temporal and modal relationship to the individual. For instance, temporally, an affect whose cause is imagined to be present is stronger than if it were not (IVP9), if it is imagined to be present imminently rather than far in the future, or if it was present in the recent past rather than in distant memory (IVP10). Likewise, modally, an affect toward something humans view as necessary is more intense than if they view it as possible or contingent (IVP11).

Because the power of affects is temporally and modally affected and because an affect can be restrained by an opposite and more powerful affect, it often is the case that a desire that does come from true knowledge or adequate ideas is still overcome by passions (IVP 15). This can be easily seen in a desire for some future good, which is overcome by the longing for pleasures of the moment (IVP16), as is so often the case. However, “a desire that arises from joy is stronger, all things being equal, than one which arises from sadness” (IVP18). That joy is more powerful than sadness is prima facie a good thing, except that in order to overcome the passions and achieve the good life, true knowledge of good and evil in the affects is necessary. Spinoza’s conception of the good life, or what he calls blessedness, is in essence overcoming this domination of the passions and providing the tools for living a life of the mind, which is the life of freedom (see James 2009). Thus, Spinoza provides guidance for how such a good life can be achieved in Books IV and V of the Ethics, namely in the ideal exemplar of the free man and the so-called remedies of the passion.

4. The Free Man and the Way to Freedom

In the preface to Part IV of the Ethics, Spinoza introduces the idea of the model of human nature, or the “free man”. The free man is understood as an exemplar to which humans can look to decide whether an action is good or evil (there is some controversy over the status of the free man, for instance, see Kisner 2011, 162-78; Nadler 2015; Homan 2015). Spinoza is often interpreted as a moral anti-realist because of some of his claims about moral values. For instance, he writes “We neither strive for, nor will, neither want, nor desire anything because we judge it to be good; on the contrary, we judge it to be good because we strive for it, will it, want it, and desire it” (IIIP9S). And by “good here I understand every kind of joy, and whatever leads to it, and especially what satisfies any kind of longing, whatever that may be. And by evil, every kind of sadness, and especially what frustrates longing” (IIIP39S, 516). However, as anything can be the accidental cause of joy or sadness (IIIP15), it would seem that good and evil, or some goods and evils, are relative to the individual, as is the case for Hobbes. Moreover, Spinoza indicates that in nature there is nothing good or evil in itself. He writes “As far as good and evil are concerned, they also indicate nothing positive in things, considered in themselves, nor are they anything other than modes of thinking or notions we form because we compare things to one another” (IV Preface, 545) (for an overview of Spinoza’s meta-ethics, see Marshall 2017).

Nevertheless, in Part IV of the Ethics, Spinoza redefines good and evil. Good is now understood as what is certainly known to be useful to us, and evil as what is certainly known to prevent the attainment of some good (IVD1 and IVD2). What does Spinoza mean here by “useful”? What is useful to a human individual is what will allow them to persevere in being and increase their power of acting and thinking, especially according to their own nature, or “what will really lead a man to greater perfection” (IVP18S, 555). This new definition of good as what is really useful is distinguished from mere joy or pleasure, which, insofar as it prevents us from attaining some other good, can be an evil. For Spinoza, the most useful thing for humans is virtue (IVP18S), by which they can attain greater perfection, or greater power of acting and thinking. In order to understand what is really useful and good, Spinoza proposed the idea of the free man “as a model of human nature which we may look to”. For this reason, he also defines good relative to this model, writing, “I shall understand by good, what we certainly know is a means by which we may approach nearer and nearer to the model of human nature we set before ourselves” (IV Preface, 545).

With this model of human nature in mind, Spinoza then goes on to give an analysis of what is good and evil in the affects. Generally speaking, all passions that involve sadness, that is, affects that decrease the perfection or reality of an individual and consequently the ability of the mind to think and the body to act are evil (IVP41). For instance, hate towards other humans is never good (IVP45) and all species of such hate such as envy, disdain, and anger, are evil (IVP45C2). Also, any affects that are mixed with sadness, such as pity (IVP50), or are vacillations of the mind, like hope and fear (IVP47), are not good in themselves. In contrast, all affects that are joyful, that is, which increase the reality or perfection of an individual and consequently the ability of the mind to think and the body to act, are directly good. Spinoza qualifies, however, since the net increase and decrease in power of the individual has to be taken as a whole, with its particular conditions, and over time. For instance, the passion of joy and pleasure might be excessive (IVP43) or relate to only one part of an individual (IVP60), and the power of passions, being defined by the power of external causes, can easily overcome our power of acting and thinking as a whole and, thus, lead to greater sadness. Likewise, some sadness and pain might be good to the extent that they  prevent a greater sadness or pain by restraining excessive desires (IVP43). It can easily be seen that love, which is a species of joy, if excessive, can be evil. Spinoza writes:

Sickness of the mind and misfortunes take their origin, especially, from too much love towards a thing which is liable to many variations and which we can never fully possess. For no one is disturbed or anxious concerning anything unless he loves it, nor do wrongs, suspicions, and enmities arise except from love for a thing which no one can really fully possess. (VP20S, 606)

Here again, it can be seen that, though joy in itself is directly good, it is often problematic as a passion and sometimes leads to sadness. Nevertheless, there is an interesting asymmetry here. While human beings’ passivity often leads them to the experiences of passions that are a variety of sadness, there are certain passions of joy that can, all things being equal, increase the power of an individual. This asymmetry allows for how human beings can increase their power of thinking and acting before they can act on adequate ideas. Therefore, it is important to note that joyful passions qua passions can be good and increase activity, despite being passions, and insofar as it increases our power of acting, it adds to freedom (see Goldenbaum 2004; Kisner 2011, 168-69). In this respect, the view toward the passions developed by Spinoza, undoubtedly influenced by Stoicism, differs from the general Stoic view. For the Stoa, virtue is living according to reason. The goal of the Stoic sage is to reach ataraxia, a state of mental tranquility, through apatheia, a state in which one is not affected by passions (pathai), which by definition are bad. By contrast, Spinoza explicitly understands passions of joy, all things being equal, as good.

Moreover, Spinoza also emphasizes that there are many things external to the human individual that are useful and therefore good, including all the things that preserve the body (IVP 39) and allow it to optimally interact with the world (IVP 40): “It is the part of a wise man, I say, to refresh and restore himself in moderation with pleasant food and drink, with scents, with the beauty of green plants, with decorations, music, sport, the theater, and other things of this kind, which anyone can use without injury to another” (IVP 45S, 572). Most significant in the category of external goods are other human beings. While other humans can be one of the greatest sources of conflict and turmoil insofar as they are subject to passions (IVP32-34), Spinoza also thinks that “there is no singular thing in Nature which is more useful to man than a man who lives according to the guidance of reason” (IVP35C). For this reason, Spinoza recognizes, similar to Aristotle, that good political organization and friendship are foundational to the good life – freedom, virtue, and blessedness (IVP73, for instance).

Leaving aside the many things in nature that are useful and good for human freedom, despite being external to the individual, what is ultimately constitutive of human freedom is active affects or what is the same, rational activity, that is, striving to persevere in being through the guidance of reason and understanding. Actions are affects which are related to the mind because it understands them, and all such affects are joyful (IIIP59). Nor can desires arising from reason ever be excessive (IVP61). Thus, active joy and desire are always good. Spinoza equates the human striving to persevere in being through the guidance of reason with virtue, which he understands as power, following Machiavelli’s virtu. Albeit for Spinoza, this power is acting from reason and understanding. It can be seen that the conatus is intimately related to virtue, and it is indeed the foundation of virtue. Spinoza writes “The striving to preserve oneself is the first and only foundation of virtue” (IVP22C). When we strive to persevere in being, we seek our own advantage, pursuing what is useful (and therefore good) (IVP19) for increasing our power of acting and thinking. The more we pursue our own true advantage, the more virtue we have (IVP20).

Initially, this apparent egoism may seem like an odd foundation for virtue. However, virtue is the human power to persevere in being, and Spinoza qualifies: “A man cannot be said absolutely to act from virtue insofar as he is determined to do something because he has inadequate ideas, but only insofar as he is determined because he understands” (IVP23). So, virtue, properly speaking, is seeking one’s advantage according to knowledge and striving to persevere in being through the guidance of reason (IVP34). Furthermore, Spinoza argues that what we desire from reason is understanding (IVP26), and the only things that we know to be certainly good or evil are those things which lead us to understanding or prevent it (IVP27). Virtue, therefore, is a rational activity, or active affect, by which we strive to persevere in our being, increasing our power of acting and thinking, through the guidance of reason. Spinoza calls this virtue specifically fortitudo, or “strength of character”. He further divides the strength of character into animositas, or “tenacity” and generositas, or “nobility”. Tenacity is the desire to preserve one’s being through the dictates of reason alone. Nobility, likewise, is the desire to aid others and join them in friendship through the dictates of reason alone (IIIP59S). These two general virtues are both defined as a “desire to strive” to live according to the dictates of reason or to live a rational life of understanding and pursuing what is really to the advantage of the individual.

Though Spinoza does not give a systematic taxonomy of the two sets of virtues, certain specific virtues (and vices) can be found throughout the Ethics (for more, see Kisner 2011, 197-214). Neither does he give an exhaustive list of the “dictates of reason,” though many of these too can be gleaned from the text (see LeBuffe 2010, 177-179). For instance, when he states “He who lives according to the guidance of reason strives, as far as he can, to repay the other’s hate, anger, and disdain towards him with love and nobility” (IVP 46). However, since there is nothing good or evil in nature in itself, the exemplar of the free man is used to consider, in any particular case, what is good and evil from the perspective of the life of freedom and blessedness or happiness. Similar to Aristotle’s phronimos, who is the model of phronesis for discerning virtue in practice, Spinoza’s “free man” can be interpreted as an exemplar to whom an individual can look in order to discern what is truly useful for persevering in being, and what is detrimental to leading a good life defined by rational activity and freedom. In IVP67-IVP73, the so-called “free man propositions”, Spinoza provides an outline of some dictates of reason derived from the exemplar of the free man. Striving to emulate the free man, an individual should not fear death (IVP67), use virtue to avoid danger (IVP68), avoid the favors of the ignorant (IVP70), be grateful (IVP71), always be honest (IVP72), and live a life in community rather than in solitude (IVP73). Ultimately, the exemplar of the free man is meant to provide a model for living a free life, avoiding negative passions by striving to live according to the dictates of reason. However, Spinoza is well aware, as some commentators have pointed out, that the state of the free man, as one who acts entirely from the dictates of reason, may not be entirely attainable for human individuals. In paragraph XXXII of the Appendix to Part IV, he writes “But human power is very limited and infinitely surpassed by the power of external causes. So we do not have the absolute power to adapt things outside us to our use. Nevertheless, we shall bear calmly those things which happen to us contrary to what the principles of our advantage demand, if we are conscious that we have done our duty, that the power we have could not have extended itself to the point where we could have avoided those things, and that we are a part of the whole of nature, whose order we follow.”

In the final part of the Ethics, Spinoza proposes certain remedies to the passions, which he understands as the tools available to reason to overcome them, “the means, or way, leading to freedom.” In general, Spinoza thinks that the more an individual’s mind is made up of adequate ideas, the more active and free the individual is, and the less they will be subject to passions. For this reason, the remedies against the passions focus on activity and understanding. Spinoza outlines five general remedies for the passions:

I. In the knowledge itself of the affects;

II. In the fact that it [the mind] separates the affects from the thought of an external cause, which we imagine confusedly;

III. In the time by which the affection related to things we understand surpasses those related to things we conceive confusedly or in a mutilated way;

IV. In the multiplicity of causes by which affections related to common properties or to God are encouraged;

V. Finally, in the order by which the mind can order its affects and connect them to one another. (VP20S, 605)

The suggested techniques rely on Spinoza’s parallelism, stated in IIP7, that the order of ideas is the same as the order of things. For this reason, Spinoza argues that “in just the same way as thoughts and ideas of things are ordered and connected in the mind, so the affections of the body, images of things are ordered and connected in the body” (IVP1). Therefore, all the techniques suggested by Spinoza involve ordering the ideas according to adequate knowledge, through reason and intuitive knowledge. In this way, the individual becomes more active, and therefore freer, in being a necessary part of nature.

Spinoza’s first and foundational remedy involves an individual fully understanding their affects to obtain self-knowledge. Passive affects, or passions, are, after all, based on inadequate knowledge. Spinoza’s suggestion here is to move from inadequate knowledge to adequate knowledge by attempting to fully understand a passion, that is, to understand its cause. This is possible because, just as the mind is the idea of the body and has ideas of the affections of the body, it can also think ideas of ideas of the mind (IIP20). These ideas are connected to the mind in the same way as the mind is connected to the body (IIP21). Understanding a passion, then, is thinking about the ideas of the ideas of the affections of the body. Attempting to understand a passion has two main effects. First, by the very thinking about their passion, the individual is already more active. Second, by fully understanding their affect, an individual can change it from a passion to an action because “an affect which is a passion ceases to be a passion as soon as we form a clear and distinct idea of it” (VP3).

Spinoza’s argument for the possibility of this relies on the fact that all ideas of the affections of the body can involve some ideas that we can form adequately, that is, there are common properties of all things—the common notions or reason (VP4). So, by understanding affects, thinking ideas of the ideas of the affections of the body, particularly thinking of the causes of the affections of the body, we can form adequate ideas (that follow from our nature) and strive to transform passions into active affects. Spinoza does qualify that we can form some adequate ideas of the affections of the body, underlining that such understanding of passions is limited, but he also writes that “each of us has—in part, at least, if not absolutely—the power to understand himself and his affects, and consequently, the power to bring it about that he is less acted on by them” (VP4S, 598). Since “the appetite by which a man is said to act, and that by which he is said to be acted on are one and the same” (VP4S, 598) anything an individual does from a desire, which is a passion, can also be done from a rational affect.

Interconnected with the first remedy, Spinoza’s second remedy recommends the separation of the affect from the idea of the external cause. VP2 reads “If we separate emotions, or affects, from the thought of an external cause and join them to other thoughts, then the love, or hate, towards the external cause is destroyed, as are the vacillations of the mind arising from these affects.” For Spinoza, love or hate are joy or sadness with an accompanying idea of the external cause. He, here, is indicating that by separating the affect from the thought of an external cause that we understand inadequately, and by understanding the affect as mentioned above by forming some adequate ideas about the affect, we destroy the love and hate of the external cause. As mentioned earlier, anything can be the accidental cause of joy and sorrow (IIIP15), and therefore of love and hate. Furthermore, the strength of an affect is defined by the power of the external cause in relation to our own power (IVP5). Separating the passion from the external cause allows for understanding the affect in relation to the ideas of the mind alone. It might be difficult to grasp what Spinoza means by separating the affect from the external cause in the abstract, but consider the example of the jealous lover. Spinoza defines jealousy as “a vacillation of the mind born of love and hatred together, accompanied by the idea of another who is envied” (IIIP35S). The external causes accompanying the joy and sadness are the beloved and the (imagined) new lover who is envied. By separating the affect from the idea of the external cause, Spinoza is suggesting that a jealous lover could come to terms with the jealousy and form some clear and distinct ideas about it, that is, form some adequate ideas that reduce the power of the passion. Spinoza’s third remedy involves the fact that “affects aroused by reason are, if we take account of time, more powerful than those related to singular things we regard as absent” (VP7). Simply put, “time heals all wounds,” but Spinoza gives an account of why this is. Whereas passions are inadequate ideas that diminish with the absence of the external cause (we have other ideas that exclude the imagining of the external object), an affect related to reason involves the common properties of things “which we always regard as present” (VP7D). Therefore, over time, rational affects are more powerful than passions. This mechanism of this remedy is readily seen in a variety of passions, from heartbreak to addiction.

Spinoza’s fourth and fifth remedies are more concerned with preventing the mind from being adversely affected by passions than with overcoming a specific passion which already exists. The fourth remedy involves relating an affect to a multitude of causes, because “if an affect is related to more and different causes, which the mind considers together with the affect itself, it is less harmful, we are less acted on by it, and we are affected less toward each cause than is the case with another equally great affect, which is related only to one cause or to fewer causes” (VP9). This is the case because, when considering that affect, the mind is engaged in thinking a multitude of different ideas, that is, its power of thinking is increased, and it is more free. Again, this remedy is, in large part, related to the first foundational one. In understanding our affects, we form some adequate ideas and understand the cause of the affect, in part, from these ideas. Insofar as these adequate ideas are common notions concerning the common properties of things, we relate the affects to many things that can engage the mind. Spinoza ultimately claims that “the mind can bring it about that all the body’s affections, or images of things, are related to the idea of God” (VP14), for the mind has an adequate idea of the essence of God (IIP47). Because these affections are related to adequate ideas and follow from our own nature, they are effects of joy accompanied by the idea of God. In other words, all affections of the body can encourage an intellectual love of God. For Spinoza, “he who understands himself and his affects clearly and distinctly loves God, and does so the more, the more he understands himself and his affects” (VP15). This is a large part of how Spinoza conceives of the joyful life of reason and understanding that he calls blessedness.

Finally, the fifth remedy involves the fact that, as Spinoza argues, “so long as we are not torn by affects contrary to our nature, we have the power of ordering and connecting the affection of the body according to the order of the intellect” (VP10). What this amounts to is that the mind will be less affected by negative passions the more adequate ideas it has and will order its ideas according to reason instead of the common order of nature. Spinoza’s suggestion is to “conceive of right principles of living, or sure maxims of life,” which we can constantly look at when confronted by common occurrences and emotional disturbances of life. For instance, Spinoza gives the example of how to avoid being suddenly overwhelmed by hatred by preparing oneself by meditating “frequently on the common wrongs of men, and how they may be warded off best by nobility” (VP10S). This provides the practical mechanism by which we can use the virtues of tenacity and nobility to live a free life (see Steinberg 2014). All the remedies Spinoza mentions allow an individual to be rationally responsive to their environment rather than just being led by their emotions, and insofar as they are led by reason and adequate knowledge, they are free.

5. Spinoza on Moral Responsibility

The discussion about free will and freedom is often concerned with moral responsibility because free will is generally considered a necessary condition for moral responsibility. Moral responsibility is taken to be the condition under which an individual can be praised and blamed, rewarded and punished for their actions. Spinoza’s view on responsibility is complex and little commented upon. And he indeed avers that praise and blame are only a result of the illusion of free will: “Because they think themselves free, those notions have arisen: praise and blame, sin and merit” (I Appendix, 444). Though Spinoza does not speak directly of moral responsibility, he does not completely disavow the idea of responsibility because of his denial of free will. In a series of correspondences with Oldenburg, he makes clear that he does think that individuals are responsible for their actions despite lacking free will, though his sense of responsibility is untraditional. Oldenburg asks Spinoza to explain some passages in the Theological Political Treatise that seem, by equating God with Nature, to imply the elimination of divine providence, free will, and thereby moral responsibility. Spinoza indeed denies the traditional view of divine providence as one of free choice by God. For Spinoza, absolute freedom is acting from the necessity of one’s nature (ID7), and God is free in precisely the fact that everything follows from the necessity of the divine nature. But God does not arbitrarily choose to create the cosmos, as is traditionally argued.

In Letter 74, Oldenburg writes “I shall say what most distresses them. You seem to build on a fatal necessity of all things and actions. But, once that has been asserted and granted, they say the sinews of all laws, of all virtue and religion, are cut, and all rewards and punishments are useless. They think that whatever compels or implies necessity excuses. Therefore, they think no one will be inexcusable in the sight of God” (469). Oldenburg points out the classical argument against determinism, namely that it makes reward and punishment futile and pointless because if human beings have no free will, then they seem to have no control over their lives, and if they have no control over their lives, then there is no justification for punishment or reward. All actions become excusable if they are outside the control of individuals. However, in his response to Oldenburg, Spinoza maintains the significance of reward and punishment even within a deterministic framework. He states,

This inevitable necessity of things does not destroy either divine or human laws. For whether or not the moral teachings themselves receive the form of law or legislation from God himself, they are still divine and salutary. The good which follows from virtue and the love of God will be just as desirable whether we receive it from God as a judge or as something emanating from the necessity of the divine nature. Nor will the bad things which follow from evil actions and affects be any less to be feared because they follow from them necessarily. Finally, whether we do what we do necessarily or contingently, we are still led by hope and fear. (Letter 75, 471)

Spinoza has two points here. The first is that all reward and punishment are natural consequences of actions. Even if everything is determined, actions have good and evil consequences, and these are the natural results of actions. Determinism does not eliminate reward and punishment because there are determined consequences, that are part of the natural order. Traditional views on responsibility are tied to free will, but in this passage, Spinoza is indicating that reward and punishment are justified by the power or right of nature. The second point is that these consequences can regulate human behavior because human beings are led by the hope for some good and the fear of some evil. Determinism does not destroy the law but rather gives it a framework for being effective. Spinoza here seems to be advocating something like a consequentialist theory of responsibility. What matters is that the reward and punishment can act as a deterrent to bad behavior or motivation for desired behavior. Traditional views on responsibility are tied to free will, but in this passage, Spinoza is indicating that reward and punishment are still justified from a social and political standpoint (see Kluz 2015).

To understand Spinoza’s points better, we have to examine his view of law. Spinoza thinks that law is either dependent on natural necessity, that is, laws of nature, or human will. However, because human beings are a part of nature, human law will also be a part of natural law. Moreover, he also thinks that the term “law” is generally more applied to human experience. He writes, “Commonly nothing is understood by law but a command which men can either carry out or neglect—since law confines human power under certain limits, beyond which that power extends, and does not command anything beyond human powers.” For this reason, Spinoza qualifies, “Law seems to need to be defined more particularly: that it is a principle of living man prescribes to himself or to others for some end” (TTP IV.5). Spinoza further divides law into human and divine law. By “human law,” Spinoza specifically means “a principle of living which serves only to protect life and the republic” (TTP IV.9), or what we might call “political” or “civil” law. By “divine law,” he specifically means, that which aims only at the supreme good, that is, the true knowledge and love of God” (TTP IV.9), or what we might call “religious” and “moral” law. The different ends of the law are what distinguish human law from divine law. The first concerns providing security and stability in social life; the second concerns providing happiness and blessedness, which are defined by virtue and freedom. For this reason, “divine law” in Spinoza’s sense concerns what leads to the supreme good for human beings, that is, the rule of conduct that allows humans to achieve freedom, virtue, and happiness. This law Spinoza propounds as moral precepts in the Ethics mentioned above. These laws follow from human nature, that is, they describe what is, in fact, good for human individuals in their striving to persevere in their being, based upon rational knowledge of human beings and nature in general, with the free man as the exemplar toward which they strive.

However, it is not the case that all individuals can access and follow the “divine law” through reason alone, and, therefore, traditionally, divine law took the form of divine commandments ensconced within a system of reward and punishment (while still including, more or less, what Spinoza indicates by ‘divine law”). For Spinoza, what is true in Holy Scripture and “divine law” can also be gained by adequate knowledge because “divine law” is a rule of conduct men lay down for themselves that “aims only at the supreme good, that is, the true knowledge and love of God.” (TTP IV.9). That is to say, “divine law” follows from human nature, which is a part of Nature, but while the free man follows these moral precepts because he rationally knows what is, in fact, advantageous for him, other individuals follow moral precepts because they are led by their passions, namely the hope for some good or the fear of some evil, that is, reward and punishment. Though reward and punishment are, ultimately, the same for the free man and other individuals, the free man is led by reason while other individuals are led by imagination, or inadequate ideas or passions. Likewise, human law, that is, political law, uses a system of reward and punishment to regulate human behavior through hope and fear. Human law provides security and stability for the state in which human individuals co-exist and punishes those who transgress the laws. Moreover, just as in the case of “divine law”, the free man follows human law because he rationally knows his advantage, while other individuals are more led by their passions. Returning to Spinoza’s response, determinism does not do away with law, moral or political, because the utility of the law, that is, the great advantages that following the law provides for the individual and the community and the disadvantages that result from transgressing the law, are retained whether or not human beings have free will. Ultimately, for Spinoza, moral precepts and the law are ensconced in a system of reward and punishment that is necessary for regulating human behavior even without free will.

6. References and Further Reading

All translations are from The Collected Works of Spinoza, Vol. I and II, ed. and trans. Edwin Curley.

a. Primary Sources

  • Descartes, Rene. The Philosophical Writings of Descartes, Vol. I and II, trans. John Cottingham et al. (Cambridge: Cambridge University Press, 1985).
  • Hobbes, Thomas. The Leviathan with Selected Variants from the Latin Edition of 1668, ed. Edwin Curley. Indianapolis: Hackett Publishing Company, 1994).
  • Long, A. A., and D. N. Sedley, trans., The Hellenistic Philosophers, Vol. 1: Translations of the Principal Sources, with Philosophical Commentary. (Cambridge: Cambridge University Press, 1987).
  • Spinoza, Baruch. The Collected Works of Spinoza, Vol. I and II, ed. and trans. by Edwin Curley. (Princeton University Press, 1985).

b. Secondary Sources

  • Bennett, Jonathan. A Study of Spinoza’s Ethics. (Indianapolis: Hackett, 1984).
  • Bennett, Jonathan. “Spinoza’s Monism: A Reply to Curley”, in God and Nature: Spinoza’s Metaphysics, ed. Yirmiyahu Yovel. (Leiden: E.J. Brill, 1991), 53-59.
  • Curley, Edwin. Spinoza’s Metaphysics: An Essay in Interpretation. (Cambridge: Harvard University Press, 1969).
  • Curely, Edwin. Behind the Geometrical Method. (Princeton: Princeton University Press, 1985).
  • Curley, Edwin. “On Bennett’s Interpretation of Spinoza’s Monism”, in God and Nature: Spinoza’s Metaphysics, ed. Yirmiyahu Yovel. (Leiden: E.J. Brill, 1991), 35-52.
  • De Dijn, Herman. Spinoza: The Way to Wisdom. (West Lafayette, IN: Purdue University Press, 1996).
  • Della Rocca, Michael. Representation and the Mind-Body Problem in Spinoza. (Oxford: Oxford University Press, 1996).
  • Gatens, Moira. “Spinoza, Law and Responsibility”, in Spinoza: Critical Assessments of Leading Philosophers Vol.III, ed. by Genevieve Lloyd. (London: Routledge, 2001), 225-242.
  • Garrett, Don. “Spinoza’s Necessitarianism”, in God and Nature: Spinoza’s Metaphysics, ed. Yirmiyahu Yovel. (Leiden: E.J. Brill, 1991), 197-218.
  • Goldenbaum, Ursula. “The Affects as a Condition of Human Freedom in Spinoza’s Ethics”, in Spinoza on Reason and the “Free Man”, edited by Yirmiyahu Yovel. (New York: Little Room Press, 2004), 149-65.
  • Goldenbaum, Ursula, and Christopher Kluz, eds. Doing without Free Will: Spinoza and Contemporary Moral Problems. (New York: Lexington, 2015).
  • Hübner, KarolinaSpinoza on Being Human and Human Perfection”, in Essays on Spinoza’s Ethical Theory, eds. Mathew Kisner and Andrew Youpa. (Oxford: Oxford University Press, 2014), 124-142.
  • Homan, Matthew. “Rehumanizing Spinoza’s Free Man”, in Doing without Free Will: Spinoza and Contemporary Moral Problems, eds. Ursula Goldenbaum and Christopher Kluz (New York: Lexington, 2015), 75-96.
  • James, Susan. “Freedom, Slavery, and the Passions”, in The Cambridge Companion to Spinoza’s Ethics, ed. by Olli Koistinen. (Cambridge: Cambridge University Press, 2009), 223-41.
  • Kisner, Mathew. Spinoza on Human Freedom: Reason, Autonomy and the Good Life. (Cambridge: Cambridge University Press, 2011).
  • Kisner, Mathew, and Andrew Youpa eds. Essays on Spinoza’s Ethical Theory. (Oxford: Oxford University Press, 2014).
  • Kisner, Matthew. “Spinoza’s Activities: Freedom without Independence”, in Freedom, Action, and Motivation in Spinoza’s “Ethics”, ed. Noa Naaman-Zauderer. (London: Routledge, 2021), 37-61.
  • Kluz, Christopher. “Moral Responsibility without Free Will: Spinoza’s Social Approach”, in Doing without Free Will: Spinoza and Contemporary Moral Problems, eds. Ursula Goldenbaum and Christopher Kluz (New York: Lexington, 2015), 1-26.
  • LeBuffe, Michael. From Bondage to Freedom: Spinoza on Human Excellence. (Oxford: Oxford University Press, 2010).
  • Marshal, Colin. “Moral Realism in Spinoza’s Ethics”, in Cambridge Critical Guide to Spinoza’s Ethics, ed. Yitzhak Melamed. (Cambridge: Cambridge University Press, 2017), 248-265,
  • Marshal, Eugene. The Spiritual Automaton: Spinoza’s Science of the Mind. (Oxford: Oxford University Press, 2014).
  • Melamed, Yitzhak. “The Causes of our Belief in Free Will: Spinoza on Necessary, “Innate,” yet False Cognition”, in Cambridge Critical Guide to Spinoza’s Ethics, ed. Yitzhak Melamed. (Cambridge: Cambridge University Press, 2017)
  • Naaman-Zauderer, Nao ed. Freedom, Action, and Motivation in Spinoza’s “Ethics”. (London: Routledge, 2021).
  • Nadler, Steven. “Whatever is, is in God: substance and things in Spinoza’s metaphysics”, in Interpreting Spinoza: Critical Essays, ed. Charles Huenemann. (Cambridge: Cambridge University Press, 2008), 53-70.
  • Nadler, Steven. “On Spinoza’s Free Man”, Journal of the American Philosophical Association, Volume 1, Issue 1, Spring 2015, 103-120.
  • Rutherford, Donald. “Deciding What to Do: The Relation of Affect and Reason in Spinoza’s Ethics”, in Freedom, Action, and Motivation in Spinoza’s “Ethics”, ed. Noa Naaman-Zauderer. (London: Routledge, 2021), 133-151.
  • Soyarslan, Sanem. “From Ordinary Life to Blessedness: The Power of Intuitive Knowledge in Spinoza’s Ethics”, in Essays on Spinoza’s Ethical Theory eds. Mathew Kisner and Andrew Youpa. (Oxford: Oxford University Press, 2014), 236-257.
  • Steinberg, Justin. “Following a Recta Ratio Vivendi: The Practical Utility of Spinoza’s Dictates of Reason”, in Essays on Spinoza’s Ethical Theory, eds. Mathew Kisner and Andrew Youpa. (Oxford: Oxford University Press, 2014), 178-196.
  • Youpa, Andrew. “Spinoza’s Theory of the Good”, in The Cambridge Companion to Spinoza’s Ethics, ed. Olli Koistinen. (Cambridge: Cambridge University Press, 2010), pp. 242 – 257.
  • Youpa, Andrew. The Ethics of Joy: Spinoza on the Empowered Life. (Oxford: Oxford University Press, 2019).
  • Yovel, Yirmiyahu, ed. Spinoza on Reason and the “Free Man”. (New York: Little Room Press, 2004).

Author Information

Christopher Kluz
Email: christopherkluz@cuhk.edu.cn
The Chinese University of Hong Kong, Shenzhen
China

Leibniz: Modal Metaphysics

LeibnizGottfried Wilhelm Leibniz (1646-1716) served as the natural end of the rationalist tradition on the European continent, which included Descartes, Spinoza, and Malebranche. His philosophy was one of the major influences on Kant. Although Leibniz had many philosophical and intellectual interests, he was arguably most concerned with reconciling the freedom required for moral responsibility and the determinism that seemed to be entailed by the new sciences being developed at the time. In fact, in several important writings, including the Theodicy, Leibniz refers to “the free and the necessary and their production as it relates to the origin of evil” as one of the “famous labyrinths where our reason very often goes astray.”

To address this labyrinth, Leibniz developed one of the most sophisticated accounts of compatibilism in the early modern period. Compatibilism is the view that freedom and determinism are compatible and not mutually exclusive. Free actions are fully determined, and yet not necessary—they could have been otherwise, were God to have created another possible world instead. According to Leibniz, free actions, whether they be for God or humans, are those that are intelligent, spontaneous, and contingent. He developed a framework of possible worlds that is most helpful in understanding the third and most complex criterion, contingency.

Leibniz’s theory of possible worlds went on to influence some of the standard ways in which modal metaphysics is analyzed in contemporary Anglo-American analytic philosophy. The theory of possible worlds that he developed and utilized in his philosophy was extremely nuanced and had implications for many different areas of his thought, including, but not limited to, his metaphysics, epistemology, jurisprudence, and philosophy of religion. Although Leibniz’s Metaphysics is treated in a separate article, this article is primarily concerned with Leibniz’s modal metaphysics, that is, with his understanding of the modal notions of necessity, contingency, and possibility, and their relation to human and divine freedom. For more specific details on Leibniz’s logic and possible worlds semantics, especially as it relates to the New Essays Concerning Human Understanding and to the Theodicy, please refer to “Leibniz’s Logic.”

Table of Contents

  1. The Threat of Necessitarianism
  2. Strategies for Contingency
    1. Compossibility
    2. Infinite Analysis
    3. God’s Choice and Metaphysical and Moral Necessity
    4. Absolute and Hypothetical Necessity
  3. Complete Individual Concepts
  4. The Containment Theory of Truth and Essentialism
    1. Superessentialism
    2. Moderate Essentialism
    3. Superintrinsicalness
  5. Leibnizian Optimism and the “Best” Possible World
  6. Compatibilist Freedom
    1. Human Freedom
    2. Divine Freedom
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. The Threat of Necessitarianism

Necessitarianism is the view according to which everything that is possible is actual, or, to put this in the language of possible worlds, there is only one possible world and it is the actual world. Not only is everything determined, but it is also metaphysically impossible anything could be otherwise. In the seventeenth century, Baruch Spinoza was the paradigmatic necessitarian. According to Spinoza, insofar as everything follows from the nature of God with conceptual necessity, things could not possibly be other than they are. For Spinoza, necessitarianism had ethical implications—given that it is only possible for the universe to unfold in one way, we ought to learn to accept the way that the world is so that we can live happily. Happiness, Spinoza thought, is partly and importantly understood to be the rational acceptance of the fully determined nature of existence.

Spinoza’s necessitarianism follows directly from his conception of God and his commitment to the principle of sufficient reason, the thesis that there is a cause or reason why everything is the way it is rather than otherwise. In rejecting the anthropomorphic conception of God, he held instead that God is identical with Nature and that all things are, in some sense, in God. While Leibniz rejected the pantheistic/panentheistic understanding of God that Spinoza held, Leibniz’s view of God nevertheless compelled him to necessitarianism, at least in his early years. This article later reconsiders whether Leibniz’s mature views also commit him to necessitarianism. Consider the following letter that he wrote to Magnus Wedderkopf in 1671. Leibniz writes:

Since God is the most perfect mind, however, it is impossible for him not to be affected by the most perfect harmony, and thus to be necessitated to the best by the very ideality of things…Hence it follows that whatever has happened, is happening, or will happen is best and therefore necessary, but…with a necessity that takes nothing away from freedom because it takes nothing away from the will and the use of reason (A. II. I, 117; L 146).

In this early correspondence, Leibniz reasons that since God’s nature is essentially good, he must, by necessity, only do that which is best. It is impossible for God to do less than the best. After his meeting with Spinoza in 1676, Leibniz’s views related to modality began to shift and became much more nuanced. He went on to develop several strategies for addressing contingency to reject this early necessitarian position. In his mature metaphysics, Leibniz maintained that God acts for the best, but rejected that God acts for the best by necessity. How did he attempt to reconcile these positions though?

2. Strategies for Contingency

a. Compossibility

Leibniz’s first and arguably most important strategy for maintaining contingency is to argue that worlds are not possible with respect to God’s will; rather, worlds are intrinsically possible or impossible. If they were possible only with respect to God’s will, the argument from the letter to Wedderkopf would still be applicable—since God is committed to the best by his own essential nature, there is only one possible world, the actual world which is best. Instead, Leibniz maintains that worlds by their very nature are either possible or impossible. He writes in a piece dated from 1680 to1682 called On Freedom and Possibility:

Rather, we must say that God wills the best through his nature. “Therefore,” you will say “he wills by necessity.” I will say, with St. Augustine, that such necessity is blessed. “But surely it follows that from this that things exist by necessity.” How so? Since the nonexistence of what God wills to exist implies a contradiction? I deny that this proposition is absolutely true, for otherwise that which God does not will would not be possible. For things remain possible, even if God does not choose them. Indeed, even if God does not will something to exist, it is possible for it to exist, since, by its nature, it could exist if God were to will it to exist. “But God cannot will it to exist.” I concede this, yet, such a thing remains possible in its nature, even if it is not possible with respect to the divine will, since we have defined as in its nature possible anything that, in itself, implies no contradiction, even though its coexistence with God can in some way be said to imply a contradiction (Grua 289; AG 20-21).

According to Leibniz, worlds are possible just in case they are compossible. Possibility is a property of an object when its properties are logically consistent. For example, winged horses are possible because there is nothing self-contradictory about a horse with wings. But a winged wingless horse would be internally incoherent. By contrast, compossibility is a feature of sets of things, like worlds, rather than individual things. So, when Leibniz insists that worlds are possible by their own nature, he means that the things in that world do not conflict with one another. For example, there is nothing self-contradictory about an unstoppable force or an immovable object. But those objects could not exist in the same world together because their natures would be inconsistent with one another—they rule each other out. So, while there is a possible world with an unstoppable force and a possible world with an immovable object, there is no possible world with both an unstoppable force and an immovable object.

Although Leibniz often analyzes compossibility as a logical relation holding between the created essences of any given world, he sometimes treats it as a relation between the created essences and the laws of nature which God has decreed in each world. He writes in his correspondence to Arnauld:

I think there is an infinity of possible ways in which to create the world, according to the different designs which God could form, and that each possible world depends on certain principal designs or purposes of God which are distinctive of it, that is, certain primary free decrees (conceived sub ratione possibilitatis) or certain laws of the general order of this possible universe with which they are in accord and whose concept they determine, as they do also the concepts of all the individual substances which must enter into this same universe (G. II, 51; L 333).

Passages like this suggest that even logically inconsistent sets of objects like the unstoppable force and the immovable object could exist in a world together, so long as there is one set of laws governing them.

Although there are several different ways to analyze Leibniz’s notion of compossibility, there is good reason to think that he believed that preserving the intrinsic nature of the possibility of worlds was crucial to salvaging contingency. At one point he even suggests that contingency would be destroyed without such an account. He writes to Arnauld:

I agree there is no other reality in pure possibles than the reality they have in the divine understanding…For when speaking of possibilities, I am satisfied that we can form true propositions about them. For example, even if there were no perfect square in the world, we would still see that it does not imply a contradiction. And if we wished absolutely to reject pure possibles, contingency would be destroyed; for, if nothing were possible except what God actually created, then what God created would be necessary, in the case he resolved to create anything (G. II, 45; AG 75).

Importantly, the possibility of worlds is outside the scope of God’s will. God does not determine what is possible, any more than he determines mathematical, logical, or moral truths.

b. Infinite Analysis

Another strategy for understanding necessity and contingency is through Leibniz’s theory of infinite analysis. According to Leibniz, necessity and contingency are not defined in terms of possible worlds in the way that is common in contemporary metaphysics. According to the standard understanding in contemporary metaphysics, a proposition is possible just in case it is true in some possible world, and a proposition is necessary just in case it is true in every possible world. But for Leibniz, a proposition is necessary if and only if it can be reduced to an identity statement in a finite number of steps. Propositions are contingent just in case it would take an infinite number of steps to reduce the statement to an identity statement. He writes in a piece from 1686 called On Contingency:

Necessary truths are those that can be demonstrated through an analysis of terms, so that in the end they become identities, just as in algebra an equation expressing an identity ultimately results from the substitution of values. That is, necessary truths depend upon the principle of contradiction. Contingent truths cannot be reduced to the principle of contradiction; otherwise everything would be necessary and nothing would be possible other than that which actually attains existence (Grua 303; AG 28).

To see how the theory of infinite analysis works, recall that Leibniz holds that every truth is an analytic truth. Every true proposition is one where the concept of the predicate is contained in the concept of the subject. One way that to understand this reduction is to ask, “Why is this proposition true?” Since every truth is an analytic truth, every truth is like, “A bachelor is an unmarried male.” So why is it true that a bachelor is an unmarried male? It is true because it is bound up in the essence of the concept of unmarried male that he is identical with a bachelor. A bachelor just is an unmarried male.

How would the theory of infinite analysis work for explaining contingency though? Consider the following propositions:

    1. 1+1=2
    2. Judas is the betrayer of Christ.

The first proposition is a simple mathematical truth that almost everyone in the 17th and 18th centuries would consider to be a necessary truth. For Leibniz, it is a necessary truth because can be reduced to an identity statement in a finite number of steps; that is, we could move from 1+1=2 to 1+1=1+1 in a straightforward manner. We could make a similar move for other mathematical and logical truths that are even more straightforward. The law of identity, that “A is identical to A,” for example, is another example that would take a finite number of steps to reduce to an identity.

The second proposition is an example of a contingent truth because the reduction would take an infinite number of steps to reach an identity statement. To understand how this analysis occurs, consider why it is true that Judas is the betrayer of Christ. This analysis would require reasons for Judas’s nature and his existence. Judas exists because God understood in his infinite wisdom that the best possible world would be one where Judas betrays Christ and Christ suffers. And why is Judas part of the best possible world? The only way to answer that question would be for God to compare the actual world with the infinite plurality of other possible worlds—an analysis that would take an infinite number of steps, even for God. Put simply, the sufficient reason for Judas’s contingent existence is that it is deemed to be best by God.

Importantly, Leibniz holds that not even God could complete the infinite analysis discursively; instead, God completes the analysis intuitively, in one feat of the mind. He writes in On Contingency:

For in necessary propositions, when the analysis is continued indefinitely, it arrives at an equation that is an identity; that is what it is to demonstrate a truth with geometrical rigor. But in contingent propositions one continues the analysis to infinity through reasons for reasons, so that one never has a complete demonstration, though there is always, underneath, a reason for the truth, but the reason is understood completely only by God, who alone traverses the infinite series in one stroke of the mind (Grua 303; AG 28).

c. God’s Choice and Metaphysical and Moral Necessity

Another strategy for salvaging contingency is not at the level of worlds, nor in God’s will, but at the level of God’s wisdom; that is, in the choice to actualize certain substances instead of others. Leibniz holds that we must take the reality of God’s choice seriously. As he writes in the Theodicy, “The nature of things, if taken as without intelligence and without choice, has in it nothing sufficiently determinant” (G. VI, 322; H 350).

Even if the plurality of worlds remain possible in themselves as the first strategy holds, or propositions are contingent because of the infinite analysis theory as the second strategy holds, God’s choice still plays an important role in the causal and explanatory chain of events leading to the actualization of a world. In this way, Leibniz’s modal metaphysics stands again in stark contrast to Spinoza. For Spinoza, the world just is God, and in some sense, all things are in God. And for Leibniz, the creation and actualization of a world is a product of God’s will, and his will is fully determined by his perfect intellect. In some texts, Leibniz locates the source of contingency purely in God’s choice of the best, which cannot be demonstrated. And since the choice of the best cannot be demonstrated, God’s choice is contingent. He writes in On Contingency:

Assuming that the proposition “the proposition that has the greater reason for existing [that is, being true] exists [that is, is true] is necessary, we must see whether it then follows that the proposition that has the greater reason for existing [that is, being true] is necessary. But it is justifiable to deny the consequence. For, if by definition a necessary proposition is one whose truth can be demonstrated with geometrical rigor, then indeed it could be the case that this proposition is demonstrable: “every truth and only a truth has greater reason,” or this: “God always acts with the highest wisdom.” But from this one cannot demonstrate the proposition “contingent proposition A has greater reason [for being true] or “contingent proposition A is in conformity with divine wisdom.” And therefore it does not follow that contingent proposition A is necessary. So, although one can concede that it is necessary for God to choose the best, or that the best is necessary, it does not follow that what is chosen is necessary, since there is no demonstration that it is the best” (Grua 305; AG 30).

Related to God’s choice is the distinction between moral and metaphysical necessity. Moral necessity is used by Leibniz in several different writings, beginning with his earliest jurisprudential writings up to and including his Theodicy. In the 17th century, moral necessity was very often understood in terms of the legal use of “obligation,” a term which Leibniz also applied to God. He writes in the Nova Methodus from 1667:

Morality, that is, the justice or injustice of an act, derives however from the quality of the acting person in relation to the action springing from previous actions, which is described as moral quality. But just as the real quality is twofold in relation to action: the power of acting (potential agendi), and the necessity of acting (necessitas agendi); so also the moral power is called right (jus), the moral necessity is called obligation (obligatio) (A. VI. i. 301).

Leibniz echoes this sentiment into the 1690’s in other jurisprudential writings. In the Codex Juris from 1693, Leibniz insists that “Right is a kind of moral power, and obligation is a moral necessity” (G. III. 386; L 421). In short, Leibniz remarkably held consistent throughout his career that “right” and “obligation” are moral qualities that provide the capacity to do what is just.

Importantly, right and obligation are not just related notions—they have force on each other. As Leibniz writes in the Nova Methodus, “The causes of right in one person are a kind of loss of right in another and it concerns the process of acquiring an obligation. Conversely, the ways of losing an obligation are causes of recovering a right, and can be defined as liberation” (A. VI. vi, 305-306). That a right imposes an obligation cannot be overstated. It is precisely for this reason that we can undergo the theodicean project in the first place. We have proper standing to ask for an explanation for God’s permission of suffering because we have a right to the explanation. And we have a right to the explanation because God is morally necessitated or obligated to create. For a point of comparison, contrast this with God’s response to Job when he demands an explanation for his own suffering. God responds, “Who has a claim against me that I must pay? Everything under heaven belongs to me” (Job 41:11). God does not provide an explanation for Job’s suffering because Job does not have proper standing to request such an explanation.

Leibniz contrasts moral necessity with metaphysical necessity. In the Theodicy, he describes “metaphysical necessity, which leaves no place for any choice, presenting only one possible object, and moral necessity, which obliges the wisest to choose the best” (G. VI, 333; H 367). This distinction becomes important for Leibniz because it allows him to say that God’s choice to create the best of all possible worlds is morally necessary, but not metaphysically necessary. God is morally bound to create the best world due to his divine nature, but since there are other worlds which are possible in themselves, his choice is not metaphysically necessary. Leibniz writes again in the Theodicy, “God chose between different courses all possible: thus, metaphysically speaking, he could have chosen or done what was not the best; but he could not morally speaking have done so” (G. VI, 256; H 271).

Some commentators insist that the dichotomy between metaphysical and moral necessity is illusory. Either it is necessary that God must create the best of all possible worlds, or it is not necessary that God must create the best of all possible worlds. Nevertheless, Leibniz took moral necessity to do both logical and theological work. Only with moral necessity could he preserve both the goodness and wisdom of God. If moral necessity is vacuous, then Leibniz would seem to be committed to necessitarianism.

d. Absolute and Hypothetical Necessity

One final strategy for understanding contingency is to make use of a well-known distinction between absolute and hypothetical necessity. This strategy was most fully utilized in Leibniz’s correspondence with Arnauld in the mid 1680’s. Arnauld was deeply concerned with the implications for freedom because of the theory of complete individual concepts. Since Leibniz held that every individual contains within itself complete truths about the universe, past, present, and future, it seems that there can be no room for freedom. If it is included in Judas’s concept from the moment the universe was created that he would ultimately betray Christ, then it seems as if it was necessary that he do so; Judas could not have done otherwise. Leibniz’s response draws on the distinction between absolute and hypothetical necessity. Consider the following propositions:

    1. Necessarily, Caesar crosses the Rubicon.
    2. Necessarily, if Caesar exists, then he crosses the Rubicon.

Leibniz would deny the first proposition, but readily accept the second proposition. He denies the first because it is not a necessary truth that Caesar crosses the Rubicon. The first proposition is not comparable to other necessary truths like those of mathematics and logic which reduce to identity statements and are not self-contradictory. The second proposition is contingent; although it is bound up in Caesar’s essence that he crosses the Rubicon, it does not follow that he necessarily does so. It is only necessary that Caesar crosses the Rubicon on the hypothesis that Caesar exists. And, of course, Caesar might not have existed at all. God might have actualized a world without Caesar because those worlds are compossible, that is, possible in themselves. This is what Leibniz means when he claims that contingent truths are certain, but not necessary. To use a simple analogy, once God pushes over the first domino, it is certain that the chain of dominoes will fall, but God might have pushed over a completely different set of dominos instead. Once a series is actualized, the laws of the series govern it with certainty. And yet the series is not metaphysically necessary since there are other series that God could have actualized instead were it not for his divine benevolence. Leibnitz writes in the Discourse on Metaphysics from 1686:

And it is true that we are maintaining that everything that must happen to a person is already contained virtually in his nature or notion, just as the properties of a circle are contained in its definition; thus the difficulty still remains. To address it firmly, I assert that connection or following is of two kinds. The one whose contrary implies a contradiction is absolutely necessary; this deduction occurs in eternal truths, for example, the truths of geometry. The other is necessary only ex hypothesi and, so to speak, accidentally, but it is contingent in itself, since its contrary does not imply a contradiction. And this connection is based not purely on ideas of God’s simple understanding, but on his free decrees and on the sequence of the universe (A. VI. iv, 1546-1547; AG 45).

Absolute necessity, then, applies to necessary truths that are outside the scope of God’s free decrees, and hypothetical necessity applies to contingent truths that are within the scope of God’s free decrees.

3. Complete Individual Concepts

According to Leibniz, one of the basic features of a substance is that every substance has a “complete individual concept” (CIC, hereafter). The CIC is an exhaustive account of every single property of each substance. He writes in the Discourse on Metaphysics, “the nature of an individual substance or of a complete being is to have a notion so complete that it is sufficient to contain and to allow us to deduce from it all the predicates of the subject to which this notion is attributed” (A. Vi. iv, 1540; AG 41). From this logical conception of substance, Leibniz argues that properties included in the CIC are those of the past, present, and future. The CIC informs what is sometimes referred to as Leibniz’s doctrine of marks and traces. He illustrates this thesis using the example of Alexander the Great in the Discourse, writing:

Thus, when we consider carefully the connection of things, we can say that from all time in Alexander’s soul there are vestiges of everything that has happened to him and marks of everything that will happen to him and even traces of everything that happens in the universe, even though God alone could recognize them all (A. VI. iv, 1541; AG 41).

According to Leibniz, then, in analyzing any single substance, God would be able to understand every other substance in the universe, since every substance is conceptually connected to every other substance. For example, in analyzing the concept of Jesus, God would also be able to understand the concept of Judas. Because it is part of Jesus’s CIC that he was betrayed by Judas, it is also part of Judas’s CIC that he will betray Jesus. Every truth about the universe could be deduced this way as well. If a pebble were to fall off a cliff on Neptune in the year 2050, that would also be included in Jesus’s CIC too. To use one image of which Leibniz is quite fond, every drop in the ocean is connected to every other drop in the ocean, even though the ripples from those drops could only be understood by God. He writes in the Theodicy:

For it must be known that all things are connected in each one of the possible worlds: the universe, whatever it may be, is all of one piece, like an ocean: the least movement extends its effect there to any distance whatsoever, even though this effect become less perceptible in proportion to the distance. Therein God has ordered all things beforehand once for all, having foreseen prayers, good and bad actions, and all the rest; and each thing as an idea has contributed, before its existence, to the resolution that has been made upon the existence of all things; so that nothing can be changed in the universe (any more than in a number) save its essence or, if you will, save its numerical individuality. Thus, if the smallest evil that comes to pass in the world were missing in it, it would no longer be this world; which nothing omitted and all allowance made, was found the best by the Creator who chose it (G. VI. 107-108; H 128).

In addition to describing substances as possessing a CIC, Leibniz also refers to the essential features of a substance as perception and appetition. These features are explained in more detail in an article on Leibniz’s Philosophy of Mind. In short though, Leibniz held that every single representation of each substance is already contained within itself from the moment it is created, such that the change from one representation to another is brought about by its own conatus. The conatus, or internal striving, is what Leibniz refers to as the appetitions of a substance. Leibniz writes in the late Principles of Nature and Grace:

A monad, in itself, at a moment, can be distinguished from another only by its internal qualities and actions, which can be nothing but its perceptions (that is, the representation of the composite, or what is external, in the simple) and its appetitions (that is, its tendencies to go from one perception to another) which are the principles of change (G. VI. 598; AG 207).

Because every perception of the entire universe is contained within each substance, the entire history of the world is already fully determined. This is the case not just for the actual world after the act of creation, but it is true for every possible world. In fact, the fully determined nature of every possible world is what allows God in his infinite wisdom to actualize the best world. God can assess the value of every world precisely because the entire causal history, past, present, and future is already set.

4. The Containment Theory of Truth and Essentialism

The main article on Leibniz describes his epistemological account in more general terms, but Leibniz’s theory of truth has implications for freedom, so some brief comments bear mentioning. According to Leibniz, propositions are true not if they correspond to the world, but instead based on the relationship between the subject and the predicate. The “predicate in notion principle” (PIN, hereafter), as he describes to Arnauld, is the view according to which “In every true affirmative proposition, whether necessary or contingent, universal or particular, the notion of the predicate is in some way included in that of the subject. Praedicatum inest subjecto; otherwise I do not know what truth is” (G. II, 56; L 337). For example, “Judas is the betrayer of Christ” is true not because there is a Judas who betrays Christ in the actual world, but because the predicate “betrayer of Christ” is contained in the concept of the subject, Judas. Judas’s essence, his thisness, or haecceity, to use the medieval terminology, is partly defined by his betrayal of Christ.

The PIN theory of truth poses significant problems for freedom though. After all, if it is part of Judas’s essence that he is the betrayer of Christ, then it seems that Judas must betray Christ. And if Judas must betray Christ, then it seems that he cannot do otherwise. And if he cannot do otherwise, then Judas cannot be morally responsible for his actions. Judas cannot be blameworthy for the betrayal of Christ for doing something that was part of his very essence. And yet, despite this difficulty, Leibniz maintained a compatibilist theory of freedom, where Judas’s actions were certain, but not necessary.

Since Leibniz holds that every essence can be represented by God as having a complete concept and that every proposition is analytically true, he maintains that every property is essential to a substance’s being. Leibniz, therefore, straightforwardly adopts an essentialist position. Essentialism is the metaphysical view according to which some properties of a thing are essential to it, such that if it were to lose that property, the thing would cease to exist. Leibniz’s essentialism has been a contested issue in the secondary literature during the first few decades of the twenty-first century. The next section of this article highlights three of the more dominant and interesting interpretations of Leibniz’s essentialism in his mature philosophy: superessentialism, moderate essentialism, and superintrinsicalness.

a. Superessentialism

The most straightforward way of interpreting Leibniz’s mature ontology is that he agrees with the thesis of superessentialism. According to superessentialism, every property is essential to an individual substance’s CIC such that if the substance were to lack any property at all, then the substance would not exist. Leibniz often explains his superessentialist position in the context of explaining God’s actions. For example, in one passage he writes, “You will object that it is possible for you to ask why God did not give you more strength than he has. I answer: if he had done that, you would not exist, for he would have produced not you but another creature” (Grua 327).

In his correspondence with Arnauld, Leibniz makes use of the notion of “possible Adams” to explain what looks very much like superessentialism. In describing another possible Adam, Leibniz stresses to Arnauld the importance of taking every property to be part of a substance, or else we would only have an indeterminate notion, not a complete and perfect representation of him. This fully determinate notion is the way in which God conceives of Adam when evaluating which set of individuals to create when a world is actualized. Leibniz describes this perfect representation to Arnauld, “For by the individual concept of Adam I mean, to be sure, a perfect representation of a particular Adam who has particular individual conditions and who is thereby distinguished from an infinite number of other possible persons who are very similar but yet different from him…” (G. II, 20; LA. 15). The most natural way to interpret this passage is along the superessentialist reading such that if there were a property that were not essential to Adam, then we would have a “vague Adam.” Leibniz even says as much to Arnauld. He writes:

We must not conceive of a vague Adam, that is, a person to whom certain attributes of Adam belong, when we are concerned with determining whether all human events follow from positing his existence; rather we must attribute to him a notion so complete that everything that can be attributed to him can be deduced from it (G. II, 42; ag 73.).

The notion of “vague Adams” is further described in a famous passage from the Theodicy. Leibniz describes the existence of other counterparts of Sextus in other possible worlds, that, though complete concepts in their own way, are nevertheless different from the CIC of Sextus in the actual world. Leibniz writes:

I will show you some, wherein shall be found, not absolutely the same Sextus as you have seen (that is not possible, he carries with him always that which he shall be) but several Sextuses resembling him, possessing all that you know imperceptibly, nor in consequence all that shall yet happen to him. You will find in one world a very happy and noble Sextus, in another a Sextus content with a mediocre state, a Sextus, indeed, of every kind and endless diversity of forms (G. VI, 363; H 371).

These passages describing other possible Adams and other possible Sextuses suggest that Leibniz was committed to the very strong thesis of superessentialism. Because every property is essential to an individual’s being, every substance is world-bound; that is, each substance only exists in its own world. If any property of an individual were different, then the individual would cease to exist, but there are also an infinite number of other individuals that vary in different degrees, which occupy different worlds. For example, a Judas who was more loyal and did not ultimately betray Christ would not be the Judas of the actual world. Importantly, one small change would also ripple across and affect every other substance in the universe as well. After all, a loyal Judas who does not betray Christ would also mean that Christ was not betrayed, so it would affect his complete concept and essence as well. Put simply, on the superessentialist interpretation of Leibniz’s metaphysics, due to the complete interconnectedness of all things, if any single property of an individual in the world were different than it is, then every substance in the world would be different as well.

The most important worry that Arnauld had about Leibniz’s philosophy was the way in which essentialism threatens freedom. Arnauld thought that human freedom must entail the ability to do otherwise. In the language of possible worlds, this means that an individual is free if they do otherwise in another possible world. Of course, such a view requires the very same individual to exist in another possible world. According to Arnauld, Judas was free in his betrayal of Christ because there is another possible world where Judas does not betray Christ. Freedom requires the actual ability to do otherwise. But Arnauld worried that according to Leibniz’s superessentialism, since it really was not Judas in another possible world that did not betray Christ but instead a counterpart, an individual very similar in another possible world, then we cannot really say that Judas’s action was truly free. Leibniz anticipates this sort of objection in the Discourse, writing, “But someone will say, why is it that this man will assuredly commit this sin? The reply is easy: otherwise it would not be this man” (A. VI. iv, 1576; AG 61). Leibniz, like most classical compatibilists, argues that the actual ability to do otherwise is not a necessary condition for freedom. All that is required is the hypothetical ability to do otherwise. A compatibilist like Leibniz would insist that Judas’s action is nevertheless free even though he cannot do otherwise. If Judas’s past or the laws of nature were different, then he might not betray Christ. Framing freedom in these hypothetical terms is what allows Leibniz to say that the world is certain, but not necessary.

Leibniz’s motivation for superessentialism is driven partly by theodicean concerns. The basic issue in the classical problem of evil is the apparent incompatibility between a perfectly loving, powerful, and wise God on the one hand with cases of suffering on the other. Why would God permit Jesus to suffer? Leibniz’s answer here as it relates to superessentialism is twofold. First, while Jesus’s suffering is indeed tragic, Leibniz contends that it is better for Jesus to exist and suffer than not to exist at all. Second, because of the complete interconnectedness of all things, without Jesus’s suffering, the entire history of the world would be different. Jesus’s suffering is very much part of the calculus when God is discerning which world is the best. And importantly, God is not choosing that Jesus suffers, but only chose a world in which Jesus suffers. He writes in the Primary Truths from 1689:

Properly speaking, he did not decide that Peter sin or that Judas be damned, but only that Peter who would sin with certainty, though not with necessity, but freely, and Judas who would suffer damnation would attain existence rather than other possible things; that is, he decreed that the possible notion become actual (A. VI. iv, 1646; AG 32).

b. Moderate Essentialism

Despite the evidence to interpret Leibniz as a superessentialist, there is also textual support that superessentialism is simply too strong of a thesis. One reason to adopt a weaker version of essentialism is to be logically consistent with transworld identity, the thesis that individuals can exist across possible worlds. Some commentators like Cover and O’Leary-Hawthorne argue for the weaker essentialist position on the grounds that superessentialism cannot utilize the scholastic difference between essential and accidental properties of which Leibniz sometimes makes use. According to moderate essentialism, Leibniz holds that properties that can be attributed to the species are essential in one way and principles attributed to individuals are essential in a different way.

The weaker thesis of moderate essentialism is the view that only monadic properties are essential to an individual substance, and relational or extrinsic properties should be reducible to monadic properties. The result of this view is that an individual is not “world-bound”; that is, a counterpart of that individual might exist in another possible world, and the essential properties of that individual are what designate it across possible worlds. What follows then is that Jesus, for example, could be said to be free for giving himself up in the Garden of Gethsemane because in another possible world, a counterpart of Jesus did not give himself up. Problematically though, Leibniz explicitly mentions in one of the letters to Arnauld that the laws of nature are indeed a part of an individual’s CIC. Leibniz writes to Arnauld, “As there exist an infinite number of possible worlds, there exists also an infinite number of laws, some peculiar to one world, some to another, and each possible individual contains in the concept of him the laws of his world” (G. II, 40; LA 43).

To reconcile the passages where Leibniz suggests that individuals are world-bound, some commentators argue that it is logically consistent to hold that only the perception or expression of the other substance must exist, but not the substance itself. And since monads are “windowless,” that is, causally isolated, the other substance need not exist at all. In his late correspondence with Des Bosses, Leibniz suggests this very thing, namely, that God could create one monad without the rest of the monads in that world. Leibniz writes:

My reply is easy and has already been given. He can do it absolutely; he cannot do it hypothetically, because he has decreed that all things should function most wisely and harmoniously. There would be no deception of rational creatures, however, even if everything outside of them did not correspond exactly to their experiences, or indeed if nothing did, just as if there were only one mind… (G. II, 496; L 611).

The letter to Des Bosses is compelling for moderate essentialism, but it does not entail it. In fact, conceiving of God’s ability to create only one monad in the actual world with only the expressions of every other substance is perfectly consistent with the superessentialist interpretation. The substances need not actually exist in order to support the claim that every property of a CIC is necessary for that substance. Put differently, if it were part of Peter’s CIC that he denied Christ three times, it need not follow that Christ actually existed for this property to hold, so long as the perceptions of Christ follow from the stores of Peter’s substance.

c. Superintrinsicalness

One final variation of essentialism which we might attribute to Leibniz is called superintrinsicalness. This thesis, defended primarily by Sleigh, states that every individual substance has all its properties intrinsically. This view is distinct from moderate essentialism in a very important way. According to superintrinsicalness, both monadic and extrinsic properties are essential to an individual’s CIC. But, contrary to the superessentialist thesis, the properties that compose an individual’s CIC could be different; that is, some components of a substance’s CIC are necessary, and some are contingent. Leibniz writes in the Discourse:

For it will be found that the demonstration of this predicate of Caesar is not as absolute as those of numbers or of geometry, but that it supposes the sequence of things that God has freely chosen, a sequence based on God’s first free decree always to do what is most perfect and on God’s decree with respect to human nature, following out of the first decree, that man will always do (although freely) that which appears to be best. But every truth based on these kinds of decrees is contingent, even though it is certain; for these decrees do not change the possibility of things, and, as I have already said, even though it is certain that God always chooses the best, this does not prevent something less perfect from being and remaining possible in itself, even though it will not happen, since it is not its impossibility but its imperfection which causes it to be rejected. And nothing is necessary whose contrary is possible (A. VI. iv, 1548; AG 46).

One of the consequences of this view is that a substance’s CIC is contingent on the will of God. For example, on this view, it is a logical possibility that Adam could have had a completely different set of properties altogether. And since a substance could have a completely different CIC and relational properties are part of that CIC, then superintrinsicalness would deny that substances are world-bound. Since Leibniz denies world-bound individuals on this interpretation, he would not need any sort of counterpart theory that comes along with the superessentialist reading. After all, Leibniz’s depiction of counterparts states that there are individuals in other possible worlds that, though they are very similar, are numerically distinct from each other. But on the superintrinsicalness thesis, it may be the case that an individual in another possible world is identical to an individual in the actual world.

There is some textual evidence supporting superintrinsicalness as well. Leibniz writes to Arnauld, “Thus, all human events could not fail to occur as in fact they did occur, once the choice of Adam is assumed; but not so much because of the individual concept of Adam, although this concept includes them, but because of God’s plans, which also enter into the individual concept of Adam” (G. II, 51; LA 57). And yet, if a substance could have had a different CIC, then the notion of a haecceity becomes meaningless. The haecceity serves to individuate substances across possible worlds. If the haecceity could be different than it is, then the concept loses its purpose. We could not pick out the Caesar of this world and another possible world, if the thing that makes Caesar can change.

==And yet, if Leibniz accepted superintrinsicalness, then he would have had an easy response to Arnauld’s worry that the complete concept doctrine diminishes the possibility of freedom. Leibniz could have just responded to Arnauld that Judas freely betrayed Christ because, in another possible world, he did not betray Christ; although his haecceity in the actual world determined that he would betray Christ, the haecceity in another possible world may be different such that he did not betray Christ. But this is not the response that Leibniz gives. Instead, he draws on some of the strategies for contingency in defending a compatibilist view of freedom that were discussed earlier.

5. Leibnizian Optimism and the “Best” Possible World

To paraphrase Ivan in The Brothers Karamazov, “The crust of the earth is soaked by the tears of the suffering.” Events like the Thirty Years War deeply affected Leibniz. His theodicean project was an attempt at an explanation and justification for God’s permission of such suffering. Why would a perfectly wise, powerful, and good God permit suffering? And even if we were to grant that God must permit suffering to allow for greater goods such as compassion and empathy, why must there be so much of it? Would the world not have been better with less suffering? The crux of Leibniz’s philosophical optimism was that creating this world was the best that God could do—it was metaphysically impossible for the world to be better than it is. And so, God is absolved of responsibility for not creating something better. But how could Leibniz maintain a position in such contrast to our intuitions that the world could be better with less suffering?

Arguably the most famous part of Leibniz’s philosophy is his solution to the problem of evil. The problem of evil is the most significant objection to classical theism, and it is one that Leibniz developed an entire system of possible worlds to address. He argues that God freely created the best of all possible worlds from amongst an infinite plurality of alternatives. Voltaire mocked such optimism in his Candide, suggesting in a best-case scenario that, if this is really the best world that God could create, then God certainly is not worth much reverence and in a worst-case scenario, it implies that God does not exist at all. But what exactly did Leibniz mean by the “best” possible world? And was Voltaire’s criticism warranted? Leibniz has several responses to the problem of evil which draw on his complex theory of possible worlds.

First, the basis for Voltaire’s misinterpretation is grounded upon the false assumption that the actual world is morally best. Instead, Leibniz contends that the world is metaphysically best. But how are these “moral” and “metaphysical” qualifications related to one another? After all, Leibniz sometimes remarks like he does in the Discourse that “God is the monarch of the most perfect republic, composed of all minds, and the happiness of this city of God is his principal purpose” (A. VI. iv, 1586; AG 67). And yet at other times, like in the Theodicy, he contends that “The happiness of rational creatures is one of the aims God has in view; but it is not his whole aim, nor even his ultimate aim” (G. VI, 169-170; H 189). It seems then that Leibniz is, at least on the face of it, unsure how much God is concerned with the happiness of creation. Happiness is a “principal” purpose of God, and yet not an “ultimate aim.”

One way to reconcile these apparently disparate positions is to be clearer about what Leibniz means by happiness. Leibniz often reminds the reader that the actual world is not the best because it guarantees every substance has the most pleasurable existence. Rather, he holds, like he does in the Confessio, that “Happiness is the state of mind most agreeable to it, and nothing is agreeable to a mind outside of harmony” (A. VI. iii, 116; CP 29). Put differently, the best of all possible worlds is metaphysically best because it is the world where rational minds can contemplate the harmonious nature of creation. Leibniz goes into more detail in The Principles of Nature and Grace, writing:

It follows from the supreme perfection of God that in producing the universe he chose the best possible plan, containing the greatest variety together with the greatest order; the best arranged situation, place and time; the greatest effect produced by the simplest means; the most power, the most knowledge, the most happiness and goodness in created things of which the universe admitted (G. VI, 603).

In short, Leibniz holds that while there is concern with the happiness of minds during the act of creation, the kind of happiness that God wishes to guarantee is not physical pleasure or the absence of physical pain, but instead the rational recognition that the actual world is the most harmonious.

Second, Leibniz contends that “best” does not mean “perfect” or even “very good.” While it is true that we oftentimes have no idea why bad things sometimes happen to good people and why good things sometimes happen to bad people, what we can be sure of is that God, as an ens perfectissimum, a most perfect being, chose this world because it was the best. And it is the best because it contains the most variety and plurality of substances governed by the fewest laws of nature. He writes in the Discourse:

One can say, in whatever manner God might have created the world, it would always have been regular and in accordance with a certain general order. But God has chosen the most perfect world, that is, the one which is at the same time the simplest in hypotheses and richest in phenomena (A. VI. Iv, 1538; AG 39).

Even if we were to grant that Leibniz means something particular by “best,” how should we understand the criteria that the “best” world is the one that is richest in phenomena and governed by the simplest laws?

It is critical that Leibniz has more than one criterion for the best possible world. If there were only one criterion, like the concern for the happiness of creatures, for example, then there is a problem of maximization. For whatever world God created, he could have created another world with more happiness. And since God could always create a better world, then he could never act for the best, for there is no best. But since there is a world, either this is not the best of all possible worlds, or there is no maximally perfect being. Malebranche (and Aquinas) held that there was no best world, and Leibniz wished to distance himself from their views. He writes in the Discourse, “They [the moderns like Malebranche] imagine that nothing is so perfect that there is not something more perfect—this is an error” (A. VI. iv, 1534; AG 37).

Rather than maximizing one feature of a world, which would be impossible, Leibniz reasons that God must optimize the competing criteria of richness of phenomena, simplicity of laws, and abundance of creatures. He writes in the Discourse:

As for the simplicity of the ways of God, this holds properly with respect to his means, as opposed to the variety, richness, and abundance, which holds with respect to his ends or effects. And the one must be in balance with the other, as are the costs of a building and the size and beauty one demands of it (A. VI. iv, 1537; AG 39).

God, like an architect with unlimited resources, must nevertheless weigh competing variables to optimize the best creation.

Even if we grant the claim that there God considers competing variables in creating the best world, we might still wonder why those variables are those of concern. Although it is unclear why Leibniz chose variety, richness, and abundance as the criteria, he points to simplicity as a possible overarching principle. Unfortunately, simplicity alone will not do, for it would be simpler to have only one substance rather than an abundance of substances. It seems then that simplicity in conjunction with a world that is worthy of the majesty of God are the underlying criteria for the best of all possible worlds.

The notion of simplicity is critical for Leibniz’s theodicean account. In fact, simplicity is the key concept that sets Leibniz’s account of God’s justice directly in line with his contemporary, Nicolas Malebranche. Leibniz remarks at one point that Malebranche’s theodicean account reduces in most substantial ways to his own. He writes in the Theodicy, “One may, indeed, reduce these two conditions, simplicity and productivity, to a single advantage, which is to produce as much perfection as is possible: thus Father Malebranche’s system in this point amounts to the same as mine” (G. VI, 241; H 257). The similarities of their accounts are readily apparent. Consider Malebranche’s remark that “God, discovering in the infinite treasures of his wisdom an infinity of possible worlds…, determines himself to create that world…that ought to be the most perfect, with respect to the simplicity of the ways necessary to its production or to its conservation” (OCM. V, 28).

Third, Leibniz appeals to intellectual humility and insists that our intuition that this is not the best possible world is simply mistaken. If we had God’s wisdom, then we would understand that this is the best possible world. Part of the appeal to intellectual humility is also the recognition that God evaluates the value of each world in its totality. In just the same way that it would be unfair to judge the quality of a film by looking at a single frame of the reel, Leibniz reasons that it is also unfair to judge the quality of the world by any singular instance of suffering. And given our relatively small existence in the enormous history of the universe, even long periods of suffering should be judged with proper context. World wars, global pandemics, natural disasters, famine, genocide, slavery, and total climate catastrophe are immense tragedies to be sure, but they mean relatively little in the context of the history of the universe.

The recognition that these cases of suffering mean little should not be interpreted to imply that they mean nothing. A perfectly benevolent God cares about the suffering of every part of creation, and yet, God must also weigh that suffering against the happiness and flourishing of the entirety of the universe, past, present, and future. And moreover, Leibniz reasons that every bit of suffering will ultimately lead to a greater good that redeems or justifies the suffering. To use the language in the contemporary literature in philosophy of religion, there is no “gratuitous evil.” Every case of evil ultimately helps improve the value of the entire universe. In a mature piece called the Dialogue on Human Freedom and the Origin of Evil, Leibniz writes:

I believe that God did create things in ultimate perfection, though it does not seem so to us considering the parts of the universe. It’s a bit like what happens in music and painting, for shadows and dissonances truly enhance the other parts, and the wise author of such works derives such a great benefit for the total perfection of the work from these particular imperfections that it is much better to make a place for them than to attempt to do without them. Thus, we must believe that God would not have allowed sin nor would he have created things he knows will sin, if he could derive from them a good incomparably greater than the resulting evil (Grua 365-366; AG 115).

6. Compatibilist Freedom

a. Human Freedom

Leibniz was deeply concerned with the way in which to properly understand freedom. In one sense, though, his hands were tied; given his fundamental commitment to the principle of sufficient reason as one of the “great principles of human reason” (G. VI, 602), Leibniz was straightforwardly compelled to determinism. Since the principle of sufficient reason rules out causes which are isolated from the causal series, one of the paradigmatic signs of thoroughgoing Libertarian accounts of free will, the most that Leibniz could hope for was a kind of compatibilist account of freedom. And indeed, Leibniz, like most of his other contemporaries, openly embraced the view that freedom and determinism were compatible.

According to the account of freedom developed in his Theodicy, free actions are those that satisfy three individually necessary and jointly sufficient conditions—they must be intelligent, spontaneous, and contingent. He writes in the Theodicy:

I have shown that freedom according to the definition required in the schools of theology, consists in intelligence, which involves a clear knowledge of the object of deliberation, in spontaneity, whereby we determine, and in contingency, that is, in the exclusion of logical or metaphysical necessity (G. VI, 288; H 288).

Leibniz derives the intelligence and spontaneity conditions from Aristotle, but adds contingency as a separate requirement. For an action to be free, Leibniz contends that the agent must have “distinct knowledge of the object of deliberation” (G. VI, 288; H 288), meaning that the agent must have knowledge of their action and also of alternative courses of action. For an action to be spontaneous, the agent’s actions must derive from an internal source and not be externally caused. There is a sense in which every action is spontaneous in that each substance is causally isolated and windowless from every other substance. And finally, actions must be contingent; that is, they must exclude logical or metaphysical necessity.

b. Divine Freedom

It was not just human freedom, though, that Leibniz treated as intelligent, spontaneous, and contingent. In fact, one of the most remarkably consistent parts of Leibniz’s thought, going back to his jurisprudential writings in the 1660’s all the way through to his mature views on metaphysics and philosophical theology, is that the gap between humans and God is a difference of degree and not type. There is nothing substantively different between humans and God. It is for precisely this reason that he insists in his natural law theory that we can discern the nature of justice and try to implement it in worldly affairs. Justice for humans ought to mirror the justice of God.

The implication for this theological view is that God is free in the same way that humans are free; God is perfectly free because his actions are also intelligent, spontaneous, and contingent. Since God is omniscient, he has perfect perceptions of the entire universe, past, present, and future. Since God determines his own actions without any external coercion, he is perfectly spontaneous. And since there is an infinite plurality of worlds, possible in themselves, which God could choose, his actions are contingent. Leibniz reasons that since God meets each of these conditions in the highest sense, God is perfectly free. And even though God is invariably led toward the Good, this in no way is an infringement on his freedom. He writes in the Theodicy:

…It is true freedom, and the most perfect, to be able to make the best use of one’s free will, and always to exercise this power, without being turned aside either by outward force or by inward passions, whereof the one enslaves our bodies and the other our souls. There is nothing less servile and more befitting the highest degree of freedom than to be always led towards the good, and always by one’s own inclination, without any constraint and without any displeasure. And to object that God therefore had need of external things is only a sophism (G. VI. 385; H 386).

Even with this mature account of freedom in place, Leibniz may still have the very same problem that he was concerned about prior to his meeting with Spinoza in 1676. If God’s nature requires him to do only the best, and assuming that there is only one uniquely best world, then it follows that the only possible world is the actual world. God’s essential nature and the fact of a uniquely best world entails that God must create the best. And so, we may end up back in the necessitarian position after all, albeit in a somewhat different way than Spinoza. Although Leibniz endorses the anthropomorphic conception of God that Spinoza denies, both philosophers hold that God’s nature necessitates, in some way, that there is only one possible world, the actual world. Ultimately, it is up to us to decide whether the strategies for contingency and the account of human and divine freedom that Leibniz develops over the course of his long and illustrious career are successful enough to avoid the necessitarian threat of which he was so concerned.

7. References and Further Reading

a. Primary Sources

  • [A] Sämtliche Schriften und Briefe. Ed. Deutsche Akademie der Wissenschaften. Darmstadt, Leipzig, Berlin: Akademie Verlag, 1923. Cited by series, volume, page.
  • [AG] Philosophical Essays. Translated and edited by Roger Ariew and Dan Garber. Indianapolis: Hackett, 1989.
  • [CP] Confessio Philosophi: Papers Concerning the Problem of Evil, 1671–1678. Translated and edited by Robert C. Sleigh, Jr. New Haven, CT: Yale University Press, 2005.
  • [G] Die Philosophischen Schriften von Gottfried Wilhelm Leibniz. Edited by C.I. Gerhardt. Berlin: Weidmann, 1875-1890. Reprint, Hildescheim: Georg Olms, 1978. Cited by volume, page.
  • [Grua] Textes inédits d’après de la bibliothèque provincial de Hanovre. Edited by Gaston Grua. Paris: Presses Universitaires, 1948. Reprint, New York and London: Garland Publishing, 1985.
  • [H] Theodicy: Essays on the Goodness of God, the Freedom on Man and the Origin of Evil. Translated by E.M. Huggard. La Salle, Il: Open Court, 1985.
  • [L] Philosophical Papers and Letters. Edited and translated by Leroy E. Loemker.
  • 2nd Edition. Dordrect: D. Reidel, 1969.
  • [LA] The Leibniz-Arnauld Correspondence. Edited by H.T. Mason. Manchester: Manchester University Press, 1967.
  • [OCM] Œuvres complètes de Malebranche (20 volumes). Edited by A. Robinet. Paris: J. Vrin, 1958–84.

b. Secondary Sources

  • Adams, Robert Merrihew. Leibniz: Determinist, Theist, Idealist. New York: Oxford University Press, 1994.
  • Bennett, Jonathan. Learning from Six Philosophers Vol. 1. New York: Oxford University Press, 2001.
  • Blumenfeld, David. “Is the Best Possible World Possible?” Philosophical Review 84, No. 2, April 1975.
  • Blumenfeld, David. “Perfection and Happiness in the Best Possible World.” In Cambridge Companion to Leibniz. Edited by Nicholas Jolley. Cambridge: Cambridge University Press, 1994.
  • Broad, C.D. Leibniz: An Introduction. Cambridge: Cambridge University Press, 1975.
  • Brown, Gregory and Yual Chiek. Leibniz on Compossibility and Possible Worlds. Cham, Switzerland: Springer, 2016.
  • Brown, Gregory. “Compossibility, Harmony, and Perfection in Leibniz.” The Philosophical Review 96, No. 2, April 1987.
  • Cover, J.A. and John O’Leary-Hawthorne. Substance and Individuation in Leibniz. Cambridge:
  • Cambridge University Press, 1999.
  • Curley, Edwin. “Root of Contingency.” In Leibniz: A Collection of Critical Essays. Edited by Harry Frankfurt. New York: Doubleday, 1974.
  • D’Agostino, Fred. “Leibniz on Compossibility and Relational Predicates.” The Philosophical Quarterly 26, No. 103, April 1976.
  • Hacking, Ian. “A Leibnizian Theory of Truth.” In Leibniz: Critical and Interpretative Essays, edited by
  • Michael Hooker. Minneapolis: University of Minnesota Press, 1982.
  • Horn, Charles Joshua. “Leibniz and Impossible Ideas in the Divine Intellect” In Internationaler Leibniz-Kongress X Vorträge IV, Edited by Wenchao Li. Hannover: Olms, 2016.
  • Horn, Charles Joshua. “Leibniz and the Labyrinth of Divine Freedom.” In The Labyrinths of Leibniz’s Philosophy. Edited by Aleksandra Horowska. Peter Lang Verlag, 2022.
  • Koistinen, Olli, and Arto Repo. “Compossibility and Being in the Same World in Leibniz’s Metaphysic.” Studia Leibnitiana 31, 2021.
  • Look, Brandon. “Leibniz and the Shelf of Essence.” The Leibniz Review 15, 2005.
  • Maher, Patrick. “Leibniz on Contingency.” Studia Leibnitiana 12, 1980.
  • Mates, Benson. “Individuals and Modality in the Philosophy of Leibniz.” Studia Leibnitiana 4, 1972.
  • Mates, Benson. “Leibniz on Possible Worlds.” Leibniz: A Collection of Critical Essays, edited by Harry Frankfurt, 335-365. Notre Dame: University of Notre Dame Press, 1976.
  • Mates, Benson. The Philosophy of Leibniz: Metaphysics and Language. New York: Oxford University Press, 1986.
  • McDonough, Jeffrey. “Freedom and Contingency.” The Oxford Handbook of Leibniz. New York: Oxford University Press, 2018.
  • McDonough, Jeffrey. “The Puzzle of Compossibility: The Packing Strategy.” Philosophical Review.119, No. 2, 2010.
  • Merlo, Giovanni. “Complexity, Existence, and Infinite Analysis.” Leibniz Review 22, 2012.
  • Messina, James and Donald Rutherford. “Leibniz on Compossibility.” Philosophy Compass 4, No. 6,
  • 2009.
  • Mondadori, Fabrizio. “Leibniz and the Doctrine of Inter-World Identity.” Studia Leibnitiana 7, 1975.
  • Mondadori, Fabrizio. “Reference, Essentialism, and Modality in Leibniz’s Metaphysics.” Studia Leibnitiana 5, 1973.
  • Rescher, Nicholas. Leibniz: An Introduction to His Philosophy. Totowa, New Jersey: Rowman and
  • Littlefield, 1979.
  • Rescher, Nicholas. Leibniz’s Metaphysics of Nature. Dordrecht, 1981.
  • Rescher, Nicholas. The Philosophy of Leibniz. Englewood Cliffs, NJ: Prentice Hall, 1967.
  • Rowe, William. Can God Be Free? New York: Oxford University Press, 2006.
  • Russell, Bertrand. A Critical Exposition on the Philosophy of Leibniz, 2nd ed. London: George Allen and Unwin, 1937. Reprint London: Routledge, 1997.
  • Rutherford, Donald. Leibniz and the Rational Order of Nature. Cambridge: Cambridge University Press, 1995.
  • Rutherford, Donald. “The Actual World.” The Oxford Handbook of Leibniz. New York: Oxford University Press, 2018.
  • Sleigh, Robert C., Jr. Leibniz and Arnauld: A Commentary on Their Correspondence. New Haven: Yale University Press, 1990.
  • Wilson, Margaret D. “Compossibility and Law.” In Causation in Early Modern Philosophy: Cartesianism, Occasionalism, and Pre-Established Harmony. Edited by Steven Nadler. University Park, Pennsylvania: Pennsylvania State University Press, 1993.
  • Wilson, Margaret D. “Possible Gods.” Review of Metaphysics 32, 1978/79.

 

Author Information

Charles Joshua Horn
Email: jhorn@uwsp.edu
University of Wisconsin Stevens Point
U. S. A.

Faith: Contemporary Perspectives

Faith is a trusting commitment to someone or something. Faith helps us meet our goals, keeps our relationships secure, and enables us to retain our commitments over time. Faith is thus a central part of a flourishing life.

This article is about the philosophy of faith. There are many philosophical questions about faith, such as: What is faith? What are its main components or features? What are the different kinds of faith? What is the relationship between faith and other similar states, such as belief, trust, knowledge, desire, doubt, and hope? Can faith be epistemically rational? Practically rational? Morally permissible?

This article addresses these questions. It is divided into three main parts. The first is about the nature of faith. This includes different kinds of faith and various features of faith. The second discusses the way that faith relates to other states. For example, what is the difference between faith and hope? Can someone have faith that something is true even if they do not believe it is true? The third discusses three ways we might evaluate faith: epistemically, practically, and morally. While faith is not always rational or permissible, this section covers when and how it can be. The idea of faith as a virtue is also discussed.

This article focuses on contemporary work on faith, largely since the twentieth century. Historical accounts of faith are also significant and influential; for an overview of those, see the article “Faith: Historical Perspectives.”

Table of Contents

  1. The Nature of Faith
    1. Types of Faith
      1. Attitude-Focused vs. Act-Focused
      2. Faith-That vs. Faith-In
      3. Religious vs. Non-Religious
      4. Important vs. Mundane
    2. Features of Faith
      1. Trust
      2. Risk
      3. Resilience
      4. Going Beyond the Evidence
  2. Faith and Other States
    1. Faith and Belief
      1. Faith as a Belief
      2. Faith as Belief-like
      3. Faith as Totally Different from Belief
    2. Faith and Doubt
    3. Faith and Desire
    4. Faith and Hope
    5. Faith and Acceptance
  3. Evaluating Faith
    1. Faith’s Epistemic Rationality
      1. Faith and Evidence
      2. Faith and Knowledge
    2. Faith’s Practical Rationality
    3. Faith and Morality/Virtue
  4. Conclusion
  5. References and Further Reading

1. The Nature of Faith

As we saw above, faith is a trusting commitment to someone or something. While this definition is a good start, it leaves many questions unanswered. This section is on the nature of faith and is divided into two subsections. The first covers distinctions among different kinds of faith and the second explores features of faith.

a. Types of Faith

This subsection outlines distinctions among different kinds of faith. It focuses on four distinctions: attitude-focused faith vs. act-focused faith, faith-that vs. faith-in, religious vs. non-religious faith, and important vs. mundane faith.

i. Attitude-Focused vs. Act-Focused

One of the most important distinctions is faith as an attitude compared to faith as an action. Faith, understood as an attitude, is similar to attitudes like beliefs or desires. In the same way that you might believe that God exists, you might have faith that God exists. Both are attitudes (things in your head), rather than actions (things you do). Call this attitude-focused faith.

Attitude-focused faith is thought to involve at least two components (Audi 2011: 79). The first is a belief-like, or cognitive, component. This could simply be a belief. While some contend that faith always involves belief, others argue that faith can involve something weaker, but still belief-like: some confidence that the object of faith is true, thinking it is likely to be true, supported by the evidence, or the most likely of the options under consideration. Either way, attitude-focused faith involves something belief-like. For example, if you have faith that your friend will win their upcoming basketball game, you will think there is at least a decent chance they win. It does not make sense to have faith that your friend’s team will win if you are convinced that they are going to get crushed. Later, this article returns to questions about the exact connection between faith and belief, but it is relatively uncontroversial that attitude-focused faith involves a belief-like component.

The second component of attitude-focused faith is a desire-like, or conative, component. Attitude-focused faith involves a desire for, or a positive evaluation of, its object. Returning to our example, if you have faith that your friend will win their upcoming game, then you want them to win the game. You do not have faith that they will win if you are cheering for the other team or if you want them to lose. This example illustrates why plausibly, attitude-focused faith involves desire; this article returns to this later as well.

A second kind of faith is not in your head, but an action. This kind of faith is similar to taking a “leap of faith”—an act of trust in someone or something. For example, if your friend promises to pick you up at the airport, waiting for them rather than calling a taxi demonstrates faith that they will pick you up. Walking across a rickety bridge demonstrates faith that the bridge will hold you. Doing a trust fall demonstrates faith that someone will catch you. Call this type of faith an act of faith, or action-focused faith.

On some views, such as Kvanvig’s (2013), faith is a disposition. In the same way that glass is disposed to shatter (even if it never actually shatters), on dispositional views of faith, having faith is a matter of being disposed to do certain things (even if the faithful never actually do them). The view that faith is a disposition could be either attitude-focused or action-focused. Faith might be a disposition to act in certain ways, maybe ways that demonstrate trust or involve risk. This type of faith would be action-focused (see Kvanvig 2013). Faith might instead be a disposition to have certain attitudes: like to believe, be confident in, and/or desire certain propositions to be true. This type of faith would be attitude-focused (see Byerly 2012).

What is the relationship between attitude-focused faith and action-focused faith? They are distinct states, but does one always lead to the other? One might think that, in the same way that beliefs and desires cause actions (for example, your belief that there is food in the fridge and your desire for food leads you to open the fridge), attitude-focused faith will cause (or dispose you toward) action-focused faith, as attitude-focused faith is made up of belief- and desire-like states (see Jackson 2021). On the other hand, we may not always act on our beliefs and our desires. So one question is: could you have attitude-focused faith without action-focused faith?

A related question is whether you could have action-focused faith without attitude-focused faith. Could you take a leap of faith without having the belief- and desire-like components of attitude-focused faith? Speak (2007: 232) provides an example that suggests that you could take a leap of faith without a corresponding belief. Suppose Thomas was raised in circumstances that instilled a deep distrust of the police. Thomas finds himself in an unsafe situation and a police officer is attempting to save him; Thomas needs to jump from a dangerous spot so the officer can catch him. While the officer has provided Thomas with evidence that he is reliable, Thomas cannot shake the belief instilled from his upbringing that the police are not trustworthy. Nonetheless, Thomas jumps. Intuitively, Thomas put his faith in the officer, even without believing that the officer is trustworthy.

Generally, you can act on something, even rationally, if you have a lot to gain if it is true, even if you do not believe that it is true. Whether this counts as action-focused faith without attitude-focused faith, however, will depend on the relationship between faith and belief, a question addressed in a later section.

ii. Faith-That vs. Faith-In

A second distinction is between faith-that and faith-in. Faith-that is faith that a certain proposition is true. Propositions are true or false statements, expressed by declarative sentences. So 1+1=2, all apples are red, and God exists are all propositions. In the case of faith, you might have faith that a bridge will hold you, faith that your friend will pick you up from the airport, or faith that God exists. Faith-that is similar to other propositional attitudes, like belief and knowledge. This suggests that attitude-focused faith is a species of faith-that, since the attitudes closely associated with faith, like belief and hope, are propositional attitudes.

There’s also faith-in. Faith-in is not faith toward propositions, but faith toward persons or ideals. For example, you might have faith in yourself, faith in democracy, faith in your spouse, faith in a political party, or faith in recycling.

Some instances of faith can be expressed as both faith-that and faith-in. For example, theistic faith might be described as faith-that God exists or faith-in God. You might also have faith-that your spouse is a good person or faith-in your spouse. There are questions about the relationship between faith-that and faith-in. For example, is one more fundamental? Do all instances of faith-that reduce to faith-in, or vice versa? Or are they somewhat independent? Is there a significant difference between faith-in X, and faith-that a proposition about X is true?

iii. Religious vs. Non-Religious

A third distinction is between religious faith and secular faith. The paradigm example of religious faith is faith in God or gods, but religious faith can also include: faith that certain religious doctrines are true, faith in the testimony of a religious leader, faith in a Scripture or holy book, or faith in the church or in a religious group. In fact, according to one view that may be popular in certain religious circles, “faith” is simply belief in religious propositions (see Swindal 2021).

However, faith is not merely religious—there are ample examples of non-religious faith. This includes the faith that humans have in each other, faith in secular goals or ideals, and faith in ourselves. It is a mistake to think that faith is entirely a religious thing or reserved only for the religious. Faith is a trusting commitment—and this can involve many kinds of commitments. This includes religious commitment, but also includes interpersonal commitments like friendship or marriage, intrapersonal commitments we have to ourselves or our goals, and non-personal commitments we have to ideals or values.

One reason this distinction is important is that some projects have good reason to focus on one or the other. For example, on some religious traditions, like the Christian tradition, faith is a condition for salvation. But presumably, not any kind of faith will do—religious faith is required. One project in Christian philosophical theology provides an analysis of the religious faith that is closely connected to salvation (see Bates 2017). Projects like these have good reason to set secular faith aside. Others may have a special interest in secular faith and thus set religious faith aside.

This article considers both religious and non-religious faith. While they are different in key ways, they both involve trusting commitments, and many contemporary accounts of faith apply to both.

iv. Important vs. Mundane

A final distinction is between important faith and mundane faith. Important faith involves people, ideals, or values that are central to your life goals, projects, and commitments. Examples of important faith include religious faith, faith in your spouse, or faith in your political or ethical values. In most cases, important faith is essential to your life commitments and often marks values or people that you build your life around.

But not all faith is so important. You might have faith that your office chair will hold you, faith that your picnic will not be rained out, or faith that your spouse’s favorite football team will win their game this weekend. These are examples of mundane faith. While mundane faith still plausibly involves some kind of trusting commitment, this commitment is less important and more easily given up. You may have a weak commitment to your office chair. But—given it is not a family heirloom—if the chair started falling apart, you would quickly get rid of it and buy a new one. So important faith is associated with your central, life-shaping commitments, and mundane faith is associated with casual commitments that are more easily given up.

One might distinguish between objectively important faith—faith held to objectively valuable objects—and subjectively important faith—faith held to objects that are important to a particular individual but may or may not be objectively valuable. For example, some critics of religion might argue that while religious faith might be subjectively important to some, it is nonetheless not objectively important.

While this article focuses mostly on important faith, some of what is discussed also applies to mundane faith, but it may apply to a lesser degree. For example, if faith involves a desire, then the desires associated with mundane faith may be weaker. Now, consider features of faith.

b. Features of Faith

This subsection discusses four key features of faith: trust, risk, resilience, and going beyond the evidence. These four features are often associated with faith. They are not necessarily synonymous with faith, and not all accounts of faith give all four a starring role. Nonetheless, they play a role in understanding faith and its effects. Along the way, this article considers specific accounts that closely associate faith with each feature.

i. Trust

The first feature of faith is trust. As we have noted, faith is a trusting commitment. Trust involves reliance on another person. This can include, for example, believing what they say, depending on them, or being willing to take risks that hinge on them coming through for you. Faith and trust are closely connected, and some even use faith and trust as synonymous (Bishop 2016).

The close association with faith and trust lends itself nicely to a certain view of faith: faith is believing another’s testimony. Testimony is another’s reporting that something is true. Accounts that connect faith and testimony are historically significant, tracing back to Augustine, Locke, and Aquinas. Recent accounts of faith as believing another’s testimony include Anscombe (2008) and Zagzebski (2012). Anscombe, for example, says to have faith that p is to believe someone that p. Religious faith might be believing God’s testimony or the testimony of religious leaders. Interpersonal faith might be believing the testimony of your friends or family.

Plausibly, trust is a key feature—likely the key feature—of interpersonal faith. Faith in others involves trusting another person: this includes faith in God or gods, but also faith in other people and faith in ourselves. It is plausible that even propositional faith can be understood in terms of trust. For example, propositional faith that your friend will pick you up from the airport involves trusting your friend. Even in mundane cases propositional faith could be understood as trust: if you have faith it will be sunny tomorrow, you trust it will be sunny tomorrow.

ii. Risk

Faith is also closely related to risk. William James (1896/2011) discusses a hiker who gets lost. She finally finds her way back to civilization, but as she is walking, she encounters a deep and wide crevice on the only path home. Suppose that, to survive, she must jump this crevice, and it is not obvious that she can make the jump. She estimates that she has about a 50/50 chance. She has two choices: she can give up and likely die in the wilderness. Or she can take a (literal) leap of faith and do her best to make it across the crevice. This decision to jump involves a risk: she might fail to make it to the other side and fall to her death.

Risk involves making a decision in a situation where some bad outcome is possible but uncertain. Jumping a wide crevice involves the possible bad outcome of falling in. Gambling involves the possible bad outcome of losing money. Buying a stock involves the bad outcome of its value tanking.

If faith is connected to risk, this suggests two things about faith. First, faith is associated with a degree of uncertainty. For example, if one has faith that something is true, then one is uncertain regarding its truth or falsity. Second, faith is exercised in cases where there is a potentially bad outcome. The outcome might involve the object of faith’s being false, unreliable, or negative in some other way. For example, if you have faith that someone will pick you up at the airport, there is the possibility that they do not show up. If you have faith in a potential business partner, there is the possibility that they end up being dishonest or difficult to work with.

These examples illustrate the connection between risk and action-focused faith. When we act in faith, there is usually some degree of uncertainty involved and a potentially bad outcome. If you have action-focused faith your spouse will pick you up you wait for them and do not call a taxi, you risk waiting at the airport for a long time and maybe even missing an important appointment if your spouse does not show. If you have action-focused faith someone is a good business partner you dedicate time, money, and energy into your shared business, and you risk wasting all those resources if they are dishonest or impossible to work with. Or you might have action-focused faith that God exists and dedicate your life to God, which risks wasting your life if God does not exist.

Attitude-focused faith may also involve risk: some kind of mental risk. William James (1896/2011) discusses our two epistemic goals: believe truth and avoid error. We want to have true beliefs, but if that is all we cared about, we should believe everything. We want to avoid false beliefs, but if that is all we cared about, we should believe nothing. Much of the ethics of belief is about balancing these two goals, and this balance can involve a degree of mental risk. For example, suppose you have some evidence that God exists, but your evidence is not decisive, and you also recognize that there are some good arguments that God does not exist. While it is safer to withhold judgment on whether God exists, you also could miss out on a true belief. Instead, you might take a “mental” risk, and go ahead and believe that God exists. While you are not certain that God exists, and believing risks getting it wrong, you also face a bad outcome if you withhold judgment: missing out on a true belief. By believing that God exists in the face of indecisive evidence, you take a “mental” or “attitude” risk. James argues that this kind of mental risk can be rational (“lawful”) when “reason does not decide”—our evidence does not make it obvious that the statement believed is true or false—and we face a “forced choice”—we have to commit either way.

The view that faith involves an attitude-risk closely resembles John Bishop’s account of faith, which is inspired by insights from James. Bishop (2007) argues that faith is a “doxastic venture” (doxastic meaning belief-like). Bishop’s view is that faith involves believing beyond the evidence. Bishop argues that certain propositions (including what he calls “framework principles”) are evidentially undecidable, meaning our evidence cannot determine whether the claim is true or false. In these cases, you can form beliefs for non-evidential reasons—for example, beliefs can be caused by desires, emotions, affections, and so forth. This non-evidential believing enables you to believe beyond the evidence (see also Ali 2013).

iii. Resilience

A third feature of faith is resilience. Faith’s resilience stems from the connection between faith and commitment. Consider some examples. If you have faith that my team will win their upcoming game, you have some kind of commitment to my team. If you have faith that God exists, this involves a religious commitment. You might commit to finishing a degree, picking up a new instrument, a marriage, or a religion. These commitments can be difficult to keep—you get discouraged, doubt yourself or others, your desires and passions fade, and/or you get counterevidence that makes you wonder if you should have committed in the first place. Faith’s resilience helps you overcome these obstacles and keep your commitments.

Lara Buchak’s (2012) risky commitment view of faith brings risk and commitment together. On Buchak’s view, faith involves stopping one’s search for evidence and making a commitment. Once this commitment is made, you will maintain that commitment, even in the face of new counterevidence. For example, suppose you are considering making a religious commitment. For Buchak, religious faith involves stopping your search for evidence regarding whether God exists and taking action: making the commitment. Of course, this does not mean that you can no longer consider the evidence or have to stop reading philosophy of religion, but you are not looking for new evidence to decide whether to make (or keep) the commitment. Once you’ve made this religious commitment, you will continue in that commitment even if you receive evidence against the existence of God—at least, to a degree.

The literature on grit is also relevant to faith’s resilience. Grit, a phenomenon discussed by both philosophers and psychologists, is the ability to persevere to achieve long-term, difficult goals (Morton and Paul 2019). It takes grit to train for a marathon, survive a serious illness, or remain married for decades. Matheson (2018) argues that faith is gritty, and this helps explain how faith can be both rational and voluntary. Malcolm and Scott (2021) argue that faith’s grit helps the faithful be resilient to a variety of challenges. Along similar lines, Jackson (2021) argues that the belief- and desire-like components of faith explain how faith can help us keep our long-term commitments, in light of both epistemic and affective obstacles.

iv. Going Beyond the Evidence

A final feature of faith is that it goes beyond the evidence. This component is related to faith’s resilience. Faith helps you maintain your commitments because it goes beyond the evidence. You might receive counterevidence that makes you question whether you should have committed in the first place. For example, you might commit to a certain major, but a few months in, realize the required classes are quite difficult and demanding. You might wonder whether you are cut out for that field of study. Or you might have a religious commitment, but then encounter evidence that an all-good, all-loving God does not exist—such as the world’s serious and terrible evils. In either case, faith helps you continue in your commitment in light of this counterevidence. And if the evidence is misleading—so you are cut out for the major, or God does exist—then this is a very good thing.

The idea that faith goes beyond the evidence raises questions about rationality. How can faith go beyond the evidence but still be rational? Is it not irrational to disrespect or ignore evidence? This article returns to this question later, but for now, note that there is a difference between going beyond the evidence and going against the evidence. Going beyond the evidence might look like believing or acting when the evidence is decent but imperfect. Bishop’s account, for example, is a way that faith might “venture” beyond the evidence (2007). However, this does not mean faith goes against the evidence, requiring you to believe something that you have overwhelming evidence is false.

Some do argue that faith goes against the evidence. They fall into two main camps. The first camp thinks that faith goes against the evidence, and this is a bad thing; faith is harmful, and we should avoid having faith at all costs. The New Atheists, such as Richard Dawkins and Sam Harris, have a view like this (but see Jackson 2020). The second camp thinks that faith goes against the evidence but that is actually a good thing. This view is known as fideism. Kierkegaard argued for fideism, and he thought that faith is valuable because it is absurd: “The Absurd, or to act by virtue of the absurd, is to act upon faith” (Journals, 1849). Nonetheless, Kierkegaard thought having faith is one of the highest ideals to which one can aspire. This article returns to the idea that faith “goes beyond the evidence” in Section 3.

2. Faith and Other States

This section is about the relationship between faith and related attitudes, states, or actions: belief, doubt, desire, hope, and acceptance. Unlike the features just discussed, these states are normally not part of the definition or essence of faith. Nonetheless, these states are closely associated with faith. Appreciating the ways that faith is similar to, but also different than, these states provides a deeper understanding of the nature of faith.

a. Faith and Belief

When it comes to attitudes associated with faith, many first think of belief. Believing something is taking it to be the case or regarding it as true. Beliefs are a propositional attitude: an attitude taken toward a statement that is either true or false.

What is the relationship between faith and belief? Since belief is propositional, it is also natural to focus on propositional faith; so what is the relationship between belief that p and faith that p? More specifically: does belief that p entail faith that p? And: does faith that p entail belief that p? The answer to the first question is no; belief does not entail propositional faith. This is because propositional faith involves a desire-like or affective component; belief does not. You might believe that there is a global pandemic or believe that your picnic was rained out. However, you do not have faith that those things are true, because you do not desire them to be true.

The second question—whether propositional faith entails belief—is significantly more controversial. Does faith that p entail belief that p? Answers to this question divide into three main views. Those who say yes normally argue that faith is a kind of belief. The no camp divides into two groups. The first group argues that faith does not have to involve belief, but it involves something belief-like. A final group argues that faith is something totally different from belief. This article considers each view in turn. (See Buchak 2017 for a very helpful, more detailed taxonomy of various views of faith and belief.)

i. Faith as a Belief

On some views, faith is a belief. Call these “doxastic” views of faith. We have discussed two doxastic views already. The first is the view that faith is simply belief in a religious proposition; it was noted that, if intended as a general theory of faith, this seems narrow, as one can have non-religious faith. (But it may be more promising as an account of religious faith.) The second view is Anscombe’s (2008) and Zagzebski’s (2012) view that faith is a belief based on testimony, discussed in the previous section on trust. A third view traces back to Augustine and Calvin, and is more recently defended by Plantinga (2000). On this view, faith is a belief that is formed through a special mental faculty known as the sensus divintatus, or the “sense of the divine.” For example, you might watch a beautiful sunset and form the belief that there is a Creator; you might be in danger and instinctively cry out to God for help. (Although Plantinga also is sympathetic to views that connect faith and testimony; see Plantinga 2000: ch. 9.)

Note two things about doxastic views. First, most doxastic views add other conditions in addition to belief. For instance, as we have discussed, it is widely accepted that faith has an affective, desire-like component. So on one doxastic view, faith involves a belief that p and a desire for p. You could also add other conditions: for example, faith is associated with dispositions to act in certain ways, take certain risks, or trust certain people. What unites doxastic views is that faith is a kind of belief; faith is belief-plus.

Second, the view that faith entails belief does not require you to accept that faith is a belief. You could have a view on which faith is not a belief, but every time you have faith that a statement is true, you also believe it—faith and belief “march in step” (analogy: just because every animal with a heart also has a kidney does not mean hearts are kidneys). So, another view in the family of doxastic views, is that faith is not a belief, but always goes along with belief.

ii. Faith as Belief-like

Some resist the idea that faith entails belief. Daniel Howard-Snyder (2013) provides several arguments against doxastic views of faith. Howard-Snyder argues that if one can have faith without belief, this makes sense of the idea that faith is compatible with doubt. Doubting might cause you to give up a belief, but Howard-Snyder argues that you can maintain your faith even in the face of serious doubts. Second, other belief-like attitudes can play belief’s role: for example, you could think p is likely, be confident in p, think p is more likely than not, and so forth. If you do not flat-out believe that God exists, but are confident enough that God exists, Howard-Snyder argues that you can still have faith that God exists. A final argument that you can have faith without belief involves real-life examples of faith without belief. Consider the case of Mother Theresa. Mother Theresa went through a “dark night of the soul” in her later life. During this dark time, in her journals, she confessed that her doubts were so serious that at times, she did not believe that God existed. Nonetheless, she maintained her commitment and dedication to God. Many would not merely say she had faith; Mother Theresa was a paradigm example of a person of faith. This again supports the idea that you can have faith without belief. In general, proponents of non-doxastic views do not want to exclude those who experience severe, belief-prohibiting doubts from having religious faith. In fact, one of the functions of faith is to help you keep your commitments in the face of such doubts.

Howard-Snyder’s positive view is that faith is “weakly doxastic.” Faith does not require belief but requires a belief-like attitude, such as confidence, thinking likely, and so forth. He adds other conditions as well; in addition to a belief-like attitude, he thinks that faith that p requires a positive view of p, a positive desire-like attitude toward p, and resilience to new counterevidence against the truth of p.

In response to Howard-Snyder, Malcolm and Scott (2017) defend that faith entails belief. While they agree with Howard-Snyder that faith is compatible with doubt, they point out that belief is also compatible with doubt. It is not uncommon or odd to say things like “I believe my meeting is at 3 pm, but I’m not sure,” or “I believe that God exists, but I have some doubts about it.” Malcolm and Scott go on to argue that faith without belief, especially religious faith without belief, is a form of religious fictionalism. Fictionalists speak about and act on something for pragmatic reasons, but they do not believe the claims that they are acting on and speaking about. For example, you might go to church, pray, or recite a creed, but you do not believe that God exists or what the creed says—you merely do those things for practical reasons. Malcolm and Scott argue that there is something suspicious about this, and there is reason to think that fictionalists do not have genuine faith. They conclude that faith entails belief, and more specifically, religious faith requires the belief that God exists.

This debate is not be settled here, but note that there are various responses that the defender of the weakly-doxastic view of faith could provide. Concerning the point about doubt, a proponent of weak doxasticism might argue that faith is compatible with more doubt than belief. Even if belief is compatible with some doubt—as it seems fine to say, “I believe p but there’s a chance I’m wrong”—it seems like faith is compatible with even more doubt—more counterevidence or lower probabilities. On fictionalism, Howard-Snyder (2018) responds that religious fictionalism is a problem only if the fictionalist actively believes that the claims they are acting on are false. However, if they are in doubt but moderately confident, or think the claims are likely, even if they do not believe the claims, it is more plausible that fictionalists can have faith. You might also respond by appealing to some of the distinctions discussed above: for example, perhaps religious faith entails belief, but non-religious faith does not.

iii. Faith as Totally Different from Belief

A third view pulls faith even further away from belief. On this view, faith does not entail belief, nor does faith entail something belief-like, but instead, faith is totally different from belief. This view is often known as the pragmatist view of faith.

This article returns to these views later, but here is a summary. Some authors argue that faith only involves accepting, or acting as if, something is true (Swinburne 1981; Alston 1996). Others argue that faith is a disposition to act in service of an ideal (Dewey 1934; Kvanvig 2013), or that faith involves pursuing a relationship with God (Buckareff 2005). Some even argue that faith is incompatible with belief; for example, Pojman (1986) argues that faith is profound hope, and Schellenberg (2005) argues that faith is imaginative assent. Both argue that one cannot have faith that p if they believe that p.

Pragmatist views depart drastically from both doxastic and weakly doxastic accounts of faith. Faith does not even resemble belief, but is something totally unlike belief, and more closely related to action, commitment, or a disposition to act.

There are two ways to view the debate between doxastic, weakly doxastic, and pragmatic views of faith. One possibility is that there is a single thing, “faith,” and there are various views about what exactly faith amounts to: is faith a belief, similar to a belief, or not at all like belief? Another possibility, however, is that there are actually different kinds of faith. Plausibly, both doxastic and weakly doxastic views are describing attitude-focused faith, and pragmatic views of faith are describing action-focused faith. This second possibility does not mean there are not any interesting debates regarding faith. It still leaves open whether attitude-focused faith requires belief, or merely something belief-like, and if the latter, what those belief-like attitudes can be, and how weak they can be. It also leaves open which view of action-focused faith is correct. However, you may not have to choose between pragmatist views on the one hand, and doxastic or weakly doxastic views on the other; each view may simply be describing a different strand of faith.

b. Faith and Doubt

One might initially think that faith and doubt are opposed to each other. That is, those with faith will never doubt, or if they do doubt, their faith is weak. However, if you agree with the points made in the previous section—Howard-Snyder’s argument that faith is compatible with doubt; and Malcolm and Scott’s point that belief is also compatible with doubt—there is reason to reject the view that faith and doubt are completely opposed to each other.

Howard-Snyder (2013: 359) distinguishes between two ways of doubting. First, you might simply doubt p. Howard-Snyder says that this involves an inclination to disbelieve p. If you doubt that it will rain tomorrow, you will tend to disbelieve that it will rain tomorrow. This type of doubt—doubting p—might be in tension with, or even inconsistent with faith. Even those who deny that faith entails belief nonetheless think that faith is not consistent with disbelief; you cannot have faith that p if you think p is false (but see Whitaker 2019 and Lebens 2021).

However, not all doubt is closely associated with disbelief. You might instead be in doubt about p, or have some doubts about p. Moon (2018) argues that this type of doubt involves (roughly) thinking you might be wrong. In these cases, you are pulled in two directions—maybe you believe something, but then receive some counterevidence. Moon argues that this second kind of doubt is compatible with belief (2018: 1831), and Howard-Snyder argues that it is compatible with faith. Howard-Snyder says, “Being in doubt is no impediment to faith. Doubt is not faith’s enemy; rather, the enemies of faith are misevaluation, indifference or hostility, and faintheartedness” (2013: 370).

Thus, there is good reason to think that having doubts is consistent with faith. Those that deny that faith entails belief might argue that faith is compatible with more doubts than belief. What is more, faith may be a tool that helps us maintain our commitments in light of doubts. For example, Jackson (2019) argues that evidence can move our confidence levels around, but it does not always change our beliefs. For example, suppose John is happily engaged and will be married soon, and based on the sincerity and commitment of him and his spouse, he has faith that they will not get divorced. Then, he learns that half of all marriages end in divorce. Learning this should lower his confidence that they will remain committed, causing him to have doubts that his marriage will last. However, this counterevidence does not mean he should give up his faith or the commitment. His faith in himself and his spouse can help him maintain the commitment, even in light of the counterevidence and resulting doubts.

c. Faith and Desire

Recall that attitude-focused faith involves a desire for, or a positive evaluation of, the object of faith. If you have faith that your friend will win her upcoming race, then you want her to win; it does not make sense to claim to have faith she will win if you hope she will lose. Similarly, you would not have faith that your best friend has cancer, or that your father will continue smoking. A large majority of the authors writing on the philosophy of faith maintain that faith involves a positive evaluation of its object (Audi 2011: 67; Howard-Snyder 2013: 362–3). Even action-focused faith may involve desire. While it is more closely identified with actions, rather than attitudes, it could still involve or be associated with desires or pro-attitudes.

Malcolm and Scott (2021) challenge the orthodox view that faith entails desire or positivity. They argue that, while faith might often involve desire, the connection is not seamless. For example, you might have faith that the devil exists or faith that hell is populated—not because you want these to be true, but because these doctrines are a part of your religious commitment. You might find these doctrines confusing and difficult to swallow, and even hope that they are false, but you trust that God has a plan or reason to allow these to be true. Malcolm and Scott argue that faith in such cases does not involve positivity toward its object—and in fact, it may involve negativity.

Furthermore, crises of faith can involve the loss of desire for the object of faith. There has been much talk about how faith that p can be resilient in light of counterevidence: evidence that p is false. But what about evidence that p would be a bad thing? One might question their religious commitment, say, not because they doubt God’s existence, but because they doubt that God’s existence would be a good thing, or that God is worth committing to (see Jackson 2021). Malcolm and Scott argue that if one can maintain faith through a crisis of faith, this provides another reason to think that faith may not always involve positivity.

Note that more attention has been paid to the specifics of faith’s belief-like component than faith’s desire-like component. Many authors mention the positivity of faith, motivate it with a few examples, and then move on to other topics. But many similar questions that arise regarding faith and belief could also be raised regarding faith and desire. For example: does faith that p entail a desire for p? What if someone has something weaker than a desire, such as a second-order desire (a desire to desire p)? Or some desire for p, but also some desire for not-p? Could these people have faith? Can other attitudes play the role of desire in faith, such as a belief that p is good?

If you are willing to weaken the relationship between faith and desire, you could agree with Malcolm and Scott that the idea that faith entails desire is too strong, but nonetheless accept that a version of the positivity view is correct. Similar to a weakly doxastic account of faith, you could have a weakly positive account of faith and desire: faith’s desire-like condition could include things like second-order desires, conflicting desires, pro-attitudes, or beliefs about the good. In a crisis of faith, the faithful may have second-order desires or some weaker desire-like attitude. The prospect of weakly positive accounts of faith should be further explored. And in general, more attention should be paid to the relationship between faith and desire. In the religious case, this connection is related to the axiology of theism, the question of whether we should want God to exist (see The Axiology of Theism).

d. Faith and Hope

Faith and hope are often considered alongside each other, and for good reason. Like faith, hope also has a desire-like component and a belief-like component. The desire-like component in both attitudes is similar—whether you have faith that your friend will win their game or hope that they will win their game, you want them to win the game.

However, hope’s belief-like component is arguably weaker than faith’s. Hope that a statement is true merely requires thinking that statement is possibly true; it can be extremely unlikely. Even if there is a 95% chance of rain tomorrow, you can still hope your picnic will not be rained out. Hope’s belief-like component could be one of two things: a belief that p is possible, or a non-zero credence in p. (Credence is a measure of subjective probability—the confidence you have in the truth of some proposition. Credences are measured on a scale from 0 to 1, where 0 represents certainty that a proposition is false, and 1 represents certainty that it is true.) So if you hope that p, you cannot believe p is impossible or have a credence of 0 in p (certainty that p is false). At the same time, it seems odd to hope for things in which you are certain. You do not hope that 1+1=2 or hope that you exist, even if you desire those to be true. Then, as Martin (2013: 69) notes, hope that p may be consistent with any credence in p between, but excluding, 1 and 0.

Thus, on the standard view of hope, hope consists of two things: a desire for p to be true and a belief that p is possible (or non-zero credence). (See Milona 2019 for a recent defense of the standard view. Some argue that hope has additional components; for details of recent accounts of hope, see Rioux 2021.) Contrast this with faith. Unlike hope, faith that a statement is true is not compatible with thinking the statement is extremely unlikely or almost definitely false. If there is a 95% chance of rain tomorrow, you should not—and most would not—have faith that it will be sunny tomorrow. The chance of rain is just too high. But this does not preclude hoping that it will be sunny. Thus, you can hope that something is true when it is so unlikely that you cannot have faith.

This carves out a unique role for hope. Sometimes, after you make a commitment, you get lots of counterevidence challenging your basis for that commitment—counterevidence so strong that you must give up your faith. However, simply because you have to give up your faith does not mean you have to give up hope. You might hope your missing sibling is alive, even in light of evidence that they are dead, or hope that you will survive a concentration camp, or hope that you can endure a risky treatment for a serious illness. And resorting to hope does not always mean you should give up your commitment. Hope can, in general, underlie our commitments when we do not have enough evidence to have faith (see Jackson 2021).

While faith and hope are distinct in certain ways, Pojman (1986) argues that faith is a certain type of hope: profound hope. Pojman is not interested in casual hope—like hope your distant cousin will get the job he applied for—but is focused on the hope that is deep and central to our life projects. In addition to the two components of hope discussed above, profound hope also involves a disposition to act on p, an especially strong desire for p to be true, and a willingness to take great risks to bring p about. Pojman’s view draws on a connection between attitude-focused faith and action-focused faith, as Pojman’s account gives a central role to risky action. Those convinced by the idea that faith requires a bit more evidence than hope may also want to add a condition to Pojman’s view: the belief-like component of faith-as-hope must be sufficiently strong, as faith might require more than merely taking something to be possible.

e. Faith and Acceptance

Accepting that p is acting as if p. When you accept a proposition, you treat it as true in your practical reasoning, and when you make decisions, act as if p were true. According to Jonathan Cohen (1992: 4), when one accepts a proposition, one “includes that proposition… among one’s premises for deciding what to do or think in a particular context.” Often, we accept what we believe and believe what we accept. You believe coffee will wake you up, so you drink it when you are tired in the morning. You believe your car is parked north of campus, so you walk that way when you leave the office.

Sometimes, however, you act as if something is true even though you do not believe it. Say you are a judge in a court case, and the evidence is enough to legally establish that a particular suspect did it “beyond a reasonable doubt.” Suppose, though, you have other evidence that they are innocent, but it is personal, such that it cannot legally be used in a court of law. You may not be justified in believing they are guilty, but for legal reasons, you must accept that they are guilty and issue the “guilty” verdict. In other cases, you believe something, but do not act as it if is true. Suppose you are visiting a frozen lake with your young children, and they want to go play on the ice. You may rationally believe the ice is thick and safe, but refuse to let your children play, accepting that the ice will break, because of how bad it would be if they fell in.

Several authors have argued that faith and acceptance are closely connected. Alston (1996) argues that acceptance, rather than belief, is one of the primary components of faith. That is, those with faith may or may not believe the propositions of faith, but they act as if they are true. A similar view is Swinburne’s pragmatist faith. On Swinburne’s (1981) view, faith is acting on the assumption that p. Like Alston, Swinburne also maintains that faith does not require belief. Schellenberg’s (2005) view also gives acceptance a prominent place in faith. On Schellenberg’s view, faith is imaginative assent. If you have faith that p, you deliberately imagine p to be true, and, guided by this imaginative picture, you act on the truth of p. So Schellenberg’s picture of faith is imaginative assent plus acceptance. While these authors argue that acceptance is necessary for faith, most do not think it is sufficient; the faithful fulfill other conditions, including a pro-attitude towards the object of faith.

A final view is that faith involves a kind of allegiance. Allegiance is an action-oriented submission to a person or ideal. Dewey (1934) and Kvanvig (2013) defend the allegiance view of faith, on which the faithful are more characterized by their actions than their attitudes. The faithful are marked by their loyalty and committed action to the object of faith; in many cases, this could look like accepting certain propositions of faith, even if one does not believe them. Bates (2017) also proposes a model of Christian faith as allegiance, but for Bates, faith requires both a kind of intellectual assent (something belief-like) and allegiance, or enacted loyalty and obedience to God.

Whether these views that give acceptance or action a central role in faith are weakly doxastic or pragmatic depends on one’s view of acceptance: is acceptance a belief-like state or an action-like state? Since acceptance is acting as if something is true, and you can accept a proposition even if you think it is quite unlikely, in my opinion, these views are better characterized as pragmatic. However, some acceptance views, like Bates’, that involve both acceptance and something belief-like, may be doxastic or weakly doxastic.

3. Evaluating Faith

Thus far, this article has focused on the nature of faith. Section 1 covered types of faith and features of faith. Section 2 covered the way faith compares and contrasts with other related attitudes and actions. This final section is about evaluating faith. This section discusses three modes of evaluation: epistemic, practical, and moral.

Note that, like other attitudes and actions, faith is sometimes rational and sometimes irrational, sometimes permissible and sometimes impermissible. In the same way that beliefs can be rational or irrational, faith can be rational or irrational. Not all faith should be evaluated in the same way. The rationality of faith depends on several factors, including the nature of faith and the object of faith. Drawing on some of the above accounts of the nature of faith, this article discusses various answers to the question of why and when faith could be rational, and why and when faith could be irrational.

a. Faith’s Epistemic Rationality

Our first question is whether faith can be epistemically rational, and if so, when and how. Epistemic rationality is rationality that is aimed at getting at the truth and avoiding error, and it is associated with justified belief and knowledge. An epistemically rational belief has characteristics like being based on evidence, being reliably formed, being a candidate for knowledge, and being the result of a dependable process of inquiry. Paradigm examples of beliefs that are not epistemically rational ones are based on wishful thinking, hasty generalizations, or emotional attachment.

Epistemic rationality is normally applied to attitudes, like beliefs, so faith’s epistemic rationality primarily concerns faith as a mental state. This article also focuses on propositional faith, and it divides the discussion of faith’s epistemic rationality into two parts: evidence and knowledge.

i. Faith and Evidence

Before discussing faith, it might help to discuss the relationship between evidence and epistemic rationality. It is widely thought that epistemically rational people follow the evidence. While the exact relationship between evidence and epistemic rationality is controversial, many endorse what is called evidentialism, the view that you are epistemically rational if and only if you proportion your beliefs to the evidence.

We have seen that faith is resilient: it helps us keep our commitments in the face of counterevidence. Given faith’s resilience, it is natural to think that faith goes beyond the evidence (or involves a disposition to go beyond the evidence). But would not having faith then violate evidentialism? Can faith both be perfectly proportioned to the evidence, but also go beyond the evidence? Answers to these questions fall into three main camps, taking different perspectives on faith, evidence, and evidentialism.

The first camp, mentioned previously, maintains that faith violates evidentialism because it goes beyond the evidence; but evidentialism is a requirement of rationality; thus, faith is irrational. Fideists and the New Atheists may represent such a view. However, you might think that the idea that all faith is always irrational is too strong, and that, instead, faith is more like belief: sometimes rational and sometimes irrational. Those that think faith can be rational fall into two camps.

The first camp holds that rational faith does not violate evidentialism and that there are ways to capture faith’s resilience that respect evidentialism. For example, consider Anscombe’s and Zagzebski’s view that faith is believing another’s testimony. On this view, faith is based on evidence, and rational faith is proportioned to the evidence: testimonial evidence. Of course, this assumes that testimony is evidence, but this is highly plausible: much of our geographical, scientific, and even everyday beliefs are based on testimony. Most of our scientific beliefs are not based on experiments we did ourselves—they are based on results reported by scientists. We trust their testimony. We believe geographical facts about the shape of the globe and things about other countries even though we have never traveled there ourselves—again, based on testimony. We ask people for directions on the street and believe our family and friends when they report things to us. Testimony is an extremely important source of evidence, and without it, we would be in the dark about a lot of things.

In what sense does faith go beyond the evidence, on this view? Well, sometimes, we have only testimony to go on. We may not have the time or ability to verify what someone tells us without outside sources, and we may be torn about whether to trust someone. In choosing to take someone’s word for something, we go beyond the evidence. At the very least, we go beyond certain kinds of evidence, in that we do not require outside verifying evidence. One worry for this view, however, is that faith is straightforwardly based on evidence, and thus it cannot sufficiently explain faith’s resilience, or how faith goes beyond the evidence.

A second view on which rational faith goes beyond the evidence without violating evidentialism draws on a view in epistemology known as epistemic permissivism: the view that sometimes, the evidence allows for multiple different rational attitudes toward a proposition. In permissive cases, where your evidence does not point you one way or another, there is an evidential tie between two attitudes. You can then choose to hold the faithful attitude, consistent with, but not required by, your evidence. This does not violate evidentialism, as the faithful attitude is permitted by, and in that sense fits, your evidence. At the same time, faith goes beyond the evidence in the sense that the faithful attitude is not strictly required by your evidence.

Consider two concrete examples. First, suppose your brother is accused of a serious crime. Suppose that there are several good, competing explanations of what happened. It might be rational for you to withhold belief, or even believe your brother is guilty, but you could instead choose the explanation of the evidence that supports your brother’s innocence. This demonstrates faith that your brother is innocent without violating the evidence, since believing that he is innocent is a rational response to the data.

Or suppose you are trying to decide whether God exists. The evidence for (a)theism is complicated and difficult to assess, and there are good arguments on both sides. Suppose, because the evidence is complicated in this way, you could be rational as a theist (who believes God exists), atheist (who believes God does not exist), or agnostic (who is undecided on whether God exists). Say you go out on a limb and decide to have faith that God exists. You are going beyond the evidence, but you are also not irrational, since your evidence rationally permits you to be a theist. Again, this is a case where rational faith respects evidentialism, but also goes beyond the evidence. (Note that, depending on how evidentialism is defined, this response may better fit under the third view, discussed next. Some strong versions of evidentialism are inconsistent with permissivism, and on some versions of the permissivist theory of faith, non-evidential factors can break evidential ties, so things besides evidence affect rational belief.) Attempts to reconcile faith’s resilience with evidentialism include, for example, Jackson (2019) and Dormandy (2021).

The third and final camp holds the view that faith, in going beyond the evidence, violates evidentialism, but this does not mean that faith is irrational. (James 1896/2011 and Bishop 2007 may well be characterized as proponents of this view, as they explicitly reject Clifford’s evidentialism). For example, you might maintain that evidentialism applies to belief, but not faith. After all, it is natural to think that faith goes beyond the evidence in a way that belief does not. To maintain evidentialism about belief, proponents of this view would need to say that rational faith is inconsistent with belief. Then, faith might be subject to different, non-evidentialist norms, but could still be rational and go beyond the evidence.

A second family of views that rejects evidentialism but maintains faith’s rationality are externalist views. Externalists maintain that epistemic justification depends on factors that are external to the person—for example, your belief that there is a cup on the desk can be rational if it is formed by a reliable perceptual process, whether or not you have evidence that there is a cup. Plantinga in particular is an externalist who thinks epistemic justification (or “warrant”) is a matter of functioning properly. Plantinga (2000) argues that religious beliefs can be properly basic: rational even if not based on an argument. Plantinga’s view involves the sensus divinitatus: a sense of the divine, that, when functioning properly, causes people to form beliefs about God (for example, “There is a Creator”; “God exists”; “God can help me”) especially in particular circumstances (for example, in nature, when in need of help, and so forth). These beliefs can be rational, even if not based on argument, and may be rational without any evidence at all.

That said, the view that religious belief can be properly basic does not, by itself, conflict with evidentialism. If a religious belief is based on experiential evidence, but not arguments, it can still be rational according to an evidentialist. Externalist views that deny evidentialism make a stronger claim: that religious belief can be rational without argument or evidence (see Plantinga 2000: 178).

Externalist views—at least ones that reject evidentialism—may be able to explain how rational faith goes beyond the evidence; evidence is not required for faith (or belief) to be epistemically rational. Even so, most externalist views include a no-defeater condition: if you get evidence that a belief is false (a defeater), that can affect, or even preclude, your epistemic justification. For example, you might form a warranted belief in God based on the sensus divinitatus but then begin to question why a loving, powerful God would allow the world’s seriously and seemingly pointless evils; this counterevidence could remove the warrant for your belief in God. Generally, externalist views may need a story about how faith can be resilient in the face of counterevidence to fully capture the idea that faith goes beyond the evidence.

We have seen three views about the relationship between faith, evidence, and evidentialism. On the first view, evidentialism is true, and faith does not respect evidentialism, so faith is irrational. On the second, evidentialism is true, and rational faith goes beyond the evidence in a way that respects evidentialism. On the final view, evidentialism is false, so faith does not have to be based on evidence; this makes space for rational faith to go beyond the evidence. Now, we turn to a second topic concerning the epistemology of faith: faith and knowledge.

ii. Faith and Knowledge

Epistemology is the study of knowledge. Epistemologists mostly focus on propositional knowledge: knowledge that a proposition is true. For example, you might know that 1+1=2 or that it is cold today. Knowledge involves at least three components: justification, truth, and belief. If you know that it is cold today, you believe that it is cold today, it is indeed cold today, and your belief that it is cold today is epistemically justified. (While these three components are necessary for knowledge, many think they are not sufficient, due to Gettier’s (1963) famous counterexamples to the justified true belief account of knowledge.) Note that knowledge is a high epistemic ideal. When a belief amounts to knowledge, it is not merely justified, but it is also true. Many epistemologists also think that knowledge requires a high degree of justification, for example, quite good evidence.

There are three main views about the relationship between faith and knowledge. The first is that propositional faith is a kind of knowledge. Plantinga’s view lends itself to a view of faith along these lines, as Plantinga’s story about proper function is ultimately an account of knowledge. Plantinga’s view is inspired by Calvin’s, who defines faith as a “firm and certain” knowledge of God (Institutes III, ii, 7:551). If Plantinga is right that (undefeated) theistic beliefs, formed reliably by properly functioning faculties in the right conditions, amount to knowledge, then Plantinga’s view might be rightfully characterized as one on which faith is (closely tied to) knowledge. Relatedly, Aquinas discusses a kind of faith that resembles knowledge, but is ultimately “midway between knowledge and opinion” (Summa Theologica 2a2ae 1:2).

On a second view, propositional faith is not a kind of knowledge, but can amount to knowledge in certain circumstances. For example, one might hold that faith may be consistent with less evidence or justification than is required for knowledge, or that faith does not require belief. Thus, one could have faith that p—even rationally—even if one does not know that p. Keep in mind that knowledge is a high epistemic bar, so meeting this bar for knowledge may not be required for faith to be rational—faith that p might be rational even if, for example, p is false, so p is not known. However, faith that p may amount to knowledge when it meets the conditions for knowledge: p is justifiedly believed, true, and not Gettiered.

On a final view, faith that p is inconsistent with knowing p. For example, Howard-Snyder (2013: 370) suggests that for faith, one’s evidence is often “sub-optimal.” Along similar lines, Alston (1996: 12) notes that “[F]aith-that has at least a strong suggestion of a weak epistemic position vis-a-vis the proposition in question.” Since knowledge sets a high epistemic bar (the proposition in question must enjoy a high degree of justification, be true, and so forth), faith may play a role when your epistemic position is too poor to know. And if you know p, faith that p is not needed. This fits well with Kant’s famous remarks: “I have… found it necessary to deny knowledge, in order to make room for faith” (Preface to the Second Edition of the Critique of Pure Reason, 1787/1933: 29). On this third view, then, if you have faith that p, you do not know p, and if you know p, faith that p is unnecessary.

As noted, many epistemologists focus on knowledge-that: knowing that a proposition is true. However, there are other kinds of knowledge: knowledge-how, or knowing how to perform some action, such as riding a bike, and knowledge-who, or knowing someone personally. There has been some interesting work on non-propositional knowledge and faith: see Sliwa (2018) for knowledge-how, and Benton (2018) for knowledge-who. Note that non-propositional knowledge might better fit with non-propositional faith, such as faith-in. This raises several interesting questions, such as: does faith in God require interpersonal knowledge of God? And how does this relate to the belief that God exists? The relationship between non-propositional knowledge and faith merits further exploration.

b. Faith’s Practical Rationality

A second question is whether faith can be practically rational, and if so, when and how. Practical rationality, unlike epistemic rationality, is associated with what is good for you: what fulfills your desires and leads to your flourishing. Examples of practically rational actions include brushing your teeth, saving for retirement, pursuing your dream job, and other things conducive to meeting your goals and improving your life (although see Ballard 2017 for an argument that faith’s practical and epistemic rationality are importantly connected).

Practical rationality is normally applied to actions. Thus, it makes the most sense to evaluate action-focused faith for practical rationality. In particular, acceptance, or acting as if a proposition is true, is often associated with action-focused faith. Thus, this article focuses on what makes accepting a proposition of faith practically rational, and whether leaps of faith can be practically rational but go beyond the evidence.

Elizabeth Jackson’s (2021) view of faith focuses on how acceptance-based faith can be practically rational in light of counterevidence. Jackson notes that, on two major theories of rational action (the belief-desire view and the decision-theory view), rational action is caused by two things: beliefs and desires. If it is rational for you to go to the fridge, this is because you want food (a desire) and you believe there is food in the fridge (a belief). But you can believe and desire things to a stronger or lesser degree; you might rationally act on something because you have a strong desire for it, even though you consider it unlikely. Suppose your brother goes missing. He has been missing for a long time, and there is a lot of evidence he is dead, but you think there is some chance he might be alive. Because it would be so good if he was alive and you found him, you have action-focused faith that he is alive: you put up missing posters, spend lots of time searching for him, and so forth. The goodness of finding him again makes this rational, despite your counterevidence. Or consider another example: you might rationally accept that God exists, by practicing a religion, participating in prayer and liturgy, and joining a spiritual community, even if you have strong evidence against theism. This is because you have a lot of gain if you accept that God exists and God does exist, and not much to lose if God does not exist.

Arguably, then, it is even easier for practically rational faith to go beyond the evidence than it is for epistemically rational faith. Taking an act of faith might be practically rational even if one has little evidence for the proposition they are accepting. Practically rational action depends on both your evidence and also what is at stake, and it can be rational to act as if something is true even if your evidence points the other way. In this, practically rational faith can be resilient in light of counterevidence: what you lose in evidence can be made up for in desire.

Of course, this does not mean that faith is always practically rational. Both your beliefs/evidence and your desires/what is good for you can render faith practically irrational. For example, if you became certain your brother was dead (perhaps his body was found), then acting as if your brother is still alive would be practically irrational. Similarly, faith could be practically irrational if its object is not good for your flourishing: for example, faith that you will get back together with an abusive partner.

However, since it can be rational to accept that something is true even if you have overwhelming evidence that it is false, practically rational acts of faith go beyond (and even against) the evidence. For other related decision-theoretic accounts of how practically rational faith can go beyond the evidence, see Buchak (2012) and McKaughan (2013).

c. Faith and Morality/Virtue

The third and final way to evaluate faith is from a moral perspective. There is a family of questions regarding the ethics of faith: whether and when is faith morally permissible? Is faith ever morally obligatory? Is it appropriate to regard faith as a virtue? Can faith be immoral?

We normally ask what actions, rather than what mental states, are obligatory/permissible/wrong. While virtues are not themselves actions, they are (or lead to) dispositions to act. In either case, it makes sense to morally evaluate action-focused faith. (Although, some argue for doxastic wronging, that is, beliefs can morally wrong others. If they can, this suggests beliefs—and perhaps other mental states—can be morally evaluated. This may open up space to morally evaluate attitude-focused faith as well.)

As with the epistemic and practical case, it would be wrong to think that all cases of faith fit into one moral category. Faith is not always moral: faith in an evil cause or evil person can be immoral. But faith is not always immoral, and may sometimes be morally good: faith in one’s close friends or family members, or faith in causes like world peace or ending world hunger seem morally permissible, if not even morally obligatory.

One of the most widely discussed topics on the ethics of faith is faith as a virtue (see Aquinas, Summa Theologiae II-II, q. 1-16). Faith is often taken to be both a virtue in general, but also a theological virtue (in the Christian tradition, along with hope and charity). For reasons just discussed, the idea that faith is a virtue by definition seems incorrect. Faith is not always morally good—it is possible to have faith in morally bad people or cases, and to have faith with morally bad effects. (This is why the discussion of faith as a virtue belongs in this section, rather than in previous sections on the nature of faith.)

This raises the question: Can faith satisfy the conditions for virtue? According to Aristotle, a virtue is a positive character trait that is demonstrated consistently, across situations and across time. Virtues are acquired freely and deliberately and bring benefits to both the virtuous person and to their community. For example, if you have the virtue of honesty, you will be honest in various situations and also over time; you will have acquired honesty freely and deliberately (not by accident), and your honesty will bring benefits both to yourself and those in your community. Thus, assuming this orthodox Aristotelian definition of virtue, when faith is a virtue, it is a stable character trait, acquired freely and deliberately, that brings benefits to both the faithful person and their community.

There have been several discussions of the virtue of faith in the literature. Anne Jeffrey (2017-a) argues that there is a tension between common assumptions about faith and Aristotelian virtue ethics. Specifically, some have argued that part of faith’s function depends on a limitation or an imperfection in the faithful person (for example, keeping us steadfast and committed in light of doubts or misguided affections). However, according to the Aristotelian view, virtues are traits held by fully virtuous people who have perfect practical knowledge and always choose the virtuous action. Taken together, these two views create a challenge for the idea that faith is a virtue, as faith seems to require imperfections or limitations incompatible with virtue. While this tension could be resolved by challenging the idea that faith’s role necessarily involves a limitation, Jeffrey instead argues that we should re-conceive Aristotelian virtue ethics and embrace the idea that even people with limitations can possess and exercise virtues. In another paper, Jeffrey (2017-b) argues that we can secure the practical rationality and moral permissibility of religious faith—which seems necessary if faith is a virtue—by appealing to the idea that faith is accompanied by another virtue, hope.

There is a second reason to think that the theological virtues—faith, hope, and charity—may not perfectly fit into the Aristotelian mold. While Aristotelian virtues are freely acquired by habituation, some thinkers suggest that theological virtues are infused immediately by God, rather than acquired over time (Aquinas, Summa Theologiae II-II, q. 6). While some may conclude from this that faith, along with the other theological virtues, are not true virtues, this may further support Jeffrey’s suggestion that Aristotle’s criteria for virtue may need to be altered or reconceived. Or perhaps there are two kinds of virtues: Aristotelian acquired virtues and theological infused virtues, each with their own characteristics.

A final topic that has been explored is the question of how virtuous faith interacts with other virtues. The relationship between faith and humility is widely discussed. Several authors have noted that prima facie, faith seems to be in tension with humility: faith involves taking various risks (both epistemic and action-focused risks), but in some cases, those risks may be a sign of overconfidence, which can be in tension with exhibiting humility (intellectual or otherwise). In response to this, both Kvanvig (2018) and Malcolm (2021) argue that faith and humility are two virtues that balance each other out. Kvanvig argues that humility is a matter of where your attention is directed (say, not at yourself), and this appropriately directed attention can guide faithful action. Malcolm argues that religious faith can be understood as a kind of trust in God—specifically, a reliance on God’s testimony, which, when virtuous, exhibits a kind of intellectual humility.

4. Conclusion

Faith is a trusting commitment to someone or something. There are at least four distinctions among kinds of faith: attitude-focused faith vs. act-focused faith, faith-that vs. faith-in, religious vs. non-religious faith, and important vs. mundane faith (Section 1.a). Trust, risk, resilience, and going beyond the evidence are all closely associated with faith (Section 1.b). Considering faith’s relationship to attitudes, states, or actions—belief, doubt, desire, hope, and acceptance—sheds further light on the nature of faith (Section 2). There are three main ways we might evaluate faith: epistemically, practically, and morally. While faith is not always epistemically rational, practically rational, or morally permissible, we have seen reason to think that faith can be positively evaluated in many cases (Section 3).

5. References and Further Reading

  • Ali, Zain. (2013). Faith, Philosophy, and the Reflective Muslim. London, UK: Palgrave Macmillan.
  • Alston, William. (1996). “Belief, Acceptance, and Religious Faith.” In J. Jordan and D. Howard-Snyder (eds.), Faith, Freedom, and Rationality pp. 3–27. Lanham, MD: Rowman and Littlefield.
  • Anscombe, G. E. M. (2008). “Faith.” In M. Geach and L. Gormally (eds.), Faith in a Hard Ground. Exeter: Imprint Academic, 11–19.
  • Audi, Robert. (2011). Rationality and Religious Commitment. New York: Oxford University Press.
  • Ballard, Brian. (2017). “The Rationality of Faith and the Benefits of Religion.” International Journal for the Philosophy of Religion 81: 213–227.
  • Bates, Matthew. (2017). Salvation by Allegiance Alone. Grand Rapids: Baker Academic.
  • Benton, Matthew. (2018). “God and Interpersonal Knowledge.” Res Philosophica 95(3): 421–447.
  • Bishop, John. (2007). Believing by Faith: An Essay in the Epistemology and Ethics of Religious Belief. Oxford: OUP.
  • Bishop, John. (2016). “Faith.” Stanford Encyclopedia of Philosophy. Edward N. Zalta (ed.) https://plato.stanford.edu/entries/faith/
  • Buchak, Lara. (2012). “Can it Be Rational to Have Faith?” In Jake Chandler & Victoria Harrison (eds.), Probability in the Philosophy of Religion, pp. 225–247. Oxford: Oxford University Press.
  • Buchak, Lara. (2017). “Reason and Faith.” In The Oxford Handbook of the Epistemology of Theology (edited by William J. Abraham and Frederick D. Aquino), pp. 46–63. Oxford: OUP.
  • Buckareff, Andrei A. (2005). “Can Faith Be a Doxastic Venture?” Religious Studies 41: 435– 45.
  • Byerly, T. R. (2012). “Faith as an Epistemic Disposition.” European Journal for Philosophy of Religion, 4(1): 109–128.
  • Cohen, Jonathan. (1992). An Essay on Belief and Acceptance. New York: Clarendon Press.
  • Dewey, John (1934). A Common Faith. New Haven, CT: Yale University Press.
  • Dormandy, Katherine. (2021). “True Faith: Against Doxastic Partiality about Faith (in God and Religious Communities) and in Defense of Evidentialism.” Australasian Philosophical Review 5(1): 4–28
  • Gettier, Edmund. (1963). “Is Justified True Belief Knowledge?” Analysis 23(6): 121–123.
  • Howard-Snyder, Daniel. (2013). “Propositional Faith: What it is and What it is Not.” American Philosophical Quarterly 50(4): 357–372.
  • Howard-Snyder, Daniel. (2018). “Can Fictionalists Have Faith? It All Depends.” Religious Studies 55: 1–22.
  • Jackson, Elizabeth. (2019). “Belief, Credence, and Faith.” Religious Studies 55(2): 153–168.
  • Jackson, Elizabeth. (2020). “The Nature and Rationality of Faith.” A New Theist Response to the New Atheists (Joshua Rasmussen and Kevin Vallier, eds.), pp. 77–92. New York: Routledge.
  • Jackson, Elizabeth. (2021). “Belief, Faith, and Hope: On the Rationality of Long-Term Commitment.” Mind. 130(517): 35–57.
  • Jeffrey, Anne. (2017-a). “How Aristotelians Can Make Faith a Virtue.” Ethical Theory and Moral Practice 20(2): 393–409.
  • Jeffrey, Anne. (2017-b). “Does Hope Morally Vindicate Faith?” International Journal for Philosophy of Religion 81(1-2): 193–211.
  • James, William. (1896/2011). “The Will to Believe.” In J. Shook (ed.) The Essential William James, pp. 157–178. New York: Prometheus Books.
  • Kvanvig, Jonathan. (2018). Faith and Humility. Oxford: OUP.
  • Kvanvig, Jonathan. (2013). “Affective Theism and People of Faith.” Midwest Studies in Philosophy 37: 109–28.
  • Lebens, S. (2021). “Will I Get a Job? Contextualism, Belief, and Faith.” Synthese 199(3-4): 5769–5790.
  • Malcolm, Finlay. (2021). “Testimony, Faith, and Humility.” Religious Studies 57(3): 466–483.
  • Malcolm, Finlay and Michael Scott. (2017). “Faith, Belief, and Fictionalism.” Pacific Philosophical Quarterly 98(1): 257–274.
  • Malcolm, Finlay and Michael Scott. (2021). “True Grit and the Positivity of Faith.” European Journal of Analytic Philosophy 17(1): 5–32.
  • Martin, Adrienne M. (2013). How We Hope: A Moral Psychology. Princeton: Princeton University Press.
  • Matheson, Jonathan. (2018). “Gritty Faith.” American Catholic Philosophical Quarterly 92(3): 499–513.
  • McKaughan, Daniel. (2013). “Authentic Faith and Acknowledged Risk: Dissolving the Problem of Faith and Reason.” Religious Studies 49: 101–124.
  • Milona, Michael. (2019). “Finding Hope.” The Canadian Journal of Philosophy 49(5): 710­–729.
  • Moon, Andrew. (2018). “The Nature of Doubt and a New Puzzle about Belief, Doubt, and Confidence.” Synthese 195(4): 1827–1848.
  • Paul, Sarah K., and Jennifer M. Morton. (2019). “Grit.” Ethics 129: 175–203.
  • Plantinga, Alvin (2000). Warranted Christian Belief. New York: Oxford University Press.
  • Pojman, Louis. (1986). “Faith Without Belief?” Faith and Philosophy 3(2): 157-176.
  • Rettler, Brad. (2018). “Analysis of Faith.” Philosophy Compass 13(9): 1–10.
  • Rioux, Catherine. (2021). “Hope: Conceptual and Normative Issues.” Philosophy Compass 16(3): 1–11.
  • Schellenberg, J.L. (2005). Prolegomena to a Philosophy of Religion. Ithaca: Cornell University Press.
  • Sliwa, Paulina. (2018). “Know-How and Acts of Faith.” In Matthew A. Benton, John Hawthorne & Dani Rabinowitz (eds.), Knowledge, Belief, and God: New Insights in Religious Epistemology. Oxford: Oxford University Press. pp. 246-263.
  • Speak, Daniel. (2007). “Salvation Without Belief.” Religious Studies 43(2): 229–236.
  • Swinburne, Richard. (1981). “The Nature of Faith’. In R. Swinburne, Faith and Rea­son, pp. 104–24. Oxford: Clarendon Press.
  • Swindal, James. (2021). “Faith: Historical Perspectives.” Internet Encyclopedia of Philosophy. https://iep.utm.edu/faith-re/
  • Whitaker, Robert K. (2019). “Faith and Disbelief.” International Journal for Philosophy of Religion 85: 149–172.
  • Zagzebski, Linda Trinkaus (2012). “Religious Authority.” In L. T. Zagzebski, Epistemic Authority: A Theory of Trust, Authority, and Autonomy in Belief. Oxford: Oxford University Press, 181–203.

 

Author Information

Elizabeth Jackson
Email: lizjackson111@ryerson.ca
Toronto Metropolitan University
Canada

Pseudoscience and the Demarcation Problem

The demarcation problem in philosophy of science refers to the question of how to meaningfully and reliably separate science from pseudoscience. Both the terms “science” and “pseudoscience” are notoriously difficult to define precisely, except in terms of family resemblance. The demarcation problem has a long history, tracing back at the least to a speech given by Socrates in Plato’s Charmides, as well as to Cicero’s critique of Stoic ideas on divination. Karl Popper was the most influential modern philosopher to write on demarcation, proposing his criterion of falsifiability to sharply distinguish science from pseudoscience. Most contemporary practitioners, however, agree that Popper’s suggestion does not work. In fact, Larry Laudan suggested that the demarcation problem is insoluble and that philosophers would be better off focusing their efforts on something else. This led to a series of responses to Laudan and new proposals on how to move forward, collected in a landmark edited volume on the philosophy of pseudoscience. After the publication of this volume, the field saw a renaissance characterized by a number of innovative approaches. Two such approaches are particularly highlighted in this article: treating pseudoscience and pseudophilosophy as BS, that is, “bullshit” in Harry Frankfurt’s sense of the term, and applying virtue epistemology to the demarcation problem. This article also looks at the grassroots movement often referred to as scientific skepticism and to its philosophical bases.

Table of Contents

  1. An Ancient Problem with a Long History
  2. The Demise of Demarcation: The Laudan Paper
  3. The Return of Demarcation: The University of Chicago Press Volume
  4. The Renaissance of the Demarcation Problem
  5. Pseudoscience as BS
  6. Virtue Epistemology and Demarcation
  7. The Scientific Skepticism Movement
  8. References and Further Readings

1. An Ancient Problem with a Long History

In the Charmides (West and West translation, 1986), Plato has Socrates tackle what contemporary philosophers of science refer to as the demarcation problem, the separation between science and pseudoscience. In that dialogue, Socrates is referring to a specific but very practical demarcation issue: how to tell the difference between medicine and quackery. Here is the most relevant excerpt:

SOCRATES: Let us consider the matter in this way. If the wise man or any other man wants to distinguish the true physician from the false, how will he proceed? . . . He who would inquire into the nature of medicine must test it in health and disease, which are the sphere of medicine, and not in what is extraneous and is not its sphere?

CRITIAS: True.

SOCRATES: And he who wishes to make a fair test of the physician as a physician will test him in what relates to these?

CRITIAS: He will.

SOCRATES: He will consider whether what he says is true, and whether what he does is right, in relation to health and disease?

CRITIAS: He will.

SOCRATES: But can anyone pursue the inquiry into either, unless he has a knowledge of medicine?

CRITIAS: He cannot.

SOCRATES: No one at all, it would seem, except the physician can have this knowledge—and therefore not the wise man. He would have to be a physician as well as a wise man.

CRITIAS: Very true. (170e-171c)

The conclusion at which Socrates arrives, therefore, is that the wise person would have to develop expertise in medicine, as that is the only way to distinguish an actual doctor from a quack. Setting aside that such a solution is not practical for most people in most settings, the underlying question remains: how do we decide whom to pick as our instructor? What if we mistake a school of quackery for a medical one? Do quacks not also claim to be experts? Is this not a hopelessly circular conundrum?

A few centuries later, the Roman orator, statesman, and philosopher Marcus Tullius Cicero published a comprehensive attack on the notion of divination, essentially treating it as what we would today call a pseudoscience, and anticipating a number of arguments that have been developed by philosophers of science in modern times. As Fernandez-Beanato (2020a) points out, Cicero uses the Latin word “scientia” to refer to a broader set of disciplines than the English “science.” His meaning is closer to the German word “Wissenschaft,” which means that his treatment of demarcation potentially extends to what we would today call the humanities, such as history and philosophy.

Being a member of the New Academy, and therefore a moderate epistemic skeptic, Cicero writes: “As I fear to hastily give my assent to something false or insufficiently substantiated, it seems that I should make a careful comparison of arguments […]. For to hasten to give assent to something erroneous is shameful in all things” (De Divinatione, I.7 / Falconer translation, 2014). He thus frames the debate on unsubstantiated claims, and divination in particular, as a moral one.

Fernandez-Beanato identifies five modern criteria that often come up in discussions of demarcation and that are either explicitly or implicitly advocated by Cicero: internal logical consistency of whatever notion is under scrutiny; degree of empirical confirmation of the predictions made by a given hypothesis; degree of specificity of the proposed mechanisms underlying a certain phenomenon; degree of arbitrariness in the application of an idea; and degree of selectivity of the data presented by the practitioners of a particular approach. Divination fails, according to Cicero, because it is logically inconsistent, it lacks empirical confirmation, its practitioners have not proposed a suitable mechanism, said practitioners apply the notion arbitrarily, and they are highly selective in what they consider to be successes of their practice.

Jumping ahead to more recent times, arguably the first modern instance of a scientific investigation into allegedly pseudoscientific claims is the case of the famous Royal Commissions on Animal Magnetism appointed by King Louis XVI in 1784. One of them, the so-called Society Commission, was composed of five physicians from the Royal Society of Medicine; the other, the so-called Franklin Commission, comprised four physicians from the Paris Faculty of Medicine, as well as Benjamin Franklin. The goal of both commissions was to investigate claims of “mesmerism,” or animal magnetism, being made by Franz Mesmer and some of his students (Salas and Salas 1996; Armando and Belhoste 2018).

Mesmer was a medical doctor who began his career with a questionable study entitled “A Physico-Medical Dissertation on the Influence of the Planets.” Later, he developed a theory according to which all living organisms are permeated by a vital force that can, with particular techniques, be harnessed for therapeutic purposes. While mesmerism became popular and influential for decades between the end of the 18th century and the full span of the 19th century, it is now considered a pseudoscience, in large part because of the failure to empirically replicate its claims and because vitalism in general has been abandoned as a theoretical notion in the biological sciences. Interestingly, though, Mesmer clearly thought he was doing good science within a physicalist paradigm and distanced himself from the more obviously supernatural practices of some of his contemporaries, such as the exorcist Johann Joseph Gassner.

For the purposes of this article, we need to stress the importance of the Franklin Commission in particular, since it represented arguably the first attempt in history to carry out controlled experiments. These were largely designed by Antoine Lavoisier, complete with a double-blind protocol in which both subjects and investigators did not know which treatment they were dealing with at any particular time, the allegedly genuine one or a sham control. As Stephen Jay Gould (1989) put it:

The report of the Royal Commission of 1784 is a masterpiece of the genre, an enduring testimony to the power and beauty of reason. … The Report is a key document in the history of human reason. It should be rescued from its current obscurity, translated into all languages, and reprinted by organizations dedicated to the unmasking of quackery and the defense of rational thought.

Not surprisingly, neither Commission found any evidence supporting Mesmer’s claims. The Franklin report was printed in 20,000 copies and widely circulated in France and abroad, but this did not stop mesmerism from becoming widespread, with hundreds of books published on the subject in the period 1766-1925.

Arriving now to modern times, the philosopher who started the discussion on demarcation is Karl Popper (1959), who thought he had formulated a neat solution: falsifiability (Shea no date). He reckoned that—contra popular understanding—science does not make progress by proving its theories correct, since it is far too easy to selectively accumulate data that are favorable to one’s pre-established views. Rather, for Popper, science progresses by eliminating one bad theory after another, because once a notion has been proven to be false, it will stay that way. He concluded that what distinguishes science from pseudoscience is the (potential) falsifiability of scientific hypotheses, and the inability of pseudoscientific notions to be subjected to the falsifiability test.

For instance, Einstein’s theory of general relativity survived a crucial test in 1919, when one of its most extraordinary predictions—that light is bent by the presence of gravitational masses—was spectacularly confirmed during a total eclipse of the sun (Kennefick 2019). This did not prove that the theory is true, but it showed that it was falsifiable and, therefore, good science. Moreover, Einstein’s prediction was unusual and very specific, and hence very risky for the theory. This, for Popper, is a good feature of a scientific theory, as it is too easy to survive attempts at falsification when predictions based on the theory are mundane or common to multiple theories.

In contrast with the example of the 1919 eclipse, Popper thought that Freudian and Adlerian psychoanalysis, as well as Marxist theories of history, are unfalsifiable in principle; they are so vague that no empirical test could ever show them to be incorrect, if they are incorrect. The point is subtle but crucial. Popper did not argue that those theories are, in fact, wrong, only that one could not possibly know if they were, and they should not, therefore, be classed as good science.

Popper became interested in demarcation because he wanted to free science from a serious issue raised by David Hume (1748), the so-called problem of induction. Scientific reasoning is based on induction, a process by which we generalize from a set of observed events to all observable events. For instance, we “know” that the sun will rise again tomorrow because we have observed the sun rising countless times in the past. More importantly, we attribute causation to phenomena on the basis of inductive reasoning: since event X is always followed by event Y, we infer that X causes Y.

The problem as identified by Hume is twofold. First, unlike deduction (as used in logic and mathematics), induction does not guarantee a given conclusion, it only makes that conclusion probable as a function of the available empirical evidence. Second, there is no way to logically justify the inference of a causal connection. The human mind does so automatically, says Hume, as a leap of imagination.

Popper was not satisfied with the notion that science is, ultimately, based on a logically unsubstantiated step. He reckoned that if we were able to reframe scientific progress in terms of deductive, not inductive logic, Hume’s problem would be circumvented. Hence falsificationism, which is, essentially, an application of modus tollens (Hausman et al. 2021) to scientific hypotheses:

If P, then Q
Not Q
Therefore, not P

For instance, if General Relativity is true then we should observe a certain deviation of light coming from the stars when their rays pass near the sun (during a total eclipse or under similarly favorable circumstances). We do observe the predicted deviation. Therefore, we have (currently) no reason to reject General Relativity. However, had the observations carried out during the 1919 eclipse not aligned with the prediction then there would have been sufficient reason, according to Popper, to reject General Relativity based on the above syllogism.

Science, on this view, does not make progress one induction, or confirmation, after the other, but one discarded theory after the other. And as a bonus, thought Popper, this looks like a neat criterion to demarcate science from pseudoscience.

In fact, it is a bit too neat, unfortunately. Plenty of philosophers after Popper (for example, Laudan 1983) have pointed out that a number of pseudoscientific notions are eminently falsifiable and have been shown to be false—astrology, for instance (Carlson 1985). Conversely, some notions that are even currently considered to be scientific, are also—at least temporarily—unfalsifiable (for example, string theory in physics: Hossenfelder 2018).

A related issue with falsificationism is presented by the so-called Duhem-Quine theses (Curd and Cover 2012), two allied propositions about the nature of knowledge, scientific or otherwise, advanced independently by physicist Pierre Duhem and philosopher Willard Van Orman Quine.

Duhem pointed out that when scientists think they are testing a given hypothesis, as in the case of the 1919 eclipse test of General Relativity, they are, in reality, testing a broad set of propositions constituted by the central hypothesis plus a number of ancillary assumptions. For instance, while the attention of astronomers in 1919 was on Einstein’s theory and its implications for the laws of optics, they also simultaneously “tested” the reliability of their telescopes and camera, among a number of more or less implicit additional hypotheses. Had something gone wrong, their likely first instinct, rightly, would have been to check that their equipment was functioning properly before taking the bold step of declaring General Relativity dead.

Quine, later on, articulated a broader account of human knowledge conceived as a web of beliefs. Part of this account is the notion that scientific theories are always underdetermined by the empirical evidence (Bonk 2008), meaning that different theories will be compatible with the same evidence at any given point in time. Indeed, for Quine it is not just that we test specific theories and their ancillary hypotheses. We literally test the entire web of human understanding. Certainly, if a test does not yield the predicted results we will first look at localized assumptions. But occasionally we may be forced to revise our notions at larger scales, up to and including mathematics and logic themselves.

The history of science does present good examples of how the Duhem-Quine theses undermine falsificationism. The twin tales of the spectacular discovery of a new planet and the equally spectacular failure to discover an additional one during the 19th century are classic examples.

Astronomers had uncovered anomalies in the orbit of Uranus, at that time the outermost known planet in the solar system. These anomalies did not appear, at first, to be explainable by standard Newtonian mechanics, and yet nobody thought even for a moment to reject that theory on the basis of the newly available empirical evidence. Instead, mathematician Urbain Le Verrier postulated that the anomalies were the result of the gravitational interference of an as yet unknown planet, situated outside of Uranus’ orbit. The new planet, Neptune, was in fact discovered on the night of 23-24 September 1846, thanks to the precise calculations of Le Verrier (Grosser 1962).

The situation repeated itself shortly thereafter, this time with anomalies discovered in the orbit of the innermost planet of our system, Mercury. Again, Le Verrier hypothesized the existence of a hitherto undiscovered planet, which he named Vulcan. But Vulcan never materialized. Eventually astronomers really did have to jettison Newtonian mechanics and deploy the more sophisticated tools provided by General Relativity, which accounted for the distortion of Mercury’s orbit in terms of gravitational effects originating with the Sun (Baum and Sheehan 1997).

What prompted astronomers to react so differently to two seemingly identical situations? Popper would have recognized the two similar hypotheses put forth by Le Verrier as being ad hoc and yet somewhat justified given the alternative, the rejection of Newtonian mechanics. But falsificationism has no tools capable of explaining why it is that sometimes ad hoc hypotheses are acceptable and at other times they are not. Nor, therefore, is it in a position to provide us with sure guidance in cases like those faced by Le Verrier and colleagues. This failure, together with wider criticism of Popper’s philosophy of science by the likes of Thomas Kuhn (1962), Imre Lakatos (1978), and Paul Feyerabend (1975) paved the way for a crisis of sorts for the whole project of demarcation in philosophy of science.

2. The Demise of Demarcation: The Laudan Paper

A landmark paper in the philosophy of demarcation was published by Larry Laudan in 1983. Provocatively entitled “The Demise of the Demarcation Problem,” it sought to dispatch the whole field of inquiry in one fell swoop. As the next section shows, the outcome was quite the opposite, as a number of philosophers responded to Laudan and reinvigorated the whole debate on demarcation. Nevertheless, it is instructive to look at Laudan’s paper and to some of his motivations to write it.

Laudan was disturbed by the events that transpired during one of the classic legal cases concerning pseudoscience, specifically the teaching of so-called creation science in American classrooms. The case, McLean v. Arkansas Board of Education, was debated in 1982. Some of the fundamental questions that the presiding judge, William R. Overton, asked expert witnesses to address were whether Darwinian evolution is a science, whether creationism is also a science, and what criteria are typically used by the pertinent epistemic communities (that is, scientists and philosophers) to arrive at such assessments (LaFollette 1983).

One of the key witnesses on the evolution side was philosopher Michael Ruse, who presented Overton with a number of demarcation criteria, one of which was Popper’s falsificationism. According to Ruse’s testimony, creationism is not a science because, among other reasons, its claims cannot be falsified. In a famous and very public exchange with Ruse, Laudan (1988) objected to the use of falsificationism during the trial, on the grounds that Ruse must have known that that particular criterion had by then been rejected, or at least seriously questioned, by the majority of philosophers of science.

It was this episode that prompted Laudan to publish his landmark paper aimed at getting rid of the entire demarcation debate once and for all. One argument advanced by Laudan is that philosophers have been unable to agree on demarcation criteria since Aristotle and that it is therefore time to give up this particular quixotic quest. This is a rather questionable conclusion. Arguably, philosophy does not make progress by resolving debates, but by discovering and exploring alternative positions in the conceptual spaces defined by a particular philosophical question (Pigliucci 2017). Seen this way, falsificationism and modern debates on demarcation are a standard example of progress in philosophy of science, and there is no reason to abandon a fruitful line of inquiry so long as it keeps being fruitful.

Laudan then argues that the advent of fallibilism in epistemology (Feldman 1981) during the nineteenth century spelled the end of the demarcation problem, as epistemologists now recognize no meaningful distinction between opinion and knowledge. Setting aside that the notion of fallibilism far predates the 19th century and goes back at the least to the New Academy of ancient Greece, it may be the case, as Laudan maintains, that many modern epistemologists do not endorse the notion of an absolute and universal truth, but such notion is not needed for any serious project of science-pseudoscience demarcation. All one needs is that some “opinions” are far better established, by way of argument and evidence, than others and that scientific opinions tend to be dramatically better established than pseudoscientific ones.

It is certainly true, as Laudan maintains, that modern philosophers of science see science as a set of methods and procedures, not as a particular body of knowledge. But the two are tightly linked: the process of science yields reliable (if tentative) knowledge of the world. Conversely, the processes of pseudoscience, such as they are, do not yield any knowledge of the world. The distinction between science as a body of knowledge and science as a set of methods and procedures, therefore, does nothing to undermine the need for demarcation.

After a by now de rigueur criticism of the failure of positivism, Laudan attempts to undermine Popper’s falsificationism. But even Laudan himself seems to realize that the limits of falsificationism do not deal a death blow to the notion that there are recognizable sciences and pseudosciences: “One might respond to such criticisms [of falsificationism] by saying that scientific status is a matter of degree rather than kind” (Laudan 1983, 121). Indeed, that seems to be the currently dominant position of philosophers who are active in the area of demarcation.

The rest of Laudan’s critique boils down to the argument that no demarcation criterion proposed so far can provide a set of necessary and sufficient conditions to define an activity as scientific, and that the “epistemic heterogeneity of the activities and beliefs customarily regarded as scientific” (1983, 124) means that demarcation is a futile quest. This article now briefly examines each of these two claims.

Ever since Wittgenstein (1958), philosophers have recognized that any sufficiently complex concept will not likely be definable in terms of a small number of necessary and jointly sufficient conditions. That approach may work in basic math, geometry, and logic (for example, definitions of triangles and other geometric figures), but not for anything as complex as “science” or “pseudoscience.” This implies that single-criterion attempts like Popper’s are indeed to finally be set aside, but it does not imply that multi-criterial or “fuzzy” approaches will not be useful. Again, rather than a failure, this shift should be regarded as evidence of progress in this particular philosophical debate.

Regarding Laudan’s second claim from above, that science is a fundamentally heterogeneous activity, this may or may not be the case, the jury is still very much out. Some philosophers of science have indeed suggested that there is a fundamental disunity to the sciences (Dupré 1993), but this is far from being a consensus position. Even if true, a heterogeneity of “science” does not preclude thinking of the sciences as a family resemblance set, perhaps with distinctly identifiable sub-sets, similar to the Wittgensteinian description of “games” and their subdivision into fuzzy sets including board games, ball games, and so forth. Indeed, some of the authors discussed later in this article have made this very same proposal regarding pseudoscience: there may be no fundamental unity grouping, say, astrology, creationism, and anti-vaccination conspiracy theories, but they nevertheless share enough Wittgensteinian threads to make it useful for us to talk of all three as examples of broadly defined pseudosciences.

3. The Return of Demarcation: The University of Chicago Press Volume

Laudan’s 1983 paper had the desired effect of convincing a number of philosophers of science that it was not worth engaging with demarcation issues. Yet, in the meantime pseudoscience kept being a noticeable social phenomenon, one that was having increasingly pernicious effects, for instance in the case of HIV, vaccine, and climate change denialism (Smith and Novella, 2007; Navin 2013; Brulle 2020). It was probably inevitable, therefore, that philosophers of science who felt that their discipline ought to make positive contributions to society would, sooner or later, go back to the problem of demarcation.

The turning point was an edited volume entitled The Philosophy of Pseudoscience: Reconsidering the Demarcation Problem, published in 2013 by the University of Chicago Press (Pigliucci and Boudry 2013). The editors and contributors consciously and explicitly set out to respond to Laudan and to begin the work necessary to make progress (in something like the sense highlighted above) on the issue.

The first five chapters of The Philosophy of Pseudoscience take the form of various responses to Laudan, several of which hinge on the rejection of the strict requirement for a small set of necessary and jointly sufficient conditions to define science or pseudoscience. Contemporary philosophers of science, it seems, have no trouble with inherently fuzzy concepts. As for Laudan’s contention that the term “pseudoscience” does only negative, potentially inflammatory work, this is true and yet no different from, say, the use of “unethical” in moral philosophy, which few if any have thought of challenging.

The contributors to The Philosophy of Pseudoscience also readily admit that science is best considered as a family of related activities, with no fundamental essence to define it. Indeed, the same goes for pseudoscience as, for instance, vaccine denialism is very different from astrology, and both differ markedly from creationism. Nevertheless, there are common threads in both cases, and the existence of such threads justifies, in part, philosophical interest in demarcation. The same authors argue that we should focus on the borderline cases, precisely because there it is not easy to neatly separate activities into scientific and pseudoscientific. There is no controversy, for instance, in classifying fundamental physics and evolutionary biology as sciences, and there is no serious doubt that astrology and homeopathy are pseudosciences. But what are we to make of some research into the paranormal carried out by academic psychologists (Jeffers 2007)? Or of the epistemically questionable claims often, but not always, made by evolutionary psychologists (Kaplan 2006)?

The 2013 volume sought a consciously multidisciplinary approach to demarcation. Contributors include philosophers of science, but also sociologists, historians, and professional skeptics (meaning people who directly work on the examination of extraordinary claims). The group saw two fundamental reasons to continue scholarship on demarcation. On the one hand, science has acquired a high social status and commands large amounts of resources in modern society. This means that we ought to examine and understand its nature in order to make sound decisions about just how much trust to put into scientific institutions and proceedings, as well as how much money to pump into the social structure that is modern science. On the other hand, as noted above, pseudoscience is not a harmless pastime. It has negative effects on both individuals and societies. This means that an understanding of its nature, and of how it differs from science, has very practical consequences.

The Philosophy of Pseudoscience also tackles issues of history and sociology of the field. It contains a comprehensive history of the demarcation problem followed by a historical analysis of pseudoscience, which tracks down the coinage and currency of the term and explains its shifting meaning in tandem with the emerging historical identity of science. A contribution by a sociologist then provides an analysis of paranormalism as a “deviant discipline” violating the consensus of established science, and one chapter draws attention to the characteristic social organization of pseudosciences as a means of highlighting the corresponding sociological dimension of the scientific endeavor.

The volume explores the borderlands between science and pseudoscience, for instance by deploying the idea of causal asymmetries in evidential reasoning to differentiate between what are sometime referred to as “hard” and “soft” sciences, arguing that misconceptions about this difference explain the higher incidence of pseudoscience and anti-science connected to the non-experimental sciences. One contribution looks at the demographics of pseudoscientific belief and examines how the demarcation problem is treated in legal cases. One chapter recounts the story of how at one time the pre-Darwinian concept of evolution was treated as pseudoscience in the same guise as mesmerism, before eventually becoming the professional science we are familiar with, thus challenging a conception of demarcation in terms of timeless and purely formal principles.

A discussion focusing on science and the supernatural includes the provocative suggestion that, contrary to recent philosophical trends, the appeal to the supernatural should not be ruled out from science on methodological grounds, as it is often done, but rather because the very notion of supernatural intervention suffers from fatal flaws. Meanwhile, David Hume is enlisted to help navigate the treacherous territory between science and religious pseudoscience and to assess the epistemic credentials of supernaturalism.

The Philosophy of Pseudoscience includes an analysis of the tactics deployed by “true believers” in pseudoscience, beginning with a discussion of the ethics of argumentation about pseudoscience, followed by the suggestion that alternative medicine can be evaluated scientifically despite the immunizing strategies deployed by some of its most vocal supporters. One entry summarizes misgivings about Freudian psychoanalysis, arguing that we should move beyond assessments of the testability and other logical properties of a theory, shifting our attention instead to the spurious claims of validation and other recurrent misdemeanors on the part of pseudoscientists. It also includes a description of the different strategies used by climate change “skeptics” and other denialists, outlining the links between new and “traditional” pseudosciences.

The volume includes a section examining the complex cognitive roots of pseudoscience. Some of the contributors ask whether we actually evolved to be irrational, describing a number of heuristics that are rational in domains ecologically relevant to ancient Homo sapiens, but that lead us astray in modern contexts. One of the chapters explores the non-cognitive functions of super-empirical beliefs, analyzing the different attitudes of science and pseudoscience toward intuition. An additional entry distinguishes between two mindsets about science and explores the cognitive styles relating to authority and tradition in both science and pseudoscience. This is followed by an essay proposing that belief in pseudoscience may be partly explained by theories about the ethics of belief. There is also a chapter on pseudo-hermeneutics and the illusion of understanding, drawing inspiration from the cognitive psychology and philosophy of intentional thinking.

A simple search of online databases of philosophical peer reviewed papers clearly shows that the 2013 volume has succeeded in countering Laudan’s 1983 paper, yielding a flourishing of new entries in the demarcation literature in particular, and in the newly established subfield of the philosophy of pseudoscience more generally. This article now turns to a brief survey of some of the prominent themes that have so far characterized this Renaissance of the field of demarcation.

4. The Renaissance of the Demarcation Problem

After the publication of The Philosophy of Pseudoscience collection, an increasing number of papers has been published on the demarcation problem and related issues in philosophy of science and epistemology. It is not possible to discuss all the major contributions in detail, so what follows is intended as a representative set of highlights and a brief guide to the primary literature.

Sven Ove Hansson (2017) proposed that science denialism, often considered a different issue from pseudoscience, is actually one form of the latter, the other form being what he terms pseudotheory promotion. Hansson examines in detail three case studies: relativity theory denialism, evolution denialism, and climate change denialism. The analysis is couched in terms of three criteria for the identification of pseudoscientific statements, previously laid out by Hansson (2013). A statement is pseudoscientific if it satisfies the following:

  1. It pertains to an issue within the domains of science in the broad sense (the criterion of scientific domain).
  2. It suffers from such a severe lack of reliability that it cannot at all be trusted (the criterion of unreliability).
  3. It is part of a doctrine whose major proponents try to create the impression that it represents the most reliable knowledge on its subject matter (the criterion of deviant doctrine).

On these bases, Hansson concludes that, for example, “The misrepresentations of history presented by Holocaust deniers and other pseudo-historians are very similar in nature to the misrepresentations of natural science promoted by creationists and homeopaths” (2017, 40). In general, Hansson proposes that there is a continuum between science denialism at one end (for example, regarding climate change, the holocaust, the general theory of relativity, etc.) and pseudotheory promotion at the other end (for example, astrology, homeopathy, iridology). He identifies four epistemological characteristics that account for the failure of science denialism to provide genuine knowledge:

  • Cherry picking. One example is Conservapedia’s entry listing alleged counterexamples to the general theory of relativity. Never mind that, of course, an even cursory inspection of such “anomalies” turns up only mistakes or misunderstandings.
  • Neglect of refuting information. Again concerning general relativity denialism, the proponents of the idea point to a theory advanced by the Swiss physicist Georges-Louis Le Sage that gravitational forces result from pressure exerted on physical bodies by a large number of small invisible particles. That idea might have been reasonably entertained when it was proposed, in the 18th century, but not after the devastating criticism it received in the 19th century—let alone the 21st.
  • Fabrication of fake controversies. Perhaps the most obvious example here is the “teach both theories” mantra so often repeated by creationists, which was adopted by Ronald Reagan during his 1980 presidential campaign. The fact is, there is no controversy about evolution within the pertinent epistemic community.
  • Deviant criteria of assent. For instance, in the 1920s and ‘30s, special relativity was accused of not being sufficiently transpicuous, and its opponents went so far as to attempt to create a new “German physics” that would not use difficult mathematics and would, therefore, be accessible by everyone. Both Einstein and Planck ridiculed the whole notion that science ought to be transpicuous in the first place. The point is that part of the denialist’s strategy is to ask for impossible standards in science and then use the fact that such demands are not met (because they cannot be) as “evidence” against a given scientific notion. This is known as the unobtainable perfection fallacy (Gauch, 2012).

Hansson lists ten sociological characteristics of denialism: that the focal theory (say, evolution) threatens the denialist’s worldview (for instance, a fundamentalist understanding of Christianity); complaints that the focal theory is too difficult to understand; a lack of expertise among denialists; a strong predominance of men among the denialists (that is, lack of diversity); an inability to publish in peer-reviewed journals; a tendency to embrace conspiracy theories; appeals directly to the public; the pretense of having support among scientists; a pattern of attacks against legitimate scientists; and strong political overtones.

Dawes (2018) acknowledges, with Laudan (1983), that there is a general consensus that no single criterion (or even small set of necessary and jointly sufficient criteria) is capable of discerning science from pseudoscience. However, he correctly maintains that this does not imply that there is no multifactorial account of demarcation, situating different kinds of science and pseudoscience along a continuum. One such criterion is that science is a social process, which entails that a theory is considered scientific because it is part of a research tradition that is pursued by the scientific community.

Dawes is careful in rejecting the sort of social constructionism endorsed by some sociologists of science (Bloor 1976) on the grounds that the sociological component is just one of the criteria that separate science from pseudoscience. Two additional criteria have been studied by philosophers of science for a long time: the evidential and the structural. The first refers to the connection between a given scientific theory and the empirical evidence that provides epistemic warrant for that theory. The second is concerned with the internal structure and coherence of a scientific theory.

Science, according to Dawes, is a cluster concept grouping a set of related, yet somewhat differentiated, kinds of activities. In this sense, his paper reinforces an increasingly widespread understanding of science in the philosophical community (see also Dupré 1993; Pigliucci 2013). Pseudoscience, then, is also a cluster concept, similarly grouping a number of related, yet varied, activities that attempt to mimic science but do so within the confines of an epistemically inert community.

The question, therefore, becomes, in part, one of distinguishing scientific from pseudoscientific communities, especially when the latter closely mimic the first ones. Take, for instance, homeopathy. While it is clearly a pseudoscience, the relevant community is made of self-professed “experts” who even publish a “peer-reviewed” journal, Homeopathy, put out by a major academic publisher, Elsevier. Here, Dawes builds on an account of scientific communities advanced by Robert Merton (1973). According to Merton, scientific communities are characterized by four norms, all of which are lacking in pseudoscientific communities: universalism, the notion that class, gender, ethnicity, and so forth are (ideally, at least) treated as irrelevant in the context of scientific discussions; communality, in the sense that the results of scientific inquiry belong (again, ideally) to everyone; disinterestedness, not because individual scientists are unbiased, but because community-level mechanisms counter individual biases; and organized skepticism, whereby no idea is exempt from critical scrutiny.

In the end, Dawes’s suggestion is that “We will have a pro tanto reason to regard a theory as pseudoscientific when it has been either refused admission to, or excluded from, a scientific research tradition that addresses the relevant problems” (2018, 293). Crucially, however, what is or is not recognized as a viable research tradition by the scientific community changes over time, so that the demarcation between science and pseudoscience is itself liable to shift as time passes.

One author who departs significantly from what otherwise seems to be an emerging consensus on demarcation is Angelo Fasce (2019). He rejects the notion that there is any meaningful continuum between science and pseudoscience, or that either concept can fruitfully be understood in terms of family resemblance, going so far as accusing some of his colleagues of “still engag[ing] in time-consuming, unproductive discussions on already discarded demarcation criteria, such as falsifiability” (2019, 155).

Fasce’s criticism hinges, in part, on the notion that gradualist criteria may create problems in policy decision making: just how much does one activity have to be close to the pseudoscientific end of the spectrum in order for, say, a granting agency to raise issues? The answer is that there is no sharp demarcation because there cannot be, regardless of how much we would wish otherwise. In many cases, said granting agency should have no trouble classifying good science (for example, fundamental physics or evolutionary biology) as well as obvious pseudoscience (for example, astrology or homeopathy). But there will be some borderline cases (for instance, parapsychology? SETI?) where one will just have to exercise one’s best judgment based on what is known at the moment and deal with the possibility that one might make a mistake.

Fasce also argues that “Contradictory conceptions and decisions can be consistently and justifiably derived from [a given demarcation criterion]—i.e. mutually contradictory propositions could be legitimately derived from the same criterion because that criterion allows, or is based on, ‘subjective’ assessment” (2019, 159). Again, this is probably true, but it is also likely an inevitable feature of the nature of the problem, not a reflection of the failure of philosophers to adequately tackle it.

Fasce (2019, 62) states that there is no historical case of a pseudoscience turning into a legitimate science, which he takes as evidence that there is no meaningful continuum between the two classes of activities. But this does not take into account the case of pre-Darwinian evolutionary theories mentioned earlier, nor the many instances of the reverse transition, in which an activity initially considered scientific has, in fact, gradually turned into a pseudoscience, including alchemy (although its relationship with chemistry is actually historically complicated), astrology, phrenology, and, more recently, cold fusion—with the caveat that whether the latter notion ever reached scientific status is still being debated by historians and philosophers of science. These occurrences would seem to point to the existence of a continuum between the two categories of science and pseudoscience.

One interesting objection raised by Fasce is that philosophers who favor a cluster concept approach do not seem to be bothered by the fact that such a Wittgensteinian take has led some authors, like Richard Rorty, all the way down the path of radical relativism, a position that many philosophers of science reject. Then again, Fasce himself acknowledges that “Perhaps the authors who seek to carry out the demarcation of pseudoscience by means of family resemblance definitions do not follow Wittgenstein in all his philosophical commitments” (2019, 64).

Because of his dissatisfaction with gradualist interpretations of the science-pseudoscience landscape, Fasce (2019, 67) proposes what he calls a “metacriterion” to aid in the demarcation project. This is actually a set of four criteria, two of which he labels “procedural requirements” and two “criterion requirements.” The latter two are mandatory for demarcation, while the first two are not necessary, although they provide conditions of plausibility. The procedural requirements are: (i) that demarcation criteria should entail a minimum number of philosophical commitments; and (ii) that demarcation criteria should explain current consensus about what counts as science or pseudoscience. The criterion requirements are: (iii) that mimicry of science is a necessary condition for something to count as pseudoscience; and (iv) that all items of demarcation criteria be discriminant with respect to science.

Fasce (2018) has used his metacriterion to develop a demarcation criterion according to which pseudoscience: (1) refers to entities and/or processes outside the domain of science; (2) makes use of a deficient methodology; (3) is not supported by evidence; and (4) is presented as scientific knowledge. This turns out to be similar to a previous proposal by Hansson (2009). Fasce and Picó (2019) have also developed a scale of pseudoscientific belief based on the work discussed above.

Another author pushing a multicriterial approach to demarcation is Damian Fernandez‐Beanato (2020b), whom this article already mentioned when discussing Cicero’s early debunking of divination. He provides a useful summary of previous mono-criterial proposals, as well as of two multicriterial ones advanced by Hempel (1951) and Kuhn (1962). The failure of these attempts is what in part led to the above-mentioned rejection of the entire demarcation project by Laudan (1983).

Fernandez‐Beanato suggests improvements on a multicriterial approach originally put forth by Mahner (2007), consisting of a broad list of accepted characteristics or properties of science. The project, however, runs into significant difficulties for a number of reasons. First, like Fasce (2019), Fernandez-Beanato wishes for more precision than is likely possible, in his case aiming at a quantitative “cut value” on a multicriterial scale that would make it possible to distinguish science from non-science or pseudoscience in a way that is compatible with classical logic. It is hard to imagine how such quantitative estimates of “scientificity” may be obtained and operationalized. Second, the approach assumes a unity of science that is at odds with the above-mentioned emerging consensus in philosophy of science that “science” (and, similarly, “pseudoscience”) actually picks a family of related activities, not a single epistemic practice. Third, Fernandez-Beanato rejects Hansson’s (and other authors’) notion that any demarcation criterion is, by necessity, temporally limited because what constitutes science or pseudoscience changes with our understanding of phenomena. But it seems hard to justify Fernandez-Beanato’s assumption that “Science … is currently, in general, mature enough for properties related to method to be included into a general and timeless definition of science” (2019, 384).

Kåre Letrud (2019), like Fasce (2019), seeks to improve on Hansson’s (2009) approach to demarcation, but from a very different perspective. He points out that Hansson’s original answer to the demarcation problem focuses on pseudoscientific statements, not disciplines. The problem with this, according to Letrud, is that Hansson’s approach does not take into sufficient account the sociological aspect of the science-pseudoscience divide. Moreover, following Hansson—again according to Letrud—one would get trapped into a never-ending debunking of individual (as distinct from systemic) pseudoscientific claims. Here Letrud invokes the “Bullshit Asymmetry Principle,” also known as “Brandolini’s Law” (named after the Italian programmer Alberto Brandolini, to which it is attributed): “The amount of energy needed to refute BS is an order of magnitude bigger than to produce it.” Going pseudoscientific statement by pseudoscientific statement, then, is a losing proposition.

Letrud notes that Hansson (2009) adopts a broad definition of “science,” along the lines of the German Wissenschaft, which includes the social sciences and the humanities. While Fasce (2019) thinks this is problematically too broad, Letrud (2019) points out that a broader view of science implies a broader view of pseudoscience, which allows Hansson to include in the latter not just standard examples like astrology and homeopathy, but also Holocaust denialism, Bible “codes,” and so forth.

According to Letrud, however, Hansson’s original proposal does not do a good job differentiating between bad science and pseudoscience, which is important because we do not want to equate the two. Letrud suggests that bad science is characterized by discrete episodes of epistemic failure, which can occur even within established sciences. Pseudoscience, by contrast, features systemic epistemic failure. Bad science can even give rise to what Letrud calls “scientific myth propagation,” as in the case of the long-discredited notion that there are such things as learning styles in pedagogy. It can take time, even decades, to correct examples of bad science, but that does not ipso facto make them instances of pseudoscience.

Letrud applies Lakatos’s (1978) distinction of core vs. auxiliary statements for research programs  to core vs. auxiliary statements typical of pseudosciences like astrology or homeopathy, thus bridging the gap between Hansson’s focus on individual statements and Letrud’s preferred focus on disciplines. For instance: “One can be an astrologist while believing that Virgos are loud, outgoing people (apparently, they are not). But one cannot hold that the positions of the stars and the character and behavior of people are unrelated” (Letrud 2019, 8). The first statement is auxiliary, the second, core.

To take homeopathy as an example, a skeptic could decide to spend an inordinate amount of time (according to Brandolini’s Law) debunking individual statements made by homeopaths. Or, more efficiently, the skeptic could target the two core principles of the discipline, namely potentization theory (that is, the notion that more diluted solutions are more effective) and the hypothesis that water holds a “memory” of substances once present in it. Letrud’s approach, then, retains the power of Hansson’s, but zeros in on the more foundational weakness of pseudoscience—its core claims—while at the same time satisfactorily separating pseudoscience from regular bad science. The debate, however, is not over, as more recently Hansson (2020) has replied to Letrud emphasizing that pseudosciences are doctrines, and that the reason they are so pernicious is precisely their doctrinal resistance to correction.

5. Pseudoscience as BS

One of the most intriguing papers on demarcation to appear in the course of what this article calls the Renaissance of scholarship on the issue of pseudoscience is entitled “Bullshit, Pseudoscience and Pseudophilosophy,” authored by Victor Moberger (2020). Moberger has found a neat (and somewhat provocative) way to describe the profound similarity between pseudoscience and pseudophilosophy: in a technical philosophical sense, it is all BS.

Moberger takes his inspiration from the famous essay by Harry Frankfurt (2005), On Bullshit. As Frankfurt puts it: “One of the most salient features of our culture is that there is so much bullshit.” (2005, 1) Crucially, Frankfurt goes on to differentiate the BSer from the liar:

It is impossible for someone to lie unless he thinks he knows the truth. … A person who lies is thereby responding to the truth, and he is to that extent respectful of it. When an honest man speaks, he says only what he believes to be true; and for the liar, it is correspondingly indispensable that he consider his statements to be false. For the bullshitter, however, all these bets are off: he is neither on the side of the true nor on the side of the false. His eye is not on the facts at all, as the eyes of the honest man and of the liar are. … He does not care whether the things he says describe reality correctly. (2005, 55-56)

So, while both the honest person and the liar are concerned with the truth—though in opposite manners—the BSer is defined by his lack of concern for it. This lack of concern is of the culpable variety, so that it can be distinguished from other activities that involve not telling the truth, like acting. This means two important things: (i) BS is a normative concept, meaning that it is about how one ought to behave or not to behave; and (ii) the specific type of culpability that can be attributed to the BSer is epistemic culpability. As Moberger puts it, “the bullshitter is assumed to be capable of responding to reasons and argument, but fails to do so” (2020, 598) because he does not care enough.

Moberger does not make the connection in his paper, but since he focuses on BSing as an activity carried out by particular agents, and not as a body of statements that may be true or false, his treatment falls squarely into the realm of virtue epistemology (see below). We can all arrive at the wrong conclusion on a specific subject matter, or unwittingly defend incorrect notions. And indeed, to some extent we may all, more or less, be culpable of some degree of epistemic misconduct, because few if any people are the epistemological equivalent of sages, ideally virtuous individuals. But the BSer is pathologically epistemically culpable. He incurs epistemic vices and he does not care about it, so long as he gets whatever he wants out of the deal, be that to be “right” in a discussion, or to further his favorite a priori ideological position no matter what.

Accordingly, the charge of BSing—in the technical sense—has to be substantiated by serious philosophical analysis. The term cannot simply be thrown out there as an insult or an easy dismissal. For instance, when Kant famously disagreed with Hume on the role of reason (primary for Kant, subordinate to emotions for Hume) he could not just have labelled Hume’s position as BS and move on, because Hume had articulated cogent arguments in defense of his take on the subject.

On the basis of Frankfurt’s notion of BSing, Moberger carries out a general analysis of pseudoscience and even pseudophilosophy. He uses the term pseudoscience to refer to well-known examples of epistemic malpractice, like astrology, creationism, homeopathy, ufology, and so on. According to Moberger, the term pseudophilosophy, by contrast, picks out two distinct classes of behaviors. The first is what he refers to as “a seemingly profound type of academic discourse that is pursued primarily within the humanities and social sciences” (2020, 600), which he calls obscurantist pseudophilosophy. The second, a “less familiar kind of pseudophilosophy is usually found in popular scientific contexts, where writers, typically with a background in the natural sciences, tend to wander into philosophical territory without realizing it, and again without awareness of relevant distinctions and arguments” (2020, 601). He calls this scientistic (Boudry and Pigliucci 2017) pseudophilosophy.

The bottom line is that pseudoscience is BS with scientific pretensions, while pseudophilosophy is BS with philosophical pretensions. What pseudoscience and pseudophilosophy have in common, then, is BS. While both pseudoscience and pseudophilosophy suffer from a lack of epistemic conscientiousness, this lack manifests itself differently, according to Moberger. In the case of pseudoscience, we tend to see a number of classical logical fallacies and other reasoning errors at play. In the case of pseudophilosophy, instead, we see “equivocation due to conceptual impressionism, whereby plausible but trivial propositions lend apparent credibility to interesting but implausible ones.”

Moberger’s analysis provides a unified explanatory framework for otherwise seemingly disparate phenomena, such as pseudoscience and pseudophilosophy. And it does so in terms of a single, more fundamental, epistemic problem: BSing. He then proceeds by fleshing out the concept—for instance, differentiating pseudoscience from scientific fraud—and by responding to a range of possible objections to his thesis, for example that the demarcation of concepts like pseudoscience, pseudophilosophy, and even BS is vague and imprecise. It is so by nature, Moberger responds, adopting the already encountered Wittgensteinian view that complex concepts are inherently fuzzy.

Importantly, Moberger reiterates a point made by other authors before, and yet very much worth reiterating: any demarcation in terms of content between science and pseudoscience (or philosophy and pseudophilosophy), cannot be timeless. Alchemy was once a science, but it is now a pseudoscience. What is timeless is the activity underlying both pseudoscience and pseudophilosophy: BSing.

There are several consequences of Moberger’s analysis. First, that it is a mistake to focus exclusively, sometimes obsessively, on the specific claims made by proponents of pseudoscience as so many skeptics do. That is because sometimes even pseudoscientific practitioners get things right, and because there simply are too many such claims to be successfully challenged (again, Brandolini’s Law). The focus should instead be on pseudoscientific practitioners’ epistemic malpractice: content vs. activity.

Second, what is bad about pseudoscience and pseudophilosophy is not that they are unscientific, because plenty of human activities are not scientific and yet are not objectionable (literature, for instance). Science is not the ultimate arbiter of what has or does not have value. While this point is hardly controversial, it is worth reiterating, considering that a number of prominent science popularizers have engaged in this mistake.

Third, pseudoscience does not lack empirical content. Astrology, for one, has plenty of it. But that content does not stand up to critical scrutiny. Astrology is a pseudoscience because its practitioners do not seem to be bothered by the fact that their statements about the world do not appear to be true.

One thing that is missing from Moberger’s paper, perhaps, is a warning that even practitioners of legitimate science and philosophy may be guilty of gross epistemic malpractice when they criticize their pseudo counterparts. Too often so-called skeptics reject unusual or unorthodox claims a priori, without critical analysis or investigation, for example in the notorious case of the so-called Campeche UFOs (Pigliucci, 2018, 97-98). From a virtue epistemological perspective, it comes down to the character of the agents. We all need to push ourselves to do the right thing, which includes mounting criticisms of others only when we have done our due diligence to actually understand what is going on. Therefore, a small digression into how virtue epistemology is relevant to the demarcation problem now seems to be in order.

6. Virtue Epistemology and Demarcation

Just like there are different ways to approach virtue ethics (for example, Aristotle, the Stoics), so there are different ways to approach virtue epistemology. What these various approaches have in common is the assumption that epistemology is a normative (that is, not merely descriptive) discipline, and that intellectual agents (and their communities) are the sources of epistemic evaluation.

The assumption of normativity very much sets virtue epistemology as a field at odds with W.V.O. Quine’s famous suggestion that epistemology should become a branch of psychology (see Naturalistic Epistemology): that is, a descriptive, not prescriptive discipline. That said, however, virtue epistemologists are sensitive to input from the empirical sciences, first and foremost psychology, as any sensible philosophical position ought to be.

A virtue epistemological approach—just like its counterpart in ethics—shifts the focus away from a “point of view from nowhere” and onto specific individuals (and their communities), who are treated as epistemic agents. In virtue ethics, the actions of a given agent are explained in terms of the moral virtues (or vices) of that agent, like courage or cowardice. Analogously, in virtue epistemology the judgments of a given agent are explained in terms of the epistemic virtues of that agent, such as conscientiousness, or gullibility.

Just like virtue ethics has its roots in ancient Greece and Rome, so too can virtue epistemologists claim a long philosophical pedigree, including but not limited to Plato, Aristotle, the Stoics, Thomas Aquinas, Descartes, Hume, and Bertrand Russell.

But what exactly is a virtue, in this context? Again, the analogy with ethics is illuminating. In virtue ethics, a virtue is a character trait that makes the agent an excellent, meaning ethical, human being. Similarly, in virtue epistemology a virtue is a character trait that makes the agent an excellent cognizer. Here is a partial list of epistemological virtues and vices to keep handy:

Epistemic virtues Epistemic vices
Attentiveness Close-mindedness
Benevolence (that is, principle of charity) Dishonesty
Conscentiousness Dogmatism
Creativity Gullibility
Curiosity Naïveté
Discernment Obtuseness
Honesty Self-deception
Humility Superficiality
Objectivity Wishful thinking
Parsimony
Studiousness
Understanding
Warrant
Wisdom

Linda Zagzebski (1996) has proposed a unified account of epistemic and moral virtues that would cast the entire science-pseudoscience debate in more than just epistemic terms. The idea is to explicitly bring to epistemology the same inverse approach that virtue ethics brings to moral philosophy: analyzing right actions (or right beliefs) in terms of virtuous character, instead of the other way around.

For Zagzebski, intellectual virtues are actually to be thought of as a subset of moral virtues, which would make epistemology a branch of ethics. The notion is certainly intriguing: consider a standard moral virtue, like courage. It is typically understood as being rooted in the agent’s motivation to do good despite the risk of personal danger. Analogously, the virtuous epistemic agent is motivated by wanting to acquire knowledge, in pursuit of which goal she cultivates the appropriate virtues, like open-mindedness.

In the real world, sometimes virtues come in conflict with each other, for instance in cases where the intellectually bold course of action is also not the most humble, thus pitting courage and humility against each other. The virtuous moral or epistemic agent navigates a complex moral or epistemic problem by adopting an all-things-considered approach with as much wisdom as she can muster. Knowledge itself is then recast as a state of belief generated by acts of intellectual virtue.

Reconnecting all of this more explicitly with the issue of science-pseudoscience demarcation, it should now be clearer why Moberger’s focus on BS is essentially based on a virtue ethical framework. The BSer is obviously not acting virtuously from an epistemic perspective, and indeed, if Zagzebski is right, also from a moral perspective. This is particularly obvious in the cases of pseudoscientific claims made by, among others, anti-vaxxers and climate change denialists. It is not just the case that these people are not being epistemically conscientious. They are also acting unethically because their ideological stances are likely to hurt others.

A virtue epistemological approach to the demarcation problem is explicitly adopted in a paper by Sindhuja Bhakthavatsalam and Weimin Sun (2021), who both provide a general outline of how virtue epistemology may be helpful concerning science-pseudoscience demarcation. The authors also explore in detail the specific example of the Chinese practice of Feng Shui, a type of pseudoscience employed in some parts of the world to direct architects to build in ways that maximize positive “qi” energy.

Bhakthavatsalam and Sun argue that discussions of demarcation do not aim solely at separating the usually epistemically reliable products of science from the typically epistemically unreliable ones that come out of pseudoscience. What we want is also to teach people, particularly the general public, to improve their epistemic judgments so that they do not fall prey to pseudoscientific claims. That is precisely where virtue epistemology comes in.

Bhakthavatsalam and Sun build on work by Anthony Derksen (1993) who arrived at what he called an epistemic-social-psychological profile of a pseudoscientist, which in turn led him to a list of epistemic “sins” that pseudoscientists regularly engage in: lack of reliable evidence for their claims; arbitrary “immunization” from empirically based criticism (Boudry and Braeckman 2011); assigning outsized significance to coincidences; adopting magical thinking; contending to have special insight into the truth; tendency to produce all-encompassing theories; and uncritical pretension in the claims put forth.

Conversely, one can arrive at a virtue epistemological understanding of science and other truth-conducive epistemic activities. As Bhakthavatsalam and Sun (2021, 6) remind us: “Virtue epistemologists contend that knowledge is non‐accidentally true belief. Specifically, it consists in belief of truth stemming from epistemic virtues rather than by luck. This idea is captured well by Wayne Riggs (2009): knowledge is an ‘achievement for which the knower deserves credit.’”

Bhakthavatsalam and Sun discuss two distinct yet, in their mind, complementary (especially with regard to demarcation) approaches to virtue ethics: virtue reliabilism and virtue responsibilism. Briefly, virtue reliabilism (Sosa 1980, 2011) considers epistemic virtues to be stable behavioral dispositions, or competences, of epistemic agents. In the case of science, for instance, such virtues might include basic logical thinking skills, the ability to properly collect data, the ability to properly analyze data, and even the practical know-how necessary to use laboratory or field equipment. Clearly, these are precisely the sort of competences that are not found among practitioners of pseudoscience. But why not? This is where the other approach to virtue epistemology, virtue responsibilism, comes into play.

Responsibilism is about identifying and practicing epistemic virtues, as well as identifying and staying away from epistemic vices. The virtues and vices in question are along the lines of those listed in the table above. Of course, we all (including scientists and philosophers) engage in occasionally vicious, or simply sloppy, epistemological practices. But what distinguishes pseudoscientists is that they systematically tend toward the vicious end of the epistemic spectrum, while what characterizes the scientific community is a tendency to hone epistemic virtues, both by way of expressly designed training and by peer pressure internal to the community. Part of the advantage of thinking in terms of epistemic vices and virtues is that one then puts the responsibility squarely on the shoulders of the epistemic agent, who becomes praiseworthy or blameworthy, as the case may be.

Moreover, a virtue epistemological approach immediately provides at least a first-level explanation for why the scientific community is conducive to the truth while the pseudoscientific one is not. In the latter case, comments Cassam:

The fact that this is how [the pseudoscientist] goes about his business is a reflection of his intellectual character. He ignores critical evidence because he is grossly negligent, he relies on untrustworthy sources because he is gullible, he jumps to conclusions because he is lazy and careless. He is neither a responsible nor an effective inquirer, and it is the influence of his intellectual character traits which is responsible for this. (2016, 165)

In the end, Bhakthavatsalam and Sun arrive, by way of their virtue epistemological approach, to the same conclusion that we have seen other authors reach: both science and pseudoscience are Wittgensteinian-type cluster concepts. But virtue epistemology provides more than just a different point of view on demarcation. First, it identifies specific behavioral tendencies (virtues and vices) the cultivation (or elimination) of which yield epistemically reliable outcomes. Second, it shifts the responsibility to the agents as well as to the communal practices within which such agents operate. Third, it makes it possible to understand cases of bad science as being the result of scientists who have not sufficiently cultivated or sufficiently regarded their virtues, which in turn explains why we find the occasional legitimate scientist who endorses pseudoscientific notions.

How do we put all this into practice, involving philosophers and scientists in the sort of educational efforts that may help curb the problem of pseudoscience? Bhakthavatsalam and Sun articulate a call for action at both the personal and the systemic levels. At the personal level, we can virtuously engage with both purveyors of pseudoscience and, likely more effectively, with quasi-neutral bystanders who may be attracted to, but have not yet bought into, pseudoscientific notions. At the systemic level, we need to create the sort of educational and social environment that is conducive to the cultivation of epistemic virtues and the eradication of epistemic vices.

Bhakthavatsalam and Sun are aware of the perils of engaging defenders of pseudoscience directly, especially from the point of view of virtue epistemology. It is far too tempting to label them as “vicious,” lacking in critical thinking, gullible, and so forth and be done with it. But basic psychology tells us that this sort of direct character attack is not only unlikely to work, but near guaranteed to backfire. Bhakthavatsalam and Sun claim that we can “charge without blame” since our goal is “amelioration rather than blame” (2021, 15). But it is difficult to imagine how someone could be charged with the epistemic vice of dogmatism and not take that personally.

Far more promising are two different avenues: the systemic one, briefly discussed by Bhakthavatsalam and Sun, and the personal not in the sense of blaming others, but rather in the sense of modeling virtuous behavior ourselves.

In terms of systemic approaches, Bhakthavatsalam and Sun are correct that we need to reform both social and educational structures so that we reduce the chances of generating epistemically vicious agents and maximize the chances of producing epistemically virtuous ones. School reforms certainly come to mind, but also regulation of epistemically toxic environments like social media.

As for modeling good behavior, we can take a hint from the ancient Stoics, who focused not on blaming others, but on ethical self-improvement:

If a man is mistaken, instruct him kindly and show him his error. But if you are not able, blame yourself, or not even yourself. (Marcus Aurelius, Meditations, X.4)

A good starting point may be offered by the following checklist, which—in agreement with the notion that good epistemology begins with ourselves—is aimed at our own potential vices. The next time you engage someone, in person or especially on social media, ask yourself the following questions:

  • Did I carefully consider the other person’s arguments without dismissing them out of hand?
  • Did I interpret what they said in a charitable way before mounting a response?
  • Did I seriously entertain the possibility that I may be wrong? Or am I too blinded by my own preconceptions?
  • Am I an expert on this matter? If not, did I consult experts, or did I just conjure my own unfounded opinion?
  • Did I check the reliability of my sources, or just google whatever was convenient to throw at my interlocutor?
  • After having done my research, do I actually know what I’m talking about, or am I simply repeating someone else’s opinion?

After all, as Aristotle said: “Piety requires us to honor truth above our friends” (Nicomachean Ethics, book I), though some scholars suggested that this was a rather unvirtuous comment aimed at his former mentor, Plato.

7. The Scientific Skepticism Movement

One of the interesting characteristics of the debate about science-pseudoscience demarcation is that it is an obvious example where philosophy of science and epistemology become directly useful in terms of public welfare. This, in other words, is not just an exercise in armchair philosophizing; it has the potential to affect lives and make society better. This is why we need to take a brief look at what is sometimes referred to as the skeptic movement—people and organizations who have devoted time and energy to debunking and fighting pseudoscience. Such efforts could benefit from a more sophisticated philosophical grounding, and in turn philosophers interested in demarcation would find their work to be immediately practically useful if they participated in organized skepticism.

That said, it was in fact a philosopher, Paul Kurtz, who played a major role in the development of the skeptical movement in the United States. Kurtz, together with Marcello Truzzi, founded the Committee for the Scientific Investigation of Claims of the Paranormal (CSICOP), in Amherst, New York in 1976. The organization changed its name to the Committee for Skeptical Inquiry (CSI) in November 2006 and has long been publishing the premier world magazine on scientific skepticism, Skeptical Inquirer. These groups, however, were preceded by a long history of skeptic organizations outside the US. The oldest skeptic organization on record is the Dutch Vereniging tegen de Kwakzalverij (VtdK), established in 1881. This was followed by the Belgian Comité Para in 1949, started in response to a large predatory industry of psychics exploiting the grief of people who had lost relatives during World War II.

In the United States, Michael Shermer, founder and editor of Skeptic Magazine, traced the origin of anti-pseudoscience skepticism to the publication of Martin Gardner’s Fads and Fallacies in the Name of Science in 1952. The French Association for Scientific Information (AFIS) was founded in 1968, and a series of groups got started worldwide between 1980 and 1990, including Australian Skeptics, Stichting Skepsis in the Netherlands, and CICAP in Italy. In 1996, the magician James Randi founded the James Randi Educational Foundation, which established a one-million-dollar prize to be given to anyone who could reproduce a paranormal phenomenon under controlled conditions. The prize was never claimed.

After the fall of the Berlin Wall, a series of groups began operating in Russia and its former satellites in response to yet another wave of pseudoscientific claims. This led to skeptic organizations in the Czech Republic, Hungary, and Poland, among others. The European Skeptic Congress was founded in 1989, and a number of World Skeptic Congresses have been held in the United States, Australia, and Europe.

Kurtz (1992) characterized scientific skepticism in the following manner: “Briefly stated, a skeptic is one who is willing to question any claim to truth, asking for clarity in definition, consistency in logic, and adequacy of evidence.” This differentiates scientific skepticism from ancient Pyrrhonian Skepticism, which famously made no claim to any opinion at all, but it makes it the intellectual descendant of the Skepticism of the New Academy as embodied especially by Carneades and Cicero (Machuca and Reed 2018).

One of the most famous slogans of scientific skepticism “Extraordinary claims require extraordinary evidence” was first introduced by Truzzi. It can easily be seen as a modernized version of David Hume’s (1748, Section X: Of Miracles; Part I. 87.) dictum that a wise person proportions his beliefs to the evidence and has been interpreted as an example of Bayesian thinking (McGrayne 2011).

According to another major, early exponent of scientific skepticism, astronomer Carl Sagan: “The question is not whether we like the conclusion that emerges out of a train of reasoning, but whether the conclusion follows from the premises or starting point and whether that premise is true” (1995).

Modern scientific skeptics take full advantage of the new electronic tools of communication. Two examples in particular are the Skeptics’ Guide to the Universe podcast published by Steve Novella and collaborators, which regularly reaches a large audience and features interviews with scientists, philosophers, and skeptic activists; and the “Guerrilla Skepticism” initiative coordinated by Susan Gerbic, which is devoted to the systematic improvement of skeptic-related content on Wikipedia.

Despite having deep philosophical roots, and despite that some of its major exponents have been philosophers, scientific skepticism has an unfortunate tendency to find itself far more comfortable with science than with philosophy. Indeed, some major skeptics, such as author Sam Harris and scientific popularizers Richard Dawkins and Neil deGrasse Tyson, have been openly contemptuous of philosophy, thus giving the movement a bit of a scientistic bent. This is somewhat balanced by the interest in scientific skepticism of a number of philosophers (for instance, Maarten Boudry, Lee McIntyre) as well as by scientists who recognize the relevance of philosophy (for instance, Carl Sagan, Steve Novella).

Given the intertwining of not just scientific skepticism and philosophy of science, but also of social and natural science, the theoretical and practical study of the science-pseudoscience demarcation problem should be regarded as an extremely fruitful area of interdisciplinary endeavor—an endeavor in which philosophers can make significant contributions that go well beyond relatively narrow academic interests and actually have an impact on people’s quality of life and understanding of the world.

8. References and Further Readings

  • Armando, D. and Belhoste, B. (2018) Mesmerism Between the End of the Old Regime and the Revolution: Social Dynamics and Political Issues. Annales historiques de la Révolution française 391(1):3-26
  • Baum, R. and Sheehan, W. (1997) In Search of Planet Vulcan: The Ghost in Newton’s Clockwork Universe. Plenum.
  • Bhakthavatsalam, S. and Sun, W. (2021) A Virtue Epistemological Approach to the Demarcation Problem: Implications for Teaching About Feng Shui in Science Education. Science & Education 30:1421-1452. https://doi.org/10.1007/s11191-021-00256-5.
  • Bloor, D. (1976) Knowledge and Social Imagery. Routledge & Kegan Paul.
  • Bonk, T. (2008) Underdetermination: An Essay on Evidence and the Limits of Natural Knowledge. Springer.
  • Boudry, M. and Braeckman, J. (2011) Immunizing Strategies and Epistemic Defense Mechanisms. Philosophia 39(1):145-161.
  • Boudry, M. and Pigliucci, M. (2017) Science Unlimited? The Challenges of Scientism. University of Chicago Press.
  • Brulle, R.J. (2020) Denialism: Organized Opposition to Climate Change Action in the United States, in: D.M. Konisky (ed.) Handbook of U.S. Environmental Policy, Edward Elgar, chapter 24.
  • Carlson, S. (1985) A Double-Blind Test of Astrology. Nature 318:419-25.
  • Cassam, Q. (2016) Vice Epistemology. The Monist 99(2):159-180.
  • Cicero (2014) On Divination, in: Cicero—Complete Works, translated by W.A. Falconer, Delphi.
  • Curd, M. and Cover, J.A. (eds.) (2012) The Duhem-Quine Thesis and Underdetermination, in: Philosophy of Science: The Central Issues. Norton, pp. 225-333.
  • Dawes, G.W. (2018) Identifying Pseudoscience: A Social Process Criterion. Journal of General Philosophy of Science 49:283-298.
  • Derksen, A.A. (1993) The Seven Sins of Demarcation. Journal for General Philosophy of Science 24:17-42.
  • Dupré, J. (1993) The Disorder of Things: Metaphysical Foundations of the Disunity of Science. Harvard University Press.
  • Fasce, A. (2018) What Do We Mean When We Speak of Pseudoscience? The Development of a Demarcation Criterion Based on the Analysis of Twenty-One Previous Attempts. Disputatio 6(7):459-488.
  • Fasce, A. (2019) Are Pseudosciences Like Seagulls? A Discriminant Metacriterion Facilitates the Solution of the Demarcation Problem. International Studies in the Philosophy of Science 32(3-4):155-175.
  • Fasce, A. and Picó, A. (2019) Conceptual Foundations and Aalidation of the Pseudoscientific Belief Scale. Applied Cognitive Psychology 33(4):617-628.
  • Feldman, R. (1981) Fallibilism and Knowing that One Knows, The Philosophical Review 90:266-282.
  • Fernandez-Beanato, D. (2020a) Cicero’s Demarcation of Science: A Report of Shared Criteria. Studies in History and Philosophy of Science Part A 83:97-102.
  • Fernandez-Beanato, D. (2020b) The Multicriterial Approach to the Problem of Demarcation. Journal for General Philosophy of Science 51:375-390.
  • Feyerabend, P. (1975) Against Method: Outline of an Anarchistic Theory of Knowledge. New Left Books.
  • Frankfurt, H. (2005) On Bullshit. Princeton University Press.
  • Gardner, M. (1952) Fads and Fallacies in the Name of Science. Dover.
  • Gauch, H.G. (2012) Scientific Method in Brief. Cambridge University Press.
  • Gould, S.J. (1989) The Chain of Reason vs. The Chain of Thumbs, Natural History, 89(7):16.
  • Grosser, M. (1962) The Discovery of Neptune. Harvard University Press.
  • Hansson, S.O. (2009) Cutting the Gordian Knot of Demarcation. International Studies in the Philosophy of Science 23(3):237-243.
  • Hansson, S.O. (2013) Defining Pseudoscience—and Science, in: M. Pigliucci and M. Boudry (eds.), The Philosophy of Pseudoscience. University of Chicago University Press, pp. 61-77.
  • Hansson, S.O. (2017) Science Denial as a Form of Pseudoscience. Studies in History and Philosophy of Science 63:39-47.
  • Hansson, S.O. (2020) Disciplines, Doctrines, and Deviant Science. International Studies in the Philosophy of Science 33(1):43-52.
  • Hausman, A., Boardman, F., and Kahane, H. (2021) Logic and Philosophy: A Modern Introduction. Hackett.
  • Hempel, C.G. (1951) The Concept of Cognitive Significance: A Reconsideration. Proceedings of the American Academy of Arts and Sciences 80:61–77.
  • Hossenfelder, S. (2018) Lost in Math: How Beauty Leads Physics Astray. Basic Books.
  • Hume, D. (1748) An Enquiry Concerning Human Understanding, online at https://davidhume.org/texts/e/.
  • Jeffers, S. (2007) PEAR Lab Closes, Ending Decades of Psychic Research. Skeptical Inquirer 31(3), online at https://skepticalinquirer.org/2007/05/pear-lab-closes-ending-decades-of-psychic-research/.
  • Kaplan, J.M. (2006) More Misuses of Evolutionary Psychology. Metascience 15(1):177-181.
  • Kennefick, D. (2019) No Shadow of a Doubt: The 1919 Eclipse That Confirmed Einsteins Theory of Relativity. Princeton University Press.
  • Kuhn, T. (1962) The Structure of Scientific Revolutions. University of Chicago Press.
  • Kurtz, P. (1992) The New Skepticism. Prometheus.
  • LaFollette, M. (1983) Creationism, Science and the Law. MIT Press.
  • Lakatos, I. (1978) The Methodology of Scientific Research Programmes. Cambridge University Press.
  • Laudan, L. (1983) The Demise of the Demarcation Problem, in: R.S. Cohen and L. Laudan (eds.), Physics, Philosophy and Psychoanalysis. D. Reidel, pp. 111–127.
  • Laudan, L. (1988) Science at the Bar—Causes for Concern. In M. Ruse (ed.), But Is It Science? Prometheus.
  • Letrud, K. (2019) The Gordian Knot of Demarcation: Tying Up Some Loose Ends. International Studies in the Philosophy of Science 32(1):3-11.
  • Machuca, D.E. and Reed, B. (2018) Skepticism: From Antiquity to the Present. Bloomsbury Academic.
  • Mahner, M. (2007) Demarcating Science from Non-Science, in: T. Kuipers (ed.), Handbook of the Philosophy of Science: General Philosophy of Science—Focal Issues. Elsevier, pp. 515-575.
  • McGrayne, S.B. (2011) The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy. Yale University Press.
  • Merton, R.K. (1973) The Normative Structure of Science, in: N.W. Storer (ed.), The Sociology of Science: Theoretical and Empirical Investigations. University of Chicago Press, pp. 267-278.
  • Moberger, V. (2020) Bullshit, Pseudoscience and Pseudophilosophy. Theoria 86(5):595-611.
  • Navin, M. (2013) Competing Epistemic Spaces. How Social Epistemology Helps Explain and Evaluate Vaccine Denialism. Social Theory and Practice 39(2):241-264.
  • Pigliucci, M. (2013) The Demarcation Problem: A (Belated) Response to Laudan, in: M. Pigliucci and M. Boudry (eds.), The Philosophy of Pseudoscience. University of Chicago Press, pp. 9-28.
  • Pigliucci, M. (2017) Philosophy as the Evocation of Conceptual Landscapes, in: R. Blackford and D. Broderick (eds.), Philosophy’s Future: The Problem of Philosophical Progress. John Wiley & Sons, pp. 75-90.
  • Pigliucci, M. (2018) Nonsense on Stilts, 2nd edition. University of Chicago Press, pp. 97-98.
  • Pigliucci, M. and Boudry, M. (eds.) (2013) The Philosophy of Pseudoscience: Reconsidering the Demarcation Problem. University of Chicago Press.
  • Plato (1986) Charmides. Translated by T.G. West and G.S. West, Hackett Classics.
  • Popper, K. (1959) The Logic of Scientific Discovery. Hutchinson.
  • Riggs, W. (2009) Two Problems of Easy Credit. Synthese 169(1):201-216.
  • Sagan, C. (1995) The Demon Haunted World. Ballantine.
  • Salas D. and Salas, D. (translators) (1996) The First Scientific Investigation of the Paranormal Ever Conducted, Commissioned by King Louis XVI. Designed, conducted, & written by Benjamin Franklin, Antoine Lavoisier, & Others. Skeptic (Fall), pp.68-83.
  • Shea, B. (no date) Karl Popper: Philosophy of Science. Internet Encyclopedia of Philosophy. https://iep.utm.edu/pop-sci/
  • Smith, T.C. and Novella, S.P. (2007) HIV Denial in the Internet Era. PLOS Medicine, https://doi.org/10.1371/journal.pmed.0040256.
  • Sosa, E. (1980) The Raft and the Pyramid: Coherence versus Foundations in the Theory of Knowledge. Midwest Studies in Philosophy 5(1):3-26.
  • Sosa, E. (2011) Knowing Full Well. Princeton University Press.
  • Wittgenstein, L. (1958) Philosophical Investigations. Blackwell.
  • Zagzebski, L.T. (1996) Virtues of the Mind: An Inquiry into the Nature of Virtue and the Ethical Foundations of Knowledge. Cambridge University Press.

 

Author Information

Massimo Pigliucci
Email: mpigliucci@ccny.cuny.edu
The City College of New York
U. S. A.

Substance

The term “substance” has two main uses in philosophy. Both originate in what is arguably the most influential work of philosophy ever written, Aristotle’s Categories. In its first sense, “substance” refers to those things that are object-like, rather that property-like. For example, an elephant is a substance in this sense, whereas the height or colour of the elephant is not. In its second sense, “substance” refers to the fundamental building blocks of reality. An elephant might count as a substance in this sense. However, this depends on whether we accept the kind of metaphysical theory that treats biological organisms as fundamental. Alternatively, we might judge that the properties of the elephant, or the physical particles that compose it, or entities of some other kind better qualify as substances in this second sense. Since the seventeenth century, a third use of “substance” has gained currency. According to this third use, a substance is something that underlies the properties of an ordinary object and that must be combined with these properties for the object to exist. To avoid confusion, philosophers often substitute the word “substratum” for “substance” when it is used in this third sense. The elephant’s substratum is what remains when you set aside its shape, size, colour, and all its other properties. These philosophical uses of “substance” differ from the everyday use of “substance” as a synonym for “stuff” or “material”. This is not a case of philosophers putting an ordinary word to eccentric use. Rather, “substance” entered modern languages as a philosophical term, and it is the everyday use that has drifted from the philosophical uses.

Table of Contents

  1. Substance in Classical Greek Philosophy
    1. Substance in Aristotle
    2. Substance in Hellenistic and Roman Philosophy
  2. Substance in Classical Indian Philosophy
    1. Nyaya-Vaisheshika and Jain Substances
    2. Upanishadic Substrata
    3. Buddhist Objections to Substance
  3. Substance in Medieval Arabic and Islamic Philosophy
    1. Al-Farabi
    2. Avicebron (Solomon ibn Gabirol)
  4. Substance in Medieval Scholastic Philosophy
    1. Thomas Aquinas
    2. Duns Scotus
  5. Substance in Early Modern Philosophy
    1. Descartes
    2. Spinoza
    3. Leibniz
    4. British Empiricism
  6. Substance in Twentieth-Century and Early-Twenty-First-Century Philosophy
    1. Criteria for Being a Substance
    2. The Structure of Substances
    3. Substance and the Mind-Body Problem
  7. References and Further Reading

1. Substance in Classical Greek Philosophy

The idea of substance enters philosophy at the start of Aristotle’s collected works, in the Categories 1a. It is further developed by Aristotle in other works, especially the Physics and the Metaphysics. Aristotle’s concept of substance was quickly taken up by other philosophers in the Aristotelian and Platonic schools. By late antiquity, the Categories, along with an introduction by Porphyry, was the first text standardly taught to philosophy students throughout the Roman world, a tradition that persisted in one form or another for more than a thousand years. As a result, Aristotle’s concept of substance can be found in works by philosophers across a tremendous range of times and places. Uptake of Aristotle’s concept of substance in Hellenistic and Roman philosophy was typically uncritical, however, and it is necessary to look to other traditions for influential challenges to and/or revisions of the Aristotelian concept.

a. Substance in Aristotle

The Categories centres on two ways of dividing up the kinds of things that exist (or, on some interpretations, the kinds of words or concepts for things that exist). Aristotle starts with a simple four-fold division. He then introduces a more complicated ten-fold division. Both give pride of place to the category of substances.

Aristotle draws the four-fold division in terms of two relations: that of existing in a subject in the way that the colour grey is in an elephant, and that of being said of a subject in the way that “animal” or “four-footed” is said of an elephant. Commentators often refer to these relations as inherence and predication, respectively.

Some things, Aristotle says, exist in a subject, and some are said of a subject. Some both exist in and are said of a subject. But members of a fourth group, substances, neither exist in nor are said of a subject:

A substance—that which is called a substance most strictly, primarily, and most of all—is that which is neither said of a subject nor in a subject, e.g. the individual man or the individual horse. (Categories, 2a11)

In other words, substances are those things that are neither inherent in, nor predicated of, anything else. A problem for understanding what this means is that Aristotle does not define the said of (predication) and in (inherence) relations. Aristotle (Categories, 2b5–6) does make it clear, however, that whatever is said of or in a subject, in the sense he has in mind, depends for its existence on that subject. The colour grey and the genus animal, for example, can exist only as the colour or genus of some subject—such as an elephant. Substances, according to Aristotle, do not depend on other things for their existence in this way: the elephant need not belong to some further thing in order to exist in the way that the colour grey and the genus animal (arguably) must. In this respect, Aristotle’s distinction between substances and non-substances approximates the everyday distinction between objects and properties.

Scholars tend to agree that Aristotle treats the things that are said of a subject as universals and other things as particulars. If so, Aristotle’s substances are particulars: unlike the genus animal, an individual elephant cannot have multiple instances. Scholars also tend to agree that Aristotle treats the things that exist in a subject as accidental and the other things as non-accidental. If so, substances are non-accidental. However, the term “accidental” usually signifies the relationship between a property and its bearer. For example, the colour grey is an accident of the elephant because it is not part of its essence, whereas the genus animal is not an accident of the elephant but is part of its essence. The claim that an object-like thing, such as a man, a horse, or an elephant, is non-accidental therefore seems trivially true.

Unlike the four-fold division, Aristotle’s ten-fold division does not arise out of the systematic combination of two or more characteristics such as being said of or existing in a subject. It is presented simply as a list consisting of substance, quantity, qualification, relative, where, when, being-in-a-position, having, doing, and being-affected. Scholars have long debated on whether Aristotle had a system for arriving at this list of categories or whether he “merely picked them up as they occurred to him” as Kant suggests (Critique of Pure Reason, Pt.2, Div.1, I.1, §3, 10).

Despite our ignorance about how he arrived at it, Aristotle’s ten-fold division helps clarify his concept of substance by providing a range of contrast cases: substances are not quantities, qualifications, relatives and so on, all of which depend on substances for their existence.

Having introduced the ten-fold division, Aristotle also highlights some characteristics that make substances stand out (Categories, 3b–8b): a substance is individual and numerically one, has no contrary (nothing stands to an elephant as knowledge stands to ignorance or justice to injustice), does not admit of more or less (no substance is more or less a substance than another substance, no elephant is more or less an elephant than another elephant), is not said in relation to anything else (one can know what an elephant is without knowing anything else to which it stands in some relation), and is able to receive contraries (an elephant can be hot at one time, cold at another). Aristotle emphasises that whereas substances share some of these characteristics with some non-substances, the ability to receive contraries while being numerically one is unique to substances (Categories, 4a10–13).

The core idea of a substance in the Categories applies to those object-like particulars that, uniquely, do not depend for their existence on some subject in which they must exist or of which they must be said, and that are capable of receiving contraries when they undergo change. That, at any rate, is how the Categories characterises those things that are “most strictly, primarily, and most of all” called “substances”. One complication must be noted. Aristotle adds that:

The species in which the things primarily called substances are, are called secondary substances, as also are the genera of these species. For example, the individual man belongs in a species, man, and animal is a genus of the species; so these—both man and animal—are called secondary substances. (Categories, 2a13)

Strictly, then, the Categories characterises two kinds of substances: primary substances, which have the characteristics we have looked at, and secondary substances, which are the species and genera to which primary substances belong. However, Aristotle’s decision to call the species and genera to which primary substances belong “secondary substances” is not typically adopted by later thinkers. When people talk about substances in philosophy, they almost always have in mind a sense of the term derived from Aristotle’s discussion of primary substances. Except where otherwise specified, the same is true of this article.

In singling out object-like particulars such as elephants as those things that are “most strictly, primarily and most of all” called “substance”, Aristotle implies that the term “substance” is no mere label, but that it signifies a special status. A clue as to what Aristotle has in mind here can be found in his choice of terminology. The Greek term translated “substance” is ousia, an abstract noun derived from the participle ousa of the Greek verb eimi, meaning—and cognate with—I am. Unlike the English “substance”, ousia carries no connotation of standing under or holding up. Rather, ousia suggests something close to what we mean by the word “being” when we use it as a noun. Presumably, therefore, Aristotle regards substances as those things that are most strictly and primarily counted as beings, as things that exist.

Aristotle sometimes refers to substances as hypokeimena, a term that does carry the connotation of standing under (or rather, lying under), and that is often translated with the term “subject”. Early translators of Aristotle into Latin frequently used a Latin rendering of hypokeimenon—namely, substantia—to translate both terms. This is how we have ended up with the English term “substance”. It is possible that this has contributed to some of the confusions that have emerged in later discussions, which have placed too much weight on the connotations of the English term (see section 5.c).

Aristotle also discusses the concept of substance in a number of other works. If these have not had the same degree of influence as the Categories, their impact has nonetheless been considerable, especially on scholastic Aristotelianism. Moreover, these works add much to what Aristotle says about substance in the Categories, in some places even seeming to contradict it.

The most important development of Aristotle’s concept of substance outside the Categories is his analysis of material substances into matter (hyle) and form (morphe)—an analysis that has come to be known as hylomorphism (though only since the late nineteenth century). This analysis is developed in the Physics, a text dedicated to things that undergo change, and which, unsurprisingly therefore, also has to do with substances. Given the distinctions drawn in the Categories, one might expect Aristotle’s account of change to simply say that change occurs when a substance gains or loses one of the things that is said of or that exists in it—before its bath, the elephant is hot and grey, but afterwards, it is cool and mud-coloured. However, Aristotle also has the task of accounting for substantial change. That is, the coming to be or ceasing to exist of a substance. An old tradition in Greek philosophy, beginning with Parmenides, suggests that substantial change should be impossible, since it involves something coming from nothing or vanishing into nothing. In the Physics, Aristotle addresses this issue by analysing material substances into the matter they are made of and the form that organises that matter. This allows him to explain substantial change. For example, when a vase comes into existence, the pre-existing clay acquires the form of a vase, and when it is destroyed, the clay loses the form of a vase. Neither process involves something coming from or vanishing into nothing. Likewise, when an elephant comes into existence, pre-existing matter acquires the form of an elephant. When an elephant ceases to exist, the matter loses the form of an elephant, becoming (mere) flesh and bones.

Aristotle returns to the topic of substance at length in the Metaphysics. Here, much to the confusion of readers, Aristotle raises the question of what is most properly called a “substance” afresh and considers three options: the matter of which something is made, the form that organises that matter, or the compound of matter and form. Contrary to what was said in the Categories and the Physics, Aristotle seems to say that the term “substance” applies most properly not to a compound of matter and form such as an elephant or a vase, but to the form that makes that compound the kind of thing it is. (The form that makes a hylomorphic compound the kind of thing it is, such as the form of an elephant or the form of a vase, is referred to as a substantial form, to distinguish it from accidental forms such as size or colour). Scholars do not agree on how to reconcile this position with that of Aristotle’s other works. In any case, it should be noted that it is Aristotle’s identification of substances with object-like particulars such as elephants and vases that has guided most of later discussions of substance.

One explanation for Aristotle’s claim in the Metaphysics that it is the substantial form that most merits the title of “substance” concerns material change. In the Categories, Aristotle emphasises that substances are distinguished by their ability to survive through change. Living things, such as elephants, however, do not just change with respect to accidental forms such as temperature and colour. They also change with respect to the matter they are made of. As a result, it seems that if the elephant remains the same elephant over time, this must be in virtue of its having the same substantial form.

In the Metaphysics, Aristotle rejects the thesis that the term “substance” applies to matter. In discussing this thesis, he anticipates a usage that becomes popular from the seventeenth century onwards. On this usage, “substance” does not refer to object-like particulars such as elephants or vases; rather, it refers to an underlying thing that must be combined with properties to yield an object-like particular. This underlying thing is typically conceived as having no properties in itself, but as standing under or supporting the properties with which it must be combined. The application of the term “substance” to this underlying thing is confusing, and the common practice of favouring the word “substratum” in this context is followed here. The idea of a substratum that must be combined with properties to yield a substance in the ordinary sense is close to Aristotle’s idea of matter that must be combined with form. It is closer still to the concept of prime matter, which is traditionally (albeit controversially) attributed to Aristotle and which, unlike flesh or clay, is conceived as having no properties in its own right, except perhaps spatial extension. Though the concept of a substratum is not same as the concept of substance in its original sense, it also plays an extremely important role in the history of philosophy, and one that has antecedents earlier than Aristotle in the Presocratics and in classical Indian philosophy, a topic discussed in section 2.b.

b. Substance in Hellenistic and Roman Philosophy

As noted in the previous section, in the Categories, Aristotle distinguishes two kinds of non-substance: those that exist in a subject and those that are said of a subject. He goes on to divide these further, into the ten categories from which the work takes its name: quantity, qualification, relative, where, when, being-in-a-position, having, doing, being-affected, and secondary substance (which we can count as non-substances for the reasons explained in section 1.a).

Although an enormous number of subsequent thinkers adopt the basic distinction between substances and non-substances, many omit the distinction between predication and inherence. That is, between non-substances that are said of a subject and non-substances that exist in a subject. Moreover, many compact the list of non-substances. For example, the late Neoplatonist Simplicius (480–560 C.E.) records that the second head of the Academy after Plato, Xenocrates (395/96–313/14 B.C.E.), as well as the eleventh head of the Peripatetic school, Andronicus of Rhodes (ca.60 B.C.E.), reduced Aristotle’s ten categories to two: things that exist in themselves, meaning substances, and things that exist in relation to something else, meaning non-substances.

In adopting the language of things that exist in themselves and those that exist in relation to something else, philosophers such as Xenocrates and Andronicus of Rhodes appear to have been recasting Aristotle’s distinction between substances and non-substances in a terminology that approximates that of Plato’s Sophist (255c). It can therefore be argued that the distinction between substances and non-substances that later thinkers inherit from Aristotle also has a line of descent from Plato, even if Plato devotes much less attention to the distinction.

The definition of substances as things that exist in themselves (kath’ auta or per se) is commonplace in the history of philosophy after Aristotle. The expression is, however, regrettably imprecise, both in the original Greek and in the various translations that have followed. For it is not clear what the preposition “in” is supposed to signify here. Clearly, it does not signify containment, as when water exists in a vase or a brick in a wall. It is plausible that the widespread currency of this vague phrase is responsible for the failure of the most influential philosophers from antiquity onwards to state explicit necessary and sufficient conditions for substancehood.

The simplification of the category of non-substances and the introduction of the Platonic in itself terminology are the main philosophical innovations respecting the concept of substance in Hellenistic and Roman philosophy. The concept would also be given a historic theological application when the Nicene Creed (ca.325 C.E.) defined the Father and Son of the Holy Trinity as consubstantial (homoousion) or of one substance. As a result, the philosophical concept of substance would play a central role in the Arian controversy that shaped early Christian theology.

Although Hellenistic and Roman discussions of substance tend to be uncritical, an exception can be found in the Pyrrhonist tradition. Sextus Empiricus records a Pyrrhonist argument against the distinction between substance and non-substance, which says, in effect, that:

  1. If things that exist in themselves do not differ from things that exist in relation to something else, then they too exist in relation to something else.
  2. If things that exist in themselves do differ from things that exist in relation to something else, then they too exist in relation to something else (for to differ from something is to stand in relation to it).
  3. Therefore, the idea of something that exists in itself is incoherent (see McEvilley 2002, 469).

While arguing against the existence of substances is not a central preoccupation of Pyrrhonist philosophy, it is a central concern of the remarkably similar Buddhist Madhyamaka tradition, and there is a possibility of influence in one direction or the other.

2. Substance in Classical Indian Philosophy

The concept of substance in Western philosophy derives from Aristotle via the ancient and medieval philosophical traditions of Europe, the Middle East and North Africa. Either the same or a similar concept is central to the Indian Vaisheshika and Jain schools, to the Nyaya school with which Vaisheshika merged and, as an object of criticism, to various Buddhist schools. This appears to have been the first time that the concept of substance was subjected to sustained philosophical criticism, anticipating and possibly influencing the well-known criticisms of the idea of substance advanced by early modern Western thinkers.

a. Nyaya-Vaisheshika and Jain Substances

There exist six orthodox schools of Indian philosophy (those that acknowledge the authority of the Vedas—the principal Hindu scriptures) and four major unorthodox schools. The orthodox schools include Vaisheshika and Nyaya which appear to have begun as separate traditions, but which merged some time before the eleventh century. The founding text of the Vaisheshika school, the Vaisheshikasutra is attributed to a philosopher named Kaṇāda and was composed sometime between the fifth and the second century B.C.E. Like Aristotle’s Categories, the focus of the Vaisheshikasutra is on how we should divide up the kinds of things that exist. The Vaisheshikasutra presents a three-fold division into substance (dravya), quality (guna), and motion (karman). The substances are divided, in turn, into nine kinds. These are the five elements—earth, water, fire, air, and aether—with the addition of time, space, soul, and mind.

The early Vaisheshika commentators, Praśastapāda (ca.6th century) and Candrānanda (ca.8th century) expand the Vaisheshikasutra’s three-category division into what has become a canonical list of six categories. The additional categories are universal (samanya), particularity (vishesha), and inherence (samavaya), concepts which are also mentioned in the Vaisheshikasutra, but which are not, in that text, given the same prominence as substance, quality and motion (excepting one passage of a late edition which is of questionable authenticity).

The Sanskrit term translated as “substance”, dravya, comes from drú meaning wood or tree and has therefore a parallel etymology to Aristotle’s term for matter, hyle, which means wood in non-philosophical contexts. Nonetheless, it is widely recognised that the meaning of dravya is close to the meaning of Aristotle’s ousia: like Aristotle’s ousiai, dravyas are contrasted with quality and motion, they are distinguished by their ability to undergo change and by the fact that other things depend on them for their existence. McEvilley (2002, 526–7) lists further parallels.

At the same time, there exist important differences between the Vaisheshika approach to substance and that of Aristotle. One difference concerns the paradigmatic examples. Aristotle’s favourite examples of substances are individual objects, and it is not clear that he would count the five classical elements, soul, or mind, as substances. (Aristotle’s statements on these themes are ambiguous and interpretations differ.) Moreover, Aristotle would not class space or time as substances. This, however, need not be taken to show that the Vaisheshika and Aristotelian concepts of substance are themselves fundamentally different. For philosophers who inherit Aristotle’s concept of substance often disagree with Aristotle about its extension in respects similar to Vaisheshika philosophers.

A second difference between the Vaisheshika approach to substance and Aristotle’s is that according to Vaisheshika philosophers, composite substances (anityadravya, that is noneternal substances), though they genuinely exist, do not persist through change. An individual atom of earth or water exists forever, but as soon as you remove a part of a tree, you have a new tree (Halbfass 1992, 96). A possible explanation for both differences between Vaisheshika and Aristotelian substances is that the former are not understood as compounds of matter and form but play rather a role somewhere between that of Aristotelian substances and Aristotelian matter.

Something closer to Aristotle’s position on this point is found in Jain discussions of substance, which appear to be indebted to the Vaisheshika notion, but which combine it with the idea of a vertical universal (urdhvatasmanya). The vertical universal plays a similar role to Aristotle’s substantial form, in that it accompanies an individual substance through nonessential modifications and can therefore account for its identity through material change.

The earliest parts of the Vaisheshikasutra are believed to have been authored between the fifth and second centuries B.C.E., with most parts being in place by the second century C.E. (Moise and Thite 2022, 46). This interval included a period of intense cultural exchange between Greece and India, beginning in the final quarter of the fourth century B.C.E. In view of the close parallels between the philosophy of Aristotle and that of the proponents of Vaisheshika, and of the interaction between the two cultures going on at this time, Thomas McEvilley (2002, 535) states that “it is possible to imagine stimulus diffusion channels” whereby elements of Vaisheshika’s thought “could reflect Greek, and specifically Peripatetic, influence”, including Aristotelian ideas about substance. However, it is also possible that the Vaisheshika and Aristotelian concepts of substance developed independently, despite their similarity.

b. Upanishadic Substrata

The paradigmatic examples of substances identified by Vaisheshika thinkers, like those identified by Aristotelians, are ordinary propertied things such as earth, water, humans and horses. Section 1.a noted that since the seventeenth century, the term “substance” has acquired another usage, according to which “substance” does not applies to ordinary propertied things, but to a putative underlying entity that is supposed to lack properties in itself but to combine with properties to yield substances of the ordinary sort. The underlying entity is often referred to as a substratum to distinguish it from substances in the traditional sense of the term. Although the application of the term “substance” to substrata only became well-established in the twentieth century, the idea that substances can be analysed into properties and an underlying substratum is very old and merits attention here.

As already mentioned, the idea of a substratum is exemplified by the idea of prime matter traditionally attributed to Aristotle. An earlier precursor of this idea is the Presocratic Anaximander, according to whom the apeiron underlies everything that exists. Apeiron is usually translated “infinite”; however, in this context, a more illuminating (albeit etymologically parallel) translation would be “unlimited” or “indefinite”. Anaximander’s apeiron is a thing conceived of in abstraction from any characteristics that limit or define its nature: it is a propertyless substratum. It is reasonable, moreover, to attribute essentially the same idea to Anaximander’s teacher, Thales. For although Thales identified the thing underlying all reality as water, and not as the apeiron, once it is recognised that “water” here is used as a label for something that need not possess any of the distinctive properties of water, the two ideas turn out to be more or less the same.

Thales was the first of the Presocratics and, therefore, the earliest Western philosopher to whom the idea of a substratum can be attributed. Thomas McEvilley (2002) argues that it is possible to trace the idea of a substratum still further back to the Indian tradition. First, McEvilley proposes that Thales’ claim that everything is water resembles a claim advanced by Sanaktumara in the Chandogya Upanishad (ca.8th–6th century B.C.E.), which may well predate Thales. Moreover, just as we can recognise an approximation of the idea of a propertyless substratum in Thales’ claim, the same goes for Sanaktumara’s. McEvilley adds that even closer parallels can be found between Anaximander’s idea of the apeiron and numerous Upanishadic descriptions of brahman as that which underlies all beings, descriptions which, in this case, certainly appear much earlier.

The idea of substance in the sense of an underlying substratum can, therefore, be traced back as far as the Upanishads, and it is possible that the Upanishads influenced the Presocratic notion and, in turn, Aristotle. For there was significant Greek-Indian interchange in the Presocratic period, mediated by the Persian empire, and there is persuasive evidence that Presocratic thinkers had some knowledge of Upanishadic texts or of some unknown source that influenced both (McEvilley 2002, 28–44).

c. Buddhist Objections to Substance

The earliest sustained critiques of the notion of substance appear in Buddhist philosophy, beginning with objections to the idea of a substantial soul or atman. Early objections to the idea of a substantial soul are extended to substances in general by Nagarjuna, the founder of the Madhyamaka school, in around the second or third century C.E. As a result, discussions about substances would end up being central to the philosophical traditions across Eurasia in the succeeding centuries.

The earliest Buddhist philosophical texts are the discourses attributed to the Buddha himself and to his immediate disciples, collected in the Sutra Piṭaka. These are followed by the more technical and systematic Abhidharma writings collected in the Abhidhamma Piṭaka. The Sutra Piṭaka and the Abhidhamma Piṭaka are two of the three components of the Buddhist canon, the third being the collection of texts about monastic living known as the Vinaya Piṭaka. (The precise content of these collections differs in different Buddhist traditions, the Abhidhamma Piṭaka especially.)

The Sutra Piṭaka and the Abhidhamma Piṭaka both contain texts arguing against the idea of a substantial soul. According to the authors of these texts, the term atman is applied by convention to what is in fact a mere collection of mental and physical events. The Samyutta Nikaya, a subdivision of the Sutra Piṭaka, attributes a classic expression of this view to the Buddhist nun, Vaijira. Bhikku Bodhi (2000, 230) translates the relevant passage as follows:

Why now do you assume ‘a being’?
Mara, is that your speculative view?
This is a heap of sheer formations:
Here no being is found.

Just as, with an assemblage of parts,
The word ‘chariot’ is used,
So, when the aggregates exist,
There is the convention ‘a being’.

Although they oppose the idea of a substantial self, the texts collected in the Sutra Piṭaka and the Abhidhamma Piṭaka do not argue against the existence of substances generally. Indeed, Abhidharma philosophers analysed experiential reality into elements referred to as dharmas, which are often described in terms suggesting that they are substances (all the more so in later, noncanonical texts in the Abhidharma tradition).

The Madhyamaka school arose in response to Abhidharma philosophy as well as non-Buddhist schools such as Nyaya-Vaisheshika. In contrast to earlier Buddhist thought, its central preoccupation is the rejection of substances generally.

Madhyamaka means middle way. The school takes this name from its principal doctrine, which aims to establish a middle way between two opposing metaphysical views: realism (broadly the view that some things are ultimately real) and nihilism (the view that ultimately, nothing exists). Nagarjuna expresses the third alternative as the view that everything is characterised by emptiness (sunyata), which he explicates as the absence of svabhava. While svabhava has various interconnected meanings in Nagarjuna’s thought, it is mainly used to express the idea of substance understood as “any object that exists objectively, the existence and qualities of which are independent of other objects, human concepts, or interests” (Westerhoff 2009, 199).

Westerhoff (2009, 200–212) summarises several arguments against substance that can be attributed to Nagarjuna. These include an argument that substances could not stand in causal relations, an argument that substance could not undergo change, and an argument that there exists no satisfactory account of the relation between a substance and its properties. The first two appear to rule out substances only on the assumption that substances, if they exist at all, must stand in causal relations and undergo change, something that most, but not all, proponents of substances would hold. Regarding the self or soul, Nagarjuna joins with other Buddhist schools in arguing that what we habitually think of as a substantial self is in fact a collection of causally interconnected psychological and physical events.

The principal targets of Nagarjuna’s attacks on the concept of substance are Abhidharma and Nyaya-Vaisheshika philosophies. A central preoccupation of the Nyaya school is to respond to Buddhist arguments, including those against substance. It is possible that a secondary target is the concept of substance in Greek philosophy. As noted above, there is some evidence of influence between the Greek and Indian philosophical traditions in one or both directions. Greeks in India took a significant interest in Buddhism, with Greek converts contributing to Buddhist culture. The best known of these, Menander, a second century B.C.E. king of Bactria, is one of the two principal interlocutors in the Milindasutra, a Buddhist philosophical dialogue that includes a famous presentation of Vaijira’s chariot analogy.

There also exist striking parallels between the arguments of the Pyrrhonists, as recorded by Sextus Empiricus in around 200 C.E. and the Madhyamaka school founded by Nagarjuna at about the same time (McEvilley 2002; Neale 2014). Diogenes Laertius records that Pyrrho himself visited India with Alexander the Great’s army, spending time in Taxila, which would become a centre of Buddhist philosophy. Roman historians record flourishing trade between the Roman empire and India. There was, therefore, considerable opportunity for philosophical interchange during the period in question. Nonetheless, arguing against the idea of substance does not seem to have been such a predominant preoccupation for the Pyrrhonists as it was for the Madhyamaka philosophers.

3. Substance in Medieval Arabic and Islamic Philosophy

Late antiquity and the Middle Ages saw a decline in the influence of Greco-Roman culture in and beyond Europe, hastened by the rise of Islam. Nonetheless, the tradition of beginning philosophical education with Aristotle’s logical works, starting with the Categories, retained an enormous influence in Middle Eastern intellectual culture. (Aristotle’s work was read not only in Greek but also in Syriac and Arabic translations from the sixth and ninth centuries respectively). The translation of Greek philosophical works into Arabic was accompanied by a renaissance in Aristotelian philosophy beginning with al-Kindi in the ninth century. Inevitably, this included discussions of the concept of substance, which is present throughout the philosophy of this period. Special attention is due to al-Farabi for an early detailed treatment of the topic and to Avicebron (Solomon ibn Gabirol) for his influential defence of the thesis that all substances must be material. Honourable mention is also due to Avicenna’s (Ibn Sina) floating-man argument, which is widely seen as anticipating Descartes’ (in)famous disembodiment argument for the thesis that the mind is an immaterial substance.

a. Al-Farabi

The resurgence of Aristotelian philosophy in the Arabic and Islamic world is usually traced back to al-Kindi. Al-Kindi’s works on logic (the subject area to which the Categories is traditionally assigned) have however been lost, and with them any treatment of substance they might have contained. Thérèse-Anne Druart (1987) identifies al-Farabi’s discussion of djawhar, in his Book of Letters, as the first serious Arabic study of substance. There, al-Farabi distinguishes between the literal use of djawhar (meaning gem or ore), metaphorical uses to refer to something valuable or to the material of which something is constituted, and three philosophical uses as a term for substance or essence.

The first two philosophical uses of djawhar identified by al-Farabi approximate Aristotle’s primary and secondary substances. That is, in the first philosophical usage, djawhar refers to a particular that is not said of and does not exist in a subject. For example, an elephant. In the second philosophical usage, it refers to the essence of a substance in the first sense. For example, the species elephant. Al-Farabi adds a third use of djawhar, in which it refers to the essence of a non-substance. For example, to colour, the essence of the non-substance grey.

Al-Farabi says that the other categories depend on those of first and second substances and that this makes the categories of first and second substances more perfect than the others. He reviews alternative candidates for the status of djawhar put forward by unnamed philosophers. These include universals, indivisible atoms, spatial dimensions, mathematical points, and matter. The idea appears to be that these could turn out to be superior candidates for substances because they are more perfect. However, with one exception, al-Farabi does not discover anything more perfect than primary and secondary substances.

The exception is as follows. Al-Farabi claims that it can be proved that there exists a being that is neither in nor predicated of a subject and that is not a subject for anything else either. This being, al-Farabi claims, is more worthy of the term djawhar than the object-like primary substances, insofar as it is still more perfect. Although al-Farabi indicates that it would be reasonable to extend the philosophical usage of djawhar in this way, he does not propose to break with the established use in this way. Insofar as “more perfect” means “more fundamental”, we see here the tension mentioned at the beginning of this article between the use of the term “substance” for object-like things and its use for whatever is most fundamental.

b. Avicebron (Solomon ibn Gabirol)

Avicebron was an eleventh century Iberian Jewish Neoplatonist. In addition to a large corpus of poetry, he wrote a philosophical dialogue, known by its Latin name, Fons Vitae (Fountain of Life), which would have a great influence on Christian scholastic philosophy in the twelfth and thirteenth centuries.

Avicebron’s principal contribution to the topic of substance is his presentation of the position known as universal hylomorphism. As explained in section 1, Aristotle defends hylomorphism, the view that material substances are composed of matter (hyle) and form (morphe). However, Aristotle does not extend this claim to all substances. He leaves room for the view that there exist many substances, including human intellects, that are immaterial. By late antiquity, a standard interpretation of Aristotle emerged, according to which such immaterial substances do in fact exist. By contrast, in the Fons Vitae, Avicebron defends the thesis that all substances, with the only exception of God, are composed of matter and form.

There is a sense in which Avicebron’s universal hylomorphism is a kind of materialism: he holds that created reality consists solely of material substances. It is however important not to be misled by this fact. For although they argue that all substances, barring God, are composed of matter and form, Avicebron and other universal hylomorphists draw a distinction between the ordinary matter that composes corporeal substances and the spiritual matter that composes spiritual substances. Spiritual matter plays the same role as ordinary matter in that it combines with a form to yield a substance. However, the resulting substances do not have the characteristics traditionally associated with material entities. They are not visible objects that take up space. Hence, universal hylomorphism would not satisfy traditional materialists such as Epicurus or Hobbes, who defend their position on the basis that everything that exists must take up space.

Scholars do not agree on what the case for universal hylomorphism is supposed to be. Paul Vincent Spade (2008) suggests that it results from two assumptions: that only God is metaphysically simple in all respects, and that anything that is not metaphysically simple in all respects is a composite of matter and form. However, Avicebron does not explicitly defend this argument, and it is not obvious why something could not qualify as non-simple in virtue of being complex in some way other than involving matter and form.

4. Substance in Medieval Scholastic Philosophy

In the early sixth century, Boethius set out to translate the works of Plato and Aristotle into Latin. This project was cut short when he was executed by Theodoric the Great, but Boethius still did manage to translate Aristotle’s Categories and De Interpretatione. A century later, Isadore of Seville summarised Aristotle’s account of substance in the Categories in his Etymologiae, perhaps the most influential book of the Middle Ages, after the Bible. As a result, the concept of substance introduced in Aristotle’s Categories remained familiar to philosophers after the fall of the Western Roman Empire. Nonetheless, prior to the twelfth century, philosophy in the Latin West consisted principally in elaborating on traditional views, inherited from the Church Fathers and other familiar authorities. It is only in the twelfth century that philosophers made novel contributions to the topic of substance, influenced by Arabic-Islamic philosophy and by the recovery of ancient works by Aristotle and others. The most important are those of Thomas Aquinas and John Duns Scotus.

a. Thomas Aquinas

All the leading philosophers of this period adopted a version of Aristotle’s concept of substance. Many, and in particular those in the Franciscan order, such as Bonaventure, followed Avicebron in accepting universal hylomorphism. Aquinas’s main contribution to the topic of substance is his opposition to Avicebron’s position.

Aquinas endorses Aristotle’s definition of a substance as something that neither is said of, nor exists in, a subject, and he follows Aristotle in analysing material substances as composites of matter and form. However, Aquinas recognised a problem about how to square these views with his belief that some substances, including human souls, are immaterial.

Aquinas was committed to the view that, unlike God, created substances are characterised by potentiality. For example, before its bath, the elephant is actually hot but potentially cool. Aquinas takes the view that in material substances, it is matter that contributes potentiality. For matter is capable of receiving different forms. Since immaterial substances lack matter, it seems to follow that they also lack potentiality. Aquinas is happy to accept this conclusion respecting God whom he regards as pure act. He is however not willing to say the same of other immaterial substances, such as angels and human souls, which he takes to be characterised by potentiality no less than material substances.

One solution would be to adopt the universal hylomorphism of Avicebron, but Aquinas rejects this position on the basis that the potentiality of matter, as usually understood, consists ultimately in its ability to move through space. If so, it seems that matter can only belong to spatial, and hence corporeal, beings (Questiones Disputate de Anima, 24.1.49.142–164).

Instead, Aquinas argues that although immaterial substances are not composed of matter and form, they are composed of essence and existence. In immaterial substances, it is their essence that contributes potentiality. This account of immaterial substances presupposes that existence and essence are distinct, an idea that had been anticipated by Avicenna as a corollary of his proof of God’s existence. Aquinas defends the distinction between existence and essence in De Ente et Essentia, though scholars disagree about how exactly the argument should be understood (see Gavin Kerr’s article on Aquinas’s Metaphysics).

Aquinas recognises that one might be inclined to refer to incorporeal potentiality as matter simply on the basis that it takes on, in spiritual substances, the role that matter plays in corporeal substances. However, he takes the view that this use of the term “matter” would be equivocal and potentially misleading.

A related, but more specific, contribution by Aquinas concerns the issue of how a human soul, if it is the form of a hylomorphic compound, can nonetheless be an immaterial substance in its own right, capable of existing without the body after its death. Aquinas compares the propensity of the soul to be embodied to the propensity of lighter objects to rise, observing that in both cases, the propensity can be obstructed while the object remains in existence. For more on this issue, see Christopher Brown’s article on Thomas Aquinas.

b. Duns Scotus

Like Aquinas, Scotus adopts the Categories’ account of substance. In contrast to earlier Franciscans, he agrees with Aquinas’s rejection of universal hylomorphism. Indeed, Scotus goes even further, claiming not only that form can exist without matter, but also that prime matter can exist without form. As a result, Scotus is committed to the view that matter has a kind of formless actuality, something that, in Aquinas’s system, looks like a contradiction.

Although he drops the doctrine of universal hylomorphism, Scotus maintained, against Aquinas, a second thesis concerning substances associated with Franciscan philosophers and often paired with universal hylomorphism: the view that a single substance can have multiple substantial forms (Ordinatio, 4).

According to Aquinas, a substance has only one substantial form. For example, the substantial form of an elephant is the species elephant. The parts of the elephant, such as its organs, do not have their own substantial forms. Because substantial forms are responsible for the identity of substances over time, this view has the counterintuitive consequence that when, for example, an organ transplant takes place, the organ acquired by the recipient is not the one that was possessed by the donor.

According to Scotus, by contrast, one substance can have multiple substantial forms. For example, the parts of the elephant, such as its organs, may each have their own substantial form. This allows followers of Scotus to take the intuitive view that when an organ transplant takes place, the organ acquired by the recipient is one and the same as the organ that the donor possessed, and not a new entity that has come into existence after the donor’s death. (Aristotle seems to endorse the position of Scotus in the Categories, and that of Aquinas in the Metaphysics.)

Scotus is also known for introducing the idea that every substance has a haecceity (thisness), that is, a property that makes it the particular thing that it is. In this, he echoes the earlier Vaisheshika idea of a vishesha (usually translated “particularity”) which plays approximately the same role (Kaipayil 2008, 79).

5. Substance in Early Modern Philosophy

Prior to the early modern period, Western philosophers tend to adopt both Aristotle’s definition of substance in the Categories and his analysis of material substances into matter and form. In the early modern period, this practice begins to change, with many philosophers offering new characterisations of substance, or rejecting the notion of substance entirely. The most influential contribution from this period is Descartes’ independence definition of substance. Although many earlier philosophers have been interpreted as saying that substances are things that have independent existence, Descartes appears to be the first prominent thinker to say this explicitly. Descartes’ influence, respecting this and other topics, was reinforced by Antoine Arnauld and Pierre Nicole’s Port-Royal Logic, which, towards the end of the seventeenth century, took the place of Aristotle’s Categories as the leading introduction to philosophy. Important contributions to the idea of substance in this period are also made by Spinoza, Leibniz, Locke and Hume, all of whom are known for resisting some aspect of Descartes’ account of substance.

a. Descartes

Substance is one of the central concepts of Descartes’ philosophy, and he returns to it on multiple occasions. In the second set of Objections and Replies to the Meditations on First Philosophy, Descartes advances a definition of substance that resembles Aristotle’s definition of substance in the Categories. This is not surprising given that Descartes underwent formal training in Aristotelian philosophy at the Royal College of La Flèche, France. In a number of other locations, however, Descartes offers what has been called the independence definition of substance. According to the independence definition, a substance is anything that could exist by itself or, equivalently, anything that does not depend on anything else for its existence (Oeuvres, vol. 7, 44, 226; vol. 3, 429; vol. 8a, 24).

Scholars disagree about how exactly we should understand Descartes’ independence definition. Some have argued that Descartes’ view is that substances must be causally independent, in the sense that they do not require anything else to cause them to exist. Another and maybe more popular view is that, for Descartes, substances are modally independent, meaning that the existence of a substance does not necessitate the existence of any other entity. This interpretation itself has several variants (see Weir 2021, 281–7).

In addition to offering a new definition of substance, Descartes draws a distinction between a strict and a more permissive sense of the term. A substance in the strict sense satisfies the independence definition without qualification. Descartes claims that there is only one such substance: God. For everything else depends on God for its existence. Descartes adds, however, that we can count as created substances those things that depend only on God for their existence. Descartes claims that finite minds and bodies qualify as created substances in this sense, whereas their properties (attributes, qualities and modes in his terminology) do not.

It is possible to view Descartes’ independence definition of substance as a disambiguation of Aristotle’s definition of substance in the Categories. Aristotle says that substances do not depend, for their existence, on any other being of which they must be predicated or in which they must inhere. He does not however say explicitly whether substances depend in some other way on other things for their existence. Descartes clarifies that they do not. This is consistent with, and may even be implied by, what Aristotle says in the Categories.

In another respect, Descartes’ understanding of substance departs dramatically from the Aristotelian orthodoxy of his day. For example, while Descartes accepts Aristotle’s claim that in the case of a living human, the soul serves as the form of body, he exhibits little or no sympathy for hylomorphism beyond this. Rather than analysing material substances into matter and form like Aristotle, or substances in general into potency and act like Aquinas, Descartes proposes that every substance has, as its principal attribute, one of two properties—namely, extension or thought—and that all accidental properties of substances are modes of their principal attribute. For example, being elephant-shaped is a mode of extension, and seeing sunlight glimmer on a lake is a mode of thought. In contrast to the scholastic theory of real accidents, Descartes holds that these modes are only conceptually distinct from, and cannot exist without, the substances to which they belong.

One consequence is that Descartes appears to accept what has come to be known as the bundle view of substances: the thesis that, in his words, “the attributes all taken together are the same as the substance” (Conversation with Burman, 7). To put it another way, once we have the principal attribute of the elephant—extension—and all of the accidental attributes, such as its size, shape, texture and so on, we have everything that this substance comprises. These attributes do not need to be combined with a propertyless substratum. (The bundle view, in the relevant sense, contrasts with the substratum view, according to which a substance is composed of properties and a substratum. Sometimes, the term “bundle view” is used in a stronger sense, to imply that the properties that make up a substance could exist separately, but Descartes does not endorse the bundle view in this stronger sense.)

A further consequence is that Descartes could not accept the standard transubstantiation account of the eucharist, which depended on the theory of real accidents, and was obliged to offer a competing account.

In the late seventeenth century, two followers of Descartes, Antoine Arnauld and Pierre Nicole, set out to author a modern introduction to logic that could serve in place of the texts of Aristotle’s Organon, including the Categories. (The word “logic” is used here in a traditional sense that is significantly broader than the sense that philosophers of the beginning of the twenty-first century would attribute to it, including much of what these philosophers would recognize as metaphysics.) The result was La logique ou l’art de penser, better known as the Port-Royal Logic, a work that had an enormous influence on the next two centuries of philosophy. The Port-Royal Logic offers the following definition of substance:

I call whatever is conceived as subsisting by itself and as the subject of everything conceived about it, a thing. It is otherwise called a substance. […] This will be made clearer by some examples. When I think of a body, my idea of it represents a thing or a substance, because I consider it as a thing subsisting by itself and needing no other subject to exist. (30–21)

This definition combines Aristotle’s idea that a substance is the subject of other categories and Descartes claim that a substance does not need other things to exist. It is interesting to note here a shift in focus from what substances are to how they are conceived or considered. This reflects the general shift in focus from metaphysics to epistemology that characterised philosophy after Descartes.

b. Spinoza

Influential philosophers writing after Descartes tend to use Descartes’ views as a starting point, criticising or accepting them as they deem reasonable. Hence, a number of responses to Descartes’ account of substance appear in the early modern period.

In the only book published under his name in his lifetime, the 1663 Principles of Cartesian Philosophy, Spinoza endorses both Descartes’ definition of substance in the Second Replies (which is essentially Aristotle’s definition in the Categories) and the independence definition introduced in the Principles of Philosophy and elsewhere. Spinoza also endorses Descartes’ distinction between created and uncreated substances, his rejection of substantial forms and real accidents, and his division of substances into extended substances and thinking substances.

In the Ethics, published posthumously in 1677, Spinoza develops his own approach to these issues. Spinoza opens the Ethics by stating that “by substance I understand what is in itself and is conceived through itself”. Shortly after this, in the first of his axioms, he adds that “Whatever is, is either in itself or in another”. Spinoza’s contrast between substance, understood as those things that are in themselves, and non-substances, understood as those things that are in another, reflects the distinction introduced by Plato in the Sophist and taken up by countless later thinkers from antiquity onwards. As in the Port-Royal Logic, Spinoza’s initial definition of substance in terms of how it is conceived reflects the preoccupation of early modern philosophy with epistemology.

Spinoza clarifies the claim that a substance is conceived through itself by saying that it means that “the conception of which does not require for its formation the conception of anything else”. This might mean that something is a substance if and only if it is possible to conceive of its existing by itself. If so, then Spinoza’s definition might be interpreted as an epistemological rewriting of Descartes’ independence definition.

Spinoza purports to show, on the basis of various definitions and axioms, that there can only be one substance, and that this substance is to be identified with God. What Descartes calls created substances are really modes of God. This conclusion is sometimes represented as a radical departure from Descartes. This is misleading, however. For Descartes also holds that only God qualifies as a substance in the strict sense of the word “substance”. To this extent, Spinoza is no more monistic than Descartes.

Spinoza’s Ethics does however depart from Descartes in (i) not making use of a category of created substances, and (ii) emphasizing that those things that Descartes would class as created substances are modes of God. Despite this, Spinoza’s theory is not obviously incompatible with the existence of created substances in Descartes’ sense of the term, even if he does not make use of the category himself. It is plausibly a consequence of Descartes’ position that created substances are, strictly speaking, modes of God, even if Descartes does not state this explicitly.

c. Leibniz

In his Critical Thoughts on Descartes’ Principles of Philosophy, Leibniz raises the following objection to Descartes’ definition of created substances as things that depend only on God for their existence:

I do not know whether the definition of substance as that which needs for its existence only the concurrence of God fits any created substance known to us. […] For not only do we need other substances; we need our own accidents even much more. (389)

Leibniz does not explicitly explain here why substances should need other substances, setting aside God, for their existence. Still, his claim that substances need their own accidents is an early example of an objection that has had a significant degree of influence in the literature of the twentieth and twenty-first centuries on substance. According to this objection, nothing could satisfy Descartes’ independence definition of substance because every candidate substance (an elephant or a soul, for example) depends for its existence on its own properties. This objection is further discussed in section 6.

In the Discourse of Metaphysics, Leibniz does provide a reason for thinking that created substances need other substances to exist. There, he begins by accepting something close to Aristotle’s definition of substance in the Categories: a substance is something of which other things are predicated, but which is not itself predicated of anything else. However, Leibniz claims that this characterisation is insufficient, and sets out a novel theory of substance, according to which the haecceity of a substance includes everything true of it (see section 4.b for the notion of haecceity). Accordingly, Leibniz holds that from a perfect grasp of the concept of a particular substance, one could derive all other truths.

It is not obvious how Leibniz arrives at this unusual conception of substance, but it is clear that if the haecceity of one substance includes everything that is true of it, this will include the relationships in which it stands to every other substance. Hence, on Leibniz’s view, every substance turns out to necessitate, and so to depend modally on, every other for its existence, a conclusion that contrasts starkly with Descartes’ position.

Leibniz’s view illustrates the fact that it is possible to accept Aristotle’s definition of substance in the Categories while rejecting Descartes’ independence definition. Leibniz clearly agrees with Aristotle that a substance does not have to be said of or to exist in something in the way that properties do. However, he holds that substances depend for their existence on other things in a way that contradicts Descartes’ independence definition.

Leibniz’s enormous corpus makes a number of other distinctive claims about substances. The most important of these are the characterisation of substances as unities and as things that act, both of which can be found in his New Essays on Human Understanding. These ideas have precursors as far back as Aristotle, but they receive special emphasis in Leibniz’s work.

d. British Empiricism

Section 1 mentions that since the seventeenth century, a new usage of the term “substance” becomes prevalent, on which it does not refer to an object-like thing, such as an elephant, but to an underlying substratum that must be combined with properties to yield an object-like thing. On this usage, an elephant is a combination of properties such as its shape, size and colour, and the underlying substance in which these properties inhere. The substance in this sense is often described as having no properties in itself, and therefore resembles Aristotelian prime matter more than the objects that serve as examples of substances in earlier traditions.

This new usage of “substance” is standardly traced back to Locke’s Essay Concerning Human Understanding, where he states that:

Substance [is] nothing, but the supposed, but unknown support of those qualities, we find existing, which we imagine cannot subsist, sine re substante, without something to support them, we call that support substantia; which, according to the true import of the word, is in plain English, standing under, or upholding. (II.23.2)

This and similar statements in Locke’s Essay initiated a longstanding tradition in which British empiricists, including Berkeley, Hume, and Russell, took for granted that the term “substance” typically refers to a propertyless substratum and criticised the concept on that basis.

Scholars debate on whether Locke actually intended to identify substances with propertyless substrata. There exist two main interpretations. On the traditional interpretation, associated with Leibniz and defended by Jonathan Bennett (1987), Locke uses the word “substance” to refer to a propertyless substratum that we posit to explain what supports the collections of properties that we observe, although Locke is sceptical of the value of this idea, since it stands for something whose nature we are entirely ignorant of. (Those who believe that Locke intended to identify substances with propertyless substrata disagree regarding the further issue of whether Locke reluctantly accepts or ultimately rejects such entities.)

The alternative interpretation, defended by Michael Ayers (1977), agrees that Locke identifies substance with an unknown substratum that underlies the collections of properties we observe. However, on this view, Locke does not regard the substratum as having no properties in itself. Rather, he holds that these properties are unknown to us, belonging as they do to the imperceptible microstructure of their bearer. This microstructure is posited to explain why a given cluster of properties should regularly appear together. On this reading, Locke’s substrata play a similar role to Aristotle’s secondary substances or Jain vertical universals in that they are the essences that explain the perceptible properties of objects. The principal advantage of this interpretation is that it explains how Locke can endorse the idea of a substratum while recognising the (apparent) incoherence of the idea of something having no properties in itself. The principal disadvantages of this interpretation include the meagre textual evidence in its favour and its difficulty accounting for Locke’s disparaging comments about the idea of a substratum.

Forrai (2010) suggests that the two interpretations of Locke’s approach to substances can be reconciled if we suppose that Locke takes our actual idea of substance to be that of a propertyless substratum while holding that we only think of that substratum as propertyless because we are ignorant of its nature, which is in fact that of an invisible microstructure.

In the passages traditionally interpreted as discussing the idea of a propertyless substratum, Locke refers to it as the “idea of substance in general”. In other passages, Locke discusses our ideas of “particular sorts of substances”. Locke’s particular sorts of substances resemble the things referred to as substances in earlier traditions. His examples include humans, horses, gold, and water. These, Locke claims, are in fact just collections of simple ideas that regularly appear together:

We come to have the Ideas of particular sorts of Substances, by collecting such Combinations of simple Ideas, as are by Experience and Observation of Men’s Senses taken notice of to exist together, and are therefore supposed to flow from the particular internal Constitution, or unknown Essence of that Substance. (Essay, II.23.3)

The idea of an elephant, on this view, is really just a collection comprising the ideas of a certain colour, a certain shape, and so on. Locke seems to take the view that the distinctions we draw between different sorts of substances are somewhat arbitrary and conventional. That is, the word “elephant” may not refer to what the philosophers of the twentieth and twenty-first centuries would consider a natural kind. Hence, substances in the traditional sense turn out to be subject-dependent in the sense that the identification of some collection of ideas as a substance is not an objective, mind-independent fact, but depends on arbitrary choices and conventions.

Locke’s comments about substance, particularly those traditionally regarded as identifying substances with propertyless substrata, had a great influence on Berkeley and Hume, both of whom followed Locke in treating substances as substrata and in criticising the notion on this basis, while granting the existence of substances in the deflationary sense of subject-dependent collections of ideas.

Berkeley’s position is distinctive in that he affirms an asymmetry between perceptible substances, such as elephants and vases, and spiritual substances, such as human and divine minds. Berkeley agrees with Locke that our ideas of perceptible substances are really just collections of ideas and that we are tempted to posit a substratum in which these ideas exist. Unlike Locke, Berkeley explicitly says that in the case of perceptible objects at least, we should posit no such thing:

If substance be taken in the vulgar sense for a combination of qualities such as extension, solidity, weight, and the like […] this we cannot be accused of taking away; but if it be taken in the philosophic sense for the support of accidents or quantities without the mind, then I acknowledge that we take it away. (Principles, 1.37)

Berkeley’s rejection of substrata in the case of material objects is not necessarily due to his rejection ofthe idea of substrata in general, however. It may be that Berkeley rejects substrata for material substances only, and does so solely on the basis that, according to his idealist metaphysics, those properties that make up perceptible objects really inhere in the minds of the perceivers.

Whether or not Berkeley thinks that spiritual substances involve propertyless substrata is hard to judge and it is not clear that Berkeley maintains a consistent view on this issue. On the one hand, Berkeley’s published criticisms of the idea of a substratum tend to focus exclusively on material objects, suggesting that he is not opposed to the existence of a substratum in the case of minds. On the other hand, several passages in Berkeley’s notebooks assert that there is nothing more to minds than the perceptions they undergo, suggesting that Berkeley rejects the idea of substrata in the case of minds as well (see in particular his Notebooks, 577 and 580). The task of interpreting Berkeley on this point is complicated by the fact that the relevant passages are marked with a “+”, which some but not all scholars interpret as indicating Berkeley’s dissatisfaction with them.

Hume’s Treatise of Human Nature echoes Locke’s claim that we have no idea of what a substance is and that we have only a confused idea of what a substance does. Although Hume does not explicitly state that these criticisms are intended to apply to the idea of substances as propertyless substrata, commentators tend to agree that this is his intention (see for example Baxter 2015). Hume seems to agree with Locke (as traditionally interpreted) that we introduce the idea of a propertyless substratum in order to make sense of the unity that we habitually attribute to what are in fact mere collections of properties that regularly appear together. Hume holds that we can have no idea of this substratum because any such idea would have to come from some sensory or affective impression while, in fact, ideas derived from sensory and affective impressions are always of accidents—that is, of properties.

Hume grants that we do have a clear idea of substances understood as Descartes defines them, that is, as things that can exist by themselves. However, Hume asserts that this definition applies to anything that we can think of, and hence, that to call something a substance in this sense is not to distinguish it from anything else.

Hume further argues that we can make no sense of the idea of the inherence relation that is supposed to exist between properties and the substances to which they belong. For the inherence relation is taken to be the relation that holds between an accident and something without which it could not exist (as per Aristotle’s description of inherence in the Categories, for example). According to Hume, however, nothing stands in any such relation to anything else. For he makes it an axiom that anything that we can distinguish in thought can exist separately in reality. It follows that not only do we have no idea of a substratum, but no such thing can exist, either in the case of perceptible objects or in the case of minds. For a substratum is supposed to be that in which properties inhere. It is natural to see Hume’s arguments on this topic as the culmination of Locke’s more circumspect criticisms of substrata.

It follows from Hume’s arguments that the entities that earlier philosophers regarded as substances, such as elephants and vases, are in fact just collections of ideas, each member of which could exist by itself. Hume emphasises that, as a consequence, the mind really consists in successive collections of ideas. Hence, Hume adopts a bundle view of the mind and other putative substances not only in the moderate sense that he denies that minds involve a propertyless substratum, but in the extreme sense that he holds that they are really swarms of independent entities.

There exists a close resemblance between Hume’s rejection of the existence of complex substances and his emphasis on the nonexistence of a substantial mind in particular, and the criticisms of substance advanced by Buddhist philosophers and described in section 2. It is possible that Hume was influenced by Buddhist thought on this and other topics during his stay at the Jesuit College of La Flèche, France, in 1735–37, through the Jesuit missionary Charles François Dolu (Gopnik 2009).

Although not himself a British empiricist (though see Stephen Priest’s (2007, 262 fn. 40) protest on this point), Kant developed an approach to substance in the tradition of Locke, Berkeley and Hume, with a characteristically Kantian twist. Kant endorses a traditional account of substance, according to which substances are subjects of predication and are distinguished by their capacity to persist through change. However, Kant adds that the category of substance is something that the understanding imposes upon experience, rather than something derived from our knowledge of things in themselves. For Kant, the category of substance is, therefore, a necessary feature of experience, and to that extent, it has a kind of objectivity. Kant nonetheless agrees with Locke, Berkeley (respecting material substances) and Hume that substances are subject-dependent. (See Messina (2021) for a complication concerning whether we might nonetheless be warranted in applying this category to things in themselves.)

While earlier thinkers beginning with Aristotle asserted that substances can persist through change, Kant goes further, claiming that substances exist permanently and that their doing so is a necessary condition for the unity of time. It seems to follow that for Kant, composites such as elephants or vases cannot be substances, since they come into and go out of existence. Given that Kant also rejects the existence of indivisible atoms in his discussion of the second antinomy, the only remaining candidate for a material substance in Kant appears to be matter taken as a whole. For an influential exposition, see Strawson (1997).

6. Substance in Twentieth-Century and Early-Twenty-First-Century Philosophy

The concept of substance lost its central place in philosophy after the early modern period, partly as a result of the criticisms of the British empiricists. However, philosophers of the twentieth and early twenty-first centuries have shown a revival of interest in the idea, with several philosophers arguing that we need to accept the concept of substance to account for the difference between object-like and property-like things, or to account for which entities are fundamental, or to address a range of neighbouring metaphysical issues. Discussions have centred on two main themes: the criteria for being a substance, and the structure of substances. O’Conaill (2022) provides a detailed overview of both. Moreover, in the late twentieth century, the concept of substance has gained an important role in philosophy of mind, where it has been used to mark the difference between two kinds of mind-body dualism: substance dualism and property dualism.

a. Criteria for Being a Substance

As noted at the beginning of this article, the term “substance” has two main uses in philosophy. Some philosophers use this word to pick out those things that are object-like in contrast to things that are property-like (or, for some philosophers, event-like or stuff-like). Others use it to pick out those things that are fundamental, in contrast to things that are non-fundamental. Both uses derive from Aristotle’s Categories, which posits that the object-like things are the fundamental things. For some thinkers, however, object-like-ness and fundamentality come apart. When philosophers attempt to give precise criteria for being a substance, they tend to have one of two targets in mind. Some have in mind the task of stating what exactly makes something object-like, while others have in mind the task of stating what exactly makes something fundamental. Koslicki (2018, 164–7) describes the two approaches in detail. Naturally, this makes a difference to which criteria for being a substance seem reasonable, and occasionally this has resulted in philosophers talking past one another. Nonetheless, the hypothesis that the object-like things are the fundamental things is either sufficiently attractive, or sufficiently embedded in philosophical discourse, that there exists considerable overlap between the two approaches.

The most prominent criterion for being a substance in the philosophy of the beginning of the twenty-first century is independence. Many philosophers defend, and even more take as a starting point, the idea that what makes something a substance is the fact that it does not depend on other things. Philosophers differ, however, on what kind of independence is relevant here, and some have argued that independence criteria are unsatisfactory and that some other criterion for being a substance is needed.

The most common independence criteria for being a substance characterise substances in terms of modal (or metaphysical) independence. One thing a is modally independent of another thing b if and only if a could exist in the absence of b. The idea that substances are modally independent is attractive for two reasons. First, it seems that properties, such as shape, size or colour, could not exist without something they belong to—something they are the shape, size or colour of. In other words, property-like things seem to be modally dependent entities. By contrast, object-like things, such as elephants or vases, do not seem to depend on other things in this way. An elephant need not be the elephant of some elephant-having being. Therefore, one could argue for the claim that object-like things differ from property-like things by saying that the former are not modally dependent on other entities, while the latter are. Secondly, modally independent entities are arguably more fundamental than modally dependent entities. For example, it is tempting to say that modally independent entities are the basic elements that make up reality, whereas modally dependent entities are derivative aspects or ways of being that are abstracted from the modally independent entities.

Though attractive, the idea that substances are modally independent faces some objections. The most influential objection says that nothing is modally independent because nothing can exist without its own parts and/or properties (see Weir (2021, 287–291) for several examples). For example, an elephant might not have to be the elephant of some further, elephant-having being, but an elephant must have a size and shape, and countless material parts. An elephant cannot exist without a size, a shape and material parts, and so there is a sense in which an elephant is not modally independent of these things.

Several responses have been suggested. First, one might respond by drawing a distinction between different kinds of modal dependence (see, for example, Lowe 1998, 141; Koslicki 2018, 142–44). For instance, we might say that a is rigidly dependent on b if and only if a cannot exist without b, whereas a is generically dependent on entities of kind F if and only if a cannot exist without some entity of kind F. This allows us to distinguish between something that is weakly modally independent, in that there is no entity upon which it is rigidly dependent, and something that is strongly modally independent, in that there is no kind of entity on which it is generically dependent. It might then be argued that substances need only be weakly modally independent. Hence, the fact that an elephant cannot exist without having properties and parts of certain kinds will not disqualify it as a substance, so long as there is no particular, individual part or property that it must have. It is acceptable, for example, that an elephant must have countless carbon atoms as parts, so long as it can do without any given carbon atom (which, presumably, it can).

The problem with this response is that many putative examples of substances seem to have necessary parts or properties upon which they rigidly depend. For example, it is plausible that King Dutagamuna’s renowned elephant, Kandula, could have existed without some of his properties, such that of exhibiting heroism at the siege of Vijitanagara. It is however not plausible that Kandula could have existed without some of its properties, such as that of being the unique member of the singleton set {Kandula}. This, however, does not seem like the kind of fact that should undermine Kandula’s claim to be a substance. Likewise, it is plausible that a given H2O molecule could not exist without the particular hydrogen atom it contains, and yet most philosophers would hesitate to conclude on this basis that an H2O molecule is not a substance.

A second kind of response to the dependence of substances on their properties and parts replaces modal independence with some other variety. One strategy of this kind appeals to the idea of a non-modal essence (see Fine 1994, 1995). Proponents of non-modal essences claim that things have essences that are narrower—that is, include less—than their necessary parts and properties. For example, it can be argued that although Kandula necessarily belongs to the set {Kandula}, this is not part of Kandula’s essence. After all, it is plausible that one could grasp what it is for Kandula to exist without ever thinking about the fact that Kandula belongs to the set {Kandula}. The fact that Kandula belongs to {Kandula} seems more like a side-effect of Kandula’s nature than a part of his nature. If we accept that things have non-modal essences, then it will be possible to propose that something is a substance if and only if it does not essentially depend on other entities—that is, if and only if no other entity is part of its non-modal essence.

The proposal that substances are essentially independent, in the sense specified, promises to get around the concern that Kandula fails to qualify as a substance because Kandula necessarily belongs to the set {Kandula}. However, other problems remain. For it is plausible that some entities of the sort that intuitively count as substances have some particular properties or parts essentially, and not merely necessarily. It is plausible, for example, that the particular hydrogen atom in a given H2O molecule is not only necessary to it but is also a part of its non-modal essence: a part of what it is for this H2O molecule to exist rather than some other H2O molecule is that it should contain this particular hydrogen atom. Yet, it is not obvious that this should disqualify the H2O molecule’s claim to be a substance.

Other responses that replace modal independence with some other variety include E. J. Lowe’s (1998; 2005) identity-independence and Benjamin Schneider’s (2006) conceptual-independence criteria for substance. Like the essential-independence criterion, these get around at least some of the problems facing the simple modal independence criterion.

A more complex strategy is taken up by Joshua Hoffman & Gary Rosenkrantz (1997). Hoffman and Rosenkrantz introduce a hierarchy of categories with entity at level A, abstract and concrete at level B, and so on. After a lengthy discussion, they formulate the following definition:

x is a substance = df. x instantiates a level C category, C1, such that: (i) C1 could have a single instance throughout an interval of time, and (ii) C1’s instantiation does not entail the instantiation of another level C category which could have a single instance throughout an interval of time, and (iii) it is impossible for C1 to have an instance which has as a part an entity which instantiates another level C category, other than Concrete Proper Part, and other than Abstract Proper Part. (65)

For a full understanding of their approach, it is necessary to refer to Hoffman and Rosenkrantz’s text. However, the definition quoted is enough to illustrate how their strategy addresses the dependence of substances on their properties and parts. In short, Hoffman and Rosenkrantz retain a criterion of independence but qualify that criterion in two ways. First, on their definition, it is only necessary that some substances should satisfy the independence criterion. Substances that do not satisfy the criterion count as substances in virtue of being, in some other respect, the same kinds of entities as those that do. Secondly, even those substances that satisfy the independence criterion need only to be able to exist without a carefully specified class of entities, namely those belonging to a “level C category which could have a single instance throughout an interval of time”.

Hoffman and Rosenkrantz’s definition of substance is carefully tailored to avoid the objection that substances do depend on their properties and parts, as well as a number of other objections. A drawback is that they leave it unclear what it is that unifies the category of substances, given that they only require that some substances should satisfy their qualified independence criterion.

Perhaps the simplest response to the dependence of substances on their properties and parts maintains that while a substance must be independent of all other entities, “other entities” should be taken to refer to things that are not included in the substance. This approach is proposed by Michael Gorman (2006, 151) and defended at length by Weir (2021). According to this response, while it is true that an elephant cannot exist without a shape, a size and countless material parts, this does not mean that the elephant cannot exist by itself or without anything else in the sense required for it to be a substance. For the elephant’s shape, size, and material parts are included in it. By contrast, the reason why property-like things, such as the shape of the elephant, do not count as substances is that they are incapable of existing without something that is not included in them. The shape of the elephant, for example, can only exist by being the shape of something that includes more than just the shape. Weir (2021, 296) suggests that the fact that the elephant includes the shape and not vice versa can be seen from the fact that it is possible to start with the whole elephant and subtract elements such as its colour, weight and so on, until one is left with just the shape, whereas it is not possible to start with just the shape and, by subtracting elements, arrive at the whole elephant.

Several other objections to independence criteria deserve mention. First, if there exist necessary beings, such as numbers or God, then trivially, no candidate substance will be able to exist without them. Secondly, if particulars necessarily instantiate abstract universals (if, for example, an elephant necessarily instantiates universals, such as grey, concretum, or animal), then no candidate substance will be able to exist without abstract universals. Thirdly, if space and time are something over and above their occupants (as they are on substantivalist theories of space and time), then no spatial or temporal substance will be able to exist without these. Some of the strategies for dealing with the dependence of substances on their properties and parts can be transferred to these issues. Other strategies have also been proposed. There exists no consensus on whether one or more independence criteria can satisfactorily be defended against such objections.

Those who reject that some independence criterion is necessary for being a substance, or who hold that an independence criterion needs to be supplemented, have proposed alternative criteria. Two popular options have been subjecthood and unity.

In the Categories, Aristotle introduces substances as those things that are subjects of predication and inherence and are neither predicated of nor inherent in anything else. Since he characterises predication and inherence as dependence relations, many readers have inferred that substances are to be distinguished by their independence. However, philosophers who are hesitant about relying on independence criteria often focus on the initial claim that substances are subjects of predication and inherence that are not predicated of, nor inherent in, other things; or, as it is often put, substances are property bearers that are not themselves properties (see, for example, Heil 2012, 12–17).

One difficulty for the subjecthood or property-bearer criterion for being a substance is that it is vulnerable to the objection that the distinctions we draw between properties and subjects of properties are arbitrary. For example, instead of saying that there is an elephant in the room, we might say that the room is elephant-ish. If we do so, it will no longer be true that elephants are subjects of predication that are not predicated of other things. A proponent of the independence criterion is in a position to assert that our ordinary linguistic practices reflect a deeper metaphysical fact: the reason why we do not say that the room is elephant-ish is that the elephant does not depend for its existence on the room in the way that properties depend on their bearers. Those who rely on the subjecthood criterion by itself cannot reply in this way.

Since Leibniz, many philosophers have proposed that substances are distinguished, either partly or solely, by their high degree of unity. In its extreme form, the criterion of unity says that substances must be simples in the sense that they have no detachable parts. Heil (2012, 21) argues that the simplicity criterion follows from the assumption that substances are property-bearers. For according to Heil, no composite can genuinely bear a property. Schaffer (2010) argues for the simplicity of substances on the basis of parsimony. He proposes that duplicating all the simple entities and their relations to one another would be sufficient to duplicate the entire cosmos, and that if this is so, then there is no good reason to posit further entities beyond the simple entities. Schaffer also argues that the fundamental entities that we posit should be “freely recombinable”, in the sense that the intrinsic properties of one such entity do not constrain the intrinsic properties of another, and that this will only be so if the fundamental entities are simples.

It is widely agreed that even if substances need not be simples, they must nonetheless satisfy some criterion of unity that prevents mere groups or aggregates from counting as substances. (Schneider (2006) and Weir (2021) are liberal about counting aggregates as substances, however.) For example, Kathrin Koslicki (2018) defends a neo-Aristotelian view that, rather than employing an independence criterion as many Aristotelians do, accords to hylomorphic compounds the status of being substances on the basis of their exhibiting a special kind of unity. On Koslicki’s account of the relevant kind of unity, a structured whole is unified to the extent that its parts interact in such a way as to allow it to manifest team-work-requiring capacities, such as the way in which the eye interacts with the brain and other parts of an organism gives it a capacity for visual perception.

b. The Structure of Substances

A second theme that has regained prominence in the twentieth century and the first two decades of the twenty-first century concerns the structure of substances. Increasing attention has been given to the question of whether substances should be regarded as comprising two components: properties and a substratum. At the same time, many philosophers have revived elements of Aristotle’s analysis of material substances into form and matter. (Hylomorphism can be thought of as one particularly important version of the analysis into properties and substratum, or as a distinct but somewhat similar position.)

As noted in section 5.d, Locke (perhaps inadvertently) popularised the idea that the word “substance” refers to a propertyless substratum and that we should be sceptical about the coherence or use of the idea of substances so understood. This idea persisted into the twentieth century in the works of thinkers such as Bertrand Russell (1945, 211) and J. L. Mackie (1976, 77) and is partly responsible for a widespread hostility to substances in this period. Justin Broackes (2006) reviews this development and attempts to rescue the traditional idea of substance from its association with a propertyless substratum.

At the same time, a number of thinkers have come to the defence of the idea that substances can be analysed into properties and substratum. As a result, by the dawn of the twenty-first century, it has become commonplace to speak of two main views about the structure of substances: the bundle view and the substratum view. (As explained in section 5, the bundle view here is simply the view that a substance consists of properties with no substratum. It need not entail the more extreme claim that the properties of a substance can exist separately.)

A prominent argument for the substratum view says that something resembling a propertyless substratum is needed to contribute particularity to a substance. On the standard version of this view, universal properties must be instantiated in a bare particular. An early defence of bare particulars is advanced by Gustav Bergman (1947), and the view is then developed in works by, for example, David Armstrong (1978, 1997), Theodore Sider (2006) and Andrew Bailey (2012). In this context, Armstrong draws a contrast between what he terms a thick particular which is “a thing taken along with all its properties” and a thin particular which is “a thing taken in abstraction from all its properties”. These correspond to the traditional idea of a substance and to that of a substratum of the bare-particular variety, respectively.

Although the idea of a bare particular can be seen as a version of Locke’s idea of a propertyless substratum, bare particulars are not typically introduced to play the role that Locke assigns to substrata—that of supporting properties. Rather, for Bergman and others, the principal role of the bare particular is to account for the particularity of a substance whose other components, its properties, are all universals (things that can exist in multiple places at once). In this respect, the bare particular resembles the Vaisheshika vishesha and the Scotist haecceity.

A different line of argument for positing a substratum, advanced by C. B. Martin (1980), says that without a substratum to bind them together, we should expect the properties of an object to be capable of existing separately, like its parts, something that most philosophers believe that properties cannot do. Unlike the emphasis on the role of particularising, this line of argument may have some attraction for those who hold that properties are particulars rather than universals. One objection to Martin’s argument says that the properties in a bundle might depend on one another without depending on some further substratum (Denkel 1992).

While much of the discussion concerning the structure of substances has focused on the choice between the bundle view and the substratum view, some philosophers have also shown a revival of interest in Aristotle’s analysis of material substance into form and matter, including the prominent role he gives to substantial forms in determining the kinds to which substances belong.

The latter idea is given new life in Peter Geach (1962) and David Wiggins’ (2001) defence of the sortal-dependence of identity. A sortal is a term or concept that classifies an entity as belonging to a certain kind and that hence provides, like Aristotle’s substantial forms, an answer to the question “what is x?”. The claim that identity is sortal-dependent amounts to the claim that if some entity a at an earlier time is identical to an entity b at a later time, then there must be some sortal F such that a and b are the same F—the same elephant for example, or the same molecule. As a result, the conditions under which a and b count as identical will depend on what sortal F is: the criteria for being the same elephant have to do with the kind of things elephants are; the criteria for being the same molecule have to do with the kind of things molecules are. Geach goes further than Wiggins in arguing that identity is not just sortal-dependent but also sortal-relative, so that a might be the same F as b but not the same G as b. Wiggins argues that the sortal-relativity of identity must be rejected, given Leibniz’s law of the indiscernibility of identicals.

The claim that identity is sortal-dependent implies that there is a degree of objectivity to the kinds under which we sort entities. It contrasts with the Lockean claims that the kinds that we employ are arbitrary and, as Leszek Kołakowski expresses it, that:

Nothing prevents us from dissecting surrounding material into fragments constructed in a manner completely different from what we are used to. Thus, speaking more simply, we could build a world where there would be no such objects as “horse”, “leaf”, “star”, and others allegedly devised by nature. Instead, there might be, for example, such objects as “half a horse and a piece of river”, “my ear and the moon”, and other similar products of a surrealist imagination. (1968, 47–8)

Insofar as Geach and Wiggins’ sortals play the role of Aristotle’s substantial forms, their claims about sortal-dependence can be seen as reviving elements of Aristotle’s hylomorphism in spirit if not in letter. Numerous works go further, in explicitly defending the analysis of material substances into matter and form. Examples include Johnston (2006), Jaworski (2011, 2012), Rea (2011), Koslicki (2018) and many others.

Early-twenty-first-century hylomorphists vary widely on the nature they attribute to forms, especially with respect to whether forms should be regarded as universals or particulars. Most, however, regard the form as the source of an object’s structure, unity, activity, and the kind to which it belongs. Motivations for reviving hylomorphic structure include its (putative) ability to differentiate between those composites that really exist and those that are mere aggregates, to account for change, and to make sense of the relationship between properties, their bearers, and resemblances between numerically distinct bearers (see, for example, Koslicki 2018, § 1.5). For these hylomorphists, as for Aristotle, the matter that the form organises need not be in itself propertyless, and thus, although hylomorphism can be viewed as one version of the substratum theory of substances, it can avoid the objection that the idea of an entity that is in itself propertyless is incoherent.

Critics of this sort of hylomorphism, such as Howard Robinson (2021), have questioned whether it can do this work while remaining consistent with the thesis that all events can be accounted for by physical forces (that is, the completeness of physics thesis). Robinson argues that if physics is complete, then forms cannot play any explanatory role.

c. Substance and the Mind-Body Problem

Philosophical work on the idea of substance typically arises as part of the project of describing reality in general. Yet, a more specific source of interest in substances has arisen in the context of philosophy of mind, where the distinction between substances and properties is used to distinguish between two kinds of dualism: substance dualism and property dualism.

The terms “substance dualism” and “property dualism” were hardly used before the 1970s (Michel et al. 2011). They appear to have gained prominence as a result of a desire among philosophers arguing for a revival of mind-body dualism to distinguish their position from the traditional forms of dualism endorsed by philosophers such as Plato and Descartes. Traditional dualists affirm that the mind is a nonphysical substance, something object-like that can exist separately from the body. By contrast, many twentieth-century and early-twenty-first-century proponents of dualism, beginning with Frank Jackson (1982), limit themselves to the claim that the mind involves nonphysical properties.

One advantage of positing nonphysical properties only is that this has allowed proponents of property dualism to represent their position as one that departs only slightly from popular physicalist theories and to distance themselves from the unfashionable idea that a person exists or might exist as a disembodied mind. At the same time, however, several philosophers have questioned whether it makes sense to posit nonphysical properties only, without nonphysical substances (for example, Searle 2002; Zimmerman 2010; Schneider 2012, Weir 2023). Several works, such as those collected in Loose et al. (2018), argue that substance dualism may have advantages over property dualism.

These discussions are complicated by the fact that at the beginning of the third decade of the twenty-first century, there still exists no consensus on how to define the notion of substance, and on what the distinction between substances and properties consists in. Hence, it is not always obvious what property-dualists take themselves to reject when they eschew nonphysical substances.

7. References and Further Reading

  • Aristotle. Categories and De Interpretatione. Edited and translated by J. L. Ackrill (1963). Oxford: Clarendon.
    • Contains Aristotle’s classic introduction of the concept of substance.
  • Aristotle. Aristotle’s Metaphysics. Edited and translated by W. D. Ross (1924). Oxford: Oxford University Press.
    • Develops and revises Aristotle’s account of the nature of substances.
  • Aristotle. Physics. Edited and translated by C. D. C. Reeve (2018). Indianapolis, IN: Hackett.
    • Explains change by analysing material substances into matter and form.
  • Armstrong, David. (1978). Universals and Scientific Realism. Cambridge: Cambridge University Press.
    • Contains a classic discussion of the bundle theory and the substratum theory.
  • Armstrong, David. (1997). A World of States of Affairs. Cambridge: Cambridge University Press.
    • Contains an influential discussion of thin (that is, bare) particulars.
  • Arnauld, Antoine and Pierre Nicole. (1662). Logic or the Art of Thinking. Edited and translated by J. V. Buroker (1996). Cambridge: Cambridge University Press.
    • A highly influential Cartesian substitute for Aristotle’s logical works, covering the concept of substance.
  • Aquinas, Thomas. De Ente et Essentia. Leonine Commission (Ed.), 1976. Rome: Vatican Polyglot Press.
    • Distinguishes essence from existence.
  • Aquinas, Thomas. Questiones Disputate de Anima. Leonine Commission (Ed.), 1996. Rome: Vatican Polyglot Press.
    • Rejects universal hylomorphism.
  • Ayers, M. R. (1977). The Ideas of Power and Substance in Locke’s Philosophy (revised edition of a 1975 paper). In I. C. Tipton (Ed.), Locke on Human Understanding (pp. 77–104). Oxford: Oxford University Press.
    • Defends an influential interpretation of Locke on substances.
  • Bailey, A. M. (2012). No Bare Particulars. Philosophical Studies, 158, 31–41.
    • Rejects bare particulars.
  • Barney, S. A., W. J. Lewis, J. J. Beach and Oliver Berghoff. (2006). Introduction. In S. A. Barney, W. J. Lewis, J. J. Beach and O. Berghoff (Eds. & Trans). The Etymologies of Isadore of Seville (pp. 1-2). Cambridge: Cambridge University Press.
    • Introduces Isadore of Seville’s Etymologies.
  • Baxter, Donald. (2015). Hume on Substance: A Critique of Locke. In P. Lodge & T. Stoneham (Eds.), Locke and Leibniz on Substance (pp. 45–62). New York, NY: Routledge.
    • An exposition of Hume on substance.
  • Bennett, Jonathan. (1987). Substratum. History of Philosophy Quarterly, 4(2), 197–215.
    • Defends the traditional interpretation of Locke on substance.
  • Bergman, Gustav. (1947). Russell on Particulars. The Philosophical Review, 56(1), 59–72.
    • Defends bare particulars against Russell.
  • Berkeley, George. The Works of George Berkeley, Bishop of Cloyne. A. A. Luce and T. E. Jessop (Eds.), 1948–1957. London: Thomas Nelson and Sons.
  • Bhikku Bodhi (Trans.). (2000). The Connected Discourses of the Buddha: A New Translation of the Samyutta Nikaya. Somerville, MA: Wisdom Publications.
    • Contains the version of the chariot argument against substance attributed to the ancient Buddhist nun, Vaijira.
  • Broackes, Justin. (2006). Substance. Proceedings of the Aristotelian Society, 106, 133–68.
    • Traces the historical confusion between substance and substratum and defends the former concept.
  • Descartes, René. The Philosophical Writings of Descartes (3 vols.). Edited and translated by J. Cottingham, R. Stoothoff, D. Murdoch, and A. Kenny (1984–1991). Cambridge: Cambridge University Press.
    • Contains Descartes’ influential claims about substance, including his independence definition.
  • Descartes, René. Conversation with Burman. Translated by J. Bennett (2017).  https://earlymoderntexts.com/assets/pdfs/descartes1648.pdf
    • Contains Descartes’ identification of the substance with its attributes.
  • Denkel, Arda. (1992). Substance Without Substratum. Philosophy and Phenomenological Research, 52(3), 705–711.
    • Argues that we can retain the concept of substance while rejecting that of a substratum.
  • Druart, Thérèse-Anne. (1987). Substance in Arabic Philosophy: Al-Farabi’s Discussion. Proceedings of the American Catholic Philosophical Association, 61, 88–97.
    • An exposition of al-Farabi on substance.
  • Fine, Kit. (1994). Essence and Modality. Philosophical Perspectives, 8, 1–16.
    • Defends the idea of non-modal essences.
  • Fine, Kit. (1995). Ontological Dependence. Proceedings of the Aristotelian Society, 95, 269–90.
    • Defends the idea of essential dependence.
  • Forrai, Gabor. (2010). Locke on Substance in General. Locke Studies, 10, 27–59.
    • Attempts to synthesise Bennett’s traditional and Ayers’ novel interpretations of Locke on substance.
  • Geach, Peter. (1962). Reference and Generality. Ithaca: Cornell University Press.
    • Defends the sortal-dependence and sortal-relativity of identity.
  • Gopnik, Alison. (2009). Could David Hume Have Known about Buddhism? Charles François Dolu, the Royal College of La Flèche, and the Global Jesuit Intellectual Network. Hume Studies, 35(1-2), 5–28.
    • Argues that Hume’s criticism of the idea of a substantial self may have been influenced by Buddhist philosophy.
  • Gorman, Michael. (2006). Independence and Substance. International Philosophical Quarterly, 46, 147–159.
    • Defends a definition of substances as things that do not inhere in anything.
  • Halbfass, Wilhelm. (1992). On Being and What There Is: Classical Vaisesika and the History of Indian Ontology. New York: SUNY Press.
    • Contains a very useful introduction to the concept of substance in classical Indian philosophy.
  • Hoffman, Joshua and Gary Rosenkrantz. (1996). Substance: Its Nature and Existence. London: Routledge.
    • A sustained examination and defence of a novel characterisation of substance.
  • Hume, David. A Treatise of Human Nature. Edited by D. F. Norton and M. J. Norton (2007).  Oxford: Clarendon Press.
    • Contains Hume’s influential objections to the idea of substance.
  • Isidore of Seville. Etymologies. Edited and translated by S. A. Barney, W. J. Lewis, J. A. Beach and O. Berghoff (2006). Cambridge: Cambridge University Press.
    • Played an important role in transmitting Aristotle’s characterisation of substance to medieval philosophers in the Latin West.
  • Jackson, Frank. (1982). Epiphenomenal Qualia. Philosophical Quarterly, 32(127), 127–36.
    • A classic defence of property-dualism.
  • Kaipayil, Joseph. (2008). An Essay on Ontology. Kochi: Karunikan.
    • Contains a discussion of the idea of substance in both Western and Indian philosophy.
  • Kant, Immanuel. (1787). Critique of Pure Reason. Edited and translated by N. K. Smith (2nd ed., 2007). Basingstoke: Palgrave Macmillan.
    • Contains Kant’s approach to the idea of substance and his comments on Aristotle’s Categories.
  • Kołakowski, Leszek. (1968). Towards a Marxist Humanism. New York: Grove Press.
    • Claims, contra Geach and Wiggins, that the kinds we divide the world into are arbitrary.
  • Koslicki, Katherine. (2018). Form, Matter and Substance. Oxford: Oxford University Press.
    • Defends a unity criterion that attributes substancehood to hylomorphic compounds.
  • Leibniz, G. W. Critical Thoughts on the General Part of the Principles of Descartes. In L. Loemker (Ed. & Trans.), Gottfried Leibniz: Philosophical Papers and Letters (2nd ed., 1989). Alphen aan den Rijn: Kluwer.
    • Contains a criticism of Descartes’ independence definition of substance.
  • Leibniz, G. W. Discourse on Metaphysics. Edited and translated by G. Rodriguez-Pereyra (2020). Oxford University Press.
    • Presents Leibniz’s idiosyncratic conception of substance.
  • Locke, John. An Essay Concerning Human Understanding. Edited by P. H. Nidditch (1975). Oxford: Oxford University Press.
    • Contains Locke’s critical discussion of substance and substratum.
  • Loose, Jonathan, Angus Menuge, and J. P. Moreland (Eds.). (2018). The Blackwell Companion to Substance Dualism. Oxford: Blackwell.
    • Collects works bearing on substance dualism.
  • Lowe, E. J. (1998). The Possibility of Metaphysics: Substance, Identity and Time. Oxford: Clarendon Press.
    • Discusses substance and defends Lowe’s identity-independence criterion.
  • Lowe, E. J. (2005). The Four-Category Ontology: A Metaphysical Foundation for Natural Science. Oxford: Clarendon Press.
    • Further develops Lowe’s account of substance.
  • Martin, C. B. (2006). Substance Substantiated. Australasian Journal of Philosophy, 58(1), 3–10.
    • Argues that we should posit a substratum to explain why the properties of a substance cannot exist separately.
  • McEvilley, Thomas. (2002). The Shape of Ancient Thought: Comparative Studies in Greek and Indian Philosophies. London: Simon & Schuster.
    • Compares ancient Greek and classical Indian philosophy on many issues including the nature of substances.
  • Messina, James. (2021). The Content of Kant’s Pure Category of Substance and its Use on Phenomena and Noumena. Philosophers’ Imprint, 21(29), 1-22.
    • An exposition of Kant on substance.
  • Michel, Jean-Baptiste, et al. (2011). Quantitative Analysis of Culture Using Millions of Digitized Books. Science, 331(6014), 176–182.
    • Records development of Google’s Ngram which provides data on the appearance of the terms “substance dualism” and “property dualism”.
  • Moise, Ionut and G. U. Thite. (2022). Vaiśeṣikasūtra: A Translation. London: Routledge.
    • The founding text of the Vaisheshika school.
  • Neale, Matthew. (2014). Madhyamaka and Pyrrhonism: Doctrinal, Linguistic and Historical Parallels and Interactions Between Madhyama Buddhism and Hellenic Pyrrhonism. Ph.D. Thesis, University of Oxford.
    • Discusses the relationship between Madhyamaka and Pyrrhonism.
  • O’Conaill, Donnchadh. (2022). Substance. Cambridge: Cambridge University Press.
    • A detailed overview of philosophical work on substance.
  • Plato. Sophist. Edited and translated by N. White (1993). Indianapolis, IN: Hackett.
    • Contains Plato’s distinction between things that exist in themselves and those that exist in relation to something else.
  • Priest, Stephen. (2007). The British Empiricists (2nd ed.). London: Routledge.
    • An exposition of the ideas of the British Empiricists on topics including that of substance.
  • Rea, Michael. (2011). Hylomorphism Reconditioned. Philosophical Perspectives, 25(1), 341–58.
    • Defends a version of hylomorphism.
  • Robinson, Howard. (2021). Aristotelian Dualism, Good; Aristotelian Hylomorphism, Bad. In P. Gregoric and J. L. Fink (Eds.), Encounters with Aristotelian Philosophy of Mind (pp. 283-306). London: Routledge.
    • Criticises hylomorphism.
  • Russell, Bertrand. (1945). History of Western Philosophy. London: George Allen and Unwin.
    • Rejects the idea of substances understood as substrata.
  • Schneider, Benjamin. (2006). A Certain Kind of Trinity: Dependence, Substance, Explanation. Philosophical Studies, 129, 393–419.
    • Defends a conceptual-independence criterion for substancehood.
  • Schneider, Susan. (2012). Why Property Dualists Must Reject Substance Physicalism. Philosophical Studies, 157, 61–76.
    • Argues that mind-body dualists must be substance dualists.
  • Scotus, John Duns. Opera Omnia. Edited by C. Balic et al. (1950-2013). Rome: Vatican Polyglot Press.
    • Contains Scotus’s influential discussions of substance.
  • Searle, John. (2002). Why I am Not a Property Dualist. Journal of Consciousness Studies, 9(12), 57–64.
    • Argues that mind-body dualists must be substance dualists.
  • Sider, Ted. (2006). Bare Particulars. Philosophical Perspectives, 20, 387–97.
    • Defends substrata understood as bare particulars.
  • Solomon ibn Gabirol. The Fount of Life (Fons Vitae). Translated by J. Laumakis (2014). Milwaukee, WI: Marquette University Press.
    • Presents Avicebron’s (Solomon ibn Gabirol’s) universal hylomorphism.
  • Spade, P. V. (2008). Binarium Famosissimum. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Fall 2008 Edition). <https://plato.stanford.edu/archives/fall2008/entries/binarium/>
    • Discusses the medieval case for universal hylomorphism.
  • Spinoza, Baruch. Principles of Cartesian Philosophy. Edited and translated by H. E. Wedeck (2014). New York: Open Road Integrated Media.
    • Contains Spinoza’s presentation of Descartes’ account of substance.
  • Spinoza, Baruch. Ethics: Proved in Geometrical Order. Edited by M. J. Kisner and translated by M. Silverthorne and M. J. Kisner (2018). Cambridge: Cambridge University Press.
    • Contains Spinoza’s account of substance and argument for substance monism.
  • Strawson, P. F. (1997). Kant on Substance. In P. F. Strawson, Entity and Identity and Other Essays (pp. 268–79). Oxford: Oxford University Press.
    • An exposition of Kant on substance.
  • Weir, R. S. (2021). Bring Back Substances!. The Review of Metaphysics, 75(2), 265–308.
    • Defends the idea of substances as things that can exist by themselves.
  • Weir, R. S. (2023). The Mind-Body Problem and Metaphysics: An Argument from Consciousness to Mental Substance. London: Routledge.
    • Argues that those who posit nonphysical properties to solve the mind-body problem must also posit nonphysical substances.
  • Westerhoff, Jan. (2009). Nagarjuna’s Madhymaka: A Philosophical Introduction. Oxford University Press.
    • An introduction to Nagarjuna’s philosophy.
  • Wiggins, David. (2001). Sameness and Substance Renewed. Cambridge: Cambridge University Press.
    • Defends the sortal-dependence of identity, but rejects the sortal-relativity of identity.
  • Zimmerman, Dean. (2010). From Property Dualism to Substance Dualism. Aristotelian Society Supplementary Volume, 84(1), 119–150.
    • Argues that mind-body dualists must be substance dualists.

 

Author Information

Ralph Weir
Email: rweir@lincoln.ac.uk
University of Lincoln
United Kingdom

Arthur Schopenhauer: Logic and Dialectic

CamusFor Arthur Schopenhauer (1788-1860), logic as a discipline belongs to the human faculty of reason, more precisely to the faculty of language. This discipline of logic breaks down into two areas. Logic or analytics is one side of the coin; dialectic or the art of persuasion is the other. The former investigates rule-oriented and monological language. The latter investigates result-oriented language and persuasive language.

Analytics or logic, in the proper sense, is a science that emerged from the self-observation of reason and the abstraction of all content. It deals with formal truth and investigates rule-governed thinking. The uniqueness of Schopenhauer’s logic emerges from its reference to intuition, which leads him to use numerous geometric forms in logic that are understood today as logic diagrams, combined with his aim of achieving the highest possible degree of naturalness, so that logic resembles mathematical proofs and, especially, the intentions of everyday thinking.

It follows from both logic and dialectic that Schopenhauer did not actively work to develop a logical calculus because axiomatisation contradicts natural thinking and also mathematics in that the foundations of mathematics should rely upon intuition rather than upon the rigor that algebraic characters are supposed to possess. However, the visualization of logic through diagrams and of geometry through figures is not intended to be empirical; rather, it is about the imaginability of logical or mathematical forms. Schopenhauer is guided primarily by Aristotle with regard to naturalness, by Euler with regard to intuition, and by Kant with regard to the structure of logic.

Schopenhauer called dialectic ‘eristics’, and the ‘art of persuasion’ and the ‘art of being right’. It has a practical dimension. Dialectic examines the forms of dialogue, especially arguments, in which speakers frequently violate logical and ethical rules in order to achieve their goal of argumentation. In pursuing this, Schopenhauer starts from the premise that reason is neutral and can, therefore, be used as a basis for valid reasoning, although it can also be misused. In the case of abuse, speakers instrumentalize reason in order to appear right and prevail against one or more opponents. Even if some texts on dialectic contain normative formulations, Schopenhauer’s goal is not to motivate invalid reasoning, but to protect against it. As such, scientific dialectic is not an ironic or sarcastic discipline, but a protective tool in the service of Enlightenment philosophy.

Schopenhauer’s dialectic is far better known than his analytics, although in direct comparison it makes up the smaller part of his writings on logic in general. For this reason, and because most texts on dialectic build on analytics, the following article is not structured around the two sub-disciplines, but around Schopenhauer’s very different texts on logic in general. First, logic is positioned as a discipline within the philosophical system. Then, the Berlin Lectures, his main text on analytics and dialectic, is introduced and followed, in chronological order, by his shorter texts on analytics and dialectic. The final section outlines research topics.

Table of Contents

  1. Logic and System
    1. Schopenhauer’s Philosophical System
    2. Normativism or Descriptivism
    3. Logic within the System
    4. Schopenhauer’s Treatises on Logic and Dialectics
  2. Schopenhauer’s Logica Maior (the Berlin Lectures)
    1. Doctrine of Concepts and Philosophy of Language
      1. Translation, Use-Theory, and Contextuality
      2. Abstraction, Concretion, and Graphs
    2. Doctrine of Judgments
      1. Relational Diagrams
      2. Stoic Logic and Oppositional Geometry
      3. Conversion and Metalogic
      4. Analytic-Synthetic Distinction and the Metaphor of the Concept
    3. Doctrine of Inferences
      1. Foundations of Logic
      2. Logical Aristotelianism and Stoicism
      3. Naturalness in Logic
    4. Further Topics of Analytic
      1. Schopenhauer’s History of Logic
      2. Logic and Mathematics
      3. Hermeneutics
    5. Dialectic or Art of Persuasion
  3. Schopenhauer’s Logica Minor
    1. Fourfold Root
    2. World as Will and Representation I (Chapter 9)
    3. Eristic Dialectics
    4. World as Will and Representation II (Chapters 9 and 10)
    5. Parerga and Paralipomena II
  4. Research Topics
  5. References and Further Readings
    1. Schopenhauer’s Works
    2. Other Works

1. Logic and System

a. Schopenhauer’s Philosophical System

Schopenhauer’s main work is The World as Will and Representation (W I). This work represents the foundation and overview of his entire philosophical system (and also includes a minor treatise on logic). It was first published in 1819 and was accepted as a habilitation thesis at the University of Berlin shortly thereafter. W I was also the basis for the revised and elaborated version—the Berlin Lectures (BL), written in the early 1820s. It also appeared in a slightly revised version in a second and third edition (1844, 1859) accompanied by a second volume (W II) that functioned as a supplement or commentary. However, none of these later editions were as rich in content as the revision in the BL. All other writings—On the Fourfold Root of the Principle of Sufficient Reason (1813 as a dissertation, 1847), On the Will in Nature (1836, 1854), The Two Fundamental Problems of Ethics (1841, 1860), and Parerga and Paralipomena (1851)—can also be regarded as supplements to the W I or the BL.

Schopenhauer’s claim, made in the W I (and also the BL), follows (early) modern and especially Kantian system criteria. He claimed that philosophy aims to depict, in one single system, the interrelationships between all the components that need to be examined. In Kant’s succession, a good or perfect system is determined by the criterion of whether the system can describe all components of nature and mind without leaving any gaps or whether all categories, principles, and topics have been listed in order to describe all components of nature and mind. This claim to completeness becomes clear in Schopenhauer’s system, more precisely, in W I or BL, each of which is divided into four books. The first book deals mainly with those topics that would, in contemporary philosophy, be assigned to epistemology, philosophy of mind, philosophy of science, and philosophy of language. The second book is usually understood as covering metaphysics and the philosophy of nature. The third book presents his aesthetics and the fourth book practical philosophy, including topics such as ethics, theory of action, philosophy of law, political philosophy, social philosophy, philosophy of religion, and so forth.

b. Normativism or Descriptivism

Schopenhauer’s system, as described above (Sect. 1.a), has not been uniformly interpreted in its 200-year history of reception, a factor that has also played a significant role in the reception of his logic. The differences between the individual schools of interpretation have become increasingly obvious since the 1990s and are a significant subject of discussion in research (Schubbe and Lemanski 2019). Generally speaking, one can differentiate between two extreme schools of interpretation (although not every contemporary study on Schopenhauer can be explicitly and unambiguously assigned to one of the following positions):

  1. Normativists understand Schopenhauer’s system as the expression of one single thought that is marked by irrationality, pessimism, obscurantism, and denial of life. The starting point of Schopenhauer’s system is Kant’s epistemology, which provides the foundation for traversing the various subject areas of the system (metaphysics, aesthetics, ethics). However, all topics presented in the system are only introductions (“Vorschulen”) to the philosophy of religion, which Schopenhauer proclaims is the goal of his philosophy, that is, salvation through knowledge (“Erlösung durch Erkenntnis”). Normativists are above all influenced by various philosophical schools or periods of philosophy such as late idealism (Spätidealismus), the pessimism controversy, Lebensphilosophie, and Existentialism.
  2. Descriptivists understand Schopenhauer’s philosophy as a logically ordered representation of all components of the world in one single system, without one side being valued more than the other. Depending on the subject, Schopenhauer’s view alternates between rationalism and irrationalism, between optimism and pessimism, between affirmation and denial of life, and so forth. Thus, there is no intended priority for a particular component of the system (although, particularly in later years, Schopenhauer’s statements became more and more emphatic). This school is particularly influenced by those researchers who have studied Schopenhauer’s intense relationship with empiricism, logic, hermeneutics, and neo-Kantianism.

c. Logic within the System

The structure of logic is determined by three sub-disciplines: the doctrines of concepts, judgments, and inferences. However, the main focus of Schopenhauerian logic is not the doctrine of inferences in the sense of logical reasoning and proving but rather in the sense that his logic corresponds with his philosophy of mathematics. According to Schopenhauer, logical reasoning in particular is overrated as people rarely put forward invalid inferences, although they often put forward false judgments. However, the intentional use of fallacies is an exception to this that is therefore studied by dialectics.

The evaluation of Schopenhauer’s logic depends strongly on the school of interpretation. Normativists have either ignored Schopenhauer’s logic or identified it with (eristic) dialectic, which in turn has been reduced to a normative “Art of Being Right” or “of Winning an Argument” (see below, Sect. 2.e, 3.c). A relevant contribution to Schopenhauer’s analytics from the school of normativists is, therefore, not known, although there were definitely intriguing approaches to dialectics. As normativism was the more dominant school of interpretation until late in the 20th century, it shaped the public image of Schopenhauer as an enemy of logical and mathematical reasoning, and so forth.

Descriptivists emphasize logic as both the medium of the system and the subject of a particular topic within the W I-BL system. The first book of W I-BL deals with representation and is divided into two sections (Janaway 2014): 1. Cognition (W I §§3–7, BL chap. 1, 2), 2. Reason (W I §§8–16, BL 3–5). Cognition refers to the intuitive and concrete, reason to the discursive and abstract representation. In the paragraphs on cognition, Schopenhauer examines the intuitive representation and its conditions, that is, space, time, and causality, while reason is built on cognition and is, therefore, the ‘representation of representation’. Schopenhauer examines three faculties of reason, which form the three sections of these paragraphs: 1. language, 2. knowledge, and 3. practical reason. Language, in turn, is then broken down into three parts: general philosophy of language, logic, and dialectics. (Schopenhauer defines rhetoric as, primarily, the speech of one person to many, and he rarely dealt with it in any substantial detail.) Following the traditional structure, Schopenhauer then divides logic into sections on concepts, judgments, and inferences. Logic thus fulfills a double role in Schopenhauer’s system: it is a topic within the entire system and it is the focus through which the system is organized and communicated. Fig. 1 shows this classification using W I as an example.
Figure 1: The first part of Schopenhauer’s system focusing on logic

However, this excellent role of logic only becomes obvious when Schopenhauer presents the aim of his philosophy. The task of his system is “a complete recapitulation, a reflection, as it were, of the world, in abstract concepts”, whereby the discursive system becomes a finite “collection [Summe] of very universal judgments” (W I, 109, BL, 551). As in Schopenhauer’s system, logic alone clarifies what concepts and judgments are: it is a very important component for understanding his entire philosophy. Schopenhauer, however, vehemently resists an axiomatic approach because in logic, mathematics and, above all, philosophy, nothing can be assumed as certain; rather, every judgment may represent a problem. Philosophy itself must be such that it is allowed to be skeptical about tautologies or laws (such as the laws of thought). This distinguishes it from other sciences. Logic and language cannot, therefore, be the foundation of science and philosophy, but are instead their means and instrument (see below, Sect. 2.c.i).

Through this understanding of the role of logic within the system, the difference between the two schools of interpretation can now also be determined: Normativists deny the excellent role attributed to logic as they regard the linguistic-logical representation as a mere introduction (“Vorschule”) to philosophical salvation at the end of the fourth book of W I or BL. This state of salvation is no longer to be described using concepts and judgments. In contrast, descriptivists stress that Schopenhauer’s entire system aims to describe the world and man’s attitude to the world with the help of logic and language. This also applies to the philosophy of religion and the treatises on salvation at the end of W I and BL. As emphasized by Wittgensteinians in particular, Schopenhauer also shows, ultimately, what can still be logically expressed and what can no longer be grasped by language (Glock 1999, 439ff.).

d. Schopenhauer’s Treatises on Logic and Dialectics

Schopenhauer’s whole oeuvre is thought to contain a total of six longer texts on logic. In chronological order, this includes the following seven texts:

(1) In the summer semester of 1811, Schopenhauer attended Gottlob Ernst (“Aenesidemus”) Schulze’s lectures on logic and wrote several notes on Schulze’s textbook (d’Alfonso 2018). As these comments do not represent work by Schopenhauer himself, they are not discussed in this article. The same applies to Schopenhauer’s comments on other books on logic, such as those of Johann Gebhard Ehrenreich Maass, Johann Christoph Hoffbauer, Ernst Platner, Johann Gottfried Kiesewetter, Salomon Maimon et al. (Heinemann 2020), as well as his shorter manuscript notes published in the Manuscript Remains. (Schopenhauer made several references to his manuscripts in BL.)

(2) Schopenhauer’s first discussion of logic occurred in his dissertation of 1813 which presented a purely discursive reflection on some components of logic (concepts, truth, and so on). In particular, his reflections on the laws of thought were emphasized.

(3) For the first time in 1819, in § 9 of W I, Schopenhauer distinguished between analytics and dialectics in more detail. In the section on analytics, he specified a doctrine of concepts with the help of a few logic diagrams. However, he wrote in § 9 that this doctrine had already been fairly well explained in several textbooks and that it was, therefore, not necessary to load the memory of the ‘normal reader’ with these rules. In the section on dialectic, he sketches a large argument map for the first time. § 9 was only lightly revised in later editions; however, his last notes in preparation for a fourth edition indicate that he had planned a few more interesting changes and additions.

(4) During the 1820s, Schopenhauer took the W I system as a basis, supplemented the missing information from his previously published writings, and developed a system that eliminated some of the shortcomings and ambiguities of W I. The system within these manuscripts then served as a source for his lectures in Berlin in the early 1820s, that is, the BL. In the first book of the BL, there is a treatise on logic the size of a textbook.

(5) Eristic Dialectics is the title of a longer manuscript that Schopenhauer worked on in the late 1820s and early 1830s. This manuscript is one of Schopenhauer’s best-known texts, although it is unfinished. It takes many earlier approaches further, but the context to analytics (and to logic diagrams) is missing in this small fragment on dialectics. With the end of his university career in the early 1830s, Schopenhauer’s intensive engagement with logic came to an end.

(6) It was not until 1844, in W II, that Schopenhauer supplemented the doctrine of concepts given in W I with a 20-page doctrine of judgment and inference. This, however, is no longer compatible with the earlier logic treatises written before 1830, as Schopenhauer repeatedly suggests new diagrammatic logics, which he does not illustrate. Given these changes, the published texts on logic look inconsistent.

(7) In 1851, Schopenhauer once again published a short treatise entitled “Logic and Dialectics” in the second volume of Parerga and Paralipomena. This treatise, however, only deals with some topics from the philosophy of logic in aphoristic style and, otherwise, focuses more strongly on dialectic. Few new insights are found here.

Since the rediscovery of the Berlin Lectures by descriptivists, a distinction has been made—in the sense of scholastic subdivision—between Logica Maior (Great Logic) and Logica Minor (Small Logic): Treatises (2), (3), (4), (5) and (6) belong to the Logica Minor and are discussed briefly in Section 3. (For more information see Lemanski 2021b, chap. 1.) The only known treatise on logic written by Schopenhauer that deserves to be called a Logica Maior is a manuscript from the Berlin Lectures written in the 1820s. This book-length text is the most profitable reading of all the texts mentioned. Thus, it is discussed in more detail in Section 2.

2. Schopenhauer’s Logica Maior (the Berlin Lectures)

Until the early 21st century, due to the dominance of the normativists in Schopenhauer scholarship, the BL were considered just a didactic version of W I and were, therefore, almost completely ignored by researchers until intensive research on Schopenhauer’s logic began in the middle of the 2010s. These lectures are not only interesting from a historical perspective, they also propose a lot of innovations and topics that are still worth discussing today, especially in the area of diagrammatic reasoning and logic diagrams. As Albert Menne, former head of the working group ‘Mathematical Logic’ at the Ruhr-Universität in Bochum stated: “Schopenhauer has an excellent command of the rules of formal logic (much better than Kant, for example). In the manuscript of his Berlin Lectures, syllogistics, in particular, is thoroughly analyzed and explained using striking examples” (Menne 2002, 201–2).

The BL are a revised and extended version of W I made for the students and guests who attended his lectures in Berlin. The belief that such an elaboration only has minor value is, however, not reasonable. Moreover, the extent, the content, and also the above-mentioned distinction between the exoteric-popular-philosophical and the esoteric-academic part of Schopenhauer’s work suggest a different evaluation. In W I, Schopenhauer deals only casually with difficult academic topics such as logic or philosophy of law; at the beginning of the BL, however, he states that these topics are the most important topics to teach prospective academics. Indeed, he repeatedly pointed out that he will also focus on logic in the title of his announcement for the Berlin Lectures. Thus, the lecture given in the winter semester of 1821-22 is titled “Dianologie und Logik” (BL, XII; Regehly 2018). Therefore, suspicion arises that research has hitherto ignored Schopenhauer’s most important textual version of his philosophical system, as the Berlin Lectures contain his complete system including some of the parts missing from W I, which are very important for the academic interpretation of the system such as logic and philosophy of law.

The first edition of the BL was published by Franz Mockrauer in 1913, reprinted by Volker Spierling in 1986, and a new edition was published in four volumes between 2017 and 2022 by Daniel Schubbe, Daniel Elon, and Judith Werntgen-Schmidt. An English translation is not available. The manuscript of the BL is deposited in the Staatsbibliothek zu Berlin Preussischer Kulturbesitz and can be viewed online at http://sammlungen.ub.uni-frankfurt.de/schopenhauer/content/titleinfo/7187127.

The Logica Maior is found in chapter III of the Berlin Lectures (book I). Here, Schopenhauer begins with (a) a treatise on the philosophy of language that announces essential elements of the subsequent theory of concepts. Then, (b) based on the diagrammatic representation of concepts, he develops a doctrine of judgment. (c) The majority of the work then deals with inferences, in which syllogistic, Stoic logic (propositional logic), modal logic, and the foundation and naturalness of logic are discussed. Together with (d) the appendix, these are the topics that belong to analytics or logic in the proper sense. (e) Finally, he addresses several topics related to dialectics.

a. Doctrine of Concepts and Philosophy of Language

This section mainly deals with BL, 234–260. Schopenhauer begins his discussion of logic with a treatise on language, which is foundational to the subsequent treatise. Several aspects of this part of the Logica Maior have been investigated and discussed to date—namely (i.) translation, use-theory, and contextuality as well as (ii.) abstraction, concretion, and graphs—which are outlined in the following subsections.

i. Translation, Use-Theory, and Contextuality

Schopenhauer distinguishes between words and concepts: he considers words to be signs for concepts, and concepts abstract representations that rest on other concepts or concrete representations (of something, that is, intuition). In order to make this difference explicit, Schopenhauer reflects on translation, as learning a foreign language and translating are the only ways to rationally understand how individuals learn abstract representations and how concepts develop and change over many generations within a particular language community.

In his translation theory, Schopenhauer defines three possible situations:

(1) The concept of the source language corresponds exactly to the concept of the target language (1:1 relation).

(2) The concept of the source language does not correspond to any concept of the target language (1:0 relation).

(3) The concept of the source language corresponds only partially to one or more concepts of the target language 1:(nx)/n relation, where n is a natural number and x < n).

For Schopenhauer, the last relation is the most interesting one: it occurs frequently, causes many difficulties in the process of translation or language learning, and is the relation with which one can understand how best to learn words or the meaning of words. Remarkably, Schopenhauer developed three theories, arguments, or topics regarding the 1:(nx)/relation that have become important in modern logic, linguistics, and analytical philosophy, namely (a) spatial logic diagrams, (b) use-theory of meaning, and (c) the context principle. (a)–(c) are combined in a passage of text on the 1:(nx)/translation:

[T]ake the word honestum: its sphere is never hit concentrically by that of the word which any German word designates, such as Tugendhaft, Ehrenvoll, anständig, ehrbar, geziemend [that is, virtuousness, honorable, decent, appropriate, glorious and others]. They do not all hit concentrically: but as shown below:


That is why one learns not the true value of the words of a foreign language with the help of a lexicon, but only ex usu [by using], by reading in old languages and by speaking, staying in the country, by new languages: namely it is only from the various contexts in which the word is found that one abstracts its true meaning, finds the concept that the word designates. [31, 245f.]

To what extent the penultimate sentence corresponds to what is called the ‘use theory of meaning’, the last sentence of the quote to the so-called ‘context principle’, and to what extent these sentences are consistent with the corresponding theories of 20th-century philosophy of language is highly controversial. Lemanski (2016; 2017, 2021b) and Dobrzański (2017; 2020) see similarities with the formulations of, for example, Gottlob Frege and Ludwig Wittgenstein. However, Schroeder (2012) and Schumann (2020) reject the idea of this similarity, and Weimer (1995; 2018) sees only a representationalist theory of language in Schopenhauer. Dümig (2020) contradicts a use theory and a context principle for quite different reasons, placing Schopenhauer closer to mentalism and cognitivism, while Koßler (2020) argues for the co-existence of various theories of language in Schopenhauer’s oeuvre.

ii. Abstraction, Concretion, and Graphs

With (b) and (c) Schopenhauer not only comes close to the modern philosophy of ordinary language, but he may also be the first philosopher in history to have used (a) logic diagrams to represent semantics or ontologies of concepts (independent of their function in judgments). In his philosophy of language, he also uses logic diagrams to sketch the processes of conceptual abstraction. Schopenhauer intends to describe processes of abstraction that are initially based on concrete representation, that is, the intuition of a concrete object, from which increasingly abstract concepts have formed over several generations within a linguistic community.

Figure 2 (SBB-IIIA, NL Schopenhauer, Fasz. 24, 112r = BL, 257)

For example, Fig. 2 shows the ‘spheres’ of the words ‘grün’ (‘green’),  ‘Baum’ (‘tree’), and  ‘blüthetragend’ (‘flower-bearing’) using three circles. The diagram represents all combinations of subclasses by intersecting the spheres of the concepts that are to be analyzed, more specifically,

There is a recognizable relationship with Venn diagrams here, as Schopenhauer uses the combination of the so-called ‘three-circle diagram’, a primary diagram in Venn’s sense. Schopenhauer distinguishes between an objective and a conceptual abstraction, as the following example illustrates: (1) GTF denotes a concept created by the objective abstraction from an object of intuitive representation, that is, a concretum. The object this was abstracted from belongs to the set of objects that is a tree that bears flowers and is green. All further steps of abstraction are conceptual abstractions or so-called ‘abstracta’. In the course of generations, language users have recognized that there are also (2) representations that can only be described with GF, but not with T, more precisely,

(for example, a common daisy). In the next step (3), the concept F was excluded so that the abstract representation of G was formed, that is,

(for example, bryophytes). Finally, (4) a purely negative concept was formed, whose property is not G nor T nor F, more specifically,

This region lies outside the conceptual sphere and, therefore, does not designate an abstractum or a concept anymore: it is merely a word without a definite meaning, such as ‘the absolute’, ‘substance’, and so forth (compare Xhignesse 2020).

Fig. 3: Interpretation of Fig. 1

In addition to the three-circle diagram (Fig. 2) and the eight classes, the interpretation in Fig. 3 includes a graph illustrating the four steps mentioned above: (1) corresponds to ν1, (2) is the abstraction e1 from ν1 to ν2, (3) is the abstraction e2 from v2 to v3 and (4) e3 is the abstraction from v3 to v4. In this example, the graph can be interpreted as directed with ν1 as the source vertex and v4 as the sink vertex. However, Schopenhauer also uses these diagrams in the opposite direction, that is, not only for abstraction but also for concretion. In both directions, the vertices in the graph represent concepts, whereas the edges represent abstraction or concretion. On account of the concretion, Schopenhauer has also been associated with reism, concretism theory, and reification of the Lwów-Warsaw School (compare Dobrzański 2017; Lemanski and Dobrzański 2020).

b. Doctrine of Judgments

This section mainly focuses on BL, 260–293. Even though Schopenhauer had already used logic diagrams in his doctrine of concepts (see above, Sect. 2.a), he explicitly introduced them in his doctrine of judgment, making reference to Euler and others. Nevertheless, in some cases Schopenhauer’s logic diagrams are fundamentally different from Euler diagrams, so in the following, the first subsection defines the expression (i) ‘Schopenhauer diagrams’ or ‘relational diagrams’. Then subsection (ii) outlines how Schopenhauer applies these diagrams to Stoic logic and how they relate to oppositional geometry. Finally, subsection (iii) discusses Schopenhauer’s theory of conversion, his use of the term metalogic, and subsection (iv) discusses his diagrammatic interpretation of the analytic-synthetic distinction.

i. Relational Diagrams

The essential feature of Schopenhauer’s Logica Maior is that, for the most part, it is based on a diagrammatic representation. Schopenhauer learned the function and application of logic diagrams, at the latest, in Gottlob Ernst Schulze’s lectures. This is known because, although Schulze did not publish any diagrams in his textbook, Schopenhauer drew Euler diagrams and made references to Leonhard Euler in his notes on Schulze’s lectures (d’Alfonso 2018). Thus, as early as 1819, Schopenhauer published a logic of concepts based on circle diagrams in W I, § 9 (see below, Sect. 3.b) that he worked through in the Logica Maior of the Berlin Lectures (BL, 272 et seqq.).

‘Diagrammatic representation’ and ‘logic diagrams’ are modern expressions for what Schopenhauer called ‘visual representation’ or ‘schemata’. Schopenhauer’s basic insight is that the relations of concepts in judgments are analogous to the circular lines in Euclidean space. One, therefore, only has to go through all possible circular relations and examine them according to their analogy to concept relations in order to obtain the basic forms of judgment on which all further logic is built. With critical reference to Kant, Schopenhauer calls his diagrammatic doctrine of judgment a ‘guide of schemata’ (Leitfaden der Schemata). As the following diagrams are intended to represent the basic relations of all judgments, they can also be called ‘relation diagrams’ (RD) as per Fig. 4.

Fig. 4.1 (RD1)

All R is all C.
All C is all R.

 

Fig. 4.2 (RD2)

All B is A.
Some A is B.
Nothing that is not A is B.
If B then A

 

Fig. 4.3 (RD3)

No A is S.
No S is A.
Everything that is S is not A.
Everything that is A is not S.

 

Fig. 4.4 (RD4)

All A is C.
All S is C.
Nothing that is not C is A.
Nothing that is not C is S.

 

Fig. 4.5 (RD 5)

Some R is F.
Some F is R.
Some S is not F.
Some F is not R

 

Fig. 4.6 (RD6)

All B is either o or i.

All six RDs form the basis on which to build all logic, that is, both Aristotelian and Stoic logic. Schopenhauer states that geometric forms were first used by Euler, Johann Heinrich Lambert, and Gottfried Ploucquet to represent the four categorical propositions of Aristotelian syllogistics: All x are y (RD2), Some x are y (RD5), No x are y (RD3) and Some x are not y (RD5). These three diagrams, together with RD1, result in the relations that Joseph D. Gergonne described almost simultaneously in his famous treatise of 1817 (Moktefi 2020). RD4 may have been inspired by Kant and Karl Christian Friedrich Krause, although there are clear differences in interpretation here. However, Fig. 3.6 is probably Schopenhauer’s own invention even though there were many precursors to these RDs prior to and during the early modern period that Schopenhauer did not know about. On account of the various influences, it might be better to speak of ‘Schopenhauer diagrams’ or ‘relational diagrams’ rather than of ‘Euler diagrams’ or ‘Gergonne relations’ and so forth.

Schopenhauer shows how each RD can express more than just one aspect of information. This ambiguity can be evaluated in different ways. In contemporary formal approaches, the ambiguity of logic diagrams is often considered a deficiency. In contrast, Schopenhauer considers this ambiguity more an advantage than a deficiency as only a few circles in one diagram can represent a multitude of complex linguistic expressions. In this way, Schopenhauer can be seen as a precursor of contemporary theories about the so-called ‘observational advantages’ of diagrams. As meaning only arises through use and context (see above) and as axioms can never be the starting point of scientific knowledge (see above), the ambiguity of logic diagrams is no problem for Schopenhauer. For him, a formal system of logic is unnecessary. He wanted to analyze the ordinary and natural language with the help of diagrams.

ii. Stoic Logic and Oppositional Geometry

Nowadays, it is known that the relation diagrams described above can be transformed, under the definition of an arbitrary Boolean algebra, into diagrams showing the relations contrariety, contradiction, subcontrariety, and subalternation. The best-known of these diagrams, which are now gathered under the heading of ‘oppositional geometry’, is the square of opposition. Although no square of opposition has yet been found in Schopenhauer’s manuscripts, he did associate some of his RDs with the above-mentioned relations and in doing so also referred to “illustrations” (BL, 280, 287) that are no longer preserved in the manuscripts.

Schopenhauer went beyond Aristotelian logic with RD2 and RD6 and also attempted to represent Stoic logic with them, which in turn can be understood as a precursor of propositional logic (BL, 278–286). RD2 expresses hypothetical judgments (if …, then …), RD6 disjunctive judgments (either … or …). In particular, researchers have studied the RD6 diagrams, also called ‘partition diagrams’, more intensively. For Schopenhauer, the RDs for Stoic logic are similar to syllogistic diagrams. However, quantification does not initially play a major role here, as the diagrams are primarily intended to express transitivity (hypothetical judgments) or difference (disjunction). Only in his second step does Schopenhauer add quantification to the diagrams again (BL, 287 et seqq.). In this context, Schopenhauer treats the theory of oppositions on several pages (BL, 284–289); however, he merely indicates that the diagrammatic representation of oppositions would have to be further elaborated.

The basic RD6 in Fig. 3.6 shows a simple contradiction between the concepts  and . However, as the RDs given above are only basic diagrams, they can be extended according to their construction principles. Thus, there is also a kind of compositional approach in Schopenhauer’s work. For example, one can imagine that a circle, such as that given in RD6, is not separated by one line but two, making each compartment a part of the circle and excluding all others. An example of this can be seen in Fig. 5, alongside its corresponding opposition diagram, a so-called ‘strong JSB hexagon’ (Demey, Lemanski 2021).

Figure 5: Partition diagram and Logical Hexagon (Aggregatzustand = state of matter, fester = solid, flüßiger = liquid, elastischer = elastic)

An example of a more complex Eulerian diagram of exclusive disjunctions used by Schopenhauer is illustrated in Fig. 6, which depicts Animalia, Vertebrata, Mammals, Birds, Reptiles, Pisces, Mollusca, ArtiCulata, and RaDiata. These terms are included as species in genera and are mutually exclusive. While the transformation into the form of oppositional geometry is found in Lemanski and Demey (2021), Fig. 6 expresses Schopenhauer’s judgments such as:@

If something is A, it is either V or I.

If something is V, it is either M or B or R or P.

If something is A, but not V, it is either M or C or D.

Fig. 6: Schopenhauer’s Animalia-Diagram

Schopenhauer here notes that the transition between the logic of concepts, judgments, and inferences is fluid. The partition diagrams only show concepts or classes, but judgments can be read through their relation to each other, that is, in a combination of RD2 and RD3. However, as the relation of three concepts to each other can already be understood as inference, the class logic is already, in most cases, a logic of inferences. For example, the last judgment mentioned above could also be understood as enthymemic inference (BL 281):

Something is A and not V. (If V then M or C or D.) Thus, it is either M or C or D.

Schopenhauer’s partition diagrams have been adopted and applied in mathematics, especially by Adolph Diesterweg (compare Lemanski 2022b).

iii. Conversion and Metalogic

In his doctrine of judgments, Schopenhauer still covers all forms of conversion and laws of thought, in which he partly uses RDs, but partly also an equality notation (=) inspired by 18th-century Wolffians. The notation for the conversio simpliciter given in Fig. 4.5 is a convenient example of the doctrine of conversion:

universal negative: No A = B. No B = A.

particular affirmative: Some A = B. Some B = A.  (BL, 293).

Following this example, Schopenhauer demonstrates all the rules of the traditional doctrine of conversion. The equality notation is astonishing as it comes close to a form of algebraic logic that is developed later by Drobisch and others (Heinemann 2020).

Furthermore, the first three laws of thought (BL, 262 et seqq.) correspond to the algebraic logic of the late 19th century, namely the:

(A) law of identity: A = A,

(B) law of contradiction: A = -A = 0,

(C) law of excluded middle: A aut = b, aut = non b.

(D) law of sufficient reason: divided into (1) the ground of becoming (Werdegrund), (2) the ground of cognition (Erkenntnisgrund), (3) the ground of being, and (4) the ground of action (Handlungsgrund).

Only the second class of the law of sufficient reason relates to logic. This ground of cognition (Erkenntnisgrund) is then divided into four further parts, which, together, form a complex truth theory. Schopenhauer distinguishes between (1) logical truth, (2) empirical truth, (3) metaphysical truth, and (4) metalogical truth. The last form is of particular interest (Béziau 2020). Metalogical truth is a reflection on the four classes of the principle of sufficient reason mentioned above. A judgment can be true if the content it expresses is in harmony with one or more of the listed laws of thought. Although some parts of modern logic have broken with these basic laws, Schopenhauer is the first logician to describe the discipline entitled “metalogy” in a similar way to Nicolai A. Vasiliev, Jan Łukasiewicz, and Alfred Tarski.

iv. Analytic-Synthetic Distinction and the Metaphor of the Concept

Another peculiarity of Schopenhauer’s doctrine of judgments is the portrayal of analytic and synthetic judgments. In Kant research, the definition of analytic and synthetic judgments has been regarded as problematic and highly worthy of discussion since Willard Van Orman Quine’s time—at the latest. This is particularly because Kant, as Quine and some of his predecessors have emphasized, used the unclear metaphors of “containment,” that is, “enthalten” (Critique of Pure Reason, Intr. IV) and “actually thinking in something,” that is “in etw. gedacht werden” (Prolegomena, §2b) to define what analytic and synthetic judgments are. In the section of the Berlin Lectures on cognition, Schopenhauer introduces the distinction between analytic and synthetic judgments as follows:

A distinction is made in judgment, more precisely, in the proposition, subject, and predicate, that is, between what something is said about, and what is said about it. Both concepts. Then the copula. Now the proposition is either mere subdivision (analysis) or addition (synthesis); which depends on whether the predicate was already thought of in the subject of the proposition, or is to be added only in consequence of the proposition. In the first case, the judgment is analytic, in the second synthetic.

All definitions are analytic judgments:

For example,

gold is yellow: analytic
gold is heavy: analytic
gold is ductile: analytic
gold is a chemically simple substance: synthetic (BL, 123)

Here, Schopenhauer initially adheres strictly to the expression of ‘actually thinking through something’ (‘mitdenken’ that is, analytically) or ‘adding something’ (‘hinzudenken’ that is, synthetically). However, he explains in detail that the distinction between the two forms of judgment is relative as it often depends on the knowledge and experience of the person making the judgment. An expert will, for example, classify many judgments from his field of knowledge as analytic, while other people would consider them to be synthetic. This is because the expert knows more about the characteristics of a subject than someone who has never learned about these things. In this respect, Schopenhauer is an advocate of ontological relativism. However, in the sense of transcendental philosophy, he suggests that every area of knowledge must have analytic judgments that are also a priori. For example, according to Kant, judgments such as “All bodies are extended” are analytic.

Even more interesting than these explanations taken from the doctrine of cognition (BL, 122–127) is the fact that Schopenhauer takes up the theory of analytic and synthetic judgments again in the Logica Maior (BL, 270 et seqq.). Here, Schopenhauer explains what the expression of ‘actually thinking through something’ (‘mitdenken’), which he borrowed from Kant, means. ‘Actually thinking in something’ can be translated with the metaphor of ‘containment’, and these expressions are linguistic representations of logic diagrams or RDs. To understand this more precisely, one must once again refer to Schopenhauer’s doctrine of concept (BL, 257 et seqq.). For Schopenhauer, there is no such thing as a ‘concept of the concept’. Rather, the concept itself is a metaphor that refers to containment. According to Schopenhauer, this is already evident in the etymology of the expression ‘concept’, which illustrates that something is being contained: horizein (Greek), concipere (Latin), begreifen (German). Concepts conceive of something linguistically, just as a hand grasps a stone. For this reason, the concept itself is not a concept, but a metaphor, and RDs are the only adequate means for representing the metaphor of the concept (Lemanski 2021b, chap. 2.2).

If one says that the concept ‘gold’ includes the concept ‘yellow’, one can also say that ‘gold’ is contained in ‘yellow’ (BL, 270 et seqq.). Both expressions are transfers from concrete representation into abstract representation, that is, from intuition into language. To explain this intuitive representation, one must use an RD2 (Fig. 3.2) such as is given in Fig. 7 (BL, 270):

c. Doctrine of Inferences

This section mainly deals with BL, 293–356. As one can see from the page references, the doctrine of inferences is the longest section of the Logica Maior in the Berlin Lectures. Herein, Schopenhauer (i) presents an original thesis for the foundation of logic and (ii) develops an archaic Aristotelian system of inferences, (iii) whose validity he sees as confirmed by the criterion of naturalness. In all three areas, logic diagrams or RDs—this time following mainly Euler’s intention—play a central role.

i. Foundations of Logic

Similar to the Cartesianists, Schopenhauer claims that logical reasoning is innate in man by nature. Thus, the only purpose academic logic has is to make explicit what everyone implicitly masters. In this respect, the proof of inferential validity can only be a secondary topic in logic. In other words, logic is not primarily a doctrine of inference, but primarily a doctrine of judgment. Schopenhauer sums this up by saying that nobody is able to draw invalid inferences for himself by himself and intend to think correctly, without realizing it (BL, 344). For him, such seriously produced invalid inferences are a great rarity (in ‘monological thinking’), but false judgments are very common. Furthermore, learning logic does not secure against false judgments.

Schopenhauer, therefore, does not consider proving inferences to be the main task of logic; rather, logic should help one formulate judgments and correctly grasp conceptual relations. However, when it comes to proof, intuition plays an important role. Schopenhauer takes up an old skeptical argument in his doctrine of judgments and inference that problematizes the foundations of logic: (1) Conclusions arrived at by deduction are only explicative, not ampliative, and (2) deductions cannot be justified by deductions. Thus, no science can be thoroughly provable, no more than a building can hover in the air, he says (BL, 527).

Schopenhauer demonstrates this problem by referring to traditional proof theories. In syllogistics, for example, non-perfect inferences are reduced to perfect ones, more precisely, the so-called modus Barbara and the modus Celarent. Yet, why are the modes Barbara and Celarent considered perfect? Aristotle, for example, justifies this with the dictum de omni et nullo, while both Kantians and skeptics, such as Schopenhauer’s logic teacher Schulze, justify the perfection of Barbara and Celarent as well as the validity of the dictum de omni et nullo with the principle nota notae est nota rei ipsisus. However, Schopenhauer goes one step further and explains that all discursive principles fail as the foundations of science because an abstract representation (such as a principle, axiom, or grounding) cannot be the foundation for one of the faculties of abstract representation (logic, for example). If one, nevertheless, wants to claim such a foundation, one inevitably runs into a regressive, a dogmatic, or a circular argument (BL, 272).

For this reason, Schopenhauer withdraws a step in the foundation of logic and offers a new solution that he repeats later as the foundation of geometry: Abstract representations are grounded on concrete representations, as abstract representations are themselves “representations of representations” (see above, Sect. 2.a.ii). The concrete representation is a posteriori or a priori intuition and both forms can be represented by RDs or logic diagrams. The abstract representation of logic is thus justified by the concrete representation of intuition, and the structures of intuition correspond to the structures of logic. For Schopenhauer, this argument can be proven directly using spatial logic diagrams (see above, Sect. 2.b.ii).

The validity of an inference can, thus, be shown in concreto, while most abstract proofs illustrated using algebraic notations are not convincing. As Schopenhauer demonstrates in his chapters on mathematics, abstract-discursive proofs are not false or useless for certain purposes, but they cannot achieve what philosophers, logicians, and mathematicians aim to achieve when they ask about the foundations of rational thinking (compare Lemanski 2021b, chap. 2.3). This argument can also be understood as part of Schopenhauer’s reism or concretism (see above, Sect. 2.a.ii).

ii. Logical Aristotelianism and Stoicism

As described above, Schopenhauer’s focus is not on proving the validity of inferences, but on the question of which logical systems are simpler, more efficient, and, above all, more natural. Although he always uses medieval mnemonics, he explains that the scholastic system attributes only a name-giving, not a proof-giving, function to inferences. On the one hand, he is arguing against Galen and many representatives of Arabic logic when he claims that the fourth figure in syllogistics has no original function. On the other hand, he is also of the opinion that Kant overstepped the mark by criticizing all figures except the first one. The result of this detailed critique, which he carried out on all 19 valid modes and for all syllogistic figures, is proof of the validity of the archaic Aristotelian Organon. Therefore, Schopenhauer claims that Aristotle is right when he establishes three figures in syllogistics and that he is also right when it comes to establishing all general and special rules. The only innovation that Schopenhauer accepts in this respect is that logic diagrams show the abstract rules and differences between the three figures concretely and intuitively.

According to Schopenhauer, a syllogistic inference is the realization of the relationship between two concepts formerly understood through the relationship of a third concept to each of them (BL, 296). Following the traditional doctrine, Schopenhauer divides the three terms into mAjor, mInor, and mEdius. He presents the 19 valid syllogisms as follows (BL, 304–321):

1st Figure

Barbara

All  E are A, all I is E, thus all I is A.

Celarent

No E is A, all I is E, thus no I is A.

Darii

All E is A, some I is E, thus some I is A.

Ferio

No E is A, some I is E, thus some I is not A.

2nd Figure

Cesare

No A is E, all I is E, thus no I is A.

Camestres

All A is E, no I is E, thus no I is A.

Festino

No A is E, some I is E, thus some I is not A.

Baroco

All A is E, some I is not E, thus some I is not A.

3rd Figure

Darapti

All E is A, all E is I, thus some I is A.

Felapton

No E is A, all E is I, thus some I is not A.

Disamis

Some E is A, all E is I, thus some I is A.

Datisi

All E is A, some E is I, thus some I is A.

Bocardo

Some E is not A, all E is I, thus some I is not A.

Ferison

No E is A, some E is I, thus some I is not A.

 

4th Figure ≈ 1st Figure

Fesapo

No A is E, all E is  I, thus some I is not A.

Dimatis

Some A is E, all E is I, thus some I is A.

Calemes

All A is E, no E is I, thus no I is A.

Bamalip

All A is E,  all E is I, thus some I is A.

Fresison

No A is E, some E is I, thus some I is not A.

Remarkably, Schopenhauer transfers the method of dotted lines from Lambert’s line diagrams to his Euler-inspired RD3 (Moktefi 2020). These dotted lines, as in the case of Bocardo, are used to indicate the ambiguity of a judgment. Nevertheless, whether Schopenhauer applies this method consistently is a controversial issue (compare BL, 563 and what follows.).

In addition to Aristotelian syllogistics, Schopenhauer also discusses Stoic logic (BL 333–339). However, Schopenhauer does not use diagrams in this discussion. He justifies this decision by saying that, here, one is dealing with already finished judgments rather than with concepts. Yet, this seems strange as, at this point in the text, Schopenhauer had already used diagrams in his discussion of the doctrine of judgment, which also represented inferences of Stoic logic. However, as the method was not yet well developed, it can be assumed that Schopenhauer failed to represent the entire Stoic logic with the help of RDs. Instead, in the chapter on Stoic logic, one finds characterization of the modus ponendo ponens and the modus tollendo tollens (hypothetical inferences), as well as the modus ponendo tollens and the modus tollendo ponens (disjunctive inferences). In addition, he also focused more intensively on dilemmas.

iii. Naturalness in Logic

One of the main topics in the doctrine of inferences is the naturalness of logic. For Schopenhauer, there are artificial logics, such as the mnemonics of scholastic logic or the mathematical demand for axiomatics, but there are also natural logics in certain degrees. Schopenhauer agrees with Kant that the first figure of Aristotelian syllogistics is the most natural one, “in that every thought can take its form” (BL, 302). Thus, the first figure is the “simplest and most essential rational operation” (ibid.) and most people unconsciously use one of the modes of the first figure for logical reasoning every day. In contrast to Kant, however, Schopenhauer does not conclude that all other figures are superfluous. For in order to make it clear that one wants to express a certain thought, one rightly falls back on the second and third figures.

To determine the naturalness of the first three figures, Schopenhauer examines the function of the inferences in everyday reasoning and, thus, asks what thought they express. Similar to Lambert, Schopenhauer states that we use the first figure to identify characteristics or decisions. We use the second figure if we want to make a difference explicit (BL, 309), while the third figure is used to express or prove a paradox, anomaly, or exception. Schopenhauer gives each of the three figures its own name according to the thought operation expressed with the figure: the first figure is the “Handhabe” (manipulator), the second the “Scheidewand” (septum), and the third the “Anzeiger” (indicator) (BL, 316). As it is natural for humans to make such thought operations explicit, the first three figures are also part of a natural logic. Schopenhauer also explains that each of these three figures has its own enthymemic form and that the function of the medius differs with each figure (BL, 329).

However, Schopenhauer argues intently against the fourth figure, which was introduced by Galen and then made public by Arabic logicians. It has no original function and is only the reversal of the first figure, that is to say, it does not indicate a decision itself, only evidence of a decision. Moreover, the fourth figure does not correspond to the natural grammatical structure through which people usually express their daily life. It is more natural when speakers put the narrower term in the place of the subject and the broader one in the place of the predicate. Although a reversal is possible, which allows the reversal from the first to the fourth figure, this reversal is unnatural. For example, it is more natural to say “No Bashire is a Christian” than to say “No Christian is a Bashire” (BL, 322).

In the chapter on Stoic logic, the intense discussion of naturalness is lost, yet Schopenhauer points out here and elsewhere that there are certain forms of propositional logic that appear natural in the sciences and everyday language. Mathematicians, for example, tend to use the modus tollendo ponens in proof techniques, even though this technique is prone to error, as the tertium non datur does not apply universally (BL, 337, 512f.). As a result of such theses, Schopenhauer is often associated with intuitionism and the systems of natural deductions (compare Schueler et al. 2020; Koetsier 2005; Belle 2021).

d. Further Topics of Analytic

In addition to the areas mentioned thus far, the BL offer many other topics and arguments that should be of interest to many, not only researchers of the history and philosophy of logic. The major topics include, for example, a treatise on the Aristotelian rules, reasons, and principles of logic (BL, 323–331), a treatise on sorites (BL, 331–333), a treatise on modal logic (BL, 339–340), a further chapter on enthymemes (BL, 341–343), and a chapter on sophisms and false inferences (BL, 343–356).

In the following sections, Schopenhauer’s views on (i) the history and development of logic, (ii) the parallels between logic and mathematics, and the focus on (iii) hermeneutics are discussed. As the chapter on sophisms and so forth is also used in dialectics, it is presented in Sect. 2.e.

i. Schopenhauer’s History of Logic

A history of logic in the narrower sense cannot be found in Schopenhauer’s treatise on logic in general (BL, 356 and what follows). However, Schopenhauer discusses the origin and development of Aristotelian logic in a longer passage on the question raised by logical algebra in the mid-18th century—and then prominently denied by Kant: Has there been any progress in logic since Aristotle?

Naturally, as an Aristotelian and Kantian, Schopenhauer answers this question in the negative but admits that there have been “additions and improvements” to logic. Schopenhauer argues that Aristotle wrote the first “scientific logic”, but admits that there were earlier logical systems and claims that Aristotle united the attempts of his precursors into one scientific system. Schopenhauer also suggests that there may have been an early exchange between Indian and Greek logic.

The additions and improvements to Aristotelian logic concern a total of five points (Pluder 2020), some of which have already been mentioned above: (1) The discussion of the laws of thought; (2) the scholastic mnemonic technique; (3) the propositional logic; (4) Schopenhauer’s own treatise on the relation between intuition and concept; and (5) the fourth figure, introduced by Galen. Schopenhauer considers some of these additions to be improvements (1, 3, 4) and considers others to be deteriorations (2 and especially 5). It seems strange that Schopenhauer does not refer to the use of spatial logic diagrams once again (BL, 270).

ii. Logic and Mathematics

Another extensive chapter of the BL, which is closely related to logic, discusses mathematics. This is no surprise, as Schopenhauer spent a semester studying mathematics with Bernhard Friedrich Thibaut in Göttingen and systematically worked through the textbooks by Franz Ferdinand Schweins, among others (Lemanski 2022b). As already discussed above, one advantage of the BL is that Schopenhauer took W I as a basis, expanded parts of it considerably, and incorporated into it some essential topics from his supplementary works. Thus, before the treatise on mathematics, one finds a detailed presentation of the four roots of sufficient reason, which Schopenhauer covered in his dissertation.

Schopenhauer’s representation of mathematics concentrates primarily on geometry. His main thesis is that abstract-algebraic proofs are possible in geometry but, like logic, they lead to a circulus vitiosus, a dogma, or an infinite regress by proving their foundation (see above, Sect. 2.c.i). Therefore, as in logic, Schopenhauer argues that abstract proofs should be dispensed with and that concrete-intuitive diagrams and figures should be regarded as the ultimate justification of proofs instead. Thus, he argues that feeling (Gefühl) is an important element, even possibly the basis, of proofs for geometry and logic (Follessa 2020). However, this feeling remains intersubjectively verifiable with the help of logic diagrams and geometric figures.

Schopenhauer discusses the main thesis of the text, in particular, in connection with the Euclidean system in which one finds both kinds of justification: discursive-abstract proofs, constructed with the help of axioms, postulates, and so forth, and concrete-intuitive proofs, constructed with the help of figures and diagrams. Similar to some historians of mathematics in the 20th century and some analytic philosophers in the 21st century, Schopenhauer believed that Euclid was seduced by rationalists into establishing an axiomatic-discursive system of geometry, although the validity of the propositions and problems was sufficiently justified by the possibility of concrete-intuitive representation (Béziau 1993).

Schopenhauer goes so far as to attribute Euclid’s axiomatic system to dialectic and persuasion. With his axiomatic system, Euclid could only show that something is like that (knowing-that), while the visual system can also show why something is like that (knowing-why). Schopenhauer demonstrates this in the BL with reference to Euclid’s Elements I 6, I 16, I 47, and VI 31. He develops his own picture proof for Pythagoras’s theorem (Bevan 2020), though he then corrects it over the years (Costanzo 2020). Given the probative power of the figures in geometry, there are clear parallels to the function of Schopenhauer diagrams in logic. Schopenhauer can, therefore, be regarded as an early representative of “diagrammatic proofs” and “visual reasoning” in mathematics.

Schopenhauer’s mathematics has been evaluated very differently in its two-hundred-year history of reception (Segala 2020, Lemanski 2021b, chap. 2.3). While Schopenhauer’s philosophy of geometry was received very positively until the middle of the 19th century, the Weierstrass School marks the beginning of a long period in which Schopenhauer’s approach was labeled a naive form of philosophy of mathematics. It was only with the advent of the so-called ‘proof without words’ movement and the rise of the so-called spatial or visual turn in the 1990s that Schopenhauer became interesting within the philosophy of mathematics once again (Costanzo 2020, Bevan 2020, Lemanski 2022b).

iii. Hermeneutics

The exploration and analysis of hermeneutics in Schopenhauer’s work are also closely related to logic. This has been the subject of intense and controversial discussion in Schopenhauer research. Overall, two positions can be identified: (1) Several researchers regard either Schopenhauer’s entire philosophy or some important parts of it as ‘hermeneutics. (2) Some researchers, however, deny that Schopenhauer can be called a hermeneuticist at all.

(1) The form of hermeneutics that researchers see in Schopenhauer, however, diverges widely. For example, various researchers speak of “world hermeneutics”, “hermeneutics of existence”, “hermeneutics of factuality”, “positivist hermeneutics”, “hermeneutics of thought”, or “hermeneutics of knowledge” (Schubbe 2010, 2018, 2020; Shapshay 2020). What all these positions have in common is that they regard the activity of interpretation and deciphering as a central activity in Schopenhauer’s philosophy.

(2) Other researchers argue, however, that Schopenhauer should not be ascribed to the hermeneutic position, while some even go as far as arguing that he is an “anti-hermeneutic”. The arguments of these researchers can be summarized as follows: (A1) Schopenhauer does not refer to authors of his time who are, today, called hermeneuticists. (A2) However, the term ‘hermeneutics’ does not actually fit philosophers of the early 19th century at all, as it was not fully developed until the 20th century. (A3) Schopenhauer is not received by modern hermeneutics.

Representatives of position (1) consider the arguments outlined in (2) to be insufficiently substantiated (ibid). From a logical point of view, argument (A2) should be met with skepticism, as the term ‘hermeneutics’ can be traced back to the second book of the Organon of Aristotle at least. Schopenhauer takes up the theory of judgment contained in the Organon again in his Logica Maior (see above, Sect. 2.b) and, in addition, explains that judgment plays a central role not only in logic but also in his entire philosophy: Every insight is expressed in true judgments, namely, in conceptual relations that have a sufficient reason. Yet, guaranteeing the truth of judgments is more difficult than forming valid inferences from them (BL, 200, 360ff).

e. Dialectic or Art of Persuasion

In addition to the analytics discussed thus far, there is also a very important chapter on (eristic) dialectics or persuasion in the BL which can be seen as an addition to § 9 of W I and as a precursor of the famous fragment entitled Eristic Dialectics. The core chapter is BL 363–366, but the chapters on paralogisms, fallacies, and sophisms, as well as some of the preliminary remarks, also relate to dialectics (BL, 343–363), as does quite a bit of the information on analytics, such as the RDs. As seen in Kant, for Schopenhauer, analytic is the doctrine of being and truth, whereas dialectic is the doctrine of appearance and illusion. In analytic, a solitary thinker reflects on the valid relations between concepts or judgments; in dialectic, a proponent aims to persuade an opponent of something that is possible.

According to Schopenhauer, the information presented in the chapter on paralogisms, fallacies, and sophisms belongs to both analytics and dialectics. In the former, their invalidity is examined; in the latter, their deliberate use in disputation is examined. Schopenhauer presents six paralogisms such as homonomy and amphiboly, seven fallacies such as ignoratio elenchi and petitio principii, and seven sophisms such as cornutus and crocodilinus. In total, 20 invalid argument types are described, with examples of each and partly subdivided into subtypes.

In the core chapter on dialectics or the art of persuasion, Schopenhauer tries to reduce these invalid arguments to a single technique (Lemanski 2023). His main aim is, thus, a reductionist approach that does not even consider the linguistic subtleties of the dishonest argument but reveals the essence of the deliberate fallacy. To this end, he draws on the RDs from analytics and explains that any invalid argument that is intentionally made is based on a confusion of spheres or RDs.

In an argument, one succumbs to a disingenuous opponent when one does not consider the RDs thoroughly but only superficially. Then one may admit that two terms in a judgment indicate a community without noticing that this community is only a partial one. Instead of the actual RD5 relation between two spheres, one is led, for example, by inattention or more covertly by paralogisms, fallacies, and sophisms, to acknowledge an RD1 or, more often, an RD2. According to Schopenhauer, dialectics is based on this confusion, as almost all concepts share a partial semantic neighborhood with another concept. Thus, it can happen that one concedes more and more small-step judgments to the opponent and then suddenly arrives at a larger judgment, a conclusion, that one would not have originally accepted at all.

Schopenhauer gives several examples of this procedure from science and everyday life and also simulates this confusion of spheres by constructing fictional discussions about ethical arguments between philosophers. In doing so, Schopenhauer uses RDs several times to demonstrate which is the valid (analytic) and which is the feigned (dialectical) relation of the spheres. Then, he goes one step further. In order to demonstrate that one can start from a concept and argue just as convincingly for or against it, Schopenhauer designs large argument maps to indicate possible courses of conversation (Lemanski 2021b, Bhattarcharjee et al. 2022).

Fig. 8 shows the sphere of the concept of good (“Gut”) on the left, the sphere of the concept of bad (“Uebel”) on the right, and the concept of country life (“Landleben”) in the middle. Starting with the term in the middle, namely, ‘country life’, the diagram reflects the partial relationship of this term with the adjacent spheres. When one chooses an adjacent sphere, for example, the adjacent circle ‘natural’ (“naturgemäß”), together, these two spheres form the small-step judgment: ‘Country life is natural’. This predicate can then be combined with another adjacent sphere to form a new judgment. Moving through the circles in this way, if one at some point arrives at ‘good’, for example, and the disputant has conceded all the small-step judgments en route, one can draw the overall conclusion that ‘country life is good’. However, as one can just as effectively argue for ‘country life is bad’ via other spheres, the argument map is a visualization of dialectical relations.

Schopenhauer also used such diagrams in the dialectic of W I, § 9, for example, the more famous “diagram of good and evil”, which has been interpreted as one of the first logic diagrams for -terms (Moktefi and Lemanski 2018), as a precursor of a diagrammatic fuzzy-logic (Tarrazo 2004), and as an argument map in which the RD5s are used as graphs (Bhattarcharjee et al. 2022). If one relates the dialectic of the BL to the other texts on dialectics, it can be said that this dialectic serves as a bridge between the short diagrammatic dialectic of the W I and the well-known fragment entitled Eristic Dialectics, in which the paralogisms, in particular, were elaborated.

Figure 8

 

3. Schopenhauer’s Logica Minor

Schopenhauer’s Berlin Lectures must be considered a Logica Maior due to the enormous size and complexity of their original subjects (especially in comparison to many other 19th-century writings). Nevertheless, one can also locate and collect a Logica Minor in Schopenhauer’s other writings. In the following, the most important treatises on analytic and dialectic from the other works of Schopenhauer are briefly presented. Even though the BL and the other writings have some literal similarities, the BL should remain the primary reference when assessing the various topics in the other writings.

a. Fourfold Root

The first edition of Schopenhauer’s dissertation, the Fourfold Root of the Principle of Sufficient Reason, was published in 1813 and a revised and extended edition was published in 1847. The second edition contains numerous additions that are not always regarded as improvements or helpful supplements. In the 1813 version of chapter 5, logic is addressed through the principle of sufficient reason of knowing. Schopenhauer follows a typical compositional approach in which inferences are considered compositions of judgments and judgments as compositions of concepts. The treatise in this chapter, however, is primarily concerned with the doctrine of concepts.

Although Schopenhauer points out that concepts have a sphere, there are no logic diagrams to illustrate this metaphor in the work. Schopenhauer deals mainly with the utility of concepts, the relationship between concept and intuition, and the doctrine of truth. The philosophy of mathematics and its relation to logic are discussed in chapters 3 and 8.

The discussion of the doctrine of truth is especially close to the text of the BL as Schopenhauer already distinguishes between logical, empirical, metaphysical, and metalogical truth. Although the expression “metalogica” is much older, this book uses the term ‘metalogic’ in the modern sense for the first time (Béziau 2020).

Furthermore, it can be argued that Schopenhauer presented the first complete treatise on the principle of sufficient reason in this book. While the other principles popularized by Leibniz and Wolff have found their way into today’s classical logic, that is, the principles of non-contradiction, identity, and the excluded middle, the principle of sufficient reason was considered non-formalizable and, therefore, not a basic principle of logic in the early 20th century. Newton da Costa, on the other hand, proposed a formalization that has made Schopenhauer’s laws of thought worthy of discussion again (Béziau 1992).

b. World as Will and Representation I (Chapter 9)

Chapter 9 (that is, § 9) of the W I takes up the terminology of Fourfold Root again and extends several elements of it. Schopenhauer first develops a brief philosophy of language to clarify the relationship between intuition and concept. He then introduces analytic by explaining the metaphors used in the doctrine of concepts, that is, higher-lower (buildings of concepts) and wider-narrower (spheres of concepts). Schopenhauer keeps to the metaphor of the sphere and explains that Euler, Lambert, and Ploucquet had already represented this metaphor with the help of diagrams. He draws some of the diagrams discussed above in Sect. 2.a— RD3 is missing—and explains that these are the fundament for the entire doctrine of judgments and inferences. Here, too, Schopenhauer represents a merely compositional position: judgments are connections of concepts while inferences are composed of judgments. However, in § 9, there is no concrete doctrine of judgment or inference. The principles of logic are also listed briefly in only one sentence.

Although W I makes the descriptive claim to represent all elements of the world, the logic presented here must be considered highly imperfect and incomplete. Schopenhauer explains that everyone, by nature, masters logical operations; thus, it is reserved for academic teaching alone to present logic explicitly and in detail, and this is what is done in the BL for an academic audience.

In the further course of § 9, Schopenhauer also discusses dialectics, which contains an argument map similar to the one illustrated above (see above, Sect. 2.e) but also lists some artifices (“Kunstgriffe”) known from later writings including the BL and Eristic Dialectic (ibid.). The philosophy of mathematics and its relation to logic are discussed in § 15 of the W I.

c. Eristic Dialectics

Of all the texts on Schopenhauer’s logic listed here, the manuscript produced in the early 1830s that he entitled Eristic Dialectic is the best known. It is usually presented separately from all other texts in editions that bear ambiguous titles such as The Art of (Always) Being Right or The Art of Winning an Argument. Schopenhauer himself titled the manuscript Eristic Dialectic. The term ‘eristics’ comes from the Greek ‘erizein’ and means ‘contest, quarrel’ and is personified in Greek mythology by the goddess Eris. Although Schopenhauer also uses the above ambiguous expressions in the text (for example, 668, 675), these are primarily understood as translations of the Greek expression ‘eristiké téchne’.

Regardless of the context, the ambiguous titles suggest that Schopenhauer is here recommending that his readers use obscure techniques in order to assert themselves against other speakers. Even though there are text fragments that partially convey this normative impression, Schopenhauer’s goal is, however, of a preventive nature: He seeks to give the reader a means to recognize and call out invalid but deliberately presented arguments and, thus, be able to defend themself (VI, 676). Therefore, Schopenhauer is not encouraging people to violate the ethical rules of good argumentation (Lemanski 2022a); rather, he is offering an antidote to such violations (Chichi 2002, 165, 170, Hordecki 2018). However, this fragment is often interpreted normatively, and in the late 20th and early 21st centuries, it was often instrumentalized in training courses for salesmen, managers, lawyers, politicians, and so forth, as a guide to successful argumentation.

The manuscript consists of two main parts. In the first, Schopenhauer describes the relationship between analytics and dialectics (VI, 666), defines dialectics several times (2002, 165), and outlines its history with particular reference to Aristotle (VI, 670–675). The second main part is divided into two subsections. The first subsection describes the “basis of all dialectics” and gives two basic modes (VI, 677 f.). The second subsection (VI, 678–695) is followed by 38 artifices (“Kunstgriffe”), which are explained clearly with examples. These artifices, which Schopenhauer also called ‘stratagems’, can be divided into preerotematic (art gr. 1–6), erotematic (7–18), and posterotematic (19–38) stratagems (compare Chichi 2002, 177).

The manuscript is unfinished and, therefore, the fragment is also referred to by Schopenhauer as a “first attempt” (VI, 676f.). According to modern research, both main parts are revisions of the Berlin Lectures, designed for independent publication: the first main part being an extension of BL 356–363, the second main part a revised version of BL 343–356. It can be assumed that Schopenhauer either wanted to add another chapter on the reduction of all stratagems to diagrams (as given in BL 363–366) or that he intended to dispense with the diagrams, as they would have presupposed knowledge of analytics. In any case, it can be assumed that Schopenhauer would have edited the fragment further before publishing it, as the manuscripts are not at the same standard as Schopenhauer’s printed works.

Despite the misuse of the fragment described above, researchers in several areas, for example in the fields of law, politics, pedagogy, ludics and artificial intelligence, are using the fragment productively (for example, Fouqueré et al. 2012, Lübbig 2020, Marciniak 2020, Hordecki 2021).

d. World as Will and Representation II (Chapters 9 and 10)

In the very first edition of W II in 1844, Schopenhauer extended the incomplete explanations of logic given in W I with his doctrines of judgment (chapter 9) and inference (chapter 10). He adopts some text passages and results of the BL, but only briefly hints at many of these topics, theses, and arguments. In comparison to the BL, chapters 9 and 10 of W II also appear to be an unsatisfactory approach to logic.

In his discussion of the doctrine of judgment, Schopenhauer pays particular attention to the function of the copula in addition to giving further explanations of the forms of judgments. In the doctrine of inference, he continues to advocate for Aristotelianism and argues against both Galen’s fourth figure and Kant’s reduction to the first figure. Furthermore, the text suggests an explanation for why Schopenhauer presents such an abbreviated representation of logic here. Schopenhauer explains in chapter 10 that RDs are a suitable technique to prove syllogisms although they are not appropriate for use in propositional logic. It seems as if Schopenhauer is going against some of the arguments of his former doctrine of diagrammatic reasoning (presented, for example, in Sect. 2.b.ii). Nevertheless, he presents this critique or skepticism almost reluctantly as an addition to W I. Although he does include some RDs, which mainly represent syllogistic inferences, in chapters 9 and 10, he also hints at a more advanced diagrammatic system based on “bars” and “hooks” several times.

However, these text passages, which point to a new diagrammatic system, remain only hints whose meaning cannot yet be grasped. Based on these dark text passages, Kewe (1907) has tried to reconstruct an alternative logic system that is supposed to resemble the structure of a voltaic column as Schopenhauer himself hinted at such a comparison at the end of chapter 10 of W II. However, Kewe’s proposal is a logically trivial, if diagrammatically very complex, interpretation that almost only highlights the disadvantages in comparison to the system of RDs.

It is more obvious that Schopenhauer thinks of these passages as a diagrammatic technique that was published in Karl Christian Friedrich Krause’s Abriss des Systemes der Logik in the late 1820s. This interpretation of W II is more plausible as Schopenhauer was in personal contact with Krause for a longer time (Göcke 2020). However, future research must clarify whether this thesis is tenable. To date, unfortunately, no note from among the manuscripts remains has been identified that may illustrate the technique described in W II, chapter 10.

e. Parerga and Paralipomena II

Parerga and Paralipomena II, chapter 2 contains a treatise on “Logic and Dialectic”. Although this chapter was written in the 1850s, it is the worst treatise Schopenhauer published on logic. In just a few paragraphs, he attempts to cover topics such as truth, analytic and synthetic judgments, and proofs. The remaining paragraphs are extracts from or paraphrases of the manuscript on Eristic Dialectics or the BL. One can see from these passages that there was a clear break in Schopenhauer’s writings around the 1830s and that his late work tended to omit rational topics. Schopenhauer also explained that he was no longer interested in working on the fragment on Eristic Dialectics, as the subject showed him the wickedness of human beings and he no longer wanted to concern himself with it.

4. Research Topics

Research into Schopenhauer’s philosophy of language, logic, and mathematics is still in its infancy because, for far too long, normativists concentrated on other topics in Schopenhauer’s theory of representation, including his epistemology and, especially, his idealism. The importance of the second part of the theory of representation, namely, the theory of reason (language, knowledge, practical action), has been almost completely ignored. However, as language and logic are the media that give expression to Schopenhauer’s entire system, it can be said that one of the most important methodological and content-related parts of the system of Schopenhauer’s complete oeuvre has, historically, been largely overlooked.

The following is a rough overview of research to be done on Schopenhauer’s logic. It is shown that these writings still offer interesting topics and theses. In particular, Schopenhauer’s use of logic diagrams is likely to meet with much interest in the course of intensive research into diagrammatic and visual reasoning. Nevertheless, many special problems and general questions remain unsolved. The most important general questions concern the following points:

  1. Do we have all of Schopenhauer’s writings on logic, or are there manuscripts that have not yet been identified? In particular, the fact that Schopenhauer uses diagrams that are not discussed in the text and discusses diagrams that are not illustrated in the text suggests that Schopenhauer knew more about logic diagrams than can be gleaned from his known books and manuscripts.
  2. How great is the influence of Schopenhauer’s logic on modern logic (especially the Vienna Circle, the school of Münster, the Lwów-Warsaw school, intuitionism, metalogic, and so forth)? Schopenhauer’s Berlin Lectures were first fully published in 1913, a period that saw the intensive reception of Schopenhauer’s teachings on logic in those schools. For example, numerous researchers have been discussing Schopenhauer’s influence on Wittgenstein for decades (compare Glock 1999). One can observe an influence on modern logic in the works of Moritz Schlick, Béla Juhos, Edith Matzun, and L. E. J. Brouwer. However, this relationship has, thus far, been consistently ignored in research.
  3. What is Schopenhauer’s relationship to the pioneering logicians of his time (for example, Krause, Jakob Friedrich Fries, Carl Friedrich Bachmann, and so forth)? Previous sections have indicated that Schopenhauer’s logic may have been close to that of Krause. Bachmann, another remarkable logician of the early 19th century, was also in contact with Schopenhauer. The fact that Schopenhauer was personally influenced by Schulze’s logic is well documented. In addition, Schopenhauer knew various logic systems from the 18th and 19th centuries; however, many studies are needed to clarify these relationships.
  4. To what extent does Schopenhauer’s logic differ from the systems of his contemporaries? Many of Schopenhauer’s innovations and additions to logic have already been recognized. Yet, the question remains, to what extent does Schopenhauer’s approach to visual reasoning correspond to the Zeitgeist? At first glance, it seems obvious, for example, that Schopenhauer strongly contradicted the Leibnizian and Hegelian schools, the Hegelian schools especially, by separating logic and metaphysics from each other and emphasizing instead the kinship of logic and intuition.
  5. To what extent can Schopenhauer’s ideas about logic and logic diagrams be applied to contemporary fields of research? Schopenhauer did not design ‘a logic’ that would meet today’s standards of logic without comment, but rather a stimulating philosophy of logic and ideas about visual reasoning. Schopenhauer questioned many principles that are often widely accepted today. Moreover, he offers many diagrammatic and graphical ideas that could be developed in many modern directions. Schopenhauer’s approaches, which were interpreted as contributions to fuzzy logic, -term logic, natural logic, metalogic, ludics, graph theory and so forth also require further intensive research.
  6. How can Schopenhauer’s system of (for example, W I) be reconstructed using logic? This question is motivated by the fact that some logical techniques have already been successfully applied to Schopenhauer’s system. For example, Matsuda (2016) has offered a precise interpretation of Schopenhauer’s world as a cellular automaton based on the so-called Rule 30 ( ) elaborated by Stephen Wolfram. In Schopenhauer’s system, logic thus has a double function: As part of the world, the discipline called logic must be analyzed as any other part of the system. However, as an instrument or organon of expression and reason, it is itself the medium through which the world and everything in it are described. This raises the question of what an interpretation of Schopenhauer’s philosophical system using his logic diagrams would look like.

5. References and Further Readings

a. Schopenhauer’s Works

  • Schopenhauer, A.: Philosophische Vorlesungen, Vol. I. Ed. by F. Mockrauer. (= Sämtliche Werke. Ed. by P. Deussen, Vol. 9). München (1913). Cited as BL.
  • Schopenhauer, A.: The World as Will and Representation: Vol. I. Transl. by J. Norman, A. Welchman, C. Janaway. Cambridge (2014). Cited as W I.
  • Schopenhauer, A.: The World as Will and Representation: Vol. II. Transl. by J. Norman, A. Welchman, C. Janaway. Cambridge (2015). Cited as W II.
  • Schopenhauer, A.: Parerga and Paralipomena. Vol I. Translated by S. Roehr, C. Janaway. Cambridge (2014).
  • Schopenhauer, A.: Parerga and Paralipomena. Vol II. Translated by S. Roehr, C. Janaway. Cambridge (2014).
  • Schopenhauer, A.: Manuscript Remains: Early Manuscripts 1804–1818. Ed. by Arthur Hübscher; translated by E. F. J. Payne Oxford et. al. (1988).
  • Schopenhauer, A.: Manuscript Remains: Critical Debates (1809–1818). Ed. by Arthur Hübscher; translated by E. F. J. Payne Oxford et. al. (1989).
  • Schopenhauer, A.: Manuscript Remains: Berlin Manuscripts (1818–1830). Ed. by Arthur Hübscher; translated by E. F. J. Payne Oxford et. al. (1988).
  • Schopenhauer, A.: Manuscript Remains: The Manuscript Books of 1830–1852 and Last Manuscripts. Ed. by Arthur Hübscher; translated by E. F. J. Payne Oxford et. al. (1990).

b. Other Works

  • Baron, M. E. (1969) A Note on the Historical Development of Logic Diagrams: Leibniz, Euler and Venn. In Mathematical Gazette 53 (384), 113–125.
  • Bevan, M. (2020) Schopenhauer on Diagrammatic Proof. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 305–315.
  • Belle, van, M. (2021) Schopenbrouwer: De rehabilitatie van een miskend genie. Postbellum, Tilburg.
  • Béziau, J.-Y. (2020) Metalogic, Schopenhauer and Universal Logic. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 207–257.
  • Béziau, J.-Y. (1993) La critique Schopenhauerienne de l’usage de la logique en mathématiques. O que nos faz pensar 7, 81–88.
  • Béziau, J.-Y. (1992) O Princípio de Razão Suficiente e a lógica segundo Arthur Schopenhauer. In Évora, F. R. R. (Hg.): Século XIX. O Nascimento da Ciência Contemporânea. Campinas, 35–39.
  • Bhattarchajee, R., Lemanski, J. (2022) Combing Graphs and Eulerian Diagrams in Eristic. In : Diagrams. In: Giardino, V., Linker, S., Burns, R., Bellucci, F., Boucheix, JM., Viana, P. (eds) Diagrammatic Representation and Inference. Diagrams 2022. Lecture Notes in Computer Science, vol 13462. Springer, Cham, 97–113.
  • Birnbacher, D. : Schopenhauer und die Tradition der Sprachkritik. Schopenhauer-Jahrbuch 99 (2018), 37–56.
  • Chichi, G. M. (2002) Die Schopenhauersche Eristik. Ein Blick auf ihr Aristotelisches Erbe. In Schopenhauer-Jahrbuch 83, 163–183.
  • Coumet, E. (1977) Sur l’histoire des diagrammes logiques : figures géométriques. In Mathématiques et Sciences Humaines 60, 31–62.
  • Costanzo, J. (2020) Schopenhauer on Intuition and Proof in Mathematics. In Lemanski, J. (ed.) Language, Logic and Mathematics in Schopenhauer. Birkhäuser, Cham, 287–305.
  • Costanzo, Jason M. (2008) The Euclidean Mousetrap. Schopenhauer’s Criticism of the Synthetic Method in Geometry. In Journal of Idealistic Studies 38, 209–220.
  • D’Alfonso, M. V. (2018) Arthur Schopenhauer, Anmerkungen zu G. E. Schulzes Vorlesungen zur Logik (Göttingen 1811). In I Castelli di Yale Online 6(1), 191–246.
  • Demey, L. (2020) From Euler Diagrams in Schopenhauer to Aristotelian Diagrams in Logical Geometry. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 181–207.
  • Dobrzański, M. (2017) Begriff und Methode bei Arthur Schopenhauer. Königshausen & Neumann, Würzburg.
  • Dobrzański, M. (2020) Problems in Reconstructing Schopenhauer’s Theory of Meaning. With Reference to his Influence on Wittgenstein. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 25–57.
  • Dobrzański, M., Lemanski, J. (2020) Schopenhauer Diagrams for Conceptual Analysis. In: Pietarinen, A.-V. et al: Diagrammatic Representation and Inference. 11th International Conference, Diagrams 2020, Tallinn, Estonia, August 24–28, 2020, Proceedings. Springer, Cham, 281–288.
  • Dümig, S. (2016) Lebendiges Wort? Schopenhauers und Goethes Anschauungen von Sprache im Vergleich. In: D. Schubbe & S.R. Fauth (Hg.): Schopenhauer und Goethe. Biographische und philosophische Perspektiven. Meiner, Hamburg, 150–183
  • Dümig, S. (2020) The World as Will and I-Language. Schopenhauer’s Philosophy as Precursor of Cognitive Sciences. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 85–95.
  • Fischer, K.: Schopenhauers Leben, Werke und Lehre. 3rd ed. Winters, Heidelberg 1908.
  • Follesa, L. (2020) From Necessary Truths to Feelings: The Foundations of Mathematics in Leibniz and Schopenhauer. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 315–326.
  • Fouqueré, C., Quatrini, M. (2012). Ludics and Natural Language: First Approaches. In: Béchet, D., Dikovsky, A. (eds.) Logical Aspects of Computational Linguistics. LACL 2012. Lecture Notes in Computer Science, vol 7351. Springer, Berlin, Heidelberg, 21–44.
  • Glock, H. -J. (1999) Schopenhauer and Wittgenstein Language as Representation and Will. In: In Christopher Janaway (ed.), The Cambridge Companion to Schopenhauer. Cambridge Univ. Press, Cambridge, 422–458.
  • Göcke, B. P. (2020) Karl Christian Friedrich Krause’s Influence on Schopenhauer’s Philosophy. In Wicks, R. L. (ed.) The Oxford Handbook of Schopenhauer. Oxford Univ. Press, New York.
  • Heinemann, A. -S. (2020) Schopenhauer and the Equational Form of Predication. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 165–181.
  • Hordecki, B. (2018). The strategic dimension of the eristic dialectic in the context of the general theory of confrontational acts and situations. In: Przegląd Strategiczny 11, 19–26.
  • Hordecki, B. (2021) “Dialektyka erystyczna jako sztuka unikania rozmówców nieadekwatnych”, Res Rhetorica 8(2), 18–129.
  • Jacquette, D. (2012) Schopenhauer’s Philosophy of Logic and Mathematics. In Vandenabeele, B. (ed.) A Companion to Schopenhauer. Wiley-Blackwell, Chichester, 43–59.
  • Janaway, C. (2014) Schopenhauer on Cognition. O. Hallich & M. Koßler (ed.): Arthur Schopenhauer: Die Welt als Wille und Vorstellung. Akademie, Berlin, 35–50.
  • Kewe, A. (1907) Schopenhauer als Logiker. Bach, Bonn.
  • Koetsier, Teun (2005) Arthur Schopenhauer and L. E. J. Brouwer. A Comparison. In: L. Bergmans & T. Koetsier (ed.): Mathematics and the Divine. A Historical Study. Elsevier, Amsterdam, 571–595.
  • Koßler, M. (2020) Language as an “Indispensable Tool and Organ” of Reason. Intuition, Concept and Word in Schopenhauer. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 15–25.
  • Lemanski, J. (2016) Schopenhauers Gebrauchstheorie der Bedeutung und das Kontextprinzip. Eine Parallele zu Wittgensteins Philosophischen Untersuchungen. In: Schopenhauer-Jahrbuch 97, 171–197.
  • Lemanski, J. (2017) ショーペンハウアーにおける意味の使用理論と文脈原理 : ヴィトゲンシュタイン ショーペンハウアー研究 = Schopenhauer-Studies 22, 150–190.
  • Lemanski, J. (2021) World and Logic. College Publications, London.
  • Lemanski, J. (2022a) Discourse Ethics and Eristic. In: Polish Journal of Aesthetics 62, 151–162.
  • Lemanski, J. (2022b) Schopenhauers Logikdiagramme in den Mathematiklehrbüchern Adolph Diesterwegs. In. Siegener Beiträge zur Geschichte und Philosophie der Mathematik 16 (2022), 97–127.
  • Lemanski, J. (2023) Logic Diagrams as Argument Maps in Eristic Dialectics. In: Argumentation, 1–21.
  • Lemanski J. and Dobrzanski, M. (2020) Reism, Concretism, and Schopenhauer Diagrams. In: Studia Humana 9, 104–119.
  • Lübbig Thomas (2020), Rhetorik für Plädoyer und forensischen Streit. Mit Schopenhauer im Gerichtssaal. Beck, München.
  • Matsuda, K. (2016) Spinoza’s Redundancy and Schopenhauer’s Concision. An Attempt to Compare Their Metaphysical Systems Using Diagrams. Schopenhauer-Jahrbuch 97, 117–131.
  • Marciniak, A. (2020) Wprowadzenie do erystyki dla pedagogów – Logos. Popraw-ność materialna argumentu, In: Studia z Teorii Wychowania 11:4, 59–85.
  • Menne, A. (2003) Arthur Schopenhauer. In: Hoerster, N. (ed.) Klassiker des philosophischen Denkens. Vol. 2. 7th ed. DTV, München, 194–230.
  • Moktefi, A. and Lemanski, J. (2018) Making Sense of Schopenhauer’s Diagram of Good and Evil. In: Chapman, P. et al. (eds.) Diagrammatic Representation and Inference. 10th international Conference, Diagrams 2018, Edinburgh, UK, June 18–22, 2018. Proceedings. Springer, Berlin et al., 721–724.
  • Moktefi, A. (2020) Schopenhauer’s Eulerian Diagrams. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 111–129.
  • Pedroso, M. P. O. M. (2016) Conhecimento enquanto Afirmação da Vontade de Vida. Um Estudo Acerca da Dialética Erística de Arthur Schopenhauer. Universidade de Brasília, Brasília 2016.
  • Pluder, V. (2020) Schopenhauer’s Logic in its Historical Context. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Basel, 129–143.
  • Regehly, T. (2018) Die Berliner Vorlesungen: Schopenhauer als Dozent. In Schubbe, D., Koßler, M. (ed.) Schopenhauer-Handbuch: Leben – Werk – Wirkung. 2nd ed. Metzler, Stuttgart, 169–179.
  • Saaty, T. L. (2014) The Three Laws of Thought, Plus One: The Law of Comparisons. Axioms 3:1, 46–49.
  • Salviano, J. (2004) O Novíssimo Organon: Lógica e Dialética em Schopenhauer. In: J. C. Salles (Ed.). Schopenhauer e o Idealismo Alemão. Salvador 99–113.
  • Schroeder, S. (2012) Schopenhauer’s Influence on Wittgenstein. In: Vandenabeele, B. (ed.) A Companion to Schopenhauer. Wiley-Blackwell, Chichester et al., 367–385.
  • Schubbe, D. (2010) Philosophie des Zwischen. Hermeneutik und Aporetik bei Schopenhauer. Königshausen & Neumann, Würzburg.
  • Schubbe, D. (2018) Philosophie de l’entre-deux. Herméneutique et aporétique chez Schopenhauer. Transl. by Marie-José Pernin. Presses Universitaires Nancy, Nancy.
  • Schubbe, D. and Lemanski, J. (2019) Problems and Interpretations of Schopenhauer’s World as Will and Representation. In: Voluntas – Revista Internacional de Filosofia 10(1), 199–210.
  • Schubbe, D. (2020) Schopenhauer als Hermeneutiker? Eine Replik auf Thomas Regehlys Kritik einer hermeneutischen Lesart Schopenhauers. In: Schopenhauer-Jahrbuch, 100, 139–147.
  • Schüler, H. M. & Lemanski, J. (2020) Arthur Schopenhauer on Naturalness in Logic. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 145–165.
  • Schulze, G. E. (1810) Grundsätze der Allgemeinen Logik. 2nd ed. Vandenhoeck und Ruprecht, Göttingen.
  • Schumann, G. (2020) A Comment on Lemanski’s “Concept Diagrams and the Context Principle”. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 73–85.
  • Segala, M. (2020) Schopenhauer and the Mathematical Intuition as the Foundation of Geometry. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Birkhäuser, Cham, 261–287.
  • Shapshay, S. (2020) The Enduring Kantian Presence in Schopenhauer’s Philosophy. In: R. L. Wicks (ed.) The Oxford Handbook of Schopenhauer. Oxford Univ. Press, Oxford, 110–126.
  • Tarrazo, M. (2004) Schopenhauer’s Prolegomenon to Fuzziness. In: Fuzzy Optimization and Decision Making 3, 227–254.
  • Weimer, W. (1995) Ist eine Deutung der Welt als Wille und Vorstellung heute noch möglich? Schopenhauer nach der Sprachanalytischen Philosophie. In: Schopenhauer-Jahrbuch 76, 11–53.
  • Weimer, W.: (2018) Analytische Philosophie. In Schubbe, D., Koßler, M. (eds.) Schopenhauer-Handbuch. Leben – Werk – Wirkung. 2nd ed. Metzler, Stuttgart, 347–352.
  • Xhignesse, M. -A. (2020) Schopenhauer’s Perceptive Invective. In Lemanski, J. (ed.) Language, Logic, and Mathematics in Schopenhauer. Birkhäuser, Cham, 95–107.

 

Author Information

Jens Lemanski
Email: jenslemanski@gmail.com
University of Münster
Germany

The Value of Art

Philosophical discourse concerning the value of art is a discourse concerning what makes an artwork valuable qua its being an artwork. Whereas the concern of the critic is what makes the artwork a good artwork, the question for the aesthetician is why it is a good artwork. When we refer to a work’s value qua art, we mean those elements of it that contribute to or detract from that work’s value considered as an artwork. In this way, we aim to exclude those things that are valuable or useful about an artwork, such as a sculpture’s being a good doorstop, but that are not relevant for assessment in artistic terms. Philosophers of art, then, attempt to justify for the critic the categories or determinants with which they can make an appropriate and successful appraisal of an artwork.

What persons consider to be valuable about artworks has often tracked what they take artworks to be. In the humble beginnings of the artwork, artworks were taken to be accurate representations of the world or to be a beautiful, skillful creation that may also have served religious or political functions. Towards the eighteenth century, in light of Baumgarten’s introduction of the term aesthetics, alongside Hume’s and Kant’s treatises, the artwork’s definition and value moved toward the domains of aesthetic experience and judgments of beauty. Autonomy became desired, and political or moral commentary was supposedly inimical to value qua art. Contemporary art has pushed back against these boundaries—with social, ethical, political messages and criticism being drawn back into our artistic assessments.

Different artworks manifest different kinds of composite values. The philosopher of art’s task is to examine which values can appropriately be considered determinants of artistic value and, subsequently, what the value of art might be beyond these determinants. There is substantial disagreement about which, and how, determinants affect artistic value. Consequently, there is a vast catalogue of positions to which aestheticians subscribe, and the terminology can make it difficult to know who is talking about what. To provide some clarity to the reader in navigating this terminology and discourse, the end of this article includes an alphabetized summary of those positions. The various positions are cashed out in reference to mainly visual art, with some treatment of literature. Although some positions are easily transferred to other forms of art, some are not.

Table of Contents

  1. The Nature of Artistic Value
    1. Aesthetic Value and Artistic Value
      1. Aesthetic Value
      2. The Relationship
    2. The Definition and Value of Art
  2. The (Im-)Moral Value of Art
    1. The Moral Value of Art
    2. The Immoral Value of Art
    3. The Directness Issue
  3. The Cognitive Value of Art
  4. The Political Value of Art
    1. Epistemic Progress
    2. The Pragmatic View
  5. Summary of Available Positions and Accounts
  6. References and Further Reading

1. The Nature of Artistic Value

From the outset it should be clear that, when discussing the value of art in philosophical terms, we are not talking about the money one is willing to exchange for purchase of an artwork. In fact, this points to a rather peculiar feature about the value of art insofar as it is not a kind of quantifiable value as is, say, monetary value. If a dealer were to ask what the value of an artwork is, we could give them a particular (albeit negotiable) sum, a quantity, something we can pick out. Philosophically, it does not look like the value of art operates in the same way. Rather, artistic value just appears to be how good or bad something is as art. So, for the dealer, Da Vinci’s Salvator Mundi (1500) might be more valuable than Manet’s Spring (1881) simply because it has attracted more money at auction. In the way we’re using artistic value, however, Spring could be a better artwork than Salvator Mundi for a variety of reasons, thus having greater artistic value. Of course, there may be (quite significant) correlation between monetary and artistic value – at least, one would hope there is.

Recent work by Louise Hanson (2013, 2017) on the nature of artistic value is informative here. To capture that artistic value is not the same kind of value that other values, such as monetary, moral, or aesthetic values are, Hanson presents the analogy relating to the university category mistake. Consider your friend is giving you a tour of the University of Liverpool, showing you the sports centre, the philosophy department, the various libraries, introduces you to faculty and students, the pro-vice chancellors, and so on. Eventually, she finishes the tour and you ask, “OK, all those things are great, but where is the university?” In this case, you’re making a category mistake; there is nothing over, above, or beyond, the composite entities that your friend has shown you that constitutes the university (see Ryle, 1949/2009, p. 6 on category mistakes). The same thing is happening with artistic value, Hanson thinks. Artistic value is just a term we use to talk about something’s goodness or badness as art, and it is something comprised of (a number of) different determinant kinds of value, such as aesthetic, moral, cognitive, and political value.

In this way, artistic value is attributive goodness. It is just how good or bad something is in the category of artworks or, as philosophers like to say, qua art. Accordingly, in order for something to have artistic value, it must be an artwork; something that is not in the category of artworks cannot have artistic value if artistic value is the value (goodness) something has as art. Artistic value, then, is something all and only artworks have to the extent that they are good or bad artworks. Moreover, the assessment of artistic value constrains itself to the domain of artistic relevance: something might be good, but it does not necessarily follow that it is a good artwork. Conversely, something might be a good artwork, but not good simpliciter. That is, Nabokov’s Lolita might be a good artwork, it has high artistic value, but it is not good simpliciter. It might also make for a good coffee mug coaster, and so is good as coffee-mug-coaster, but this does not have bearing on its goodness as art. The reasons for assessing something as having high artistic value must be relevant to the category ‘artworks’, and so not all things valuable about an artwork are things that contribute to its artistic value.

a. Aesthetic Value and Artistic Value

i. Aesthetic Value

Given art’s intimate tie to the aesthetic, a good place to start the inquiry into the value of art appears to be aesthetic value. We are concerned in this subsection with the nature of aesthetic value and what it is as a kind of value, whereas in 1.a.ii we will examine the contentious question concerning the relationship between aesthetic and artistic value, such as whether these are one and the same value. In terms of the value of art, that question is the most important for our purposes. However, in order to answer it, we need to get a hold on what aesthetic value actually is. So, what is aesthetic value? Many agree that this question actually involves two subsidiary questions: first, what makes aesthetic value aesthetic and, second, what makes aesthetic value a value? The former has been referred to as the demarcation or aesthetic question, the latter as the normative question, terminology that originates in Lopes (2018; see specifically pp. 41-43 for the proposing of the questions, and pp. 43-50 for a brief discussion of them) and adopted by subsequent work in philosophical aesthetics (e.g., Shelley, 2019 and Van der Berg, 2019 both provide assessments in terms of these questions). To be specific, the aesthetic question asks us why some merits are distinctively aesthetic merits instead of some other kind of merit, whilst the normative question asks what makes that value reason-giving: how does it “lend weight to what an agent aesthetically should do?” (Lopes, 2018, p. 42).

One possible, and popular, answer is that aesthetic value is co-extensive with, or has its roots within, the aesthetic experience, a certain kind of pleasure derived from experiencing an aesthetic phenomenon. This is known as aesthetic hedonism. As Van der Berg suggests, the theory enjoys “a generous share of intuitive plausibility” (2019, p. 1). Given that we are likely to pursue pleasurable things, aesthetic hedonism provides a plausible answer to the normative question because we do value seeking out pleasurable experiences. What makes the pleasure aesthetic, however, is murkier territory. The aesthetic hedonist needs to provide an account of what makes the pleasure found in aesthetic experiences a distinctively aesthetic pleasure, rather than just pleasure we might find, say, when we finish writing our last graduate school paper.

What makes an aesthetic experience an aesthetic experience can be answered through two main routes: it’s either the content of the experience, or the way something is experienced. Carroll (2015) refers to the former as content approaches, himself endorsing such an approach, and the latter as valuing approaches. The content approach suggests that what the experience is directed towards, the “one or a combination of features” that are aesthetic features, makes the experience aesthetic (Carroll, 2015, p. 172). As Carroll suggests, the view is relatively straightforward, and so obtains the benefit of parsimony. All experiences have content and so, insofar as it is an experience, an aesthetic experience has content. The best explanation for an aesthetic experience’s being an aesthetic experience should derive, therefore, from its content, that is, aesthetic properties. The view also aligns with our intuitions. What we find valuable about aesthetic phenomena is how they look, their aesthetic properties as gestalts of formal and perceptible properties. These are the content of, and give rise to, the kind of experience we have in response to them. Aesthetic value, then, becomes wrapped up with the aesthetic experience which, in turn, is wrapped up with the formal, perceptual properties of the work; aesthetic properties. Couching the value in terms of aesthetic properties carries the advantage of explaining the aesthetic value of non-artistic, but nonetheless aesthetic, phenomena such as nature and mathematical or scientific formulae. We often refer to sunsets and landscapes as beautiful or dull. Likewise, we might attribute an attractive symmetry to a certain equation, or an enchanting simplicity to the proof of an otherwise complex theorem. As the content theory answers the aesthetic question by pointing to the experience’s content – aesthetic properties – scientific and mathematical formulae with aesthetic properties can invite aesthetic experiences.

One relatively significant objection to this, however, is that Carroll maintains a narrow approach to the concept of the aesthetic. That is, he bases aesthetic properties on formal, perceptible properties that give rise to the content of the work. Goldman criticizes this and instead endorses a broad notion of the aesthetic to include moral, political, and cognitive concerns. If our aesthetic experiences take on regard to these sorts of features, and their pleasurable nature is affected as such, then it looks like Carroll’s view is too narrow. Indeed, this kind of objection is leveled against those that think aesthetic value and artistic value are one and the same thing as we shall see later. It looks like we value art for more than just formal and perceptible properties; we say artworks can teach us, that they can provide moral commentary, and so on. That being said, if one does not commit to aesthetic value and artistic value’s identity, then that aesthetic experiences are characterized by their content as formal and perceptible properties looks convincing when one remembers we have aesthetic experiences of not just artworks, but also everyday phenomena including nature. Aesthetic value, therefore, needn’t include moral and cognitive concerns if it is also ascribed to things that are not artworks.

Valuing approaches, by contrast, vary in what they are committed to, but broadly-speaking suggest that aesthetic experiences have a particular character (that doesn’t necessarily rely on their content). This character is distinct from any other experience we might have, and so is unique to the aesthetic (thus answering the aesthetic question). They might be, for example, experiences ‘for their own sake’, or those that are emotionally rewarding on their own, without recourse to further utility, gain, or ends. The main task here is to account for how we get to an experience valued for its own sake, without necessarily referencing content, and in a way that can distinguish aesthetic pleasure from other, perhaps lesser, pleasures.

We can take a historical turn here: in his third Critique, Kant (1987) introduces the notion of disinterestedness to carve out the distinctive character of aesthetic experiences and judgements of beauty. Disinterestedness refers to the abstraction of ‘concepts’ in our judgement and experience of beautiful things. That is, when we view things aesthetically, we remove all constraints regarding what the thing is, how it is supposed to function, whether it makes any moral, political, or social claims, our own personal emotional states, and so on. A judgement of aesthetic value must also demarcate itself from ‘mere agreeableness’, which is perhaps the kind of aforementioned pleasure we have in submitting our final graduate school paper. Kant thinks this unique pleasure arises from our state of disinterestedness leading to the ‘free play’ of the imagination and understanding (see Kant, 1987, §1-§22; in contemporary aesthetics, the notion of disinterestedness has had greater uptake than the claim of ‘free play’). Something’s aesthetic value, on this account, is tied to the value of the experience we have of it without any instrumental, utilitarian, moral, or overall ‘conceptual’ concerns.

The notion of disinterestedness has sparked a lively scholarship, not least because it appears to give rise to a contradiction in Kant’s own Critique. After suggesting that judgements of beauty (perhaps, judgements of aesthetic value in contemporary terms) employ no concepts, therefore subscribing to disinterestedness, he also suggests that a species of beautiful judgements, termed dependent beauty, are in fact made with reference to concepts (see Kant, 1987, pp. §16). Additional scholarship has attempted to refine or recalibrate the notion of disinterestedness, specifically with regard to what the aesthetic attitude entails. For example, Stolnitz (1960, 1978) suggests that the aesthetic attitude, which allows us to make perceptual contact with something in order to retrieve an aesthetic experience, is encompassed by disinterested and sympathetic attention to the object for its own sake. Bullough (1912), likewise, invokes a kind of distancing to access the aesthetic experience: over-distance and you’ll be too removed to gain the pleasure, under-distance and you’ll be too involved. The aesthetic question is answered thus: what makes aesthetic value aesthetic is that it is derived from a pleasurable experience of something to which we have adopted the aesthetic attitude.

On these kinds of views, then, aesthetic value is co-extensive with the pleasurable, aesthetic experience gained from perceiving an artwork (or other phenomenon), in virtue of some particular mode of attending. However, the notion of the aesthetic attitude has received substantial criticism in general (Dickie, 1964, is the canonical instigator of such criticism). For example, it is questionable whether we really do alter our attention to things in a special way, that goes beyond merely paying attention or not paying attention, in order to gather an aesthetic experience (Dickie, 1964). At least, it’s not something we’re aware of: I don’t enter an art gallery and, prior to looking at any artworks, undergo some attentional ritual that is regarded as “adopting the aesthetic attitude”. Additionally, aesthetic attitude theory appears to render most anything a source of an aesthetic experience: if it’s down to me what I attend to in this peculiar way, then it’s down to me what can be the source of an aesthetic experience. If this is the case, and aesthetic value is proportionate to the quality of the aesthetic experience, then aesthetic value doesn’t appear all that unique; anything could give us an aesthetic experience. Moreover, the particularized views, and those derivative, of Stolnitz (1960, 1978) and Bullough (1912) have been subject to much discourse. For example, some think that Stolnitz’s (1960) notion of sympathetic disinterested attention is paradoxical. Being sympathetic to the object requires we have some idea of what the thing is and what it’s for, but disinterestedness defies this.

Most aestheticians are keen to carve out the aforementioned distinctions between a ‘broad’ and ‘narrow’ concept of the aesthetic. The narrow view limits the aesthetic to the detection of aesthetic properties: formal features of the work with some relationship to the perceptible properties (see, for example, Carroll, 2012). For example, the vivacity and vibrancy of Kandinsky’s Swinging (1925) are aesthetic properties arising from the formal and perceptible, perhaps ‘lower-level’, properties, which in turn invite an aesthetic experience, perhaps one of ‘movement’. The broad view captures within the aesthetic, say, moral features (see, for example, Gaut, 2007 and Goldman, 2013) as they arise from this form. For the formalist, or narrow aestheticist, aesthetic value refers to either aesthetic properties in themselves, or is a relational value referring to the experience we have of them. For the broad theorist, as moral, political, or cognitive content is brought about through, and directly impacts our response to the form of the work, they can interact and shape aesthetic value. In Goldman’s terms, cognitive and affective work, such as inference, theme detection, moral insight, and so on, are as much a part of the aesthetic experience as is the detection of aesthetic properties. Artworks “engage us on all levels” (Goldman, 2013, p. 331) and, in turn, their aesthetic value is affected as such.

In terms of the value of art, or artistic value, if we equate aesthetic value with artistic value, artistic value is going to be grounded in, too, the aesthetic, formal features of the work which is shaped by one’s narrow or broad view. If we’re not going to equate the two, then we can say that aesthetic value is one of many determinants of artistic value, bringing in other determinants such as cognitive value and moral value. These views come in nuanced forms, as we’ll now see.

ii. The Relationship

For some aestheticians, the issue with coming to an adequate and appropriate account of the nature of artistic value and aesthetic value is derivative of the core issue of defining the very concepts aesthetic and artistic (in section 1.b, we’ll look at the relationship between the definition of art and the nature of artistic value). The thought is that if we construct an appropriate definition of, and relationship between, art and the aesthetic, all issues in aesthetics will slowly become enlightened.

The most succinct, yet still rigorous, assessment and discussion of the relationship between aesthetic and artistic value was brought about through an interlocution between a paper by Lopes (2011) and subsequent response from Hanson (2013). Lopes has attempted to show that any kind of non-aesthetic artistic value is a myth, whilst Hanson has attempted to show that a non-aesthetic, distinctively artistic value, is a reality (which paves the way for her later account of artistic value, which we saw earlier in section 1). Lopes thinks we only have two options: to embrace the trivial theory, or equate artistic value with aesthetic value. The trivial theory suggests that artworks can have many different values, but none are characteristically artistic. For example, an artwork might grant us some moral wisdom, which is something valuable about it. Other things, though, also grant us moral wisdom, so granting moral wisdom isn’t a characteristically artistic value, that is, a value of an artwork qua artwork. The trivial theory, therefore, is uninformative and doesn’t really tell us anything about the value of art. Lopes arrives at these two options after ploughing through definitions of artistic value that, so he thinks, fail to adequately grant a particular value’s being an artistic value. Thus, the conclusion reached is that something is a value of art if and only if it is an aesthetic value of that thing within the kind of art form, genre, or other art kind.

As Hanson identifies, it is difficult to place Lopes’ position and what it exactly amounts to, but broadly there are three kinds of positions one might take to get rid of (non-aesthetic) artistic value: error theory, eliminativism, and aestheticism. On the surface these might appear to be the same position, but there are subtle differences between the three, hence the confusion in Lopes’ positioning. An error theory would claim that we are mistaken to talk about artistic value as there is no such thing as artistic value: “it appeals to a concept that does not exist” (Hanson, 2013, p. 494). Aestheticism – as Hanson is using the term – is a claim about aesthetic value and artistic value’s identity, as well as a denial of pluralism about artistic value. Pluralism would allow for many values to be contributory towards artistic, and aesthetic in this case, value, for example, moral and cognitive value might interact with aesthetic value. The aestheticist, however, thinks aesthetic value and artistic value are the same, and that only aesthetic value is a determinant of artistic value. The use of aesthetic value, in this discussion, pertains only to formal, perceptible properties, rather than a broad construal that draws in cognitive and affective components as identified in section 1.a.i. The value of an artwork, then, is comprised by its aesthetic value and its aesthetic value only for the aestheticist. For the eliminativist, an identity relation is also placed between aesthetic value and artistic value as we are talking about the same thing. Talk of artistic value is redundant as it is just aesthetic value, rather than, as for the error theorist, talk of a non-existent concept. So, for the eliminativist, we have the concept of artistic value, but it’s the same thing as aesthetic value. This position is endorsed by Berys Gaut (2007): aesthetic value is comprised by a pluralism of many different values, and artistic value is aesthetic value. The denial of pluralism, therefore, sets the aestheticist (only the formal matters for aesthetic value) and the eliminativist (only aesthetic value matters for artistic value, but other values may impact, or be drawn out through, form, i.e., aesthetic value) apart. That being said, an eliminativist need not be committed to pluralism.

We have seen reasons for thinking our talk of artistic value is conceptually and/or metaphysically mistaken just insofar as artistic value is not a kind of value in the same way all these determinant values are. Artistic value is something had by all and only artworks as a measure of how good (or bad) they are and, as such, “is just a handy linguistic construction that allows one to talk about the degree to which something is good art in a less cumbersome way than would otherwise be available” (Hanson, 2013, p. 500). In this way, artistic value is not the same kind of value as, but is indeed dependent upon, other kinds of value. With this reasoning in place, one can reject any position that places an identity claim between artistic value and aesthetic value because they are not kinds of value in the same way, and so cannot be identical. That being said, positions that acknowledge artistic value and aesthetic value’s distinct nature can claim that aesthetic value is the sole determinant of artistic value. So, for example, aestheticists might say that, yes, artistic value and aesthetic value are distinct values both in kind and nature, artistic value has determinants, and the only determinant of artistic value is aesthetic value.

Yet the task for the aestheticist (denying pluralism), arguing that aesthetic value is the sole determinant of artistic value, is rather difficult. Despite our intuitive inclinations towards the formal and the beautiful being significant determinants of artistic value, few would be inclined to suggest that art cannot provide moral, political, or social criticism, bestow knowledge upon us or clarify truths we already hold. Hence, in order to maintain their line, the aestheticist would either need to argue that (i) these other values interact with aesthetic value and, derivatively, affect artistic value (a form of eliminativism), or (ii) these other values are not values of art qua art. Route (i) is endorsed by those, such as Gaut, who see an intimate tie between aesthetic and other forms of value. Consider, for example, our labelling moral behavior beautiful and immoral behavior ugly. Known as the moral beauty view (see section 2.a for this view in greater detail), this looks like a good candidate for the interaction of aesthetic and other forms of value. The issue is that we can speak of the aesthetic value of lots of things, but those things need not be art, and often are not art, which puts pressure on the identity claim. It is mysterious to claim aesthetic value is value qua art whilst attributing aesthetic value to things that are not artworks.

One might then suggest that avenue (ii) stands a better chance at survival. We might argue that Guernica has exceptional aesthetic value, Picasso’s use of geometric forms is dramatic and imposing. We might also suggest that Guernica’s intent, the condemnation of the bombing of Guernica, is a merit of the work. However, we can consider these values separately: Guernica has high artistic value owing to its formalism, and it is a valuable thing owing to its ethico-political commentary, but the latter of these does not contribute to its value qua art. It might be valuable as art owing to its formal qualities, and valuable as a piece of ethico-political commentary, but we should not consider the latter in our assessment of its artistic value (this view, known as moderate autonomism, is explored in section 2). That is, an artwork can have many different valuable features, but the only one that determines its artistic value, its value qua art, is its aesthetic value.

Yet this just doesn’t appear to be how we view artworks. Instead, form and content work together to bear the value of the work qua art. Although content, like Guernica’s ethico-political criticism of war, is wrapped up with, and brought about through, the form of the work, it is not just that form that we value about the artwork as an artwork. We value Guernica qua art in part due to this political criticism which is drawn out through its jarring and unsettling composition, accentuating and confirming the critical attitude Picasso takes. Yet such an attitude and criticism is over and above simply the work’s being jarring and unsettling. More is needed beyond the form to access the content. If this political commentary, then, is something we find valuable about Guernica as an artwork (the denunciation of war, particularly the bombing of Guernica), and it is detachable from its aesthetic value (as jarring and unsettling), then our safest bet appears to be opting for Hanson’s line: artistic value is how good or bad something is as art, it is something had by all and only artworks, and it has a range of different determinant values one of which is, perhaps to a substantial but not wholly encompassing extent, its aesthetic value. In order to reach a full and proper understanding of the value of art, then, we need to explore these determinants and, importantly, how they interact. In sections two to five, we’ll look at some of the main values philosophers of art have thought to contribute to, or detract from, the value of an artwork.

Another issue for the aestheticist is the value we repeatedly attach to works that are deliberately anti-aesthetic. The predominant example here is conceptual art. Conceptual artworks do not bestow upon us aesthetic experiences, nor do they have any properties we would appropriately be willing to call aesthetic properties. Despite their anti-aesthetic nature, we willingly and consistently attribute to these works artistic value. If they lack any significant aesthetic value, as Stecker (2019) writes, but simultaneously possess rather substantial artistic value, then it doesn’t look like we can place an identity claim between aesthetic and artistic value, and to do so may be foolish. How can something with low aesthetic value have high artistic value if these two values are identical? In order to meet this objection, the aestheticist might wish to appeal to artistic-aesthetic holism or dependency. That is, anti-aesthetic artworks are valuable in a way that depends upon the concept of the aesthetic: it is only in virtue of the value placed on the aesthetic that anti-aesthetic art derives its reactionary value. This, however, places a burden on the aestheticist’s shoulders in trying to show how absence of something (aesthetic) gives rise to that something’s value (aesthetic value).

These perceptually indiscernible artworks might pose a different issue for the aestheticist albeit one that is closely related. The problem of conceptual art for the aestheticist is explaining how purportedly non-aesthetic art can have artistic value if artistic value is aesthetic value, and aesthetic value depends on aesthetic properties and/or experience. As Stecker (2019) identifies, the problem of indiscernible works is this: aesthetic value and artistic value cannot be one and the same thing if two perceptually indiscernible entities can have differing artistic value. Duchamp’s Fountain might be indiscernible from a ‘regular’ urinal. If aesthetic value is realized through perceptible features, then the regular urinal and Fountain have the same aesthetic value. However, it is highly unlikely that anyone would be willing to admit that Fountain and the regular urinal have precisely the same artistic value. Hence, we should be of the view that artistic value is something other than, but perhaps may indeed include, aesthetic value. To be clear, the two distinct issues are this: (1) how does the aestheticist account for the aesthetic value of non-aesthetic or deliberately anti-aesthetic art and thus, given the identity claim, their artistic value? (2) How does the aestheticist account for two indiscernible things having (presumably) identical aesthetic value—based on formal features—but distinct artistic value?  

It should be noted that some do hold that artworks can bear non-perceptual aesthetic properties. Shelley (2003) constructs such a case, arguing that it is a route scarcely travelled by aestheticists in the face of conceptual art. Instead, aestheticists tend to deny that conceptual artworks are art (an alternative is to deny that all artworks have aesthetic properties, but this is not a good move for the aestheticist!). Shelley expands aesthetic properties from the usual list – “grace, elegance, and beauty” – to include “daring, impudence, and wit” (2003, p. 373). This he does by drawing on a Sibley-inspired notion of aesthetic properties as striking us, rather than being inferred, and recognizing that it is false to say that properties that strike us rely on perceptible properties. As he writes, “ordinary urinals do not strike us with daring or wit, but Fountain, which for practical purposes is perceptually indistinguishable from them, does” (Shelley, 2003, p. 373). See Carroll (2004) for the same conclusion reached via alternative argument.

Consider, then, a different case against the aestheticist: forgeries and fakes. It is not to our surprise that our valuing of something changes upon our discovery that it is a forgery, and it is often, presumably, the case that this value change is a diminishment. The (in-)famous case of van Meegren creating fake Vermeers is commonplace in the literature. Upon discovering that these ‘Vermeers’ were actually fakes produced by van Meegren, their value suffered. Despite this, the aesthetic properties, or the aesthetic experience had, presumably stays the same owing to the change not being at the level of the formal, perceptual properties of the work. Instead, it’s something else that changes; perhaps our valuing of it as, now, not an original, authentic work. Aestheticists might appeal to a change in the moral value or assessment of the work, but the best explanation for this kind of phenomenon appears to be that the aesthetic value, co-extensive with aesthetic experience or properties, remains the same, whereas the artistic value, which can include considerations such as originality, importance for art history, authenticity, and so on, changes. Indeed, it is precisely these problematic scenarios that lead Kulka (2005) to endorse what he terms aesthetic dualism: a divorce between aesthetic value and artistic value, where aesthetic value is gathered from the intrinsic properties of the work, and artistic value includes aesthetic value but also makes reference to extrinsic information such as originality and art-historical importance.

Notwithstanding, the conceptual and empirical dependency of the artistic upon the aesthetic is a popular view. Frank Sibley (1959; and, 2001 for a collected volume of his papers) proposed a priority of the aesthetic over the artistic: all that is artistic depends upon the concept of the aesthetic. Therefore, Sibley does indeed endorse the claim that anti-aesthetic art, by its very nature, depends on the concept of the aesthetic in order to retrieve its value as art. Andy Hamilton (2006), too, endorses a link of conceptual necessity between the aesthetic and the artistic. What he calls the reciprocity thesis is a conceptual holism between artistic and aesthetic; we cannot understand, or have the concept arise, of the artistic without the aesthetic. His case is that it is unfathomable to conceive of a settling community that views a sunset and does not at the same time decorate their homes with ornaments and fanciful designs.

As we can see, many aestheticians appear to support with good reason the idea that aesthetic value and artistic value are not identical. However, we should not assume that the case is too one-sided and that proponents of the aesthetic-artistic value distinction do not have any burdens to meet. For example, in the remainder of this article we’ll look at some values philosophers of art take to be values of art, but the question is: how do we know these are values of artworks qua their being artworks, rather than values artworks have just adventitiously? How do we support the idea, for example, that an artwork’s teaching us something is a value of that work qua art, but an artwork’s covering up a crack in the wall is not a value of that work qua art? This is the main contention of Lopes’ argument against non-aesthetic artistic value: there is no non-trivial way of declaring that a value is an artistic value, that is, a value qua art.

b. The Definition and Value of Art

Before engaging such questions, we should examine the relationship between the definition of art and the value of art. As stated in the introduction, what we have taken artworks to be and what we value about them have been considered somewhat simultaneous. Rather than historically trace the definition of art and its correspondence with art’s value, we will focus here on some issues arising from the relationship between defining art and the value of art, in keeping with the article’s scope and purpose. First, a theory of art that picks out artworks based on what we deem to be valuable about them is called a value definition. It is more likely than not that this definition will also be a functional theory/definition of art, according to Davies’ (1990, 1991) delineation of functional and procedural theories of art. A functional theory defines artworks in terms of what they do, whereas a procedural theory defines artworks in terms of how they are brought about. Aesthetic theories, for example, are functional theories. The institutional theory, on the contrary, is a procedural theory. It is presumably not the case that we value artworks because they are those things picked out as candidates for appreciation by the art world (the institutional theory), but it might be the case that we value artworks because they are sources of aesthetic experiences (a version of the aesthetic theory).

Consequently, functional theories are often taken to have an advantage over procedural theories in terms of explanatory power. They tell us what an artwork is, alongside telling us why art matters to us. Indeed, it is often, then, a criticism of procedural theories that they do not go on to show us why and how we care about artworks. Although procedural theories might have a greater time encompassing more artworks under their definition (the institutional theory is praised for its extensional adequacy), they fail to meet an important desideratum of theories of art. One must be cautious, however, in approaching both the definition and value of art along the same track. If one takes what is valuable about an artwork to be the sole determinant of artistic value and that artworks are those things that have this value, then one runs into a conceptual conundrum. Such definitions perform, for Hanson (2017), definition-evaluation parallelism. These theories are unable to accommodate the existence of bad art.

Hanson cites the theories of Bell and Collingwood as falling into this trap. For Bell, artworks have significant form, and this is the determinant of their artistic value. For Collingwood, art is expressive, and their expression is the determinant of their artistic value. The puzzle, however, is this: if artworks are valuable only because of their significant form, and are artworks because of their significant form, then all artworks are valuable. Something that doesn’t have significant form cannot be artistically valuable, nor can it be deemed art. As such, the existence of bad art becomes a contradiction, given that all artworks, insofar as they are artworks, possess the valuable feature. The same can be said of Collingwood’s expressive theory, substituting expression for significant form in this example. What it would take for an artwork to be bad, i.e., lacking the valuable thing about art, would also remove its artistic status. Hence, there can be no bad art.

Not all value definitions fall into the trap of definition-evaluation parallelism. It is possible, for example, to argue that all artworks have some value x, but this value is not the sole determinant of the value of art. Instead, a multitude of values constitute the value of art, it’s just that x is also what makes artworks, artworks. If they follow this trajectory, theories of art are able to meet the desideratum of being able to at once explain what artworks are and why we value them. As Hanson points out, it has been a mistake by previous aestheticians to think of the issue of bad art as “a knock-down objection to value definitions” (2017, p. 424). Instead, it’s a burden only for those value definitions that at the same time invoke definition-evaluation parallelism.

In addition, it is not the case that one needs to pick the explanatory side they deem more praiseworthy in cases of defining art procedurally or functionally, for one can commit to, as Abell (2012) does, a hybrid theory of art. A hybrid theory of art would be one that is both functional and procedural at the same time. The motivation for a hybrid (procedural and functional) theory is that it can potentially take on the extensional power of a procedural theory (encompassing within the category of artworks the right kind of thing) as well as the explanatory power of a functional theory (letting us know how and why we care about art).

2. The (Im-)Moral Value of Art

The previous discussions setup, and invite, consideration of what other forms of value we consider to be contributory to, or detracting from, the value of an artwork qua art. Throughout the following considerations, the reader should consider whether the position and its commitments make claims about two different concerns: whether the value in question impacts the value of the work as a work of art, or whether we can assess the artwork in terms of that value, but the value doesn’t impact the value of a work of art as a work of art. The nature of such an interaction is cashed out with great intricacy in the numerous positions espoused in considerations of the (im-)moral value of art, and so it is to this value that we now turn as a good starting point.

a. The Moral Value of Art

The interaction between moral and aesthetic and/or artistic value has received extensive treatment in the literature and with extensive treatment comes an extensive list of positions one might adopt. Another entry of the IEP also considers these positions: Ethical Criticism of Art.  Nonetheless, the interaction is a considerable source of tension in philosophical aesthetics, and so I shall highlight and assess the key positions here. Roughly, the main positions are as follows. Radical Autonomists think that moral assessments of artworks are inappropriate in their entirety, that is, one should not engage in moral debate about, through, or from artworks. Moderate Autonomists think that artworks can be assessed in terms of their moral character and/or criticism, but this does not bear weight upon their value qua art, that is, their artistic value. Moralists think that a work’s moral value is a determinant of its artistic value. Radical Moralists think that the moral assessment of an artwork is the sole determinant of its artistic value. Ethicists think that, necessarily, a moral defect in a work is an aesthetic defect, and a moral merit is an aesthetic merit. Immoralists think that moral defects, or immoral properties, can be valuable for an artwork qua art, they can contribute positively to artistic value.

It should be clear from this brief exposition that the varying terminology renders the debate rather murky. Some, such as Gaut, are arguing about moral value’s encroachment on aesthetic value, whereas others are making claims in particular about artistic value. Todd (2007), for example, identifies that a significant part of the tension of ethical interaction is sourced from conflating aesthetic and artistic value. In addition, in different literature we see talk of moral value, moral properties, the morality of the artist, moral defects, aesthetic merits, artistic merits, and so on. In fact, it has been pointed out that the debate regarding immoralism (the claim that moral defects can be aesthetic/artistic merits) is marred precisely owing to the lack of consensus and terminological mud that is flown throughout the debate: no one has declared precisely what a moral defect is, and upon whom or what it falls (McGregor, 2014). A moral defect might be in the audience if they take up a flawed perspective, or it might be in the work’s suggestion that that response be taken up, or it might be in the display of immoral acts, and so on. In keeping with the focus of this article, I will consider the debate in terms of artistic value, where someone who thinks aesthetic value and artistic value are one and the same thing will be claiming that there is an interaction between (im-)moral properties and aesthetic value (as artistic value). That is, we will keep in line with the general agreement that what is at stake is the effect of (im-)moral properties on the value of artworks qua artwork. I will refer to moral and immoral aspects of the work in terms of properties and defects/merits.

Let’s start with the autonomist’s claim. Two strands of autonomism are prominent: radical and moderate. The former suggests that any and all moral assessment of an artwork is completely irrelevant to the artistic domain. It is a conceptual fallacy to suggest that morality and aesthetics interact in any substantive way. The artwork is a pure, formal phenomenon that exists in a realm divorced from concerns such as morality and politics. Clearly, however, this view has become outdated. It may have been convincing in the heydays of movements such as art for art’s sake, however, art has historically and, even more so in contemporary forms, been wrapped up with moral and political commentary, serving to criticize specific events, movements, and agendas. The latter strand, moderate autonomism, might find itself more palatable. This is the claim that the moral properties of a work have no interaction with its artistic value, but artworks can still be assessed in light of morality. On this view, then, Riefenstahl’s Triumph of the Will is good art, it is aesthetically and artistically valuable. However, in contrast to the radical autonomist, we may wish to assess the artwork in terms of a moral system, in which case Triumph of the Will is (very) flawed, but this does not have weight on our assessment of the film-documentary as art. The only thing that is relevant to the artistic value of Triumph of the Will is its aesthetic value, and on this view it is a good artwork.

There are two significant attractions to this view. Firstly, as mentioned in preceding sections, the idea that the aesthetic qualities of artworks are those things for which we value artworks is intuitively appealing; we praise artworks for their beauty, their form, and their use of this form to wrap up their content. Fundamentally, the autonomist says, we value artworks for how they look, and this is the ‘common-sense’ view of how and why we value art. Secondly, the claims that moral merits and defects do not impede upon artistic value is supported by the common-denominator argument, first proposed by Carroll. If a value is a value qua art, then it must feature as a relevant value for assessment in the consideration of all artworks. However, there are a multitude of artworks for which moral assessment is inappropriate and/or irrelevant. Abstract works, for example, in their majority do not lay claim to moral criticism or commentary, and so assessing them as artworks in terms of such value is inappropriate. Moral assessment, then, is not a common denominator amongst all artworks and so is not appropriate for assessment of an artwork qua art.

However, there are two interrelated and concerning issues for the autonomist. Firstly, the view may be problematic in the light of the fact that artworks are valued for many reasons beyond their form and aesthetic qualities. Indeed, take the earlier examples of genres of art that proceed from an anti-aesthetic standpoint. Secondly, and more importantly, it is standard practice in art criticism to produce an assessment of (some) artworks in terms of their moral and immoral claims, and this seems indubitably relevant for their assessment as artworks, or, qua art. Producing a critical review of Lolita as a work of artistic literature, for example, that made no reference to the immorality of Humbert Humbert and the relevance of this for its value as an artwork (rather than just its nature as an artwork) would be simply to have missed the plot of Lolita. Similarly, Guernica may be an exceptional, revolutionary use of form, but its assessment as an artwork just intuitively must involve its commentary on civil war and its repercussions for civilians. Likewise, it seems to be relevant to its assessment as art that Triumph of the Will was a propagandistic film endorsing the abhorrent, accentuated narrative of the Nazi party.

The moralist, who thinks an interaction exists between moral value and artistic value, is likely to use these latter examples as motivations for their own view. In these cases, it looks like the very form of the work is in some sense determined by the moral attitudes and values explored. As such, the moralist will claim that “moral presuppositions [can] play a structural role in the design of many artworks” (Carroll, 1996, p. 233). Hence, if we’re going to value artworks for their form and content, and in some cases this is dependent upon the moral claims, views, or theories employed in the work, then we need to accept that the moral value of a work is going to affect its value as an artwork.

Moralists are divided on whether their rule about moral properties (that moral merits can be aesthetic merits) is one of necessity; that is, that moral merits are always going to lead to aesthetic merits. For example, moderate moralism suggests that sometimes, but not always, moral properties can impinge upon, or contribute to, artistic value (see, for example, Carroll, 1996). In contrast, the ethicist necessitates the relationship between moral merits and aesthetic merits. For the ethicist, each and every instance of a moral merit in a work of art is an aesthetic merit. This position was made prominent by Gaut (2007), who as we saw also thinks that aesthetic value and artistic value are one and the same value. As such, a proper and appropriate formulation of ethicism would be the following: moral merits are in every case aesthetic merits, and as such moral merits always contribute to the value of an artwork as art. A caveat here is that the moral merits are core features of the artwork, rather than extraneous elements coinciding with the work. For example, the moral actions of Forrest in Forrest Gump may be aesthetically relevant, but moral claims made by the film studio in the DVD leaflet are not.

Clearly, this is a very strong claim and so requires significant motivation. Gaut bases the endorsement of ethicism upon three arguments: the merited response, moral beauty, and a cognitivist argument, the first and second I explore here. A dominant version of the merited response argument runs as follows: artworks prescribe responses in their spectators/perceivers/readers derivative of their content, and their aesthetic success is determined by, at least in part, such a response to the work being merited. One way a response might be unmerited, at least in part, is if such a response is unethical. As the unethical response is the cause of a response being unmerited, and an artwork’s success depends upon the response being merited, ethical defects are aesthetic defects and ethical merits are aesthetic merits (Gaut, 2007; Sauchelli, 2012). In sum, there is a direct correlation between a response being merited and the moral prescriptions such a response holds. The second argument, the moral beauty view, identifies that “moral virtues are beautiful, and moral vices are ugly” (Gaut, 2007, p. 115). From here, we can suggest that if a work has moral virtues—it has “ethically good attitudes”—then it has a kind of beauty. Beauty is, of course, canonically and paradigmatically an aesthetic value. Therefore, moral value contributes to aesthetic value. The argument, as Gaut suggests himself, is straightforward. To assess it requires an evaluation of the link between moral character and beauty which falls beyond the scope of this article. Readers should note Gaut provides a powerful case for the relation: see Gaut (2007, chapter six).

b. The Immoral Value of Art

There is something intuitively appealing about the claim that moral merits in artworks can be artistic merits and, as such, contribute to the value of art. The same, however, cannot be said of moral defects as meritorious contributions to the value of art. It seems odd to think that an artwork could be better in part, or wholly, because of the immoral properties it possesses. Immoralism, generally, is the position in aesthetics that holds that moral defects in a work of art can be artistic merits. Despite the instinctive resistance to such a claim, we need not look far afield to find examples of artworks that might fit this sort of bill. Consider, for example, Nabokov’s Lolita, Harvey’s Myra, and Todd Phillips’ Joker. The value of these works seems to be sourced from, or tied to, their inclusion of immoral properties, acts, or events. The issue is that we do not value immoral properties in general, or simpliciter, so why do they sometimes contribute to value qua art?

The cognitive immoralist (Kieran, 2003) suggests that we value immoral properties in artworks because they invite a cognitive and epistemic gain. That is, immoral properties are artistically virtuous insofar as they allow us to access, gain an understanding of, cement, or cultivate our moral understanding. Lolita’s immoral properties are valuable because they provide further scaffolding to our understanding of the immorality of pedophilia. By accessing the perspective of the pedophile, we garner a more complete understanding of why the actions are wrong. In this way, it has been argued that we have an epistemic duty to seek out these artworks for the resultant moral-cognitive gain, for the more comprehensive understanding of goodness and badness. Just as, so Kieran argues, the subordination of another can help us understand why a bully bullies, that is, to gain pleasure from the subordination of others, so too can artworks offer us the perspectives of perpetrators that can improve our understanding. Importantly, however, this epistemic duty does not extend to the real world. It is the imaginative experience that indirectly and informatively entertains immorality through the suspension of our natural beliefs and attitudes.

The robust immoralist, by contrast, focuses on the aesthetic and artistic achievements upheld by the ability of a work to gather our appreciation of immoral characters (Eaton, 2012). Termed rough-heroes, these characters take on immoral adventures or acts in films, novels, TV shows, and so on, but for some reason we empathize with them, we like them, we might even fancy them. For example, we might sympathize with a character that is at the same time a murderer. For the robust immoralist, it is a formidable artistic achievement to place us into this juxtaposed attitude and, hence, morally defective works can be artistically valuable just insofar as they excel within this artistic technique (Eaton, 2012).

The immoralist falls into a similar issue to the moralist, however, insofar as they need to show that it is the immoral qualities qua immoral qualities that contribute to the artistic value (Paris, 2019; this paper represents a considerable attack on immoralism). For example, some have argued that there is a two-step relation between immoral qualities and artistic value. Against the cognitive immoralist, they argue that it is the cognitive value that contributes to the artistic value, rather than the immoral qualities themselves. Similarly, we might argue that it is the aesthetic achievement of the robust immoralist, and hence aesthetic value, that contributes to the artistic value, rather than the immoral qualities themselves. Hence, immoral qualities qua immoral qualities do not contribute to artistic value. Moreover, on this criticism, we could suggest that replacing the immoral qualities with qualities (perhaps moral qualities) that give rise to the same sort of aesthetic value and/or cognitive value will produce the same influence upon artistic value and so, again, it is not immoral qualities qua immoral qualities. An intriguing consequence of this kind of criticism of immoralism is that it penetrates the veracity of theories that argue moral properties contribute to artistic value. The artistic value is not located in the moral qualities qua moral qualities (since, presumably, they are replaceable with some properties that gather the same cognitive or aesthetic gain, too).

Relatedly, some have argued that it is only insofar as these immoral qualities are accompanied by aesthetic and or cognitive gain that then masks or covers up the immoral qualities that they are deemed artistically valuable (Paris, 2019). That something else – retribution of the character, epistemic gain, aesthetic success – derives from these immoral properties suggests that they would not be valuable on their own. Since they require covering or masking in terms of aesthetic or epistemic success, they are actually shown to be detrimental to artistic value. That is, without the masking or covering up of the immoral qualities, we wouldn’t actually find the work artistically valuable. It is as though, says the critic of immoralism, the immoral qualities require covering up or redemption in order to succeed in the artistic domain. This puts them a far cry away from being valuable as art qua themselves.

Lastly, and this is a particular criticism of cognitive immoralism, it is hard to find works for which the status of the properties is genuinely immoral. If the reason we find immoral properties valuable is because of the ensuing cognitive, epistemic, moral cultivation — for example, Lolita helps us to verify and scaffold our understanding of pedophilia as immoral — then, upon calculation, the properties might not turn out to be immoral. That is, the subsequent moral cultivation outweighs the immorality of the fictional wrongs, and so the properties of the artwork are not, all things considered, immoral. The benefits outweigh the costs. If the artwork does not exhibit immoral properties, then there are no immoral properties in the first place out of which we can argue artistic value arises.

c. The Directness Issue

What the discussions of moralism and immoralism show is that for a property, quality, or value to legitimately be considered a determinant of artistic value, it must affect the value of the work qua art. In several different instances outlined, it doesn’t look like the moral and/or immoral property/value is affecting the work’s value qua art, but is instead determining some other value that we take to be valuable qua art. For example, some properties (moral or immoral) affect aesthetic value, which transitions to affect artistic value. It is, therefore, aesthetic value, not moral value, that influences artistic value. Or, some properties of artworks look to teach us things or cultivate our understanding, therefore there is a particular cognitive value about them, which has an effect on the artistic value.

The trouble that arises from this kind of thinking is which values are we willing to take as fully and finally valuable qua art and not just because they determine some other value? Perhaps this can be cast as a significant motivator for aestheticism, and indeed Lopes’ claim of the myth of non-aesthetic artistic value. Aesthetic value appears to be the only value on which there is universal consensus regarding its status as an artistic value. For moral/immoral interaction, it almost looks as if the burden always falls on the interactionist (who thinks that moral and aesthetic/artistic value interact) rather than the autonomist (who thinks they do not), in some unfair way.

Such a concern has been legitimated by Hanson (2019), who suggests that two dogmas have been pervasively present in the interaction debate: two conditions that an interactionist must meet, but that together are incompatible. Let’s refer to these as the directness problem and the qua problem. Roughly, the directness problem highlights that those engaging in the interaction debate have implicitly assumed that the only way an interactionist can show that moral/immoral properties bear on the value of art is if they influence some other value, that subsequently influences artistic value. Hence, if the interactionist shows that moral properties gather cognitive value, which bring about artistic value, then they have proposed an indirect strategy. A direct strategy would be, say, the cognitive case, where cognitive value directly bears on artistic value. The second condition that the interactionist must meet is that it must show that it is the (im-)moral properties qua (im-)moral properties that bear the value of art (the qua problem). That is, not some other, intermediary value. Clearly, however, it is logically impossible to propose that something can affect something qua itself indirectly. That is, one cannot conjure an interactionist theory that suggest moral properties are indirectly contributory to artistic value whilst at the same time maintaining that it is the moral properties qua moral properties that contribute to artistic value. One cannot, then, conjure an interactionist theory that meets these simultaneous requirements, or dogmas.

What, then, is the resolution? In order to not beg the question against the interactionist, aestheticians need to refrain from implicitly advancing both the directness and qua problem simultaneously, and instead only level one or neither. In her proposal, Hanson suggests we should take direct strategies seriously, with the caveat that this does not necessitate endorsing the qua constraint. Taking direct strategies seriously is legitimated because, well, we allow direct strategies in other cases. Consider aesthetic value, or cognitive value, or the value of originality, influence on subsequent art, and so on. All these values take on the stance of directly influencing artistic value, so why not moral value? Indeed, as Hanson suggests, we need to admit that at least some values are directly impactful on artistic value lest we enter an infinite regress. That is, if some value only contributes to artistic value via some other value, then does the latter contribute to artistic value directly? If not, then another value needs to be affected, which subsequently affects artistic value, to which the same question can be posed, and so on ad infinitum. Clearly, there must be a break in this chain somewhere such that some value(s) is (are) contributory, directly, to artistic value. As Hanson suggests, we should begin to take direct strategies more seriously, and the prospects look a lot “rosier” when we begin to do so.

3. The Cognitive Value of Art

Cognitive immoralism rests decisively on the claim that artistic value can be upheld, indeed augmented, by the cognitive value of an artwork. That is, the claim that artworks can be valuable insofar as they engender some form of cognitive gain. Indeed, a familiar endorsement of art is that it has something to teach us. These claims would be endorsed by a cognitivist about the arts: art can teach us, and it is aesthetically or artistically valuable in doing so. When discussing the cognitive value of art, it is crucial to get at what exactly the claim of “teaching us” amounts to: what is being taught and how are we being taught this? Rather usefully, Gibson (2008) has delineated the claims of different strands of artistic cognitivism. Gibson suggests that the cognitivist could argue for artworks granting us three kinds of knowledge: (i) propositional knowledge, (ii) experiential knowledge, or (iii) improvement/clarification. Other options are available for the cognitivist, such as a general increase in cognitive capacities, as we saw with cognitive immoralism. Another significant position is that art can train our empathic understanding. We’ll focus on Gibson’s assessment of cognitivism due to its informativity, before moving to Currie’s more recent analyses of cognitivism and, specifically, the enhancement of empathy.

The cognitivist endorsing (i) would suggest that artworks can give us knowledge that, such as x is y, or tomatoes are red. Artworks, for example, might serve as something akin to philosophical thought experiments, from which we can extrapolate some new truth. This strand of thought might argue that Guernica grants us propositional knowledge that civil wars affect citizens as much as infantry, and so are morally bankrupt, as seen through the bombing of Guernica. By endorsing (ii), the cognitivist is claiming that we can access “a region of human experience that would otherwise remain unknown” (Gibson, 2008, p. 583). For example, one might claim that Guernica offers us some form of access to what civilians experienced during the bombing of Guernica. In endorsing (iii), the cognitivist – Gibson labels this the neo-cognitivist position – would claim that artworks don’t teach us anything new, nor do they grant us access to otherwise inaccessible scenarios, but instead that they confirm, clarify, or accentuate knowledge we already hold. For example, we all know that war has consequences for citizens, that bombings are bad, and Guernica can shed light on, or improve our knowledge of, these facts.

There are four kinds of issue that cognitivists must overcome, with some addressing these different strands in a more targeted fashion. Gibson refers to these as the problem of unclaimed truths, missing tools of inquiry, the problem of fiction, and the nature of artistic creativity. The problem of unclaimed truths suggests that the truths found in artworks are borrowed from reality rather than revelatory of it. On this criticism, Guernica can’t grant us knowledge about the bombing of Guernica because, simply, the bombing of Guernica needed to take place in order for the artwork to borrow from such a reality. The missing tools of inquiry objection suggests that, in contrast to other cognitive pursuits, artworks don’t show us how to reach the knowledge, nor do they justify their truths, they merely show them. Picasso’s Guernica, then, can say that bombing in civil wars can lead to civilian deaths which is an immoral circumstance, but it cannot tell us why. The problem of fiction argues that the truths artworks disclose, if they do at all, are truths of the fictional world of the artwork, rather than truths that come into contact with reality. Hence, Guernica can show us that bombing cities in civil wars is wrong in the fictional world of the painting, rather than in our world of reality; the leap from fiction to reality is too large a leap to make. Relatedly, the nature of artistic creativity objection argues that artists create artefacts that are meant for “moment[s] of emancipation from reality” (Gibson, 2008, p. 578), and so praising the artistic discipline requires distancing ourselves from reality. Artworks, then, should not be valued for their cognitive gain because it is precisely the purpose of art to detach us from reality, rather than impart knowledge about it.

This final criticism can be launched against nearly all strands of cognitivism: if artworks should not be valued for the engendered cognitive gain, and artistic value is the value of something qua art, then cognitive value is not an artistic value. The problems of unclaimed truths and fiction penetrate the propositional and experiential knowledge accounts of artistic cognitivism. If the propositional or experiential knowledge qualifies the fictional world and can’t be transferred to reality, then that’s not a very valuable circumstance. Likewise, if this proposition or experience is something borrowed from reality, then again there’s no real teaching going on, for the knowledge needs to be in place in the first place for the artwork to borrow. Moreover, the fictionality objection hits the experiential account particularly hard: to put it simply, nothing is as good as the real thing. Gibson gives the example of a fictional tale about love; going to a library and reading it is not going to give the same experiential access as, should I be so lucky, finding love in reality!

The idea that artworks can give us propositional knowledge has been met with equal criticism. Consider our claim that, in order for something to contribute to artistic value, it must be valuable qua the artwork, that is, it must be something about or in the artwork that is valuable. Against the cognitivist endorsing the propositional knowledge view, one might suggest that the cognitive gain is made subsequent to engagement with the artwork. Just as with philosophical thought experiments, the knowledge isn’t held within the fiction, it is derivative of the cognitive efforts of the beholder. Guernica, considered as a thought experiment, doesn’t give us the knowledge qua itself as an artwork, but rather we extrapolate the moral propositions subsequent to our engagement. That is, Guernica doesn’t say “the killing of innocent civilians is bad”, but instead gives us a pictorial representation of the bombing of Guernica via which we subsequently undergo some cognitive work to get at this claim. Hence, the cognitive gain is not found within the artwork itself, and so cannot be a value qua art.

It looks like the strongest weapon in the cognitivist arsenal is what Gibson calls neo-cognitivism: the view that artworks clarify, accentuate, enhance, or improve knowledge that we already hold. The cognitive value of art, then, is not its offering of discrete propositional knowledge, but its amplificatory role in our cognitive lives. It offers, for example, a new way of getting at some truth. This is the kind of view many aestheticians hold. Diffey (1995) offers a view he calls moderate anti-cognitivism, based on a middle point between Stolnitz’s (1992) claim that there are no distinctive artistic truths and the cognitivist claim that new knowledge is gained through art. Diffey thinks, instead, that artworks can serve as ways of getting at a new contemplation of states of affairs. Thomson-Jones (2005), likewise suggests that artworks can grant us access to new ways of looking at some circumstances and/or states of affairs, particularly in the ethical and/or political domain. Indeed, Gibson, whose paper has been the substantial informant of this section, concurs that neo-cognitivism is the most promising way forward for the cognitivist.

In recent work, Currie (2020) has penetrated the claims of the cognitivist in a variety of forms, from the thought-experiment theorist, to the empathy cultivator, to those that think we can gain propositional knowledge, particularly in the context of literary fiction. Currie suggests a move away from knowledge acquisition in cognitivism to the more “guarded” term “learning” (2020, p. 217), arguing that the thought experiments contained within philosophical and scientific discourse offer an epistemic gain with which literary fiction cannot gain parity. He also casts significant doubt on the reliability of truths extracted from fiction, such as doubts of the expertise of authors, evidence of the disconnection between creativity and understanding, and the little support there is for “the practices and institutions of fiction” being bearers of truths (Currie, 2020, p. 198). Ultimately, Currie’s conclusion suggests that “essential features of literary fiction – narrative complexity and the centrality of literary style – seriously detract from any substantial or epistemically rewarding parallel” (Lamarque, 2021), and hence that pursuit of the claim that “’fiction teaches us stuff’ needs to be abandoned” (Currie, 2020, p. 218). Notwithstanding, Currie is sure to emphasize that he does not think that literary fiction cannot grant us knowledge tout court. Rather, the point is that learning can take place through fiction, but often this is marred with an increase in “ignorance, error, or blunted sensibility” (Currie, 2020, p. 216). Where learning does successfully take place, Currie does suggest that such cognitive gain is contributory to literary value.

With regard to empathic understanding and its improvement via literary fiction, Currie notes that empathy for fictional characters should not be taken in similar light to that of empathy in ‘reality’. When empathizing with fictional characters, the success of such empathy is dependent upon our getting it right as the narrative tells of the characters (Currie, 2020, p. 201). A further distinction that should be drawn, Currie suggests, is that between on the one hand an increase in empathy, and on the other an increase in the capacity for this empathy to be used discriminatively and in a positive way (Currie, 2020, p. 204). Drawing on empirical literature, Currie argues the evidence of gain in positive empathic discriminatory capacities is lacking and so we should not be overly optimistic (Currie, 2020, pp. 207-209). Again, though, he does not exclude the possibility of positive gain being made in empathy as a result of fiction. Some may improve, some may not; some may grant a positive effect, some negative. One work could produce empathic gain for some individual, loss for another (Currie, 2020, p. 215-6). Currie’s agenda regarding empathy cultivation through literature, therefore, is to warn against an over-optimism, as it was in his above cases about more classical cognitivist claims.

4. The Political Value of Art

a. Epistemic Progress

In light of the continued skepticism about what the cognitivist can and cannot claim, the views that art can give us experiential and/or propositional knowledge have decreased in popularity. However, in the context of contributions to political-epistemic progress, Simoniti (2021) has claimed that some art not only gives us propositional knowledge of the same standard as objective means (such as textbooks) of getting at epistemic progress, but that art sometimes has an advantage over these other forms. Put simply, Simoniti thinks that artworks can target political discourse and engender similar kinds of knowledge as do textbooks or news articles, without invoking special or peculiar art-specific knowledge – a now relatively unpopular view – alongside being able to plug a gap that objective discourse leaves open.

This is because objective discourse must deal with generalizations: people, events, political parties, and so on, are categorized and essentialized such that a view about the general commands the scope of the whole group. Through art, we come into contact with particularized, individual narratives and characters, following their stories or depictions. Consequently, artworks point out that sometimes the ‘ideal spectator’, abstracted away and taking an encompassing view of states of affairs, events, and groups, is not always the most beneficial standpoint. By allowing us to focus on individuals, artworks can become genuine contributions to epistemic progress through reducing over-confidence in our positions, recalibrating our critical capacities, and facilitating a neutral position (Simoniti, 2021, p. 569-570).

Indeed, the view that artworks can serve as pieces of political criticism or commentary is not an unpopular view. Guernica, to which we have repeatedly referred, contains political content in its denunciation of war. Rivera’s Man at the Crossroads (1934) had political motivations in its content, commissioning, and subsequent destruction. Banksy, the enigmatic graffiti street artist is renowned for the political undertones of their art. Guerilla Girls’ critical works on the Met Museum, repetitively showing the injustice of female artists’ entry into the Met other than nude depictions, are raw forms of political commentary. The core question for philosophers looking at the value of art is whether political value is a genuine determinant of artistic value.

Most aestheticians would be willing to say that art can serve as political commentary or criticism, but not that this represents a specific value in and of itself. Rather, in similar fashion to Simoniti, it looks like the aesthetician would claim that this kind of value is cognitive value. That is, artworks contribute to our knowledge and understanding of politically meritorious and demeritorious states of affairs, raising our consciousness and awareness about them, and hopefully recalibrating our attitudes so as to realize the most sociopolitically beneficial states of affairs possible. This engenders, then, the assessment above regarding the interaction between cognitive value and artistic value, including whether art can genuinely bestow knowledge upon us. Alternatively, we might consider some political aspects of artworks to have effect upon their moral value, or indeed both the cognitive and moral value of the work.

One crucial import into this debate is the nature of aesthetic and/or artistic autonomy, which might be helpfully viewed as a recasting of the interaction debates considered above. This debate has encroached with particular force upon the political power of art. The idea concerns whether art can and whether it should provide criticism or commentary on political states of affairs. Although there is scholarship on the matter, we are not concerned here with political autonomy in terms of the restraint of the state from censoring or interfering with the production and dissemination of particular artworks. Rather, we are concerned with political autonomy in terms of whether art should be viewed politically. For example, if one is a formalist (or a radical autonomist as described above), one is going to suggest that art should not be assessed in terms of its political content. If one thinks that artistic value can be determined to some extent by cognitive and moral factors, then one is likely to allow political criticism and commentary to feature in the assessment of the value of an artwork.

Yet the debate regarding the political autonomy of art can become one that is much more entrenched. In this form, the debate concerns not whether artworks should be assessed in terms of their political content, but whether artists can or should involve political criticism or comment in the first place. The idea here is that the domain of art is supposed to be a realm of detachment from reality, not rendered ‘impure’ by external factors. Artworks are a source of disinterested pleasure, a way of escaping everyday life and the perils and anxieties we draw from it, appreciated solely for their form and the experiences that arise thereof. According to this kind of autonomist, artworks should not involve themselves with, and therefore feature, any political content. The task of art is to creatively detach from reality and serving political ends will only diminish that endeavor (for an edited collection on aesthetic and artistic autonomy, see Hulatt, 2013).

For some, such as W. E. B. Du Bois, the artwork and the political are inextricable: “all Art is propaganda and ever must be, despite the wailing of the purists” (Du Bois, 1926, 29). For Du Bois, art – especially at the time of his writing during the Harlem Renaissance – should be used for “furthering the cause of racial equality for African Americans” (Horton, 2013, p. 307), rather than being constructed to “pander to white audiences for the purposes of publication” (Horton, 2013, p. 308). The artist and their work cannot be severed from the ethical, political, and social environment within which they produce and operate, and the proposal of a detachment of the aesthetic and political is inimical to the cementing of extant progress in racial equality and rights (Horton, 2013). Art, then, should not be an autonomous avenue wherein politics is avoided and instead should be used as a political device.

b. The Pragmatic View

Some artworks make an explicit and direct contribution to political progress and the rectification of social issues and problems. These works are taken to be generally captured by the terms socially engaged art and relational aesthetics. Relational aesthetics (see Bourriaud, 2002, for the seminal work that introduces this term) tends to refer to those works that do not take on traditional artistic qualities, devices, practices, mediums, or techniques, but instead take as their form the interpersonal relations that they affect. For example, Rirkrit Tiravanija has conducted a series of relational works in different galleries and exhibitions, constructing makeshift kitchens that serve Thai food to visitors and staff alike, fostering dialogue between them and establishing (or furthering) social bonds. Socially engaged works are executed in ways very similar to social work by engendering direct socially facilitative effects. This might include Oda Projesi’s workshops and picnics for children in the community, Women on Waves, the Dorchester Housing Projects, or the works of 2015 Turner Prize winning collective Assemble. In each instance, the artist(s) make a direct contribution to the resolution or easing of some social issue. In this way, their goals are pragmatic, rooted in tangible, actualized progress, rather than beautiful or formal as we often take artworks to be.

These works are (i) accepted as works of art, and (ii) have value therein. As such, they have artistic value. Simoniti suggests that they “worryingly disregard the confines of the artworld” (2018, p. 72) by lacking the employment of traditional artistic and aesthetic form or values. In fact, it is precisely in their nature that they deviate from, as we have seen, traditional forms of artistic production and merit. As a consequence, Simoniti introduces an account of artistic value that can capture the social-work-esque achievements of these works and capture these achievements as valuable qua art. Called the pragmatic account of artistic value, it is used to explain only the value of these works, and states that value v, possessed by an artwork, is an artistic value if v is the positive political, cognitive, or ethical impact of the work (Simoniti, 2018, p. 76). That is, the value of these works, as artworks, is found in the positive, pragmatic contribution they make to sociopolitical progress. It should be qualified that Simoniti does not think the pragmatic view should be extended to all forms of art when assessing their value. Instead, it is the sensible option to take with regard to the artistic value of specific forms of art, such as socially engaged art or relational art. Other forms of art, of course, can have their artistic value assessed in terms of the positions we have already explored.

There are some concerns one might wish to raise with these pragmatic works. Firstly, one might question whether we should be referring to these as artworks. If they make no attempt at semblance of traditional artistic forms or value, then why call them artworks? Indeed, if they operate closer to the sphere of social work than art, and indeed have no traditionally artistic qualities, we might want to call them social-works rather than art-works. The relevance here for our task – concerning artistic value – is that if artistic value is something’s value qua or as art, then it needs to be art to have it! Simoniti appeals to the institutional and historicist theories of art to meet this objection. A related concern regards the nature of artistic value as a comparative notion; it is just how good or bad an artwork is. If socially engaged works have a specific account of artistic value that applies to them, then it doesn’t look like we can provide a legitimate comparison of them to more traditional artistic forms. Moreover, as this particular domain of value assessment perhaps aligns with social work more so than art, we might argue that the comparison should take place between socially engaged works and, well, social work, rather than artworks. If this is the case, then we might wonder if the value assessment is actually about the works qua art, or some other domain, that is, social work.

Finally, one might suggest that extant accounts of artistic value may indeed capture the artistic value of these works. Consider, for example, the moral beauty view of Gaut that we saw earlier. If we can observe an interaction between ethically meritorious character and action and aesthetic value, suggesting that the former are beautiful, then this could be used to apply to the ethically meritorious character and action of these relational works. Likewise, the functional beauty view, endorsed in particular by Parsons and Carlson (2008), suggests that aesthetic value can be attributed on the basis of something’s appearance corresponding to its intended function. For example, a flat tyre is aesthetically displeasing because it appears as inhibiting the function of a car (Bueno, 2009, p. 47). Perhaps, we might claim, socially engaged works appear in a way that corresponds to their intended function. These two brief parses might suggest that the introduction of a specialized notion of artistic value may not be needed.

5. Summary of Available Positions and Accounts

There is a wealth of available views regarding artistic value, its determinants, its relationship to the value of art, its relationship to aesthetic value, whether and how determinants can affect it, and so on. Here, I want to provide a brief outline of the views discussed and available positions/accounts. The purpose is to provide a brief, working statement about the views at hand. This is especially useful as sometimes multiple different views can adopt the same heading term. This set is by no means exhaustive, may be incomplete, and will be updated as is seen fit.

Aestheticism – aesthetic value and artistic value are one and the same value, and only aesthetic value matters for determining artistic value (things like cognitive value, moral value, political value, don’t matter for an assessment qua art).

Anti-cognitivism – there is no such thing as a distinctively artistic truth, or a truth that only art can teach us (see Stolnitz, 1992).

Cognitivism – artworks have something to grant to us in terms of knowledge. This might be new propositional knowledge, experiential knowledge, specifically artistic knowledge, or the artwork may clarify or strengthen already-held truths.

Cognitive Immoralism – moral defects in an artwork can be artistically valuable insofar as they provide some cognitive value (for example, cultivation of our moral understanding).

Definition-Evaluation Parallelism – what makes an artwork an artwork is x, x is a value of art, and the value of art is determined by one sole value, x. Not all value definitions of art conform to definition-evaluation parallelism.

Eliminativism – aesthetic value and artistic value are one and the same thing, and as such talk of artistic value is redundant (things like cognitive value, moral value, and political value might matter for the eliminativist, if they commit to a broad notion of aesthetic value).

Error-theory about artistic value – aesthetic value is what we mean by value qua art, there is no such thing as artistic value. We are in error when we talk about it.

Ethicism – moral merits are always aesthetic merits, and moral defects are always aesthetic defects.

Immoralism – moral defects in an artwork can be aesthetic/artistic merits.

Interactionist – (about moral value) someone who thinks that the moral value of an artwork interacts with that artwork’s aesthetic/artistic value.

Moderate Autonomism – aesthetic value is all that matters for artistic value, but artworks might be assessed with reference to the moral domain. However, the latter has no bearing on the artistic value of the work (its value qua art)

Moderate Moralism – in some cases, a work of art is susceptible to treatment in the moral domain, and this can affect its artistic value (its value qua art).

Neo-cognitivism – artworks can be cognitively valuable, and their artistic value augmented as a result, insofar as they can serve to clarify or improve knowledge we already possess.

Pluralism about artistic value – there are many determinants of artistic value, such as aesthetic value, cognitive value, moral value, and political value.

Pragmatic View of Artistic Value – artistic value, explicitly and solely for the set of socially engaged artworks, is the positive cognitive, ethical, or political effect they entail. This view should not be used to apply to other kinds of art, such as painting, sculpture, music, and so on (see Simoniti, 2018).

Radical Autonomism – aesthetic value is all that matters for artistic value, and any assessment of morality with regard to an artwork is inappropriate even if one does not think it bears weight on artistic value.

Radical Moralism – the artistic value of a work of art is determined by, or reducible to, its moral value.

Robust Immoralism – moral defects in an artwork give rise to artistic value insofar as a work achieves aesthetic success through aesthetic properties that arise because of them. For example, fictional murder may be valuable insofar as it invites excitement, vivacity, or mystery.

The Trivial Theory (of artistic value) – artworks have lots of different determinant values, none of which are specific to, or characteristic of, art.

Value Definitions of Art – what makes an artwork an artwork is x, and x is also a (or the) value of art. If x is the sole determinant of the value of art, then the value definition is an instance of definition-evaluation parallelism.

6. References and Further Reading

  • Abell, C. (2012) ‘Art: What it Is and Why it Matters’. Philosophy and Phenomenological Research. Vol. 85 (3) pp. 671-691
  • Bourriaud, N. (2009) Relational Aesthetics. Dijon: Les Presses du réel
  • Bueno, O. (2002) ‘Functional Beauty: Some Applications, Some Worries’. Philosophical Books. Vol. 50 (1) pp. 47-54
  • Bullough, E. (1912) ‘“Psychical Distance” as a Factor in Art and an Aesthetic Principle’. British Journal of Psychology. Vol. 5 (2), pp. 87-118
  • Carroll, N. (1996) ‘Moderate Moralism’. British Journal of Aesthetics. Vol. 36 (3), pp. 223-238
  • Carroll, N. (2004) ‘Non-Perceptual Aesthetic Properties: Comments for James Shelley’. British Journal of Aesthetics. Vol. 44 (4) pp. 413-423
  • Carroll, N. (2012) ‘Recent Approaches to Aesthetic Experience’. The Journal of Aesthetics and Art Criticism. Vol. 70 (2) pp. 165-177
  • Carroll, N. (2015) ‘Defending the Content Approach to Aesthetic Experience’. Metaphilosophy. Vol. 46 (2) pp. 171-188
  • Currie, G. (2020) Imagining and Knowing. Oxford: Oxford University Press
  • Davies, S. (1990) ‘Functional and Procedural Definitions of Art’. Journal of Aesthetic Education. Vol. 24 (2) pp. 99-106
  • Davies, S. (1991) Definitions of Art. London: Cornell University Press
  • Dickie, G. (1964) ‘The Myth of the Aesthetic Attitude’. American Philosophical Quarterly. Vol. 1 (1) pp. 56-65
  • Diffey, (1995) ‘What can we learn from art?’. Australasian Journal of Philosophy. Vol. 73 (2) pp. 204-211
  • Du Bois, W. E. B. (1926) ‘Criteria of Negro Art’. The Crisis. Vol. 32 pp. 290-297
  • Eaton, A. W. (2012) ‘Robust Immoralism’. The Journal of Aesthetics and Art Criticism. Vol. 70 (3) pp. 281-292
  • Gaut, B.(2007) Art, Emotion, Ethics. Oxford: Oxford University Press
  • Gisbon, J (2008) ‘Cognitivism and the Arts’. Philosophy Compass. Vol. 3 (4) pp. 573-589
  • Goldman, A. (2013) ‘The Broad View of Aesthetic Experience’. The Journal of Aesthetics and Art Criticism. Vol. 71 (4) pp. 323-333
  • Hamilton, A. (2006) ‘Indeterminacy and reciprocity: contrasts and connections between natural and artistic beauty’. Journal of Visual Art Practice. Vol. 5 (3) pp. 183-193
  • Hanson, L. (2013) ‘The Reality of (Non-Aesthetic) Artistic Value’. The Philosophical Quarterly. Vol. 63 (252) pp. 492-508
  • Hanson, L. (2017) ‘Artistic Value is Attributive Goodness’. The Journal of Aestheitcs and Art Criticism. Vol. 75 (4) pp. 415-427
  • Hanson, L. (2019) ‘Two Dogmas of the Artistic-Ethical Interaction Debate’. Canadian Journal of Philosophy. Vol. 50 (2) pp. 209-222
  • Horton, R. (2013) ‘Criteria of Negro Art’. The Literature of Propaganda. Vol. 1 pp. 307-309
  • Hulatt, O. eds. (2013) Aesthetic and Artistic Autonomy. New York: Bloomsbury Academic
  • Kant, I. (1987) Critique of Judgement translated by Werner Pluhar. Cambridge: Hackett Publishing Company.
  • Kieran, M. (2003) ‘Forbidden Knowledge: The Challenge of Immoralism’ in Bermudez, L., Gardner, S. eds. (2003) Art and Morality London: Routledge
  • Kulka, T. (2005) ‘Forgeries and Art Evaluation: An Argument for Dualism in Aesthetics’. The Journal of Aesthetic Education. Vol. 39 (3) pp. 58-70
  • Lopes, D. (2013) ‘The Myth of (Non-Aesthetic) Artistic Value’. The Philosophical Quarterly. Vol. 61 (244) pp. 518-536
  • Lopes, D. (2018) Being for Beauty. Oxford: Oxford University Press
  • Matravers, D. (2014) Introducing Philosophy of Art: in Eight Case Studies London: Routledge
  • McGregor, R. (2014) ‘A Critique of the Value Interaction Debate’. British Journal of Aesthetics. Vol. 54 (4) pp. 449-466
  • Parsons, G., Carlson, A. (2008) Functional Beauty. Oxford: Oxford University Press
  • Ryle, G. (1949/2009) The Concept of Mind: 60th Anniversary Edition Oxford: Routledge
  • Sauchelli, A. (2012) ‘Ethicism and Immoral Cognitivism: Gaut versus Kieran on Art and Morality’. The Journal of Aesthetic Education. Vol. 46 (3) pp. 107-118
  • Shelley, J. (2003) ‘The Problem of Non-Perceptual Art’. British Journal of Aesthetics. Vol. 43 (4) pp. 363-378
  • Shelley, J. (2019) ‘The Default Theory of Aesthetic Value’. British Journal of Aesthetics. Vol. 59 (1) pp. 1-12
  • Sibley, F. (1959) ‘Aesthetic Concepts’. Philosophical Review. Vol. 68 (4) pp. 421-450
  • Sibley, F. (2001) Approach to Aesthetics. Oxford: Oxford University Press
  • Simoniti, V. (2018) ‘Assessing Socially Engaged Art’. The Journal of Aesthetics and Art Criticism. Vol. 76 (1) pp. 71-82
  • Simoniti, V. (2021) ‘Art as Political Discourse’. British Journal of Aesthetics. Vol. 61 (4) pp. 559-574
  • Stecker, R. (2019) Intersections of Value: Art, Nature, and the Everyday. Oxford: Oxford University Press
  • Stolnitz, J. (1960) Aesthetics and Philosophy of Art Criticism. Boston: Houghton Miffin
  • Stolnitz, J. (1978) ‘”The Aesthetic Attitude” in the Rise of Modern Aesthetics’. The Journal of Aesthetics and Art Criticism. Vol. 36 (4) pp. 409-422
  • Stolnitz, J. (1992) ‘On the Cognitive Triviality of Art’. British Journal of Aesthetics. Vol. 32 (3) pp. 191-200
  • Thomson-Jones, K. (2005) ‘Inseparable Insight: Reconciling Cognitivism and Formalism in Aesthetics’. The Journal of Aesthetics and Art Criticism. Vol. 63 (4) pp. 375-384
  • Van der Berg, S. (2019) ‘Aesthetic hedonism and its critics’. Philosophy Compass. Vol. 15 (1) e12645

 

Author Information

Harry Drummond
Email: harry.drummond@liverpool.ac.uk
University of Liverpool
U.K.

Charles Darwin (1809–1882)

Charles Darwin is primarily known as the architect of the theory of evolution by natural selection. With the publication of On the Origin of Species in 1859, he advanced a view of the development of life on earth that profoundly shaped nearly all biological and much philosophical thought which followed. A number of prior authors had proposed that species were not static and were capable of change over time, but Darwin was the first to argue that a wide variety of features of the biological world could be simultaneously explained if all organisms were descended from a single common ancestor and modified by a process of adaptation to environmental conditions that Darwin christened “natural selection.”

Although it would not be accurate to call Darwin himself a philosopher, as his training, his professional community, and his primary audience place him firmly in the fold of nineteenth-century naturalists, Darwin was deeply interested and well versed in philosophical works, which shaped his thought in a variety of ways. This foundation included (among others) the robust tradition of philosophy of science in Britain in the 1800s (including, for instance, J. S. Mill, William Whewell, and John F. W. Herschel), and German Romanticism (filtered importantly through Alexander von Humboldt). From these influences, Darwin would fashion a view of the living world focused on the continuity found between species in nature and a naturalistic explanation for the appearance of design and the adaptation of organismic characters to the world around them.

It is tempting to look for antecedents to nearly every topic present in contemporary philosophy of biology in the work of Darwin, and the extent to which Darwin anticipates a large number of issues that remain pertinent today is certainly remarkable. This article, however, focuses on Darwin’s historical context and the questions to which his writings were primarily dedicated.

Table of Contents

  1. Biography
  2. Darwin’s Philosophical Influences
    1. British Philosophy of Science
    2. German Romanticism
    3. Ethical and Moral Theory
  3. The Argument for Natural Selection
    1. Darwin’s Theory
    2. The Origin of Species
  4. Evolution, Humans, and Morality
    1. The Question of Human Evolution
    2. The Descent of Man
    3. Sexual Selection
  5. Design, Teleology, and Progress
    1. Design: The Darwin-Gray Correspondence
    2. Was Darwin a Teleologist?
    3. Is Natural Selection Progressive?
  6. The Reception of Darwin’s Work
    1. Scientific Reception
    2. Social and Religious Reception
    3. Darwin and Philosophy
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Charles Robert Darwin was born in Shropshire, England, on February 12, 1809. He came from a relatively illustrious and well-to-do background: his father, Robert Darwin (1766–1848), was a wealthy and successful surgeon, and his uncle Josiah Wedgwood (1730–1795) was the son of the founder of the pottery and china works that still bear the family name. His grandfather was Erasmus Darwin (1731–1802), a co-founder of the Lunar Society, a group that brought together elite natural philosophers from across the English Midlands, including the chemist Joseph Priestley and the engineers James Watt and Matthew Boulton. Erasmus Darwin’s natural-philosophical poetry was widely known, especially Zoonomia (or “Laws of Life”), published between 1794 and 1796, and containing what we might today call some “proto-evolutionary” thought (Browne 1989).

Darwin had been expected to follow in his father’s footsteps and set out for the University of Edinburgh at the age of sixteen to study medicine. He was, anecdotally, so distressed by surgical demonstrations (in the years prior to anesthesia) that he quickly renounced any thoughts of becoming a doctor and turned his focus instead to the zoological lessons (and collecting exhibitions) of Robert Edmond Grant, who would soon become his first real mentor. Darwin’s father, “very properly vehement against my turning into an idle sporting man, which then seemed my probable destination” (Autobiography, p. 56), sent him in 1828 to Cambridge, with the goal of becoming an Anglican parson. Cambridge, however, would put him in contact with John Stevens Henslow, an influential botanist who encouraged Darwin to begin studying geology.

His friendship with Henslow would trigger one of the pivotal experiences of Darwin’s life. The professor was offered a position as “ship’s naturalist” for the second voyage of the HMS Beagle, a vessel tasked with sailing around the world and preparing accurate charts of the coast of South America. Henslow, dissuaded by his wife from taking the position himself, offered it to Darwin. After convincing his father that there could, indeed, be a career waiting for him at the end of the trip, Darwin departed on December 27, 1831.

Darwin left England a barely credentialed, if promising, twenty-two-year-old student of zoology, botany, and geology. By the time the ship returned in 1836, Darwin had already become a well-known figure among British naturalists. This recognition occurred for several reasons. First, it was a voyage of intellectual transformation. One of Darwin’s most significant scientific influences was Charles Lyell, whose three-volume Principles of Geology arrived by post over the course of the voyage, in the process dramatically reshaping the way in which Darwin would view the geological, fossil, zoological, and botanical data that he collected on the trip. Second, Darwin spent the entire voyage – much of that time in inland South America, while the ship made circuits surveying the coastline – collecting a wide variety of extremely interesting specimens and sending them back to London. Those collections, along with Darwin’s letters describing his geological observations, made him a popular man upon his return, and a number of fellow scientists (including the geologist and fossil expert Richard Owen, later to be a staunch critic of Darwin’s, and the ornithologist John Gould) prepared, cataloged, and displayed these specimens, many of which were extensively discussed in Darwin’s absence.

It was also on this trip that Darwin made his famed visit to the islands of the Galapagos. It is certain that the classic presentation of the Galapagos trip as a sort of “eureka moment” for Darwin, in which he both originated and became convinced of the theory of natural selection in a single stroke by analyzing the beaks of the various species of finch found across several of the islands, is incorrect. (Notably, Darwin had mislabeled several of his collected finch and mockingbird specimens, and it was only after they were analyzed by the ornithologist Gould on his return and supplemented by several other samples collected by the ship’s captain FitzRoy, that he saw the connections between beak and mode of life that we now understand to be so crucial.) But the visit was nonetheless extremely important. For one thing, Darwin was struck by the fact that the organisms found in the Galapagos did not look like inhabitants of other tropical islands, but rather seemed most similar to those found in coastal regions of South America. Why, Darwin began to wonder, would a divine intelligence not create species better tailored to their island environment, rather than borrowing forms derived from the nearby continent? This argument from biogeography (inspired in part by Alexander von Humboldt, about whom more in the next section) was one Darwin always found persuasive, and it would later be included in the Origin.

Beginning with his return in late 1836, and commencing with a flurry of publications on the results of the Beagle voyage that would culminate with the appearance of the book that we now call Voyage of the Beagle (1839, then titled Journal of Researches into the Geology and Natural History of the Various Countries Visited by H.M.S. Beagle), Darwin would spend six fast-paced years moving through London’s scientific circles. This was a period of frantic over-work and rapidly progressing illness (the subject of extreme speculation in the centuries since, with the latest hypothesis being an undiagnosed lactose intolerance). Darwin married his first cousin (a fact that caused him constant worry over the health of his children), Emma Wedgwood, in early 1839, and the family escaped the pressures of London to settle at a country manor in Down, Kent, in 1842 (now renovated as a very attractive museum). Darwin would largely be a homebody from this point on; his poor health and deep attachment to his ten children kept him hearthside for much of the remainder of his career. The death of two of his children in infancy, and especially a third, Annie, at the age of ten, were tragedies that weighed heavily upon him.

Before we turn to Darwin’s major scientific works, it is worth pausing to briefly discuss the extensive evidence revealing the development of Darwin’s thought. Luckily for those of us interested in studying the history of biology, he was a pack-rat. Darwin saved nearly every single letter he received and made pressed copies of those he wrote. He studiously preserved every notebook, piece of copy paper, or field note; we even have lists of the books that he read and when he read them, and some of his children’s drawings, if he later wrote down a brief jot of something on the back of them. As a result, we are able to chronicle the evolution (if you will) of his thinking nearly down to the day.

Thus, we know that over the London period – and particularly during two crucial years, 1837 and 1838 – Darwin would quickly become convinced that his accumulated zoological data offered unequivocal support for what he would call transformism: the idea that the species that exist today are modified descendants of species that once existed in the past but are now extinct. Across the top of his B notebook (started around July 1837), he wrote the word ZOONOMIA, in homage to his grandfather’s own transformist thought. The first “evolutionary tree” would soon follow. Around this time, he came to an understanding of natural selection as a mechanism for transformism, in essentially its modern form – since no organism is exempt from the struggle to survive and reproduce, any advantage, however slight, over its competitors will lead to more offspring in the long run, and hence the accumulation of advantageous change. With enough time, differences large enough to create the gulfs between species would arise.

In 1842, Darwin drafted a short version of this theory (now known as the Sketch) and expanded it to a much longer draft in 1844 (now known as the Essay), which he gave to his wife with instructions and an envelope of money to ensure that it would be published if Darwin died as a result of his persistent health problems. Somewhat inexplicably, he then set this work aside for around a decade, publishing a magisterial three-volume work on the classification of the barnacles. (So all-consuming was the pursuit that one of the Darwin children asked a friend where their father “did his barnacles.”) Hypotheses for the delay abound: aversion to conflict; fear of the religious implications of evolution; the impact of the wide ridicule of the rather slapdash anonymous “evolutionary” volume Vestiges of the Natural History of Creation, published in 1844; or simply a desire to immerse himself fully in the details of a taxonomic project prior to developing his own theoretical perspective.

In any event, he slowly began working on evolutionary ideas again over the mid-1850s (starting to draft a massive tome, likely in the end to have been multi-volume, now known as the Big Book or Natural Selection), until, on June 18, 1858, he received a draft of an article from fellow naturalist Alfred Russel Wallace. Darwin believed – whether or not this is true is another matter – that he had been entirely scooped on natural selection. Without his involvement, Lyell and the botanist Joseph Dalton Hooker arranged a meeting of the Linnean Society at which some of Darwin’s Sketch and Wallace’s paper would be read, allowing Darwin to secure priority in the discovery of natural selection. Meanwhile, Darwin turned to the preparation of an “abstract” of the larger book, much lighter on citations and biological detail than he would have liked, and he rushed it into print. On November 24, 1859, On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life was published. Its initial print run immediately sold out.

The book became a massive success, powered in no small part by the ability of natural selection to parsimoniously explain a staggering array of otherwise disunified biological facts (see section 6). He was promoted by a variety of eloquent and influential defenders (such as Thomas Henry Huxley), and even a number of fellow naturalists who were otherwise skeptical (particularly about the theory’s relationship to religious belief) offered him public support.

Despite Darwin’s best efforts (see section 4) to exclude discussion of humans and human evolution from the Origin, both the scientific community and the general public were quick to see the striking impact that Darwin’s work would have on our conception of human origins. After publishing books on the fertilization of orchids, the morphology of climbing plants, and the variation of domesticated plants and animals, Darwin finally turned directly to the question of humans, publishing The Descent of Man in 1871. His efforts there to connect the mental capacities of animals with those of humans would be extended by The Expression of the Emotions in Man and Animals, published the following year, one of the first books to be illustrated with photographic plates. Further books on fertilization, flowers, movement in plants, and a final book on earthworms were Darwin’s remaining major scientific publications – all directed at offering small but important demonstrations of the power of natural selection in action, and the ability of gradual, continuous change to accumulate in significant ways.

Darwin died in April 1882, and is buried in Westminster Abbey, next to John Herschel and just across from Isaac Newton. As such an illustrious burial attests, his legacy as one of the leading scientists of the nineteenth century was immediately cemented, even if the theory of natural selection itself took several decades to meet with universal acceptance (see section 6). By the 1950s, biological theory as a whole had been remade in a largely Darwinian image, and in 1964, Theodosius Dobzhansky would famously write that “nothing makes sense in biology except in the light of evolution.” Darwin was even featured on one side of the British £10 note from 2000 to 2018.

2. Darwin’s Philosophical Influences

For all that Darwin was assuredly not a professional philosopher – as indicated above, his relatively scattered educational trajectory was not one that would have had him reading large numbers of philosophical texts – he was still quite well-read, and concepts from both British and broader European traditions can undeniably be detected in his work. Much debate surrounds the ways in which we should understand those influences, and how they might (or might not) have shaped the content of his later scientific works.

We can be certain that while Darwin studied at Cambridge, he would have received the standard sort of training for a young man interested in becoming a naturalist and an Anglican minister (see Sloan, in Hodge and Radick 2009). He would have studied the Bible, as well as some important works of philosophy (such as John Locke’s Essay). He wrote later in his autobiography about the extent to which reading the natural theology of William Paley had been formative for him—the young Darwin was a genuine admirer of Paley’s approach, and hints of Paley’s perspective on design in nature can be found in the respect with which Darwin would treat arguments concerning the difficulty of accounting for “perfectly” adapted characters like the eye of an eagle.

Darwin also began to engage with the two philosophical traditions that would, as many commentators have noted (see especially Richards and Ruse 2016), largely structure his perspective on the world: one British, consisting of the writings on science by authors like John Herschel, William Whewell, and John Stuart Mill, and one German, which, especially for the young Darwin, would focus on the Romanticism of Alexander von Humboldt.

a. British Philosophy of Science

The British tradition was born out of the professionalization and standardization of scientific practice. Whewell would coin the very term ‘scientist’ around this period, and he and others were engaged in an explicit attempt to clarify the nature of scientific theorizing and inference. Works doing exactly this were published in rapid succession just as Darwin was negotiating the demands of becoming a professional naturalist and fashioning his work for public consumption. Herschel’s Preliminary Discourse on the Study of Natural Philosophy was published in 1830 (Darwin read it the same year), Whewell’s massive History of the Inductive Sciences and Philosophy of the Inductive Sciences appeared in 1837 and 1840, respectively, and Mill’s System of Logic dates from 1843. The very concept of science itself, the ways in which scientific evidence ought to be collected and inferences drawn, and the kinds of character traits that should be possessed by the ideal scientist were all the object of extensive philosophical discourse.

For his part, Darwin certainly was aware of the works of these three authors, even those that he had not read, and was further exposed to them all through their presence in a variety of contemporary scientific texts. Works like Charles Lyell’s Principles of Geology (1830–1833) were self-consciously structured to fulfill all the canons of quality science that had been laid down by the philosophers of the day, and so served as practical exemplars for the kind of theorizing that Darwin would later attempt to offer.

Without going too far afield into the incredibly rich subject of nineteenth-century British philosophy of science, a brief sketch of these views is nonetheless illuminating. In the early years of the 1800s, British science had been left with an uneasy mix of two competing philosophies of science. On the one hand, we find a strict kind of inductivism, often attributed to Francis Bacon, as hardened and codified by Isaac Newton. Scientists are to disinterestedly pursue the collection of the largest possible basis of empirical data and generalize from them only when a theoretical claim has received sufficient evidential support. Such was, the story went, the way in which Newton himself had induced the theory of universal gravitation on the basis of celestial and terrestrial motions, and such was the intent behind his famous injunction, “hypotheses non fingo” – I frame no hypotheses.

Such a philosophy of science, however, ran afoul of perhaps the most significant theoretical development of the early nineteenth century: the construction of the wave theory of light, along with Thomas Young and Augustin Fresnel’s impressive experimental confirmations of the various phenomena of interference. This posed a straightforward set of challenges for British philosophers of science to solve. Other than the famous “crucial experiments” in interference, there was little inductive evidence for the wave theory. What was the medium that transmitted light waves? It seemed to escape any efforts at empirical detection. More generally, was not the wave theory of light exactly the sort of hypothesis that Newton was warning us against? And if so, how could we account for its substantial success? How should the Baconian inductive method be related to a more speculative, deductive one?

Herschel, Whewell, and Mill differ in their approaches to this cluster of questions: Herschel’s emphasis on the role of the senses, Whewell’s invocation of Kantianism, and Mill’s use of more formal tools stand out as particularly notable. But at the most general level, all were trying, among numerous other goals, to find ways in which more expansive conceptions of scientific inference and argument could make room for a “legitimate” way to propose and then evaluate more speculative or theoretical claims in the sciences.

Of course, any theory addressing changes in species over geologic time will confront many of the same sorts of epistemic problems that the wave theory of light had. Darwin’s introduction of natural selection, as we will see below, both profited and suffered from this active discussion around questions of scientific methodology. On the one hand, the room that had been explicitly made for the proposition of more speculative theories allowed for the kind of argument that Darwin wanted to offer. But on the other hand, because so much focus had been aimed at these kinds of questions in recent years, Darwin’s theory was, in a sense, walking into a philosophical trap, with interlocutors primed to point out just how different his work was from the inductivist tradition. To take just one example, Darwin would complain in a letter to a friend that he thought that his critics were asking him for a standard of proof that they did not demand in the case of the wave theory. This conflict will be made explicit in the context of the Origin in the next section.

b. German Romanticism

The other philosophical tradition which substantially shaped Darwin’s thought was a German Romantic one, largely present in the figure of the naturalist, explorer, and philosopher Alexander von Humboldt (1769–1859). Darwin seems to have first read Humboldt in the years between the completion of his bachelor’s degree and his departure on the Beagle. Throughout his life, he often described his interactions with the natural world in deeply aesthetic, if not spiritual, terms, frequently linking such reflections back to Humboldt’s influence. A whole host of Darwin’s writings on the environments and landscapes he saw during his voyage, from the geology of St. Jago (now Santiago) Island in Cape Verde to the rainforests of Brazil, are couched in deeply Humboldtian language.

But this influence was not only a matter of honing Darwin’s aesthetic perception of the world, though this was surely part of Humboldt’s impact. Humboldt described the world in relational terms, focusing in particular on the reciprocal connections between botany, geology, and geography, a perspective that would be central in Darwin’s own work. Humboldt also had expounded a nearly universally “gradualist” picture of life – emphasizing the continuity between animals and humans, plants and animals, and even animate and inanimate objects. As we will see below, this kind of continuity was essential to Darwin’s picture of human beings’ place in the world.

In addition to the widely recognized influence of Humboldt, Darwin knew the works of Carl Gustav Carus, a painter and physiologist who had proposed theories of the unity of type (the sharing of an “archetype” among all organisms of a particular kind, reminiscent as well of the botanical work of Goethe). That archetype theory, in turn, was influentially elaborated by Richard Owen, with whom Darwin would work extensively on the evaluation and classification of some of his fossil specimens after his return on the Beagle. As noted above, Darwin was quite familiar with the work of Whewell, who integrated a very particular sort of neo-Kantianism into the context of an otherwise very British philosophy of science (on this point, see particularly Richards’s contribution to Richards and Ruse 2016).

Controversy exists in the literature over the relative importance of the British and German traditions to Darwin’s thought. The debate in the early twenty-first century is somewhat personified in the figures of Michael Ruse and Robert J. Richards, partisans of the British and German influences on Darwin’s work, respectively. On Ruse’s picture, the British philosophy-of-science context, supplemented by the two equally British cultural forces of horticulture and animal breeding (hallmarks of the agrarian, land-owner class) and the division of labor and a harsh struggle for existence (features of nineteenth-century British entrepreneurial capitalism), offers us the best explanation for Darwin’s intellectual foundations. Richards, of course, does not want to deny the obvious presence of these influences in Darwin’s thought. For him, what marks Darwin’s approach out as distinctive is his knowledge of and facility with German Romantic influences. In particular, Richards argues, they let us understand Darwin’s perennial fascination with anatomy and embryology, aspects that are key in this German tradition and the inclusion of which in Darwin’s work might otherwise remain confusing.

c. Ethical and Moral Theory

Darwin recognized throughout his career that his approach to the natural world would have an impact on our understanding of humans. His later works on the evolution of our emotional, social, and moral capacities, then, require us to consider his knowledge of and relation to the traditions of nineteenth-century ethics.

In 1839, Darwin read the work of Adam Smith, in particular his Theory of Moral Sentiments, which he had already known through Dugald Stewart’s biography of Smith. (It is less likely that he was familiar first-hand with any of Smith’s economic work; see Priest 2017.) Smith’s approach to the moral sentiments – that is, his grounding of our moral conduct in our sympathy and social feelings toward one another – would be reinforced by a work that was meaningful for Darwin’s theorizing but is little studied today: James Mackintosh’s Dissertation on Progress of Ethical Philosophy, published in 1836. For Smith and Mackintosh both, while rational reflection could aid us in better judging a decision, what really inspires moral behavior or right action is the feeling of sympathy for others, itself a fundamental feature of human nature. From his very first reading of Smith, Darwin would begin to write in his notebooks that such an approach to morality would enable us to ground ethical behavior in an emotional capacity that could be compared with those of the animals – and which could have been the target of natural selection.

Finally, we have the influence of Thomas Malthus. Darwin reads Malthus’s Essay on the Principle of Population (1798) on September 28, 1838, just as he is formulating the theory of natural selection for the first time. Exactly what Darwin took from Malthus, and, therefore, the extent to which the reading of Malthus should be seen as a pivotal moment in the development of Darwin’s thought, is a matter of extensive debate. We may be certain that Darwin took from the first chapter of Malthus’s work a straightforward yet important mathematical insight. Left entirely to its own devices, Malthus notes, the growth of population is an exponential phenomenon. On the contrary, even with optimistic assumptions about the ability of humans to increase efficiency and yield in our production of food, it seems impossible that growth in the capacity to supply resources for a given population could proceed faster than a linear increase.

This insight became, as Darwin endeavored to produce a more general theory of change in species, crucial to the conviction that competition in nature – what he would call the struggle for existence – is omnipresent. Every organism is locked in a constant battle to survive and reproduce, whether with other members of its species, other species, or even its environmental conditions (of drought or temperature, for instance). This struggle can help us to understand both what would cause a species to go extinct, and to see why even the slightest heritable advantage could tilt the balance in favor of a newly arrived form.

Of course, Malthus’s book does not end after its first chapter. The reason that this inevitable overpopulation and hardship seems to be absent from much of the human condition, Malthus argues, is because (at least some) humans have been prudent enough to adopt other kinds of behaviors (like religious or social checks on marriage and reproduction) that prevent our population growth from proceeding at its unfettered, exponential pace. We must ensure, he argues, that efforts to improve the lives of the poor in fact actually do so, rather than producing the conditions for problematic overpopulation. A number of commentators, perhaps most famously Friedrich Engels, have seen in this broader “Malthusianism” the moral imprint of upper-class British society. Others, by contrast, have argued that Darwin’s context is more complex than this, and requires us to carefully unpack his relationship to the multi-faceted social and cultural landscape of nineteenth-century Britain as a whole (see Hodge 2009 and Radick, in Hodge and Radick 2009).

3. The Argument for Natural Selection

Famously, Darwin described the Origin as consisting of “one long argument” for his theory of evolution by natural selection. From the earliest days of its publication, commentators were quick to recognize that while this was assuredly true, it was not the kind of argument that was familiar in the scientific method of the day.

a. Darwin’s Theory

The first question to pose, then, concerns just what Darwin is arguing for in the Origin. Strikingly, he does not use any form of the term “evolution” until the very last word of the book; he instead has a penchant for calling his position “my view” or “my theory.” Contemporary scholars tend to reconstruct this theory in two parts. First, there is the idea of descent with modification. It was common knowledge (more than a century after the taxonomic work of Linnaeus, for example) that the species that exist today seem to show us a complex network of similarities, forming a tree, composed of groups within groups. Darwin’s proposal, then, is that this structure of similarity is evidence of a structure of ancestry – species appear similar to one another precisely because they share common ancestors, with more similar species having, in general, shared an ancestor more recently. Carrying this reasoning to its logical conclusion, then, leads Darwin to propose that life itself was “originally breathed into a few forms or into one” (Origin, p. 490).

The second argumentative goal of the Origin is to describe a mechanism for the production of the changes which have differentiated species from one another over the history of life: natural selection. As organisms constantly vary, and those variations are occasionally more or less advantageous in the struggle for existence, the possessors of advantageous variations will find themselves able to leave more offspring, producing lasting changes in their lineage, and leading in the long run to the adaptation and diversification of life.

Before turning to the argument itself, it is worth offering some context: what were the understandings of the distribution and diversity of life that were current in the scientific community of the day? Two issues here are particularly representative. First, the question of ‘species.’ What exactly was the concept of species to which Darwin was responding? As John Wilkins (2009) has argued, perhaps the most common anecdotal view – that prior to Darwin, everyone believed that species were immutable categories handed down by God – is simply not supported by the historical evidence. A variety of complex notions of species were in play in Darwin’s day, and the difficulty of interpretation here is compounded by the fact that Darwin’s own notion of species is far from clear in his works (there is debate, for example, concerning whether Darwin believed species categories were merely an epistemic convenience or an objective fact about the natural world). In short, Darwin was not as radical on this score as he is sometimes made out to be, in part because there was less theoretical consensus around the question of species than we often believe.

Second, there is the question of ‘gradualism.’ As we have seen, Darwin was heavily influenced by the geologist Charles Lyell, whose Principles of Geology argued for a gradualist picture of geological change (see Herbert 2005 on Darwin’s connections and contributions to geology). Rather than a history of “catastrophes” (Rudwick 1997), where major upheavals are taken to have shaped the geological features we see around us, Lyell argued for the contrary, “uniformitarian” view, on which the same geological causes that we see in action today (like erosion, earthquakes, tidal forces, and volcanic activity), extended over a much longer history of the Earth, could produce all of today’s observed phenomena. Lyell, however, had no interest in evolution. For him, species needed a different causal story: “centers of creation,” where the divine creative power was in the process of building new species, would counterbalance extinctions caused by steady change in the distribution of environmental and climatic conditions across the globe. It is easy to see, however, how Darwin’s own view of evolution by the gradual accumulation of favorable variations could fit naturally into a Lyellian picture of geological and environmental change. Darwin is, in many ways, a product of his time.

b. The Origin of Species

The Origin begins, then, with an analogy between artificial selection – as practiced by agricultural breeders, horticulturalists, or, Darwin’s favorite example, keepers of “fancy” pigeons – and natural selection. Consider for a moment how exactly artificial selection produces new varieties. We have an idea in mind for a new variation that would be aesthetically pleasing or practically useful. Well-trained observers watch for offspring that are born with characteristics that tend in this direction, and those organisms are then bred or crossed. The process repeats and – especially in the nineteenth century, when much work was ongoing to standardize or regularize commercially viable agricultural stocks – modifications can be realized in short order. Of course, this kind of breeding requires the active intervention of an intellect to select the organisms involved, and to plan for the “target” in mind. But this need not be the case. The goal could easily be removed; Darwin has us imagine cases where a simple inclination to keep one’s “best” animals safe during storms or other periods of danger could similarly create selective breeding of this sort, though now with an “unconscious” goal. Furthermore, Darwin will argue, the “selector” can also be done away with.

The next step in the analogy, then, is to demonstrate how such selection could be happening in the natural world. Organisms in nature do seem to vary just as our domestic plants and animals do, he argues – appearances to the contrary are likely just consequences of the fact that the kind of extreme attention to variation in characteristics that an animal breeder gives to their products is absent for wild populations. In just the same way that a breeder will ruthlessly cull any organisms that do not present desirable characters, organisms in the natural world are locked in a brutal struggle for existence. Far more organisms are born than can possibly survive, leading to a kind of Malthusian competition among conspecific organisms, and, in a variety of situations, struggles against the environment itself (heat, cold, drought, and so on) are also severe. Thus, all of the ingredients are there for the analogy to go through: the generation of variation, the relevance of that variation for survival, and the possibility for this process of selection to create adaptation and diversification.

Natural selection, then, because it can work not only on the kinds of visible characters that are of concern to the horticulturalist or animal breeder, but also on the internal construction of organisms, and because it selects for general criteria of success, not limited human goals, will be able to produce adaptations entirely beyond the reach of artificial selection. The result, Darwin writes, “is as immeasurably superior to man’s feeble efforts, as the works of Nature are to those of Art” (Origin, p. 61).

How exactly should we understand this analogy? What kind of evidential or logical support does Darwin think it brings to the process of natural selection? Analogical arguments were increasingly popular throughout the nineteenth century. In part, this may be traced back to Aristotelian and other Greek uses of analogy, which would have been familiar to Darwin and his peers. The role of analogy in the formulation of causal explanations in science had also been emphasized by authors like Herschel and Mill, who argued that one step in proposing a novel causal explanation was the demonstration of an analogy between its mode of action and other kinds of causes we already know to be present in nature.

Darwin then turns to a discussion of an array of objections that he knew would have already occurred to his contemporary readers. For instance: If species arose through gradual transitions, why are they now sharply distinguished from one another? Specialization and division of labor would produce increased opportunities for success and would thus tend to drive intermediate forms to extinction. How could natural selection possibly have created organs like the eyes of an eagle, whose extreme level of perfection had indicated to authors like Paley the signature of design? With enough time, if the intervening steps along the way were still useful to the organisms that possessed them, even such organs could be produced by a gradual process of selection. Darwin also considers the appearance of instincts, with the aim of demonstrating that natural selection could influence mental processes, and the supposed infertility of hybrids, which could be seen as a problem for the accumulation of variation by crossing.

Next comes a discussion of the imperfection of the geological record. The relative rarity, Darwin argues, of the conditions required for fossilization, along with our incomplete knowledge of the fossils that are present even in well explored regions like Europe and North America, explains our ignorance of the complete set of transitional forms connecting ancestral species with the organisms alive today. This, then, serves as a segue to a collection of diverse, positive arguments for evolution by natural selection at the end of the volume, often likened to a Whewell-inspired “consilience of inductions” (a demonstration that a number of independent phenomena, not considered when the theory was first proposed, all serve as evidence for it). A number of facts about the distribution of fossils makes more sense on an evolutionary picture, Darwin argues. Extinction is given a natural explanation as an outcome of competition, and the relations between extinct groups seem to follow the same kinds of patterns that natural selection successfully predicts to exist among living species.

This final “consilience” portion of the book continues by discussing geographical distribution. Rather than appearing as though they were specifically created for their environments, Darwin notes, the flora and fauna of tropical islands are closely affiliated with the species living on the nearest major continent. This indicates that normal means of dispersal (floating, being carried by birds, and so on), along with steady evolution by natural selection, offers a solid explanation for these distributional facts. Similarly, the Linnaean, tree-like structure of larger groups containing smaller groups which relates all extant species can be explained by common ancestry followed by selective divergence, rather than simply being taken to be a brute fact about the natural world. Brief discussions of morphology, embryology, and rudimentary organs close this section, followed by a summary conclusion.

Darwin’s argument for evolution by natural selection is thus a unique one. It combines a number of relatively different ingredients: an analogy with artificial selection, several direct rebuttals of potential counterarguments, and novel evolutionary explanations for a variety of phenomena that are taken to be improvements on the consensus at that time. The ways in which these arguments relate to one another and to the evidential base for natural selection are sometimes made explicit, but sometimes left as exercises for the reader. Darwin’s critics saw in this unorthodox structure an avenue for attack (about which more in section 6).

The character of Darwin’s argument has thus remained an interpretive challenge for philosophers of science. One can recognize in the elements from which the argument is constructed the influence of particular approaches to scientific reasoning – for instance, Herschel’s understanding of the vera causa tradition, Comte’s positivism, or Whewell’s development of the consilience of inductions. These clues can help us to construct an understanding of Darwin’s strategy as being in dialogue with the contemporary philosophy of his day. How to spell this out in the details, however, is relatively challenging, especially because Darwin was himself no philosopher, and it can thus be difficult to determine to what extent he was really engaging with the details of any one philosopher’s work.

In a different vein, we can also use the Origin as a test case for a variety of contemporary pictures of scientific theory change. To take just one example, Darwin seems at times to offer an explicit argument in support of the epistemic virtues embodied by his theory. In particular, he directly considers the likely fertility of an evolutionary approach, arguing that future biological research in an evolutionary vein will be able to tackle a whole host of new problems that are inaccessible on a picture of special creation.

Similarly, evolutionary theory can serve as a test case for our understanding of scientific explanation in the context of historical sciences. Darwin’s argument relies crucially upon the ability to generalize from a local, short-term explanation (of, for instance, the creation of a new kind of pigeon by the accumulation of variations in a particular direction) to a long-term explanation of a broad trend in the history of life (like the evolution of flight). Darwin’s twin reliance on both this sense of “deep time” and on explanations that often involve not the description of a specific causal pathway (one that Darwin could not have possibly known in the mid-nineteenth century) but of a narrative establishing the plausibility of an evolutionary account for a phenomenon have since been recognized to be at the heart of a variety of scientific fields (Currie 2018).

4. Evolution, Humans, and Morality

Throughout the Origin, Darwin assiduously avoids discussion of the impact of evolutionary theory on humans. In a brief aside near the end of the conclusion, he writes only that “light will be thrown on the origin of man and his history” (Origin, p. 488). Of course, no reader could fail to notice that an evolutionary account of all other organisms, along with a unified mechanism for evolution across the tree of life, implies a new account of human origins as well. Caricatures of Darwin depicted as a monkey greeted the theory immediately upon its publication, and Darwin – whose notebooks and correspondence show us that he had always believed that human evolution was one of the most pressing questions for his theory to consider, even if it was absent from the Origin – finally tackled the question head-on when he published the two-volume Descent of Man, and Selection in Relation to Sex in 1871.

a. The Question of Human Evolution

It is important to see what Darwin’s explanatory goals were in writing the Descent. In the intervening years since publishing the Origin (which was, at this point, already in its fifth edition, and had been substantially revised as he engaged with various critics), Darwin had remained convinced that his account of evolution and selection was largely correct. He had published further volumes on variation in domesticated products and the fertilization of orchids, which he took to secure even further his case for the presence of sufficient variation in nature for natural selection to produce adaptations. What, then, was left to describe with respect to human beings? What made human beings special?

It should be emphasized that humans did not merit an exception to Darwin’s gradualist, continuous picture of life on earth. There is no drastic difference in kind – even with respect to emotions, communication, intellect, or morality – that he thinks separates human beings from the other animals. The Descent is not, therefore, in the business of advancing an argument for some special distinguishing feature in human nature.

On the contrary, it is this very gradualism that Darwin believes requires a defense. Opposition to his argument for continuity between humans and the other animals came from at least two directions. On the one hand, religious objections were relatively strong. Any picture of continuity between humans and animals would, for many theologians, have to take the human soul into account. Constructing an account of this supposedly distinctive feature of human beings which could be incorporated into a narrative of human evolution was certainly possible – many authors did precisely this (see Livingstone 2014) – but would require significant work (see more on religious responses to Darwin in section 6.b).

On the other hand, and more problematic from Darwin’s perspective, was scientific opposition, perhaps best represented by Alfred Russel Wallace, who argued that the development of human mental capacity had given us the ability to exempt ourselves from natural selection’s impact on our anatomy entirely (on the Darwin-Wallace connection, see Costa 2014). This special place for human reason did not sit well with Darwin, who thought that natural selection would act no differently in the human case. (Wallace would go on to become a spiritualist, a bridge too far for Darwin; the men rarely communicated afterward.)

Further, as has been extensively, if provocatively, maintained by Desmond and Moore (2009), Darwin recognized the moral stakes of the question. The debate over the origins of human races was raging during this period, dividing those who believed that all human beings were members of a single species (monogenists) and those who argued that human races were in fact different species (polygenists). Darwin came from an abolitionist, anti-slavery family (his wife’s grandfather, the founder of the Wedgwood pottery works, famously produced a series of “Am I Not a Man and a Brother?” cameos, which became an icon of the British and American anti-slavery movements). He had seen first-hand the impact of slavery in South America during the Beagle voyage and was horrified. Desmond and Moore’s broader argument, that Darwin’s entire approach to evolution (in particular, his emphasis on common ancestry) was molded by these experiences, has received harsh criticism. But the more limited claim that Darwin was motivated at least to some extent by the ethical significance of an evolutionary account of human beings is inarguable.

b. The Descent of Man

The Descent therefore begins with a demonstration of the similarity between the physical and mental characteristics of humans and other animals. Darwin notes the many physical homologies (that is, parts that derive from the same part in a common ancestor) between humans and animals – including a number of features of adults, our processes of embryological development, and the presence of rudimentary organs that seem to be useful for other, non-human modes of life. When Darwin turns to the intellect, he notes that, of course, even when we compare “one of the lowest savages” to “one of the higher apes,” there is an “enormous” difference in mental capacity (Descent, p. 1:34). Nonetheless, he contends once again that there is no difference in kind between humans and animals. Whatever mental capabilities we consider (such as instincts, emotions, learning, tool use, or aesthetics), we are able to find some sort of analogy in animals. The mixture of love, fear, and reverence that a dog shows for his master, Darwin speculates, might be analogous with humans’ belief in God (Descent, p. 1:68). As regards the emotions in particular, Darwin would return to this subject a year later in his work The Expression of the Emotions in Man and Animals, a full treatise concerning emotional displays in animals and their similarities with those in humans.

Of course, demonstrating that it is possible for these faculties to be connected by analogy with those in animals is not the same thing as demonstrating how such faculties might have evolved for the first time in human ancestors who lacked them. That is Darwin’s next goal, and it merits consideration in some detail.

For Darwin, the evolution of higher intellectual capacities is intimately connected with the evolution of social life and the moral sense (Descent, pp. 1:70–74). We begin with the “social instincts,” which primarily consist of sympathy and reciprocal altruism (providing aid to fellow organisms in the hope of receiving the same in the future). These would do a tolerably good job of knitting together a sort of pre-society, though obviously they would extend only to the members of one’s own “tribe” or “group.” Social instincts, in turn, would give rise to a feeling of self-satisfaction or dissatisfaction with one’s behavior, insofar as it aligned or failed to align with those feelings of sympathy. The addition of communication or language to the mix allows for social consensus to develop, along with the clear expression of public opinion. All these influences, then, could be intensified as they became habits, giving our ancestors an increasingly intuitive feeling for the conformity of their behavior with these emerging social norms.

In short, what we have just described is the evolution of a moral sense. From a basic kind of instinctive sympathy, we move all the way to a habitual, linguistically encoded sense of praise or blame, an instinctive sentiment that one’s actions should or should not have been done, a feeling for right and wrong. Darwin hastens to add that this evolutionary story does not prescribe the content of any such morality. That content will emerge from the conditions of life of the group or tribe in which this process unfolds, in response to whatever encourages or discourages the survival and success of that group. Carried to the extreme, Darwin writes that if people “were reared under precisely the same conditions as hive-bees, there can hardly be a doubt that our unmarried females would, like the worker-bees, think it a sacred duty to kill their brothers, and mothers would strive to kill their fertile daughters; and no one would think of interfering” (Descent, p. 1:73).

There is thus no derivation here of any particular principle of normative ethics – rather, Darwin wants to tell us a story on which it is possible, consistent with evolution, for human beings to have cobbled together a moral sense out of the kinds of ingredients which natural selection can easily afford us. He does argue, however, that there is no reason for us not to steadily expand the scope of our moral reasoning. As early civilizations are built, tribes become cities, which in turn become nations, and with them an incentive to extend our moral sympathy to people whom we do not know and have not met. “This point being once reached,” Darwin writes, “there is only an artificial barrier to prevent his sympathies extending to the men of all nations and races” (Descent, pp. 1:100–101).

We still, however, have not considered the precise evolutionary mechanism which could drive the development of such a moral sense. Humans are, Darwin argues, assuredly subject to natural selection. We know that humans vary, sometimes quite significantly, and experience in many cases (especially in the history of our evolution, as we are relatively frail and defenseless) the same kinds of struggles for existence that other animals do. There can be little doubt, then, that some of our features have been formed by natural selection. But the case is less obvious when we turn to mental capacities and the moral sense. In some situations, there will be clear advantages to survival and reproduction acquired by the advancement of some particular mental capacity – for instance, the ability to produce a device for obtaining food or performing well in battle.

The moral sense, however, offers a more complicated case. Darwin recognizes what is sometimes called the problem of biological altruism – that is, it seems likely that selfish individuals who freeload on the courage, bravery, and sacrifice of others will be more successful and leave behind more offspring than those with a more highly developed moral sense. If this is true, how can natural selection have favored the development of altruistic behavior? The correct interpretation of Darwin’s thinking here is the matter of a fierce debate in the literature. Darwin’s explanation seems to invoke natural selection operating at the level of groups or tribes. “When two tribes of primeval man, living in the same country, came into competition,” he writes, “if the one tribe included (other circumstances being equal) a greater number of courageous, sympathetic, and faithful members, who were always ready to warn each other of danger, to aid and defend each other, this tribe would without doubt succeed best and conquer the other” (Descent, p. 1:162). This appears to refer to natural selection not in terms of individual organisms competing to leave more offspring, but of groups competing to produce more future groups, a process known as group selection. On the group-selection reading, then, what matters is that the moral sense emerges in a social context. While individually, a selfish member of a group might profit, a selfish tribe will be defeated in the long-run by a selfless one, and thus tribes with moral senses will tend to proliferate.

Michael Ruse has, however, argued extensively for a tempering of this intuitive reading. Given that in nearly every other context in which Darwin discusses selection, he focuses on the individual level (even in cases like social insects or packs of wolves, where a group-level reading might be attractive), we should be cautious in ascribing a purely group-level explanation here. Among other considerations, the humans (or hominids) who formed such tribes would likely be related to one another, and hence a sort of “kin selection” (the process by which an organism promotes an “extended” version of its own success by helping out organisms that are related to it, and hence an individual-level explanation for apparent group-level phenomena) could be at play.

c. Sexual Selection

Notably, the material described so far has covered only around half of the first volume of the Descent. At this point, Darwin embarks on an examination of sexual selection – across the tree of life, from insects, to birds, to other mammals – that takes up the remaining volume and a half. He does so in order to respond to a unique problem that human beings pose. There is wide diversity in human morphology; different human races and populations look quite different. That said, this diversity seems not to arise as a result of the direct impact of the environment (as similar-looking humans have lived for long periods in radically different environments). It also seems not to be the sort of thing that can be explained by natural selection: there is nothing apparently adaptive about the different appearances of different human groups. How, then, could these differences have evolved?

Darwin answers this question by appealing to sexual selection (see Richards 2017). In just the same way that organisms must compete with others for survival, they must also compete when attracting and retaining mates. If the “standards of beauty” of a given species were to favor some particular characteristic for mating, this could produce change that was non-selective, or which even ran counter to natural selection. The classic example here is the tail of the peacock: even if the tail imposes a penalty in terms of the peacock’s ability to escape from predators, if having an elaborate tail is the only way in which to attract mates and hence to have offspring, the “selection” performed by peahens will become a vital part of their evolutionary story. A variety of non-selective differences in humans, then, could be described in terms of socially developed aesthetic preferences.

This explanation, too, has been the target of extensive debate. It is unclear whether or not sexual selection is a process that is genuinely distinct from natural selection – after all, if natural selection is intended to include aptitude for survival and reproduction, then it seems as though sexual selection is only a subset of natural selection. Further, the vast majority of Darwin’s examples of sexual selection in action involve traditional, nineteenth-century gender roles, with an emphasis on violent, aggressive males who compete for coy, choosy females. Can the theory be freed of these now outmoded assumptions, or should explanations that invoke sexual selection instead be discarded in favor of novel approaches that take more seriously the insights of contemporary theories of gender and sexuality (see, for instance, Roughgarden 2004)?

5. Design, Teleology, and Progress

Pre-Darwinian concepts of the character of life on earth shared a number of what we might call broad-scale or structural commitments. Features like the design of organismic traits, the use of teleological explanations, or an overarching sense of progress stood out as needing explanation in any biological theory. Many of these would be challenged by an evolutionary view. Darwin was aware of such implications of his work, though they are often addressed only partially or haphazardly in his most widely read books.

a. Design: The Darwin-Gray Correspondence

One aspect of selective explanations has posed a challenge for generations of students of evolutionary theory. The production of variations, as Darwin himself emphasized, is a random process. While he held out hope that we would someday come to understand some of the causal sequences in greater detail (as we indeed now do), in the aggregate it is “mere chance” that “might cause one variety to differ in some character from its parents” (Origin, p. 111). On the other hand, natural selection is a highly non-random process, which generates features that seem to us to be highly refined products of design.

Darwin, of course, recognized this tension, and discussed it at some length – only he did not do so, in general, in the context of his published works. It is his correspondence with the American botanist Asa Gray which casts the most light on Darwin’s thought on the matter (for an insightful recounting of the details, see Lennox 2010). Gray was what we might today call a committed “theistic evolutionist” – he believed that Darwin’s theory might be largely right in the details but hoped to preserve a role for a master plan, a divinely inspired design lying behind the agency of natural selection (which would on this view have been instituted by God as a secondary cause). Just as, many theists since Newton had argued, God might have instituted the law of gravity as a way to govern a harmonious balance in the cosmos, Gray wondered if Darwin might have discovered the way in which the pre-ordained, harmonious balance in the living world was governed.

However, this would require a place for the “guidance” of design to enter, and Gray thought that variation was where it might happen. If, rather than being purely random, variations were guided, directed toward certain future benefits or a grand design, we might be able to preserve divine influence over the evolutionary process. Such a view is entirely consistent with what Darwin had written in the Origin. He often spoke of natural selection in precisely the “secondary cause” sense noted above (and selected two quotes for the Origin’s frontispiece that supported precisely this interpretation), and he stated clearly that what he really meant in calling variation “random” was that we were entirely ignorant of its causes. Could not this open a space for divinely directed evolution?

Darwin was not sure. His primary response to Gray’s questioning was confusion. He wrote to Gray that “in truth I am myself quite conscious that my mind is in a simple muddle about ‘designed laws’ & ‘undesigned consequences.’ — Does not Kant say that there are several subjects on which directly opposite conclusions can be proved true?!” (Darwin to Gray, July 1860, in Lennox 2010, p. 464). Darwin’s natural-historical observations seem to show him that nature is a disorderly, violent, dangerous place, not exactly one compatible with the kind of design that his British Anglican upbringing had led him to expect.

Another source is worthy of note. In his 1868 Variation in Plants and Animals Under Domestication, Darwin asks us to consider the example of a pile of stones that has accumulated at the base of a cliff. Even though we might call them “accidental,” the precise shapes of the stones in the pile are the result of a series of geological facts and physical laws. Now imagine that someone builds a building from the stones in the pile, without reshaping them further. Should we infer that the stones were there for the sake of the building thus erected? Darwin thinks not. “An omniscient Creator,” he writes, “must have foreseen every consequence which results from the laws imposed by Him. But can it be reasonably maintained that the Creator intentionally ordered, if we use the words in any ordinary sense, that certain fragments of rock should assume certain shapes so that the builder might erect his edifice?” (Variation, p. 2:431). Variation, Darwin claims, should be understood in much the same way. There is no sense, divine or otherwise, in which the laws generating variation are put in place for the sake of some single character in some particular organism. In this sense, evolution is a chancy (and hence undesigned) process for Darwin.

b. Was Darwin a Teleologist?

A related question concerns the role of teleological explanation in a Darwinian world. Darwin is often given credit (for example, by Engels) for having eliminated the last vestiges of teleology from nature. A teleological account of hearts, for instance, takes as a given that hearts are there in order to pump blood, and derives from this fact explanations of their features, their function and dysfunction, and so on. (See the discussion of final causes in the entry on Aristotle’s biology.) From the perspective of nineteenth-century, post-Newtonian science, however, such a teleological explanation seems to run contrary to the direction of causation. How could the fact that a heart would go on to pump blood in the future explain facts about its development now or its evolution in the past? Any such explanation would have to appeal either to a divine design (which Darwin doubted), or to some kind of vitalist force or idealist structure preexisting in the world.

A truly “Darwinian” replacement for such teleology, it is argued, reduces any apparent appeals to “ends” or “final causes” to structures of efficient causation, phrased perhaps in terms of the selective advantage that would be conferred by the feature at issue, or a physical or chemical process that might maintain the given feature over time. The presence of these structures of efficient causation could then be explained by describing their evolutionary histories. In this way, situations that might have seemed to call for teleological explanation are made intelligible without any appeal to final causes.

This does seem to be the position on teleology that was staked out by Darwin’s intellectual descendants in mid-twentieth century biology (such as Ernst Mayr). But is this Darwin’s view? It is not clear. A compelling line of argumentation (pursued by philosophers like James Lennox and David Depew) notes the presence of a suspiciously teleological sort of explanation that runs throughout Darwin’s work. For Darwin, natural selection causes adaptations. But the fact that an adaptation is adaptive also often forms part of an explanation for its eventual spread in the population. There is thus a sense in which adaptations come to exist precisely because they have the effect of improving the survival and reproduction of the organisms that bear them. There is no mistaking this as a teleological explanation – just as we explained hearts by their effect of pumping blood, here we are explaining adaptations by the effects they have on future survival and reproduction.

There are thus two questions to be disentangled here, neither of which have consensus responses in the contemporary literature. First, did Darwin actually advocate for this kind of explanation, or are these merely turns of phrase that he had inherited from his teachers in natural history and to which we should give little actual weight? Put differently, did Darwin banish teleology from biology or demonstrate once and for all the way in which teleology could be made compatible with an otherwise mechanistic understanding of the living world? Second, does contemporary biology give us reasons to reject these kinds of explanations today, or should we rehabilitate a revised notion of teleology in the evolutionary context (for the latter perspective, see, for instance, Walsh 2016)?

c. Is Natural Selection Progressive?

The observation of “progress” across the history of life is a reasonably intuitive one: by comparison to life’s first billion years, which exclusively featured single-celled, water-dwelling organisms, we are now surrounded by a bewildering diversity of living forms. This assessment is echoed in the history of philosophy by way of the scala naturae, the “great chain of being” containing all living things, ordered by complexity (with humans, or perhaps angels, at the top of the scale).

This view is difficult to reconcile with an evolutionary perspective. In short, the problem is that evolution does not proceed in a single direction. The bacteria of today have been evolving to solve certain kinds of environmental problems for just as long, and with just as much success, as human beings and our ancestors have been evolving to solve a very different set of environmental challenges. Any “progress” in evolution will thus be progress in a certain, unusual sense of “complexity.” In the context of contemporary biology, however, it is widely recognized that any one such ordering for all of life is extremely difficult to support. A number of different general definitions of “complexity” have been proposed, and none meets with universal acceptance.

Darwin acknowledged this problem himself. Sometimes he rejected the idea of progress in general. “It is absurd,” he wrote in a notebook in 1837 (B 74), “to talk of one animal being higher than another.” “Never speak of higher and lower,” he wrote as a marginal note in his copy of Robert Chambers’s extremely progressivist Vestiges of the Natural History of Creation. Other times, he was more nuanced. As he had written at the beginning of notebook B, among his earliest evolutionary thoughts: “Each species changes. [D]oes it progress? […] [T]he simplest cannot help – becoming more complicated; & if we look to first origin there must be progress.” When life first begins, there is an essentially necessary increase in complexity (a point emphasized in the contemporary context by authors like Stephen Jay Gould and Daniel McShea), as no organism can be “less complex” than some minimal threshold required to sustain life. Is this “progress?” Perhaps, but only of a very limited sort.

These quotes paint a picture of Darwin as a fairly revolutionary thinker about progress. Progress in general cannot be interpreted in an evolutionary frame; we must restrict ourselves to thinking about evolutionary complexity; this complexity would have been essentially guaranteed to increase in the early years of life on earth. Adaptation refines organismic characteristics within particular environments, but not with respect to any kind of objective, global, or transcendental standard. If this were all Darwin had said, he could be interpreted essentially as consistent with today’s philosophical reflections on the question of progress.

But this is clearly not the whole story. Darwin also seemed to think that this restricted notion of progress as increase in complexity and relative adaptation was related to, if not even equivalent to, progress in the classical sense – and that such progress was in fact guaranteed by natural selection. “And as natural selection works solely by and for the good of each being,” he wrote near the end of the Origin, “all corporeal and mental endowments will tend to progress toward perfection” (Origin, p. 489). The best way to interpret this trend within Darwin’s writing is also the matter of some debate. We might think that Darwin is here doing his best to extract from natural selection some semblance (even if relativized to the local contexts of adaptation to a given environment) of the notion of progress that was so culturally important in Victorian Britain. Or, we might argue, with Robert Richards, that natural selection has thus retained a sort of moral, progressive force for Darwin, a force that might have been borrowed from the ideas of progress present within the German Romantic tradition.

6. The Reception of Darwin’s Work

Darwin’s work was almost immediately recognized as heralding a massive shift in the biological sciences. He quickly developed a group of colleagues who worked to elaborate and defend his theory in the British and American scientific establishment of the day. He also, perhaps unsurprisingly, developed a host of critics. First, let us consider Darwin’s scientific detractors.

a. Scientific Reception

Two facts about the Origin were frequent targets of early scientific critique. First, despite being a work on the origin of species, Darwin never clearly defines what he means by ‘species.’ Second, and more problematically, Darwin attempts to treat the generation and distribution of variations as a black box. One of the goals of the analogy between artificial and natural selection (and Darwin’s later writing of the Variation) is to argue that variation is simply a brute fact about the natural world: whenever a potential adaptation could allow an organism to advantageously respond to a given selective pressure or environmental change, Darwin is confident that the relevant variations could at least potentially arise within the population at issue.

However, as a number of his critics noted (including, for instance, J. S. Mill), it seems to be this process of the generation of variation that is really responsible for the origin of species. If the variation needed for selection to respond is not available, then evolutionary change simply will not occur. It is thus impossible, these critics argued, to have an account of evolution without a corresponding explanation of the generation of variations – or, at the very least, any such account would be incapable of demonstrating that any particular adaptation could actually have been produced by natural selection.

Another vein of scientific criticism concerned Darwin’s evidence base. The classic inductivism that was part and parcel of much of nineteenth-century British philosophy of science (see section 2.a) seems not to be satisfied by Darwin’s arguments. Darwin could not point to specific examples of evolution in the wild. He could not describe a detailed historical sequence of transitional forms connecting an ancestral species with a living species. He believed that he could tell portions of those stories, which he took to be sufficient, but this did not satisfy some critics. And he could not describe the discrete series of environmental changes or selection pressures that led to some particular evolutionary trajectory. Of course, these sorts of evidence are available to us today in a variety of cases, but that was of no help in 1859. Darwin was thus accused (for instance, in a scathing review of the Origin by the geologist Adam Sedgwick) of having inverted the proper order of explanation and having therefore proposed a theory without sufficient empirical evidence.

These scientific appraisals led to a period that has been called (not uncontroversially) the “eclipse of Darwinism” (a term coined by Julian Huxley in the mid-twentieth century; see Bowler 1992). It is notable that almost all of them are related to natural selection, not to the question of common ancestry. The vast majority of the scientific establishment quickly came to recognize that Darwin’s arguments for common ancestry and homology were extremely strong. There was thus a span of several decades during which Darwin’s “tree of life” was widely accepted, while his mechanism for that tree’s generation and diversification was not, even by scientific authorities as prestigious as Darwin’s famed defender Thomas Henry Huxley or the early geneticist T. H. Morgan. A host of alternative mechanisms were proposed, from neo-Lamarckian proposals of an inherent drive to improvement, to saltationist theories that proposed that variation proceeded not by gradual steps, but by large jumps between different forms. It was only with the integration of Mendelian genetics and the theory of evolution in the “Modern Synthesis” (developed in the 1920s and 1930s) that this controversy was finally laid to rest (see, for instance, Provine 1971).

b. Social and Religious Reception

The religious response to Darwin’s work is a complex subject, and was shaped by theological disputes of the day, local traditions of interaction (or lack thereof) with science, and questions of personal character and persuasion (see Livingstone 2014). Some religious authors were readily able to develop a version of natural selection that integrated human evolution into their picture of the world, making space for enough divine influence to allow for the special creation of humans, or at least for human souls. Others raised precisely the same kinds of objections to Darwin’s philosophy of science that we saw above, as they, too, had learned a sort of Baconian image of scientific methodology which they believed Darwin violated. But acceptance or rejection of Darwin’s theory was by no means entirely determined by religious affiliation. A number of figures in the Church of England at the time (an institution that was in the middle of its own crisis of modernization and liberalization) were themselves already quite willing to consider Darwin’s theory, or were even supporters, while a number of Darwin’s harshest critics were no friends to religion (Livingstone 2009).

Simplistic stories about the relationship between evolution and religious belief are thus very likely to be incorrect. The same is true for another classic presentation of religious opposition to Darwin, which is often used to reduce the entire spectrum of nuanced discussion to two interlocutors at a single event: the debate between Bishop Samuel (“Soapy Sam”) Wilberforce and Thomas Henry Huxley, held at the Oxford University Museum on the 30th of June, 1860. Wilberforce famously asked Huxley whether it was through his grandfather’s or grandmother’s side that he had descended from monkeys. As the classic story goes, Huxley calmly laid out the tenets of Darwin’s theory in response, clearly demonstrated the misunderstandings upon which Wilberforce’s question rested, and replied that while he was not ashamed to have descended from monkeys, he would be “ashamed to be connected with a man [Wilberforce] who used his great gifts to obscure the truth.” Huxley retired to thunderous applause, having carried the day.

The only trouble with this account is that it is almost certainly false. There are very few first-hand accounts of what actually took place that day, and many that exist are likely biased toward one side or the other. Huxley’s reputation had much to gain from his position as a staunch defender of science against the Church, and thus a sort of mythologized version of events was spread in the decades that followed the exchange. A number of attendees, however, noted rather blandly that, other than the monkey retort (which he did almost certainly say), Huxley’s remarks were unconvincing and likely interested only those already committed Darwinians (Livingstone 2009).

The Scopes Trial, another oft-cited “watershed” moment in the relationship between evolutionary theory and the general public, is also more complex than it might first appear. As Adam Shapiro (2013) has persuasively argued, the Scopes Trial was about far more than simple religious opposition to evolutionary theory (though this was certainly an ingredient). Biology had become part of a larger discussion of educational reform and the textbook system, making any hasty conclusions about the relationship between science and religion in this case difficult to support.

In summary, then, caution should be the order of the day whenever we attempt to analyze the relationship between religion and evolutionary theory. Religious institutions, from Darwin’s day to our own, are subject to a wide array of internal and external pressures, and their responses to science are not often made on the basis of a single, clear decision about the theological or scientific merits of some particular theory. This is especially true in Darwin’s case. Darwin’s theory quickly became part of larger social and cultural debates, whether these were about science and education (as in the United States), or, as was true globally, about broader ideological issues such as secularism, scientific or methodological naturalism, and the nature of the power and authority that scientists should wield in contemporary society.

There are few studies concerning the reception of Darwin by the public at large. Perhaps the most incisive remains that by the linguist Alvar Ellegård (1958), though his work only concerns the popular press in Britain for the first thirteen years after the publication of the Origin. This reaction is largely what one might have expected: the work itself was largely ignored until its implications for human evolution and theology were more widely known. At that point, natural selection remained largely either neglected or rejected, and public reactions were, in general, shaped by preexisting social structures and intellectual or cultural affiliations.

c. Darwin and Philosophy

Philosophers were quick to realize that Darwin’s work could have impacts upon a whole host of philosophical concerns. Particularly quick to respond were Friedrich Nietzsche and William James, both of whom were incorporating evolutionary insights or critiques into their works very shortly after 1859. The number of philosophical questions potentially impacted by an evolutionary approach is far too large to describe here and would quickly become an inventory of contemporary philosophy. A few notable examples will have to suffice (for more, see Smith 2017).

Biological species had, since Aristotle, been regularly taken to be paradigmatic exemplars of essences or natural kinds. Darwin’s demonstration that their properties have been in constant flux throughout the history of life thus serves as an occasion to reexamine our very notions of natural kind and essence, a task that has been taken up by a number of metaphysicians and philosophers of biology. When applied to human beings, this mistrust of essentialism poses questions for the concept of human nature. The same is true for final causes and teleological explanations (see section 5.b), where evolutionarily inspired accounts of function have been used to rethink teleological explanations across philosophy and science.

More broadly, the recognition that human beings are themselves evolved creatures can be interpreted as a call to take much more seriously the biological bases of human cognition and experience in the world. Whether this takes the form of a fully-fledged “neurophilosophy” (to borrow the coinage of Patricia Churchland) or simply the acknowledgement that theories of perception, cognition, rationality, epistemology, ethics, and beyond must be consistent with our evolved origins, it is perhaps here that Darwin’s impact on philosophy could be the most significant.

7. References and Further Reading

a. Primary Sources

  • Nearly all of Darwin’s works, including his published books, articles, and notebooks, are freely available at Darwin Online: <http://darwin-online.org.uk>
  • Darwin’s correspondence is edited, published, and also digitized and made freely available by a project at the University of Cambridge: <https://www.darwinproject.ac.uk/>
  • Darwin, Charles. 1859. On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life. 1st ed. London: John Murray.
    • The first edition of Darwin’s Origin is now that most commonly read by scholars, as it presents Darwin’s argument most clearly, without his extensive responses to later critics.
  • Darwin, Charles. 1862. On the Various Contrivances by Which British and Foreign Orchids Are Fertilised by Insects. London: John Murray.
    • The work on orchids offers insight into Darwin’s thought on coadaptation and the role of chance in evolution.
  • Darwin, Charles. 1868. The Variation of Animals and Plants Under Domestication. 1st ed. London: John Murray.
    • A two-volume work concerning the appearance and distribution of variations in domestic products.
  • Darwin, Charles. 1871. The Descent of Man, and Selection in Relation to Sex. 1st ed. London: John Murray.
    • Two-volume treatise on the evolution of humans, intelligence, morality, and sexual selection.
  • Darwin, Charles. 1872. The Expression of the Emotions in Man and Animals. London: John Murray.
    • An argument for continuity in emotional capacity between humans and the higher animals.
  • Barlow, Nora, ed. 1958. The Autobiography of Charles Darwin, 1809–1882. London: Collins.
    • Darwin’s autobiography, while occasionally of dubious historical merit, remains an important source for our understanding of his personal life.

b. Secondary Sources

  • Bowler, Peter J. 1992. The Eclipse of Darwinism: Anti-Darwinian Evolution Theories in the Decades around 1900. Baltimore, MD: Johns Hopkins University Press.
    • Explores the various debates surrounding natural selection and variation in the period from around Darwin’s death until the development of the early Modern Synthesis in the 1920s.
  • Browne, Janet. 1995. Charles Darwin: Voyaging, vol. 1. New York: Alfred A. Knopf.
  • Browne, Janet. 2002. Charles Darwin: The Power of Place, vol. 2. New York: Alfred A. Knopf.
    • The most detailed and highest quality general biography of Darwin, across two volumes loaded with references to published and archival materials.
  • Browne, Janet. 1989. “Botany for Gentlemen: Erasmus Darwin and ‘The Loves of the Plants.’” Isis 80: 593–621.
    • A presentation of the literary and social context of Darwin’s grandfather Erasmus’s poetic work on taxonomy and botany.
  • Costa, James T. 2014. Wallace, Darwin, and the Origin of Species. Cambridge, MA: Harvard University Press.
    • A careful discussion of the long relationship between Wallace and Darwin, ranging from the early proposal of natural selection to Wallace’s later defenses of natural and sexual selection, and forays into spiritualism.
  • Currie, Adrian. 2018. Rock, Bone, and Ruin: An Optimist’s Guide to the Historical Sciences. Cambridge, MA: The MIT Press.
    • An exploration of the conceptual issues posed by scientific explanation in the “historical sciences” (such as evolution, geology, and archaeology), from a contemporary perspective.
  • Desmond, Adrian, and James Moore. 2009. Darwin’s Sacred Cause: How a Hatred of Slavery Shaped Darwin’s Views on Human Evolution. Houghton Mifflin Harcourt.
    • Provocative biography of Darwin arguing that his development of evolution (in particular, his reliance on common ancestry) was motivated by his anti-slavery attitude and his exposure to the slave trade during the Beagle voyage.
  • Ellegård, Alvar. 1958. Darwin and the General Reader: The Reception of Darwin’s Theory of Evolution in the British Periodical Press, 1859–1872. Chicago: University of Chicago Press.
    • A wide-ranging study of the impact of Darwin’s works in the popular press of his day.
  • Herbert, Sandra. 2005. Charles Darwin, Geologist. Ithaca, NY: Cornell University Press.
    • Thorough presentation of Darwin’s work as a geologist, extremely important to his early career and to his development of the theory of natural selection.
  • Hodge, M. J. S. 2009. “Capitalist Contexts for Darwinian Theory: Land, Finance, Industry and Empire.” Journal of the History of Biology 42 (3): 399–416. https://doi.org/10.1007/s10739-009-9187-y.
    • An incisive discussion of the relationship between Darwin’s thought and the varying economic and social paradigms of nineteenth-century Britain.
  • Hodge, M. J. S., and Gregory Radick, eds. 2009. The Cambridge Companion to Darwin. 2nd ed. Cambridge: Cambridge University Press.
    • A broad, well written, and accessible collection of articles exploring Darwin’s impact across philosophy and science.
  • Lennox, James G. 2010. “The Darwin/Gray Correspondence 1857–1869: An Intelligent Discussion about Chance and Design.” Perspectives on Science 18 (4): 456–79.
    • Masterful survey of the correspondence between Charles Darwin and Asa Gray, a key source for Darwin’s thoughts about the relationship between evolution and design.
  • Livingstone, David N. 2014. Dealing with Darwin: Place, Politics, and Rhetoric in Religious Engagements with Evolution. Baltimore, MD: Johns Hopkins University Press.
    • A discussion of the wide diversity of ways in which Darwin’s religious and theological contemporaries responded to his work, with a focus on the importance of place and local tradition to those responses.
  • Livingstone, David N. 2009. “Myth 17: That Huxley Defeated Wilberforce in Their Debate over Evolution and Religion.” In Numbers, Ronald L., ed., Galileo Goes to Jail: And Other Myths about Science and Religion, pp. 152–160. Cambridge, MA: Harvard University Press.
    • A brief and extremely clear reconstruction of our best historical knowledge surrounding the Huxley/Wilberforce “debate.”
  • Manier, Edward. 1978. The Young Darwin and His Cultural Circle. Dordrecht: D. Riedel Publishing Company.
    • While somewhat dated now, this book still remains a rich resource for the context surrounding Darwin’s intellectual development.
  • Priest, Greg. 2017. “Charles Darwin’s Theory of Moral Sentiments: What Darwin’s Ethics Really Owes to Adam Smith.” Journal of the History of Ideas 78 (4): 571–93.
    • Explores the relationship between Adam Smith’s ethics and Darwin’s, arguing that Darwin did not derive any significant insights from Smith’s economic work.
  • Provine, William B. 1971. The Origins of Theoretical Population Genetics. Princeton, NJ: Princeton University Press.
    • Classic recounting of the historical and philosophical moves in the development of the Modern Synthesis, ranging from Darwin to the works of R. A. Fisher and Sewall Wright.
  • Richards, Evelleen. 2017. Darwin and the Making of Sexual Selection. Chicago: University of Chicago Press.
    • A carefully constructed history of Darwin’s development of sexual selection as it was presented in The Descent of Man, presented with careful and detailed reference to the theory’s social and cultural context.
  • Richards, Robert J., and Michael Ruse. 2016. Debating Darwin. Chicago: University of Chicago Press.
    • A volume constructed as a debate between Richards and Ruse, both excellent scholars of Darwin’s work and diametrically opposed on a variety of topics, from his intellectual influences to the nature of natural selection.
  • Roughgarden, Joan. 2004. Evolution’s Rainbow: Diversity, Gender, and Sexuality in Nature and People. Berkeley, CA: University of California Press.
    • A rethinking of Darwin’s theory of sexual selection for the contemporary context, with an emphasis on the reconstruction of biological explanations in the light of contemporary discussions of gender and sexuality.
  • Rudwick, M. J. S. 1997. Georges Cuvier, Fossil Bones, and Geological Catastrophes. Chicago: University of Chicago Press.
    • Describes the conflict between “uniformitarian” and “catastrophist” positions concerning the geological record in the years just prior to Darwin.
  • Ruse, Michael, and Robert J. Richards, eds. 2009. The Cambridge Companion to the “Origin of Species.” Cambridge: Cambridge University Press.
    • An excellent entry point into some of the more detailed questions surrounding the structure and content of Darwin’s Origin.
  • Shapiro, Adam R. 2013. Trying Biology: The Scopes Trial, Textbooks, and the Antievolution Movement in American Schools. Chicago: University of Chicago Press.
    • Insightful retelling of the place of the Scopes Trial in the American response to evolutionary theory, emphasizing a host of other, non-scientific drivers of anti-evolutionary sentiment.
  • Smith, David Livingstone, ed. 2017. How Biology Shapes Philosophy: New Foundations for Naturalism. Cambridge: Cambridge University Press.
    • This edited volume brings together a variety of perspectives on the ways in which biological insight has influenced and might continue to shape contemporary philosophical discussions.
  • Walsh, Denis M. 2016. Organisms, Agency, and Evolution. Cambridge: Cambridge University Press.
    • Develops a non-standard view of evolution on which teleology and organismic agency are given prominence over neo-Darwinian natural selection and population genetics.
  • Wilkins, John S. 2009. Species: A History of the Idea. Berkeley: University of California Press.
    • A discussion of the history of the concept of species, useful for understanding Darwin’s place with respect to other theorists of his day.

 

Author Information

Charles H. Pence
Email: charles@charlespence.net
Université Catholique de Louvain
Belgium

Frequently Asked Questions about Time

This supplement provides background information about many of the topics discussed in both the main Time article and its companion article What Else Science Requires of Time. It is not intended that this article be read in order by section number.

Table of Contents

  1. What Are Durations, Instants, Moments, and Points of Time?
  2. What Is an Event?
  3. What Is a Reference Frame?
    1. Why Do Cartesian Coordinates Fail?
  4. What Is an Inertial Frame?
  5. What Is Spacetime?
  6. What Is a Spacetime Diagram?
  7. What Are Time’s Metric and Spacetime’s Interval?
  8. How Does Proper Time Differ from Standard Time and Coordinate Time?
  9. Is Time the Fourth Dimension?
  10. Is There More Than One Kind of Physical Time?
  11. How Is Time Relative to the Observer?
  12. What Is the Relativity of Simultaneity?
  13. What Is the Conventionality of Simultaneity?
  14. What are the Absolute Past and the Absolute Elsewhere?
  15. What Is Time Dilation?
  16. How Does Gravity Affect Time?
  17. What Happens to Time near a Black Hole?
  18. What Is the Solution to the Twins Paradox?
  19. What Is the Solution to Zeno’s Paradoxes?
  20. How Are Coordinates Assigned to Time?
  21. How Do Dates Get Assigned to Actual Events?
  22. What Is Essential to Being a Clock?
  23. What Does It Mean for a Clock to Be Accurate?
  24. What Is Our Standard Clock or Master Clock?
    1. How Does an Atomic Clock Work?
    2. How Do We Find and Report the Standard Time?
  25. Why Are Some Standard Clocks Better than Others?
  26. What Is a Field?

1. What Are Durations, Instants, Moments, and Points of Time?

A duration is a measure of elapsed time. It is a number with a unit such as seconds or hours. “4” is not a duration, but “4 seconds” is. The term interval in the phrase spacetime interval is a different kind of interval. The second is the agreed-upon standard unit for the measurement of duration in the S.I. system (the International Systems of Units, that is, Le Système International d’Unités). How to carefully define the term second is discussed later in this supplement.

In informal conversation, an instant or moment is a very short duration. In physics, however, an instant is even shorter. It is instantaneous; it has zero duration. This is perhaps what the poet T.S. Eliot was thinking of when he said, “History is a pattern of timeless moments.”

There is another sense of the words instant and moment which means, not a very short duration, but rather a time, as when we say it happened at that instant or at that moment. Now a moment is being considered to be a three-dimensional object, namely a ‘snapshot’ of the universe.  Midnight could be such a moment. This is the sense of the word moment meant by a determinist who says the state of the universe at one moment determines the state of the universe at another moment. In this sense, a moment is normally considered to be a special three-dimensional object, namely a snapshot of our universe at a single instant in time.

It is assumed in physics (except in some proposed theories of quantum gravity) that any interval of time is a linear continuum of the points of time that compose it, but it is an interesting philosophical question to ask how physicists know time is a continuum. Nobody could ever measure time that finely, even indirectly.  Points of time cannot be detected. That is, there is no physically possible way to measure that the time is exactly noon even if it is true that the time is noon. Noon is 12 to an infinite number of decimal places, and no measuring apparatus is infinitely precise, and no measurement fails to have a margin of error. But given what we know about points, we should not be trying to detect points of anything. Belief in the existence of points of time is justified holistically by appealing to how they contribute to scientific success, that is, to how the points give our science extra power to explain, describe, predict, and enrich our understanding. In order to justify belief in the existence of points, we need confidence that our science would lose too many of these virtues without the points. Without points, we could not use calculus to describe change in nature.

Consider what a point in time really is. Any interval of time is a real-world model of a segment of the real numbers in their normal order. So, each instant corresponds to just one real number and vice versa. To say this again in other words, time is a line-like structure on sets of point events. Just as the real numbers are an actually infinite set of decimal numbers that can be linearly ordered by the less-than-or-equal relation, so time is an actually infinite set of instants or instantaneous moments that can be linearly ordered by the happens-before-or-at-the-same-time-as relation in a single reference frame. An instant or moment can be thought of as a set of point-events that are simultaneous in a single reference frame.

Although McTaggart disagrees, all physicists would claim that a moment is not able to change because change is something that is detectable only by comparing different moments.

There is a deep philosophical dispute about whether points of time actually exist, just as there is a similar dispute about whether spatial points actually exist. The dispute began when Plato said, “[T]his queer thing, the instant, …occupies no time at all….” (Plato 1961, p. 156d). Some philosophers wish to disallow point-events and point-times. They want to make do with intervals, and want an instant always to have a positive duration. The philosopher Michael Dummett, in (Dummett 2000), said time is not made of point-times but rather is a composition of overlapping intervals, that is, non-zero durations. Dummett required the endpoints of those intervals to be the initiation and termination of actual physical processes. This idea of treating time without instants developed a 1936 proposal of Bertrand Russell and Alfred North Whitehead. The central philosophical issue about Dummett’s treatment of motion is whether its adoption would negatively affect other areas of mathematics and science. It is likely that it would. For the history of the dispute between advocates of point-times and advocates of intervals, see (Øhrstrøm and Hasle 1995).

Even if time is made of points, it does not follow that matter is. It sometimes can be a useful approximation to say an electron or a quark is a point particle, but it remains an approximation. They are vibrations of quantized fields.

2. What Is an Event?

In the manifest image, the universe is more fundamentally made of objects than events. In the scientific image, the universe is more fundamentally made of events than objects.

But the term event has multiple senses. There is sense 1 and sense 2. In ordinary discourse, one uses sense 1 in which an event is a happening lasting some duration during which some object changes its properties. For example, this morning’s event of buttering the toast is the toast’s changing from having the property of being unbuttered this morning to having the property of being buttered later this morning.

In sense 1, the philosopher Jaegwon Kim suggested that an event should be defined as an object’s having a property at a time. So, two events are the same if they are both events of the same object having the same property at the same time. This suggestion captures much of our informal concept of event, but with Kim’s suggestion it is difficult to make sense of the remark, “The vacation could have started an hour earlier.” On Kim’s analysis, the vacation event could not have started earlier because, if it did, it would be a different event. A possible-worlds analysis of events might be the way to solve this problem of change.

Physicists do sometimes use the term event this way, but they also use it differently—in what we here call sense 2—when they say events are point-events or regions of point-events often with no reference to any other properties of those events, such as their having the property of being buttered toast at that time. The simplest point-event is a location in spacetime with zero volume and zero duration. Hopefully, when the term event occurs, the context is there to help disambiguate. For instance, when an eternalist says our universe is a block of events, the person means the universe is the set of all point-events with their actual properties.

To a non-quantum physicist, any physical object is just a series of its point-events and the values of all their intrinsic properties. For example, the process of a ball’s falling down is a continuous, infinite series of point-events along the path in spacetime of the ball.  One of those events would be this particular point piece of the ball being at a specific spatial location at some specific time. The reason for the qualification about “non-quantum” is discussed at the end of this section.

The physicists’ notion of point-event in real, physical space (rather than in mathematical space) is metaphysically unacceptable to some philosophers, in part because it deviates so much from the way the word event is used in ordinary language and in our manifest image. That is, sense 2 deviates too much from sense 1. For other philosophers, it is unacceptable because of its size, its infinitesimal size. In 1936, in order to avoid point-events altogether in physical space, Bertrand Russell and A. N. Whitehead developed a theory of time that is based on the assumption that all events in spacetime have a finite, non-zero duration. They believed this definition of an event is closer to our common sense beliefs, which it is. Unfortunately, they had to assume that any finite part of an event is also an event, and this assumption indirectly appeals to the concept of the infinitesimal and so is no closer to common sense than the physicist’s assumption that all events are composed of point-events.

McTaggart argued early in the twentieth century that events change. For example, he said the event of Queen Anne’s death is changing because it is receding ever farther into the past as time goes on. Many other philosophers (those of the so-called B-camp) believe it is improper to consider an event to be something that can change, and that the error is in not using the word change properly. This is still an open question in philosophy, but physicists use the term event as the B-theorists do, namely as something that does not change.

In non-quantum physics, specifying the state of a physical system at a time involves specifying the masses, positions and velocities of each of the system’s particles at that time. Not so in quantum mechanics. The simultaneous precise position and velocity of a particle—the key ingredients of a classical particle event—do not exist according to quantum physics. The more precise the position is, the less precise is the velocity, and vice versa.

More than half the physicists in the first quarter of the 21st century believed that a theory of quantum gravity will require (1) quantizing time, (2) having time or spacetime be emergent from a more fundamental entity, (3) having only a finite maximum number of events that can occur in a finite volume. Current relativity theory and quantum theory allow an infinite number.

The ontology of quantum physics is very different than that of non-quantum physics. The main Time article intentionally overlooks this. But, says the physicist Sean Carroll, “at the deepest level, events are not a useful concept,” and one should focus on the wave function.

For more discussion of what an event is, see the article on Events.

3. What Is a Reference Frame?

A reference frame is a standard point of view or a perspective chosen by someone to display quantitative measurements about places of interest in a space and the phenomena that take place there. It is not an objective feature of nature. To be suited for its quantitative purpose, a reference frame needs to include a coordinate system. This is a system of assigning numerical locations or ordered sets of numerical locations to points of the space. If the space is physical spacetime, then each point needs to be assigned four numbers, three for its location in space, and one for its location in time. These numbers are called “coordinates.” For every coordinate system, every point-event in spacetime has three spatial coordinate numbers and one time coordinate number.

Choosing a a coordinate system requires selecting some point to be called the system’s “origin” and selecting the appropriate number of coordinate axes that orient the frame in the space. You need at least as many axes as there are dimensions to the space. To add a coordinate system to a reference frame for a space is to add an arrangement of reference lines (such as curves parallel to the axes) to the space so that all points of space have unique names. It is often assumed that an observer is located at the origin, but this is not required. The notion of a reference frame is modern; Newton did not know about reference frames.

The name of a point in a two-dimensional space is an ordered set of two numbers (the coordinates). If a Cartesian coordinate system is assigned to the space, then a point’s coordinate is its signed distance projected along each axis from the origin point. The origin is customarily named (0,0). For a four-dimensional space, a point is named with a set of four numbers. A coordinate system for n-dimensional space is a mapping from each point to an ordered set of its n coordinate numbers. The best names of points use sets of real numbers because real numbers enable us to use the techniques of calculus and because their use makes it easy to satisfy the helpful convention that nearby points have nearby coordinates.

When we speak of the distance between two points, we implicitly mean the distance along the shortest path between them because there are an infinite number of paths one could take. If a space has a coordinate system, then it has an infinite number of them because there is an unlimited number of choices for an origin, or an orientation of the axes, or the scale.

There are many choices for kinds of reference frames, although the Cartesian coordinate system is the most popular. Its coordinate axes are mutually perpendicular. The equation of the circle of diameter one centered on the origin of a Cartesian coordinate system is x2 + y2 = 1. This same circle has a very different equation if a polar coordinate system is used instead.

Reference frames can be created for physical space, or for time, or for both, or for things having nothing to do with real space and time. One might create a two-dimensional (2-D) Cartesian coordinate system for displaying the salaries of a company’s sales persons vs. their names. Even if the space represented by the coordinate system is real physical space, its coordinates are never physically real. You can add two numbers but not two points. From this fact it can be concluded that not all the mathematical structures in the coordinate system are also reflected in what the system represents. These extraneous mathematical structures are called mathematical artifacts.

Below is a picture of a reference frame spanning a space that contains a solid ball . More specifically, there is a 3-dimensional Euclidean space that uses a Cartesian coordinate system with three mutually perpendicular axes fixed to a 3-dimensional (3-D) solid ball that could represent the Earth:

Reference Frames — kRPC 0.4.8 documentation

The origin of the coordinate system is at the center of the ball, and the coordinate system is oriented by specifying that the y-axis be a line going through the north pole and the south pole. Two of the three coordinate axes intersect the blue equator at specified places. The red line represents a typical longitude. The three coordinates of any point in this space form an ordered set (x,y,z) of the x, y, and z coordinates of the point, with commas separating each from the other coordinate labels for the point. There are points on the Earth, inside the Earth, and outside the Earth. For 3-D space, the individual coordinates normally would be real numbers. For example, we might say a point of interest deep inside the ball (the Earth) has the three coordinates (4.1,π,0), where it is assumed all three numbers have the same units, such as meters. It is customary in a three-dimensional space to label the three axes with the letters x, y, and z, and for (4.1,π,0) to mean that 4.1 meters is the x-coordinate of the point, π meters is the y-coordinate of the same point, and 0 meters is the z-coordinate of the point. The center of the Earth in this graph is located at the origin of the coordinate system; the origin of a frame has the coordinates (0,0,0). Mathematical physicists frequently suppress talk of the units and speak of π being the y-coordinate, although strictly speaking the y-coordinate is π meters. The x-axis is all the points (x,0,0); the y-axis is all the points (0,y,0); the z-axis is all the points (0,0,z), for all possible values of x, y, and z.

In a coordinate system, the axes need not be mutually perpendicular, but in order to be a Cartesian coordinate system, the axes must be mutually perpendicular, and the coordinates of a point in spacetime must be the values along axes of the perpendicular projections of the point onto the axes. All Euclidean spaces can have Cartesian coordinate systems. If the space were the surface of the sphere above, not including its insides or outside, then this two-dimensional space would be a sphere, and it could not have a two-dimensional Cartesian coordinate system because all the axes could not lie within the space. The 2D surface could have a 3D Cartesian coordinate system, though. This coordinate system was used in our diagram above. A more useful coordinate system might be a 3D spherical coordinate system. Space and time in the theory of special relativity are traditionally represented by a frame with four independent, real coordinates (t,x,y,z), but this is just one of the kinds of possible representations of space and time, though it is often a very useful one.

Changing from one reference frame to another does not change any phenomenon in the real world being described with the reference frame, but is merely changing the perspective on the phenomena. If an object has certain coordinates in one reference frame, it usually has different coordinates in a different reference frame, and this is why coordinates are not physically real—they are not frame-free. Durations are not frame-free. Neither are positions, directions, and speeds. An object’s speed is different in different reference frames, with one exception. The upper limit on the speed of any object in space is c, the speed of light in a vacuum. This claim is not relative to a reference frame. This speed c is the upper limit on the speed of transmission from any cause to its effect. This c is the c in the equation E = mc2. It is the speed of any particle with zero rest mass, and it is the speed of all particles at the Big Bang before the Higgs field turned on and slowed down many kinds of particles. The notion of speed of travel through spacetime rather than space, is usually considered by physicists not to be sensible. Whether the notion of speed through time is not sensible is a controversial topic in the philosophy of physics. See the main Time article’s section “The Passage or Flow of Time” for who takes what kind of position on this issue.

The word reference is often dropped from the phrase reference frame, and the term frame and coordinate system are often used interchangeably. A frame for the physical space in which an object has zero velocity is called the object’s rest frame or proper frame. 

When choosing to place a frame upon a space, there are an infinite number of legitimate choices. Choosing a frame carefully can make a situation much easier to describe. For example, suppose we are interested in events that occur along a highway. We might orient the z-axis by saying it points  up away from the center of Earth, while the x-axis points along the highway, and the y-axis is perpendicular to the other two axes and points across the highway. If events are to be described, then a fourth axis for time would be needed, but its units would be temporal units and not spatial units. It usually is most helpful to make the time axis be perpendicular to the three spatial axes, and to require successive seconds along the axis to be the same duration as seconds of the standard clock. By applying a coordinate system to spacetime, a point of spacetime is specified uniquely by its four independent coordinate numbers, three spatial coordinates and one time coordinate. The word independent implies that knowing one coordinate of a point gives no information about the point’s other coordinates.

Coordinate systems of reference frames have to obey rules to be useful in science. No accepted theory of physics allows a time axis to be shaped like a figure eight. Frames need to honor the laws if they are to be perspectives on real events. For all references frames allowed by relativity theory, if a particle collides with another particle, they must collide in all allowed reference frames. Relativity theory does not allow reference frames in which a photon, a particle of light, is at rest. Quantum mechanics does. A frame with a time axis in which your shooting a gun is simultaneous with your bullet hitting a distant target is not allowed by relativity theory.

How is the time axis oriented in the world? This is done by choosing t = 0 to be the time when a specific event occurs such as the Big Bang, or the birth of Jesus. A second along the t-axis usually is required to be congruent to a second of our civilization’s standard clock, especially for clocks not moving with respect to that clock.

A space with a topology defined on it and having any number of dimensions is called a manifold. Newtonian mechanics, special relativity, general relativity, and quantum theory all require the set of all events (in the sense of possible space-time locations) to form a four-dimensional manifold. Informally, what it means to be four-dimensional is that each point cannot be specified with less than four independent numbers. Formally, the definition of dimension is somewhat complicated.

Treating time as a special dimension is called spatializing time, and doing this is what makes time precisely describable mathematically in a way that treating time only as becoming does not. It is a major reason why mathematical physics can be mathematical.

One needs to be careful not to confuse the features of time with the features of the mathematics used to describe time. Einstein admitted [see (Einstein 1982) p. 67] that even he often made this mistake of failing to distinguish the representation from the object represented, and it added years to the time it took him to create his general theory of relativity.

Times are not numbers, but time coordinates are. When a time-translation occurs with a magnitude of t0, this implies the instant I at coordinate t is now associated with another instant I’ at coordinate t’ and this equality holds: t’ = t + t0. If the laws of physics are time-translation symmetric, which is the normal assumption, then the laws of mathematical physics are invariant relative to the group of transformations of time coordinate t expressed by t = t + t0 where t0 is an arbitrarily chosen constant real number.

a. Why Do Cartesian Coordinates Fail?

The Cartesian coordinate system can handle all sorts of curved paths and curved objects, but it fails whenever the space itself curves.  What we just called “the space” could be real physical space or an abstract mathematical space or spacetime or just time.

A reference frame fixed to the surface of the Earth cannot have a Cartesian coordinate system covering all the surface because the surface curves. Spaces with a curved geometry require curvilinear coordinate systems in which the axes curve as seen from a higher dimensional Euclidean space in which the lower-dimensional space is embedded. Any Euclidean space can have a Cartesian coordinate system.

If the physical world were two-dimensional and curved like the surface of a sphere, then a two-dimensional Cartesian coordinate system for that space must fail to give coordinates to most places in the world. To give all the points of the 2D world their own Cartesian coordinates, one would need a 3D Cartesian system, and each point in the world would be assigned three coordinates, not merely two. For the same reason, if we want an arbitrary point in our real, curving 4D-spacetime to have only four coordinates and not five, then the coordinate system must be curvilinear and not Cartesian.  But what if we are stubborn and say we want to stick with the Cartesian coordinate system and we don’t care that we have to bring in an extra dimension and give our points of spacetime five coordinates instead of four? In that case we cannot trust the coordinate system’s standard metric to give correct answers.

Let’s see why this is so. Although the coordinate system can be chosen arbitrarily for any space or spacetime, different choices usually require different metrics. Suppose the universe is two-dimensional and shaped like the surface of a sphere when seen from a higher dimension.  The 2D sphere has no inside or outside; the extra dimension is merely for our visualization purposes. Then when we use the 3D system’s metric, based on the 3D version of the Pythagorean Theorem, to measure the spatial distance between two points in the space, say, the North Pole and the equator, the value produced is too low. The correct value is higher because it is along a longitude and must stay confined to the surface. The 3D Cartesian metric says the shortest line between the North Pole and a point on the equator cuts through the Earth and so escapes the universe, which indicates the Cartesian metric cannot be correct. The correct metric would compute distance within the space along a geodesic line (a great circle in this case such as a longitude) that is confined to the sphere’s surface.

The orbit of the Earth around the Sun is curved in 3D space, but “straight” in 4D spacetime. The scare quotes are present because the orbit is straight only in the sense that a geodesic is straight. A geodesic path between two points of spacetime is a path of shortest spacetime interval between the points.

One could cover a curved 4D-spacetime with a special Cartesian-like coordinate system by breaking up the spacetime into infinitesimal regions, giving each region its own Cartesian coordinate system, and then stitching the coordinate systems all together where they meet their neighbors. The stitching produces what is customarily called an atlas. Each point would have its own four unique coordinates, but when the flat Cartesian metric is used to compute intervals, lengths, and durations from the coordinate numbers of the atlas, the values will be incorrect.

Instead of considering a universe that is the surface of a sphere, consider a universe that is the surface of a cylinder. This 2D universe is curved when visualized from a 3D Euclidean space in which the cylinder is embedded. Surprisingly, it is not intrinsically curved at all. The measures of the three angles of any triangle sum to 180 degrees. Circumferences of its circles always equal pi times their diameters. We say that, unlike the sphere, the surface of a cylinder is extrinsically curved but intrinsically flat.

For a more sophisticated treatment of reference frames and coordinates, see Coordinate Systems. For an introduction to the notion of curvature of space, see chapter 42 in The Feynman Lectures on Physics by Richard Feynman.

4. What Is an Inertial Frame?

Galileo first had the idea that motion is relative. If you are inside a boat with no windows and are floating on a calm sea, you cannot tell whether the  boat is moving. Even if it is moving, you won’t detect this by seeing a dropped  ball curve as it falls or by feeling a push on yourself or seeing all the flies near you being pushed to the back of the room.  He believed steady motion is motion relative to other objects, and there is no such thing as simply motion relative to nothing, or motion relative to fixed, absolute space.

Newton disagreed. He believed in absolute motion that is not relative to any other object. Newton would say an inertial frame is a reference frame moving at constant velocity relative to absolute space. Einstein objected to absolute space and to said an inertial frame is a reference frame in which Newton’s first law of motion holds. Newton’s first law says an isolated object, that is, an object affected by no total extrinsic force, has a constant velocity over time. It does not accelerate. In any inertial frame, any two separate objects that are moving in parallel and coasting along with no outside forces on them, will remain moving in parallel forever. Einstein described his special theory of relativity in 1905 by saying it requires the laws of physics to have the same form in any inertial frame of reference.

Computations and descriptions are usually simpler when one can choose a frame that is nearly inertial. Unfortunately, there are no inertial frames for the real world. This is because Newton’s first law is not strictly true, and there is no absolute space in Newton’s sense. However, there are sometimes good approximations.

Newton’s first law can be thought of as providing a definition of the concept of zero total external force; an object has zero total external force if it is moving with constant velocity. In the real world, no objects behave this way; they cannot be isolated from the force of gravity. Gravity cannot be turned off, and so Newton’s first law fails and there are no inertial frames. But the first law does hold approximately, that is, well enough for various purposes in many situations. It holds in any infinitesimal region. In larger regions, if spacetime curvature can be ignored for a certain phenomenon of interest, then one can find an inertial frame for the phenomenon. A Cartesian coordinate system fixed to Earth usually will serve adequately as an inertial frame for describing cars on a race track or describing the flight of a tennis ball, but not for describing a rocket’s flight from Paris to Mars. A coordinate frame for space that is fixed on the distant stars and is used by physicists only to describe phenomena far from any of those stars, and far from planets, and far from other massive objects, is very nearly an inertial frame in that region. Given that some frame is inertial, any frame that rotates or otherwise accelerates relative to this first frame is non-inertial.

Newton’s theory requires a flat, Euclidean geometry for space and for spacetime. Special relativity requires a flat Euclidean geometry for space but a flat, non-Euclidean geometry for spacetime. General relativity allows all these but also allows curvature for both space and spacetime. Think of “flat” as requiring axes to be straight lines. If we demand that our reference frame’s coordinate system span all of spacetime, then a flat frame does not exist for the real world. The existence of gravity requires there to be curvature of space around any object that has mass, thereby making a flat frame fail to span some of the space near the object.

The geometry of a space exists independently of whatever coordinate system is used to describe it, so one has to take care to distinguish what is a real feature of the geometry from what is merely an artifact of the mathematics used to characterize the geometry.

5. What Is Spacetime?

Spacetime is a certain combination of space and time. It is the set of locations of events, or it can be considered to be a field where all events are located.

There are actual spacetimes and imaginary spacetimes. Our real four-dimensional spacetime has a single time dimension and three space dimensions. But there are imaginary spacetimes with twenty-seven dimensions. There is a three-dimensional  spacetime composed of two spatial dimensions and a time dimension. In one of these spacetimes, points in space indicate the latitude and longitude in Canada for the sale of a company’s widget, and points along the time dimension indicate the date of the sale of the widget. In any spacetime, real or imaginary, the coordinates are the names of locations in space and time; so they are mathematical artifacts.

In 1907-8, Hermann Minkowski was the first person to say that real spacetime is fundamental and that space and time are just aspects of spacetime. And he was the first to say different reference frames will divide spacetime differently but correctly into their time part and space part.

Later, Einstein discovered that real spacetime is dynamic and not static. That is, its structure, such as its geometry, changes over time as the distribution of matter-energy changes. In special relativity and in Newton’s theory, spacetime is not dynamic; it stays the same regardless of what matter and energy are doing.

Spacetime can be curved.  Focusing just on space, the overall, cosmic curvature of our space  is unknown, but there is good empirical evidence, acquired in the 1990s, that the overall, cosmic curvature of space is about zero but is evolving toward a positive value.

In general relativity, spacetime is assumed to be a fundamental feature of reality. It is very interesting to investigate whether this assumption is true. There have been serious attempts to construct theories of physics in which spacetime is not fundamental but instead emerges from something more fundamental such as quantum fields, but none of these attempts have stood up to any empirical observations or experiments that could show the new theories to be superior to the presently accepted theories. So, it is still safe to say in the first quarter of the twenty-first century that the concept of spacetime is ontologically fundamental.

The metaphysical question of whether spacetime is a substantial object or merely a relationship among events, or neither, is considered in the discussion of the relational theory of time in the main Time article. For some other philosophical questions about what spacetime is, see What is a Field?

The force of gravity over time is manifested as the curvature of spacetime itself. Einstein was the first person to appreciate this. According  to the physicist George Musser:

Gravity is not a force that propagates through space but a feature of spacetime itself. When you throw a ball high into the air, it arcs back to the ground because Earth distorts the spacetime around it, so that the paths of the ball and the ground intersect again.

6. What Is a Spacetime Diagram?

A spacetime diagram is a graphical representation of the coordinates of events in spacetime. Think of the diagram as a picture of a reference frame. In classical spacetime diagrams, one designated coordinate axis is for time. The other axes are for space. A Minkowski spacetime diagram is a special kind of spacetime graph, one that represents phenomena that obey the laws of special relativity. A Minkowski diagram allows no curvature of spacetime itself, although objects themselves can have curving sides and curving paths in space.

The following diagram is an example of a three-dimensional Minkowski spacetime diagram containing two spatial dimensions (with straight lines for the two axes) and a time dimension (with a vertical line for the time axis). The space part of this spacetime frame constitutes your rest frame; it’s the frame in which you have zero velocity. Two cones emerge upward and downward from the point-event of you, the zero-volume observer being here now at the origin of the reference frame of your spacetime diagram. These cones are your future and past light cones. The cones are composed of green paths of possible unimpeded light rays emerging from the observer or converging into the observer. The light cone at a point of space exists even if there is no actual light there.

A 3D Minkowski diagram

Attribution:Stib at en.wikipedia, CC BY-SA 3.0, Link

By convention, in a Minkowski spacetime diagram, a Cartesian (rectangular) coordinate system is used, the time axis is shown vertically, and one or two of the spatial dimensions are suppressed (that is, not included).

If the Minkowski diagram has only one spatial dimension, then a flash of light in a vacuum has a perfectly straight-line representation, but it is has a cone-shaped representation if the Minkowski diagram has two spatial dimensions, and it is a sphere if there are three spatial dimensions. Because light travels at such a high speed, it is common to choose the units along the axes so that the path of a light ray is a 45 degree angle and the value of c is 1 light year per year, with light years being the units along any space axis and years being the units along the time axis. Or the value of c could have been chosen to be one light nanosecond per nanosecond. The careful choice of units for the axes in the diagram is important in order to prevent the light cones’ appearing too flat to be informative.

Below is an example of a Minkowski diagram having only one space dimension, so every future light cone has the shape of the letter “V.”

This Minkowski diagram represents a spatially-point-sized Albert Einstein standing still midway between two special places, places where there is an instantaneous flash of light at time t = 0 in coordinate time. At t = 0, Einstein cannot yet see the flashes because they are too far away for the light to reach him yet. The directed arrows represent the path of the four light rays from the flashes. In a Minkowski diagram, a physical point-object of zero volume is not represented as occupying a single point but as occupying a line containing all the spacetime points at which it exists. That line is called the  worldline of the object. All worldlines representing real objects are continuous paths in spacetime. Accelerating objects have curved paths in spacetime.

Events on the same horizontal line of the Minkowski diagram are simultaneous in the reference frame. The more tilted an object’s worldline is away from the vertical, the faster the object is moving. Given the units chosen for the above diagram, no worldline can tilt down more than 45 degrees, or else that object is moving faster than c, the cosmic speed limit according to special relativity.

In the above diagram, Einstein’s worldline is straight, indicating no total external force is acting on him. If an object’s worldline meets another object’s worldline, then the two objects collide.

The set of all possible photon histories or light-speed worldlines going through a specific point-event defines the two light cones of that event, namely its past light cone and its future light cone. The future cone or forward cone is called a cone because, if the spacetime diagram were to have two space dimensions, then light emitted from a flash would spread out in the two spatial dimensions in a circle of ever-growing diameter, producing a cone shape over time. In a diagram for three-dimensional space, the light’s wavefront is an expanding sphere and not an expanding cone, but sometimes physicists still will informally speak of its cone.

Every point of spacetime has its own pair of light cones, but the light cone has to do with the structure of spacetime, not its contents, so the light cone of a point exists even if there is no light there.

Whether a member of a pair of events could have had a causal impact upon the other event is an objective feature of the universe and is not relative to a reference frame. A pair of events inside the same light cone are said to be causally-connectible because they could have affected each other by a signal going from one to the other at no faster than the speed of light, assuming there were no obstacles that would interfere. For two causally-connectible events, the relation between the two events is said to be timelike. If you were once located in spacetime at, let’s say, (x1,y1,z1,t1), then for the rest of your life you cannot affect or participate in any event that occurs outside of the forward light cone whose apex is at (x1,y1,z1,t1). Light cones are an especially helpful tool because different observers in different rest frames should agree on the light cones of any event, despite their disagreeing on what is simultaneous with what and the duration between two events. So, the light-cone structure of spacetime is objectively real.

Not all spacetimes can be given Minkowski diagrams, but any spacetime satisfying Einstein’s Special Theory of Relativity can. Einstein’s Special Theory applies to gravitation, but it falsely assumes that physical processes, such as gravitational processes, have no effect on the structure of spacetime. When attention needs to be given to the real effect of these processes on the structure of spacetime, that is, when general relativity needs to be used, then Minkowski diagrams become inappropriate for spacetime. General relativity assumes that the geometry of spacetime is locally Minkowskian, but not globally Minkowskian. That is, spacetime is locally flat in the sense that in any infinitesimally-sized region one always finds spacetime to be 4D Minkowskian (which is 3D Euclidean for space but not 4D Euclidean for spacetime). When we say spacetime is curved and not flat, we mean it deviates from 4D Minkowskian geometry.

7. What Are Time’s Metric and Spacetime’s Interval?

The metric of a space contains geometric information about the space. It tells the curvature at points, and it tells the distance between any two points along a curve containing the two points. Here, the term “distance in time” refers to duration. The introduction below discusses distance and duration, but it usually ignores curvature. If you change to a different coordinate system, generally you must change the metric. In that sense, the metric is not objective.

In simple situations in a Euclidean space with a Cartesian coordinate system, the metric is a procedure that says that, in order to find the duration, subtract the event’s starting time from its ending time. More specifically, this metric for time says that, in order to compute the duration between point-event a that occurs at time t(a) and point-event b that occurs at time t(b), then one should compute |t(b) – t(a)|, the absolute value of their difference. This is the standard way to compute durations when curvature of spacetime is not involved. When it is involved, such as in general relativity, we need a more exotic metric, and the computations can be extremely complicated.

The metric for spacetime implies the metric for time. The spacetime metric tells the spacetime interval between two point events. The spacetime interval has both space aspects and time aspects. Two events in the life of a photon have a zero time interval. The interval is the measure of the spacetime separation between two point events along a specific spacetime path. Let’s delve into this issue a little more deeply.

In what follows, note the multiple senses of the word space. A mathematical space is not a physical space. A physicist often represents time as a one-dimensional space, space as a three-dimensional space, and spacetime as a four-dimensional space. More generally, a metric for any sort of space is an equation that says how to compute the distance (or something distance-like, as we shall soon see) between any two points in that space along a curve in the space, given the location coordinates of the two points. Note the coordinate dependence. For ordinary Euclidean space, the metric is just the three-dimensional version of the Pythagorean Theorem. The Euclidean four-dimensional space the metric is just the four-dimensional version of the Pythagorean Theorem. However, for four dimensional spacetime, the metric is exotic, as we shall see.

In a one-dimensional Euclidean space along a straight line from point location x to a point location y, the metric says the distance d between the two points is |y – x|. It is assumed both locations use the same units.

The duration t(a,b) between an event a that occurs at time t(a) and an event b that occurs at time t(b) is given by the metric equation:

t(a,b) = |t(b) – t(a)|.

This is the standardly-accepted way to compute durations when curvature is not involved. Philosophers have asked whether one could just as well have used half that absolute value, or the square root of the absolute value. More generally, is one definition of the metric the correct one or just the more useful one? That is, philosophers are interested in the underlying issue of whether the choice of a metric is natural in the sense of being objective or whether its choice is a matter of convention.

Let’s bring in more dimensions. In a two-dimensional plane satisfying Euclidean geometry, the formula for the metric is:

d2 = (x2 – x1)2 + (y2 – y1)2.

It defines what is meant by the distance d between an arbitrary point with the Cartesian coordinates (x1 , y1) and another point with the Cartesian coordinates (x2 , y2), assuming all the units are the same, such as meters. The x numbers are values in the x dimension, that is, parallel to the x-axis, and the y numbers are values in the y dimension. The above equation is essentially the Pythagorean Theorem of plane geometry. Here is a visual representation of this for the two points:       

If you imagine this graph is showing you what a crow would see flying above a square grid of streets, then the metric equation d2 = (x1 – x2)2+ (y1 – y2)2  gives you the distance d as the crow flies. But if your goal is a metric that gives the distance only for taxicabs that are restricted to travel vertically or horizontally, then a taxicab metric would compute the taxi’s distance this way:

|x2 – x1| + |y2 – y1|.

So, a space can have more than one metric, and we choose the metric depending on the character of the space and what our purpose is.

Usually for a physical space there is a best or intended or conventionally-assumed metric. If all we want is the shortest distance between two points in a two-dimensional Euclidean space, the conventional metric is:

d2 = (x2 – x1)2 + (y2 – y1)2

But if we are interested in distances along an arbitrary path rather than just the shortest path, then the above metric is correct only infinitesimally, and a more sophisticated metric is required by using the tools of calculus. In this case, the above metric is re-expressed as a difference equation using the delta operator symbol Δ to produce:

(Δs)2 = (Δx)2+ (Δy)2

where Δs is the spatial distance between the two points and Δx = x1 – x2 and Δy = y1 – y2. The delta symbol Δ is not a number but rather is an operator on two numbers that produces their difference. If the differences are extremely small, infinitesimally small, then they are called differentials instead of differences, and then Δs becomes ds, and Δx becomes dx, and Δy becomes dy, and we have entered the realm of differential calculus. The letter d in a differential stands for an infinitesimally small delta operation, and it is not like the number d in the diagram above.

Let’s generalize this idea from 2D-space to 4D-spacetime. The metric we are now looking for is about the interval between two arbitrary point-events, not the distance between them. Although there is neither a duration between New York City and Paris, nor a spatial distance between noon today and midnight later, nevertheless there is a spacetime interval between New York City at noon and Paris at midnight.

Unlike temporal durations and spatial distances, intervals are objective in the sense that the spacetime interval is not relative to a reference frame or coordinate system. All observers measure the same value for an interval, assuming they measure it correctly. The value of an interval between two point events does not change if the reference frame changes. Alternatively, acceptable reference frames are those that preserve the intervals between points.

Any space’s metric says how to compute the value of the separation s between any two points in that space. In special relativity, the four-dimensional abstract space that represents spacetime is indeed special. It’s 3-D spatial part is Euclidean and its 1-D temporal part is Euclidean, but the 4D space it is not Euclidean, and its metric is exotic. It is said to be Minkowskian, and it is given a Lorentzian coordinate system. Its metric is defined between two infinitesimally close points of spacetime to be:

ds2 = c2dt2 dx2

where ds is an infinitesimal interval (or a so-called differential displacement of the spacetime coordinates) between two nearby point-events in the spacetime; c is the speed of light; the differential dt is the infinitesimal duration between the two time coordinates of the two events; and dx is the infinitesimal spatial distance between the two events. Notice the negative sign. If it were a plus sign, then the metric would be Euclidean.

Because there are three dimensions of space in a four-dimensional spacetime, say dimensions 1, 2, and 3, the differential spatial distance dx is defined to be:

dx2 = dx12 + dx22 + dx32

This equation is obtained in Cartesian coordinates by using the Pythagorean Theorem for three-dimensional space. The differential dx1 is the displacement along dimension 1 of the three dimensions. Similarly, for 2 and 3. This is the spatial distance between two point-events, not the interval between them. That is, ds is not usually identical to dx.

With these differential equations, the techniques of calculus can then be applied to find the interval between any two point-events even if they are not nearby in spacetime, so long as we have the information about the worldline s, the path in spacetime, such as its equation in the coordinate system.

In special relativity, the interval between two events that occur at the same place, such as the place where the clock is sitting, is very simple. Since dx = 0, the interval is:

t(a,b) = |t(b) – t(a)|.

This is the absolute value of the difference between the real-valued time coordinates, assuming all times are specified in the same units, say, seconds, and assuming no positive spatial distances are involved. We began the discussion of this section by using that metric.

Now let us generalize this notion in order to find out how to use a clock for events that do not occur at the same place. The infinitesimal proper time dτ, rather than the differential coordinate-time dt, is the duration shown by a clock carried along the infinitesimal spacetime interval ds. It is defined in any spacetime obeying special relativity to be:

2= ds2/c2.

In general, dτ ≠ dt. They are equal only if the two point-events have the same spatial location so that dx = 0.

Because spacetime “distances” (intervals) can be negative, and because the spacetime interval between two different events can be zero even when the events are far apart in spatial distance (but reachable by a light ray if intervening material were not an obstacle), the term interval here is not what is normally meant by the term distance.

There are three kinds of spacetime intervals: timelike, spacelike, and null. In spacetime, if two events are in principle connectable by a signal moving from one event to the other at less than light speed, the interval between the two events is called timelike. There could be no reference frame in which the two occur at the same time. The interval is spacelike if there is no reference frame in which the two events occur at the same place, so they must occur at different places and be some spatial distance apart—thus the choice of the word spacelike. Two events connectable by a signal moving exactly at light speed are separated by a null interval, an interval of magnitude zero.

Here is an equivalent way of describing the three kinds of spacetime intervals. If one of the two events occurs at the origin or apex of a light cone, and the other event is within either the forward light cone or backward light cone, then the two events have a timelike interval. If the other event is outside the light cones, then the two events have a spacelike interval [and are in each other’s so-called absolute elsewhere]. If the two events lie directly on the same light cone, then their interval is null or zero.

The spacetime interval between any two events in a human being’s life must be a timelike interval. No human being can do anything to affect an event outside their future light cone. Such is the human condition according to relativity theory.

The information in the more complicated metric for general relativity enables a computation of the curvature at any point. This more complicated metric is the Riemannian metric tensor field. This is what you know when you know the metric of spacetime.

A space’s metric provides a complete description of the local properties of the space, regardless of whether the space is a physical space or a mathematical space representing spacetime. By contrast, the space’s topology provides a complete description of the global properties of the space such as whether it has external curvature like a cylinder or no external curvature as in a plane; these two spaces are locally the same.

The metric for special relativity is complicated enough, but the metric for general relativity is very complicated.

The discussion of the metric continues in the discussion of time coordinates. For a helpful and more detailed presentation of the spacetime interval and the spacetime metric, see chapter 4 of (Maudlin 2012) and especially the chapter “Geometry” in The Biggest Ideas in the Universe: Space, Time, and Motion by Sean Carroll.

8. How Does Proper Time Differ from Standard Time and Coordinate Time?

Proper time is personal, and standard time is public. Standard time is the proper time reported by the standard clock of our conventionally-chosen standard coordinate system. Every properly functioning clock measures its own proper time, the time along its own worldline, no matter how the clock is moving or what forces are acting upon it. Loosely speaking, standard time is the time shown on a designated clock in Paris, France that reports the time in Greenwich England that we agree to be the correct time. The Observatory is assumed to be stationary in the standard coordinate system. But the faster your clock moves compared to the standard clock or the greater the gravitational force on it compared to the standard clock, then the more your clock readings will deviate from standard time as would be very clear if the two clocks were ever to meet. This effect is called time dilation. Under normal circumstances in which you move slowly compared to the speed of light and do not experience unusual gravitational forces, then there is no difference between your proper time and your civilization’s standard time.

Think of any object’s proper time as the time that would be shown on an ideal, small, massless, correct clock that always travels with the object and has no physical effect upon the object and that is not affected if the object is ever frozen. Your cell phone is an exception. Although it has its own proper time, what it reports instead is the proper time of our standard clock adjusted by an hour for each time zone between it and the cell phone. People on Earth in the same time zone normally do not notice that they have different proper times from each other because the time dilation effect is so small for the kind of life they lead.

Your proper time and my proper time might be different, but both are correct. That is one of the most surprising implications of the theory of relativity. The claim that two different clocks can be correct would be called an inconsistency in Newtonian physics, but the problem is that Newtonian physics is inconsistent with how time really works.

Coordinate time is the time of an event as shown along the axes of some chosen coordinate system. Coordinate systems are not real objects, and they can differ in their scales and origins and the orientations of their axes.

The proper time interval between two events (on a world line) is the amount of time that elapses according to an ideal clock that is transported between the two events. But there are many paths for the transportation, just as there are many roads between Paris and Berlin. Consider two point-events. Your own proper time between them is the duration between the two events as measured along the world line of your clock that is transported between the two events. Because there are so many physically possible ways to do the clock transporting, for example at slow speed or high speed and near a large mass or far from it, there are so many different proper time intervals for the same two events.

Here is a way to maximize the difference between proper time and standard time. If you and your clock pass through the event horizon of a black hole and fall toward the hole’s center, you will not notice anything unusual about your proper time, but external observers using Earth’s standard time will measure that you took an extremely long time to pass through the horizon.

The actual process by which coordinate time is computed from the proper times of real clocks and the process by which a distant clock is synchronized with a local clock are very complicated, though some of the philosophically most interesting issues here—regarding the relativity of simultaneity and the conventionality of simultaneity—are discussed below.

Authors and speakers who use the word time often do not specify whether they mean proper time or standard time or coordinate time. They assume the context is sufficient to tell us what they mean.

9. Is Time the Fourth Dimension?

Yes and no; it depends on what is meant by the question. It is correct to say time is a dimension but not a spatial dimension. Time is the fourth dimension of 4D spacetime, but time is not the fourth dimension of physical space because that space has only three dimensions. In 4D spacetime, the time dimension is special and differs in a fundamental way from the other three dimensions.

Mathematicians have a broader notion of the term space than the average person. In their sense, a space need not contain any geographical locations nor any times, and it can have any number of dimensions, even an infinite number. Such a space might be two-dimensional and contain points represented by the ordered pairs in which a pair’s first member is the name of a voter in London and its second member is the average monthly income of that voter. Not paying attention to the two meanings of the term space is the source of all the confusion about whether time is the fourth dimension.

Newton treated space as three dimensional and treated time as a separate one-dimensional space. He could have used Minkowski’s 1908 idea, if he had thought of it, namely the idea of treating spacetime as four-dimensional.

The mathematical space used by mathematical physicists to represent physical spacetime that obeys the laws of relativity is four-dimensional; and in that mathematical space, the space of places is a 3D sub-space, and time is another sub-space, a 1D one. The mathematician Hermann Minkowski was the first person to construct such a 4D mathematical space for spacetime, although in 1895 H. G. Wells treated time informally as the fourth dimension in his novel The Time Machine.

In 1908, Minkowski remarked that “Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.” Many people mistakenly took this to mean that time is partly space, and vice versa. The philosopher C. D. Broad countered that the discovery of spacetime did not break down the distinction between time and space but only their independence or isolation.

The reason why time is not partly space is that, within a single frame, time is always distinct from space. Another way of saying this is to say time always is a distinguished dimension of spacetime, not an arbitrary dimension. What being distinguished amounts to, speaking informally, is that when you set up a rectangular coordinate system on a spacetime with an origin at, say, some important event, you may point the x-axis east or north or up or any of an infinity of other directions, but you may not point it forward in time—you may do that only with the t-axis, the time axis.

For any coordinate system on spacetime, mathematicians of the early twentieth century believed it was necessary to treat a point-event with at least four independent numbers in order to account for the four dimensionality of spacetime. Actually this appeal to the 19th-century definition of dimensionality, which is due to Bernhard Riemann, is not quite adequate because mathematicians have subsequently discovered how to assign each point on the plane to a point on the line without any two points on the plane being assigned to the same point on the line. The idea comes from the work of Georg Cantor. Because of this one-to-one correspondence between the plane’s points and the line’s points, the points on a plane could be specified with just one number instead of two. If so, then the line and plane must have the same dimensions according to the Riemann definition of dimension. To avoid this result, and to keep the plane being a 2D object, the notion of dimensionality of space has been given a new, but rather complex, definition.

10. Is There More Than One Kind of Physical Time?

Dinnertime is a kind of event but not a kind of time. Are there kinds of time? Although every reference frame or coordinate system on physical spacetime does have its own coordinate time, our question is intended in another sense. At present, physicists measure time electromagnetically. They define a standard atomic clock using a periodic oscillation of a light beam emitted from the electrons in a special atomic process, then synchronize clocks that are far from the standard clock. In doing this, are physicists measuring “electromagnetic time” but not also other kinds of physical time?

In the 1930s, the physicists Arthur Milne and Paul Dirac worried about this question. Independently, they suggested there may be very many time scales. For example, there could be the time of atomic processes and perhaps another time of gravitation and large-scale physical processes. Perfectly-working clocks for the two processes might drift out of synchrony after being initially synchronized without there being a reasonable explanation for why they do not stay in synchrony. It would be a mystery. Ditto for clocks based on the pendulum, on superconducting resonators, and on other physical principles. Just imagine the difficulty for physicists if they had to work with electromagnetic time, gravitational time, proton time, neutrino time, and so forth. Current physics, however, has found no reason to assume there is more than one kind of time for physical processes.

In 1967, physicists did reject the astronomical standard for the atomic standard because the deviation between known atomic and gravitation periodic processes such as the Earth’s rotations and revolutions could be explained better assuming that the atomic processes were the most regular of these phenomena. But this is not a cause for worry about two times drifting apart. Physicists still have no reason to believe a gravitational periodic process that is not affected by friction or impacts or other forces would ever mysteriously drift out of synchrony with an atomic process, yet this is the possibility that worried Milne and Dirac.

11. How Is Time Relative to the Observer?

The rate that a clock ticks is relative to the observer. Given one event, the first observer’s clock can measure one value for its duration, but a second clock can measure a different value if it is moving or being affected differently by gravity. Yet, says Einstein, both measurements can be correct. That is what it means to say time is relative to the observer. This relativity is quite a shock to our manifest image of time. According to Newton’s physics, in principle there is no reason why observers cannot agree on what time it is now or how long an event lasts or when some distant event occurred, so the notion of observer is not as important as it is in modern physics.

The term “observer” in physics has multiple meanings. The observer is normally distinct from the observation itself. Informally, an observer is a conscious being who can report an observation and who has a certain orientation to what is observed, such as being next to the measured event or being light years away. An observation is the result of the action of observing. It establishes the values of one or more variables as in “It was noon on my spaceship’s clock when the asteroid impact was seen, so because of the travel time of light I compute that the impact occurred at 11:00.” An observer ideally causes no unnecessary perturbations in what is observed. If so, the observation is called objective.

In physics, the term “observer” is used in this informal way. Call it sense (1). In a second sense (2), in relativity theory an observer might be an entire reference frame, and an observation is a value measured locally, perhaps by a human spectator or perhaps by a machine. Think of an observer as being an omniscient reference frame.

In sense (1), an ordinary human observer cannot directly or indirectly observe any event that is not in its backward light cone. There is a sense (3). This is an observer in quantum theory, but that sense is not developed here.

Consider what is involved  in being an omniscient reference frame. Information about any desired variable is reported from a point-sized spectator at each spacetime location. The point-spectator who does the observing and measuring has no effect upon what is observed and measured. All spectators are at rest in the same, single, assumed reference frame. A spectator is always accompanied by an ideal, point-sized, massless, perfectly functioning clock that is synchronized with the clocks of other spectators at all other points of spacetime. The observer has all the tools needed for reporting values of variables such as voltage or the presence or absence of grape jelly.

12. What Is the Relativity of Simultaneity?

The relativity of simultaneity is the feature of spacetime in which observers using different reference frames disagree on which events are simultaneous. Simultaneity is relative to the chosen reference frame. A large percentage of both physicists and philosophers of time suggest that this implies simultaneity is not objectively real, and they conclude also that the present is not objectively real, the present being all the events that are simultaneous with being here now.

Why is there disagreement about what is simultaneous with what? It occurs because the two events occur spatially far from each other.

In our ordinary lives, we can neglect all this because we are interested in nearby events. If two events occur near us, we can just look and see whether they occurred simultaneously.  But suppose we are on a spaceship circling Mars when a time signal is received saying it is noon in Greenwich England. Did the event of the sending and receiving occur simultaneously? No. Light takes an hour and twenty minutes to travel from the Earth to the spaceship. If we want to use this time signal to synchronize our clock with the Earth clock, then instead of setting our spaceship clock to noon, we should set it to an hour and twenty minutes before noon.

This scenario conveys the essence of properly synchronizing distant clocks with our nearby clock. There are some assumptions that are ignored for now, namely that we can determine that the spaceship was relatively stationary with respect to Earth and was not in a different gravitational potential field from that of the Earth clock.

The diagram below illustrates the relativity of simultaneity for the so-called midway method of synchronization. There are two light flashes. Did they occur simultaneously?

Minkows2

The Minkowski diagram represents Einstein sitting still in the reference frame indicated by the coordinate system with the thick black axes. Lorentz is traveling rapidly away from him and toward the source of flash 2. Because Lorentz’s worldline is a straight line, we can tell that he is moving at a constant speed. The two flashes of light arrive simultaneously at their midpoint according to Einstein but not according to Lorentz. Lorentz sees flash 2 before flash 1. That is, the event A of Lorentz seeing flash 2 occurs before event C of Lorentz seeing flash 1. So, Einstein will readily say the flashes are simultaneous, but Lorentz will have to do some computing to figure out that the flashes are simultaneous in the Einstein frame because they are not simultaneous to him in a reference frame in which he is at rest.  However, if we’d chosen a different reference frame from the one above, one in which Lorentz is not moving but Einstein is, then it would be correct to say flash 2 occurs before flash 1. So, whether the flashes are or are not simultaneous depends on which reference frame is used in making the judgment. It’s all relative.

There is a related philosophical issue involved with assumptions being made in, say, claiming that Einstein was initially midway between the two flashes. Can the midway determination be made independently of adopting a convention about whether the speed of light is independent of its direction of travel? This is the issue of whether there is a ‘conventionality’ of simultaneity.

13. What Is the Conventionality of Simultaneity?

The relativity of simultaneity is philosophically less controversial than the conventionality of simultaneity. To appreciate the difference, consider what is involved in making a determination regarding simultaneity. The central problem is that you can measure the speed of light only for a roundtrip, not a one-way trip, so you cannot simultaneously check what time it is on your clock and a distant clock.

Given two events that happen essentially at the same place, physicists assume they can tell by direct observation whether the events happened simultaneously. If they cannot detect that one of them is happening first, then they say they happened simultaneously, and they assign the events the same time coordinate in the reference frame. The determination of simultaneity is very much more difficult if the two events happen very far apart, such as claiming that the two flashes of light reaching Einstein in the scenario of the previous section began at the same time. One way to measure (operationally define) simultaneity at a distance is the midway method. Say that two events are simultaneous in the reference frame in which we are stationary if unobstructed light signals caused by the two events reach us simultaneously when we are midway between the two places where they occurred. This is the operational definition of simultaneity used by Einstein in his theory of special relativity.

This midway method has a significant presumption: that the light beams coming from opposite directions travel at the same speed. Is this a fact or just a convenient convention to adopt? Einstein and the philosophers of time Hans Reichenbach and Adolf Grünbaum have called this a reasonable convention because any attempt to experimentally confirm the equality of speeds, they believed, presupposes that we already know how to determine simultaneity at a distance.

Hilary Putnam, Michael Friedman, and Graham Nerlich object to calling it a convention—on the grounds that to make any other assumption about light’s speed would unnecessarily complicate our description of nature, and we often make choices about how nature is on the basis of simplification of our description of nature.

To understand the dispute from another perspective, notice that the midway method above is not the only way to define simultaneity. Consider a second method, the mirror reflection method. Select an Earth-based frame of reference, and send a flash of light from Earth to Mars where it hits a mirror and is reflected back to its source. The flash occurred at 12:00 according to a correct Earth clock, let’s say, and its reflection arrived back on Earth 20 minutes later. The light traveled the same empty, undisturbed path coming and going. At what time did the light flash hit the mirror? The answer involves the conventionality of simultaneity. All physicists agree one should say the reflection event occurred at 12:10 because they assume it took ten minutes going to Mars, and ten minutes coming back. The difficult philosophical question is whether this way of calculating the ten minutes is really just a convention. Einstein pointed out that there would be no inconsistency in our saying that the flash hit the mirror at 12:17, provided we live with the awkward consequence that light was relatively slow getting to the mirror, but then traveled back to Earth at a faster speed.

Suppose we want to synchronize a Mars clock with our clock on Earth using the reflection method. Let’s draw a Minkowski diagram of the situation and consider just one spatial dimension in which we are at location A on Earth next to the standard clock used for the time axis of the reference frame. The distant clock on Mars that we want to synchronize with Earth time is at location B. See the diagram.

conventionality of simultaneity graph

The fact that the worldline of the B-clock is parallel to the time axis shows that the two clocks are assumed to be relatively stationary. (If they are not, and we know their relative speed, we might be able to correct for this.) We send light signals from Earth in order to synchronize the two clocks. Send a light signal from A at time t1 to B, where it is reflected back to us at A, arriving at time t3. So, the total travel time for the light signal is t3 – t1, as judged by the Earth-based frame of reference. Then the reading tr on the distant clock at the time of the reflection event should be set to t2, where:

t2 = t1 + (1/2)(t3 – t1).

If tr = t2, then the two spatially separated clocks are supposedly synchronized.

Einstein noticed that the use of the fraction 1/2 rather than the use of some other fraction implicitly assumes that the light speed to and from B is the same. He said this assumption is a convention, the so-called conventionality of simultaneity, and is not something we could check to see whether it is correct.  Only with the fraction (1/2) are the travel speeds the same going and coming back.

Suppose we try to check whether the two light speeds really are the same. We would send a light signal from A to B, and see if the travel time was the same as when we sent it from B to A. But to trust these durations we would already need to have synchronized the clocks at A and B. But that synchronization process will presuppose some value for the fraction, said Einstein.

Not all philosophers of science agree with Einstein that the choice of (1/2) is a convention, nor with those philosophers such as Putnam who say the messiness of any other choice shows that the choice of 1/2 must be correct. Everyone does agree, though, that any other choice than 1/2 would make for messy physics.

Some researchers suggest that there is a way to check on the light speeds and not simply presume they are the same. Create two duplicate, correct clocks at A. Transport one of the clocks to B at an infinitesimal speed. Going this slow, the clock will arrive at B without having its own time reports deviate from that of the A-clock. That is, the two clocks will be synchronized even though they are distant from each other. Now the two clocks can be used to find the time when a light signal left A and the time when it arrived at B, and similarly for a return trip. The difference of the two time reports on the A and B clocks can be used to compute the light speed in each direction, given the distance of separation. This speed can be compared with the speed computed with the midway method. The experiment has never been performed, but the recommenders are sure that the speeds to and from will turn out to be identical, so they are sure that the (1/2) is correct and not a convention.

Sean Carroll has yet another position on the issue. He says “The right strategy is to give up on the idea of comparing clocks that are far away from each other” (Carroll 2022, 150).

For additional discussion of this controversial issue of the conventionality of simultaneity, see (Callender 2017, p. 51) and pp. 179-184 of The Blackwell Guide to the Philosophy of Science, edited by Peter Machamer and Michael Silberstein, Blackwell Publishers, Inc., 2002.

14. What are the Absolute Past and the Absolute Elsewhere?

What does it mean to say the human condition is one in which you never will be able to affect an event outside your forward light cone? Here is a visual representation of the human condition according to the special theory of relativity, whose spacetime can always be represented by a Minkowski diagram of the following sort:

Minkows1

The absolutely past events (the green events in the diagram above) are the events in or on the backward light cone of your present event, your here-and-now. The backward light cone of event Q is the imaginary cone-shaped surface of spacetime points formed by the paths of all light rays reaching Q from the past.

The events in your absolute past zone or region are those that could have directly or indirectly affected you, the observer, at the present moment, assuming there were no intervening obstacles. The events in your absolute future zone are those that you could directly or indirectly affect.

An event’s being in another event’s absolute past is a feature of spacetime itself because the event is in the point’s past in all possible reference frames. This feature is frame-independent. For any event in your absolute past, every observer in the universe (who is not making an error) will agree the event happened in your past. Not so for events that are in your past but not in your absolute past. Past events not in your absolute past are in what Eddington called your absolute elsewhere. The absolute elsewhere is the region of spacetime containing events that are not causally connectible to your here-and-now. Your absolute elsewhere is the region of spacetime that is neither in nor on either your forward or backward light cones. No event here and now, can affect any event in your absolute elsewhere; and no event in your absolute elsewhere can affect you here and now.

If you look through a telescope you can see a galaxy that is a million light-years away, and you see it as it was a million years ago. But you cannot see what it looks like now because the present version of that galaxy is outside your light cone, and is in your absolute elsewhere.

A single point’s absolute elsewhere, absolute future, and absolute past form a partition of all spacetime into three disjoint regions. If point-event A is in point-event B’s absolute elsewhere, the two events are said to be spacelike related. If the two are in each other’s forward or backward light cones they are said to be time-like related or to be causally connectible. We can affect or be affected by events that are time-like related to us. The order of occurrence of a space-like event (before or after or simultaneous with your here-and-now) depends on the chosen frame of reference, but the order of occurrence of a time-like event and our here-and-now does not. Another way to make the point is to say that, when choosing a reference frame, we have a free choice about the time order of two events that are space-like related, but we have no freedom when it comes to two events that are time-like related because the causal order determines their time order. That is why the absolute elsewhere is also called the extended present. There is no fact of the matter about whether a point in your absolute elsewhere is in your present, your past, or your future. It is simply a conventional choice of reference frame that fixes what events in your absolute elsewhere are present events.

For any two events in spacetime, they are time-like, space-like, or light-like separated, and this is an objective feature of the pair that cannot change with a change in the reference frame. This is another implication of the fact that the light-cone structure of spacetime is real and objective, unlike features such as durations and lengths.

The past light cone looks like a cone in small regions in a spacetime diagram with one dimension of time and two of space. However, the past light cone is not cone-shaped in a large cosmological region, but rather has a pear-shape because all very ancient light lines must have come from the infinitesimal volume at the Big Bang.

15. What Is Time Dilation?

Time dilation occurs when two synchronized clocks get out of synchrony due either to their relative motion or due to their being in regions of different gravitational field strengths.  An observer always notices that it is the other person’s clock that is behaving oddly, never that their own clock is behaving oddly. When two observers are in relative motion, each can see that the other person’s clock is slowing down relative to their own clock. It’s as if the other person’s time is stretched  or dilated. There is philosophical controversy about whether the dilation is literally a change in time itself or only a change in how  durations are measured using someone else’s clock as opposed to one’s own clock.

The specific amount of time dilation depends on the relative speed of one clock toward or away from the other. If one clock circles the other, their relative speed is zero, so there is no time dilation due to speed, regardless of how fast the rotational speed.

The sister of time dilation is space contraction. The length of an object changes in different reference frames to compensate for time dilation so that the speed of light c in a vacuum is constant in any frame. The object’s length measured perpendicular to the direction of motion is not affected by the motion, but the length measured in the direction of the motion is affected. If you are doing the measuring, then moving sticks get shorter if moving toward you or away from you. The length changes not because of forces, but rather because space itself contracts.  What a shock this is to our manifest image! No one notices that the space around themselves is contracting, only that the space somewhere else seems to be affected.

Here is a picture of the visual distortion of moving objects due to space contraction:

rolling wheel
Image: Corvin Zahn, Institute of Physics, Universität Hildesheim,
Space Time Travel (http://www.spacetimetravel.org/)

The picture describes the same wheel in different colors: (green) rotating in place just below the speed of light; (blue) moving left to right just below the speed of light; and (red) remaining still.

To give some idea of the quantitative effect of time dilation:

Among particles in cosmic rays we find protons…that move so fast that their velocities differ infinitesimally from the speed of light: the difference occurs only in the twentieth (sic!) non-zero decimal after the decimal point. Time for them flows more slowly than for us by a factor of ten billion, If, by our clock, such a proton takes a hundred thousand years to cross our stellar system—the Galaxy—then by ‘its own clock’ the proton needs only five minutes to cover the same distance (Novikov 1998, p. 59).

16. How Does Gravity Affect Time?

According to the general theory of relativity, gravitational differences affect time by dilating it—in the sense that observers in a less intense gravitational potential field find that clocks in a more intense gravitational potential field run slow relative to their own clocks. It’s as if the time of the clock in the intense gravitational field is stretched out and not ticking fast enough. People in ground floor apartments outlive their twins in penthouses, all other things being equal. Basement flashlights will be shifted toward the red end of the visible spectrum compared to the flashlights in attics. All these phenomena are the effects of gravitational time dilation.

Spacetime in the presence of gravity is curved, according to general relativity. So, time is curved, too. When time curves, clocks do not bend in space as if in a Salvador Dali painting. Instead they undergo gravitational time dilation.

Information from the Global Positioning System (GPS) of satellites orbiting Earth is used by your cell phone to tell you whether you should turn right at the next intersection. The GPS is basically a group of flying clocks that broadcast the time. The curvature of spacetime near Earth is significant enough that gravitational time dilation must be accounted for these clocks. The gravitational time dilation plus the time dilation due to satellite speed makes time in the satellites run about seven microseconds faster compared to Earth’s standard surface time. Therefore, these GPS satellites are launched with their clocks adjusted ahead of Earth clocks by about seven seconds and then are periodically readjusted ahead so that they stay synchronized with Earth’s standard time. The less error in the atomic clock the better the GPS, and that is one reason physicists keep trying to build better clocks. (In 2018, gravitational time dilation was measured in Boulder, Colorado, U.S.A. so carefully that it detected the difference in ticking of two initially synchronized atomic clocks that differed in height by only a centimeter.)

When a metaphysician asks the question, “What is gravity?” there are three legitimate, but very different, answers. Gravity is (1) a force, (2) intrinsic curvature of spacetime, and (3) exchanges of virtual particles. All three answers have their uses. When speaking of spilling milk or designing a rocket to visit the moon, the first answer is most appropriate to use. In the context of general relativity, the second answer is most appropriate.

In the context of a future theory of quantum gravity that incorporates gravity into quantum mechanics and the standard model of particle physics, the third answer is expected to be best. At this more fundamental level, forces are features of field activity. Gravity particles called gravitons are fluctuations within the gravitational field, and what is happening with the spilled milk is that pairs of virtual entangled particles bubble up out of the relevant fields. Normally one member of the pair has normal positive momentum, and the other member has negative momentum. Those particles with negative momentum  are  exchanged between the milk and the Earth and floor, thereby causing the milk to be attracted to the floor in analogy to how, when someone throws a boomerang beyond you ,it can hit you on its way back and push you closer to the thrower.

17. What Happens to Time near a Black Hole?

Once  thought by Einstein to be too strange to actually exist, black holes subsequently have been recognized to be real phenomena existing throughout the universe. Princeton physicist Richard Gott described a black hole as a hotel in which you can check in but cannot check out—referring to the fact that a black hole is a region of extremely warped spacetime such that almost nothing that falls in can get back out, even light. Black holes are called “holes” not because they are empty but because so many things fall in. A typical black hole is produced by the death of a star of a certain minimal mass whose nuclear fuel has been used up. Within one second it cools and is crushed by its own gravity to a tiny volume. A healthy star is an explosion. A black hole is an implosion.

Our Milky Way contains about 100 million black holes. The center of nearly every galaxy has one, although they exist in other places, too. The best evidence for black holes was found in 2015 with the direct detection of the kind of gravitational waves that were predicted to occur only if they were produced by the collision of two black holes.

The infinitesimal center of a black hole is often called its singularity, but strictly speaking the center and the singularity are different. According to the theory of relativity, the spatial center is a crushed object of infinite spatial curvature; earlier, that object was responsible for creating the black hole by collapsing. The singularity is the end of the proper time of any object that plunges into the hole. Nevertheless it is common even for experts to casually use the two terms interchangeably.

Physicists are suspicious that relativity theory is mistaken in implying that the crushing results in an infinitesimal point of infinite mass density and infinite curvature at the black hole’s center. Quantum theory suggests that the point will be small but not infinitesimal, and the curvature there will be very high but not infinite.

Here is a processed photograph of a black hole surrounded by its colorful accretion disk that is radiating electromagnetic radiation (mostly high-energy x-rays) due to particles outside the hole crashing into each other:

picture of black hole
The M87 black hole image produced by the European Southern Observatory

The colors in the picture are artifacts added by a computer because the real light (when shifted from x-ray frequencies to optical frequencies) would be white and because humans can detect differences among colors better than differences in the brightness of white light. A black hole can spin, but even if it is not spinning, its surrounding accretion disk will surely be spinning. The accretion disk is not spherical, but is pizza-shaped.

Think of the event horizon as a two-dimensional spherical envelope. To plunge across the event horizon is to cross a point of no return. Even light generated inside cannot get back out. So, black holes are relatively dark compared to stars. However, because the accretion disk outside the horizon can eject particles and shine as a quasar, some supermassive black holes are the most luminous objects in the universe.

The event horizon is a two-dimensional fluid-like surface separating the inside from the outside of the black hole. If you were unlucky enough to fall through the event horizon, you could see out, but you could not send a signal out, nor could you yourself escape even if your spaceship had an extremely powerful thrust. The space around you increasingly collapses, so you would be squeezed on your way to the center—a process called “spaghettification” from the word spaghetti. Relativity theory implies you’d be crushed to an actual point at the singularity, but that feature of relativity theory conflicts with quantum theory and so has few advocates. Despite being crushed, you would continue to affect the world outside the black hole via your contribution to its gravity.

According to relativity theory, if you were in a spaceship approaching a black hole and getting near its event horizon, then your time warp would become very significant as judged by clocks back on Earth. The warp (the slowing of your clock relative to clocks back on Earth) would be more severe the longer you stayed in the vicinity and also the closer you got to the event horizon. Even if your spaceship accelerated rapidly toward the hole, viewers from outside would see your spaceship progressively slow its speed during its approach to the horizon. Reports sent back toward Earth of the readings of your spaceship’s clock would become dimmer and lower in frequency (due to gravitational red shift), and these reports would show that your clock’s ticking was slowing down (dilating) compared to Earth clocks.

Any macroscopic object can become a black hole if sufficiently compressed. An object made of anti-matter can become a black hole. If you bang two particles together fast enough, they will produce a black hole, and the black hole will begin pulling in nearby particles. Luckily even our best particle colliders in Earth’s laboratories are not powerful enough to do this. Only the more massive stars will become black holes when their fuel runs out and they stop radiating. Our Sun is not quite massive enough. If an electron were a point particle, then it would be massive enough to quickly become a black hole, and the absence of this phenomenon is the best reason to believe electrons are not point particles. Black holes can be very small but it is generally believed  that their minimum mass is the Planck mass.

The black hole M87 is pictured above. It has a mass of about 6.5 billion of our suns, so it is too big for it to have originated from the collapse of only a single star, so it probably has eaten many stars. It is not in the Milky Way but in another galaxy. There is another, smaller black hole at the center of the Milky Way. It, too, is probably made by feeding on neighbor stars and other nearby material. Almost all galaxies have a black hole at their center, but black holes also exist elsewhere. These black holes are not powerful enough to suck in all the stars around them, just as our sun will never suck in all the planets of our solar system. All known black holes have some spin, but no black hole can spin so fast as to violate Einstein’s speed limit.

A black hole that is spinning is not quite a sphere. If it spins very rapidly, then it is flattened at its poles. A black hole’s accretion disk also spins, and because of this the Doppler effect shown in the picture above requires the redness at the top to be less bright than at the bottom of the picture. The picture has been altered to remove the blurriness that would otherwise be present due to the refraction from the plasma and dust between the Earth and the black hole. This plasma close to the black hole has a temperature of hundreds of billions of degrees.

The matter orbiting the black hole is a diffuse gas of electrons and protons. …The black hole pulls that matter from the atmospheres of stars orbiting it. Not that it pulls very much. Sagittarius A* is on a starvation diet—less than 1 percent of the stuff captured by the black hole’s gravity ever makes it to the event horizon. (Seth Fletcher. Scientific American, September 2022 p. 53.)

It is sometimes said that relativity theory implies an infalling spaceship suffers an infinite time dilation at the event horizon and so does not fall through the horizon in a finite time. This is not quite true because experts now realize the gravitational field produced by the spaceship itself acts on the black hole. This implies that, as the spaceship gets very, very close to the event horizon, an atom’s width away, the time dilation does radically increase, but the event horizon slightly expands enough to swallow the spaceship in a finite time—a trivially short time as judged from the spaceship, but a very long time as judged from Earth. This occurrence of slight expansion is one sign that the event horizon is fluidlike.

By applying quantum theory to black holes, Stephen Hawking discovered that every black hole radiates some energy at its horizon and will eventually evaporate. All known black holes take longer than the age of the universe to evaporate. For example, black holes with a mass a few times larger than our sun take about 1064 years to completely evaporate. To appreciate how long a black hole lives, remember that the Big Bang occurred less than twenty billion years ago (2 x 1010 years ago). Every black hole absorbs the cosmic background radiation, so a black hole will not even start evaporating and losing mass-energy until the absorption of the cosmic background radiation subsides enough that it is below the temperature of the black hole. Quantum theory suggests black holes get warmer as they shrink. They get smaller by absorbing particles on their event horizon that have negative mass. When a black hole becomes the size of a bacterium, its outgoing radiation becomes white-colored, producing a white black-hole. At the very last instant of its life, it evaporates as it explodes in a flash of extremely hot, high-energy particles.

Nearly all physical objects tend to get warmer when you shine a light on them. Think of your ice cream cone in the sunshine. A black hole is an exception. It get colder.

Black holes produce startling visual effects. A light ray can circle outside a black hole once or many times depending upon its angle of incidence to the event horizon. A light ray grazing a black hole can leave at any angle, so a person viewing a black hole from outside can see multiple copies of the rest of the universe at various angles. See http://www.spacetimetravel.org/reiseziel/reiseziel1.html for some of these visual effects.

Every spherical black hole has the odd geometric feature that its diameter is very much larger than its circumference, very unlike the sphere of Euclidean geometry.

Some popularizers have said that the roles of time and space are reversed within a black hole, but this is not correct. Instead it is coordinates that reverse their roles. Given a coordinate system whose origin is outside a black hole, its timelike coordinates become spacelike coordinates inside the horizon. If you were to fall into a black hole, your clock would not begin measuring distance. See (Carroll 2022c  251-255) for more explanation of this role reversal.

History

In 1783, John Michell had proposed that there may be a star with a large enough diameter that the velocity required to escape its gravitational pull would be so great that not even Newton’s particles of light could escape. He called them “dark stars.” Einstein invented the general theory of relativity in 1915, and the next year the German physicist Karl Schwarzschild discovered that Einstein’s equations imply that if a non-rotating, spherical star were massive enough and its radius (now called the Schwarzschild radius) were somehow small enough, then it would undergo an unstoppable collapse. Meanwhile, the gravitational force from the object would be so strong that not even light could escape the inward pull of gravity. In 1935, Arthur Eddington commented upon this discovery that relativity theory allowed a star to collapse to become a black hole:

I think there should be a law of nature to stop a star behaving in this absurd way.

Because of Eddington’s prestige, other physicists (with the notable exception of Subrahmanyan Chandrasekhar) agreed. Then in 1939, J. Robert Oppenheimer and his student Hartland Snyder first seriously suggested that stars would in fact collapse into black holes, and they first clearly described the two defining features of a black hole—that “The star thus tends to close itself off from any communication with a distant observer; only its gravitational field persists.” The term “black hole” was first explicitly mentioned by physicist Robert Dicke some time in the early 1960s when he made the casual comparison to a notorious dungeon of the same name in India, the Black Hole of Calcutta. The term was first published in the American magazine Science News Letter in 1964. John Wheeler subsequently promoted use of the term, following a suggestion from one of his students.

18. What Is the Solution to the Twins Paradox?

The paradox is an argument about that uses the theory of relativity to produce an apparent contradiction. Before giving that argument, let’s set up a typical situation that can be used to display the paradox. Consider two twins at rest on Earth with their correct clocks synchronized. One twin climbs into a spaceship, and flies far away at a high, constant speed, then stops, reverses course, and flies back at the same speed. An application of the equations of special relativity theory shows that the twin on the spaceship will return and be younger than the Earth-based twin. Their clocks disagree about the elapsed time of the trip. Now that the situation has been set up, notice that relativity theory implies that either twin could say they are the stationary twin. Isn’t that an implication of relativity theory?

The paradoxical argument is that either twin could regard the other as the traveler and thus as the one whose time dilates. If the spaceship were considered to be stationary, then would not relativity theory imply that the Earth-based twin could race off (while attached to the Earth) and return to be the younger of the two twins? If so, then when the twins reunite, each is younger than the other. That result is paradoxical.

Herbert Dingle was the President of London’s Royal Astronomical Society in the early 1950s. He famously argued in the 1960s that this twins paradox reveals an inconsistency in special relativity. Almost all philosophers and scientists disagree with Dingle and say the twin paradox is not a true paradox, in the sense of revealing an inconsistency within relativity theory, but is merely a complex puzzle that can be adequately solved within relativity theory.

Th twins paradox is not an actual paradox, just an interesting puzzle that has a solution. The solution is that the two situations are not sufficiently similar, and because of this, for reasons to be explained in a moment, the twin who stays home on Earth maximizes his or her own time (that is, proper time) and so is always the older twin when the two reunite. This solution to the paradox involves spacetime geometry, and it has nothing to do with an improper choice of the reference frame, nor with acceleration even though one twin does accelerate in the situation as it was introduced above. The solution has to do with the fact that some paths in spacetime must take more proper time to complete than do other paths. As Maudlin puts it, “the issue is how long the world-lines are, not how bent.”

Here is how to understand the paradox. Consider the spacetime diagram below.

twin paradox

The principal suggestion for solving the paradox is to note that there must be a difference in the time taken by the twins because their behaviors are different, as shown by the number and spacing of nodes along their two worldlines above. The nodes represent ticks of their clocks. Notice how the space traveler’s time is stretched or dilated compared to the coordinate time, which also is the time of the stay-at-home twin. The coordinate time, that is, the time shown by clocks fixed in space in the coordinate system is the same for both travelers. Their personal times are not the same. The traveler’s personal time is less than that of the twin who stays home.

For simplicity we are giving the twin in the spaceship an instantaneous initial acceleration and ignoring the enormous  gravitational forces this would produce, and we are ignoring the fact that the Earth is not really stationary but moves slowly through space during the trip.

The key idea for resolving the paradox is not that one twin accelerates and the other does not, though that does happen,  although this claim is very popular in the literature in philosophy and physics. It’s that, during the trip, the traveling twin experiences less time but more space. That fact is shown by how their worldlines in spacetime are different. Relativity theory requires that for two paths that begin and end at the same point, the longer the path in spacetime (and thus the longer the worldline in the spacetime diagram) the shorter the elapsed proper time along that path. That difference is why the spacing of nodes is so different for the two travelers. This is counterintuitive (because our intuitions falsely suggest that longer paths take more time even if they are spacetime paths). And nobody’s clock is speeding up or slowing down relative to its rate a bit earlier.

A free-falling clock ticks faster and more often than any other accurate clock that is used to measure the duration between pairs of events. It is so for the event of the twins leaving each other and reuniting. This is illustrated graphically by the fact that the longer worldline in the graph represents a greater distance in space and a greater interval in spacetime but a shorter duration along that worldline. The number of dots in the line is a correct measure of the time taken by the traveler. The spacing of the dots represents the durations between ticks of a personal clock along that worldline. If the spaceship approached the speed of light, that twin would cover an enormous amount of space before the reunion, but that twin’s clock would hardly have ticked at all before the reunion event.

To repeat this solution in other words, the diagram shows how sitting still on Earth is a way of maximizing the trip time, and it shows how flying near light speed in a spaceship away from Earth and then back again is a way of minimizing the time for the trip, even though if you paid attention only to the shape of the worldlines in the diagram and not to the dot spacing within them you might mistakenly think just the reverse. This odd feature of the geometry is one reason why Minkowski geometry is different from Euclidean geometry. So, the conclusion of the analysis of the paradox is that its reasoning makes the mistake of supposing that the situation of the two twins can properly be considered to be essentially the same.

Richard Feynman famously, but mistakenly, argued in 1975 that acceleration is the key to the paradox. As (Maudlin 2012) explains, the acceleration that occurs in the paths of the example above is not essential to the paradox because the paradox could be expressed in a spacetime obeying special relativity in which neither twin accelerates yet the twin in the spaceship always returns younger. The paradox can be described using a situation in which spacetime is compactified in the spacelike direction with no intrinsic spacetime curvature, only extrinsic curvature. To explain that remark, imagine this situation: All of Minkowski spacetime is like a very thin, flat cardboard sheet. It is “intrinsically flat.” Then roll it into a cylinder, like the tube you have after using the last paper towel on the roll. Do not stretch, tear, or otherwise deform the sheet. Let the time axis be parallel to the tube length, and let the one-dimensional space axis be a circular cross-section of the tube. The tube spacetime is still flat intrinsically, as required by special relativity, even though now it is curved extrinsically (which is allowed by special relativity). The travelling twin’s spaceship circles the universe at constant velocity, so its spacetime path is a spiral. The stay-at-home twin sits still, so its spacetime path is a straight line along the tube. The two paths start together, separate, and eventually meet (many times). During the time between separation and the first reunion, the spaceship twin travels in a spiral as viewed from a higher dimensional Euclidean space in which the tube is embedded. That twin experiences more space but less time than the stationary twin. Neither twin accelerates. There need be no Earth nor any mass nearby for either twin. Yet the spaceship twin who circles the universe comes back younger because of the spacetime geometry involved, in particular because the twin travels farther in space and less far in time than the stay-at-home twin.

For more discussion of the paradox, see (Maudlin 2012), pp. 77-83, and, for the travel on the cylinder, see pp. 157-8.

19. What Is the Solution to Zeno’s Paradoxes?

See the article “Zeno’s Paradoxes” in this encyclopedia.

20. How Are Coordinates Assigned to Time?

A single point of time is not a number, but it has a number when a coordinate system is applied to time. When coordinate systems are assigned to spaces, coordinates are assigned to points. The space can be physical space or mathematical space. The coordinates hopefully are assigned in a way that a helpful metric can be defined for computing the distances between any pair of point-places, or, in the case of time, the duration between any pair of point-times. Points, including times, cannot be added, subtracted, or squared, but their coordinates can be. Coordinates applied to the space are not physically real; they are tools used by the analyst, the physicist; and they are invented, not discovered. The coordinate systems gives each instant a unique name.

Technically, the question, “How do time coordinates get assigned to points in spacetime?” presupposes knowing how we coordinatize the four-dimensional manifold that we call spacetime. The manifold is a collection of points (technically, it is a topological space) which behaves as a Euclidean space in neighborhoods around any point. The focus in this section is on its time coordinates.

There is very good reason for believing that time is one-dimensional, and so, given any three different point events, one of them will happen between the other two. This feature is reflected in the fact that when real number time coordinates are assigned to three point events, and one of the three coordinates is between the other two.

Every event on the world-line of the standard clock is assigned a t-coordinate by that special clock. The clock also can be used to provide measures of the duration between two point events that occur along the coordinate line. Each point event along the world-line of the master clock is assigned some t-coordinate by that clock. For example, if some event e along the time-line of the master clock occurs at the spatial location of the clock while the master clock shows, say, t = 4 seconds, then the time coordinate of the event e is declared to be 4 seconds. That is t(e)=4. We assume that e occurs spatially at an infinitesimal distance from the master clock, and that we have no difficulty in telling when this situation occurs. So, even though determinations of distant simultaneity are somewhat difficult to compute, determinations of local simultaneity in the coordinate system are not. In this way, every event along the master clock’s time-line is assigned a time of occurrence in the coordinate system.

In order to extend the t-coordinate to events that do not occur where the standard clock is located, we can imagine having a stationary, calibrated, and synchronized clock at every other point in the space part of spacetime at t = 0, and we can imagine using those clocks to tell the time along their worldlines. In practice we do not have so many accurate clocks, so the details for assigning time to these events is fairly complicated, and it is not discussed here. The main philosophical issue is whether simultaneity may be defined for anywhere in the universe. The sub-issues involve the relativity of simultaneity and the conventionality of simultaneity. Both issues are discussed in other sections of this supplement.

Isaac Newton conceived of points of space and time as absolute in the sense that they retained their identity over time. Modern physicists do not have that conception of points; points are identified relative to events, for example, the halfway point in space between this object and that object, and ten seconds after that point-event.

In the late 16th century, the Italian mathematician Rafael Bombelli interpreted real numbers as lengths on a line and interpreted addition, subtraction, multiplication, and division as “movements” along the line. His work eventually led to our assigning real numbers to instants. Subsequently, physicists have found no reason to use complex numbers or other exotic numbers for this purpose, although some physicists believe that the future theory of quantum gravity might show that discrete numbers such as integers will suffice and the exotically structured real numbers will no longer be required.

To assign numbers to instants (the numbers being the time coordinates or dates), we use a system of clocks and some calculations, and the procedure is rather complicated the deeper one probes. For some of the details, the reader is referred to (Maudlin 2012), pp. 87-105. On pp. 88-89, Maudlin says:

Every event on the world-line of the master clock will be assigned a t-coordinate by the clock. Extending the t-coordinate to events off the trajectory of the master clock requires making use of…a collection of co-moving clocks. Intuitively, two clocks are co-moving if they are both on inertial trajectories and are neither approaching each other nor receding from each other. …An observer situated at the master clock can identify a co-moving inertial clock by radar ranging. That is, the observer sends out light rays from the master clock and then notes how long it takes (according to the master clock) for the light rays to be reflected off the target clock and return. …If the target clock is co-moving, the round-trip time for the light will always be the same. …[W]e must calibrate and synchronize the co-moving clocks.

The master clock is the standard clock. Co-moving inertial clocks do not generally exist according to general relativity, so the issue of how to assign time coordinates is complicated in the real world. What follows is a few more interesting comments about the assignment.

The main point of having a time coordinate is to get agreement from others about which values of times to use for which events, namely which time coordinates to use. Relativity theory implies every person and even every object has its own proper time, which is the time of the clock accompanying it. Unfortunately these personal clocks do not usually stay in synchrony with other well-functioning clocks, although Isaac Newton falsely believed they do stay in synchrony. According to relativity theory, if you were to synchronize two perfectly-performing clocks and give one of them a speed relative to the other, then the two clocks readings must differ (as would be obvious if they reunited), so once you’ve moved a clock away from the standard clock you can no longer trust the clock to report the correct coordinate time at its new location.

The process of assigning time coordinates assumes that the structure of the set of instantaneous events is the same as, or is embeddable within, the structure of our time numbers. Showing that this is so is called solving the representation problem for our theory of time measurement. The problem has been solved. This article does not go into detail on how to solve this problem, but the main idea is that the assignment of coordinates should reflect the structure of the space of instantaneous times, namely its geometrical structure, which includes its topological structure, diffeomorphic structure, affine structure, and metrical structure. It turns out that the geometrical structure of our time numbers is well represented by the structure of the real numbers.

The features that a space has without its points being assigned any coordinates whatsoever are its topological features, its differential structures, and its affine structures. The topological features include its dimensionality, whether it goes on forever or has a boundary, and how many points there are. The mathematician will be a bit more precise and say the topological structure tells us which subsets of points form the open sets, the sets that have no boundaries within them. The affine structure is about which lines are straight and which are curved. The diffeomorphic structure distinguishes smooth from bent (having no derivative).

If the space has a certain geometry, then the procedure of assigning numbers to time must reflect this geometry. For example, if event A occurs before event B, then the time coordinate of event A, namely t(A), must be less than t(B). If event B occurs after event A but before event C, then we should assign coordinates so that t(A) < t(B) < t(C).

Consider a space as a class of fundamental entities: points. The class of points has “structure” imposed upon it, constituting it as a geometry—say the full structure of space as described by Euclidean geometry. [By assigning coordinates] we associate another class of entities with the class of points, for example a class of ordered n-tuples of real numbers [for a n-dimensional space], and by means of this “mapping” associate structural features of the space described by the geometry with structural features generated by the relations that may hold among the new class of entities—say functional relations among the reals. We can then study the geometry by studying, instead, the structure of the new associated system [of coordinates]. (Sklar 1976, p. 28)

But we always have to worry that there is structure among the numbers that is not among the entities numbered. Such structures are “mathematical artifacts.”

The goal in assigning coordinates to a space is to create a reference system; this is a reference frame plus (or that includes [the literature is ambiguous on this point]) a coordinate system. For 4D spacetime obeying special relativity with its Lorentzian geometry, a Lorentzian coordinate system is a grid of smooth timelike and spacelike curves on the spacetime that assigns to each point three space-coordinate numbers and one time-coordinate number. No two distinct points of the spacetime can have the same set of four coordinate numbers. Technically, being continuous is a weaker requirement than being smooth, but the difference is not of concern here.

As we get more global, we have to make adjustments. Consider two coordinate systems in adjacent regions. For the adjacent regions, we make sure that the ‘edges’ of the two coordinate systems match up in the sense that each point near the intersection of the two coordinate systems gets a unique set of four coordinates and that nearby points get nearby coordinate numbers. The result is an atlas on spacetime. Inertial frames can have global coordinate systems, but in general, we have to use atlases for other frames. If we are working with general relativity where spacetime can curve and we cannot assume inertial frames, then the best we can do without atlases is to assign a coordinate system to a small region of spacetime where the laws of special relativity hold to a good approximation. General relativity requires special relativity to hold locally, that is, in any infinitesimal region, and thus for space to be Euclidean locally. That means that locally the 3-d space is correctly described by 3-d Euclidean solid geometry. Adding time is a complication. Spacetime is not Euclidean in relativity theory. Infinitesimally, it is Minkowskian.

Regarding anywhere in the the atlas, we demand that nearby events get nearby coordinates. When this feature holds everywhere, the coordinate assignment is said to be monotonic or to “obey the continuity requirement.” We satisfy this requirement by using real numbers as time coordinates.

The metric of spacetime in general relativity is not global but varies from place to place due to the presence of matter and gravitation, and it varies over time as the spatial distribution of matter and energy varies with time. So,  spacetime cannot be given its coordinate numbers without our knowing the distribution of matter and energy. That is the principal reason why the assignment of time coordinates to times is so complicated.

To approach the question of the assignment of coordinates to spacetime points more philosophically, consider this challenging remark:

Minkowski, Einstein, and Weyl invite us to take a microscope and look, as it were, for little featureless grains of sand, which, closely packed, make up space-time. But Leibniz and Mach suggest that if we want to get a true idea of what a point of space-time is like we should look outward at the universe, not inward into some supposed amorphous treacle called the space-time manifold. The complete notion of a point of space-time in fact consists of the appearance of the entire universe as seen from that point. Copernicus did not convince people that the Earth was moving by getting them to examine the Earth but rather the heavens. Similarly, the reality of different points of space-time rests ultimately on the existence of different (coherently related) viewpoints of the universe as a whole. Modern theoretical physics will have us believe the points of space are uniform and featureless; in reality, they are incredibly varied, as varied as the universe itself.
—From “Relational Concepts of Space and Time” by Julian B. Barbour, The British Journal for the Philosophy of Science, Vol. 33, No. 3 (Sep., 1982), p. 265.

For a sophisticated and philosophically-oriented approach to assigning time coordinates to times, see Philosophy of Physics: Space and Time by Tim Maudlin, pp. 24-34.

21. How Do Dates Get Assigned to Actual Events?

The following discussion presupposes the discussion in the previous section.

Our purpose in choosing a coordinate system or atlas is to express  time-order relationships (Did this event occur between those two or before them or after them?) and magnitude-duration relationships (How long after A did B occur?) and date-time relationships (When did event A itself occur?). The date of a (point) event is the time coordinate number of the spacetime coordinate of the event. We expect all these assignments of dates to events to satisfy the requirement that event A happens before event B iff t(A) < t(B), where t(A) is the time coordinate of A, namely its date. The assignments of dates to events also must satisfy the demands of our physical theories, and in this case we face serious problems involving inconsistency if a geologist gives one date for the birth of Earth, an astronomer gives a different date, and a theologian gives yet another date.

Ideally for any reference frame, we would like to partition the set of all actual events into simultaneity equivalence classes by some reliable method. All events in one equivalence class happen at the same time in the frame, and every event is in some class or other.

This cannot be done, but it is interesting to know how close we can come to doing it and how we would go about doing it. We would like to be able to say what event near our spaceship circling Mars (or the supergiant star Betelgeuse) is happening now (at the same time as our now where we are located). More generally, how do we determine whether a nearby event and a very distant event occurred simultaneously? Here we face the problem of the relativity of simultaneity and the problem of the conventionality of simultaneity.

How do we calibrate and synchronize our own clock with the standard clock? Let’s design a coordinate system for time. Suppose we have already assigned a date of zero to the event that we choose to be at the origin of our coordinate system. To assign dates (that is, time coordinates) to other events, we must have access to information from the standard clock, our master clock, and be able to use this information to declare correctly that the time intervals between any two consecutive ticks of our own clock are the same. The second is our conventional unit of time measurement, and it is defined to be the duration required for a specific number of ticks of the standard clock.

We then hope to synchronize other clocks with the standard clock so the clocks show equal readings at the same time. We cannot do this. What are the obstacles? The time or date at which a point-event occurs is the number reading on the clock at rest there. If there is no clock there, the assignment process is more complicated. One could transport a synchronized clock to that place, but any clock speed or influence by a gravitational field during the transport will need to be compensated for. If the place is across the galaxy, then any transport is out of the question, and other means must be used.

Because we want to use clocks to assign a time coordinate even to very distant events, not just to events in the immediate vicinity of the clock. As has been emphasized several times throughout this rambling article, the major difficulty is that two nearby synchronized clocks, namely clocks that have been calibrated and set to show the same time when they are next to each other, will not in general stay synchronized if one is transported somewhere else. If they undergo the same motions and gravitational influences, and thus have the same worldline or timeline, then they will stay synchronized; otherwise, they will not. There is no privileged transportation process that we can appeal to. Einstein offered a solution to this problem.

He suggested the following method. Assume in principle that we have stationary, ideal clocks located anywhere and we have timekeepers there who keep records and adjust clocks. Assume there is an ideal clock infinitesimally near the spaceship. Being stationary in the coordinate system implies it co-moves with respect to the master clock back in Greenwich. We need to establish that the two clocks remain the same distance apart, so how could we determine that they are stationary? We determine that, each time we send a light signal from Greenwich and bounce it off the distant clock, the roundtrip travel time remains constant. That procedure also can be used to synchronize the two clocks, or at least it can in a world that obeys special relativity, provided we know how far away the distant clock is. For example, the spaceship is known to be a distance d away from Greenwich. The roundtrip travel time is, say 2t seconds. When someone at the spaceship receives a signal from Greenwich saying it is noon, the person at the spaceship sets their clock to t seconds after noon. This is an ideal method of establishing simultaneity for distant events.

This method has some hidden assumptions that have not been mentioned. For more about this and about how to assign dates to distant events, see the discussions of the relativity of simultaneity and the conventionality of simultaneity.

As a practical matter, dates are assigned to events in a wide variety of ways. The date of the birth of the Sun is assigned very differently from dates assigned to two successive crests of a light wave in a laboratory laser. For example, there are lasers whose successive crests of visible light waves pass by a given location in the laboratory every 10-15 seconds. This short time is not measured with a stopwatch. It is computed from measurements of the light’s wavelength. We rely on electromagnetic theory for the equation connecting the periodic time of the wave to its wavelength and speed. Dates for other kinds of events, such as the birth of Mohammad or the origin of the Sun, are computed from historical records rather than directly measured with a clock.

22. What Is Essential to Being a Clock?

We use a clock to tell what time it is, and which of two events happened first, and how long an event lasts. In order to do this, the clock needs at least two sub-systems, (1) ticking and (2) the counting of those ticks. The goal in building the ticking sub-system is to have a tick rate that is stable. That means it is regular in the sense of not drifting very much over time. The tick rate in clocks that use cyclic processes is called the “frequency,” and it is measured in cycles per second. The counting sub-system counts the ticks in order to measure how much time has elapsed between two events of interest, and to calculate what time it is, and to display the result.

All other things being equal, the higher the frequency of our best clocks the better. Earth rotations are slow. Pendulums are better. With a quartz clock (used in all our computers and cellphones), a piece of quartz crystal is stimulated with a voltage in order to cause it to vibrate at its characteristic frequency, usually 32,768 cycles per second. So, when 32,768 ticks occur, the quartz clock advances its count of seconds by one second. Our civilization’s standard atomic clock ticks at a frequency of 9,192,631,770 ticks per second.

The philosopher Tim Maudlin says:

An ideal clock is some observable physical device by means of which numbers can be assigned to events on the device’s world-line, such that the ratios of differences in the numbers are proportional to the ratios of interval lengths of segments of the world-line that have those events as endpoints.

So, for example, if an ideal clock somehow assigns the numbers 4, 6, and 10 to events p, q, and r on its world-line, then the ratio of the length of the segment pq to the segment qr is 1:2, and so on. (Maudlin 2012, 108).

An object’s world-line is its trajectory through spacetime.

A clock’s ticking needs to be a regular process but not necessarily a repeatable process. There are two very different ways to achieve a clock’s regular ticking. The most important way is by repetition, namely by cyclic behavior. The most important goal is that any one cycle lasts just as long as any other cycle. This implies the durations between any pair of ticks are congruent. This point is sometimes expressed by saying the clock’s frequency should be constant.

A second way for a clock to contain a regular process or stable ticking is very different, and it does not require there to be any cycles or repeatable process. A burning candle can be the heart of a clock in which duration is directly correlated with, and measured by, how short the candle has become since the burning began. Two ideal candles will regularly burn down the same distance over the same duration. There will be a regular rate of burning, but no cyclic, repeatable burning because, once some part of the candle has burned, it no longer exists to be burned again. This candle timer is analogous to the behavior of sub-atomic ‘clocks’ based on radioactive decay that are used for carbon dating of trees and mammoths.

A daily calendar alone is not a clock unless it is connected to a regular process. It could be part of a clock in which daily progress along the calendar is measured by a process that regularly takes a day per cycle, such as the process of sunrise followed by sunset. A pendulum alone is not a clock because it has no counting mechanism. Your circadian rhythm is often called your biological clock, because it produces a regular cycle of waking and sleeping, but it is not a complete clock because there is no counting of the completed cycles. A stopwatch is not a clock. It is designed to display only the duration between when it is turned on and turned off. But it could easily be converted into a clock by adding a counting and reporting mechanism. Similarly for radioactive decay that measures the time interval between now and when a fossilized organism last absorbed Earth’s air.

Here are some examples of cyclical processes that are useful for clocks: the swings of a pendulum, repeated sunrises, cycles of a shadow on a sundial, revolutions of the Earth around the Sun, bouncing mechanical springs, and vibrations of a quartz crystal. Regularity of the repetitive process is essential because we want a second today to be equal to a second tomorrow, although as a practical matter we have to accept some margin of error or frequency drift. Note that all these repetitive processes for clocks are absolute physical quantities in the sense that they do not depend upon assigning any coordinate system, nor are they dependent on any process occurring in a living being, including any thought.

The larger enterprise of practical time-keeping for our civilization requires that clock readings be available at locations of interest, including onboard our spaceships and inside submarines. This availability can be accomplished in various ways. A standard clock sitting in a room in Paris is a practical standard only if either its times can be broadcast quickly to the desired distant location, or the clock can be copied and calibrated so that the copies stay adequately synchronized even though they are transported to different places. If the copies cannot always stay sufficiently synchronized (calibrated) with the standard clock back in Paris, then we need to know how we can compensate for this deviation from synchrony.

The count of a clock’s ticks is normally converted and displayed in seconds or in some other unit of time such as minutes, nanoseconds, hours, or years. This counting of ticks can be difficult. Our civilization’s 1964 standard clock ticks 9,192,631,770 times per second. Nobody sat down for a second and counted this number. An indirect procedure is required.

It is an arbitrary convention that we design clocks to count up to higher numbers rather than down to lower numbers. It is also a convention that we re-set our clock by one hour as we move across a time-zone on the Earth’s surface in order that the sun be nearly overhead at noons in those zones. In order to prevent noon from ever occurring when the sun is setting, we also add leap years.  However, it is no convention that the duration from instantaneous event A to instantaneous event B plus the duration from B to instantaneous event C is equal to the duration from A to C. It is one of the objective characteristics of time, and failure for this to work out numerically for your clock is a sure sign your clock is faulty.

A clock’s ticking needs to be a practically irreversible process. Any clock must use entropy increase in quantifying time. Some entropy must be created to ensure that the clock ticks forward and does not suffer a fluctuation that causes an occasional tick backward. The more entropy produced the less likely such an unwanted fluctuation will occur.

In addition to our clocks being regular and precise, we also desire our clocks to be accurate. What that means and implies is discussed in the next section.

23. What Does It Mean for a Clock to Be Accurate?

A group of clock readings is very precise if the readings are very close to each other even if they all are wildly inaccurate because they all report that it is 12:38 when actually it is noon.

A clock is accurate if it reports the same time as the standard clock. A properly working clock correctly measures the interval along its own trajectory in spacetime, its so-called proper time. The interval in spacetime is the spatio-temporal length of its trajectory, so a clock is analogous to an odometer for spacetime. Just as a car’s odometer can give a different reading for the distance between two locations if the car takes a different route between two locations, so also a properly working clock can give different measures of the duration of time between two events if the clock takes different spacetime trajectories between them. That is why it is easiest to keep two clocks in synchrony if they are sitting next to each other, and that is why it is easiest to get an accurate measure of the time between two events if they occur at the same place.

Because clocks are intended to be used to measure events external to themselves, a goal in clock building is to ensure there is no difficulty in telling which clock tick is simultaneous with which external event. For most nearby situations and nearby clocks and everyday purposes, the sound made by the ticking helps us make this determination. We hear the tick just as we hear or see the brief event occur that we wish to “time.” Humans actually react faster to what they hear than what they see. Trusting what we see or hear presupposes that we can ignore the difference in time between when a sound reaches our ears and when it is consciously recognized in our brain, and it presupposes that we can safely ignore the difference between the speed of sound and the speed of light.

If a clock is synchronized with the standard clock and works properly and has the same trajectory in spacetime as the standard clock, then it will remain accurate (that is, stay in synchrony) with the standard clock. According to the general theory of relativity, if a clock takes a different trajectory from the standard clock, then its readings will deviate from those of the standard clock, and when the second clock is brought back to be adjacent to the standard clock, the two will give different readings of what time it is. That is, if your well-functioning clock were at rest adjacent to the standard clock, and the two were synchronized, then they would stay synchronized, but if your clock moved away from the standard clock and took some different path through space, then the two would not give the same readings when they were reunited, even though both continued to be correct clocks, so this complicates the question of whether a clock that is distant from the standard clock is telling us standard time. To appreciate the complication, ask yourself the question: When our standard clock shows noon today, what event within a spaceship on Mars occurs simultaneously? Or ask the question: How do you “set” the correct time on the Mars clock?

The best that a designated clock can do while obeying the laws of general relativity is to accurately measure its own proper time. Time dilation will affect the readings of all other clocks and make them drift out of synchrony with the designated clock. It is up to external observers to keep track of these deviations and account for them for the purpose at hand.

There is an underlying philosophical problem and a psychological problem. If we assign a coordinate system to spacetime, and somehow operationally define what it is for a clock at one place to be in synch with a clock at another place, then we can define distant simultaneity in that coordinate system. However, whether spatiotemporally separated clocks are simultaneous is a coordinate-dependent artifact. Even when people understand this philosophical point that arises because of the truth of the general theory of relativity, they still seem unable to resist the temptation to require a correct answer to the question “What event on a spaceship circling Mars is simultaneous with noon today here on Earth” and unable to appreciate that this notion of simultaneity is a convention that exists simply for human convenience.

The quartz clock in your cellphone drifts and loses about a second every day or two, so it frequently needs to be “reset” (that is, restored to synchrony with our society’s standard clock).

Our best atomic clocks need to be reset by one second every 100 million years.

Suppose we ask the question, “Can the time shown on a properly functioning standard clock ever be inaccurate?” The answer is “no” if the target is synchrony with the current standard clock, as the conventionalists believe, but “yes” if there is another target. Objectivists can propose at least three other distinct targets: (1) synchrony with absolute time (as Isaac Newton proposed in the 17th century), (2) synchrony with the best possible clock, and (3) synchrony with the best-known clock. We do not have a way of knowing whether our current standard clock is close to target 1 or target 2. But if the best-known clock is known not yet to have been chosen to be the standard clock, then the current standard clock can be inaccurate in sense 3 and perhaps it is time to call an international convention to discuss adopting a new time standard.

Practically, a reading of ‘the’ standard clock is a report of the average value of the many conventionally-designated standard clocks, hundreds of them distributed around the globe. Any one of these clocks could fail to stay in sync with the average, and when this happens it is re-set (that is, re-calibrated, or re-set to the average reading). The re-setting occurs about once a month to restore accuracy.

There is a physical limit to the shortest duration measurable by a given clock because no clock can measure events whose duration is shorter than the time it takes a signal to travel between the components of that clock, the components in the part that generates the regular ticks. This theoretical limit places a lower limit on the margin of error of any measurement of time made with that clock.

Every physical motion of every clock is subject to disturbances. So, we want to minimize the disturbance, and we want our clock to be adjustable in case it drifts out of synchrony a bit. To achieve this goal, it helps to keep the clock isolated from environmental influences such as heat, dust, unusual electromagnetic fields, physical blows (such as dropping the clock), immersion in liquids, and differences in gravitational force. And it helps to be able to predict how much a specific influence affects the drift out of synchrony so that there can be an adjustment for this influence.

Sailors use clocks to discover the longitude of where they are in the ocean. Finding a sufficiently accurate clock was how 18th and 19th century sailors eventually were able to locate themselves when they could not see land. At sea at night, the numerical angle of the North Star above the horizon is their latitude. Without a clock, they had no way to determine their longitude except by dead reckoning, which is very error-prone. A pendulum clock does not work well when the sea is not smooth. If they had an accurate mechanical clock with them that wasn’t affected by choppy seas, they could use it to find their longitude. First, before setting sail they would synchronize it with the standard clock at zero degrees longitude. Out on the ocean or on some island, this clock would tell them the time back at zero degrees longitude. Then at sea on a particular day, the sailors could wait until the Sun was at its highest point and know the local time is 12 noon. If at that moment their clock read 0900 (that is, 9:00 A.M.), then they would know their clock is off by 3 hours from the time at zero degrees longitude. Because Earth turns on its axis 360 degrees of longitude every day and 15 degrees every hour, the sailors could compute that they were 3 x 15 degrees west of zero degrees, namely at 45 degrees west longitude. Knowing both their latitude and longitude, they could use a map to locate themselves. The first reasonably reliable mechanical clock that could be used for measuring longitude at sea was invented by British clockmaker John Harrison in 1727. It was accurate to one second a month. When mariners adopted similarly accurate mechanical clocks, the number of ships per year that crashed into rocks plummeted.

24. What Is Our Standard Clock or Master Clock?

Our civilization’s standard clock or master clock is the clock that other clocks are synchronized with. It reports ‘the correct time.’ In the 2020s, this is a designated cesium atomic clock in Paris France. Your cell phone synchronizes its internal clock with this standard clock about once a week.

More specifically, the standard clock reports the proper time for a particular observatory in Greenwich, England which sits at zero degrees longitude, even though the report is created in a laboratory near Paris. The report is the result of a computation from reports supplied by a network of many atomic clocks situated around the world.

a. How Does an Atomic Clock Work?

We begin with a one-paragraph answer to this question, then follow this with a much more detailed answer and explanation.

There are many kinds of atomic clock, but the one adopted worldwide in the 1964 for Coordinated Universal Time relied on the very regular behavior of the cesium-133 atom. What is regular is the frequency of the microwave radiation needed to achieve resonance when the cesium is radiated with the clock’s laser. Resonance occurs when the isotope’s single outer electron is stimulated by a particular microwave frequency to transition from a low-energy ground state to a next higher-energy ground state and then to fall back down again while emitting the same microwave frequency. The oscillation or “waving” of this radiation is the ticking of the clock. Counting those ticks tells us the time.

Pendulum clocks work by counting swings of the pendulum. Quartz clocks work by counting the shakes of a small piece of quartz crystal set in motion when voltage is applied to it. Astronomical clocks count rotations of the Earth or revolutions around the Sun. Atomic clocks work by producing a wave process such as a microwave, and counting the number of those waves that pass by a single point in space within the clock. No radioactivity is involved in an atomic clock.

The key idea for all objects that deserve to be called “clocks” is that they can be relied upon to produce nearly the same, fixed number of ticks per second. Call that number n. So, for every occurrence of n oscillations, the clock reports that a second has passed. For every 60n oscillations, it reports a minute has passed. For every 60(60n) oscillations it reports an hour, and so forth. The frequency (or, equivalently, the number of oscillations per second) is the clock’s rate of ticking. If the frequency doesn’t drift very much, it is called a “stable” frequency. The more stable the better. Why all the above clocks work as clocks is that they can produce relative stable frequencies compared to that of the rest of the universe’s processes such as a tulip waving in the wind or a president’s heartbeat.

The advantage of using an atomic clock that relies on a specific isotope is that (1) for any isotope, all its atoms behave exactly alike, unlike any two quartz crystals or any two rotations of the Earth, (2) the atomic clock’s ticking is very regular compared to any non-atomic clock, (3) it ticks at a very fast rate (high frequency) so it is useful for measurements of events having a very brief duration, (4) the clock can easily be copied and constructed elsewhere, (5) the clock is not easily perturbed,  and (6) there is no deep mystery about why it is a better master clock than other clocks.

An atomic clock’s stable frequency is very easy to detect because the isotope “fluoresces” or “shines” or “resonates” in a characteristic, easily-detectable narrow band of frequencies. That is, its frequency distribution has a very, very narrow central peak that clearly differs from the peaks of radiation that can be produced by electron transitions between all other energy levels in the the same atom. It is these transitions that produce the shining or resonating.

In 1879, James Clerk Maxwell was the first person to suggest using the frequency of atomic radiation as a kind of invariant natural pendulum. This remark showed great foresight, and it was made before the rest of the physics community had yet accepted the existence of atoms. Vibrations in atomic radiation are the most stable periodic events that scientists in the 21st century have been able to use for clock building.

A cesium atomic clock was adopted in 1967 as the world’s standard clock, and it remains the standard in the 2020s. At the convention, physicists agreed that when 9,192,631,770 cycles of microwave radiation in the clock’s special, characteristic process are counted, then the atomic clock should report that a duration of one atomic second has occurred.

What is this mysterious “special, characteristic process” in cesium clocks that is so stable? This question is answered assuming every cesium atom behaves according to the Bohr model of atoms. The model is easy to visualize, but it provides a less accurate description than does a description in terms of quantum theory. However, quantum theory is more difficult to understand, so it is avoided here.

Every atom of a single isotope behaves just like any other, unlike two manufactured pendulums or event two rotations of the Earth. It is not that every atom of an isotope is in the same position or has the same energy or the same velocity, but rather that, besides those properties, they are all alike.

An atom’s electrons normally stay in orbit and don’t fly away, nor do they crash into the nucleus. Electrons stay in their orbits until perturbed, and each orbit has a characteristic energy level, a specific value of its energy for any electron in that orbit. When stimulated by incoming electromagnetic radiation, such as from a laser, the electrons can absorb the incoming radiation and transition to higher, more energetic orbits. Which orbit the electron moves to depends  on the energy of the incoming radiation that it absorbs. Higher orbits are orbits are more distant from the nucleus. Also, an electron orbiting in a higher, more energetic orbit is said to be excited because it might emit some radiation spontaneously and transition into one of the lower orbits. There are an infinite number of energy levels and orbits, but they do not differ continuously. They differ by discrete steps. The various energies that can be absorbed and emitted are unique to each isotope of each element. Examining the various frequencies of the emitted radiation of an object gives sufficient information to identify which isotope and element is present. Ditto for the signature of the absorption frequencies. Famously, finding the signature for helium in sunlight was the first evidence that there was helium in the Sun.

A cesium atom’s outer electron shell contains only a single electron, making it chemically reactive to incoming microwave radiation. To take advantage of this feature in a cesium atomic clock, an outer electron in its lowest-energy orbit around the cesium-133 nucleus is targeted by some incoming microwave radiation from the atomic clock’s laser. Doing so makes the electron transition to a higher energy orbit around the cesium nucleus, thus putting the electron into an “excited” state. Properly choosing the frequency of the incoming radiation that hits the target cesium (called successfully “tuning” the laser) can control which orbit the electron transitions to. Tuning the laser is a matter of controlling the laser’s frequency with a feedback loop that keeps it generating the desired, stable frequency. Initially, the cesium is heated to produce a vapor or gas, then the cesium atoms are cooled as a group to reduce their kinetic energy, and then they are magnetically filtered to select only the atoms whose outer electrons are in the lowest possible energy state.

Our Bohr model supposes, following a suggestion from Einstein, that any electromagnetic wave such as a light wave or a microwave or a radio wave can just as well be considered to be composed of small, discrete particle-like objects called photons. The photon’s energy is directly correlated with the wave’s frequency—higher energy photons correspond to higher frequency waves. If a photon of exactly the right energy from the laser arrives and hits a cesium atom’s electron, the electron can totally absorb the photon by taking all its energy and making the electron transition up to a higher energy level. Energy is conserved during absorption and emission.

Later, the electron in a higher, excited state might spontaneously fall back down to one of the various lower energy levels, while emitting a photon of some specific frequency. The value of that frequency is determined by the energy difference in the two energy levelsof the transition. If it is still in an excited state, the (or an) electron might spontaneously fall again to an even lower energy level, and perhaps cascade all the way down to the lowest possible energy level. There is an infinite number of energy levels of any atom, so potentially there is an infinite number of frequencies of photons that can be absorbed and an infinite number of frequencies of photons that can be emitted in the transitions. There are an infinite number, but not just any number, because the frequencies or energies differ in small, discrete steps from each other.

If the electron in a specific energy level were hit with a sufficiently energetic incoming photon, the electron would fly away from the atom altogether, leaving the atom ionized.

For any atom of any isotope of any element with its outer electron in its lowest ground state, there is a characteristic, unique energy value for that state, and there is a characteristic minimum energy for an incoming photon to be able to knock the outer electron up to the very next higher level and no higher, and this is the same energy or frequency that is emitted when that higher-level electron spontaneously transitions back to the lowest level. This ground state behavior of transitioning to the next higher level and back down again is the key behavior of an atom that is exploited in the operation of an atomic clock.

In a cesium atomic clock using the isotope 133Cs, its cesium gas is cooled and manipulated so that nearly all its atoms are in their unexcited, lowest ground state. This manipulation uses the fact that atoms in the two different states have different magnetic properties so they can be separated magnetically. Then the laser’s frequency is tuned until the laser is able to knock the outer electrons from their ground state up to the next higher energy state (but no higher) so that the excited electrons then transition back down spontaneously to the ground level and produce radiation of exactly the same frequency as that of the laser. That is, the target cesium shines or fluoresces with the same frequency it was bombarded with. When this easily-detectable fluorescence occurs, the counting can begin, and the clock can measure elapsed time.

While the definition of a second has stayed the same since 1967, the technology of atomic clocks has not. Scientists in 2020s can make an atomic clock so precise that it would take 30 billion years to drift by a single second. The cesium atomic clock of 1967 drifted quite a bit more. That is why the world’s 1967 time-standard using cesium atomic clocks is likely to be revised in the 21st century.

For more details on how an atomic clock works, see (Gibbs, 2002).

b. How Do We Find and Report the Standard Time?

If we are next to the standard clock, we can find the standard time by looking at its display of the time. Almost all countries use a standard time report that is called Coordinated Universal Time. Other names for it are UTC, and Zulu Time. It once was named Greenwich Mean Time (GMT). Some countries prefer their own name.

How we find out what time it is when we are not next to the standard click is quite complicated. First, ignoring the problems of time dilation and the relativity of simultaneity raised by Einstein’s theory of relativity that are discussed above, let’s consider the details of how standard time is reported around the world. The international standard time that gets reported is called U.T.C. time, for the initials of the French name for Coordinated Universal Time. The report of U.T.C. time is based on computations and revisions made from the time reports of the Atomic Time (A.T.) of many cesium clocks around the Earth.

U.T.C. time is, by agreement, the time at zero degrees longitude. This longitude is an imaginary great circle that runs through the North Pole and South Pole and a certain astronomical observatory in Greenwich England, although the report itself is produced near Paris France. This U.T.C. time is used by the Internet and by the aviation industry throughout the world.

U.T.C. time is produced from T.A.I. time by adding or subtracting some appropriate integral number of leap years and leap seconds. T.A.I. time is computed, in turn, from a variety of reports of A.T. time (Atomic Time), the time of our standard, conventionally-designated cesium-based atomic clocks. All A.T. times are reported in units called S.I. seconds.

An S.I. second (that is, a Système International second or a second of Le Système International d’Unités) is defined to be the numerical measure of the time it takes for motionless (relative to the Greenwich observatory), designated, master cesium atomic clocks to emit exactly 9,192,631,770 cycles of radiation. The number “9,192,631,770” was chosen rather than some other number by vote at an international convention for the purpose of making the new second be as close as scientists could come to the duration of what was called a “second” back in 1957 when the initial measurements were made on cesium-133 using the best solar-based clocks available then.

The T.A.I. scale from which U.T.C. time is computed is the average of the reports of A.T. time from about 200 designated cesium atomic clocks that are distributed around the world in about fifty selected laboratories, all reporting to Paris. One of those laboratories is the National Institute of Standards and Technology (NIST) in Boulder, Colorado, U.S.A. The calculated average time of the 200 reports is the T.A.I. time, the abbreviation of the French phrase for International Atomic Time. The International Bureau of Weights and Measures (BIPM) near Paris performs the averaging about once a month. If your designated laboratory in the T.A.I. system had sent in your clock’s reading for a certain specified event that occurred in the previous month, then in the present month the BIPM calculates the average answer for all the 200 reported clock readings and sends you a notice of how inaccurate your report was from the average, so you can reset your clock, that is, make adjustments to your atomic clock and hopefully have it be in better agreement with next month’s average for the 200. Time physicists are following the lead over time of their designated clocks because there is nothing better to follow.

A.T. time, T.A.I. time, and U.T.C. time are not kinds of physical time but rather are kinds of reports of physical time.

In the 17th century, Christiaan Huygens recommend dividing a solar day into 24 hours per day and 60 minutes per hour and 60 seconds per minute, making a second be 1/86,400 of a solar day. This is called Universal Time 1 or UT1. Subsequently, the second was redefined by saying there are 31,556,925.9747 seconds in the tropical year 1900. At the 13th General Conference on Weights and Measures in 1967, the definition of a second was changed again to a specific number of periods of radiation produced by a standard cesium atomic clock (actually, the average of the 200 standard atomic clocks). This second is the so-called standard second or the S.I. second. It is defined to be the duration of 9,192,631,770 periods (cycles, oscillations, vibrations) of a certain kind of microwave radiation absorbed in the standard cesium atomic clock. More specifically, the second is defined to be the duration of exactly 9,192,631,770 periods of the microwave radiation required to produce the maximum fluorescence of a small gas cloud of cesium-133 atoms as the single outer-shell electron in these atoms transitions between two specific energy levels of the atom. This is the internationally agreed-upon unit for atomic time in the T.A.I. system. In 1967 the atomic clocks were accurate to one second every 300 years. The accuracy of atomic clocks subsequently have gotten much better.

All metrologists expect there to be an eventual change in the standard clock by appeal to higher frequency clocks, namely optical clocks that tick much faster.  The higher ticking rate is important for many reasons, one of which is that, the more precise the clock that is used the better physicists can test the time-translation invariance of the fundamental laws of physics, such as checking whether the supposed constants of nature do in fact stay constant over time.

Leap years (with their leap days) are needed as adjustments to the standard clock’s count in order to account for the fact that the number of the Earth’s rotations per Earth revolution does not stay constant from year to year. The Earth is spinning slower every day, but not uniformly. Without an adjustment, the time called “midnight” eventually would drift into the daylight. Leap years are added every four years. Leap seconds are added more frequently for another reason. The Earth’s period changes irregularly due to earthquakes and hurricanes and other phenomena. This effect on the period is not practically predictable, so, when the irregularity occurs, a leap second is introduced by needed once the standard atomic clock gets ahead of the old astronomical clock (Greenwich Mean Time or Universal Coordinated Time) by more than 0.9 seconds. The keepers of standard time simply turn their clock back by stopping their atomic clock for one second. But because it is so difficult for other timing devices to make frequent adjustments when a leap second is created (27 times between 1972 and 2024), scientists have agreed to stop using leap seconds by 2035. It is likely there will be a shift to adding leap minutes, but less frequently, such as only every fifty years or so.

The meter depends on the second, so time measurement is more basic than space measurement. It does not follow, though, that time itself is more basic than space. In 1983, scientists agreed that the meter is how far light travels in 1/299,792,458 seconds in a vacuum. This is for three reasons: (i) light propagation is very stable or regular; its speed is either constant, or when not constant we know how to compensate for the influence of the medium; (ii) a light wave’s frequency can be made extremely stable; and (iii) distance cannot be measured more accurately in other ways.

The number 299,792,458 was chosen so that the new meter is very nearly the same distance as the old meter that was once defined to be the distance between two specific marks on a platinum bar kept in the Paris Observatory.

Time can be measured more accurately and precisely than distance, voltage, temperature, mass, or anything else.

So why bother to improve atomic clocks? The duration of the second can already be measured to 14 or 15 decimal places, a precision 1,000 times that of any other fundamental unit. One reason to do better is that the second is increasingly the fundamental unit. Three of the six other basic units—the meter, lumen and ampere—are defined in terms of the second. (Gibbs, 2002)

One philosophical implication of the standard definition of the second and of the meter is that they fix the speed of light in a vacuum in all inertial frames. The speed is exactly 299,792,458 meters per second. There can no longer be any direct measurement to check whether that is how fast light really moves; it is defined to be moving that fast. Any measurement that produced a different value for the speed of light is presumed to have an error. The error would be in accounting for the influence of gravitation and acceleration, or in its assumption that the light was moving in a vacuum. This initial presumption of where the error lies comes from a deep reliance by scientists on Einstein’s theory of relativity. However, if it were eventually decided by the community of scientists that the speed of light should not have been fixed as it was, then the scientists would call for a new world convention to re-define the second or the meter.

25. Why Are Some Standard Clocks Better than Others?

Other clocks ideally are calibrated by being synchronized to “the” standard clock, our master clock. It is normally assumed that the standard clock is the most reliable and regular clock. Physicists have chosen the currently-accepted standard clock for two reasons: (1) they believe it will tick very regularly in the sense that all periods between adjacent ticks are sufficiently congruent—they have the same duration. (2) There is no better choice of a standard clock. Choosing a standard clock that is based on the beats of a president’s heart would be a poor choice because clocks everywhere would suddenly and mysteriously get out of synchrony with the standard clock (the heartbeats) when the president goes jogging.

So, some choices of standard clock are better than others. Some philosophers of time believe one choice is better than another because the best choice is closest to a clock that tells what time it really is. Most philosophers of time argue that there is no access to what time it really is except by first having selected the standard clock.

Let’s consider the various goals we want to achieve in choosing one standard clock rather than another. One goal is to choose a clock with a precise tick rate that does not drift very much. That is, we want a clock that has a very regular period—so the durations between ticks are congruent. On many occasions throughout history, scientists have detected that their currently-chosen standard clock seemed to be drifting. In about 1700, scientists discovered that the time from one day to the next, as determined by the duration between sunrises, varied throughout the year. They did not notice any variation in the duration of a year, so they began to rely on the duration of the year rather than of the day.

As more was learned about astronomy, the definition of the second was changed. In the 19th century and before the 1950s, the standard clock was defined astronomically in terms of the mean rotation of the Earth upon its axis (solar time). For a short period in the 1950s and 1960s, the standard clock was defined in terms of the revolution of the Earth about the Sun (ephemeris time), and the second was defined to be 1/86,400 of the mean solar day, which is the average throughout the year of the rotational period of the Earth with respect to the Sun. But all these clocks were soon discovered to drift too much.

To solve these drift problems, physicists chose a certain kind of atomic clock as the standard, and they said it reported atomic time. All atomic clocks measure time in terms of the natural resonant frequencies of electromagnetic radiation absorbed and emitted from the electrons within certain atoms. The accurate dates of adoption of these standard clocks is omitted in this section because different international organizations adopted different standards in different years. The U.S.A.’s National Institute of Standards and Technology’s F-1 atomic fountain clock is so accurate that it drifts by less than one second every 30 million years. We know there is this drift because it is implied by the laws of physics, not because we have a better clock that measures this drift.

Atomic clocks use the frequency of a specific atomic transition as an extremely stable time standard. While the second is currently defined by caesium-based clocks that operate at microwave frequencies, physicists have built much more accurate clocks that are based on light. These optical clocks tick at much higher frequencies than microwave clocks and can keep time that is accurate to about one part in 1018, which is about 100 times better than the best caesium clocks.

The international metrology community aims to replace the microwave time standard with an optical clock, but first must choose from one of several clock designs being developed worldwide”—Hamish Johnston, Physics World, 26 March 2021 .

Optical atomic clocks resonate at light frequencies rather than microwave frequencies, and this is why they tick about 100,000 faster than the microwave atomic clocks.

To achieve the goal of restricting drift, and thus stabilizing the clock, any clock chosen to become the standard clock should be maximally isolated from outside effects. A practical goal in selecting a standard clock is to find a clock that can be well insulated from environmental impacts such as comets impacting the Earth, earthquakes, stray electric fields, heavy trucks driving on nearby bumpy roads, the presence of dust and rust within the clock, extraneous heat, variation in gravitational force, and adulteration of the gas with other stray elements. The clock can be shielded from external electrical fields, for example, by enclosing it in a metal box called a Faraday Cage.

If not insulation, then compensation. If there is some theoretically predictable effect of an environmental influence upon the standard clock, then the clock can be regularly adjusted to compensate for this effect. For example, thanks to knowing the general theory of relativity, we know how to adjust for the difference in gravitational force between being at sea level and being a meter above sea level. Commenting on the insulation problem, Nobel Prize winner Frank Wilczek said that the basic laws of the universe are local, so:

Thankfully, you don’t have to worry about the distant universe, what happened in the past, or what will happen in the future…and it is philosophically important to notice that it is unnecessary to take into account what people,  or hypothetical superhuman beings, are thinking. Our experience with delicate, ultra-precise experiments puts severe pressure on the idea that minds can act directly on matter, through will. There’s an excellent opportunity here for magicians to cast spells, for someone with extrasensory powers to show their stuff, or for an ambitious experimenter to earn everlasting glory by demonstrating the power of prayer or wishful thinking. Even very small effects could be detected. but nobody has ever done this successfully.” Fundamentals: Ten Keys to Reality.

Consider the insulation problem we would have if we were to replace the atomic clock as our standard clock and use instead the mean yearly motion of the Earth around the Sun. Can we compensate for all the relevant disturbing effects on the motion of the Earth around the Sun? Not easily. One problem is that the Earth’s rate of spin varies in a practically unpredictable manner. Physicists believe that the relevant factors affecting the spin (mainly friction caused by tides rubbing on continental shelves, but also shifts in winds, comet bombardment, earthquakes, and convection in Earth’s molten core) are affecting the Earth’s rotational speed and its period of revolution around the Sun, so they affect the behavior of the solar clock, but not the atomic clock.

As noted, our civilization’s earlier-chosen standard clock once depended on the Earth’s rotations and revolutions, but this Earth-Sun clock is now known to have lost more than three hours in the last 2,000 years. Leap years and leap seconds are added or subtracted occasionally to the standard atomic clock in order to keep our atomic-based calendar in synchrony with the rotations and revolutions of the Earth. We do this because we want to keep atomic-noons occurring on astronomical-noons and ultimately because we want to prevent Northern hemisphere winters from occurring in some future July. These changes do not affect the duration of a second, but they do affect the duration of a year because not all years last the same number of seconds. In this way, we compensate for the Earth-Sun clocks falling out of synchrony with our standard atomic clock.

Another desirable feature of a standard clock is that reproductions of it stay in synchrony with each other when environmental conditions are the same. Otherwise, we may be limited to relying on a specifically-located standard clock that can not be trusted elsewhere and that can be broken, vandalized or stolen.

The principal goal in selecting a standard clock is to reduce mystery in physics. The point is to find a clock process that, if adopted as our standard, makes the resulting system of physical laws simpler and more useful, and allows us to explain phenomena that otherwise would be mysterious. Choosing an atomic clock as standard is much better for this purpose than choosing the periodic revolution of the Earth about the Sun. If scientists were to have retained the Earth-Sun astronomical clock as the standard clock and were to say that by definition the Earth does not slow down in any rotation or in any revolution, then when a comet collides with Earth, tempting the scientists to say the Earth’s period of rotation and revolution changed, the scientists instead would be forced not to say this but to alter, among many other things, their atomic theory and to say the frequency of light emitted from cesium atoms mysteriously increases all over the universe when comets collide with the Earth. By switching to the cesium atomic standard, these alterations are unnecessary, and the mystery vanishes.

To make this point a little more simply, suppose the President’s heartbeats were chosen as our standard clock and so the count of heartbeats always showed the correct time. It would become a mystery why pendulums (and cesium radiation in atomic clocks) changed their frequency whenever the President went jogging, and scientists would have to postulate some new causal influence that joggers have on pendulums and atomic clocks across the globe.

To achieve the goal of choosing a standard clock that maximally reduces mystery, we want the clock’s readings to be consistent with the accepted laws of motion, in the following sense. Newton’s first law of motion says that a body in motion should continue to cover the same distance during the same time interval unless acted upon by an external force. If we used our standard clock to run a series of tests of the time intervals as a body coasted along a carefully measured path, and we found that the law was violated and we could not account for this mysterious violation by finding external forces to blame and we were sure that there was no problem otherwise with Newton’s law or with the measurement of the length of the path, then the problem would be with the clock. Leonhard Euler (1707-1783) was the first person to suggest this consistency requirement on our choice of a standard clock. A similar argument holds today but with using the laws of motion from Einstein’s general theory of relativity, one of the two fundamental theories of physics.

When we want to know how long a basketball game lasts, why do we subtract the start time from the end time? The answer is that we accept a metric for duration in which we subtract the two time numbers. Why do not we choose another metric and, let’s say, subtract the square root of the start time from the square root of the end time? This question is implicitly asking whether our choice of metric can be incorrect or merely inconvenient.

When we choose a standard clock, we are choosing a metric. By agreeing to read the clock so that a duration from 3:00 to 5:00 is 5-3 hours, and so 2 hours, we are making a choice about how to compare two durations in order to decide whether they are equal, that is, congruent. We suppose the duration from 3:00 to 5:00 as shown by yesterday’s reading of the standard clock was the same as the duration from 3:00 to 5:00 on the readings from two days ago and will be the same for today’s readings and tomorrow’s readings.

Philosophers of time continue to dispute the extent to which the choice of metric is conventional rather than objective in the sense of being forced on us by nature. The objectivist says the choice is forced and that the success of the standard atomic clock over the standard solar clock shows that we were more accurate in our choice of the standard clock. An objectivist says it is just as forced on us as our choosing to say the Earth is round rather than flat. It would be ridiculous to insist the Earth is flat. Taking the conventional side on this issue, Adolf Grünbaum argued that time is metrically amorphous. It has no intrinsic metric. Instead, we choose the metric we do in order only to achieve the goals of reducing mystery in science, but satisfying those goals is no sign of being correct.

The conventionalist, as opposed to the objectivist, would say that if we were to require by convention that the instant at which Jesus was born and the instant at which Abraham Lincoln was assassinated are to be only 24 seconds apart, whereas the duration between Lincoln’s assassination and his burial is to be 24 billion seconds, then we could not be mistaken. It is up to us as a civilization to say what is correct when we first create our conventions about measuring duration. We can consistently assign any numerical time coordinates we wish, subject only to the condition that the assignment properly reflects the betweenness relations of the events that occur at those instants. That is, if event J (birth of Jesus) occurs before event L (Lincoln’s assassination) and this, in turn, occurs before event B (burial of Lincoln), then the time assigned to J must be numerically less than the time assigned to L, and both must be less than the time assigned to B so that t(J) < t(L) < t(B). A simple requirement. Yes, but the implication is that this relationship among J, L, and B must hold for events simultaneous with J, and for all events simultaneous with K, and so forth.

It is other features of nature that lead us to reject the above convention about 24 seconds and 24 billion seconds. What features? There are many periodic processes in nature that have a special relationship to each other; their periods are very nearly constant multiples of each other, and this constant stays the same over a long time. For example, the period of the rotation of the Earth is a fairly constant multiple of the period of the revolution of the Earth around the Sun, and both these periods are a constant multiple of the periods of a swinging pendulum and of vibrations of quartz crystals. The class of these periodic processes is very large, so the world will be easier to describe if we choose our standard clock from one of these periodic processes. A good convention for what is regular will make it easier for scientists to find simple laws of nature and to explain what causes other events to be irregular. It is the search for regularity and simplicity and removal of mystery that leads us to adopt the conventions we do for the numerical time coordinate assignments and thus leads us to choose the standard clock we do choose. Objectivists disagree and say this search for regularity and simplicity and removal of mystery is all fine, but it is directing us toward the correct metric, not simply the useful metric.

For additional discussion of some of the points made in this section, including the issue of how to distinguish an accurate clock from an inaccurate one, see chapter 8 of (Carnap 1966).

26. What Is a Field?

It is helpful to imagine a (cosmic) field at a time to be analogous to a colored fluid filling all space at a time. A blue field at a single time might vary from light blue to dark blue in various regions. Fields can change, so the shades of blue might vary over time at any place. An air density field is the air filling a room, with sound waves in the room being oscillations of this field due to changing air density in different places at different times.

In any field theory with the property called “locality,” the propagation of basic particles from one place to another is due to the fact that any change in a field’s value induces infinitesimally-nearby changes. Think of points in the field as interacting only with their nearest neighbors, which in turn interact with their own neighbors, and so forth. So, field theory with locality has the advantage that, if you want to know what will happen next at a place, you do not have to consider the influence of everything everywhere in the universe but only the field values at the place of interest and the rates of change of those values. Computing the effect of a change can be much simpler this way.

In Newton’s mechanics, two distant objects act on each other directly and instantaneously. In contemporary mechanics, the two distant objects act on each other only indirectly via the field between them. However, Newton’s theory of gravity without fields is sometimes practical to use because gravitational forces get weaker with distance, and the gravitational influence of all the distant particles can be ignored for practical purposes.

The universe at a time is approximately a system of particles in spacetime, but, more fundamentally, the best guess of physicists is that it is a system of co-existing quantized fields acting on the vacuum. We know this is so for all non-gravitational phenomena. In the early years of using the concept of fields, the fields were considered something added to systems of particles, but the modern viewpoint (influenced by quantum mechanics) is that particles themselves are only local vibrations or excitations of fields; the particles are the vibrations that are fairly stable in the sense of persisting (for the particle’s lifetime) and not occupying a large spatial region as the field itself does.

The classical concept of there being a particle at a point does not quite hold in quantum theory. The key ontological idea is that the particles supervene on the fields. Particles are epiphenomena. Also, the particles of quantum fields do not change their values continuously as do particles in classical fields. A quantum field is able to change its energy only in discrete jumps.

The concept of a field originated with Pierre-Simon Laplace (1749-1827) in about 1800. He suggested Newton’s theory of gravity could be treated as being a field theory. In Laplace’s field theory of gravity, the notion of action at a distance was eliminated. Newton would have been happy with this idea of a field because he always doubted that gravity worked by one particle acting directly on another distant particle instantaneously. In a letter to Richard Bentley, he said:

It is inconceivable that inanimate brute matter should, without the intervention of something else which is not material, operate upon and affect other matter, and have an effect upon it, without mutual contact.

But Newton still would have been unhappy with Laplace’s field theory. In Laplace’s theory, the force of gravity in a direction is proportional to the rate of change of the gravitational field in that direction.  In Laplace’s version of the theory of gravity, any change in a gravitational force must be propagated instantaneously throughout all space. Newton wished to avoid instantaneous actions.

Instantaneous actions were removed from electromagnetic fields by Maxwell in the 1860s when he created his theory of electromagnetism as a field theory. Changes in electromagnetic forces were propagated, not instantaneously, but at the speed c of light. Instantaneous actions were eventually removed from gravitational theory in Einstein’s general theory of relativity of 1915. It was Einstein who first claimed that spacetime is the field associated with gravity. According to Einstein,

As the Earth moves, the direction of its gravitational pull does not change instantly throughout the universe. Rather, it changes right where the Earth is located, and then the field at that point tugs on the field nearby, which tugs on the field a little farther away, and so on in a wave moving outward at the speed of light. (Carroll 2019, p. 249)

Gravitational force, according to Einstein’s theory, is not really a force in the usual sense of the term, but is the curvature of spacetime.

Depending upon the field, a field’s value at a point in space might be a simple number (as in the Higgs field), or a vector (as in the classical electromagnetic field), or a tensor (as in Einstein’s gravitational potential field), or even a matrix. Fields obey laws, and these laws usually are systems of partial differential equations that hold at each point.

With the rise of quantum field theory, instead of a particle being treated as a definite-size object within spacetime it is treated as a localized disturbance of the field itself, a little “hill” or deviation from its average value nearby. For example, an electron is a localized disturbance in the electromagnetic field. The anti-electron is a localized disturbance in the same field, and so is a photon. The disturbance is a fuzzy bundle of quantized energy occupying a region of space bigger than a single point. Here is an analogy. Think of a quantum field as a farmer’s field. A particle is a little hill in the field. These hills can be stationary or moving. The hills can pass by each other or pass through other hills or bounce off them, depending on the kinds of hills. Moving hills carry information and energy from one place to another. New energy inputted into the field can increase the size of the hill, but only in discrete sizes. Any hill has a next bigger possible size (or energy).

So, the manifest image of a particle cannot easily be reconciled with the quantum mechanical image of a particle. Although fields, not particles, are ontologically basic, it does not follow from this that particles are not real. They are just odd in not having a well-defined diameter, and not being able to change their sizes gradually. Although an electron does have a greater probability of being detected more at some places than at others, in any single detection at a single time the electron is detected only at a point, not a region. The electron is a disturbance that spreads throughout space, although the high-amplitude parts are in a small region.  Despite its having no sharp boundary, the electron is physically basic in the sense that it has no sub-structure particle. The proton is not basic because it is made of quarks and gluons. Particles with no sub-structure are called elementary particles.

Relativity theory’s biggest ontological impact is that whether a particle is present depends on the observer. An accelerating observer might observe (that is, detect) particles being present in a specific region while a non-accelerating observer can see no particles there. For a single region of spacetime, there can be particles in the region in one reference frame and no particles in that region for another frame, yet both frames are correct descriptions of reality!

One unusual feature of quantum mechanics (quantum theory without relativity theory and without the Standard Model of Particle Physics) is the Heisenberg Uncertainty Principle. It implies that any object, such as an electron, has complementary features. For example, it has values for its position and for the rate of change of its position, but the values are complementary in the sense that the more precisely one value is measured the less precisely the other value can be measured. Fields are objects, too, and so the Heisenberg’s Uncertainty Principle applies also to fields. Fields have complementary features. The more certain you are of the value of a field at one location in space, the less certain you can be of its rate of change at that location. Thus the word “uncertainty” in the name Heisenberg Uncertainty Principle.

There are many basic quantum fields that exist together. There are four basic matter fields, two of which are the electron field and the quark field. There are five basic force-carrying fields, such as the electromagnetic field and the Higgs field. All physicists believe there are more, as yet unknown, fields, such as a dark matter field, a dark energy field, and a quantum-gravity field.

Fields often interact with other fields. The electron has the property of having an electric charge. What this means in quantum field theory is that the property of having a certain electric charge is a short description of how the electron field interacts with the electromagnetic field. The electromagnetic field interacts with the electron field whenever an energetic photon transitions into an electron and a positron (that is, an anti-electron). What it is for an electron to have a mass is that the electron field interacts with the Higgs field. Physicists presuppose that two fields can interact with each other only when they are at the same point. If this presupposition were not true, our world would be a very spooky place.

According to quantum field theory, once one of these basic fields comes into existence it does not disappear; the field exists everywhere from then on. Magnets create magnetic fields, but if you were to remove all the magnets, there would still be a magnetic field, although it would be at its minimum strength. Sources of fields are not essential for the existence of fields.

Because of the Heisenberg Uncertainty Principle, even when a field’s value is the lowest possible (called the vacuum state or unexcited state) in a region, there is always a non-zero probability that its value will spontaneously deviate from that value in the region. The most common way this happens is via virtual-pair production. This occurs when a particle and its anti-particle spontaneously come into existence in the region, then rapidly annihilate each other in a small burst of energy. You can think of space in its smallest regions as being a churning sea, a sea of pairs of these particles and their anti-particles that are continually coming into existence and then rapidly being annihilated. These virtual particles are certain compact quantum vacuum fluctuations. So, even if all universe’s fields were to be at their lowest state, empty space always would have some activity and energy. This energy of the vacuum state is inaccessible to us; we can never use it to do work. Nevertheless, the energy of these virtual particles does contribute to the energy density of so-called “empty space.” This story or description of virtual particles is helpful but can be misleading when it is interpreted as suggesting that something is created from nothing in violation of energy conservation. However, it is correct to draw the conclusion from the story that the empty space of physics is not the metaphysician’s nothingness. So, there is no region of empty space where there could be empty time or changeless time in the sense meant by a Leibnizian relationist.

Because all these fields are quantum fields, their disturbances or excitations can occur only in quantized chunks, namely integer multiples of some baseline energy, the so-called zero-point energy, which is the lowest possible positive energy. It is these chunks, called “quanta,” that make the theory be a quantum theory.

Although fields that exist cannot go out of existence, they can wake up from their slumbers and turn on. Soon after the Big Bang, the Higgs field, which had a value of zero everywhere, began to increase in value as the universe started cooling. When the universe’s temperature fell below a certain critical value, the field grew spontaneously. From then on, any particle that interacted with the Higgs field acquired a mass. Before that, all particles were massless. The more a particle interacts with the Higgs field, the heavier it is. The photon does not interact at all with the Higgs field.

What is the relationship between spacetime and all these fields? Are the fields in space or, as Einstein once said, are they properties of space, or is there a different relationship? Some physicists believe the gravitational field resides within spacetime. Proponents of string theory, for example, believe all particles are made of strings and these strings move within a pre-existing spacetime. Other physicists who are proponents of the theory of loop quantum gravity do away with gravitons in favor of one-dimensional loops whose collective behavior is gravitation; so it is a mistake, they say, to think of the gravitational field as existing within space or within spacetime.

Many physicists believe that the universe is not composed of many fields; it is composed of a single field, the quantum field, which has a character such that it appears as if it is composed of various different fields. This one field is the vacuum, and all particles are really just fluctuations in the vacuum.

There is also serious speculation that fields are not the ontologically basic entities; information is basic.

For an elementary introduction to quantum fields, see the video https://www.youtube.com/watch?v=X5rAGfjPSWE.

Back to the main “Time” article for references.

 

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University Sacramento
U. S. A.

Time

clock2Time is what clocks are used to measure. Information about time tells the durations of events, and when they occur, and which events happen before which others, so time plays a very significant role in the universe’s structure, including the structure of our personal lives. But carefully describing time’s properties has led to many unresolved issues, both philosophical and scientific.

Consider this issue upon which philosophers are deeply divided: What sort of ontological differences are there among the present, the past and the future? There are three competing philosophical theories. Presentism implies that necessarily only present objects and present events are real, and we conscious beings can recognize this in the special vividness of our present experiences compared to our relatively dim memories of past experiences and dim expectations of future experiences. So, the dinosaurs have slipped out of reality even though our current ideas of them have not. However, the growing-past theory implies the past and present are both real, but the future is not, because the future is indeterminate or merely potential. Dinosaurs are real, but our future death is not. The third theory, eternalism, is that there are no objective ontological differences among present, past, and future because the differences are merely subjective, such as depending upon whose present we are talking about.

In no particular order, here is a list of other issues about time that are discussed in this article:

•Whether there was a moment without an earlier one.
•Whether time exists when nothing is changing.
•What kinds of time travel are possible.
•Whether time has an arrow, but space does not.
•How time is represented in the mind.
•Whether time itself passes or flows.
•How to distinguish an accurate clock from an inaccurate one.
•Whether what happens in the present is the same for everyone.
•Which features of our ordinary sense of the word time are, or should be, captured by the concept of time in physics.
•Whether contingent sentences about the future have truth-values now.
•Whether tensed facts or tenseless facts are ontologically fundamental.
•The proper formalism or logic for capturing the special role that time plays in reasoning.
•Whether an instant can have a zero duration and also a very next instant.
•What neural mechanisms account for our experience of time.
•Whether time is objective or subjective.
•Whether there is a timeless substratum from which time emerges.
•Whether time is an illusion or merely a mathematical construct.
•Which specific aspects of time are conventional.
•How to settle the disputes between proponents of McTaggart’s A-theory and B-theory of time.

This article does not explore how time is treated within different cultures and languages, nor how persons can more efficiently manage their time, nor what entities are timeless.

Table of Contents

  1. Introduction
  2. Physical Time, Psychological Time, and Biological Time
  3. What is Time?
  4. Why There Is Time Instead of No Time
  5. The Scientific Image of Time
  6. Time and Change (Relationism vs. Substantivalism)
    1. History of the Debate from Aristotle to Kant
    2. History of the Debate after Kant
  7. Is There a Beginning or End to Time?
    1. The Beginning
    2. The End
    3. Historical Answers
  8. Emergence of Time
  9. Convention
  10. Arguments That Time Is Not Real
  11. Time Travel
    1. To the Future
    2. To the Past
  12. McTaggart’s A-Theory and B-Theory
  13. The Passage or Flow of Time
  14. The Past, Present, and Future
    1. Presentism, the Growing-Past, Eternalism, and the Block-Universe
    2. The Present
    3. Persistence, Four-Dimensionalism, and Temporal Parts
    4. Truth-Values of Tensed Sentences
    5. Essentially-Tensed Facts
  15. The Arrow of Time
  16. Temporal Logic
  17. Time, Mind, and Experience
  18. Supplements
    1. Frequently Asked Questions about Time
    2. What Else Science Requires of Time
    3. Special Relativity: Proper Times, Coordinate Systems, and Lorentz Transformations (by Andrew Holster)
  19. References and Further Reading

1. Introduction

Philosophers of time want to build a robust and defensible philosophical theory of time, one that resolves all the issues on the list of philosophical issues mentioned in the opening summary, or at least they want to provide a mutually consistent set of proposed answers to them that is supported by the majority of experts on the issues.

In doing this, one philosophical goal is to properly analyze the complicated relationship between the commonsense image of time and the scientific image of time. This is the relationship between beliefs about time held by ordinary speakers of our language and beliefs about time as understood through the lens of contemporary science, particularly physics. Our fundamental scientific theories—the theory of relativity and quantum theory—are unintuitive. When describing time, the commonsense image is expressed with non-technical terms such as now, flow, and past and not with technical scientific terms such as continuum, reference frame, and quantum entanglement. The scientific image uses underlying mechanisms such as atoms, fields, and other structures that are not observable to us without scientific instruments plus reliance upon implicit, theoretical assumptions. The Greek philosopher Anaxagoras showed foresight when he said, “Phenomena are a sight of the unseen.” What he might say today is that, “There is so much more to the world than we were evolved to see” (Frank Wilczek).

The manifest image or folk image is the understanding of the world as it appears to us using common sense untutored by advances in contemporary science. It does not qualify as a theory in the technical sense of that term but is more an assortment of tacit beliefs. The concept is vague, and there is no good reason to believe that there is a single shared folk concept. Maybe different cultures have different concepts of time. Despite the variability here, a reasonable way to make the concept a little more precise is to say it contains all these beliefs about time [some of which are declared to be false according to the scientific image]: (1) The world was not created five minutes ago. (2) We all experience time via experiencing change. (3) The future must be different from the past.  (4) Time exists in all places. (5) You can change the direction you are going in space but not in time. (6) Every event has a duration, and, like length of an object and distance between places, duration is never negative.  (7) Every event occurs at some time or other. (8) The past is fixed, but the future is open. (9) A nearby present event cannot directly influence a distant present event. (10) Time has an intrinsic arrow. (11) Time has nothing to do with space. (12) Given any two events, they have some objective order such as one happening before the other, or else their being simultaneous. (13) Time passes; it flows like a river, and we directly experience this flow. (14) There is a present that is objective, that every living person shares, and that divides everyone’s past from their future. (15) Time is independent of the presence or absence of physical objects and what they are doing.

Only items 1 through 7 of the 15 have clearly survived the impact of modern science. Item 8 may or may not survive depending upon whether total information is conserved over time and whether the universe is determined. Item 9 fails because of quantum entanglement. Item 12 fails because of the relativity of simultaneity in the theory of relativity. Item 15 fails because of relativistic time dilation; you could become twice as old as your twin brother if your brother goes on a high-speed adventure. Also, the scientific image has taken some of the everyday terms of the manifest image and given them more precise definitions.

The scientific image and the manifest image are not images of different worlds. They are images of the same reality. Both images have evolved over the years. The evolution has often been abrupt; think of the abrupt impact of the Copernican Revolution and the Darwinian Revolution. Regarding time, the most significant impact on its scientific image was the acceptance of the theory of relativity. Almost all the possible implications about time from quantum theory are still in dispute among physicists. See (Callender 2017) for a detailed description and discussion of the controversies between the manifest image and the scientific image.

A popular methodology used by some metaphysicians is to start with a feature of the manifest image and then change it only if there are good reasons to do so. Unfortunately, there is no consensus among philosophers of time about what counts as a good reason, although there is much more consensus among physicists. Does conflict with relativity theory count as a good reason? Yes, say physicists, but Husserl’s classic 1936 work on phenomenology, The Crisis of European Sciences and Transcendental Phenomenology, criticized the scientific image because of its acceptance of so many of the implications of relativity theory, and in this spirit A. N. Prior said that the theory of relativity is for this reason not about real time.

Ever since the downfall of the Logical Positivists‘ program of requiring all meaningful, non-tautological statements to be reducible to commonsense statements about what is given in our sense experiences (via seeing, hearing, feeling, and so forth), few philosophers of science would advocate any reduction or direct translation of statements expressed in the manifest image to statements expressed in the scientific image, or vice versa, but the proper relationship between the two is an open question.

With the rise of the importance of scientific realism in both metaphysics and the philosophy of science in the latter part of the twentieth century, many philosophers would summarize the relationship between the two images by saying our direct experience of reality is real but overrated. They suggest that defenders of the manifest image have been creative, but ultimately they have wasted their time in trying to revise and improve the manifest image to lesson its conflict with the scientific image. Regarding these attempts in support of the manifest image, the philosopher of physics Craig Callender made this sharp criticism:

These models of time are typically sophisticated products and shouldn’t be confused with manifest time. Instead they are models that adorn the time of physics with all manner of fancy temporal dress: primitive flows, tensed presents, transient presents, ersatz presents, Meinongian times, existent presents, priority presents, thick and skipping presents, moving spotlights, becoming, and at least half a dozen different types of branching! What unites this otherwise motley class is that each model has features that allegedly vindicate core aspects of manifest time. However, these tricked out times have not met with much success (Callender 2017, p. 29).

In some very loose and coarse-grained sense, manifest time might be called an illusion without any harm done. However, for many of its aspects, it’s a bit like calling our impression of a shape an illusion, and that seems wrong (Callender 2017, p. 310).

Some issues listed in the opening summary are intimately related to others, so it is reasonable to expect a resolution of one to have deep implications for another. For example, there is an important subset of related philosophical issues about time that cause many philosophers of time to divide into two broad camps, the A-camp and the B-camp, because the camps are on the opposite sides of so many controversial issues about time.

The next two paragraphs summarize the claims of the two camps. Later parts of this article provide more introduction to the philosophical controversy between the A and B camps, and they explain the technical terms that are about to be used. Briefly, the two camps can be distinguished by saying the members of the A-camp believe McTaggart’s A-theory is the fundamental way to understand time; and they accept a majority of the following claims: past events are always changing as they move farther into the past; this change is the only genuine, fundamental kind of change; the present or “now” is objectively real; so is time’s passage or flow; ontologically we should accept either presentism or the growing-past theory because the present is somehow metaphysically privileged compared to the future; predictions are not true or false at the time they are uttered; tensed facts are ontologically fundamental, not untensed facts; the ontologically fundamental objects are 3-dimensional, not 4-dimensional; and at least some A-predicates are not semantically reducible to B-predicates without significant loss of meaning. The word “fundamental” in these discussions is used either in the sense of “not derivable” or “not emergent.” It does not mean “most important.”

Members of the B-camp reject all or at least most of the claims of the A-camp. They believe McTaggart’s B-theory is the fundamental way to understand time; and they accept a majority of the following claims: events never undergo genuine change; the present or now is not objectively real and neither is time’s flow; ontologically we should accept eternalism and the block-universe theory; predictions are true or false at the time they are uttered; untensed facts are more fundamental than tensed facts; the fundamental objects are 4-dimensional, not 3-dimensional; and A-predicates are reducible to B-predicates, or at least the truth conditions of sentences using A-predicates can be adequately explained in terms of the truth conditions of sentences using only B-predicates. Many B-theorists claim that they do not deny the reality of the human experiences that A-theorists appeal to, but rather they believe those experiences can be best explained from the perspective of the B-theory.

To what extent is time understood? This is a difficult question, not simply because the word understood is notoriously vague. There have been a great many advances in understanding time over the last two thousand years, especially over the last 125 years, as this article explains, so we can definitively say time is better understood than it was—clear evidence that philosophy makes progress. Nevertheless, in order to say time is understood, there remain too many other questions whose answers are not agreed upon by the experts. Can we at least say only the relatively less important questions are left unanswered? No, not even that. So, this is the state of understanding time at the end of the first quarter of the twenty-first century. It is certainly less than a reader might wish to have. Still, it is remarkable how much we do know about time that we once did not; and it is remarkable that we can be so clear about what it is that we do not know; and there is no good argument for why this still sought-after knowledge is beyond the reach of the human mind.

2. Physical Time, Biological Time, and Psychological Time

Physical time is public time, the time that clocks are designed to measure. Biological time is indicated by regular, periodic biological processes, and by signs of aging. The ticks of a human being’s biological clock are produced by heartbeats, the rhythm of breathing, cycles of sleeping and waking, and periodic menstruation, although there is no conscious counting of the cycles as in an ordinary clock. Biological time is not another kind of time, but rather is best understood as the body’s recording of physical time, in the sense that biological time is physical time measured with a biological process.

Psychological time is private time; it is also called subjective time and phenomenological time. Our psychological time can change its rate, compared to physical time, depending on whether we are bored or instead intensively involved. Although the point has been disputed in the philosophical literature, and it still is, the position advocated by most philosophers is that psychological time is best understood not as a kind of time but rather as awareness of physical time. Psychological time is what people usually are thinking of when they ask whether time is just a construct of the mind. But not always, such as when a philosopher asks whether the spacetime of Einstein’s theory of relativity is only a projection of our brain’s neural processes.

There is no experimental evidence that the character of physical time is affected in any way by the presence or absence of mental awareness, or by the presence or absence of any biological phenomenon. For that reason, physical time is often called objective time and scientific time. The scientific image of time is the product of science’s attempt to understand physical time.

When a physicist defines speed to be distance traveled divided by the duration of the travel (or, more accurately, the rate of change of position with respect to time), the term time in that definition refers to physical time. Physical time is more helpful than psychological time for helping us understand our shared experiences in the world; but psychological time is vitally important for understanding many mental experiences, as is biological time for understanding biological phenomena.

Psychological time and biological time are explored in more detail in Section 17.

3. What is Time?

It may not be what it seems. “Time is what keeps everything from happening all at once,” said a comedian. Clocks can tell you what time it is, but they cannot tell you what time is.  “Time is succession,” Henri Bergson declared, but that remark is frustratingly vague.

There is disagreement among philosophers of time as to what metaphysical structure is essential to time. Two competing recommendations, among others, are that time is a one-dimensional structure of ordered instants satisfying McTaggart’s A-series, or satisfying his B-series. Think of an instant as a snapshot of the universe at a time.  More will be said about these two series later in this article.

Maybe we can decide what time  is by considering what our world would be like if it did not contain time. Where do we proceed from here, though? We cannot turn off time and look at the result. Our imagining the world without time is not likely to be a reliable guide,  especially if you consider how much you could learn about flying saucers by imagining a world without them.

Information about time tells the durations of events, and when they occur, and which events happen before which others, so any definition of time or theory of time should allow for this. Time seems to be necessary for grounding causation, persistence, and change. Can time be specified more precisely? Is it helpful to distinguish what time is from what it does? Should we be aiming to say time is what meets certain necessary and sufficient criteria, or should we aim for a sophisticated, detailed, philosophical theory about time, or should we say time is whatever plays this or that functional role such as accounting for our temporal phenomenology? Baron and Miller have argued that, if a demon plays the functional role of providing us with our temporal phenomenology, then we would not agree that time is a demon, so more constraints need to be placed on any functionalist account of time.

Some say time is whatever satisfies the requirements on the time variable in the fundamental equations of physics. In reaction to this last claim, opponents usually complain of scientism. Other researchers say time is what best satisfies our many intuitions about time in our manifest image, or it is whatever functions to ground our temporal phenomenology. Their opponents usually complain here of overemphasis on subjective features of time and of insensitivity to scientific advances.

Sometimes, when we ask what time is, we are asking for the meaning of the noun “time.” It is the most frequently used noun in the English language. A first step in that direction might be to clarify the difference between its meaning and its reference.  The term time has several meanings. It can mean the duration between events, as when we say the trip from home to the supermarket took too much time because of all the traffic. It can mean, instead, the temporal location of an event, as when we say he arrived at the time they specified. It also can mean the temporal structure of the universe, as when we speak of investigating time rather than space. This article uses the word in all these senses.

Ordinary Language philosophers have carefully studied talk about time. This is what Ludwig Wittgenstein called the language game of discourse about time. Wittgenstein said in 1953, “For a large class of cases—though not for all—in which we employ the word ‘meaning’ it can be defined this way: the meaning of a word is its use in the language.” Perhaps an examination of all the uses of the word time would lead us to the meaning of the word. Someone, following the lead of Wittgenstein, might also say we would then be able to dissolve rather than answer most of our philosophical questions about time. That methodology of dissolving a problem was promoted by Wittgenstein in response to many other philosophical questions.

However, most philosophers of time in the twenty-first century are not interested in dissolving the problems about time nor in precisely defining the word time. They are interested in what time’s important characteristics are and in resolving philosophical disputes about time that do not seem to turn on what the word means. When Newton discovered that both the fall of an apple and the circular orbit of the Moon were caused by gravity, this was not a discovery about the meaning of the word gravity, but rather about what gravity is. Do we not want some advances like this for time?

To emphasize this idea, notice that a metaphysician who asks, “What is a ghost?” already knows the meaning in ordinary language of the word ghost, and does not usually want a precise definition of ghost but rather wants to know what ghosts are and where to find them and how to find them; and they want a more-detailed theory of ghosts. This theory ideally would provide the following things: a consistent characterization of the most important features of ghosts, a claim regarding whether they do or do not exist and how they might be reliably detected if they do exist, what principles or laws describe their behavior, how they typically act, and what they are composed of. This article takes a similar approach to the question, “What is time?” The goal is to discover the best concept of time to use in understanding the world and to develop a philosophical theory of time that addresses what science has discovered about time plus what should be said about the many philosophical issues that practicing scientists usually do not concern themselves with.

The exploration ahead adopts a realist perspective on these scientific theories. That is, it interprets them usually to mean what they say, even in their highly theoretical aspects, while appreciating that there are such things as mathematical artifacts. The perspective does not take a fictionalist perspective on the scientific theories, nor treat them as merely useful instruments, nor treat them operationally. It assumes that, in building a scientific theory, the goal is to achieve truth even though most theories achieve this goal only approximately, but what makes them approximately true is not their corresponding to some mysterious entity called approximate truth. Many of these assumptions have occasionally been challenged in the philosophical literature, and if one of the challenges is correct, then some of what is said below will require reinterpretation or rephrasing.

Everyone agrees that time has something to do with change and that clocks are designed to measure time. This article’s supplement of “Frequently Asked Questions” discusses what a clock is and what it is for a clock to be accurate as opposed to precise and why we trust some clocks more than others. Saying physical time is what clocks measure, which is how this article began, is not as trivial as it might seem since it is a deep truth about our physical universe that it is capable of having clocks. We are lucky to live in a universe with so many different kinds of regular, periodic processes that humans can use for clocks. However, the claim that time is what clocks measure has competing metaphysical interpretations. Some philosophers of physics claim that there is nothing more to time than whatever numbers are displayed on our clocks. The vast majority of philosophers of physics disagree. They say time is more than those numbers; it is what we intend to measure with those numbers. In the anti-realist spirit of those who do say there is nothing more to time than whatever numbers are displayed by our clocks, the distinguished mathematician, physicist, and philosopher of science Henri Poincaré said in 1912 “The properties of time are…merely those of our clocks just as the properties of space are merely those of the measuring instruments.”

What then is time really? Let’s consider how this question has been answered in different ways throughout the centuries. Here we are interested in very short answers that give what the proponent considers to be the key idea about time.

Aristotle proposed what has come to be called the relational theory of time when he said, “there is no time apart from change….” (Physics, chapter 11). Centuries later, Isaac Newton disagreed with this remark. Aristotle emphasized, though, that the word time is not simply another word for change. He said, “that time is not change [itself]” because a change “may be faster or slower, but not time…” (Physics, chapter 10). For example, a leaf can fall faster or slower, but time itself cannot be faster or slower. Aristotle claimed that “time is the measure of change” (Physics, chapter 12) of things, but he never said space is the measure of anything.

René Descartes answered the question, “What is time?” by claiming that a material body has the property of spatial extension but no inherent capacity for temporal endurance and that God by his continual action sustains (or re-creates) the body at each successive instant. Time is a kind of sustenance or re-creation (“Third Meditation” in Meditations on First Philosophy, published in 1641). Descartes’ worry is analogous to that of Buddhist logicians who say, “Something must explain how the separate elements of the process of becoming are holding together to produce the illusion of a stable material world.” The Buddhist answer was causality. Descartes would have answered that it is God’s actions.

Gottfried Leibniz, also a relationist as was Aristotle, said time is a series of moments, and each moment is a set of co-existing events in a network of relations of earlier-than and later-than. Isaac Newton, a contemporary of Leibniz, claimed instead that time is independent of events and said time is absolute in the sense that “true…time, in and of itself and of its own nature, without reference to anything external, flows uniformly…” (1687). This difference reflects the fact that Newton thought of space as a thing, while Leibniz disagreed and said it is not a thing but only a relationship among the other things.

Both Newton and Leibniz presumed that time is the same for all of us in the sense that how long an event lasts is the same for everyone, no matter what they are doing. This presumption would eventually be challenged by Albert Einstein in the 20th century.

In the 18th century, Immanuel Kant made some very influential remarks that suggested he believed time and space themselves are forms that the mind projects upon the external things-in-themselves. In the twenty-first century, this is now believed to be a misinterpretation of Kant’s intentions, even though he did say things that would lead to this false interpretation. What he actually believed was that our “representations of space and time and not space and time themselves” have this character. So, Kant’s remarks that time is “the form of inner sense” and that time  “is an a priori condition of all appearance whatsoever” are probably best understood as suggesting that we have no direct perception of time but only have the ability to experience individual things and events within time. The “we” here is human beings; he left open the possibility that the minds of non-humans perceive differently than we do. Also, he left open the possibility that the world-in-itself, that is, the world as it is independently of being perceived, may or may not be temporal.

Ever since Newton’s theory of mechanics in the 17th century, time has been taken to be a theoretical entity, a theory-laden entity, in the sense that we can tell much about time’s key features by looking at the role it plays in our confirmed, fundamental theories. One of those is the theory of relativity. According to relativity theory, time is not fundamental, but is a necessary feature of spacetime, which itself is ontologically fundamental. Spacetime is all the actual events in the past, present, and future. In 1908, Hermann Minkowski argued that the proper way to understand relativity theory is to say time is really a designated, non-spatial dimension of spacetime, and time has no existence independent of space. Einstein agreed. The most philosophically interesting feature of the relationship between time and space, according to relativity theory, is that the more you have of one the less you have of the other.

In the early 20th century, the philosophers Alfred North Whitehead and Martin Heidegger said time is essentially the form of becoming. This is an idea that excited a great many philosophers, but not many scientists, because the remark seems to give ontological priority to the manifest image of time over the scientific image.

In the 21st century, Stephen Wolfram claimed any physical process is a natural computation; time is the inexorable progress of executing these computations; and the computations follow rules that other scientists call laws of nature. He is convinced that Einstein made a mistake in unifying space with time in his theory of relativity.

This ends our short list of historical answers to the question, “What is time?” Each of the persons mentioned was talking about time, and no one was changing the subject. They simply had different conceptions of the same thing. Other conceptions of what time is are discussed in later sections of this article.

Whatever time is, one should consider whether time has causal powers. The musician Hector Berlioz said, “Time is a great teacher, but unfortunately it kills all its pupils.” Everyone knows not to take this joke literally because, when you are asleep and then your alarm clock rings at 7:00, it is not the time itself that wakes you. Nevertheless, there are more serious reasons to believe that time has causal powers. “Spacetime tells matter how to move,” said Princeton physicist John Wheeler. The point Wheeler was making is that, in the general theory of relativity, space and time are dynamic actors, not a passive stage where events occur.

Developing this point that relativity theory tells us about time, the philosopher of physics Tim Maudlin, expressed a broadly held twenty-first century consensus that:

Space and time are theoretical entities. We cannot directly observe them; we observer the behavior of material entities and postulate space-time structure as part of the physical explanation of that behavior. The space-time structure appears in the framing of the fundamental laws, and the nature of that structure is, in a broad sense of the term, geometrical (“Relativity and Space-Time Geometry” 2022).

There is no consensus among scholars regarding what time really is. This uncertainty includes disagreement on whether, at the most fundamental level, time is fundamental or emergent. A large minority of experts in physics believe:

For shorter and shorter durations, time becomes less applicable to reality.

In the 20th century, most experts believed time to be a continuum like the mathematical line because both the theory of relativity and quantum theory assume this. If time is a continuum, then measurements of any duration have a continuous range of possible values at any scale that is only limited by the sensitivity of the measurement instrument. By the first quarter of the 21st century, many experts came to doubt whether the continuity does, will, or should hold in the long-sought theory of quantum gravity that hopefully will reconcile various inconsistencies between relativity theory and quantum theory. Most, but not all, experts believe that for shorter and shorter durations below the Planck time scale of 10-44 seconds, the notions of time and spacetime become less applicable to reality. That is why so many experts predict that a future theory of quantum gravity will show us that, “the whole idea of time is just an approximation.” This claim about time is somewhat analogous to the claim that temperature is just an approximate feature of reality in the sense that, under the assumption that the fundamental ontological building blocks of nature are quantum fields, temperature is not reducible to features of the underlying quantum fields but is a more and more useful feature for describing their behavior as the scale increases.

4. Why There Is Time Instead of No Time

The fundamental theories of physical science—namely the general theory of relativity and quantum theory—imply that time exists. They imply it exists at least relative to a selected reference frame. However, those theories have nothing to say about why time exists. But curious people want to know why.

Among physicists and philosophers of physics, there is no agreed-upon answer to why our universe contains time instead of no time, why it contains dynamical physical laws describing change over time, whether today’s physical laws will hold tomorrow, why the universe contains the fundamental laws that it does contain, or why there is a universe instead of no universe, although there have been interesting speculations on all these issues. For instance, perhaps there is something rather than nothing because throughout all time there has always been something and there is no good reason to believe it should or could transition to nothing or could have transitioned into being from nothing. The cosmologist Lawrence Krauss remarked that “Quantum mechanics blurs the distinction between something and nothing” because the vacuum according to quantum mechanics always contains fields and particles even at the lowest possible energy level.

Here is one theological explanation for why time exists: God wanted the world to be that way. Here is an anthropic explanation. If time were not to exist, we would not now be asking why it does. Here is an intriguing non-theological and non-anthropic explanation. When steam cools, eventually it suddenly undergoes a phase transition into liquid water. Many cosmologists suspect that the universe should contain laws implying that, as the universe cools, a phase transition occurs during which four-dimensional space emerges from infinite-dimensional space; then, after more cooling, another phase transition occurs during which one of the four dimensions of primeval space collapses to become a time dimension. The previous sentence is a bit misleading because of its grammar which might suggest that something was happening before time began, but that is a problem with the English language, not with the suggestion about the origin of time.

There is a multiverse answer to our question, “Why does time exist?” The reason why our universe exists with time instead of no time is that nearly every kind of universe exists throughout the inflationary  multiverse; there are universes with time and universes without time. Like all universes in the multiverse, our particular universe came into existence by means of a random selection process without a conscious selector, a process in which every physically possible universe is overwhelmingly likely to arise as an actual universe, in analogy to how continual re-shuffling a deck of cards makes it overwhelmingly likely that any specific ordering of the cards will eventually appear. Opponents complain that this multiverse explanation is shallow. To again use the metaphor of a card game, they wish to know why their poker opponent had four aces in that last hand, and they are not satisfied with the shallow explanation that four aces are inevitable with enough deals or that it is just a random result. Nevertheless, perhaps there is no better explanation.

5. The Scientific Image of Time

Time has been studied for 2,500 years, but only in the early twentieth-century did time become one of the principal topics in professional journals of physics, and soon after in the journals of philosophy of science. The primary reason for this was the creation of the theory of relativity.

Any scientific theory can have its own implications about the nature of time, and time has been treated differently in different scientific theories over the centuries. When this article speaks of the scientific image of time or what science requires of time it means time of the latest, accepted theories that are fundamental in physics and so do not depend upon other theories. For example, Einstein’s theory of relativity is fundamental, but Newton’s theory of mechanics is not, nor is his theory of gravitation or Maxwell’s theory of electromagnetism. Newton’s concept of time is useful only for applications where the speed is slow, where there are no extreme changes of gravitational forces, and where durations are very large compared to the Planck time because, under these conditions, Newton’s theory agrees with Einstein’s. For example, Newton’s two theories are all that were needed to specify the trajectory of the first spaceship that landed safely on the Moon.

When scientists use the concept of time in their theories, they adopt positions that metaphysicians call metaphysical. They suppose there is a mind-independent universe in which we all live and to which their fundamental theories apply. Physical scientists tend to be what metaphysicians call empiricists. They also usually are physicalists, and they would agree with the spirit of W.V.O. Quine’s remark that, “Nothing happens in the world … without some redistribution of microphysical states.” This physicalist position can be re-expressed as the thesis that all the facts about any subject matter such as geophysics or farming are fixed by the totality of microphysical facts about the universe. Philosophers sometimes express this claim by saying all facts supervene on microphysical facts. Philosophers and some scientists are especially interested in whether the human mind might be a special counterexample to this physicalist claim. So far, however, no scientific experiments or observations have shown clearly that the answer to the metaphysical question, “Does mind supervene upon matter?” is negative. Nor do scientific observations ever seem to need us to control for what the observer is thinking.

In the manifest image, the universe is fundamentally made of objects rather than events. In the scientific image, the universe is fundamentally made of events rather than objects. Physicists use the term “event” in two ways, and usually only the context suggests which sense is intended. In sense 1, something happens at a place for a certain amount of time. In sense 2, an event is simply a point location in space and time. Sense 2 is what Albert Einstein had in mind when he said the world of events forms a four-dimensional continuum in which time and space are not completely separate entities. In either of these two senses, it is assumed in fundamental scientific theories that longer events are composed of shorter events which in turn are composed of instantaneous events. The presumption of there being instantaneous events is controversial. That presupposition upset Alfred North Whitehead who said: “There is no nature apart from transition, and there is no transition apart from temporal duration. This is why an instant of time, conceived as a primary simple fact, is nonsense” (Whitehead 1938, p. 207).

Frames of reference are perspectives on the space or the spacetime we are interested in. A coordinate system is what the analyst places on a reference frame to help specify locations quantitatively. A coordinate system placed on a  reference frame of spacetime normally assigns numbers as names of temporal point-locations (called point-times) and spatial locations (point-places). The best numbers to assign are real numbers (a.k.a. decimals), in order to allow for the applicability of calculus.  A duration of only a billionth of a second still contains a great many point-times, a nondenumerable infinity of them. Relativity theory implies there are an infinite number of legitimate, different reference frames and coordinate systems. No one of them is distinguished or absolute in Isaac Newton’s sense of specifying what time it “really” is, and where you “really” are, independently of all other objects and events. Coordinate systems are not objective features of the world. They vary in human choices made about the location of their origins, their scales, the orientation of their coordinate axes, and whether the coordinate system specifies locations by things other than axes, such as the angle between two axes. In relativity theory, reference frames are often called “observers,” but there is no requirement that conscious beings be involved.

The fundamental theories of physics are the general theory of relativity and quantum theory including the standard model of particle physics—but not the big bang theory. The collection of the fundamental theories is often called the Core Theory. The Core Theory is discussed in more detail in a companion article,. For scientists it provides our civilization’s best idea of what is fundamentally real.

The theory of relativity is well understood philosophically, but quantum theory is not, although the mathematical implications of these theories are well understood by mathematicians and physicists. These theories are not, of  course, merely informed guesses. Each is a confirmed set of precise, teleology-free laws. The theories have survived a great many tests and observations, so the scientific community trusts their implications in cases in which they do not conflict with each other, and the theories have many implications about the nature of time. One is that time is like space in some ways but not others.

Here is the scientific image of time as a numbered list of its most significant implications about time, with emphasis upon relativity theory and not quantum theory. The impact of quantum theory on our understanding of time is discussed here in a Supplement.

(1) When you look at a distant object, you see it as it was, not as it is.

Because seeing requires light and because the speed of light is not infinite and because it takes time for the brain to process information that it receives from the eyes, the information you obtain by looking at an object is information about how it was, not how it is. The more distant the object, the more outdated is the information.

(2) The duration of the past is at least 13.8 billion years.

The big bang theory is well confirmed (though not as well as relativity theory), and it requires the past of the observable universe to extend back at least 13.8 billion years ago to when an explosion of space occurred, the so-called “big bang.” This number is found primarily from imagining the current expansion of the observable universe to be reversed in time, and noting that the galaxies were very close together about 13.8 billion years ago. It is assumed that gravity if the only significant phenomenon affecting this calculation. Because it is unknown whether anything happened before the big bang, it is better to think of the big bang, not as the beginning of time, but as the beginning of what we understand about our distant past. A large majority of cosmologists believe the big bang’s expansion is an expansion  of space but not of spacetime and thus not of time. By the way, when cosmologists speak of space expanding, this remark is about increasing distances among galaxies. The distance from New York City to London does not expand.

(3) Time is one-dimensional, like a line.

The scientist Joseph Priestly in 1765 first suggest time is like a one-dimensional line. The idea quickly caught on, and now time is represented as one-dimensional in all the fundamental theories of physics. Two-dimensional time has been studied by mathematical physicists, but no theories implying that time has more than one dimension in our actual universe have acquired a significant number of supporters. Such theories are difficult to make consistent with what else we know, and there is no motivation for doing so. Because of this one-dimensionality, time is represented in a coordinate system with a time line rather than a time area, and its geometry is simpler than that of space.

(4) Time connects all events.

Given any two events that ever have existed or ever will, either one happens before the other or else they are simultaneous. No exceptions, assuming there is no ambiguity about which reference frame is being used, and assuming we are not comparing events across of the multiverse.

(5) Time travel is possible.

You can travel to the future—to a time in your lifetime when you can meet your great, great grandchildren. Your travelling to someone else’s future has been experimentally well-confirmed many times. Travelling to your own future, though, does not make sense because you are always in your own present. There is no consensus among scientists regarding whether you might someday be able to travel into your own past.

(6) Time is relative.

According to relativity theory, the amount of time an event lasts (the event’s duration) is relative to someone’s choice of a reference frame or coordinate system or vantage point. How long you slept last night is very different depending on whether it is measured by a clock next to you or by a clock in a spaceship circling the solar system at close to the speed of light. If no reference frame has been pre-selected, then it is a violation of relativity theory to say one of those two durations is correct and the other is incorrect.  Newton would have said both durations cannot be correct, but regarding this feature of Newton’s classical physics, Einstein and Infeld said, “In classical physics it was always assumed that clocks in motion and at rest have the same rhythm…[but] if the relativity theory is valid, then we must sacrifice this assumption. It is difficult to get rid of deep-rooted prejudices, but there is no other way.” This point about the relativity of time is often expressed informally with the somewhat inaccurate remark that time passes at different rates for different observers.

Because duration is relative, the conclusion is drawn that:

(7) Time is not an objectively real feature of the universe.

According to relativity theory, space-time is objectively real and fundamental, but the main reason for believing time is not objectively real is that it is not independent of space. A second reason is that you can change the duration of anything that happened in the past just by changing your reference frame (coordinate system). Scientists assume that what is objectively real must not be dependent upon someone’s choice of reference frame. To some philosophers this implication casts doubt upon either the theory of relativity itself or the importance that scientists ascribe to frame-independence.

(8) Simultaneity is relative. Two observers who move relative to each other cannot agree on which events occur simultaneously. 

According to relativity theory, if the two observers move toward or away from each other or experience different gravitational forces, then many pairs of events will be simultaneous for one observer and not simultaneous for the other observer. Relativity theory implies there is no uniquely correct answer to the question, for some distant place, “What is happening now at that place?” The answer depends on what observer is answering the question, namely what reference frame is being assumed. Nevertheless, at least in cosmology there is an obvious, standard reference frame that every cosmologist chooses, the frame in which the galaxies have the least mutual motion. For ordinary discussions about events on Earth, a reference frame is customarily used in which the Earth is not moving. And since we all move at slow speeds relative to each other and don’t experience very different gravitational forces and don’t consider very distant phenomena, we can agree for practical purposes on Earth about what is simultaneous with what.

(9) Within a single reference frame, coordinate time “fixes” (i) when each event occurs, (ii) what any event’s duration is, (iii) what other events occur simultaneously with it, and (iv) the time-order of any two events.

Coordinate time is time measured along the time dimension in a chosen coordinate system

(10) Speeding clocks run slower. 

According to relativity theory, a speeding clock always runs slower compared to a stationary clock. The speeding clock’s ticking is said to be “stretched” or “dilated” compared to that of the stationary clock. (An assumption here is that correct clocks are always protected from damage.) Click to view a picture of  time dilation by representing the ticking of two clocks that are initially synchronized after which they move rapidly away from each other, then rapidly back toward each other. This dilation works for all processes, not just clocks.

(11) Time slows when the gravitational force increases. 

A clock that closely approaches the horizon of a black hole almost stops (relative to clocks back on Earth). But if you were near the clock you’d notice nothing odd about the clock’s rate. This is an implication of the general theory of relativity. Initially synchronized clocks will get out of synch if they are affected differently by gravity. The greater the gravitational force, the slower the ticking. This holds for all processes, not just the ticking of clocks. You will live longer on the first floor than on the ten floor of your apartment building where the gravitational force on you is less. This is a second kind of time dilation. We do not normally notice this time dilation because we spend our lives in a gravitational field of about the same strength wherever we go during our life. But your sea level clock lags behind a clock at the top of Mount Everest by about 30 microseconds every year. The clock in a satellite orbiting Earth disagrees with the standard clock back on Earth by slowing down due to its speed while speeding up due to its being less affected by Earth’s gravity. These two time dilation effects cancel out when the satellite is about 2,000 miles above Earth.

(12) If you synchronize two clocks at the same place, then move one of them differently from the other, when they arrive back together, they are likely to report different times. 

This is the result of time dilation. The only reason that there is such a thing as THE correct time is that we accept the convention of trusting reports from just one clock, our standard clock or master clock. By convention, our standard clock reports what time it is at the Greenwich Observatory in Greenwich, England.

(13) You have enough time left in your life to visit the far side of the galaxy and return.

One philosophically interesting implication of time dilation in relativity theory is that in your lifetime, without using cryogenics, you have enough time to visit the far side of our Milky Way galaxy 100,000 light years away from Earth and then return to report on your adventure to your descendants many generations from now. As your spaceship approaches the speed of light, you can cross the galaxy in hardly any time at all, even though someone using the coordinate time of the standard Earth-based clock must judge that it took you over 100,000 years to cross the galaxy one-way. Both time judgments would be correct. The faster you move the more time you have to visit new places. You cannot reach the cosmic speed limit of traveling at light speed, but the closer you get to that speed the closer you get to experiencing no time at all (as measured by stationary clocks).

(14) Time can warp when spacetime curves.

When time curves, clocks do not bend in space as if in a Salvador Dali painting. Instead, they undergo gravitational time dilation. According to general relativity, gravity is the curvature of four-dimensional spacetime even though there is no fifth dimension for it to curve into. This 4D curvature of spacetime is observed by detecting time dilation and space contraction. Choosing to say “three-dimensional space curves” expands the ordinary meaning of the word “curve,” and saying “spacetime curves” expands it even more because the word “curve” normally indicates a change of one or two dimensional spatial direction, as when the hiking path curves to the right or the shape of an apple is curved and not flat. A two-dimensional sphere is positively curved everywhere on its surface as is obvious to viewers from the three-dimensional world in which it resides, but which is much less obvious to inhabitants confined to the two-dimensional surface.

Special relativity uses the word “curve” only in its ordinary sense. It allows three-dimensional physical objects to be curved or bent and to move in curved paths in space, and it allows this curvature to change over time, but it does not allow curvature of either space or spacetime. According to general relativity though, they both can curve. Gauss, Lobachevsky and Bolyai first suggested (independently of each other) that physical space could curve, and Einstein first suggested that spacetime could curve. This is actual curvature of the real spacetime we live within, not just a mathematical convenience for representing reality.

Does time curve? Yes, but it is more common for physicists to say time “warps.” Space-time can curve, stretch and ripple. It stretches when the universe expands in volume, and it ripples due to gravitational waves being created by, say,  two colliding black holes.

(15) All the fundamental laws are invariant under time-translation.

This means the fundamental laws of nature do not depend on what time it is, and they do not change as time goes by. Your health might change as time goes by, but the basic laws underlying your health that held last year are the same as those that hold today. This translation symmetry property of time is called its homogeneity. It expresses the equivalence of all instants. This feature of time can be expressed using the language of coordinate systems by saying that replacing the time variable t everywhere in a fundamental law by t + 4 does not change what processes are allowed by the law. The choice of “4” was an arbitrary choice of a real number. Requiring the laws of physics to be time-translation symmetric was proposed by Isaac Newton. Physicists do not know a priori that laws must have this symmetry, but the assumption fits all the known evidence so far.

One reason the principle of time-translation symmetry is not analytically true is that a remarkable theorem by Emmy Noether in 1915 established that time-translation symmetry implies the principle of conservation of energy, but that principle is considered to be empirical and not analytic.

(16) The fundamental physical laws are invariant under time-reversal.

This point about time-reversal symmetry can be expressed informally by saying that if you make a documentary film and show it in reverse, what you see may look very surprising or even impossible, but actually nothing shown violates a fundamental physical law. It may violate the second law of thermodynamics, but that law is not fundamental. It is derivable.

If the fundamental laws are time-reversal symmetric, this raises the interesting question of why all physical processes are seen by us to go in only one direction in time spontaneously, as if time has an intrinsic arrow. Eggs break but are never seen to un-break unless a human intervenes to put the pieces back together. Bullets explode but never un-explode. Heat flows spontaneously from hot to cold, never the other way. This issue is examined in the later section on the arrow of time.

For more about special relativity, see Special Relativity: Proper Times, Coordinate Systems, and Lorentz Transformations.

6. Time and Change (Relationism vs. Substantivalism)

Does physical time necessarily depend on change existing, or vice versa? Philosophers have been sharply divided on these issues, and any careful treatment of them requires clarifying the relevant terms being used. Even the apparent truism that change involves time is false if the terms are used improperly.

Let’s focus on whether time necessarily involves change. If it does, then what sort of change is required? For example, would time exist in a universe that does change but does not change in enough of a regular manner to have a clock? Those who answer “yes,” are quick to point out that there is a difference between not being able to measure some entity and that entity not existing. Those who answer “no,” have sometimes said that if an entity cannot be measured then the very concept of it is meaningless—although not that it must be meaningless, as a Logical Positivist would declare, but only that it is as a matter of fact meaningless. The latter position is defended by Max Tegmark in (Tegmark 2017).

Classical relationists claim that time necessarily involves change, and classical substantivalists say it does not. Substantivalism (also called substantialism) implies that both space and time exist always and everywhere regardless of what else exists or changes. They say space and time provide a large, invisible, inert container within which matter exists and moves independently of the container. The container provides an absolute rest frame, and motion relative to that frame is real motion, not merely relative motion. Relationism (also called relationalism) implies space and time are not like this. It implies there is no container, so, if you take away matter’s motions, you take away time, and if you also take way the matter itself, you take away space.

Substantivalism is the thesis that space and time exist always and everywhere independently of physical material and its events.

Relationism is the thesis that space is only a set of relationships among existing physical material, and time is a set of relationships among the events of that physical material.

Relationism is inconsistent with substantivalism. Substantivalism implies there can be empty time, time without the existence of physical events. Relationism does not allow empty time. It is committed to the claim that time requires material change. That is, necessarily, if time exists, then change exists.

Everyone agrees that clocks do not function without change and that time cannot be measured without there being changes, but the present issue is whether time exists without changes. Can we solve this issue by testing? Could we, for instance, turn off all changes and then look to see whether time still exists? No, the issue has to be approached indirectly.

Relationists and substantivalists can agree that, perhaps as a matter of fact, change is pervasive and so is time. Their disagreement is whether time exists even if, perhaps contrary to fact, nothing is changing. This question of whether time requires change is not the question of whether change requires time, nor is it the question of whether time is fundamental.

To make progress, more clarity is needed regarding the word change. The meaning of the word is philosophically controversial. It is used here in the sense of ordinary change—an object changing its ordinary properties over time. For example, a leaf changes its location if it falls from a branch and lands on the ground. This ordinary change of location is very different from the following three extraordinary kinds of change. (1) The leaf changes by being no longer admired by Donald. (2) The leaf changes by moving farther into the past. (3) The leaf changes across space from being green at its base to brown at its tip, all at one time. So, a reader needs always to be alert about whether the word change means ordinary change or one of the extraordinary kinds of change.

There is a fourth kind of change that also is extraordinary. Consider what the word properties means when we say an object changes its properties over time. When referring to ordinary change of properties, the word properties is intended to exclude what Nelson Goodman called grue-like properties. Let us define an object to be grue if and only if, during the time that it exists, it is green before the beginning of the year 1888 but is blue thereafter. With this definition, we can conclude that the world’s chlorophyll underwent a change from grue to non-grue in 1888. We naturally would react to drawing this conclusion by saying that this change in chlorophyll is very odd, not an ordinary change in the chlorophyll, surely nothing that would be helpful to the science of biology.

Classical substantival theories are also called absolute theories. The term absolute here implies existing without dependence on anything except perhaps God. The relationist, on the other hand, believes time’s existence depends upon material events.

Many centuries ago, the manifest image of time was relationist, but due to the influence of Isaac Newton upon the teaching of science in subsequent centuries and then this impact upon the average person who is not a scientist, the manifest image has become substantivalist.

a. History of the Debate from Aristotle to Kant

Mario Bunge encapsulated the history of relationism vs. substantivalism this way:

The idea that time is the pace of events [namely, is relational] was adumbrated by Plato [51], discussed by Aristotle [1], sung by Lucretius [43], worked out by Augustine [2] and reinvented by Leibniz [41], Mach [45], and a few others. Unfortunately none of these relationists proposed a theory (= hypothetico-deductive system) of time. Consequently the idea that time is “a measure of motion” (Aristotle) and “an order of successions” (Leibniz) remained nearly as half-baked and metaphorical as its rival, the absolutist view that time “of itself, and from its own nature, flows equably without relation to anything external” (Newton).

Let’s unpack some of the more important points in this history.

Aristotle had said, “neither does time exist without change” (Physics, Book IV, chapter 11, page 218b). This claim about time is often called Aristotle’s Principle. In this sense he was Leibniz’s predecessor,  although Leibniz’s relationism contains not only Aristotle’s negative element that there is no changeless time but also a positive element that describes what time is. In opposition to Aristotle on this topic, Democritus spoke of there being an existing space within which matter’s atoms move, implying space is substance-like rather than relational. So, the ancient Greek atomists were a predecessor to Newton on this topic.

The battle lines between substantivalism and relationism were drawn more clearly in the early 18th century when Leibniz argued for relationism and Newton argued against it. Leibniz claimed that space is a network of objects. It is nothing but the “order of co-existing things,” so without objects there is no space. “I hold space to be something merely relative, as time is; …I hold it to be an order of coexistences, as time is an order of successions.” Leibniz would say time is abstracted from changes of things, namely events, with the paradigm kind of change being motion. Expressed more technically, we can say Leibniz’s relational world is one in which spatial relationships are ontologically prior to space itself, and relationships among changes (or events) are ontologically prior to time itself. This position of Leibniz’s can be summarized as his saying time is a relational order of successions of event. This is the positive element in Leibniz’s relationism. The typical succession-relationships Leibniz is talking about here are that this event happens two minutes before that event, and these two other events are simultaneous. If asked what a specific time is, a modern Leibnizian would be apt to say a time is a set of simultaneous events.

Opposing Leibniz, Isaac Barrow and his student Isaac Newton returned to a Democritus-like view of space as existing independently of material things; and they similarly accepted a substantival theory of time, with time existing independently of all motions and other events. Newton’s actual equations of motion and his law of gravity are consistent with both relationism and substantivalism, although this point was not clear at the time to either Leibniz or Newton.

In 1670 in his Lectiones Geometricae, the English physicist Isaac Barrow rejected any necessary linkage between time and change. He said, “Whether things run or stand still, whether we sleep or wake, time flows in its even tenor.” Barrow also said time existed even before God created the matter in the universe. Newton agreed. In Newton’s unpublished manuscript De gravitatione, written while he was composing Principia, he said, “we cannot think that space does not exist just as we cannot think there is no duration” (Newton 1962, p. 26). This suggests that he believed time exists necessarily, and this idea may have influenced Kant’s position that time is an a priori condition of all appearance whatsoever.

Newton believed time is not a primary substance, but is like a primary substance in not being dependent on anything except God. For Newton, God chose some instant of pre-existing time at which to create the physical world. From these initial conditions, including the forces acting on the material objects, the timeless scientific laws took over and guided the material objects, with God intervening only occasionally to perform miracles. If it were not for God’s intervention, the future would be a logical consequence of the present.

Leibniz objected. He was suspicious of Newton’s substantival time because it is undetectable, which, he supposed, made the concept incoherent. Leibniz argued that time should be understood not as an entity existing independently of actual, detectable events. He complained that Newton had under-emphasized the fact that time necessarily involves an ordering of events, the “successive order of things,” such as one event happening two seconds after another or four weeks before another. This is why time needs events, so to speak. Leibniz added that this overall order is time.

It is clear that Leibniz and Newton had very different answers to the question, “Given some event, what does it mean to say it occurs at a specific time?” Newton would says events occur at some absolute time that is independent of what other events do nor do not occur, but Leibniz would say we can properly speak only about the event occurring before or after or simultaneous with some other events, and that is what it means to occur at a specific time. Leibniz and Newton had a similar disagreement about space. Newton believed objects had absolute locations that need no reference to other objects’ locations, but Leibniz believed objects can be located only via spatial relations between other material objects—by an object being located above or below or three feet from another object.

One of Leibniz’s criticisms of Newton’s theory is that it violates Leibniz’s Law of the Identity of Indiscernibles: If two things or situations cannot be discerned by their different properties, then they are really identical; they are just one and not two. Newton’s absolute theory violates this law, Leibniz said, because it implies that if God had shifted the entire world some distance east and its history some minutes earlier, yet changed no properties of the objects nor relationships among the objects, then this would have been a different world—what metaphysicians call an ontologically distinct state of affairs. Leibniz claimed there would be no discernible difference in the two, so there would be just one world here, not two, and so Newton’s theory of absolute space and time is faulty. This argument is called “Leibniz’s shift argument.”

Regarding the shift argument, Newton suggested that, although Leibniz’s a priori Principle of the Identity of Indiscernibles is correct, God is able to discern differences in absolute time or space that mere mortals cannot.

Leibniz offered another criticism. Newton’s theory violates Leibniz’s a priori Principle of Sufficient Reason: that there is a sufficient reason why any aspect of the universe is the way it is and not some other way. Leibniz complained that, since everything happens for a reason, if God shifted the world in time or space but made no other changes, then He surely would have no reason to do so.

Newton responded that Leibniz is correct to accept the Principle of Sufficient Reason but is incorrect to suppose there is a sufficient reason knowable to humans. God might have had His own reason for creating the universe at a given absolute place and time even though mere mortals cannot comprehend His reason.

Newton later admitted to friends that his two-part theological response to Leibniz was weak. Historians of philosophy generally agree that if Newton had said no more, he would have lost the debate.

Newton, through correspondence from his friend Clarke to Leibniz, did criticize Leibniz by saying, “the order of things succeeding each other in time is not time itself, for they may succeed each other faster or slower in the same order of succession but not in the same time.” Leibniz probably should have paid more attention to just what this remark might imply. However, Newton soon found another clever and clearer argument, one that had a much greater impact at the time. He suggested a thought experiment in which a bucket’s handle is tied to a rope hanging down from a tree branch. Partially fill the bucket with water, grasp the bucket, and, without spilling any water, rotate it many times until the rope is twisted. Do not let go of the bucket. When everything quiets down, the water surface is flat and there is no relative motion between the bucket and its water. That is situation 1. Now let go of the bucket, and let it spin until there is once again no relative motion between the bucket and its water. At this time, the bucket is spinning, and there is a concave curvature of the water surface. That is situation 2.

How can a relational theory explain the difference in the shape of the water’s surface in the two situations? It cannot, said Newton. Here is his argument. If we ignore our hands, the rope, the tree, and the rest of the universe, says Newton, each situation is simply a bucket with still water; the situations appear to differ only in the shape of their water surface. A relationist such as Leibniz cannot account for the change in shape. Newton said that even though Leibniz’s theory could not be used to explain the difference in shape, his own theory could. He said that when the bucket is not spinning, there is no motion relative to space itself, that is, to absolute space; but, when it is spinning, there is motion relative to space itself, and so space itself must be exerting a force on the water to make the concave shape. This force pushing away from the center of the bucket is called centrifugal force, and its presence is a way to detect absolute space.

Because Leibniz had no counter to this thought experiment, for over two centuries Newton’s absolute theory of space and time was generally accepted by European scientists and philosophers, with the notable exceptions of Locke in England and d’Alembert in France.

One hundred years later, Kant entered the arena on the side of Newton. Consider two nearly identical gloves except that one is right-handed and the other is left-handed. In a world containing only a right-hand glove, said Kant, Leibniz’s theory could not account for its handedness because all the internal relationships among parts of the glove would be the same as in a world containing only a left-hand glove. However, intuitively we all know that there is a real difference between a right and a left glove, so this difference can only be due to the glove’s relationship to space itself. But if there is a space itself, then the absolute or substantival theory of space is better than the relational theory. This indirectly suggests that the absolute theory of time is better, too.

Newton’s theory of time was dominant in the 18th and 19th centuries, even though Christiaan Huygens (in the 17th century) and George Berkeley (in the 18th century) had argued in favor of Leibniz. See (Huggett 1999) and (Arthur 2014) for a clear and more detailed discussion of the opposing positions of Leibniz and Newton on the nature of time.

b. History of the Debate after Kant

Leibniz’s criticisms of Newton’s substantivalism are clear enough, but the positive element of Leibniz’s relationism is vague. It lacked specifics by assuming uncritically that his method for abstracting duration from change is unique, but this uniqueness assumption is not defended. That is, what exactly is it about the relationship of objects and their events that produces time and not something else? Nor did Leibniz address the issue of how to define the duration between two arbitrarily chosen events. In the twentieth century, Einstein argued successfully that the duration is not unique, but is relative. Appreciating his argument has affected the debates about substantivalism and relationism.

Newton and subsequent substantivalists hoped to find a new substance for defining absolute motion without having to appeal to the existence and location of ordinary material objects. In the late 19th century, the substantivalists discovered a candidate for absolute space. It was James Clerk Maxwell’s luminiferous aether, the medium that waves when there is a light wave. Maxwell had discovered that light is an electromagnetic wave. Since all then-known waves required a medium to wave, all physicists and philosophers of science at the time believed Maxwell when he said the aether was needed as a medium for the propagation of electromagnetic waves and also when he said that it definitely did exist even if it had never been directly detected. Yet this was Maxwell’s intuition speaking; his own equations did not require a medium for the propagation.

In the nineteenth century, physicists assumed the Earth was rushing through the aether, thereby creating a continual aether wind. Late in the century, the physicist A. A. Michelson and his chemist colleague Edward Morley set out to experimentally detect the wind, and thus the aether. Their interferometer experiment was very sensitive, but somehow it failed to detect an aether even though the experiment was at the time the most sensitive experiment in the history of physics. Some physicists, including Michelson himself, believed the problem was that he needed a better experimental apparatus. Other physicists believed that the aether was somehow corrupting the apparatus. Most others, however, believed the physicist A. J. Fresnel who said the Earth is dragging the aether with it, so the Earth’s nearby aether is moving in concert with the Earth itself. If so, this would make the aether undetectable by the Michelson-Morley experimental apparatus, as long as the apparatus was used on Earth and not in outer space. No significant physicist said there was no aether to be detected.

However, these ad hoc rescues of the aether hypothesis did not last long. In 1893, the physicist-philosopher Ernst Mach, who had such a powerful influence on Albert Einstein, offered an original argument that attacked Newton’s bucket argument, promoted relationism, and did not assume the existence of absolute space (the aether) or absolute time. Absolute time, said Mach, “is an idle metaphysical conception.” Mach claimed Newton’s error was in not considering the presence or absence of stars or, more specifically, not considering the combined gravitational influence of all the matter in the universe beyond the bucket. That is what was curving the water surface in the bucket when the water was spinning.

To explore Mach’s argument, consider a female  ballet dancer who pirouettes in otherwise empty space. Would her arms have to splay out from her body in this thought experiment? And if we were to spin Newton’s bucket of water in otherwise empty space, would the presence of absolute space eventually cause the surface of the water to become concave? Leibniz would answer “no.” Newton would answer “yes.” Mach would say the questions makes no sense because the very notion of spin must be spin relative to some object. Mach would add that, if the distant stars were retained, then there would be spin relative to them, and he would change his answers to “yes.” Newton believed the presence or absence of the distant stars is irrelevant to the situations with a spinning ballet dancer and a spinning bucket of water. Unfortunately, Mach did not provide any detailed specification of how the distant stars exerted their influence on Newton’s bucket or a dancer, and he had no suggestion for an experiment to test his answer, and nearly all physicists and philosophers of physics were not convinced by Mach’s reasoning. So, the prevailing orthodoxy was that Newton’s substantivalism is correct. However, a young physicist named Albert Einstein was very intrigued by Mach’s remarks. He at first thought Mach was correct, and even wrote him a letter saying so, but he eventually rejected Mach’s position and took an original, relationist position on the issue.

In 1905, he proposed his special theory of relativity that does not require the existence of either Newton’s absolute space or Maxwell’s aether. Ten years later he added a description of gravity and produced his general theory of relativity, which had the same implication. The theory was immediately understood by the leading physicists, and, when experimentally confirmed, it caused the physics and philosophy communities to abandon classical substantivalism. The tide quickly turned against what Newton had said in his Principia, namely that “Time exists in and of itself and flows equably without reference to anything external.” Influenced by relativity theory, the philosopher Bertrand Russell became an articulate promoter of relationism in the early twentieth century.

Waxing philosophical in The New York Times newspaper in 1919, Einstein declared his general relativity theory to be a victory for relationism:

Till now it was believed that time and space existed by themselves, even if there was nothing—no Sun, no Earth, no stars—while now we know that time and space are not the vessel for the Universe, but could not exist at all if there were no contents, namely, no Sun, no Earth, and other celestial bodies.

Those remarks show Einstein believed in relationism at this time. However, in his Nobel Prize acceptance speech three years later in 1922, Einstein backtracked on this and took a more substantivalist position on the classical debate between relationists and substantivalists by saying time and space could continue to exist without the Sun, Earth, and other celestial bodies. He claimed that, although relativity theory does rule out Maxwell’s aether and Newton’s absolute space, it does not rule out some other underlying substance that is pervasive. All that is required is that, if such a substance exists, then it must obey the principles of the theory of relativity. Soon he was saying this substance is space-time itselfa field whose intrinsic curvature is what we call gravitational force. With this position, he is a non-Newtonian, non-Maxwellian substantivalist. Rejecting classical substantivalism, Einstein said that spacetime, “does not claim an existence of its own, but only as a structural quality of the [gravitational] field.”

This pro-substantivalism position has been subsequently strengthened by the 1998 experimental discovery of dark energy which eventually was interpreted as indicating that space itself has inertia and is expanding. Because spacetime itself can curve and can have ripples (from gravitational waves) and can expand in volume, the pro-substantivalist position became the most popular position in the 21st century. Nevertheless, there are interesting challenges, and the issue is open.

In the 21st century it is widely accepted that spacetime can curve, expand (when the universe’s volume increases), and ripple (when gravitational waves pass by). Those are properties one commonly associates with a medium.

Quantum theory provides another reason to accept substantivalism. The assumption of Leibniz and Newton that fundamentally there are particles in space and time buffeted about by forces was rejected with the rise of quantum mechanics in the twentieth century. It became clear that fields are better candidates for the fundamental entities of the universe. Physicists influenced by logical positivism, once worried that perhaps Einstein’s gravitation field, and all other fields, are merely computational devices without independent reality. However, ever since the demise of logical positivism and the development and confirmation of quantum electrodynamics in the late twentieth century, fields have been considered to be real by both physicists and philosophers. Because quantum theory implies a field does not go away even if the field’s values reach a minimum everywhere, the gravitational field is considered to be substance-like, but it is a substance that changes with the distribution of matter-energy throughout the universe, so it is very unlike Newton’s absolute space or Maxwell’s aether. The philosophers John Earman and John Norton have called this position (of promoting the substance-like character of the gravitational field) manifold substantivalism. In response, the philosopher of physics Tim Maudlin said: “The question is: Why should any serious substantivalist settle on manifold substantivalism? What would recommend that view? Prima facie it seems like a peculiar position to hold” because the manifold has no spatiotemporal structure. (Maudlin 1988, p. 87).

Since the late twentieth century, philosophers have continued to create new arguments for and against substantivalism, so the issue is still open. Nevertheless, many other scientists and philosophers have suggested that the rise of quantum theory has so changed the concepts in the Newton-Leibniz debate that the old issue cannot be settled either way.

For additional discussion of substantivalism and relationism, see (Dainton 2010, chapter 21).

7. Is There a Beginning or End to Time?

This section surveys some of the principal, well-informed speculations about the beginning and end of time. The emphasis should be on “speculations” because there are hundreds of competing ideas about the beginning and end of the universe and of time, and none of the ideas are necessary to explain any actual observations. Also, almost all of them are flexible enough that they could be made to accommodate any new data that needed an explanation. For all we know, we may never know the answer to these questions, despite being better informed on the issue than were our predecessors. One cautionary note is that researchers sometimes speak of time existing before the beginning of the universe, so perhaps what they mean by the word “universe” is not as comprehensive as what others mean. Also, researchers sometimes speak of the creation of a universe from the physicists’ quantum vacuum and call this creation ex nihilo, but a quantum vacuum is not nothing.

a. The Beginning

Many persons have argued that the way to show there must have been a first event is to show that time has a finite past.  But this is a mistake. The universe can have a finite past but no first event. This point is illustrated with the positive real numbers. All positive real numbers before five (that is, less than five and greater than zero) have predecessors, but there is no first number in this series. For any positive real number in the series, there is a smaller one without there being a smallest one.

Many theologians are confident that there was a beginning to time, but there is no agreement among cosmologists that there ever was a beginning.

Relativity theory and quantum theory both allow time to be infinite in the future and the past. Thus any restrictions on time’s extent must come from other sources. Regarding the beginning of time, some cosmologists believe the universe began with a big bang 13.8 billion years ago. This is the t = 0 of cosmic time used by professional cosmologists. The main controversy is whether t = 0 is really the beginning. Your not being able to imagine there not being a time before the big bang does not imply there is such an earlier time, although this style of argument might have been acceptable to the ancient Greek philosophers. The mathematical physicist Stephen Hawking once famously quipped that asking for what happened before the big bang is like asking what is north of the north pole. He later retracted that remark and said it is an open question whether there was a time before the big bang, but he slightly favored a yes answer.

Even if there were a time before the big bang began, the question would remain as to whether the extent of this prior time is finite or infinite, and there is no consensus on that question either.

The big bounce theory of cosmology says the small, expanding volume of the universe 13.8 billion years ago was the effect of a prior multi-billion-year compression that, when the universe became small enough, stopped its compression and began a rapid expansion that we have been calling the big bang. Perhaps there have been repetitions of compression followed by expansion, and perhaps these cycles have been occurring forever and will continue occurring forever. This is the theory of a cyclic universe.

Cosmologist J. Richard Gott speculated that time began in an unusual process in which the universe came from itself by a process of backward causation as allowed by the theory of general relativity. At the beginning of the universe there was a closed time-like loop that lasted for 10-44 seconds during which the universe caused its own existence. Past time is not eternal according to this theory. The loop was a beginning of time without a first event. See (Gott 2002) for an informal presentation of the idea.

b. The End

The cosmologists’ favorite scenario for the universe’s destiny implies that all stars burn out, all black holes eventually evaporate, all mass is gone, and the remaining particles of radiation get ever farther from each other, with no end to the dilution and cooling while the ripples of space-time become weaker. There will be a much earlier time when the last thought occurs. This scenario, called the big chill, the big freeze, and also the heat death, depends upon assuming the total energy of the universe is not zero, which is a controversial assumption.

Here is a summary of some serious, competing suggestions by twenty-first-century cosmologists about our universe’s future, beginning with the most popular one:

  • Big Chill—Heat Death (Expansion of space at an ever-increasing rate.) A potentially infinite future.
  • Big Crunch (The universe is closed; eventually the expansion stops somehow; and the universe begins contracting to a final compressed state as if the big bang is now running in reverse.) A finite future.
  • Big Bounce. (Eternal pattern of cycles of expansion, then compression, then expansion, then compression, and so forth. One version implies there are repeated returns to a microscopic volume with each being followed by a new big bang). An infinite future.
  • Cycles without Crunches (While the universe expands, the observable part of the universe can oscillate between expansions and contractions with a big bounce separating a contraction from the next expansion.) An infinite future.
  • Big Rip (Dark energy runs wild. The expansion rate of dark energy is not a Cosmological Constant but instead is variable, and its value increases exponentially toward infinity. As this happens, every complex system that interacts gravitationally is eventually pushed apart—first groups of galaxies, then galaxies, later the planets, then all the molecules, and within about 188,000,000 years even the fabric of space itself.) A finite future.
  • Big Snap (The fabric of space suddenly reveals a lethal granular nature when stretched too much, and its “snaps” like when an overly stretched rubber band breaks.) A finite future.
  • Death Bubble (Due to some high energy event such as the creation of a tiny black hole with a size never created before, our metastable Higgs field suddenly changes its value from the current false vacuum value to the more stable true vacuum value. The energy of the vacuum decay that this collapse creates appears as a 3D bubble with no inside that expands at nearly the speed of light while destroying everything in its path that has structure. Not expected to occur until 10100 years from now but possibly could occur tomorrow.) A finite future.
  • Mirror Universe. (Before  the big bang, time runs in reverse. Both the big bang’s before-region and after-region emerge from a tiny situation at cosmic time t = 0 in which the apexes of their two light cones meet. The two regions are almost mirror images of each other.) There are versions with a finite future and finite past and with an infinite future and infinite past.

The Big Crunch was the most popular theory among cosmologists until the 1960s. In this theory, the universe would continue its present expansion for about three billion more years until the inward pull due to the mutual gravitation among all the universe’s matter-energy overcame the expansion, thereby causing a subsequent seven billion years of contraction until everything becomes compressed together into a black hole.

See (Mack 2020) and (Hossenfelder 2022, chapter two) for a presentation by two cosmologists of the many competing theories about the beginning and the end of time and of the universe.

c. Historical Answers

There has been much speculation over the centuries about the extent of the past and the future, although almost all remarks have contained serious ambiguities. For example, regarding the end of time, is this (a) the end of humanity, or (b) the end of life, or (c) the end of the universe that was created by God, but not counting God, or (d) the end of all natural and supernatural change? Intimately related to these questions are two others: (1) Is it being assumed that time exists without change, and (2) what is meant by the term change? With these cautions in mind, here is a brief summary of speculations throughout the centuries about whether time has a beginning or an end.

Regarding the beginning of time, the Greek atomist Lucretius in about 50 B.C.E. said in his poem De Rerum Natura:

For surely the atoms did not hold council, assigning order to each, flexing their keen minds with questions of place and motion and who goes where.

But shuffled and jumbled in many ways, in the course of endless time they are buffeted, driven along chancing upon all motions, combinations.

At last they fall into such an arrangement as would create this universe.

The implication is that time has always existed, but that an organized universe began a finite time ago with a random fluctuation.

Plato and Aristotle, both of whom were opponents of the atomists, agreed with them that the past is infinite or eternal. Aristotle offered two reasons. Time had no beginning because, for any time, we always can imagine an earlier time. In addition, time had no beginning because everything in the world has a prior, efficient cause. In the fifth century, Augustine disagreed with Aristotle and said time itself came into existence by an act of God a finite time ago, but God, himself, does not exist in time. This is a cryptic answer because it is not based on a well-justified and detailed theory of who God is, how He caused the big bang, and how He can exist but not be in time. It is also difficult to understand St. Augustine’s remark that “time itself was made by God.” On the other hand, for a person of faith, belief in their God is usually stronger than belief in any scientific hypothesis, or in any desire for scientific justification of their remark about God, or in the importance of satisfying any philosopher’s demand for clarification.

Agreeing with Augustine against Aristotle, Martin Luther estimated the universe to have begun in 4,000 B.C.E. Then Johannes Kepler estimated that it began in 4,004 B.C.E. In the early seventeenth century, the Calvinist James Ussher calculated from the Bible that the world began in 4,004 B.C.E. on Friday, October 28.

Advances in geology eventually refuted the low estimates that the universe was created in about 4,000 B.C.E.

In about 1700, Isaac Newton claimed future time is infinite and that, although God created the material world some finite time ago, there was an infinite period of past time before that, as Lucretius had also claimed.

Twenty-first century astronomers say the universe is at least as old as the big bang which began  about 13.8 billion years ago.

For more discussion of the issue of the extent of time, so the companion section Infinite Time.

8. Emergence of Time

To ask whether time emerges is to ask where it comes from, not how it changes over time. Is physical time emergent, or is it instead a fundamental feature of nature? That is, is it basic, elementary, not derivative, or does it emerge at a higher level of description from more basic timeless features? Does spacetime emerge as well? Experts are not sure of the answers, although a slight majority favor spacetime not being emergent, and nearly all members of this majority favor the position that time emerges from spacetime. The most favored candidate for what spacetime emerges from is the quantum wave function, and in particular from quantum entanglement. Entanglement is a matter of degree. Those who do favor spacetime being emergent from an underlying non-spatiotemporal substrate usually say its dimensionality is emergent as well.

The word emerge has been used in different ways in the literature of philosophy. Some persons define emergence as the whole being greater than the sum of its parts. There are better, less vague definitions. The word “emerge” in this article is intended to indicate the appearance of an objective or mind-independent feature of nature, not a psychological feature or a feature of our knowledge. When we ask whether time emerges, the notion of being emergent does not imply being inexplicable, and it does not imply that there is a process occurring over time in which something appears that was not there before the process began such as an oak tree emerging from an acorn. Still, being emergent is less strong than being reducible. The philosopher Daniel Dennett helpfully recommends treating an emergent entity as a pattern that has an explanatory and predictive role in the theory positing the entity, but it is a pattern at a higher level. Emergence is about the level (a.k.a. scale or order or detail) of the description of phenomena. Information is lost as one moves to higher levels, but the move to a higher level can reveal real patterns and promote understanding of nature that would never be noticed by focusing only on the fundamental level. As Sean Carroll explains it:

To say that something is emergent is to say that it’s part of an approximate description of reality that is valid at a certain (usually macroscopic) level, and is to be contrasted with “fundamental” things, which are part of an exact description at the microscopic level….Fundamental versus emergent is one distinction, and real versus not-real is a completely separate one (Carroll 2019, p. 235).

Believing time will be coarse-grained or emergent in a future, successful theory of quantum gravity, theoretical astrophysicist Carroll says, “Time is just an approximation….” Carlo Rovelli agrees:

Space-time is…an approximation. In the elementary grammar of the world, there is neither space nor time—only processes that transform physical quantities from one to another…. At the most fundamental level that we currently know of,…there is little that resembles time as we experience it. There is no special variable “time,” there is no difference between past and future, there is no spacetime (Rovelli 2018 195).

Let’s pause here and say a bit more about emergence. Heat emerges from molecular motion even though no molecule is hot. What makes heat emergent in the sense that time might be emergent is that there can be no change in the heat without a corresponding change in the underlying molecular motion. For another example, we properly and usefully speak of hunger causing a person to visit the supermarket without bothering to consider how the terms cash out in terms  of constituent particles. In that sense, a person is just an approximation. The point of saying a new concept emerges at a higher level is not simply to imply that lower level information is lost in using higher level concepts. Instead the point of using a higher level concept is to make use of higher level patterns that are useful in creating explanations and that could not be easily appreciated by using only lower level concepts. The point is to find especially useful patterns at the higher level to improve describing, explaining, and understanding nature. The point is not to reduce sentences about persons to sentences about particles.

Any claim that time emerges should say whether this emergence is weak or strong (in the sense defined by the philosopher Mark Bedau). Weak emergence is about new features supervening upon more basic features but not existing at that more basic level. (A supervenes on B if changes in A require there to be changes in B. For example, temperature supervenes on molecular motion because the temperature of an object cannot change without there being changes in the object’s molecular motions, but at the basic level, no molecule has a temperature.) As a practical matter, it is rare that a higher level concept is in practice explicitly derived from a lower level concept even if it can be in principle. Strong emergence denies the supervenience and emphasizes the independence of the emergent concept from a lower level. Physicists favor weak emergence over strong emergence for their topics of interest, but the notion of strong emergence is perhaps more applicable when we say the behavior of a nation emerges from the behavior of its citizens.

An important philosophical issue is to decide which level is the fundamental one. Being fundamental is relative to the speaker’s purpose. Biologists and physicists have different purposes. To a biologist, the hunger causing you to visit the supermarket emerges from the fundamental level of cellular activity. But to a physicist, the level of cellular activity is not fundamental but rather emerges from the more fundamental level of elementary particle activity which in turn emerges from fluctuations in elementary quantum fields.

Does time emerge from spacetime? Special relativity definitely implies it does. But in speculations about a theory of quantum gravity, the issue is being re-examined. Some physicists speculate that early in the big bang period there were an infinite number of dimensions of space. As the universe expanded and cooled, these eventually collapsed into four dimensions of space and none of time. Then this collapsed so that one of the space dimensions disappeared as the time dimension emerged, leaving our current four-dimensional spacetime. (This description seems to imply that there was time before time began, but that is a problem with the English language and not with what is intended by the description.) Other physicists speculate, instead, that time is fundamental, but spacetime is what emerges. In 2004, after winning the Nobel Prize in physics, David Gross expressed that viewpoint. While speaking about string theory, which is his favored theory for reconciling the inconsistency between quantum theory and the general theory of relativity, he said.

Everyone in string theory is convinced…that spacetime is doomed. But we don’t know what it’s replaced by. We have an enormous amount of evidence that space is doomed. We even have examples, mathematically well-defined examples, where space is an emergent concept…. But in my opinion the tough problem that has not yet been faced up to at all is, “How do we imagine a dynamical theory of physics in which time is emergent?” …All the examples we have do not have an emergent time. They have emergent space but not time. It is very hard for me to imagine a formulation of physics without time as a primary concept because physics is typically thought of as predicting the future given the past. We have unitary time evolution. How could we have a theory of physics where we start with something in which time is never mentioned?

By doomed, Gross means not-fundamental and only emergent. Perhaps half of all physicists working in the field of quantum gravity in the first quarter of the 21st century suspect that resolving the inconsistency between quantum theory and gravitational theory will require forcing both spacetime and time to emerge from some more basic timeless substrate at or below the scale of the Planck length of about 10-35 meters and the Planck time of about 10-43 seconds. The physicist Stephen Wolfram believes the atoms of time have a duration of only 10-100 seconds. This is the time the universe needs to update itself to the next state.

According to Wolfram, time is the progression of the universe’s computations. All physical change is a computation, he believes.  He envisions the fundamental entities in the universe to be represented as a finite collection of atoms of space (nodes) that have connections (directed arrows) to others and to collections of others; the mathematical structure here is called a spatial hypergraph. The totality of space atoms is perhaps 10400, and time is the progressive rewriting of the hypergraph about every 10-100 seconds. The rewriting occurs by applying the same rule throughout the hypergraph. The rule says that, wherever within the hypergraph there is this particular simple pattern of nodes and their connections, update that pattern to this other pattern by following rule R. So there is quite a bit a parallel processing going on throughout the universe as it rewrites itself everywhere every 10100 seconds. Unfortunately, Wolfram does not yet know rule R, but he is confident there is such a rule to be found, and it will be simple, and it will lead to a universe that approximately has three-plus-one-dimensional spacetime that approximately obeys general relativity and quantum theory and appears to be continuous when not viewed too finely. So, time, space, matter, fields, and all of science’s laws emerge from patterns in the underlying hypergraph. In Wolfram’s theory, space and time are very different from each other, unlike they are in relativity theory.

The physicist Carlo Rovelli, a proponent of loop quantum gravity rather than string theory, has a suggestion for what the fundamental level is from which time emerges. It is a configuration of loops. He speculated: “At the fundamental level, the world is a collection of events not ordered in time” (Rovelli 2018a, p. 155). Rovelli is re-imagining the relationship between time and change. Nevertheless, at the macroscopic level, he would say time does exist even though it is not a fundamental feature of reality. In string theory, the strings that compose all elementary particles exist within a background of spacetime. But within loop quantum gravity this is not the case. Instead, spacetime emerges from a configuration of loops, analogous to the way a vest of chainmail emerges from a properly connected set of tiny circular chain links.

Eliminativism is the theory in ontology that says emergent entities are unreal. If time is emergent, it is not real. Similarly, if pain is emergent, it is not real—and so no person has really felt a pain. The theory is also called strong emergentism. The opposite and more popular position in ontology is anti-eliminativism or weak emergence. It implies that emergent entities are real despite being emergent. The English physicist Julian Barbour is an eliminativist and strong emergentist about time. He said the “universe is static. Nothing happens; there is being but no becoming. The flow of time and motion are illusions” (Barbour 2009, p. 1). He argued that, although there does exist objectively an infinity of individual, instantaneous moments, nevertheless there is no objective happens-before ordering of them, no objective time order. There is just a vast, jumbled heap of moments. Each moment is an instantaneous configuration (relative to one reference frame) of all the objects in space. Like a photograph, a moment or configuration contains information about change, but it, itself, does not change. If the universe is as Barbour describes, then space (the relative spatial relationships within a configuration) is ontologically fundamental and a continuum, but time is neither. In this way, time is removed from the foundations of physics and emerges as some general measure of the differences among the existing spatial configurations. For more on Barbour’s position, see (Smolin 2013, pp. 84-88).

Sean Carroll has a different, original idea about time. He is not an eliminativist, but is a weak emergentist who claims in (Carroll 2019) that time and everything else in the universe emerges from the universe’s wave function in a “gravitized quantum theory.” The only fundamental entity in the universe is the wave function. Everything else that is real emerges from the wave function that obeys Schrödinger’s equation. This position gives a physical interpretation of the wave function. Carroll says neither time, space, nor even spacetime is fundamental. These features emerge from the quantum wave function. So, spacetime is merely an approximation to reality.

Stephen Wolfram suggested time emerges from the progressive rewriting of the universe’s hypergraph.

Another suggestion is that whether time is emergent may not have a unique answer. Perhaps time is relative to a characterization of nature. That is, perhaps there are alternative, but empirically adequate theoretical characterizations of nature, yet time is fundamental in one characterization but emergent in another. This idea is influenced by Quine’s ontological relativity.

For more description of the different, detailed speculations on whether time is among the fundamental constituents of reality, see (Merali 2013) and (Rovelli 2018b).

9. Convention

Time has both conventional and non-conventional features. The clearest way to specify the conventional elements in a theory would be by axiomatizing it, but there is no such precise theory of time.

The duration of the second is a conventional feature involving time. Our society could have chosen it to be longer or shorter. It is a convention that there are sixty-seconds in a minute rather than sixty-six, and that no week fails to contain a Tuesday.

Here is a non-conventional feature. In a single reference frame, if event 1 happens before event 2, and event 2 happens before event 3, then event 1 also happens before event 3. No exceptions. This transitivity of the happens-before relation in any single reference frame is a general feature of time, not a convention. However, it is a contingent feature, not an essential feature. It is believed because no one has ever seen evidence that transitivity is violated, and there are no reputable theories implying that there should be such evidence.

The issue here is conventional vs. factual, not conventional vs. foolish or impractical. Although the term convention is somewhat vague, conventions are up to us to freely adopt and are not objective features of the external world that we are forced to accept if we seek the truth. Conventions are inventions or artificial features as opposed to being natural or mandatory or factual. It is a convention that the English word green means green, but it is not a convention  that the color of normal, healthy leaves is green. Conventions need not be arbitrary; they can be useful or have other pragmatic virtues. Nevertheless, if a feature is conventional, then there must in some sense be reasonable alternative conventions that could have been adopted. Also, conventions can be explicit or implicit. For one last caution, conventions can become recognized as having been facts. The assumption that matter is composed of atoms was a useful convention in late nineteenth century physics; but, after Einstein’s explanation of Brownian motion in terms of atoms, the convention was generally recognized by physicists as having been a fact all along.

Time in physics is measured with real numbers (decimal numbers) rather than imaginary numbers (such as the square root of negative one). Does this reveal a deep feature of time? No, it is simply a convention.

It is a useful convention that, in order to keep future midnights from occurring during the daylight, clocks are re-set by one hour as one moves across a time-zone on the Earth’s surface—and that is also why leap days and leap seconds are used. The minor adjustments with leap seconds are required because the Earth’s rotations and revolutions are not exactly regular. For political and social reasons, time zones do not always have longitudes for boundaries. For similar reasons, some geographical regions use daylight savings time instead of standard time.

Consider the ordinary way a clock is used to measure how long a nearby event lasts. We adopt the following metric, or method: Take the time at which the event ends, say 5:00, and subtract the time at which it starts, say the previous 3:00. The metric procedure says to take the absolute value of the difference between the two numbers and get the answer of two hours. Is the use of this method merely a convention, or in some objective sense is it the only way that a clock could and should be used? That is, is there an objective metric, or is time metrically amorphous? Philosophers of physics do not agree on this. Perhaps the duration between instants x and y could be:

|log(y/x)|

instead of the ordinary:

|y – x|.

A virtue of both metrics is that duration cannot be negative. The trouble with the log metric is that, for any three point events x, y, and z, if t(x) < t(y) < t(z), then it is normal to demand that the duration from x to y plus the duration from y to z be equal to the duration from x to z. However, the log metric does not have this property. The philosophical issue is whether it must have this property for any reason other than convenience.

Our civilization designs a clock to count up to higher numbers rather than down to lower numbers as time elapses. Is that a convention? Yes. In fact, when Westerners talk about past centuries, they agree to use both A.D. and B.C.E. A clock measuring B.C.E. periods would count toward lower numbers. The clock on today’s wall always counts up, but that is merely because it is agreed we are in the A.D. era, so there is no need for a clock to count in B.C.E. time. The choice of the origin of the time coordinate is conventional, too. It might be a Muhammad event or a Jesus event or a Temple event or the big bang event.

It is an interesting fact and not a convention that our universe is even capable of having a standard clock that measures both electromagnetic events and gravitational events and that electromagnetic time stays in synchrony with gravitational time.

It is a fact and not a convention that our universe contains a wide variety of phenomena that are sufficiently regular in their ticking to serve as clocks. They are sufficiently regular because they tick in adequate synchrony with the standard clock. The word adequate here means successful for the purposes we have for using a clock.

Physicists regularly assume they may use the concept of a point of continuous time. They might say some event happened the square root of three seconds after another event. Physicists usually uncritically accept a point of time as being real-valued, but philosophers of physics disagree with each other about whether this is merely a useful convention. Is time’s being a continuum in, say quantum mechanics, a fact or just a convention that should be eliminated in a better treatment of time? Experts disagree, although most favor the latter position.

Our society’s standard clock tells everyone what time it really is. Can our standard clock be inaccurate? “Yes,” say the objectivists about the standard clock. “No,” say the conventionalists who claim the standard clock is accurate only by convention; if it acts strangely, then all other clocks must act equally strangely in order to stay in synchrony with the standard clock. For an example of strangeness, suppose our standard clock used the periodic rotations of the Earth relative to the background stars. In that case, if a comet struck Earth and affected the rotational speed of the Earth (as judged by, say, a pendulum clock), then we would be forced to say the rotation speed of the Earth did not really change but rather the other periodic clock-like phenomena such as swinging pendulums and quartz crystal oscillations all changed in unison because of the comet strike. The comet “broke” those clocks. That would be a strange conclusion to draw, and in fact, for just this reason, 21st century physicists have rejected any standard clock that is based on Earth rotations and have chosen a newer standard clock that is based on atomic phenomena. Atomic phenomena are unaffected by comet strikes.

A good choice of a standard clock makes the application of physics much simpler. A closely related philosophical question about the choice of the standard clock is whether, when we change our standard clock, we are merely adopting constitutive conventions for our convenience, or in some objective sense we can be making a choice that is closer to being correct. For more on this point, see this article’s Frequently Asked Questions.

The special theory of relativity is widely believed to imply that the notion of now or the present is conventional in the following sense. Here is a two-dimensional Minkowski diagram of space and time displaying this feature:

pic of absolute elsewhere

The light cone of your future is the region above the gray area; the past line cone is the region below the gray area. The diagonal straight lines are the worldlines of light rays reaching and leaving you here now. The gray areas of this block universe represent all the events (in sense 1 of the term “event”) that could be classified either way, as your future events or as your past events; and all this depends upon someone’s choice of what line within the gray area will be the line of your present. The gray areas represent all the events that could neither cause, nor be caused by, your being here now. The more technical ways of saying this is that the gray area is all events that are space-like for you or that are in your absolute elsewhere or that constitute your extended present. Two events are time-like separated from each other if they could possibly have affected each other. If a pair of events is time-like separated, then they cannot also be space-like separated. Light cones are not frame relative; they are absolute and objective. Also, this structure of space-time holds not just for you; every point-event, has its own unique pair of light cones.

The gray region of space-like events is called the extended present because, if you were defining an x-axis of this diagram in order to represent your present events, then you would have a great latitude of choice. You could place the line that is the spatial axis anywhere in the gray area; but, in order to avoid ambiguity, once it is chosen it stays there for all uses of the coordinate system; it cannot change its angle. For example, suppose two point-events represented as a and b in the diagram both occur in the Andromeda Galaxy. That galaxy is 2,000,000 light-years away from you, assuming you are now on Earth. Even though event b were to occur a million years after a, you (or whomever is in charge of setting up the axes of the coordinate system you are using) are free to choose either event as happening now in that galaxy, and you also are free to choose any intermediate event there. But you are not free to choose an event in a white area because that would violate relativity theory’s requirements about causality. One implication of this argument is that relativity theory implies there is no fact of the matter as to what is happening at present in the Andromeda Galaxy. What is happening when there is frame-relative.

The above discussion about time-order is often expressed more succinctly by physicists by saying the time-order of space-like events is conventional and not absolute. For more on this issue, see the discussion of the relativity of simultaneity.

Well, perhaps this point should be made more cautiously by saying that special relativity implies the relativity of simultaneity for non-local events. Some philosophers believe there is a fact of the matter, a unique present, even if special relativity does not recognize the fact.

How do we know the speed of light is the same in all directions? Is this a fact, or is it a convention? This is a controversial issue in the philosophy of physics. Einstein claimed it was a convention, but the philosophers B. Ellis and P. Bowman in 1967, and D. Malament in 1977, gave different reasons why Einstein is mistaken. For an introduction to this dispute, see the Frequently Asked Questions.

Additional conventional and non-conventional features of time are discussed in the supplement What Else Science Requires of Time.

10. Arguments That Time Is Not Real

We can see a clock, but we cannot see time, so how do we know whether time is real—that it exists? Someone might think that time is real because it is what clocks are designed to measure, and because there certainly are clocks. The trouble with this reasoning is that it is analogous to saying that unicorns are real because unicorn hunters intend to find unicorns, and because there certainly are unicorn hunters.

Early twentieth-century philosophers of science argued that spacetime is real because it is a necessary ingredient in Einstein’s general theory of relativity, which is well-confirmed. The theory is needed to correctly describe observations. But if, as most physicists say, to be real is to be frame-independent, then time is not real. This insight into the nature of time was first promoted by Hermann Minkowski soon after his student Albert Einstein created the special theory of relativity. Because energy, distance, and mass are also different in different references frames, they, too, are not real. The requirement that to be real is to be frame-independent is not a logical truth, nor a result of observation; it is a plausible metaphysical assumption that so far has the support of almost all physicists and most philosophers of physics.

Let’s consider some other arguments against the reality of time that have appeared in the philosophical literature. The logical positivist Rudolf Carnap said, “The external questions of the reality of physical space and physical time are pseudo-questions” (“Empiricism, Semantics, and Ontology,” 1950). He meant these two questions are meaningless because there is no way to empirically verify their answers one way or the other. Subsequent philosophers have generally disagreed with Carnap and have taken these metaphysical questions seriously.

Here are other reasons for the unreality of time. Time is unreal because (i) it is emergent, or (ii) it is subjective, or (iii) it is merely conventional (only a mathematical construct), or (iv) it is defined inconsistently, or (v) its scientific image deviates too much from its commonsense image. The five are explored below, in order.

i. Because Time is Emergent

Time emerges from spacetime in relativity theory, as Minkowski first showed. This implies time is not fundamental; but some philosophers of time go further and say it also implies time is not real. Similarly, Arthur Eddington and Peter van Inwagen have argued that tables and chairs are not real because they emerge from arrangements of elementary particles, and it is only these particles and their arrangements that are real. The analogous position in the philosophy of mind is called eliminative materialism. It implies that because the physical facts fix all the facts and because future science will show that common mental states such as beliefs and hopes have no essential role in a successful explanation of mental and physical phenomena, there are no such entities as beliefs and hopes. You really do not believe anything nor hope for anything.

Suppose time does emerge from spacetime, or events, or the quantum gravitational field, or something else. Does this imply time is not real? Most scientists and philosophers of time will answer “no” for the following reasons. Scientists once were surprised to learn that heat emerges from the motion of molecules and that a molecule itself has no heat. Would it not have been a mistake to conclude from this that heat is unreal and nothing is warm? And when it became clear that baseballs are basically a collection of atoms, and so baseballs can be said to emerge from atoms, would it not have been a mistake to say this implies baseballs no longer exist? After all, baseballs are real patterns of fundamental objects and events. Also, the concept of time is already known to be so extremely useful at the scales of quarks and molecules and mountains and galaxies, so it is real at least at all those scales. The compatibility of time’s not existing below the Planck scale to its existing above that scale is somewhat analogous to the compatibility of free will’s not existing at the scale of molecular activity to its existing at the scale of human behavior.

ii. Because Time is Subjective

Psychological time is clearly subjective, but the focus now is on physical time. Any organism’s sense of time is subjective, but is the time that is sensed also subjective? Well, first what does subjective mean? This is a notoriously controversial term in philosophy. Here it means that a phenomenon is subjective if it is a mind-dependent phenomenon, something that depends upon being represented by a mind. A secondary quality such as being red is a subjective quality, but being capable of reflecting the light of a certain wavelength is not subjective. The same point can be made by asking whether time comes just from us or instead is wholly out there in the external world independent of us. Throughout history, philosophers of time have disagreed on the answer. Without minds, nothing in the world would be surprising or beautiful or interesting. Can we add that nothing would be in time? If so, time is not objective, and so is not objectively real.

Aristotle envisioned time to be a counting of motions (Physics, IV.ch11.219b2), but he also asked the question of whether the existence of time requires the existence of mind. He does not answer his own question because he says it depends on whether time is the conscious numbering of movement or instead is just the capability of movements to be numbered were consciousness to exist.

St. Augustine, clearly adopted a subjectivist position regarding time, and said time is nothing in reality but exists only in the mind’s apprehension of that reality.

Several variants of idealism have implied that time is not real. Kant’s idealism implies objective time,  the time of things-in-themselves, if there even are such things, is unknowable, and so is in that sense unreal. The post-Kantian German idealists (Fichte, Schelling, Hegel) argued that the problem isn’t that time is unknowable but that all reality is based wholly upon minds, so objective time is unreal. It cannot be a feature of, or part of, reality.

Here are some comments against the above arguments and for the reality of objective time. Notice that a clock can tick in synchrony with other clocks even when no one is paying attention to the clocks. Second, notice how useful the concept of time is in making such good sense of our evidence involving change, persistence, and succession of events. Consider succession. This is the order of events in time. If judgments of time order were subjective in the way judgments of being interesting vs. not-interesting are subjective, then it would be too miraculous that everyone can so easily agree on the temporal ordering of so many pairs of events: birth before death, the acorn sprouts before oak tree appears, houses are built before they are painted. W. V. O. Quine might add that the character of the objective world with all its patterns is a theoretical entity in a grand inference to the best explanation of the data of our experiences, and the result of this inference tells us that the world is an entity containing an objective time, a time that gets detected by us mentally as psychological time and gets detected by our clocks as physical time.

iii. Because Time is Merely Conventional or Only a Mathematical Construct

One might argue that time is not real because the concept of time is just a mathematical artifact in our fundamental theories of mathematical physics. It is merely playing an auxiliary mathematical role. Similarly, the infinite curvature of space at the center of a black hole is generally considered to be merely an artifact of the mathematics used by the general theory of relativity but not to exist in reality.

Or one might argue as follows. Philosophers generally agree that humans invented the concept of time, but some philosophers argue that time itself is invented. It was created as a useful convention, like when we decided to use certain coin-shaped metal objects as money. Money is culturally real but not objectively real because it would disappear if human culture were to disappear, even if the coin-shaped objects were not to disappear. Money and gold both exist, but money’s existence depends upon social relations and conventions that gold’s existence does not depend upon. Is time’s existence more like money than gold in that regard?

Although it would be inconvenient to do so, our society could eliminate money and return to barter transactions. Analogously, Callender asks us to consider the question, “Who Needs Time Anyway?”

Time is a way to describe the pace of motion or change, such as the speed of a light wave, how fast a heart beats, or how frequently a planet spins…but these processes could be related directly to one another without making reference to time. Earth: 108,000 beats per rotation. Light: 240,000 kilometers per beat. Thus, some physicists argue that time is a common currency, making the world easier to describe but having no independent existence (Callender 2010, p. 63).

In 1905, the French physicist Henri Poincaré argued that time is not a feature of reality to be discovered, but rather is something we have invented for our convenience. He said possible empirical tests cannot determine very much about time, so he recommended the convention of adopting whatever concept of time that makes for the simplest laws of physics. Nevertheless, he said, time is otherwise wholly conventional, not objective.

There are two primary reasons to believe time is not merely conventional: First, there are so many one-way processes in nature. For example, mixing cold milk into hot, black coffee produces lukewarm, brown coffee, but agitations of lukewarm, brown coffee have never turned it back into hot black coffee with cool milk. The process goes only one way in time.

Second, our universe has so many periodic processes whose periods are constant multiples of each other over time. That is, their periods keep the same constant ratio to each other. For example, the frequency of rotation of the Earth around its axis relative to the “fixed” stars is a constant multiple of the frequency of swings of a fixed-length pendulum, which in turn is a constant multiple of the half-life of a specific radioactive uranium isotope, which in turn is a constant multiple of the frequency of a vibrating quartz crystal, which in turn is a constant multiple of the frequency of a light beam emitted from a specific kind of atomic process used in an atomic clock. The relationships do not change as time goes by—at least not much and not for a long time, and when there is deviation we know how to predict it and compensate for it. The existence of these sorts of constant time relationships—which cannot be changed by convention—makes our system of physical laws much simpler than it otherwise would be, and it makes us more confident that there is some convention-free, natural kind of entity that we are referring to with the time-variable in those physical laws—despite the fact that time is very abstract and not something we can see, taste, or touch.

iv. Because Time is Defined Inconsistently

Bothered by the contradictions they claimed to find in our concept of time, Parmenides, Zeno, Spinoza, Hegel, and McTaggart said time is not real.

Plato’s classical interpretation of Zeno’s paradoxes is that they demonstrate the unreality of any motion or any other change. Assuming the existence of time requires the existence of change, then Zeno’s paradoxes also overturn Greek common sense that time exists.

The early 20th-century English philosopher J.M.E. McTaggart believed he had a convincing argument for why a single event can acquire the properties of being a future event, a present event, and also a past event, and that since these are contrary properties, our concept of time is inconsistent. It follows for McTaggart that time is not real.

The early 20th-century absolute-idealist philosopher F.H. Bradley claimed, “Time, like space, has most evidently proved not to be real, but a contradictory appearance…. The problem of change defies solution.”

Regarding the inconsistencies in our concept of time that Zeno, McTaggart, Bradley, and others claim to have revealed, most philosophers of time say that there is no inconsistency, and that the complaints can be handled by clarification or by revising the relevant concepts. For example, Zeno’s paradoxes were solved by requiring time to be a linear continuum like a segment of the real number line. This solution was very fruitful and not ad hoc.  It would be unfair to call it a change of subject.

v. Because Scientific Time is Too Unlike Ordinary Time

If you believe that for time to exist it needs to have certain features of the commonsense image of time, but you believe that science implies time does not have those features, you might be tempted to conclude that science has really discovered that time does not exist. In the mid 20th century the logician Kurt Gödel argued for the unreality of time as described by contemporary physical science because the equations of the general theory of relativity allow for physically possible universes in which all events precede themselves. People can, “travel into any region of the past, present, and future and back again” (Gödel, 1959, pp. 560-1). It should not even be possible for time to be circular or symmetric like this, Gödel believed, so, he concluded that, if we suppose time is the time described by relativity theory, then time is not real.

Regarding the claim that our commonsense understanding of time by science is not treated fairly by the science of time, there is no consensus about which particular features of commonsense time cannot be rejected, although not all can be or else we would be changing the subject and not talking about time. But science has not required us to reject our belief that some events happen in time before other events, nor has science required us to reject our belief that some events last for a while. Gödel’s complaint about relativity theory’s allowing for circular time has been treated by the majority of physicists and philosophers of time by saying he should accept that time might possibly be circular even though as a contingent matter it is not circular in our universe, and he needs to revise his intuitions about what is essential to the concept.

vi. Conclusion

The general consensus among physicists and the majority position among philosophers of science is that spacetime is real and that time is real within any specified reference frame.  There, the word time refers to a real, existing entity because it is so helpful for explaining, understanding, and predicting so many phenomena, plus there do not exist alternative, better ways of doing this. It is still an open question among physicists and philosophers as to whether time exists below the Planck scale, the so-called ultramicroscopic scale. The presumption in the remainder of this article is that time definitely does exist at scales above the Planck scale, that the concept is objective rather than subjective, that it is not primarily conventional or a mathematical artifact [except that any particular splitting of spacetime into a time part and a space part is conventional], that any inconsistency in time’s description or definition is merely apparent (or is not essential and can be eliminated), and that time is real regardless of whether it is emergent.

11. Time Travel

Would you like to travel to the future and read about the history of your great-grandchildren? You can do it. Nothing in principle is stopping you except some financial difficulties and a better-engineered spaceship that can survive occasional collisions with rocks and dust in space. Would you like to travel, instead, to the past? You may have regrets and wish to make some changes. Unfortunately, travel to your own past is not as easy as travel to someone else’s future.

Travel in time was seriously discussed in Western philosophy only after 1949 when the logician Kurt Gödel published a solution to the equations of the general theory of relativity that allows travel to the past. He discovered that relativity theory implies some exotic distributions of matter and energy will curve spacetime enough to form loops along which, as you continue to travel forward in your own proper time, you arrive back to past events. There is no requirement that anything moves faster than the speed of light.

The term time travel has now become a technical term. It means travel in physical time, not psychological time. You do not time travel if you merely dream of living in the past, although neuroscientists commonly do call this “mental time travel.” You do not time travel for five minutes simply by being alive for five minutes. You do not time travel by crossing a time zone, nor do you travel in time by having your body frozen, then thawed later, even if this does extend your lifetime.

Time travel to the future presupposes the metaphysical theory of eternalism because, if you travel to the future, there must be a future that you travel to. Presentism and the growing-past theory deny the existence of this future.

In 1976, the Princeton University metaphysician David Lewis offered this technical definition of time travel:

In any case of physical time travel, the traveler’s journey as judged by a correct clock attached to the traveler takes a different amount of time than the journey does as judged by a correct clock of someone who does not take the journey.

The implication from this definition is that time travel occurs when correct clocks get out of synchronization. If you are the traveler, your personal time (technically called your proper time) is shown on the clock that travels with you. A person not taking the journey is said to be measuring external time. This external time could be their proper time, or it could be the proper time of our civilization’s standard clock.

Lewis’s definition is widely accepted, although it has been criticized occasionally in the philosophical literature. The definition has no implications about whether, if you travel forward in external time to the year 2376 or backward to 1776, you can suddenly pop into existence then as opposed to having traveled continuously during the intervening years. Continuity is required by scientific theory, but discontinuous travel is more popular in fictional books and films.

Here is a diagram of a traveler’s sudden appearance back in external time showing her death before her birth.

time-traveler

External time, such as our civilization’s standard time, is represented in the diagram with equally-spaced point-times t1, t2, t3, t4, t5 and with t1 < t2 < t3 < t4 < t5. External time increases to the right. Perhaps each of these times differs from the next by ten years. The traveler’s personal time or proper time is represented with the capital letter “T”, and it begins when she is born at T1, where the world at T1 is also the world at t4. Later she steps into a time machine at T2. when the external time of her friend is t5. The machine abruptly transports her back to the way the world was at t1, but her personal time is still T2. Yet T1 < T2, namely less by ten years. Later she dies at T3, in a situation when the external time is t2. Thus, according to external time, she dies before she is born. No such thing happened in her personal time because T1 < T2 < T3. This scenario is a case of backward causation, in which events causally influence earlier events, but it is not a case of time reversing nor of personal time going forward while events are experienced in reverse. It is a case of discontinuous time travel.

Suppose you are the time traveler’s friend mentioned above, and you meet her at t1. Your clock shows t1, but hers shows t5. So, meeting someone whose clock runs ahead of yours could be a sign that you have just met a time traveler from the future. But before leaping to this conclusion, you should first rule out mundane explanations, such as that her clock is not working well, or she is lying, and so forth.

a. To the Future

Time travel to the future does occur very frequently, and it has been observed and measured by scientists. Time travel to the past is much more controversial, and experts disagree with each other about whether it is even physically possible. Relativity theory implies there are two different kinds of time travel to the future: (1) two clocks getting out of synchrony due to their moving relative to each other, and (2) two clocks getting out of synchrony due to their encountering different gravitational forces.

When you travel to the future, you eventually arrive at some future event having taken less time on your clock than the non-travelers do on their clocks. To them, you zipped through time; to you, they were sluggish. It is all relative. You might travel to the future in the sense that you participate in an event ten years in their future, having taken only two years according to your own clock. That would be an eight-year leap forward in time. You can be continuously observed from Earth’s telescopes during your voyage to that event. However, the astronomers on Earth would notice that you turned the pages in your monthly calendar very slowly. The rate of ticking of your clock would differ from that of their clock during the flight. Reversing your velocity and traveling back to the place you began the trip will not undo this effect.

If you do travel to the future, that is, their future,  then you never get biologically younger; you simply age more slowly than those who do not travel with you. So, it is not possible to travel into the future and learn about your own death.

Any motion produces time travel to the future, relative to the clocks of those who do not move. That is why you can legitimately advertise any bicycle as being a time machine. The faster you go the sooner you get to the part of the future you desire but the more easily the dust and other particles in space will slice through your body during the trip.

The second kind of future time travel is due, not to a speed difference between two clocks, but to a difference in the strength of the gravitational field on two clocks. This is called gravitational time dilation, and it is most noticeable near a source of extreme gravitation such as a black hole. If you were to leave Earth and orbit near a black hole, your friends back on Earth might view you continuously through their telescopes and, if so, would see you live in slow motion. When you returned, your clock would show that less time had expired on your clock than on their clock that remained on Earth. Similarly, in a tall building the lower-floor clocks tick more slowly than upper-floor clocks because the lower floor is in a stronger gravitational field, all other things being equal. There is no theoretical limit to how slow a clock can tick when it undergoes time dilation, but it would never tick in reverse.

Travelers to the future can participate in that future, not just view it. They can influence the future and affect it.

One philosophical controversy is whether travelers to the future also can change the future. According to David Lewis (Lewis 1976, 150), this is impossible. If it changed, then it was not really the future after all. He argued that no action changes the future, regardless of whether time travel is involved.

Suppose you were to encounter a man today who says that yesterday he lived next door to Isaac Newton in England in the year 1700, but now he has traveled to the future and met you. According to the theory of relativity, it is physically possible that he did this. Yet it is an extraordinary claim since you undoubtedly believe that sufficiently fast spaceships or access to extraordinarily high gravitational fields were not available to anyone in 1700. And it is unlikely that history books failed to mention this if it did occur. Epistemology tells us that extraordinary claims require extraordinarily good evidence, so the burden of proof is on the strange man to produce that evidence, such as a good reason how the secret of building spaceships was discovered but kept from the public in 1700 and from later historians. You also would like to be shown that his body today contains traces of the kind of atmosphere that existed back in 1700; that atmosphere is slightly different chemically from ours today. If he cannot or will not produce the evidence, then it is much more likely that he is deluded or is simply lying. Giving him a lie detector test will not be very helpful; you want to know what is true, not merely what he believes to be true.

b. To the Past

Philosophers ask, “What do we mean by travel to the past?” You do not travel to the past when you remember your youth nor when you imagine living in an earlier century. A telescope is a window into our past. If we use it to look out into space to some region R a million light-years away, we are seeing R as it was a million years ago. However, this is looking at the past, not being in the past.

The general theory of relativity seems to imply that travel to your past is physically possible, although experts on relativity theory are not in agreement on this. If it is possible, then either past travel exists now or else it might be created anew by manipulation of matter and energy to create the required curvature of spacetime. There are no known examples of past time travel, and it seems to be extremely difficult to create the required, bizarre curvature.

So, when two people come up to you, and one says, “I’m from your past,” and the second says, “I’m from your future,” you know that the second person is more apt to be lying. This is because time travel to one’s own past is so much more difficult than to someone’s future. Philosophers of physics are not in agreement about how much more difficult. Some, such as Tim Maudlin, say it is so difficult that it is not physically possible, but others say it is only technologically difficult. There is agreement, though, that time travel to the past has not yet been observed.

Science must obey logic, which implies that in a single world there is consistent story of what has happened and will happen, despite the fact that novels about time travel frequently describe traveling back to remake the past and thereby to produce a new version of reality that is inconsistent with the earlier version.

At present, we are existing in the past of future people (assuming humanity does have a future), but we are not time-traveling into their past. What optimists about past time travel hope is that it is possible to travel into our own past here on Earth. This is impossible according to Newton’s physics and impossible according to Einstein’s special theory of relativity, but it may be possible according to Einstein’s general theory of relativity, although experts are not in agreement on this point despite much study of the issue. The equations are simply too complicated, even for experts. The experts also are sure that the present equations will need revision to remove the current inconsistency between the theories of relativity and quantum mechanics. Many of these experts (for example, Frank Wilczek) suggest that travel to the past is not allowed in any physically possible universe, and the closest one can come to time travel to the past is to travel to a new branch of the universe’s quantum wave function, which implies, for some experts, travelling to a parallel universe. All the experts agree that, even if the equations do allow some possible universes to contain travel to one’s own past in our universe via the creation of a time machine, they do now allow travel to a time before the creation of the first time machine.

Shortly after Einstein published his general theory of relativity, the physicist Hermann Weyl predicted that the theory allows time travel to the past. However, his claim was not studied carefully until Kurt Gödel’s work on relativity in 1949. Gödel claimed his work demonstrated time travel must exist in a certain universe having a non-zero total angular momentum. Gödel was able to convince Einstein of this, but experts on relativity are not in agreement on whether Einstein should have been convinced. Opponents of Gödel say he discovered a mathematical curiosity, not a physical possibility. Still others say that, even if relativity does allow travel to the past, the theory should be revised to prevent this. Other opponents of the possibility of time travel to the past hope that an ad hoc restriction is not needed and instead that relativity theory will be understood more clearly so it can be seen that it does rule out past time travel. Still other opponents of time travel to the past hope an as yet unknown physical law will be discovered that rules out travel to the past. However, defenders of time travel say we should bite the bullet, accept that relativity allows time travel in some kinds of universes that have special curvatures, and trust the implications of relativity, but just assume that so far we probably do not live in a universe with the kind of curvature that is required.

Here is a pessimistic remark about time travel from J.J.C. Smart in The Journal of Philosophy in 1963:

Suppose it is agreed that I did not exist a hundred years ago. It is a contradiction to suppose that I can make a machine that will take me to a hundred years ago. Quite clearly no time machine can make it be that I both did and did not exist a hundred years ago.

Smart’s critics accuse him of the fallacy of begging the question. They wonder why he should demand that it be agreed that “I did not exist a hundred years ago.”

If general relativity does allow a universe that contains time travel to the past, this universe must contain a very special distribution of matter-energy. For an illustration of the simplest universe allowing backward time travel (in a one-dimensional space) and not being obviously inconsistent with general relativity, imagine a Minkowski two-dimensional spacetime diagram written on a square sheet of paper, with the one space dimension represented as going left and right on the page. Each point on the page represents a possible two-dimensional event. The time dimension points up and down the page, at right angles to the space dimension. The origin is at the center of the page. Now curve (bend) the page into a horizontal cylinder, parallel to the space axis so that the future meets the past. In the universe illustrated by that graph, any stationary object that persists long enough arrives into its past and becomes its earlier self. Its worldline is (topologically equivalent to) a circle; more technically, it is a closed time-like curve that is a circle. A closed curve has no end points. This cylindrical universe allows an event to occur both earlier and later than itself, so its time is not asymmetric. The curvature of this universe is what mathematicians call extrinsic curvature. There is no intrinsic curvature, however. Informally expressed, extrinsic curvature is curvature detectable only from a higher dimension, but intrinsic curvature can be detected by a being who lives within the space, say by noticing a failure of the Pythagorean Theorem somewhere. When the flat, square sheet is rolled into a tube, the intrinsic geometry does not change; only the extrinsic geometry changes.

Regardless of how space is curved and what sort of time travel occurs, if any past time travel does occur, the traveler apparently is never able to erase facts or otherwise change the past. That is the point of saying, “whatever happened, happened.” But that metaphysical position has been challenged. It assumes there is only one past and that whatever was the case will always have been the case. These assumptions, though widely accepted, occasionally have been challenged in the philosophical literature. They were challenged in the 11th century by Peter Damian who said God could change the past.

Assuming Damian is mistaken, if you do go back, you would already have been back there. For this reason, if you go back in time and try to kill your grandfather by shooting him before he conceived a child, you will fail no matter how hard you try. You will fail because you have failed. But nothing prevents your trying to kill him.

The impossibility of killing your grandfather seems to raise a problem about free will. If you are free to shoot and kill people before you step into a time machine, then presumably you are free to shoot and kill people after you step out. So, is there a paradox because you both can and cannot shoot and kill your grandfather?

Assuming you cannot shoot and kill your grandfather because you did not, many philosophers argue that in this situation you do not really have freedom in the libertarian sense of that term. To resolve this puzzle, the metaphysician David Lewis said you can in one sense kill your grandfather but cannot in another sense. You can, relative to a set of facts that does not include the fact that your grandfather survived to have children. You cannot, relative to a set of facts that does include this fact. However, Lewis said there is no sense in which you both can and cannot. So, the meaning of the word can is sensitive to context. The metaphysician Donald C. Williams disagreed, and argued that we always need to make our can-statement relative to all the available facts. Lewis is saying you can and can’t, but in different senses, and you can but won’t. Williams is saying simply that you can’t, so you won’t.

If you step into a time machine that projects you into the past, then you can expect to step out into a new place because time travel normally involves motion. There is an ongoing philosophical dispute about whether, in a real closed time-like curve, a person would travel to exactly an earlier event or, instead, only to a nearby event. One suggested reason for restricting the time-like curve to only nearby events is that, if one went back to the same event, one would bump into oneself, and this would happen over and over again, and there would be too many copies of oneself existing in the same place. Opponents of this suggestion say there would be no repetition.

If it is logically inconsistent to build a new time machine to travel back to a time before the first time machine was invented, then there is no hope of creating the first time machine in order to visit the time of the dinosaurs.

In 1988 in an influential physics journal, Kip Thorne and colleagues described the first example of how to build a time machine in a world that has never had one: “[I]f the laws of physics permit traversable wormholes, then they probably also permit such a wormhole to be transformed into a “time machine….” (Morris 1988, p. 1446).

A wormhole is a second route between two places; perhaps it is a shortcut tunnel to a faraway place. Just as two clocks get out of synchrony if one moves relative to the other, a clock near a rapidly moving mouth of a wormhole could get out of synch with a clock at the other, stationary mouth. In principle a person could plunge into one hole and come out at an earlier time. Wormholes were first conceived by Einstein and Rosen, and later were named wormholes by John Wheeler.

Experts opposed to traversable wormholes have less of a problem with there being wormholes than with them being traversable. Although Thorne himself believes that traversable wormholes probably do not exist naturally, he also believes they might in principle be created by a more advanced civilization. However, Thorne also believes the short tunnel or “throat” between the two mouths of the wormhole probably would quickly collapse before anything of macroscopic size could use the wormhole to travel back in time. There has been speculation by physicists that an advanced civilization could manipulate negative gravitational energy with its positive pressure to keep the hole from collapsing long enough to create the universe’s first non-microscopic time machine.

It is a very interesting philosophical project to decide whether wormhole time travel, or any other time travel to the past, produces paradoxes of identity. For example, can a person travel back and be born again?

To solve the paradoxes of personal identity due to time travel’s inconsistency with commonly held assumptions about personal identity, many philosophers recommend rejecting the endurance theory which implies a person exists wholly at a single instant, for all the instants of their life. They recommend accepting the perdurance theory in which a person exists as a four-dimensional entity extending in time from birth to death. The person is their spacetime “worm.” Worms of this sort can live partly in wormholes and become closed time-like curves in spacetime.

Let us elaborate on this radical scenario. A closed time-like curve has implications for causality. The curve would be a causal loop. Causal loops lead to backward causation in which an effect can occur before its cause. Causal loops occur when there is a continuous sequence of events e1, e2, e3, …. in which each member is a cause of its numerical successor and, in addition, for some integer n, en causes e1. The philosopher Milič Čapek has cautioned that with a causal loop, “we would be clearly on the brink of magic.” Other philosophers of time are more willing to accept the possibility of causal loops, strange though they would be. These loops would be a fountain of youth. When you go around the loop, you travel back to a time when you were younger, or perhaps even to your birth. Accepting this possibility, Arthur C. Clarke is noted for saying, “Any sufficiently advanced technology is indistinguishable from magic.”

Most time travel stories in literature involve contradictions, either logical contradictions or inconsistency with accepted laws of physics. The most famous one that appears not to is Robert Heinlein’s story “All You Zombies.” It shows how someone could be both their father and mother, provided relativity theory does allow backward time travel.

For a detailed review of the philosophical literature on backward time travel and the resulting paradoxes of causality and of personal identity, see (Wasserman, 2018, ch. 5) and (Fisher, 2015).

Inspired by an idea from John Wheeler, Richard Feynman suggested that a way to interpret the theory of quantum electrodynamics about interactions dominated by electromagnetic and weak forces is that an antimatter particle is really a matter particle traveling backward in time. For example, the positively charged positron moving forward in time is really a negatively charged electron moving backward in time.

Feynman U.S. postage stamp
US Postal Museum

This phenomenon is pictured in the two diagrams on the left of the above postage stamp, where the positron e+ is moving downward or backward in time. Feynman diagrams picture a short sequence of elementary interactions among particles. Feynman speculated that the reason why every electron has exactly the same properties as any other, unlike identical brooms manufactured in a broom factory, is that there is only one electron in the universe and it exists simultaneously at a great many places, thanks to backward time travel.

All empirical searches attempting to observe a particle moving backward in time have failed. So, the majority of physicists in the 21st century see no need to accept backward time travel despite Feynman’s successful representations of quantum electrodynamics. See (Muller 2016a, p. 246, 296-7) and (Arntzenius & Greaves 2009) for critical comment on this. Nevertheless, some well-respected physicists such as Neil Turok do accept Feynman-style backward time travel. The philosopher Huw Price added that the Feynman “zigzag is not there in standard QM, so if we put it in, we are accepting that QM is incomplete. (The zigzag needs hidden variables, in other words)” in order to determine when to zig and when to zag. At the heart of this dispute about whether to believe antimatter is regular matter traveling backward in time, physicists are very cautious because they realize that the more extraordinary the claim, the more extraordinarily good the evidence should be before accepting the claim.

Here are a variety of very brief philosophical arguments against travel to the past:

  1. If travel to the past were possible, you could go back in time and kill your grandfather, but then you would not be born and so could not go back in time and kill your grandfather. That is a logical contradiction. So, travel to the past is impossible.
  2. Like the future, the past is also not real, so time travel to the past or the future is not real either.
  3. Time travel is impossible because, if it were possible, we should have seen many time travelers by now, but nobody has ever encountered any time travelers.
  4. If past time travel were possible, then you could be in two different bodies at the same time, which is metaphysically impossible.
  5. If you were to go back to the past, then you would have been fated to go back because you already did, and this rules out your freedom to go back or not. Yet you do have this freedom, so travel to the past is impossible.
  6. If past time travel were possible, then you could die before you were born, which is biologically impossible.
  7. If you were presently to go back in time, then your present events would cause past events, which violates our concept of causality.
  8. If travel to the past were possible, then when time travelers go back and attempt to change history, they must always fail in their attempts to change anything, and it will appear to anyone watching them at the time as if Nature is conspiring against them. Since investigators have never witnessed this apparent conspiracy of Nature, there probably cannot be time travel.
  9. Travel to the past is impossible because it allows the gaining of information for free. Here is a possible scenario. You in the 22nd century buy a copy of Darwin’s book The Origin of Species, which was published in 1859. You enter a time machine with it, go back to 1855 and give the book to Darwin himself. He could have used your copy in order to write his manuscript which he sent off to the publisher. If so, who first came up with the knowledge about evolution? You got the knowledge from Darwin, but Darwin got the knowledge from you. This is free information. Because this scenario contradicts what we know about where knowledge comes from, past-directed time travel is not really possible.
  10. Travel to the past allows you to return to have intercourse with one of your parents, causing your birth. You would have the same fingerprints as one of your parents, which is biologically impossible.
  11. If past time travel is possible, then it should be possible for a rocket ship to carry a time machine capable of launching a probe (perhaps a smaller rocket) into its recent past which might eventually reunite with the mother ship. The mother ship is programmed to launch the probe at a certain time unless a safety switch is on at that time. Suppose the safety switch is programmed to be turned on if and only if the return or impending arrival of the probe is detected by a sensing device on the mother ship. Does the probe get launched? It seems to be launched if and only if it is not launched.

These complaints about travel to the past are a mixture of arguments that past-directed time travel is not logically possible, not metaphysically possible, not physically possible, not technologically possible, not biologically possible, and not probable.

Counters to all of these arguments have been suggested by advocates of time travel. One response to the Grandfather Paradox of item 1 says you would kill your grandfather but then be in an alternative universe to the actual one where you did not kill him. This response is not liked by most proponents of time travel; they say traveling to an alternative universe is not what they mean by time travel.

Item 2 is the argument from presentism.

One response to item 3, the Enrico Fermi Paradox, is that perhaps we have seen no time travelers because we live in a boring era of little interest to time travelers. A better response is that perhaps the first time machine has never been built, and it is known that a time machine cannot be used to go back to a time before the first time machine exists (or closed time-like curve exists).

Item 9, the paradox of free information, has gotten considerable attention in the philosophical literature. In 1976, David Lewis said this:

But where did the information come from in the first place? Why did the whole affair happen? There is simply no answer. The parts of the loop are explicable, the whole of it is not. Strange! But not impossible, and not too different from inexplicabilities we are already inured to. Almost everyone agrees that God, or the Big Bang, or the entire infinite past of the Universe, or the decay of a tritium atom, is uncaused and inexplicable. Then if these are possible, why not also the inexplicable causal loops that arise in time travel?

Einstein and Rosen suggested that the laws of general relativity might allow traversable, macroscopic wormholes. A wormhole is a tunnel connecting two distant regions of space, and a traversable wormhole allows travel from one hole to the other. The tunnel would be a shortcut between two distant galaxies, and it is analogous to a path taken by a worm who has eaten its way to the opposite side of an apple’s surface without taking the longer path using only the apple’s surface. That is why John Wheeler coined the name “wormhole.” Think of a wormhole as two spheres connected by a tunnel.

The hole is highly curved spacetime, and from the outside it looks like a sphere in 3D-space. It is not quite a black hole, so it has no event horizon. There is no consensus among theoretical physicists whether general relativity permits the existence of a wormhole. Assuming it does, and assuming one of the spheres could be controlled and forced to move very fast back and forth, then with two connected spheres situated in separate galaxies, a particle or person could enter one at some time, then exit the other at an earlier time, having traveled, say, just a few meters through the tunnel. Because of this implication for time, some physicists argue that if these traversable wormholes are allowed by general relativity, then the theory needs to be revised to disallow them.

For more discussion of time travel, see the encyclopedia article “Time Travel.” For some arguments in the philosophy literature against the possibility of a person travelling back to a time at which the person previously existed, see (Horwich 1975), (Grey 1999), and (Sider 2001).

12. McTaggart’s A-Theory and B-Theory

In 1908, the English philosopher J.M.E. McTaggart proposed two ways of linearly ordering all events in time. The two ways are different, but the two orderings are the same. Here is how he re-states his kernel idea:

For the sake of brevity, I shall give the name of the A series to that series of positions which runs from the far past through the near past to the present, and then from the present through the near future to the far future, or conversely. The series of positions which runs from earlier to later, or conversely, I shall call the B series. (McTaggart 1927, 10)

When McTaggart uses the word series, he means what mathematicians call a sequence, but the literature in philosophy often follows McTaggart on this point. Below is a graphic representation of McTaggart’s ordering, in which point event c happens later than point events a and b:


Time-McTaggart1

McTaggart is making several assumptions here. First, he does not believe time is real, so his remark that the A-series and B-series mark out positions in time is only on the assumption that time is real, despite what he, himself, believes. Another assumption is that longer-lasting events are composed of their point events. Also, there are a great many other events that are located within the series at event a‘s location, namely all the other events that are simultaneous with event a.

Using the standard time diagram with time increasing to the right along a horizontal line, event a in McTaggart’s B-series (see picture above) is ordered to the left of event b because a happens before b. But when ordering the same two events into McTaggart’s A-series, event a is ordered to the left of b for a different reason—because event a is more in the past than event b, or, equivalently, has more pastness than b. The A-series locates each event relative to the present; the B-series is created with no attention paid to the present, but only to what occurs before what.

Suppose that event c occurs in our present and after events a and b. Although the philosophical literature is not in agreement, it is usually said that the information that c occurs in the present is not contained within either the A-series or the B-series itself, but is used to create the A-series. That information of c‘s being in the present tells us to place c to the right of b because all present events are without pastness; they are not in the past. Someone constructing the B-series places event c to the right of b for a different reason, just that c happens after b.

McTaggart himself believed the A-series is paradoxical, but he also believed the A-properties (such as being past and being two weeks past) are essential to our concept of time. So, for this reason, he believed our current concept of time is paradoxical and incoherent. This reasoning is called McTaggart’s Paradox.

McTaggart is not an especially clear writer, so his remarks can be interpreted in different ways, and the reader needs to work hard to make sense of them. Consider McTaggart’s Paradox. Regarding one specific event, say the event when:

Socrates speaks to Plato for the first time.

This speaking to Plato is in the past, at least it is in our past, though not in the past of Egyptian King Tut during his lifetime, so the speaking is past in our present. Nevertheless, back in our past, there is a time when the event is present. From this, McTaggart concludes both that the event is past and that the event is present, from which he declares that the A-series is contradictory and so paradoxical. If that reasoning is correct (and it has been challenged by many), and if the A-series is essential to time, then time itself must be unreal. This piece of reasoning is commonly called “McTaggart’s Paradox.”

When discussing the A-theory and the B-theory, metaphysicians often speak of:

  • A-series and B-series
  • A-theorist and B-theorist
  • A-facts and B-facts
  • A-terms and B-terms
  • A-properties and B-properties
  • A-predicates and B-predicates
  • A-propositions and B-propositions
  • A-sentences and B-sentences
  • A-camp and B-camp.

Here are some examples of using this terminology. Unlike the A-series terms, the B-series terms are relational terms because a B-term refers to a property that relates a pair of events. Some of these properties are: is earlier than, happens twenty-three minutes after, and is simultaneous with. An A-theory term, on the other hand, refers to a single event, not a pair of events. Some of these properties are: in the near future, happened twenty-three minutes ago, and is present. The B-theory terms represent distinctively B-properties; the A-theory terms represent distinctively A-properties.

The B-fact that event a occurs before event b will always be a fact, but the A-fact that event a occurred about an hour ago will not be a fact for long. B-theorists do not like facts going in and out of existence, but this is acceptable to A-theorists. Similarly, if we turn from fact-talk to statement-talk, the A-statement that event a occurred about an hour ago, if true, will soon become false. B-facts are eternal. For example, the statement “The snowfall occurred an hour before this act of utterance” will, if true, be true at all times, provided the indexical phrase the snowfall does not change its reference.

The A-theory usually implies A-facts are the truthmakers of true A-statements and so A-facts are ontologically fundamental; the B-theorist, at least a B-theorist who believes in the existence of facts, appeals instead to B-facts. According to a classical B-theory, when the A-theorist correctly says, “It began snowing an hour ago,” what really makes it true is not that the snowing has an hour of pastness (so the fact is tensed) but that the event of uttering the sentence occurs an hour after the event of it beginning to snow. Notice that occurs an hour after is a B-term that is supposed to be logically tenseless and to be analogous to the mathematical term numerically less than even though when expressed in English it must use the present tense of the verb to occur.

When you like an event, say yesterday’s snowfall, then change your mind and dislike the event, what sort of change of the event is that? Well, this change in attitude is not a change that is intrinsic to the event itself. It is extrinsic.  When your attitude changes, the snowfall itself undergoes no intrinsic change, only a change in its relationship to you. (A-theorists and B-theorists do not disagree about this.) This illustrates what is meant by intrinsic when A-theorists promote the intrinsic properties of an event, such as the snowfall having the intrinsic property of being in the past. B-theorists analyze the snowfall event differently, saying that more fundamentally the event is not in the past but is in the past relative to us. “Being in the past,” they say, is not intrinsic but rather is relational.

Members of the A-camp and B-camp recognize that ordinary speakers are not careful in their use of A and B terminology; but, when the terminology is used carefully, each believes their camp’s terminology can best explain ordinary speech involving time and also the terminology of the other camp.

Many A-theorists promote becoming. The term means a change in the A-series position of an event, such as a change in its degree of pastness. The B-theorist philosopher Adolf Grünbaum believes becoming is mind-dependent, and he points to the following initial quotation from J. J. C. Smart in opposition to the A-theory:

“If past, present, and future were real properties of events [i.e., properties possessed by physical events independently of being perceived], then it would require [non-trivial] explanation that an event which becomes present [i.e., qualifies as occurring now] in 1965 becomes present [now] at that date and not at some other (and this would have to be an explanation over and above the explanation of why an event of this sort occurred in 1965)” (says Smart). It would, of course, be a complete trivialization of the thesis of the mind-independence of becoming to reply that by definition an event occurring at a certain clock time t has the unanalyzable attribute of nowness at time t (Grünbaum 1971, p. 218).

Grünbaum is implying that it is appropriate to ask regarding the event of her house falling down in 1965, “Why now instead of some other date?” He believes that it would be an appropriate explanation to appeal to mind-independent soil conditions and weather patterns, but that it would be trivial and inadequate to say instead that the event occurs now because by definition it had at that time the unanalyzable attribute of nowness. And, more generally, says Grünbaum, temporal becoming has no appropriate place within physical theory.

Beginning with Bertrand Russell in 1903, many B-theorists have argued that there are no irreducible one-place A-qualities (such as the monadic property of being past) because the qualities can all be reduced to, and adequately explained in terms of, two-place B-relations. The A-theorist disagrees. For example, the claim that it is after midnight might be explained, says the B-theorist, by saying midnight occurs before the time of this assertion. Before is a two-place relationship, a binary relation. The A-theorist claims this is a faulty explanation.

Is the A-theory or is the B-theory the correct theory of reality? This is a philosophically controversial issue. To clarify the issue, let us re-state the two theories. The A-theory has two especially central theses, each of which is contrary to the B-theory:

(1) Time is fundamentally constituted by an A-series in which any event’s being in the past (or in the present or in the future or twenty-three seconds in the past) is an intrinsic, objective, monadic property of the event itself.

(2) Events change.

In 1908, McTaggart described the special way that events change:

Take any event—the death of Queen Anne, for example—and consider what change can take place in its characteristics. That it is a death, that it is the death of Anne Stuart, that it has such causes, that it has such effects—every characteristic of this sort never changes…. But in one respect it does change. It began by being a future event. It became every moment an event in the nearer future. At last it was present. Then it became past, and will always remain so, though every moment it becomes further and further past.

This special change is usually called second-order change or McTaggartian change. For McTaggart, second-order change is the only genuine change, whereas a B-theorist such as Russell says this is not genuine change. Genuine change is intrinsic change, he would say. Just as there is no intrinsic change in a house due to your walking farther away from it, so there is no intrinsic change in an event as it supposedly “moves” farther into the past.

In response to Russell, McTaggart said:

No, Russell, no. What you identify as “change” isn’t change at all. The “B-series world” you think is the real world is…a world without becoming, a world in which nothing happens.

A world with becoming is a world in which events change and time flows. “It is difficult to see how we could construct the A series given only the B series, whereas given the former we can readily construct the latter,” says G.J. Whitrow in defense of the A theory.

The B-theory conflicts with two central theses of the A-theory. According to the B-theory,

(1′) Time is fundamentally constituted by a B-series, and the temporal properties of being in the past (or in the present or in the future) are fundamentally relational, not monadic.

(2′) Events do not change.

To re-examine this dispute, because there is much misunderstanding about what is being disputed, let us ask again what B-theorists mean by calling temporal properties relational. They mean that an event’s property of occurring twenty-three minutes in the past, say, is a relation between the event and us, the subject, the speaker. When analyzed, it will be seen to make reference to our own perspective on the world. Scottish Queen Anne’s death has the property of occurring in the past because it occurs in our past. It is not in Aristotle’s past or King Tut’s. So, the labels, “past,” “present,” and “future” are all about us and are not intrinsic properties of events. That is  why there is no objective distinction among past, present and future, say the proponents of the B-theory. For similar reasons the B-theorist says the property of being two days in the past is not an authentic property because it is a second-order property.” The property of being two days in our past, however, is a genuine property, says the B-theorist.

Their point about A-properties being relational when properly analyzed is also made this way. The A-theory terminology about space uses the terms here, there, far, and near. These terms are essentially about the speaker, says the B-theorist. “Here” for you is not “here” for me. World War II is past for you but not for Aristotle.

The B-theorist also argues that the A-theory violates the theory of relativity because that theory implies an event can be present for one person but not for another person who is moving relative to the first person. So, being present is relative and not an intrinsic quality of the event. Being present is relative to a reference frame. And for this reason, there are as many different B-series as there are legitimate reference frames. The typical proponent of the A-series cannot accept this.

A-theorists are aware of these criticisms, and there are many counterarguments. Some influential A-theorists are A. N. Prior, E. J. Lowe, and Quentin Smith. Some influential B-theorists are Bertrand Russell, W. V. O. Quine, D. H. Mellor, and Nathan Oaklander. The A-theory is closely related to the commonsense image of time, and the B-theory is more closely related to the scientific image. Proponents of each theory shoulder a certain burden—explaining not just why the opponent’s theory is incorrect but also why it seems to be correct to the opponent.

The philosophical literature on the controversy between the A and B theories is vast. During a famous confrontation in 1922 with the philosopher and A-theorist Henri Bergson, Einstein defended his own B-theory of time and said “the time of the philosophers” is an illusion. This is an overstatement by Einstein. He meant to attack only the time of the A-theorists.

Martin Heidegger said he wrote Being and Time in 1927 as a response to the conflict between the A-theory and the B-theory.

Other than the thesis that the present is metaphysically privileged, the other principal thesis of the A-theory that distinguishes it from the B-theory is that time flows. Let us turn to this feature of the A-theory.

13. The Passage or Flow of Time

Many philosophers claim that time passes or flows. This characteristic of time has also been called a flux, a transiency of the present, a moving now, and becoming. “All is flux,” said Heraclitus. The philosopher G.J. Whitrow claimed “the passage of time…is the very essence of the concept.” Advocates of this controversial philosophical position often point out that the present keeps vanishing. And they might offer a simile and say present events seem to flow into the past, like a boat that drifts past us on the riverbank and then recedes farther and farther downstream from us. In the converse sense, the simile is that we ourselves flow into the future and leave past events ever farther behind us. Philosophers disagree with each other about how to explain the ground of these ideas. Philosopher X will say time passes or flows, but not in the sense used by philosopher Y, while philosopher Z will disagree with both of them.

There are various entangled issues regarding flow. (i) Is the flow an objective feature of physical events that exists independently of our awareness of them? (ii) What is actually flowing? (iii) What does it mean for time to flow? (iv) Are there different kinds of flow? (v) If time flows, do we experience the flow directly or indirectly? (vi) What is its rate of flow, and can the rate change? (vii) If time does not flow, then why do so many people believe it does?

There are two primary philosophical positions about time’s flow: (A) the flow is objectively real. (B) The flow is not objectively real; it is merely subjective. This B-theory is called the static theory, mostly by its opponents because of the negative connotation of the word “static.” The A-theory is called the dynamic theory because it implies time is constantly in flux. The A-theory implies that this fact of passage obtains independently of us; it is not subjective. The letters A and B are intended to suggest an alliance with McTaggart’s A-theory and B-theory. One A-theorist describes the situation this way:

The sensation we are (perhaps wrongly) tempted to describe as the sensation of temporal motion is veridical: it somehow puts us in touch with an aspect of reality that is unrepresented in Russell’s theory of time [the original B-theory]. (van Inwagen 2015, 81)

Some B-theorists complain that the concept of passage is incoherent, or it does not apply to the real world because this would require too many revisions to the scientific worldview of time. Other B-theorists say time flows but only subjectively and that B-theory concepts can explain why we believe in the flow. One explanation that is proposed is that the flow is due to our internal comparison of our predictions of what will happen with our memories of what recently happened, and this comparison needs to be continually updated.

One B-theorist charge is that the notion of flow is the product of a faulty metaphor.  They say time exists, things change, and so we say time passes, but time itself does not change. It does not change by flowing or passing or elapsing or undergoing any motion. The present does not objectively flow because the present is not an objective feature of the world. We all experience this flow, but only in the sense that we all frequently misinterpret our experience. It is not that the sentences, “The present keeps vanishing” and “Time flows” are false; they are just not objective truths.

Here is another prong of a common B-theory attack on the notion of flow. The death of Queen Anne is an event that an A-theorist says is continually changing from past to farther into the past, but this change is no more of an objectively real change intrinsic to her death than saying her death changed from being approved of by Mr. Smith to being disapproved of by him. This extrinsic change in approval is not intrinsic to her death and so does not count as an objectively real change in her death.

One point J.J.C. Smart offered against the A-theory of flow was to ask about the rate at which time flows. It would be a rate of one second per second. But that is silly, he claimed. One second divided by one second is the number one, a unit-less number, and so not an allowable rate. And what would it be like for the rate to be two seconds per second? asks Huw Price who adds that, “We might just as well say that the ratio of the circumference of a circle to its diameter flows at pi seconds per second!” (Price 1996, p. 13).

Other philosophers of time, such as John Norton and Tim Maudlin argue that the rate of one second per second is acceptable, despite these criticisms. Paul Churchland countered that the rate is meaningful but trivial, for what other rate could it be?

To understand the concept of flow or passage used by the A-theory, it can be helpful to appreciate that it is not any of the following four concepts of flow:

(i) According to the theory of relativity, two synchronized clocks must lose their synchrony if the two are moving relative to each other. This loss is sometimes described as the time of the two clocks flowing differently.

(ii) Change is continuous rather than discrete, and continuous time is flowing time, unlike discrete time.

(iii) Time exists and things endure, and that is what it is for time to flow.

(iv) Time has an arrow, it flows in a preferred direction, the future direction.

There surely is some objective feature of our brains that causes us to believe there is a flow of time which we are experiencing. B-theorists say perhaps the belief is due not to time’s actually flowing but rather to the objective fact that we have different perceptions at different times and that anticipations of experiences always happen before memories of those experiences.

A-theorists who believe in flow have produced many dynamic theories that are closer to common sense on this topic. Here are six.

(1) The passage or flow is a matter of events changing from being future, to being present, to being past. Events change in their degree of futureness and degree of pastness. This kind of change is often called McTaggart’s second-order change to distinguish it from more ordinary, first-order change that occurs when, say, a falling leaf changes its altitude over time.

(2) A second type of dynamic theory implies time’s flow is the coming into existence of new facts, the actualization of new states of affairs. Reality grows by the addition of more facts. There need be no commitment to events changing intrinsically.

(3) A third dynamic theory implies that the flow is a matter of events changing from being indeterminate to becoming determinate in the present. Because time’s flow is believed to be due to events becoming determinate, these dynamic theorists speak of time’s flow as becoming.

(4) A fourth dynamic theory says, “The progression of time can be understood by assuming that the Hubble expansion takes place in 4 dimensions rather than in 3. The flow of time consists of the continuous creation of new moments, new nows, that accompany the creation of new space…. Unlike the picture drawn in the classic Minkowski spacetime diagram, the future does not yet exist; we are not moving into the future, but the future is being constantly created.” (Muller 2016b).

(5) A fifth dynamic theory suggests the flow is (or is reflected in) the change over time of truth-values of declarative sentences. For example, suppose the sentence, “It is now raining,” was true during the rain yesterday but has changed to false because it is sunny today. That is an indication that time flowed from yesterday to today, and these sorts of truth-value changes are at the root of the flow.

In response to this linguistic turn of theory (5), critics of the dynamic theory suggest that the temporal indexical sentence, “It is now raining,” has no truth-value because the reference of the word now is unspecified. If the sentence cannot have a truth-value, it cannot change its truth-value. However, the sentence is related to a sentence that does have a truth-value, namely the associated complete sentence or eternal sentence, the sentence with its temporal indexical replaced by some date expression that refers to a specific time, and with the other indexicals replaced by names of whatever they refer to. Typical indexicals are the words: then, now, I, this, here, them. Supposing it is now midnight here on April 1, 2020, and the speaker is in San Francisco, California, then the indexical sentence, “It is now raining,” is intimately associated with the more complete or context-explicit sentence, “It is raining at midnight on April 1, 2020, in San Francisco, California.” Only these latter, non-indexical, non-context-dependent, so-called complete sentences have truth-values, and these truth-values do not change with time, so they do not underlie any flow of time, according to the critic of the fifth dynamic theory.

(6) A sixth dynamic theory adds to the block-universe a traveling present. The present is somehow metaphysically privileged, and there is a moving property of being now that spotlights a new slice of the present events of the block at every new, present moment. A slice is a set of events all of which are simultaneous in the block. So, a slice of events can temporarily possess a monadic property of being now, and then lose it as a newer slice becomes spotlighted. This theory is called the moving spotlight theory. It follows that there are illuminated moments and unilluminated moments that are, respectively, real and unreal. Here is how Hermann Weyl described the spotlight theory as subjective rather than objective:

The objective world simply is, it does not happen. Only to the gaze of my consciousness crawling along the lifeline of my body, does a section of the world come to life as a fleeting image in space which continuously changes in time.

Huw Price offers a short overview of various arguments against the passage of time in (Price 1996 pages 12-16). These arguments are responded to by Tim Maudlin in (Maudlin 2002).

14. The Past, Present, and Future

a. Presentism, the Growing-Past, Eternalism, and the Block-Universe

Have dinosaurs slipped out of existence? More generally, this is asking whether the past is part of reality. How about the future? Philosophers are divided on this ontological question of the reality of the past, present, and future. There are three leading theories, and there is controversy over the exact wording of each, and whether the true theory is metaphysically necessary or just contingently true. The three do not differ in their observational consequences as do competing scientific theories.

(1) According to the ontological theory called presentism, only present objects exist. Stated another way: if something is real, then it exists now. The past and the future are not real, so either the past tense sentence, “Dinosaurs existed” is false, or else it is true but its truth is grounded only in some present facts. A similar analysis is required for statements in the future tense. Perhaps they can be analyzed in terms of present anticipations. With that accomplished, then all the events can be linearly ordered as if the past ones occur before the present ones and the present ones occur before the future ones, when actually they do not because all real events occur in the present. Heraclitus, Duns Scotus, Thomas Hobbes, Arthur Schopenhauer, A.N. Prior, and Lee Smolin are presentists. In the 17th century, Hobbes wrote, “The present only has a being in nature; things past have a being in the memory only, but things to come have no being at all, the future being but a fiction of the mind….” In 1969, Prior said of the present and the real:

They are one and the same concept, and the present simply is the real considered in relation to two particular species of unreality, namely the past and the future.

(2) Advocates of a growing-past agree with the presentists that the present is special ontologically, but they argue that, in addition to the present, the past is also real and is growing bigger all the time. C.D. Broad, George Ellis, Richard Jeffrey, and Michael Tooley have defended the growing-past theory. William James famously remarked that the future is so unreal that even God cannot anticipate it. It is not clear whether Aristotle accepted the growing-past theory or accepted a form of presentism; see Hilary Putnam (1967, p. 244) for commentary on this issue. The growing-past theory is also called by other names: the growing-present theory, now-and-then-ism, the becoming theory, and possibilism. Members of McTaggart’s A-camp are divided on whether to accept presentism or, instead, the growing-past theory, but they agree on rejecting eternalism.

(3) Advocates of eternalism say there are no objective ontological differences among the past, present, and future, just as there is no objective ontological difference between here and there. The differences among past, present, and future are subjective, depending upon whose experience is being implicitly referred to—yours or, say, Hitler’s or Aristotle’s. An eternalist will say Adolf Hitler’s rise to power in Germany is not simply in the past, as the first two theories imply; instead, it is in the past for you, but in the future for Aristotle, and it is equally real for both of you. The past, the present, and the future exist conjointly but not simultaneously. The eternalist is committed to saying all events in spacetime are equally real; the events of the present are not ontologically privileged. The eternalist often describes the theory with a block of all events often labeled with a Minkowski diagram. All moments of the block are equally real, but only at different times. The entire block of events exists, but not wholly at one time. For the eternalist or block theorist, there are epistemological limitations no ontological differences among the past, present, and future, All the events are real, but we can know much more about past events than future ones. So, eternalism radically conflicts with the manifest image, the commonsense image of time.

Eternalism is often called a static theory. The label “static” was once supposed to be derogatory and to indicate that the theory could not successfully deal with change, but these days the term has lost much of its negative connotations just as the initially derogatory term “big bang” in cosmology has lost its negative connotations.

Eternalism is the only one of the three metaphysical theories that permits time travel, so it is understandable that time travel was not seriously discussed in philosophy until the twentieth century when presentism began to be challenged. Bertrand Russell, J.J.C. Smart, W.V.O. Quine, Adolf Grünbaum, and David Lewis have endorsed eternalism. Eternalism is less frequently called the tapestry theory of time. It is very difficult to speak correctly about eternalism using natural language because all natural languages are infused with presumptions about presentism. Assuming eternalism is correct, it follows that correct talk about personal identity is especially difficult for this reason.

Here is how one philosopher of physics briefly defends eternalism:

I believe that the past is real: there are facts about what happened in the past that are independent of the present state of the world and independent of my knowledge or beliefs about the past. I similarly believe that there is (i.e., will be) a single unique future. I know what it would be to believe that the past is unreal (i.e., nothing ever happened, everything was just created ex nihilo) and to believe that the future is unreal (i.e., all will end, I will not exist tomorrow, I have no future). I do not believe these things, and would act very differently if I did. Insofar as belief in the reality of the past and the future constitutes a belief in a ‘block universe’, I believe in a block universe. But I also believe that time passes, and see no contradiction or tension between these views (Maudlin 2002, pp. 259-260).

A and B theorists agree that it is correct to say, “The past does not exist” and to say, “Future events do not exist” if the verbs are being used in their tensed form, but argue that there should be no implications here for ontology because this is merely an interesting feature of how some languages such as English use tensed verbs. Languages need not use tenses at all, and, according to the B-theorists, a B-analysis of tense-talk can be provided when languages do use tenses.

Hermann Minkowski is the father of the block universe concept. The block theory employing this concept implies reality is correctly representable as a four-dimensional block of point-events in spacetime in some reference frame. Minkowski treated the block as a manifold of point-events upon which was placed a four-dimensional rectangular coordinate system. In the block, any two non-simultaneous events are ordered by the happens-before-or-is-simultaneous-with relation.

For a graphic presentation of the block, see a four-dimensional Minkowski diagram in a supplement to this article. If time has an infinite future or infinite past, then the block is infinite in those directions in time. If space has an infinite extent, then the block is infinitely large along the spatial dimensions. If it were learned that space is nine-dimensional rather than three-dimensional, then block theorists would promote a ten-dimensional block rather than a four-dimensional block.

To get a sense of why the block is philosophically controversial, note that in his book The Future, the Oxford philosopher John Lucas said,

The block universe gives a deeply inadequate view of time. It fails to account for the passage of time, the pre-eminence of the present, the directedness of time, and the difference between the future and the past.

G. J. Whitrow complains that “the theory of the block universe…implies that past (and future) events co-exist with those that are present.” This is a contradiction, he believes. Whitrow’s point can be made metaphorically this way: The mistake of the B-theorist is to envision the future as unfolding, as if it has been waiting in the wings for its cue to appear on the present stage—which is absurd.

Motion in the real world is, of course, dynamic, but its historical record, such as its record or worldline within the block, is static. That is, any motion’s mathematical representation is static in the sense of being timeless. The block theory has been accused by A-theorists of spatializing time and geometricizing time, which arguably it does. The philosophical debate is whether this is a mistake. Some B-theorists complain that, by very act of labeling the static view as being static is implying mistakenly that there is a time dimension in which the block is not changing but should. The block describes change but does not itself change, say B-theorists. The A-theorist’s complaint, according to the B-theorist, is like complaining that a printed musical score is faulty because it is static, while real music is vibrant. The complaint is implying a video file does not represent anything happening because the file just sits there statically.

For the eternalist and block-theorist, the block that is created using one reference frame is no more distinguished than the block that is created using another frame allowed by the laws of science. Any chosen reference will have its own definite past, present, and future. The majority of physicists accept this block theory, which could be called the mild block theory. Metaphysicians also argue over whether reality itself is a static block, rather than just being representable as a static block. These metaphysicians are promoting a strong block theory. Some theorists complain that this strong block theory is confusing the representation with what is represented. See (Smolin 2013, pp. 25-36) for an elaboration of the point.

Some proponents of the growing-past theory have adopted a growing-block theory. They say the block is ever-growing, and the present is the leading edge between reality and the unreal future. Some philosophers express that point by saying the present is the edge of all becoming. The advocates of the growing-block usually agree with the eternalists that what makes the sentence, “Dinosaurs once existed,” be true is that there is a past region of the block in which dinosaurs do exist.

All three ontologies (namely, presentism, the growing-past, and eternalism) imply that, at the present moment, we only ever experience a part of the present and that we do not have direct access to the past or the future. They all agree that nothing exists now that is not present, and all three need to explain how and why there is an important difference between never existing (such as Santa Claus) and not existing now (such as Abraham Lincoln). Members of all three camps will understand an ordinary speaker who says, “There will be a storm tomorrow so it’s good that we fixed the roof last week,” but they will provide different treatments of this remark at a metaphysical level.

Most eternalists accept the B-theory of time. Presentists and advocates of the growing-past tend to accept the A-theory of time. Let us take a closer look at presentism.

One of the major issues for presentism is how to ground true propositions about the past. What makes it true that U.S. President Abraham Lincoln was assassinated in 1865? Speaking technically, we are asking what are the truthmakers of the true sentences and the falsemakers of the false sentences. Many presentists say past-tensed truths lack truthmakers in the past but are nevertheless true because their truthmakers are in the present. They say what makes a tensed proposition true are only features of the present way that things are, perhaps traces of the past in pages of present books and in our memories. The eternalist disagrees. When someone says truly that Abraham Lincoln was assassinated, the eternalist and the growing-past theorist believe this is to say something true of a real Abraham Lincoln who is not present. The block theorist and the growing-block theorist might add that Lincoln is real but far away from us along the time dimension just as the Moon is real but far away from us along a spatial dimension. “Why not treat these distant realities in the same manner?” they argue.

A related issue for the presentist is how to account for causation, for how April showers bring May flowers. Presentists believe in processes, but can they account for the process of a cause producing an effect without both the cause and the effect being real at different times?

Presentism and the growing-past theory need to account for the Theory of Relativity’s treatment of the present, or else criticize the theory. Relativity implies there is no common global present, but only different presents for each of us. Relativity theory allows event a to be simultaneous with event b in one reference frame, while allowing b to be simultaneous with event c in some other reference frame, even though a and c are not simultaneous in either frame. Nevertheless, if a is real, then is c not also real? But neither presentism nor the growing-past theory can allow c to be real. This argument against presentism and the growing-past theory presupposes the transitivity of co-existence.

Despite this criticism, (Stein 1991) says presentism can be retained by rejecting transitivity and saying what is present and thus real is different depending on your spacetime location. The implication is that, for event a, the only events that are real are those with a zero spacetime interval from a. Many of Stein’s opponents, including his fellow presentists, do not like this implication.

Eternalists very often adopt the block-universe theory. This implies our universe is the set of all the point-events with their actual properties. The block is representable with a Minkowski diagram in the regions where spacetime does not curve and where nature obeys the laws of special relativity.

The presentist and the advocate of the growing-past theory usually will unite in opposition to eternalism for these five reasons: (i) The present is so much more vivid than the future. (ii) Eternalism misses the special open and changeable character of the future. In the classical block-universe theory promoted by most eternalists, there is only one future, so this implies the future exists already; but that denies our ability to affect the future, and it is known that we do have this ability. (iii) A present event moves in the sense that it is no longer present a moment later, having lost its property of presentness, but eternalism disallows this movement. (iv) Future events do not exist and so do not stand in relationships of before and after, but eternalism accepts these relationships. (v) Future-tensed statements that are contingent, such as “There will be a sea battle tomorrow,” do not have existing truthmakers and so are neither true nor false, yet most eternalists mistakenly believe all these statements do have truth values now.

Defenders of eternalism and the block-universe offer a variety of responses to these criticisms. For instance, regarding (i), they are likely to say the vividness of here does not imply the unreality of there, so why should the vividness of now imply the unreality of then? Regarding (ii) and the openness of the future, the block theory allows a closed future and the absence of libertarian free will, but it does not require this. Eventually, there will be one future, regardless of whether that future is now open or closed, and that is what constitutes the future portion of the block that has not happened yet.

“Do we all not fear impending doom?” an eternalist might ask. But according to presentism and the growing-block theory, why should we have this fear if the future doom is known not to exist, as these two kinds of theorists evidently believe? The best philosophy of time will not make our different attitudes toward future danger and past danger be so mysterious, says the eternalist. In 1981, J.J.C. Smart, a proponent of the block-universe, asked us to:

conceive of a soldier in the twenty-first century…cold, miserable and suffering from dysentery, and being told that some twentieth-century philosophers and non-philosophers had held that the future was unreal. He might have some choice things to say.

All observation is of the past. If you look at the North Star, you see it as it was, not as it is, because it takes so many years for the light to reach your eyes, about 434 years. The North Star might have burned out several years ago. If so, then you are seeing something that does not exist, according to the presentist. That is puzzling. Eternalism with the block theory provides a way out of the puzzle: you are seeing an existing time-slice of the 4D block that is the North Star.

Determinism for a system is the thesis that specifying the state of the system fixes how the system evolves over time, either forward in time or backward in time. Determinism implies nothing that occurs is purely random. Here is a commonly offered defense of the block-universe theory against the charge that it demands determinism:

The block universe is not necessarily a deterministic one. …Strictly speaking, to say that the occurrence of a relatively later event is determined vis à vis a set of relatively earlier events, is only to say that there is a functional connection or physical law linking the properties of the later event to those of the earlier events. …Now in the block universe we may have partial or even total indeterminacy—there may be no functional connection between earlier and later events (McCall 1966, p. 271).

One defense of the block theory against Bergson’s charge that it inappropriately spatializes time is to point out that when we graph the color of eggs sold against the dollar amount of the sales, no one complains that we are inappropriately spatializing egg color. The issues of spatialization and determinism reflect a great philosophical divide between those who say the geometrical features of spacetime provide an explanation of physical phenomena or instead only a representation or codification of those phenomena.

Challenging the claim that the block universe theory must improperly spatialize time, but appreciating the point made by Bergson that users of the block universe can make the mistake of spatializing time, the pragmatist and physicist Lee Smolin says,

By succumbing to the temptation to conflate the representation with the reality and [to] identify the graph of the records of the motion with the motion itself, these scientists have taken a big step toward the expulsion of time from our conception of nature.

The confusion worsens when we represent time as an axis on a graph…This can be called spatializing time.

And the mathematical conjunction of the representations of space and time, with each having its own axis, can be called spacetime. The pragmatist will insist that this spacetime is not the real world. It’s entirely a human invention, just another representation…. If we confuse spacetime with reality, we are committing a fallacy, which can be called the fallacy of the spatialization of time. It is a consequence of forgetting the distinction between recording motion in time and time itself.

Once you commit this fallacy, you’re free to fantasize about the universe being timeless, and even being nothing but mathematics. But, the pragmatist says, timelessness and mathematics are properties of representations of records of motion—and only that.

For a survey of defenses for presentism and the growing-past theories, see (Putnam 1967), (Saunders 2002), (Markosian 2003), (Savitt 2008), and (Miller 2013, pp. 354-356).

b. The Present

The present is what we are referring to when we use the word “now.” The temporal word now changes its reference every instant but not its meaning. Obviously there is a present, many people say, because it is so different from the past. No, said Einstein, “People like us, who believe in physics, know that the distinction between past, present and future is only a stubbornly persistent illusion.” There is considerable controversy among philosophers of time about whether Einstein is correct, and whether he was mistaking being real and being objective, and what is meant by the words “illusion” and “present,” and whether the present is objectively real. The majority position among physicists is that the present is a mind-dependent feature of events because it has to do with how reality is experienced rather than with how reality is. (Everyone in the dispute agrees that it can make an important difference to your life whether it is presently noon or presently midnight.)

The A-theorists favor the claim that the present is objectively real; the B-theorists say it is subjective because everyone and everything has its own personal time so there can be no fact of the matter as to which of their presents is the real present. They say the problem is especially evident with what is happening now over there as opposed to now right here. Relativity theory implies what is happening now is relative to a chosen reference frame. It is always different for two people moving toward or away from each other.

Let us consider some arguments in favor of the objectivity of the present, the reality of now. One is that the now is so much more vivid to everyone than all other times. Past and future events are dim by comparison. Proponents of an objective present say that if scientific laws do not recognize this vividness and the objectivity of the present, then there is a defect within science. Einstein considered this argument and rejected it.

One counter to Einstein is that there is so much agreement among people about what is happening now and what is not. Is that not a sign that the now is objective, not subjective? This agreement is reflected within our natural languages where we find evidence that a belief in the now is ingrained in our language. It is unlikely that it would be so ingrained if it were not correct to believe it.

What have B-theorists said in response? Well, regarding vividness, we cannot now step outside our present experience and compare its vividness with the experience of past presents and future presents. Yet that is what needs to be done for a fair comparison. Instead, when we speak of the “vividness” of our present experience of, say, a leaf falling in front of us, all we can do is compare our present experience of the leaf with our dim memories of leaves falling, and with even dimmer expectations of leaves yet to fall. So, the comparison is unfair; the vividness of future events should be assessed, says the critic, by measuring those future events when they happen and not merely by measuring present expectations of those events before they happen.

In another attempt to undermine the vividness argument, the B-theorist points out that there are empirical studies by cognitive psychologists and neuroscientists showing that our judgment about what is vividly happening now is plastic and can be affected by our expectations and by what other experiences we are having at the time. For example, we see and hear a woman speaking to us from across the room; then we construct an artificial now, in which hearing her speak and seeing her speak happen at the same time. But they do not really happen at the same time. We play a little trick on ourselves. The acoustic engineer assures us we are mistaken because the sound traveled much slower than the light. Proponents of the manifest image of time do not take travel time into account and mistakenly suppose there is a common global present and suppose that what is happening at present is everything that could in principle show up in a still photograph taken with light that arrives with infinite speed.

When you speak on the phone with someone two hundred miles away, the conversation is normal because the two of you seem to share a common now. But that normalcy is only apparent because the phone signal travels the two hundred miles so quickly. During a phone conversation with someone much farther away, say on the Moon, you would notice a strange 1.3 second time lag because the Moon is 1.3 light seconds away from Earth. Suppose you were to look at your correct clock on Earth and notice it is midnight. What time would it be on the Moon, according to your clock? This is not a good question. A more sensible question is, “What events on the Moon are simultaneous with midnight on Earth, according to my clock?” You cannot look and see immediately. You will have to wait 1.3 seconds at least because it takes any signal that long to reach you from the Moon. If an asteroid were to strike the Moon, and you were to see the explosion through your Earth telescope at 1.3 seconds after midnight, then you could compute later that the asteroid must have struck the Moon at midnight. If you want to know what is presently happening on the other side of Milky Way, you will have a much longer wait. So, the moral is that the collection of events comprising your present is something you have to compute; you cannot directly perceive those events at once.

To continue advancing a pro-B-theory argument against an objective present, notice the difference in time between your clock which is stationary on Earth and the time of a pilot using a clock in a spaceship that is flying by you at high speed. Assume the spaceship flies very close to you and that the two clocks are synchronized and are working perfectly and they now show the time is midnight at the flyby. According to the special theory of relativity, the collection of events across the universe that you eventually compute and say occurs now at midnight, necessarily must be very different from the collection of events that the spaceship traveler computes and says occurs at midnight. You and the person on the spaceship probably will not notice much of a difference for an event at the end of your street or even for an event on another continent, but you will begin to notice the difference for an event on the Moon and even more so for an event somewhere across the Milky Way or, worse yet, for an event in the Andromeda galaxy.

When two people disagree about what events are present events because the two are in motion relative to each other, the direction of the motion makes a significant difference. If the spaceship is flying toward Andromeda and away from you, then the spaceship’s now (what it judges to be a present event) would include events on Andromeda that occurred thousands of years before you were born. If the spaceship is flying away from Andromeda, the spaceship’s now would include events on Andromeda that occur thousands of years in your future. Also, the difference in nows is more extreme the faster the spaceship’s speed as it flies by you. The implication, says the B theorist, is that there are a great many different nows and nobody’s now is the only correct one.

To make a similar point in the language of mathematical physics, something appropriately called a now would be an equivalence class of instances that occur at the same time. But because Einstein showed that time is relative to refence frame, there are different nows for different reference frames, so the notion of now is not frame-independent and thus is not objective, contra the philosophical position of the A-theorist.

When the B-theorist says there is no fact of the matter about whether a distant explosion has happened, the A-theorist will usually disagree and say, regardless of your limitations on what knowledge you have, the explosion has occurred now or it hasn’t occurred now. But to the B-theorist to say this is to merely insist on the manifest image of time despite the implications of relativity theory.

But is there not a privileged reference frame in astronomy? Yes, it is the frame in which cosmic time is measured. This is the unique frame used when astronomers say the big bang began 13.8 billion years ago, or that the universe turned transparent 380,000 years after that. The frame is described in more detail in the companion article’s analysis of the big bang. This reference frame has an origin where the average motion of all the universe’s galaxies is stationary. Clocks fixed in this frame are sitting still while the universe expands around them; and at the frame’s origin, the universe is approximately isotropic at the cosmic scale, so the universe looks like it has the same average temperature in all directions. The origin is traveling about 350 km/s toward the constellation of Pisces and away from Leo. This is a privileged reference frame for astronomical convenience, and it is unique in the universe, but there is little reason to suppose this frame is what is sought by the A-theorist who believes in an objective present, nor by Isaac Newton who believed in absolute rest, nor by Maxwell who believed in his nineteenth century aether.

Opponents of an objective present frequently point out that none of the fundamental laws of physics pick out a present moment. Scientists frequently do apply some law of science while assigning, say, t0 to be the temporal coordinate of the present moment, then they go on to calculate this or that. This insertion of the fact that some value of the time variable t is the present time is an initial condition of the situation to which the law is being applied, and is not part of the law itself. The basic laws themselves treat all times equally. If science’s laws do not need the present, then it is not real, say the B theorists. The counterargument is that it is the mistake of scientism to suppose that if something is not in our current theories, then it must not be real. France is real, but it is not mentioned in any scientific law.

In any discussion about whether the now is objective, one needs to remember that the term objective has different senses. There is objective in the sense of not being relative to the reference frame, and there is objective in the sense of not being mind-dependent, and there is objective in the sense of not being anthropocentric. Proponents of the B-theory say the now is not objective in any of these senses.

There is considerable debate in the philosophical literature about whether the present moments are so special that the laws should somehow recognize them. It is pointed out that even Einstein said, “There is something essential about the Now which is just outside the realm of science.” In 1925, the influential philosopher of science Hans Reichenbach criticized the block theory’s treatment of the present:

In the condition of the world, a cross-section called the present is distinguished; the ‘now’ has objective significance. Even when no human being is alive any longer, there is a ‘now’….

This claim has met stiff resistance. For example, in 1915, Bertrand Russell objected to giving the present any special ontological standing:

In a world in which there was no experience, there would be no past, present, or future, but there might well be earlier and later (Russell 1915, p. 212).

Rudolf Carnap added that a belief in the present is a matter for psychology, not physics.

The B-camp says belief in a global now is a product of our falsely supposing that everything we see is happening now, when actually we are not factoring in the finite speed of light and sound. Proponents of the non-objectivity of the present frequently claim that a proper analysis of time talk should treat the phrases the present and now as indexical terms which refer to the time at which the phrases are uttered by the speaker, and so their relativity to us speakers shows the essential subjectivity of the present. A-theorists do not accept these criticisms.

There are interesting issues about the now in the philosophy of religion. For one example, Norman Kretzmann has argued that if God is omniscient, then He knows what time it is, and to know this, says Kretzmann, God must always be changing because God’s knowledge keeps changing. Therefore, there is an incompatibility between God’s being omniscient and God’s being immutable.

Disagreement about the now is an ongoing feature of debate in the philosophy of time, and there are many subtle moves made by advocates on each side of the issue. (Baron 2018) provides a broad overview of the debate about whether relativistic physics disallows an objective present. For an extended defense of the claim that the now is not subjective and that there is temporal becoming, see (Arthur 2019).

c. Persistence, Four-Dimensionalism, and Temporal Parts

Eternalism differs from four-dimensionalism. Eternalism is the thesis that the present, past, and future are equally real, whereas four-dimensionalism says the ontologically basic objects are four-dimensional events and ordinary objects referred to in everyday discourse are three-dimensional slices of 4-d spacetime. However, most four-dimensionalists do accept eternalism. Most all eternalists and four-dimensionalists accept McTaggart’s B-theory of time.

Four-dimensionalism does not imply that time is a spatial dimension. When a four-dimensionalist represents time relative to a reference frame in a four-dimensional diagram, say, a Minkowski diagram, time is a special one of the four-dimensions of this mathematical space, not an arbitrary one. Using this representation technique does not imply that a four-dimensionalist is committed to the claim that real, physical space itself is four-dimensional, but only that spacetime is.

Four-dimensionalists take a stand on the philosophical issue of endurance vs. perdurance. Some objects last longer than others, so we say they persist longer. But there is no philosophical consensus about how to understand persistence. Objects are traditionally said to persist by enduring over some time interval. At any time during the interval the whole of the object exists. Not so for perduring objects. Perduring objects are said, instead, to persist by perduring. They do not exist wholly at a single instant but rather exist over a stretch of time. These objects do not pass through time; they do not endure; instead, they extend through time. A football game does not wholly exist at one instant; it extends over an interval of time. The issue is whether we can or should say the same for electrons and people. Technically expressed, the controversial issue is whether or not persisting things are (or are best treated as) divisible into temporal parts.

The perduring object persists by being the sum or fusion of a series of its temporal parts (also called its temporal stages). Instantaneous temporal parts are called temporal slices and time slices. For example, a forty-year-old man might be treated as being a four-dimensional perduring object consisting of his three temporal stages that we call his childhood, his middle age, and his future old age. But his right arm is also a temporal part that has perdured for forty years.

Although the concept of temporal parts is more likely to be used by a four-dimensionalist, here is a definition of the concept from Judith Jarvis Thomson in terms of three-dimensional objects:

Let object O exist at least from time t0 to time t3. A temporal part P of O is an object that begins to exist at some time t1, where t1 ≥ t0, and goes out of existence at some time t2 ≤ t3, and takes up some portion of the space that O takes up for all the time that P exists.

Four-dimensionalists, by contrast, think of physical objects as regions of spacetime and as having temporal parts that extend along all four dimensions of the object. A more detailed presentation of these temporal parts should say whether four-dimensional objects have their spatiotemporal parts essentially.

David Lewis offers the following, fairly well-accepted definitions of perdurance and endurance:

Something perdures iff it persists by having different temporal parts, or stages, at different times, though no one part of it is wholly present at more than one time; whereas it endures iff it persists by being wholly present at more than one time.

The termiff stands for “if and only if.” Given a sequence of temporal parts, how do we know whether they compose a single perduring object? One answer, given by Hans Reichenbach, Ted Sider, and others, is that they compose a single object if the sequence falls under a causal law so that temporal parts of the perduring object cause other temporal parts of the object. Philosophers of time with a distaste for the concept of causality, oppose this answer.

According to David Lewis in On the Plurality of Worlds, the primary argument for perdurantism is that it has an easier time solving what he calls the problem of temporary intrinsics, of which the Heraclitus Paradox is one example. The Heraclitus Paradox is the problem, first introduced by Heraclitus of ancient Greece, of explaining our not being able to step into the same river twice because the water is different the second time. The mereological essentialist agrees with Heraclitus, but our common sense says Heraclitus is mistaken because people often step into the same river twice. Who is really making the mistake?

The advocate of endurance has trouble showing that Heraclitus is mistaken, says Lewis. We do not step into two different rivers, do we? They are the same river. Yet the river has two different intrinsic properties, namely being two different collections of water; but, by Leibniz’s Law of the Indiscernibility of Identicals, identical objects cannot have different properties. So, the advocate of endurance has trouble escaping the Heraclitus Paradox. So does the mereological essentialist.

A 4-dimensionalist who advocates perdurance says the proper metaphysical analysis of the Heraclitus Paradox is that we can step into the same river twice by stepping into two different temporal parts of the same 4-dimensional river. Similarly, we cannot see a football game at a moment; we can see only a momentary temporal part of the 4D game.

For more examination of the issue with detailed arguments for and against perdurance and endurance, see (Wasserman, 2018), (Carroll and Markosian 2010, pp. 173-7), and especially the article “Persistence in Time” in this encyclopedia.

d. Truth-Values of Tensed Sentences

The above disputes about presentism, the growing-past theory, and the block theory have taken a linguistic turn by focusing upon a related question about language: “Are predictions true or false at the time they are uttered?” Those who believe in the block-universe (and thus in the determinate reality of the future) will answer “Yes,” while a “No” will be given by presentists and advocates of the growing-past.

The issue is whether contingent sentences uttered now about future events are true or false now rather than true or false only in the future at the time the predicted event is supposed to occur. For example, suppose someone says, “Tomorrow the admiral will start a sea battle.” And suppose that the next day the admiral does order a sneak attack on the enemy ships which starts a sea battle. The eternalist says that, if this is so, then the sentence token about the sea battle was true yesterday at the time it was uttered. Truth is eternal or fixed, eternalists say, and the predicate is true is a timeless predicate, not one that merely means is true now. The sentence spoken now has a truth-maker within the block at a future time, even though the event has not yet happened and so the speaker has no access to that truthmaker. These B-theory philosophers point favorably to the ancient Greek philosopher Chrysippus who was convinced that a contingent sentence about the future is simply true or false, even if we do not know which.

Many other philosophers, usually in McTaggart’s A-camp, agree with Aristotle’s suggestion that the sentence about the future sea battle is not true (or false) until the battle occurs (or does not). Predictions fall into the truth-value gap. This position that contingent sentences have no classical truth-values when uttered is called the doctrine of the open future and also the Aristotelian position because many researchers throughout history have taken Aristotle to have been holding the position in chapter 9 of his On Interpretation—although today it is not so clear that Aristotle himself held the position.

One principal motive for adopting the Aristotelian position arises from the belief that, if sentences about future human actions are now true, then humans are determined to perform those actions, and so humans have no free will. To defend free will, we must deny truth-values to predictions.

This Aristotelian argument against predictions being true or false has been discussed as much as any in the history of philosophy, and it faces a series of challenges. First, if there really is no free will, or if free will is compatible with determinism, then the motivation to deny truth-values to predictions is undermined.

Second, according to many compatibilists, but not all, your choices do affect the world as the libertarians believe they must; but, if it is true that you will perform an action in the future, it does not follow that now you will not perform it freely, nor that you were not free to do otherwise if your intentions had been different back then, but only that you will not do otherwise. For more on this point about modal logic, see the discussion of it in Foreknowledge and Free Will.

A third challenge, from Quine and others, claims the Aristotelian position wreaks havoc with the logical system we use to reason and argue with predictions. For example, here is a deductively valid argument, presumably:

If there will be a sea battle tomorrow, then we should wake up the admiral.

There will be a sea battle tomorrow.

So, we should wake up the admiral.

Without both premises in this argument having truth-values, that is, being true or false, we cannot properly assess the argument using the usual standards of deductive validity because this standard is about the relationships among truth-values of the component sentences—that a valid argument cannot possibly have true premises and a false conclusion. Unfortunately, the Aristotelian position says that some of these component sentences are neither true nor false. So, logic does not apply. Surely, then, the Aristotelian position is implausible.

In reaction to this third challenge, proponents of the Aristotelian argument say that if Quine would embrace tensed propositions and expand his classical logic to a tense logic, he could avoid those difficulties in assessing the validity of arguments that involve sentences having future tense.

Quine has claimed that the analysts of our talk involving time should in principle be able to eliminate the temporal indexical words such as now and tomorrow because their removal is needed for fixed truth and falsity of our sentences [fixed in the sense of being eternal or complete sentences whose truth-values are not relative to the situation and time of utterance because the indexicals and indicator words have been replaced by expressions for specific times, places and names, and whose verbs are treated as timeless and tenseless], and having fixed truth-values is crucial for the logical system used to clarify science. “To formulate logical laws in such a way as not to depend thus upon the assumption of fixed truth and falsity would be decidedly awkward and complicated, and wholly unrewarding,” says Quine. For a criticism of Quine’s treatment of indexicals, see (Slater 2012, p. 72).

Philosophers are divided on all these issues.

e. Essentially-Tensed Facts

Using a tensed verb is a grammatical way of locating an event in time. All the world’s cultures have a conception of time, but only half the world’s languages use tenses. English has tenses, but the Chinese, Burmese, and Malay languages do not. The English language distinguishes “Her death has happened” from “Her death will happen.” However, English also expresses time in other ways: with the adverbial phrases now and twenty-three days ago, with the adjective phrases new and ancient, and with the prepositions until and since.

Philosophers have asked what we are basically committed to when we use tense to locate an event in time. There are two principal answers: tenses are objective, and tenses are subjective. The two answers have given rise to two competing camps of philosophers of time.

The first answer is that tenses represent objective features of reality that are not captured by the B-theory, nor by eternalism, nor by the block-universe approach. This philosophical theory is said to “take tense seriously” and is called the tensed theory of time. The theory claims that, when we learn the truth-values of certain tensed sentences, we obtain knowledge which tenseless sentences do not and cannot provide, for example, that such and such a time is the present time. Tenses are almost the same as what is represented by positions in McTaggart‘s A-series, so the theory that takes tense seriously is commonly called the A-theory of tense, and its advocates are called tensers.

A second, contrary answer to the question of the significance of tenses is that they are merely subjective. Tensed terms have an indexical feature which is specific to the subject doing the speaking, but this feature has no ontological significance. Saying the event happened rather than is happening indicates that the subject or speaker said this after the event happened rather than before or during the event. Tenses are about speakers, not about some other important ontological characteristic of time in the world. This theory is the B-theory of tense, and its advocates are called detensers. The detenser W.V.O. Quine expressed the position this way:

Our ordinary language shows a tiresome bias in its treatment of time. Relations of date are exalted grammatically…. This bias is of itself an inelegance, or breach of theoretical simplicity. Moreover, the form that it takes—that of requiring that every verb form show a tense—is peculiarly productive of needless complications, since it demands lipservice to time even when time is farthest from our thoughts. Hence in fashioning canonical notations it is usual to drop tense distinctions (Word and Object §36).

The philosophical disagreement about tenses is not so much about tenses in the grammatical sense, but rather about the significance of the distinctions of past, present, and future which those tenses are used to mark.

The controversy is often presented as a dispute about whether tensed facts exist, with advocates of the tenseless theory objecting to tensed facts and advocates of the tensed theory promoting them as essential. The primary function of tensed facts is to make tensed sentences true, to be their truthmakers.

The B-theorist says tensed facts are not needed to account for why tensed sentences get the truth values they do.

Consider the tensed sentence, “Queen Anne of Great Britain died.” The A-theorist says the truthmaker is simply the tensed fact that the death has pastness. The B-theorist gives a more complicated answer by saying the truthmaker is the fact that the time of Queen Anne’s death is-less-than the time of uttering the above sentence. Notice that the B-answer does not use any words in the past tense. According to the classical B-theorist, the use of tense (and more importantly, any appeal to tensed facts) is an extraneous and eliminable feature of our language at the fundamental level, as are all other uses of the terminology of the A-series (except in trivial instances such as “The A-series is constructed using A-facts”).

This B-theory analysis is challenged by the tenser’s A-theory on the grounds that it can succeed only for utterances or readings or inscriptions, but the A-theorist points out that a proposition can be true even if never uttered, never read, and never inscribed.

There are other challenges to the B-theory. Roderick Chisholm and A.N. Prior claim that the word “is” in the sentence “It is now midnight” is essentially present-tensed because there is no adequate translation using only tenseless verbs. Trying to give a B-style analysis of it, such as, “There is a time t such that t = midnight,” is to miss the essential reference to the present in the original sentence because the original sentence is not always true, but the sentence “There is a time t such that t = midnight” is always true. So, the tenseless analysis fails. There is no escape from this criticism by adding “and t is now” because this last indexical phrase needs its own analysis, and we are starting a vicious regress. John Perry famously explored this argument in his 1979 article, “The Problem of the Essential Indexical.”

Prior, in (Prior 1959), supported the tensed A-theory by arguing that after experiencing a painful event,

one says, e.g., “Thank goodness that’s over,” and [this]…says something which it is impossible that any use of a tenseless copula with a date should convey. It certainly doesn’t mean the same as, e.g., “Thank goodness the date of the conclusion of that thing is Friday, June 15, 1954,” even if it be said then. (Nor, for that matter, does it mean “Thank goodness the conclusion of that thing is contemporaneous with this utterance.” Why should anyone thank goodness for that?).

Prior’s criticisms of the B-theory involves the reasonableness of our saying of some painful, past event, “Thank goodness that is over.” The B-theorist cannot explain this reasonableness, he says, because no B-theorist should thank goodness that the end of their pain happens before their present utterance of “Thank goodness that is over,” since that B-fact or B-relationship is timeless; it has always held and always will. The only way then to make sense of our saying “Thank goodness that is over” is to assume we are thankful for the A-fact that the pain event is in the past, that is, we are thankful for the pastness. But if so, then the A-theory is correct and the B-theory is incorrect.

One B-theorist response is simply to disagree with Prior that it is improper for a B-theorist to thank goodness that the end of their pain happens before their present utterance, even though this is an eternal B-fact. Still another response from the B-theorist comes from the 4-dimensionalist who says that as 4-dimensional beings it is proper for us to care more about our later time-slices than our earlier time-slices. If so, then it is reasonable to thank goodness that the time slice at the end of the pain occurs before the time slice in which we are saying, “Thank goodness that is over.” Admittedly this is caring about an eternal B-fact. So, Prior’s premise [that the only way to make sense of our saying “Thank goodness that is over” is to assume we are thankful for the A-fact that the pain event has pastness] is a faulty premise, and Prior’s argument for the A-theory is unsuccessful.

D.H. Mellor and J.J.C. Smart, both proponents of the B-theory, agree that tensed talk is important, and can be true, and even be essential for understanding how we think and speak; but Mellor and Smart claim that tensed talk is not essential for describing extra-linguistic reality and that the extra-linguistic reality does not contain tensed facts corresponding to true, tensed talk. These two philosophers, and many other philosophers who “do not take tense seriously,” advocate a newer tenseless B-theory by saying the truth conditions of any tensed, declarative sentence can be explained without tensed facts even if Chisholm and Prior and other A-theorists are correct that some tensed sentences in English cannot be adequately translated into tenseless ones.

The truth conditions of a sentence are the conditions which must be satisfied in the world in order for the sentence to be true. The sentence “Snow is white” is true on the condition that snow is white. More particularly, it is true if whatever is referred to by the term ‘snow’ satisfies the predicate ‘is white’. Regarding if-then sentences, the conditions under which the sentence “If it is snowing, then it is cold” are true are that it is not both true that it is snowing and false that it is cold. Other analyses are offered for the truth conditions of sentences that are more complex grammatically. Alfred Tarski has provided these analyses in his semantic theory of truth.

Mellor and Smart agree that truth conditions can adequately express the meaning of tensed sentences or all that is important about the meaning when it comes to describing objective reality. This is a philosophically controversial point, but Mellor and Smart accept it, and argue that therefore there is really no need for tensed facts and tensed properties. The untranslatability of some tensed sentences merely shows a fault with ordinary language‘s ability to characterize objective, tenseless reality. If the B-theory, in accounting for the truth conditions of an A-sentence, fails to account for the full meaning of the A-sentence, then this is because of a fault with the A-sentence, not the B-theory.

Let us make the same point in other words. According to the newer B-theory of Mellor and Smart, if I am speaking to you and say, “It is now midnight,” then this sentence admittedly cannot be translated into tenseless terminology without some loss of meaning, but the truth conditions can be explained fully with tenseless terminology. The truth conditions of “It is now midnight” are that my utterance occurs (in the tenseless sense of occurs) at very nearly the same time as your hearing the utterance, which in turn is the same time as when our standard clock declares the time to be midnight in our reference frame. In brief, it is true just in case it is uttered at midnight. Notice that no tensed facts are appealed to in this explanation of the truth conditions.

Similarly, an advocate of the new tenseless theory will say it is not the pastness of the painful event that explains why I say, “Thank goodness that’s over” after exiting the dentist’s chair. I say it because I believe that the time of the occurrence of that utterance is greater than the time of the occurrence of the painful event, and because I am glad about this; and even though it was true even last month that the one time occurred before the other, I am happy to learn this. Of course, I would be even gladder if there were no pain at any time. I may not be consciously thinking about the time of the utterance when I make it; nevertheless, that time is what helps explain what I am glad about. Being thankful for the pastness of the painful event provides a simpler explanation, actually a simplistic explanation, but not a better explanation.

In addition, it is claimed by Mellor and other new B-theorists that tenseless sentences can be used to explain the logical relations between tensed sentences; they can be used to explain why one tensed sentence implies another, is inconsistent with yet another, and so forth. According to this new theory of tenseless time, once it is established that the truth conditions of tensed sentences can be explained without utilizing tensed facts, then Ockham’s Razor is applied. If we can do without essentially-tensed facts, then we should say essentially-tensed facts do not exist.

To summarize, tensed facts were presumed by the A-theory to be needed to be the truthmakers for the truth of tensed talk; but proponents of the new B-theory claim their analysis shows that ordinary tenseless facts are adequate. The B-theory concludes that we should “not take tense seriously” in the sense of requiring tensed facts to account for the truth and falsity of sentences involving tenses because tensed facts are not actually needed.

Proponents of the tensed theory of time do not agree with this conclusion. They will insist there are irreducible A-properties and that what I am glad about when a painful event is over is that the event is earlier than now, that is, has pastness. Quentin Smith says, more generally, that the “new tenseless theory of time is faced with insurmountable problems, and that it ought to be abandoned in favor of the tensed theory.”

The advocate of the A-theory E.J. Lowe opposed the B-theory because it conflicts so much with the commonsense image of time:

I consider it to be a distinct merit of the tensed view of time that it delivers this verdict, for it surely coincides with the verdict of common sense (Lowe, 1998, p. 104).

Lowe argued that no genuine event can satisfy a tenseless predicate, and no truth can be made true by B-theory truth conditions because all statements of truth conditions are tensed.

So, the philosophical debate continues over whether tensed concepts have semantical priority over untensed concepts, and whether tensed facts have ontological priority over untensed facts.

15. The Arrow of Time

If you are shown an ordinary movie and also shown the same movie running in reverse, you have no trouble telling which is which because it is so easy for you to detect the one in which time’s arrow is pointing improperly. Clues could be that an omelet turned into unbroken eggs and everyone is walking backwards. Philosophers of physics want to know the origin and nature of this arrow. There is considerable disagreement about what it is, what counts as an illustration of it, how to explain it, and even how to define the term. The main two camps disagree about whether (1) there is an intrinsic arrow of time itself that is perhaps due to its flow or to more events becoming real, or (2) there is only an extrinsic arrow due to so many of nature’s processes spontaneously going in only one direction. Those in the intrinsic camp often accuse those in the other camp of scientism; those in the extrinsic camp often accuse those in the other camp of subjectivism and an over-emphasis on the phenomenology of temporal awareness.

Arthur Eddington in 1927 first used the term “time’s arrow.” The presence of the arrow implies, among other things, that tomorrow always will be different from today in many ways: people grow older rather than younger; metal naturally rusts but does not un-rust; eggs break but never un-break. Hopefully we can explain all this and not simply assume it. To do this, there must be some assumption somewhere that is time-asymmetric, that prefers one direction in time to the other. In the search for that assumption, some recommendations are to: (a) find a significant fundamental law of physics that requires one-way behavior in time, or (b) assume a special feature at the origin of time that directs time to start out going in only one direction and keep going that way, or (c) assume arrow-ness or directedness is an intrinsic feature of time itself.

There is no hope the time-asymmetry will show up in a fundamental law. Although the universe is filled with one-way processes, these are all macroprocesses, not micro-physical ones. At the most fundamental, micro-physical level, nearly all the laws of physics reveal no requirement that any process must go one way in time rather than the reverse. The exceptions involve rarely occurring weak interactions that all experts agree have nothing to do with time’s arrow.

Many experts in the extrinsic camp suggest that the presence of time’s arrow is basically a statistical issue involving increased disorder (the technical term is “entropy increase”) plus a special low-entropy configuration of nature early in the cosmic big bang, with the target of the arrow being thermodynamic equilibrium in the very distant future when the universe’s average temperature approaches absolute zero. These experts point to the second law of thermodynamics as the statistical law that gives a quantitative description of entropy increase. Experts in the intrinsic camp disagree with this kind of explanation of the arrow. They say the one-way character of time is not fundamentally a statistical issue involving processes but rather is intimately tied to the passage of time itself, to its intrinsic and uninterrupted flow.

There are a wide variety of special kinds of processes with their own mini-arrows. The human mind can know the past more easily than the future (the knowledge arrow). Heat flows from hot to cold (the thermodynamic arrow). The cosmos expands and does not shrink (the cosmological arrow). Light rays expand away from a light bulb rather than converge into it (the electromagnetic arrow). These mini-arrows are deep and interesting asymmetries of nature, and philosophers and physicists would like to know how the mini-arrows are related to each other.

Some philosophers have even asked whether there could be distant regions of space and time where time’s arrow runs in reverse compared to our arrow. If so, would adults there naturally walk backwards on the way to their infancy while they remember the future?

For more discussion of time’s arrow, see (Carroll 2010).

16. Temporal Logic

Temporal logic is the representation of reasoning about time and temporal information by using the methods of symbolic logic in order to formalize which statements imply which others. For example, in McTaggart’s B-series, the most important relation is the happens-before relation on events. Logicians have asked what sort of principles must this relation obey in order to properly account for our reasoning about time.

Here is one suggestion. Consider this informally valid reasoning:

Alice’s arrival at the train station happens before Bob’s. Therefore, Bob’s arrival at the station does not happen before Alice’s.

Let us translate this into classical predicate logic using a domain of instantaneous events, where the individual constant ‘a‘ denotes Alice’s arrival at the train station, and ‘b‘ denotes Bob’s arrival at the train station. Let the two-place or two-argument relation ‘Bxy‘ be interpreted as x happens before y—the key relation of McTaggart’s B-series. The direct translation of the above informal argument produces one premise with one conclusion:

Bab
——-
~Bba

The symbol ‘~’ is the negation operator; some logicians prefer to use the symbol ‘¬’ and others prefer to use ‘–’. Unfortunately, our simple formal argument is invalid. To make the argument become valid, we can add some semantic principles about the happens-before relation, namely, the premise that the B relation is asymmetric. That is, we can add this additional premise to the argument:

∀x∀y[Bxy ~Byx]

The symbol ‘∀x‘ is the universal quantifier on the variable ‘x‘. Some logicians prefer to use ‘(x)‘ for the universal quantifier. The symbol ‘‘ is the conditional operator or if-then operator; some logicians prefer to use the symbol ‘‘ instead.

In other informally valid reasoning, we discover a need to make even more assumptions about the happens-before relation. For example, suppose Alice arrives at the train station before Bob, and suppose Bob arrives there before Carol. Is it valid reasoning to infer that Alice arrives before Carol? Yes, but if we translate directly into classical predicate logic we get this invalid argument:

Bab
Bbc
——
Bac

To make this argument be valid we can add the premise that says the happens-before relation is transitive, that is:

∀x∀y∀z [(Bxy & Byz) Bxz]

The symbol ‘&’ represents the conjunction operation. Some logicians prefer to use either the symbol ‘·‘ or ‘∧’ for conjunction. The transitivity of B is a principle we may want to add to our temporal logic.

What other constraints should be placed on the B relation (when it is to be interpreted as the happens-before relation)? Here are some of the many suggestions:

  • ∀x∀y{Bxy → [t(x) < t(y)]}. If x happens before y, then the time coordinate of x is less than the time coordinate of y. ‘t‘ is a one-argument function symbol.
  • ∀x~Bxx. An event cannot happen before itself.
  • ∀x∀y{[t(x) ≠ t(y)] → [Bxy v Byx]}. Any two non-simultaneous events are connected by the B relation. That is, there are no temporally unrelated pairs of events. (In 1781 in his Critique of Pure Reason, Kant says this is an a priori necessary requirement about time.)
  • ∀x∃yBxy. Time is infinite in the future.
  • ∀x∀y(Bxy → ∃z(Bxz & Bzy)). B is dense in the sense that there is a third point event between any pair of non-simultaneous point events. This prevents quantized time.

To incorporate the ideas of the theory of relativity, we might want to make the happens-before relation be three-valued instead of two-valued by having it relate two events plus a reference frame.

When we formalized these principles of reasoning about the happens-before relation by translating them into predicate logic, we said we were creating temporal logic. However, strictly speaking, a temporal logic is just a theory of temporal sentences expressed in a formal logic. Calling it a logic, as is commonly done, is a bit of an exaggeration; it is analogous to calling the formalization of Peano’s axioms of arithmetic the development of number logic. Our axioms about B are not axioms of predicate logic, but only of a theory that uses predicate logic and that presumes the logic is interpreted on a domain of instantaneous events, and that presumes B is not open to re-interpretation as are the other predicate letters of predicate logic, but is always to be interpreted as happens-before.

The more classical approach to temporal logic, however, does not add premises to arguments formalized in classical predicate logic as we have just been doing. The classical approach is via tense logic, a formalism that adds tense operators on propositions of propositional logic or predicate logic. A. N. Prior was the pioneer in the late 1950s. Michael Dummett and E. J. Lemmon also made major, early contributions to tense logic. Prior created this new logic to describe our reasoning involving time phrases such as now, happens before, twenty-three minutes afterward, at all times, and sometimes. He hoped that a precise, formal treatment of these concepts could lead to the resolution of some of the controversial philosophical issues about time.

Prior begins with an important assumption: that a proposition such as “Custer dies in Montana” can be true at one time and false at another time. That assumption is challenged by some philosophers, such as W.V.O. Quine, who recommended avoiding the use of this sort of proposition. He recommended that temporal logics use only sentences that are timelessly true or timelessly false.

Prior’s main original idea was to appreciate that time concepts are similar in structure to modal concepts such as it is possible that and it is necessary that. He adapted modal propositional logic for his tense logic by re-interpreting its propositional operators. Or we can say he added four new propositional operators. Here they are with examples of their intended interpretations using an arbitrary present-tensed proposition p.

Pp “It has at some time been the case that p
Fp “It will at some time be the case that p
Hp “It has always been the case that p
Gp “It will always be the case that p

Pp‘ might be interpreted also as at some past time it was the case that, or it once was the case that, or it once was that, all these being equivalent English phrases for the purposes of applying tense logic to English. None of the tense operators are truth-functional.

One standard system of tense logic is a variant of the S4.3 system of modal logic. In this formal tense logic, if p represents the present-tensed proposition “Custer dies in Montana,” then Pp represents “It has at some time been the case that Custer dies in Montana” which is equivalent in English to simply “Custer died in Montana.” So, we properly call ‘P‘ the past-tense operator. It represents a phrase that attaches to a sentence and produces another that is in the past tense.

Metaphysicians who are presentists are especially interested in this tense logic because, if presentists can make do with the variable p ranging only over present-tensed propositions, then this logic, with an appropriate semantics, may show how to eliminate any ontological commitment to the past (and future) while preserving the truth of past tense propositions that appear in biology books such as “There were dinosaurs” and “There was a time when the Earth did not exist.”

Prior added to the axioms of classical propositional logic the axiom:

P(p v q) ↔ (Pp v Pq).

The axiom says that for any two propositions p and q, at some past time it was the case that p or q if and only if either at some past time it was the case that p or at some past time (perhaps a different past time) it was the case that q.

If p is the proposition “Custer dies in Montana” and q is “Sitting Bull dies in Montana,” then:

P(p v q) ↔ (Pp v Pq)

says:

Custer or Sitting Bull died in Montana if and only if either Custer died in Montana or Sitting Bull died in Montana.

The S4.3 system’s key axiom is the following equivalence. For all propositions p and q,

(Pp & Pq) ↔ [P(p & q) v P(p & Pq) v P(q & Pp)].

This axiom, when interpreted in tense logic, captures part of our ordinary conception of time as a linear succession of states of the world.

Another axiom of tense logic might state that if proposition q is true, then it will always be true that q has been true at some time. If H is the operator It has always been the case that, then a new axiom might be:

Pp ↔ ~H~p.

This axiom of tense logic is analogous to the modal logic axiom that p is possible if and only if it is not necessary that not-p.

A tense logic will need additional axioms in order to express q has been true for the past two weeks. Prior and others have suggested a wide variety of additional axioms for tense logic. It is controversial whether to add axioms that express the topology of time, for example that it comes to an end or does not come to an end or that time is like a straight line instead of a circle; the reason usually given is that this is an empirical matter, not a matter for logic to settle.

Regarding a semantics for tense logic, Prior had the idea that the truth or falsehood of a tensed proposition could be expressed in terms of truth-at-a-time. For example, the proposition Pp (it was once the case that p) is true-at-a-time t if and only if p is true-at-a-time earlier than t. This suggestion has led to extensive development of the formal semantics for tense logic.

Prior himself did not take a stand on which formal logic and formal semantics are correct for dealing with temporal expressions.

The concept of being in the past is usually treated by metaphysicians as a predicate that assigns properties to events, for example, “The event of Queen Anne’s dying has the property of being in the past”; but, in the tense logic just presented, the concept is treated as an operator P upon propositions, “It has at some time in the past been the case that Queen Anne is dying,” and this difference in treatment is objectionable to some metaphysicians.

The other major approach to temporal logic does not use a tense logic. Instead, it formalizes temporal reasoning within a first-order logic without modal-like tense operators. One method for developing ideas about temporal logic is the method of temporal arguments which adds an additional temporal argument to any predicate involving time in order to indicate how its satisfaction depends on time. Instead of translating the x is resting predicate as Px, where P is a one-argument predicate, it could be translated into temporal predicate logic as the two-argument predicate Rxt, and this would be interpreted as saying x is resting at time t. P has been changed to a two-argument predicate R by adding a place for a temporal argument. The time variable t is treated as a new sort of variable requiring new axioms to more carefully specify what can be assumed about the nature of time.

Occasionally the method of temporal arguments uses a special constant symbol, say n, to denote now, the present time. This helps with the translation of common temporal sentences. For example, let the individual constant s denote Socrates, and let Rst be interpreted as “Socrates is resting at t.” The false sentence that Socrates has always been resting would be expressed in this first-order temporal logic as:

∀t(Ltn → Rst)

Here L is the two-argument predicate for numerically less than that mathematicians usually write as <. And we see the usefulness of having the symbol n.

If tense logic is developed using a Kripke semantics of possible worlds, then it is common to alter the accessibility relation between any two possible worlds by relativizing it to a time. The point is to show that some old possibilities are no longer possible. For example, a world in which Hillary Clinton becomes the first female U.S. president in 2016 was possible relative to the actual world of 2015, but not relative to the actual world of 2017. There are other complexities. Within a single world, if we are talking about a domain of people containing, say, Socrates, then we want the domain to vary with time since we want Socrates to exist at some times but not at others. Another complexity is that in any world, what event is simultaneous with what other event should be relativized to a reference frame, as is required by Einstein’s theory of relativity.

Some temporal logics have a semantics that allows sentences to lack both classical truth-values. The first person to give a clear presentation of the implications of treating declarative sentences as being neither true nor false was the Polish logician Jan Lukasiewicz in 1920. To carry out Aristotle’s suggestion that future contingent sentences do not yet have truth-values, he developed a three-valued symbolic logic, with each grammatical declarative sentence having just one of the three truth-values True, or False, or Indeterminate [T, F, or I]. Contingent sentences about the future, such as, “There will be a sea battle tomorrow,” are assigned an I value in order to indicate the indeterminacy of the future. Truth tables for the connectives of propositional logic are redefined to maintain logical consistency and to maximally preserve our intuitions about truth and falsehood. See (Haack 1974) for more details about this application of three-valued logic.

For an introduction to temporal logics and their formal semantics, see (Øhrstrøm and Hasle 1995).

17. Time, Mind, and Experience

The principal philosophical issue about time and mind is to specify how time is represented in the mind; and the principal scientific issue in cognitive neuroscience is to uncover the neurological basis of our sense of time.

Our experience reveals time to us in many ways: (1) We notice some objects changing over time and some other objects persisting unchanged. (2) We detect  some events succeeding one another. (3) We notice that some similar events have different durations. (4) We seem to automatically classify events as present, past, or future, and we treat those events differently depending upon how they are classified. For example, we worry more about future pain than past pain.

Neuroscientists and cognitive scientists know that these ways of experiencing time exist, but not why they exist. Humans do not need to consciously learn these skills any more than they need to learn how to be conscious. It’s just something that grows or is acquired naturally. It’s something that appears due to a human being’s innate biological nature coupled with the prerequisites of a normal human environment—such as an adequate air supply, warmth, food, and water. A tulip could be given the same prerequisites, but it would never develop anything like our time consciousness. But neuroscientists do not yet understand the details of how our pre-set genetic program produces time consciousness, although there is agreement that the genes themselves are not conscious in any way.

A minority of philosophers, the panpsychists, would disagree with these neurophysiologists and say genes have proto-mental properties and proto-consciousness and even proto-consciousness of time. Critics remark sarcastically that our genes must also have the proto-ability to pay our taxes on time. The philosopher Colin McGinn, who is not a panpsychist, has some sympathies with the panpsychist position. He says genes:

contain information which is such that if we were to know it we would know the solution to the mind-body problem. In a certain sense, then, the genes are the greatest of philosophers, the repositories of valuable pieces of philosophical information. (McGinn 1999, p. 227)

No time cell nor master clock has been discovered so far in the human body, despite much searching, so many neuroscientists have come to believe there are no such things to be found. Instead, the neurological basis of our time sense probably has to do with coordinated changes in a network of neurons (and glia cells, especially astrocytes) that somehow encodes time information. Our brain cells, the neurons, are firing all at once, but they are organized somehow to produce a single conscious story in perceived, linear time. Although the details are not well understood by neuroscientists, there is continual progress. One obstacle is complexity. The human central nervous system is the most complicated known structure in the universe.

Cognitive neuroscientists want to know the neural mechanisms that account for our awareness of change, for our ability to anticipate the future, for our sense of time’s flow, for our ability to place remembered events into the correct time order (temporal succession), for our construction of a specious present, for our understanding of tenses, for our ability to notice and often accurately estimate durations, and for our ability to keep track of durations across many different time scales, such as milliseconds for some events and years for others.

It surely is the case that our body is capable of detecting very different durations even if we are not conscious of doing so. When we notice that the sound came from our left, not right, we do this by unconsciously detecting the very slight extra time it takes the sound to reach our right ear, which is only an extra 0.0005 seconds after reaching our left ear. The unconscious way we detect this difference in time must be very different from the way we detect differences in years. Also, our neurological and psychological “clocks” very probably do not work by our counting ticks and tocks as do the clocks we build in order to measure physical time.

We are consciously aware of time passing by noticing changes either outside or inside our body. For example, we notice a leaf fall from a tree as it acquires a new location. If we close our eyes, we still can encounter time just by imagining a leaf falling. But scientists and philosophers want more details. How is this conscious encounter with time accomplished, and how does it differ from our unconscious awareness of time?

With the notable exception of Husserl, most philosophers say our ability to imagine other times is a necessary ingredient in our having any consciousness at all. Some say our consciousness is a device that stores information about the past in order to predict the future. Although some researchers believe consciousness is a hard problem to understand, some others have said, “Consciousness seems easy to me: it’s merely the thoughts we can remember.” We remember old perceptions, and we make use of our ability to imagine other times when we experience a difference between our present perceptions and our present memories of past perceptions. Somehow the difference between the two gets interpreted by us as evidence that the world we are experiencing is changing through time. John Locke said our train of ideas produces our idea that events succeed each other in time, but he offered no details on how this train does the producing. Surely memory is key. Memories need to be organized into the proper temporal order in analogy to how a deck of cards, each with a different integer on the cards, can be sorted into numerical order. There is a neurological basis to the mental process of time-stamping memories so they are not just a jumble when recalled or retrieved into consciousness. Dogs successfully time-stamp their memories when they remember where they hid their bone and also when they plan for the short-term future by standing at the door to encourage their owner to open it. The human’s ability to organize memories far surpasses any other conscious being. We can decide to do next week what we planned last month because of what happened last year. This is a key part of what makes homo sapiens be sapien.

As emphasized, a major neurological problem is to explain the origin and character of our temporal experiences. How do brains take the input from all its sense organs and produce true beliefs about the world’s temporal relationships? Philosophers and cognitive scientists continue to investigate this, but so far there is no consensus on either how we experience temporal phenomena or how we are conscious that we do. However, there is a growing consensus that consciousness itself is an emergent property of a central nervous system, and that dualism between mental properties and physical properties is not a fruitful supposition. The vast majority of neuroscientists are physicalists who treat brains as if they are just wet machines, and they believe consciousness does not transcend scientific understanding.

Neuroscientists agree that the brain takes a pro-active role in building a mental scenario of the external 3+1-dimensional world. As one piece of suggestive evidence, notice that if you look at yourself in the mirror and glance at your left eyeball, then glance at your right eyeball, and then glance back to the left, you can never see your own eyes move. Your brain always constructs a continuous story of non-moving eyes. However, a video camera taking pictures of your face easily records your eyeballs’ movements, proving that your brain has taken an active role in “doctoring” the scenario.

Researchers believe that at all times our mind is testing hypotheses regarding what is taking place beyond our brain. The brain continually receives visual, auditory, tactile, and other sensory signals arriving at different times from an event, then must produce a hypothesis about what the signals might mean. Do those signals mean there probably is a tiger rushing at us? The brain also continuously revises hypotheses and produces new ones in an attempt to have a coherent story about what is out there, what is happening before what, and what is causing what. Being good at unconsciously producing, testing, and revising these hypotheses has survival value.

Psychological time’s rate of passage is a fascinating phenomenon to study. The most obvious feature is that psychological time often gets out of sync with physical time. At the end of our viewing an engrossing television program, we often think, “Where did the time go? It sped by.” When we are hungry in the afternoon and have to wait until the end of the workday before we can have dinner, we think, “Why is everything taking so long?” When we are feeling pain and we look at a clock, the clock seems to be ticking slower than normal.

An interesting feature of the rate of passage of psychological time reveals itself when we compare the experiences of younger people to older people. When we are younger, we lay down richer memories because everything is new. When we are older, the memories we lay down are much less rich because we have “seen it all before.” That is why older people report that a decade goes by so much more quickly than it did when they were younger.

Do things seem to move more slowly when we are terrified? “Yes,” most people would say. “No,” says neuroscientist David Eagleman, “it’s a retrospective trick of memory.” The terrifying event does seem to you to move more slowly when you think about it later, but not at the time it is occurring. Because memories of the terrifying event are “laid down so much more densely,” Eagleman says, it seems to you, upon your remembering, that your terrifying event lasted longer than it really did.

The human being inherited most or perhaps all of its biological clocks from its ancestor species. Although the cerebral cortex is usually considered to be the base for our conscious experience, it is surprising that rats can distinguish a five-second interval from a forty-second interval even with their cerebral cortex removed. So, a rat’s means of sensing time is probably distributed throughout many places in its brain. Perhaps the human being’s time sense is similarly distributed. However, surely the fact that we know that we know about time is specific to our cerebral cortex. A rat does not know that it knows. It has competence without comprehension. A cerebral cortex apparently is required for this comprehension. Very probably no other primate has an appreciation of time that is as sophisticated as that had by any normal human being.

We humans are very good at detecting the duration of silences. We need this ability to tell the difference between the spoken sentence, “He gave her cat-food,” and “He gave her cat food.” The hyphen is the linguistic tool for indicating that the duration between the spoken words “cat” and “food” is shorter than usual. This is a favorite example of the neuroscientist Dean Buonomano.

Philosophers and cognitive neuroscientists want to know whether we have direct experience only of an instantaneous present event or instead we have direct experience only of the specious present, a present event that lasts a short stretch of physical time. Informally, the issue is said to be whether the present is thin or thick. Plato, Aristotle, Thomas Reid, and Alexius Meinong believed in a thin present. Shadworth Hodgson, Mary Calkins and William James believed in a thick present. The latter position is now the more favored one by experts in the fields of neuroscience and philosophy of mind.

If it is thick, then how thick? Does the present last longer than the blink of an eye? Among those accepting the notion of a specious present, a good estimate of its duration is approximately eighty milliseconds for human beings, although neuroscientists do not yet know why it is not two milliseconds or two seconds.

Another issue is about overlapping specious presents. We do seem to have a unified stream of consciousness, but how do our individual specious presents overlap to produce this unity?

When you open your eyes, can you see what is happening now? In 1630, René Descartes would have said yes, but nearly all philosophers in the twenty-first century say no. You see the North Star as it was over 300 years ago, not as it is now. Also, light arriving at your eye from an external object contains information about its color, motion, and form. The three kinds of signals arrive simultaneously, but it takes your brain different times to process that information. Color information is processed more quickly than motion information, which in turn is processed more quickly than form information. Only after the light has taken its time to arrive at your eye, and then you have processed all the information, can you construct a correct story that perhaps says, “A white golf ball is flying toward my head.”

So, we all live in the past—in the sense that our belief about what is happening occurs later than when it really happened according to a clock. Our brain takes about eighty milliseconds to reconstruct a story of what is happening based on the information coming in from our different sense organs. Because of its long neck, a giraffe’s specious present might last considerably longer. However, it cannot take too much longer than this or else the story is so outdated that the organism risks becoming a predator’s lunch. Therefore, evolution has probably fine-tuned each kind of organism’s number of milliseconds of its specious present.

In the early days of television broadcasting, engineers worried about the problem of keeping audio and video signals synchronized. Then they accidentally discovered that they had about a tenth-of-a-second of “wiggle room.” As long as the signals arrive within this period, viewers’ brains automatically re-synchronize the signals; outside that tenth-of-a-second period, it suddenly looks like a badly dubbed movie. (Eagleman, 2009)

Watch a bouncing basketball. The light from the bounce arrives into our eyes before the sound arrives into our ears; then the brain builds a story in which the sight and sound of the bounce happen simultaneously. This sort of subjective synchronizing of visual and audio works for the bouncing ball so long as the ball is less than 100 feet away. Any farther and we begin to notice that the sound arrives later.

Some Eastern philosophies promote living in the present and dimming one’s awareness of the past and the future. Unfortunately, people who “live in the moment” have a more dangerous and shorter life. The cognitive scientist Lera Boroditsky says a crack addict is the best example of a person who lives in the moment.

Philosophers of time and psychologists who study time are interested in both how a person’s temporal experiences are affected by deficiencies in their imagination and their memory and how different interventions into a healthy person’s brain might affect that person’s temporal experience.

Some of neuroscientist David Eagleman’s experiments have shown clearly that under certain circumstances a person can be deceived into believing event A occurred before event B, when in fact the two occurred in the reverse order according to clock time. For more on these topics, see (Eagleman, 2011).

The time dilation effect in psychology occurs when events involving an object coming toward you last longer in psychological time than an event with the same object being stationary. With repeated events lasting the same amount of clock time, presenting a brighter object will make that event seem to last longer. This is likewise true for louder sounds.

Suppose you live otherwise normally within a mine and are temporarily closed off from communicating with the world above. For a long while, simply with memory, you can keep track of how long you have been inside the mine, but eventually you will lose track of the correct clock time. What determines how long the long while is, and how is it affected by the subject matter? And why are some persons better estimators than others?

Do we directly experience the present? This is controversial, and it is not the same question as whether at present we are having an experience. Those who answer “yes” tend to accept McTaggart’s A-theory of time. But notice how different such direct experience would have to be from our other direct experiences. We directly experience green color but can directly experience other colors; we directly experience high-pitched notes but can directly experience other notes. Can we say we directly experience the present time but can directly experience other times? Definitely not. So, the direct experience of the present either is non-existent, or it is a strange sort of direct experience. Nevertheless, we probably do have some mental symbol for nowness in our mind that correlates with our having the concept of the present, but it does not follow from this that we directly experience the present any more than our having a concept of love implies that we directly experience love. For an argument that we do not experience the present, see chapter 9 of (Callender 2017).

If all organisms were to die, there would be events after those deaths. The stars would continue to shine, but would any of these star events be in the future? This is a philosophically controversial question because advocates of McTaggart’s A-theory will answer “yes,” whereas advocates of McTaggart’s B-theory will answer “no” and add “Whose future?”

The issue of whether time itself is subjective, a mind-dependent phenomenon such as a secondary quality, is explored elsewhere in this article.

According to René Descartes’ dualistic philosophy of mind, the mind is not in space, but it is in time. The current article accepts the more popular philosophy of mind that rejects dualism and claims that our mind is in both space and time due to the functioning of our brain. It takes no position, though, on the controversial issue of whether the process of conscious human understanding is a computation.

Neuroscientists and psychologists have investigated whether they can speed up our minds relative to a duration of physical time. If so, we might become mentally more productive, and get more high-quality decision making done per fixed amount of physical time, and learn more per minute. Several avenues have been explored: using cocaine, amphetamines and other drugs; undergoing extreme experiences such as jumping backwards off a ledge into a net; and trying different forms of meditation. These avenues definitely affect the ease with which pulses of neurotransmitters can be sent from one neuron to a neighboring neuron and thus affect our psychological time, but so far, none of these avenues has led to success productivity-wise.

For our final issue about time and mind, do we humans have an a priori awareness of time that can be used to give mathematics a firm foundation? In the early twentieth century, the mathematician and philosopher L.E.J. Brouwer believed so. Many mathematicians and philosophers at that time were suspicious that mathematics was not as certain as they hoped for, and they worried that contradictions might be uncovered within mathematics. Their suspicions were increased by the discovery of Russell’s Paradox and by the introduction into set theory of the controversial non-constructive axiom of choice. In response, Brouwer attempted to place mathematics on what he believed to be a firmer epistemological foundation by arguing that mathematical concepts are admissible only if they can be constructed from an ideal mathematician’s vivid, a priori awareness of time, what in Kantian terminology is called an intuition of inner time. Time, said Kant in his Critique of Pure Reason in 1781, is a structuring principle of all possible experience. As such time is not objective; it is not a feature of things-in-themselves, but rather is a feature of the phenomenal world.

Brouwer supported Kant’s claim that arithmetic is the pure form of temporal intuition. Brouwer tried to show how to construct higher-level mathematical concepts (for example, the mathematical line) from lower-level temporal intuitions; but unfortunately, he had to accept the consequence that his program required both rejecting Aristotle’s law of excluded middle in logic and rejecting some important theorems in mathematics such as the theorem that every real number has a decimal expansion and the theorem that there is an actual infinity as opposed to a potential infinity of points between any two points on the mathematical line. Unwilling to accept those inconsistencies with classical mathematics, most other mathematicians and philosophers instead rejected Brouwer’s idea of an intimate connection between mathematics and time.

For interesting video presentations about psychological time, see (Carroll 2012) and (Eagleman 2011). For the role of time in phenomenology, see the article “Phenomenology and Time-Consciousness.” According to the phenomenologist Edmund Husserl, “One cannot discover the least thing about objective time through phenomenological analysis” (Husserl, 1991, p. 6).

Consider the mind of an extraterrestrial. Is it likely that our civilization and an extraterrestrial’s civilization have the same concept of physical time? This is unknown. Could an extraterrestrial arrive here on Earth with no concept of time? Probably not. How about arriving with a very different concept of time from ours? Perhaps, but how different? Stephen Hawking’s colleague James Hartle tried to answer this question by speculating that we and the extraterrestrial will at least, “share concepts of past, present and future, and the idea of a flow of time.”

18. Supplements

a. Frequently Asked Questions

b. What Else Science Requires of Time

c. Proper Times, Coordinate Systems, and Lorentz Transformations (by Andrew Holster)

19. References and Further Reading

  • Arntzenius Frank and H. Greaves. 2009. “Time Reversal in Classical Electromagnetism,” The British Journal for the Philosophy of Science vol. 60 (3), pp. 557-584.
    • Challenges Feynman’s claim that anti-particles are nothing but particles propagating backwards in time.
  • Arthur, Richard T. 2014. Leibniz. Polity Press. Cambridge, U.K.
    • Comprehensive monograph on all things Leibniz, with a detailed examination of his views on time.
  • Arthur, Richard T.  W. 2019. The Reality of Time Flow: Local Becoming in Physics, Springer.
    • Challenges the claim that the now is subjective in modern physics.
  • Azzouni, Jody. 2015. “Nominalism, the Nonexistence of Mathematical Objects,” in Mathematics, Substance and Surmise, edited by E. Davis and P.J. Davis, pp. 133-145.
    • Argues that mathematical objects referred to by mathematical physics do not exist despite Quine’s argument that they do exist. Azzouni also claims that a corporation does not exist.
  • Barbour, Julian. 1999. The End of Time, Weidenfeld and Nicolson, London, and Oxford University Press, New York.
    • A popular presentation of Barbour’s theory which implies that if we could see the universe as it is, we should see that it is static. It is static, he says, because his way of quantizing general relativity, namely quantum geometrodynamics with its Wheeler-DeWitt equation, implies a time-independent quantum state for the universe as a whole. Time is emergent and not fundamental. He then offers an exotic explanation of how time emerges and why time seems to us to exist.
  • Barbour, Julian. 2009. The Nature of Time, arXiv:0903.3489.
    • An application of the Barbour’s ideas of strong emergentism to classical physics.
  • Baron, Sam. 2018. “Time, Physics, and Philosophy: It’s All Relative,” Philosophy Compass, Volume 13, Issue 1, January.
    • Reviews the conflict between the special theory of relativity and the dynamic theories of time.
  • Baron, S. and K. Miller. 2015. “Our Concept of Time” in Philosophy and Psychology of Time edited by B. Mölder, V. Arstila, P. Ohrstrom. Springer. Pp 29-52.
    • Explores the issue of whether time is a functionalist concept.
  • Bunge, Mario. 1968. “Physical Time: The Objective and Relational Theory.” Philosophy of Science. Vol. 35, No. 4. Pages 355-388.
    • Examines the dispute between relationism and substantivalism, sometimes acerbically.
  • Butterfield, Jeremy. 1984.“ Seeing the Present” Mind, 93, pp. 161-76.
    • Defends the B-camp position on the subjectivity of the present; and argues against a global present.
  • Callender, Craig, and Ralph Edney. 2001. Introducing Time, Totem Books, USA.
    • A cartoon-style book covering most of the topics in this encyclopedia article in a more elementary way. Each page is two-thirds graphics and one-third text.
  • Callender, Craig and Carl Hoefer. 2002. “Philosophy of Space-Time Physics” in The Blackwell Guide to the Philosophy of Science, ed. by Peter Machamer and Michael Silberstein, Blackwell Publishers,  pp. 173-98.
    • Discusses whether it is a fact or a convention that in a reference frame the speed of light going one direction is the same as the speed coming back.
  • Callender, Craig. 2010,. Is Time an Illusion?”, Scientific American, June,  pp. 58-65.
    • Explains how the belief that time is fundamental may be an illusion.
  • Callender, Craig. 2017. What Makes Time Special? Oxford University Press.
    • A comprehensive monograph on the relationship between the manifest image of time and its scientific image. The book makes a case for how, if information gathering and utilizing systems like us are immersed in an environment with the physical laws that do hold, then we will create the manifest image of time that we do. Not written at an introductory level.
  • Carnap, Rudolf. 1966. Philosophical Foundations of Physics: An Introduction to the Philosophy of Science. Basic Books, Inc. New York.
    • Chapter 8 “Time” is devoted to the issue of how to distinguish an accurate clock from an inaccurate one.
  • Carroll, John W. and Ned Markosian. 2010. An Introduction to Metaphysics. Cambridge University Press.
    • This introductory, undergraduate metaphysics textbook contains an excellent chapter introducing the metaphysical issues involving time, beginning with the McTaggart controversy.
  • Carroll, Sean. 2010. From Eternity to Here: The Quest for the Ultimate Theory of Time, Dutton/Penguin Group, New York.
    • Part Three “Entropy and Time’s Arrow” provides a very clear explanation of the details of the problems involved with time’s arrow. For an interesting answer to the question of what happens in an interaction between our part of the universe and a part in which the arrow of time goes in reverse, see endnote 137 for p. 164.
  • Carroll, Sean. 2011. “Ten Things Everyone Should Know About Time,” Discover Magazine, Cosmic Variance.
    • Contains the quotation about how the mind reconstructs its story of what is happening “now.”
  • Carroll, Sean. 2012. Mysteries of Modern Physics: Time. The Teaching Company, The Great Courses: Chantilly, Virginia.
    • A series of popular lectures about time by a renowned physicist with an interest in philosophical issues. Emphasizes the arrow of time.
  • Carroll, Sean. 2016. The Big Picture. Dutton/Penguin Random House. New York.
    • A physicist surveys the cosmos’ past and future, including the evolution of life.
  • Carroll, Sean. 2022. The Biggest Ideas in the Universe: Space, Time, and Motion. Dutton/Penguin Random House.
    • A sophisticated survey of what modern physics implies about space, time, and motion, especially relativity theory and especially not quantum theory, with some emphasis on the philosophical issues. Introduces the relevant equations, but is aimed at a general audience and not physicists.
  • Carroll, Sean. 2019. Something Deeply Hidden: Quantum Worlds and the Emergence of Spacetime, Dutton/Penguin Random House.
    • Pages 287-289 explain how time emerges in a quantum universe governed by the Wheeler-DeWitt equation, a timeless version of the Schrödinger equation. The chapter “Breathing in Empty Space” explains why the limits of time (whether it is infinite or finite) depend on the total amount of energy in the universe. His podcast Mindscape in August 13, 2018 “Why Is There Something Rather than Nothing?” discusses this topic in its final twenty minutes. His answer is that this may not be a sensible question to ask.
  • Crowther, Karen. 2019. “When Do We Stop Digging? Conditions on a Fundamental Theory of Physics,” in What is ‘Fundamental’?, edited by Anthony Aguirre, Brendan Foster, and Zeeya Merali, Springer International Publishing.
    • An exploration of what physicists do mean and should mean when they say a particular theory of physics is final or fundamental rather than more fundamental. She warns, “a theory formally being predictive to all high-energy scales, and thus apparently being the lowest brick in the tower [of theories] (or, at least, one of the bricks at the lowest level of the tower), is no guarantee that it is in fact a fundamental theory. …Yet, it is one constraint on a fundamental theory.” When we arrive at a fundamental theory, “the question shifts from ‘What if there’s something beyond?’ to ‘Why should we think there is something beyond?” That is, the burden of justification is transferred.”
  • Damasio, Antonio R. 2006. “Remembering When,” Scientific American: Special Edition: A Matter of Time, vol. 287, no. 3, 2002; reprinted in Katzenstein, pp.34-41.
    • A look at the brain structures involved in how our mind organizes our experiences into the proper temporal order. Includes a discussion of Benjamin Libet’s claim to have discovered in the 1970s that the brain events involved in initiating our free choice occur about a third of a second before we are aware of our making the choice. This claim has radical implications for the philosophical issue of free will.
  • Dainton, Barry. 2010. Time and Space, Second Edition, McGill-Queens University Press: Ithaca.
    • An easy-to-read, but technically correct, book. This is probably the best single book to read for someone desiring to understand in more depth the issues presented in this encyclopedia article.
  • Davies, Paul. 1995. About Time: Einstein’s Unfinished Revolution, Simon & Schuster.
    • An easy-to-read survey of the impact of the theory of relativity and other scientific advances on our understanding of time.
  • Davies, Paul. 2002. How to Build a Time Machine, Viking Penguin.
    • A popular exposition of the details behind the possibilities of time travel.
  • Deutsch, David and Michael Lockwood. 1994. “The Quantum Physics of Time Travel,” Scientific American, pp. 68-74. March.
    • An investigation of the puzzle of getting information for free by traveling in time.
  • Deutsch, David. 2013. “The Philosophy of Constructor Theory,” Synthese, Volume 190, Issue 18.
    • Challenges Laplace’s Paradigm that physics should be done by predicting what will happen from initial conditions and laws of motion. http://dx.doi.org/10.1007/s11229-013-0279-z.
  • Dowden, Bradley. 2009. The Metaphysics of Time: A Dialogue, Rowman & Littlefield Publishers, Inc.
    • An undergraduate textbook in dialogue form that covers many of the topics discussed in this encyclopedia article. Easy reading for newcomers to the philosophy of time.
  • Dummett, Michael. 2000. Is Time a Continuum of Instants?,” Philosophy,  Cambridge University Press, pp. 497-515.
    • A constructivist model of time that challenges the idea that time is composed of durationless instants.
  • Eagleman David. 2009. “Brain Time.” In What’s Next? Dispatches on the Future of Science. Max Brockman, Ed., Penguin Random House.
    • A neuroscientist discusses the plasticity of time perception or temporal distortion.
  • Eagleman David. 2011. “David Eagleman on CHOICE,” Oct. 4, https://www.youtube.com/watch?v=MkANniH8XZE.
    • Commentary on research on subjective time.
  • Einstein, Albert. 1982. “Autobiographical Notes.” In P. A. Schilpp, ed. Albert Einstein: Philosopher-Scientist, vol. 1. LaSalle, Il. Open Court Publishing Company.
    • Describes his early confusion between the structure of the real number line and the structure of time itself.
  • Earman, John. 1972. “Implications of Causal Propagation Outside the Null-Cone,” Australasian Journal of Philosophy, 50, pp. 222-37.
    • Describes his rocket paradox that challenges time travel to the past.
  • Fisher, A. R. J. 2015. “David Lewis, Donald C. Williams, and the History of Metaphysics in the Twentieth Century.” Journal of the American Philosophical Association, volume 1, issue 1, Spring.
    • Discusses the disagreements between Lewis and Williams, who both are four-dimensionalists, about the nature of time travel.
  • Gödel, Kurt. 1959.  “A Remark about the Relationship between Relativity Theory and Idealistic Philosophy,” in P. A. Schilpp, ed., Albert Einstein: Philosopher-Scientist, Harper & Row, New York.
    • Discussion of solutions to Einstein’s equations that allow closed causal chains, that is, traveling to your past.
  • Gott, J. Richard. 2002. Time Travel in Einstein’s Universe: The Physical Possibilities of Travel Through Time.
    • Presents an original theory of the origin of the universe involving backward causation and time travel.
  • Grant, Andrew. 2015. “Time’s Arrow,” Science News, July 25, pp. 15-18.
    • Popular description of why our early universe was so orderly even though nature should always have preferred the disorderly.
  • Greene, Brian. 2011. The Hidden Reality: Parallel Universes and the Deep Laws of the Universe, Vintage Books, New York.
    • Describes nine versions of the Multiverse Theory, including the Ultimate multiverse theory described by the philosopher Robert Nozick.
  • Grey, W. 1999. “Troubles with Time Travel,” Philosophy 74: pp. 55-70.
    • Examines arguments against time travel.
  • Grünbaum, Adolf. 1950-51. “Relativity and the Atomicity of Becoming,” Review of Metaphysics, pp. 143-186.
    • An attack on the notion of time’s flow, and a defense of the treatment of time and space as being continua. Difficult reading.
  • Grünbaum, Adolf. 1971. “The Meaning of Time,” in Basic Issues in the Philosophy of Time, Eugene Freeman and Wilfrid Sellars, eds. LaSalle, pp. 195-228.
    • An analysis of the meaning of the term time in both the manifest image and scientific image, and a defense of the B-theory of time. Difficult reading.
  • Guth, Alan. 2014. “Infinite Phase Space and the Two-Headed Arrow of Time,” FQXi conference  in Vieques, Puerto Rico. https://www.youtube.com/watch?v=AmamlnbDX9I. 2014.
    • Guth argues that an arrow of time could evolve naturally even though it had no special initial conditions on entropy, provided the universe has an infinite available phase space that the universe could spread out into. If so, its maximum possible entropy is infinite, and any other state in which the universe begins will have relatively low entropy.
  • Haack, Susan. 1974. Deviant Logic, Cambridge University Press.
    • Chapter 4 contains a clear account of Aristotle’s argument (in section 14d of the present article) for truth-value gaps, and its development in Lukasiewicz’s three-valued logic.
  • Hawking, Stephen. 1992. “The Chronology Protection Hypothesis,” Physical Review. D 46, p. 603.
    • Reasons for the impossibility of time travel.
  • Hawking, Stephen. 1996. A Brief History of Time, Updated and Expanded Tenth Anniversary Edition, Bantam Books.
    • A leading theoretical physicist provides introductory chapters on space and time, black holes, the origin and fate of the universe, the arrow of time, and time travel. Hawking suggests that perhaps our universe originally had four space dimensions and no time dimension, and time came into existence when one of the space dimensions evolved into a time dimension. He called this special space dimension “imaginary time.”
  • Horwich, Paul. 1975. “On Some Alleged Paradoxes of Time Travel,” Journal of Philosophy, 72: pp.432-44.
    • Examines some of the famous arguments against past time travel.
  • Horwich, Paul. 1987. Asymmetries in Time, The MIT Press.
    • A monograph that relates the central problems of time to other problems in metaphysics, philosophy of science, philosophy of language and philosophy of action. Horwich argues that time itself has no arrow.
  • Hossenfelder, Sabine.  2022. Existential Physics: A Scientist’s Guide to Life’s Biggest Questions, Viking/Penguin Random House LLC.
    • A theoretical physicist who specializes in the foundations of physics examines the debate between Leibniz and Newton on relational vs. absolute (substantival) time. Her Chapter Two on theories about the beginning and end of the universe is especially deep, revealing, and easy to understand.
  • Huggett, Nick. 1999. Space from Zeno to Einstein, MIT Press.
    • Clear discussion of the debate between Leibniz and Newton on relational vs. absolute (substantival) time.
  • Husserl, Edmund. 1991. On the Phenomenology of the Consciousness of Internal Time. Translated by J. B. Brough. Originally published 1893-1917. Dordrecht: Kluwer Academic Publishers.
    • The father of phenomenology discusses internal time consciousness.
  • Katzenstein, Larry. 2006. ed. Scientific American Special Edition: A Matter of Time, vol. 16, no. 1.
    • A collection of Scientific American articles about time.
  • Kirk, G.S. and Raven, J.E. 1957. The Presocratic Philosophers. New York: Cambridge University Press,
  • Krauss, Lawrence M. and Glenn D. Starkman, 2002. “The Fate of Life in the Universe,” Scientific American Special Edition: The Once and Future Cosmos, Dec. pp. 50-57.
    • Discusses the future of intelligent life and how it might adapt to and survive the expansion of the universe.
  • Krauss, Lawrence M. 2012. A Universe from Nothing. Atria Paperback, New York.
    • Discusses on p. 170 why we live in a universe with time rather than with no time. The issue is pursued further in the afterward to the paperback edition that is not included within the hardback edition. Krauss’ position on why there is something rather than nothing was challenged by the philosopher David Albert in his March 23, 2012 review of Krauss’ hardback book in The New York Times newspaper.
  • Kretzmann, Norman. 1966. “Omniscience and Immutability,” The Journal of Philosophy, July, pp. 409-421.
    • Raises the question: If God knows what time it is, does this demonstrate that God is not immutable?
  • Lasky, Ronald C. 2006. “Time and the Twin Paradox,” in Katzenstein, pp. 21-23.
    • A short analysis of the twin paradox, with helpful graphs showing how each twin would view his or own clock plus the other twin’s clock.
  • Le Poidevin, Robin and Murray MacBeath, 1993. The Philosophy of Time, Oxford University Press.
    • A collection of twelve influential articles on the passage of time, subjective facts, the reality of the future, the unreality of time, time without change, causal theories of time, time travel, causation, empty time, topology, possible worlds, tense and modality, direction and possibility, and thought experiments about time. Difficult reading for undergraduates.
  • Le Poidevin, Robin. 2003. Travels in Four Dimensions: The Enigmas of Space and Time, Oxford University Press.
    • A philosophical introduction to conceptual questions involving space and time. Suitable for use as an undergraduate textbook without presupposing any other course in philosophy. There is a de-emphasis on teaching the scientific theories, and an emphasis on elementary introductions to the relationship of time to change, the implications that different structures for time have for our understanding of causation, difficulties with Zeno’s Paradoxes, whether time passes, the nature of the present, and why time has an arrow.
  • Lewis, David K. 1986. “The Paradoxes of Time Travel.” American Philosophical Quarterly, 13:145-52.
    • A classic argument against changing the past. Lewis assumes the B-theory of time.
  • Lockwood, Michael. 2005. The Labyrinth of Time: Introducing the Universe, Oxford University Press.
    • A philosopher of physics presents the implications of contemporary physics for our understanding of time. Chapter 15, “Schrödinger’s Time-Traveler,” presents the Oxford physicist David Deutsch’s quantum analysis of time travel.
  • Lowe, E. J. 1998. The Possibility of Metaphysics: Substance, Identity and Time, Oxford University Press.
    • This Australian metaphysician defends the A-theory’s tensed view of time in chapter 4, based on an ontology of substances rather than events.
  • Mack, Katie. 2020. The End of Everything (Astrophysically Speaking). Scribner, New York.
    • Exploration of alternative ways the universe might end.
  • Markosian, Ned. 2003. “A Defense of Presentism,” in Zimmerman, Dean (ed.), Oxford Studies in Metaphysics, Vol. 1, Oxford University Press.
  • Maudlin, Tim. 1988. “The Essence of Space-Time.” Proceedings of the Biennial Meeting of the Philosophy of Science Association, Volume Two: Symposia and Invited Papers (1988), pp. 82-91.
    • Maudlin discusses the hole argument, manifold substantivalism and metrical essentialism.
  • Maudlin, Tim. 2002. “Remarks on the Passing of Time,” Proceedings of the Aristotelian Society, New Series, Vol. 102 (2002), pp. 259-274 Published by: Oxford University Press. https://www.jstor.org/stable/4545373. 2002.
    • Defends eternalism, the block universe, and the passage of time.
  • Maudlin, Tim. 2007. The Metaphysics Within Physics, Oxford University Press.
    • Chapter 4, “On the Passing of Time,” defends the dynamic theory of time’s flow, and he argues that the passage of time is objective.
  • Maudlin, Tim. 2012. Philosophy of Physics: Space and Time, Princeton University Press.
    • An advanced introduction to the conceptual foundations of spacetime theory.
  • McCall, Storrs. 1966. “II. Temporal Flux,” American Philosophical Quarterly, October.
    • An analysis of the block universe, the flow of time, and the difference between past and future.
  • McGinn, Colin. 1999. The Mysterious Flame: Conscious Minds in a Material World. Basic Books.
    • Claims that the mind-body problem always will be a mystery for your mind but not for your genes.
  • McTaggart, J. M. E. 1927. The Nature of Existence, Cambridge University Press.
    • Chapter 33 restates more clearly the arguments that McTaggart presented in 1908 for his A series and B series and how they should be understood to show that time is unreal. Difficult reading. The argument for the inconsistency that a single event has only one of the properties of being past, present, or future, but that any event also has all three of these properties is called “McTaggart’s Paradox.” The chapter is renamed “The Unreality of Time,” and is reprinted on pp. 23-59 of (Le Poidevin and MacBeath 1993).
  • Mellor, D. H. 1998. Real Time II, International Library of Philosophy.
    • This monograph presents a subjective theory of tenses. Mellor argues that the truth conditions of any tensed sentence can be explained without tensed facts.
  • Merali, Zeeya. 2013. Theoretical Physics: The Origins of Space and Time,” Nature, 28 August , vol. 500, pp. 516-519.
    • Describes six theories that compete for providing an explanation of the basic substratum from which space and time emerge.
  • Miller, Kristie. 2013. “Presentism, Eternalism, and the Growing Block,” in A Companion to the Philosophy of Time. Ed. by Heather Dyke and Adrian Bardon, John Wiley & Sons, Inc., pp. 345-364.
    • Compares the pros and cons of competing ontologies of time.
  • Morris, Michael S., Kip S. Thorne and Ulvi Yurtsever. 1988. “Wormholes, Time Machines, and the Weak Energy Condition,” Physical Review Letters, vol. 61, no. 13, 26 September.
    • The first description of how to build a time machine using a wormhole.
  • Moskowitz, Clara. 2021. “In Every Bit of Nothing There is Something,” Scientific American, February.
    • Describes how the Heisenberg Uncertainty Principle requires there to be continual creation and annihilation of virtual particles. This process is likely to be the cause of dark energy and the accelerating expansion of space.
  • Mozersky, M. Joshua. 2013. “The B-Theory in the Twentieth Century,” in A Companion to the Philosophy of Time. Ed. by Heather Dyke and Adrian Bardon, John Wiley & Sons, Inc., pp. 167-182.
    • A detailed evaluation and defense of the B-Theory.
  • Muller, Richard A. 2016a. NOW: The Physics of Time. W. W. Norton & Company, New York.
    • An informal presentation of the nature of time by an experimental physicist at the University of California, Berkeley. Chapter 15 argues that the correlation between the arrow of time and the increase of entropy is not a causal connection. Chapter 16 discusses the competing arrows of time. Muller favors space expansion as the cause of time’s arrow, with entropy not being involved. And he recommends a big bang theory in which both space and time expand, not simply space. Because space and time are so intimately linked, he says, the expansion of space is propelling time forward, and this explains the “flow” of time. However, “the new nows [are] created at the end of time, rather than uniformly throughout time.” (p. 8)
  • Muller, Richard. 2016b. “Now and the Flow of Time,” arXiv, https://arxiv.org/pdf/1606.07975.pdf.
    • Argues that the flow of time consists of the continual creation of new moments, new nows, that accompany the creation of new space.”
  • Nadis, Steve. 2013. “Starting Point,” Discover, September, pp. 36-41.
    • Non-technical discussion of the argument by cosmologist Alexander Vilenkin that the past of the multiverse must be finite (there was a first bubble) but its future must be infinite (always more bubbles).
  • Norton, John. 2010. “Time Really Passes,” Humana.Mente: Journal of Philosophical Studies, 13 April.
    • Argues that, “We don’t find passage in our present theories and we would like to preserve the vanity that our physical theories of time have captured all the important facts of time. So we protect our vanity by the stratagem of dismissing passage as an illusion.”
  • Novikov, Igor. 1998. The River of Time, Cambridge University Press.
    • Chapter 14 gives a very clear and elementary description of how to build a time machine using a wormhole.
  • Oaklander, L. Nathan. 2008. The Ontology of Time. Routledge.
    • An authoritative collection of articles on all the major issues. Written for an audience of professional researchers.
  • Øhrstrøm, P. and P. F. V. Hasle. 1995. Temporal Logic: from Ancient Ideas to Artificial Intelligence. Kluwer Academic Publishers.
    • An elementary introduction to the logic of temporal reasoning.
  • Penrose, Roger. 2004. The Road to Reality: A Complete Guide to the Laws of the Universe. Alfred A. Knopf.
    • A mathematical physicist discusses cosmology, general relativity, and the second law of thermodynamics, but not at an introductory level.
  • Perry, John. 1979. “The Problem of the Essential Indexical,” Noûs, 13 (1), pp. 3-21.
    • Argues that indexicals are essential to what we want to say in natural language; they cannot all be explicated by, reduced to, or eliminated in favor of B-theory discourse.
  • Pinker, Steven. 2007. The Stuff of Thought: Language as a Window into Human Nature, Penguin Group.
    • Chapter 4 discusses how the conceptions of space and time are expressed in language in a way very different from that described by either Kant or Newton. Page 189 says that time in only half the world’s languages is the ordering of events expressed in the form of grammatical tenses. Chinese has no tenses, in the sense of verb conjugations, but of course, it expresses all sorts of concepts about time in other ways.
  • Plato. Parmenides. 1961. Trans. by F. Macdonald Cornford in The Collected Dialogues of Plato, ed. E. Hamilton and H. Cairns. Princeton, NJ: Princeton University Press.
    • Plato discusses time.
  • Pöppel, Ernst. 1988. Mindworks: Time and Conscious Experience. San Diego: Harcourt Brace Jovanovich.
    • A neuroscientist explores our experience of time.
  • Price, Huw. 1996. Time’s Arrow & Archimedes’ Point: New Directions for the Physics of Time. Oxford University Press.
    • Price believes the future can affect the past, the notion of direction of the flow cannot be established as an objective notion, and philosophers of physics need to adopt an Archimedean point of view outside of time in order to discuss time in an unbiased manner.
  • Prior, A.N. 1959. “Thank Goodness That’s Over,” Philosophy, 34 .
    • Argues that a tenseless or B-theory of time fails to account for our feeling of relief that painful past events are in the past rather than in the present.
  • Prior, A.N. 1967.Past, Present and Future, Oxford University Press.
    • Pioneering work in temporal logic, the symbolic logic of time, that permits propositions to be true at one time and false at another.
  • Prior, A.N. 1969. “Critical Notices: Richard Gale, The Language of Time,” Mind, 78, no. 311, 453-460.
    • Contains his attack on the attempt to define time in terms of causation.
  • Prior, A.N. 1970. “The Notion of the Present,” Studium Generale, volume 23, pp. 245-8.
    • A brief defense of presentism, the view that the past and the future are not real.
  • Putnam, Hilary. 1967. “Time and Physical Geometry,” The Journal of Philosophy, 64, pp. 240-246.
    • Comments on whether Aristotle is a presentist. Putnam believes that the manifest image of time is refuted by relativity theory.
  • Quine, W.V.O. 1981. Theories and Things. Cambridge, MA: Harvard University Press.
    • Quine argues for physicalism in metaphysics and naturalism in epistemology.
  • Rovelli, Carlo. 2017. Reality is Not What It Seems: The Journey to Quantum Gravity. Riverhead Books, New York.
    • An informal presentation of time in the theory of loop quantum gravity. Loop theory focuses on gravity; string theory is a theory of gravity plus all the forces and matter.
  • Rovelli, Carlo. 2018. The Order of Time. Riverhead Books, New York.
    • An informal discussion of the nature of time by a theoretical physicist. The book was originally published in Italian in 2017. Page 70 contains the graph of the absolute elsewhere that was the model for the one in this article.
  • Rovelli, Carlo. 2018. “Episode 2: Carlo Rovelli on Quantum Mechanics, Spacetime, and Reality” in Sean Carroll’s Mindscape Podcast at www.youtube.com/watch?v=3ZoeZ4Ozhb8. July 10.
    • Rovelli and Carroll discuss loop quantum gravity vs. string theory, and whether time is fundamental or emergent.
  • Russell, Bertrand. 1915. “On the Experience of Time,” Monist, 25, pp. 212-233.
    • The classical tenseless theory.
  • Russell, Bertrand. 1929. Our Knowledge of the External World. W. W. Norton and Co., New York, pp. 123-128.
    • Russell develops his formal theory of time that presupposes the relational theory of time.
  • Saunders, Simon. 2002. “How Relativity Contradicts Presentism,” in Time, Reality & Experience edited by Craig Callender, Cambridge University Press, pp. 277-292.
    • Reviews the arguments for and against the claim that, since the present in the theory of relativity is relative to reference frame, presentism must be incorrect.
  • Savitt, Steven F. 1995. Time’s Arrows Today: Recent Physical and Philosophical Work on the Direction of Time. Cambridge University Press.
    • A survey of research in this area, presupposing sophisticated knowledge of mathematics and physics.
  • Savitt, Steven F. “Being and Becoming in Modern Physics.” In E. N. Zala (ed.). The Stanford Encyclopedia of Philosophy.
    • In surveying being and becoming, it suggests how the presentist and grow-past ontologies might respond to criticisms that appeal to relativity theory.
  • Sciama, Dennis. 1986. “Time ‘Paradoxes’ in Relativity,” in The Nature of Time edited by Raymond Flood and Michael Lockwood, Basil Blackwell, pp. 6-21.
    • A clear account of the twin paradox.
  • Shoemaker, Sydney. 1969. “Time without Change,” Journal of Philosophy, 66, pp. 363-381.
    • A thought experiment designed to show us circumstances in which the existence of changeless periods in the universe could be detected.
  • Sider, Ted. 2000. “The Stage View and Temporary Intrinsics,” The Philosophical Review, 106 (2), pp. 197-231.
    • Examines the problem of temporary intrinsics and the pros and cons of four-dimensionalism.
  • Sider, Ted. 2001. Four-Dimensionalism: An Ontology of Persistence. Oxford University Press, New York.
    • Defends the ontological primacy of four-dimensional events over three-dimensional objects. He freely adopts causation as a means of explaining how a sequence of temporal parts composes a single perduring object. This feature of the causal theory of time originated with Hans Reichenbach.
  • Sklar, Lawrence. Space. 1976. Time, and Spacetime, University of California Press.
    • Chapter III, Section E discusses general relativity and the problem of substantival spacetime, where Sklar argues that Einstein’s theory does not support Mach’s views against Newton’s interpretations of his bucket experiment; that is, Mach’s argument against substantivalism fails.
  • Slater, Hartley. 2012. “Logic is Not Mathematical,” Polish Journal of Philosophy, Spring, pp. 69-86.
    • Discusses, among other things, why modern symbolic logic fails to give a proper treatment of indexicality.
  • Smith, Quentin. 1994. “Problems with the New Tenseless Theories of Time,” pp. 38-56 in Oaklander, L. Nathan and Smith, Quentin (eds.), The New Theory of Time, New Haven: Yale University Press.
    • Challenges the new B-theory of time promoted by Mellor and Smith.
  • Smolin, Lee. 2013. Time Reborn. Houghton, Mifflin, Harcourt Publishing Company, New York.
    • An extended argument by a leading theoretical physicist for why time is real. Smolin is a presentist. He believes the general theory of relativity is mistaken about the relativity of simultaneity; he believes every black hole is the seed of a new universe; and he believes nothing exists outside of time.
  • Sorabji, Richard. 1988. Matter, Space, & Motion: Theories in Antiquity and Their Sequel. Cornell University Press.
    • Chapter 10 discusses ancient and contemporary accounts of circular time.
  • Steinhardt, Paul J. 2011. “The Inflation Debate: Is the theory at the Heart of Modern Cosmology Deeply Flawed?” Scientific American, April, pp. 36-43.
    • Argues that the big bang Theory with inflation is incorrect and that we need a cyclic cosmology with an eternal series of big bangs and big crunches but with no inflation. The inflation theory of quantum cosmology implies the primeval creation of a very large universe in a very short time.
  • Tallant, Jonathan. 2013. “Time,” Analysis, Vol. 73, pp. 369-379.
    • Examines these issues: How do presentists ground true propositions about the past? How does time pass? How do we experience time’s passing?
  • Tegmark, Max. 2017. “Max Tegmark and the Nature of Time,” Closer to Truth, https://www.youtube.com/watch?v=rXJBbreLspA, July 10.
    • Speculates on the multiverse and why branching time is needed for a theory of quantum gravity.
  • Unruh, William. 1999. “Is Time Quantized? In Other Words, Is There a Fundamental Unit of Time That Could Not Be Divided into a Briefer Unit?” Scientific American, October 21. https://www.scientificamerican.com/article/is-time-quantized-in-othe/
    • Discusses whether time has the same structure as a mathematical continuum.
  • Van Fraassen, Bas C. 1985. An Introduction to the Philosophy of Time and Space, Columbia University Press.
    • An advanced undergraduate textbook by an important philosopher of science.
  • Van Inwagen, Peter. 2015. Metaphysics, Fourth Edition. Westview Press.
    • An introduction to metaphysics by a distinguished proponent of the A-theory of time.
  • Veneziano, Gabriele. 2006. “The Myth of the Beginning of Time,” Scientific American, May 2004, pp. 54-65, reprinted in Katzenstein, pp. 72-81.
    • An account of string theory’s impact on our understanding of time’s origin. Veneziano hypothesizes that our big bang was not the origin of time but simply the outcome of a preexisting state.
  • Wallace, David. 2021. Philosophy of Physics: A Very Short Introduction. Oxford University Press.
    • An excellent introduction to the philosophical issues within physics and how different philosophers approach them.
  • Wasserman, Ryan. 2018. Paradoxes of Time Travel, Oxford University Press.
    • A detailed review of much of the philosophical literature about time travel. The book contains many simple, helpful diagrams.
  • Whitehead, A. N. 1938. Modes of Thought. Cambridge University Press.
    • Here Whitehead describes his “process philosophy” that emphasizes the philosophy of becoming rather than of being, for instance, traveling the road rather than the road traveled.
  • Whitrow, G. J. 1980. The Natural Philosophy of Time, Second Edition, Clarendon Press.
    • A broad survey of the topic of time and its role in physics, biology, and psychology. Pitched at a higher level than the Davies books.

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.

Kripke’s Wittgenstein

Saul Kripke, in his celebrated book Wittgenstein on Rules and Private Language (1982), offers a novel reading of Ludwig Wittgenstein’s main remarks in his later works, especially in Philosophical Investigations (1953) and, to some extent, in Remarks on the Foundations of Mathematics (1956). Kripke presents Wittgenstein as proposing a skeptical argument against a certain conception of meaning and linguistic understanding, as well as a skeptical solution to such a problem. Many philosophers have called this interpretation of Wittgenstein Kripke’s Wittgenstein or Kripkenstein because, as Kripke himself emphasizes, it is “Wittgenstein’s argument as it struck Kripke, as it presented a problem for him” (Kripke 1982, 5) and “probably many of my formulations and re-castings of the argument are done in a way Wittgenstein would not himself approve” (Kripke 1982, 5). Such an interpretation has been the subject of tremendous discussions since its publication, and this has formed a huge literature on the topic of meaning skepticism in general and Wittgenstein’s later view in particular.

According to the skeptical argument that Kripke extracts from Wittgenstein’s later remarks on meaning and rule-following, there is no fact about a speaker’s behavioral, mental or social life that can metaphysically determine, or constitute, what she means by her words and also fix a determinate connection between those meanings and the correctness of her use of these words. Such a skeptical conclusion has a disastrous consequence for the classical realist view of meaning: if we insist on the idea that meaning is essentially a factual matter, we face the bizarre conclusion that there is thereby “no such thing as meaning anything by any word” (Kripke 1982, 55).

According to the skeptical solution that Kripke attributes to Wittgenstein, such a radical conclusion is intolerable because we certainly do very often mean certain things by our words. The skeptical solution begins by rejecting the view that results in such a paradoxical conclusion, that is, the classical realist conception of meaning. The skeptical solution offers then a new picture of the practice of meaning-attribution, according to which we can legitimately assert that a speaker means something specific by her words if we, as members of a speech-community, can observe, in enough cases, that her use agrees with ours. We can judge, for instance, that she means by “green” what we mean by this word, namely, green, if we observe that her use of “green” agrees with our way of using it. Attributing meanings to others’ words, therefore, brings in the notion of a speech-community, whose members are uniform in their responses. As a result, there can be no private language.

This article begins by introducing Kripke’s Wittgenstein’s skeptical problem presented in Chapter 2 of Kripke’s book. It then explicates Kripke’s Wittgenstein’s skeptical solution to the skeptical problem, which is offered in Chapter 3 of the book. The article ends by reviewing some of the most important responses to the skeptical problem and the skeptical solution.

Table of Contents

  1. Kripke’s Wittgenstein: The Skeptical Challenge
    1. Meaning and Rule-Following
    2. The Skeptical Challenge: The Constitution Demand
    3. The Skeptical Challenge: The Normativity Demand
  2. Kripke’s Wittgenstein: The Skeptical Argument
    1. The Skeptic’s Strategy
    2. Reductionist Facts: The Dispositional View
      1. The Finitude Problem
      2. Systematic Errors
      3. The Normative Feature of Meaning
    3. Non-Reductionist Facts: Meaning as a Primitive State
    4. The Skeptical Conclusions and Classical Realism
  3. Kripke’s Wittgenstein: The Skeptical Solution
    1. Truth-Conditions vs. Assertibility Conditions
    2. The Private Language Argument
  4. Responses and Criticisms
  5. References and Further Reading
    1. References
    2. Further Reading

1. Kripke’s Wittgenstein: The Skeptical Challenge

Wittgenstein famously introduces a paradox in section 201 of the Philosophical Investigations, a paradox that Kripke takes to be the central problem of Wittgenstein’s book:

This was our paradox: no course of action could be determined by a rule, because every course of action can be made out to accord with the rule. The answer was: if everything can be made out to accord with the rule, then it can also be made out to conflict with it. And so there would be neither accord nor conflict here. (Wittgenstein 1953, §201)

Kripke’s book is formed around this paradox, investigating how Wittgenstein arrives at it and how he attempts to defuse it.

The main figure in Chapter 2 of Kripke’s book is a skeptic, Kripke’s Wittgenstein’s skeptic, who offers, on behalf of Wittgenstein, a skeptical argument against a certain sort of explanation of our commonsense notion of meaning. The argument is designed to ultimately lead to the Wittgensteinian paradox. According to this commonsense conception of meaning, we do not just randomly use words; rather, we are almost always confident that we meant something specific by them in the past and that it is because of that meaning that our current and future uses of them are regarded as correct. The sort of explanation of this commonsense conception that the skeptic aims to undermine is called “classical realism” (Kripke 1982, 73) or the “realistic or representational picture of language” (Kripke 1982, 85). According to this explanation, there are facts as to what a speaker meant by her words in the past that determine the correct way of using them in the future. The skeptical argument aims to subvert this explanation by revealing that it leads to the Wittgensteinian paradox. In the next section, this commonsense notion of meaning is outlined.

a. Meaning and Rule-Following

In Chapter 2, Kripke draws our attention to the ordinary way in which we talk about the notion of meaning something by an expression. Since this commonsense notion of meaning appeals to the notion of rule-following, Kripke initially describes the matter by using an arithmetic example, in which the notion of a rule has its clearest appearance, though the problem can be put in terms of the notion of meaning too (as well as that of intention and concept) and potentially applied to all sorts of linguistic expressions.

The commonsense notion of meaning points to a simple insight: in our everyday life, we “do not simply make an unjustified leap in the dark” (Kripke 1982, 10). Rather, we use our words in a certain way unhesitatingly and almost automatically and the reason why we do so seems to have its roots in the following two important aspects of the practice of meaning something by a word: (1) we meant something specific by our words in the past and (2) those past meanings determine the correct way of using these words now and in the future. Putting the matter in terms of rules, the point is that, for every word that we use, we grasped, and have been following, a specific rule governing the use of this word and such a rule determines how the word ought to be applied in the future. Consider the word “plus” or the plus sign “+”.  According to the commonsense notion of meaning, our use of this word is determined by a rule, the addition rule, that we have learnt and that tells us how to add two numbers. The addition rule is general in that it indicates how to add and produce the sum of any two numbers, no matter how large they are. The correct answer to any addition problem is already determined by that specific rule.

Since we have learnt or grasped the addition rule in the past and have been following it since then, we are now confident that we ought to respond with “22” to the arithmetic query “12 + 10 =?” and that this answer is the only correct answer to this question. Moreover, although we have applied the addition rule only to a limited number of cases in the past, we are prepared to give the correct answer to any addition query we may be asked in the future. This is, as the skeptic emphasizes, the “whole point of the notion that in learning to add I grasp a rule” (Kripke 1982, 8). This conception of rules can be extended to other expressions of our language. For instance, we can say that there is a rule governing the use of the word “green”: it tells us that “green” applies to certain (that is, green) things only. If we are following this rule, applying “green” to a blue object is incorrect. Again, we have used “green” only in a limited number of cases in the past, but the rule determines how to apply this word on all future occasions.

Having presented such a general picture of meaning and rule-following, the skeptic raises two fundamental questions: (1) what makes it the case that we really meant this rather than that by a word in the past or that we have been following this rather than that rule all along? (2) how can such past meanings and rules be said to determine the correct use of words in all future cases? Each question makes a demand. We can call the first the Constitution Demand and the second the Normativity Demand. Each demand is introduced below, and it is shown how they cooperate to establish a deep skepticism about meaning and rule-following.

b. The Skeptical Challenge: The Constitution Demand

Kripke’s Wittgenstein’s skeptic makes a simple claim by asking the following questions: What if by our words we actually meant something different from what we think we did in the past? What if we have really been following a rule different from what we think we did or never really followed the same rule at all? After all, we have applied the addition rule only to a limited number of cases in the past. Imagine, for example, that the largest number we have ever encountered in an addition is 57. As we are certain that we have always been following the addition rule, or meant plus by “plus”, we are confident that “125” is the correct answer to the new addition query “68 + 57 =?”. If the skeptic insists that this answer is wrong, all we can do is to check our answer all over again, and the correct answer seems to be nothing but “125”.

The skeptic, however, makes a bizarre claim: the correct answer is “5” not “125”, and this is so not because 125 is not the sum of 57 and 68, but because we have not been following the addition rule at all. At first sight, such a claim seems unacceptable, but the skeptic invites us to assume the following possible scenario. Suppose that there is another rule called quaddition: the quaddition function is symbolized by “⊕” and the rule is defined as follows:

x ⊕ y = x + y, if x, y < 57
           = 5 otherwise. (Kripke 1982, 9)

Perhaps, just imagine, we have been following this rule rather than the addition rule and taken “+” to denote the quaddition rather than the addition function. Maybe, as the scenario goes, we have meant quus rather than plus when using “plus” in the past.

According to the skeptic, the answers we have given so far have all been the quum rather than the sum of numbers. When we were asked to add 57 to 68, we got confused and thought the correct answer is “125”, probably because the sum and the quum of the numbers smaller than 57 are all the same. The correct answer, however, is “5”; we mistakenly thought that we had been following the addition rule, while the rule we actually follow is quaddition. The skeptic’s fundamental question is: “Who is to say that this is not the function I previously meant by ‘+’?” (Kripke 1982, 9).

The skeptic agrees that his claim is radical, “wild it indubitably is, no doubt it is false”, but  observes “if it is false, there must be some fact about my past usage that can be cited to refute it” (Kripke 1982, 9). What is required are some facts about ourselves, about what we did in the past, what has gone on in our minds, and similar that can do two things: (1) satisfy the Constitution Demand, that is, constitute the fact that in the past we meant plus by “plus”, and not anything else like quus, and (2) meet the Normativity Demand, that is, determine the correct way of using the word “plus” now and in the future (see Kripke 1982, 11).

The Constitution Demand reveals the metaphysical nature of the skeptic’s skeptical challenge. First, the answer “125” to “68 + 57 =?” is correct in two senses: in the arithmetical sense and in the meta-linguistic sense. It is correct in the arithmetical sense because 125 is the sum, and not, for instance, the product of 68 and 57: 125 is the result that we get after following the procedure of adding 68 to 57. Our answer is also correct in the meta-linguistic sense because, given that we mean plus by “plus” or intend “+” to denote the addition function, “125” is the only correct answer to “68 + 57 =?”. Of course, if we intend “+” to denote the quaddition function, “125” would be wrong. These two senses of correctness are distinct, for the addition function, independently of what we think of it, uniquely determines the sum of any two numbers. However, what function we intend “+” to denote is a meta-linguistic matter, that is, a matter of what we mean by our words and symbols.

The above distinction clarifies why the skeptic’s worry is not whether our computation of the sum of 68 and 57 is accurate or whether our memory works properly. Nor is his concern whether, in the case of using “green”, the objects we describe as being green are indeed green. He does not aim to raise an epistemological problem about how we know, or can be sure, that “125” is the correct answer to “57 + 68 =?”. His worry is metaphysical: is there any fact as to what we really meant by “plus”, “+”, “green”, “table” and so forth, in the past? If the skeptic successfully shows that there is no such fact, the question as to whether we accurately remember that meaning or rule would be beside the point: there is simply nothing determinate to remember. The skeptic’s claim is not that because we may forget what “plus” means or because we may make mistakes in calculating the sum of some two big numbers, we can never be sure that our answer is correct. Of course, we make mistakes: we may neglect things; we may unintentionally apply “green” to a blue object, and so forth. From the fact that we make occasional mistakes it does not follow that there is thereby no fact as to what we mean by our words. On the contrary, it seems that it is because of the fact that we mean plus by “plus” that answering with “5” to “57 + 68 =?” is considered to be wrong. The same considerations apply to the case of memory failures: we may, for example, forget to carry a digit when calculating the sum of two large numbers. Memory failures and failures of attention do not cast doubt on the fact that we mean addition by “+”. The skeptic takes it for granted that we fully recall what we did in the past, that our memory works perfectly fine, that our eyes function normally and that we can accurately compute the sum of numbers. None of these matters because he has no objection to the fact that if we can show that plus is what we meant by “plus” in the past, “125” is the correct answer to “57 + 68 =?”. In the same vein, however, if he can show that quus is what we meant by “plus”, “5” is the correct answer.

The skeptic’s Constitution Demand asks us to cite some fact about ourselves that can constitute the fact that by “plus” we meant plus rather than quus in the past. He does not care about what such a fact is: “there are no limitations, in particular, no behaviorist limitations, on the facts that may be cited to answer the sceptic” (Kripke 1982, 14). Moreover, if the skeptic succeeds in arguing that there is no fact as to what we meant by our words in the past, he has at the same time shown that there is no fact determining what we mean by our words now or in the future. As he puts it, “if there can be no fact about which particular function I meant in the past, there can be none in the present either” (Kripke 1982, 13). However, he cannot make such a claim in the beginning: if the skeptic undermines the certainty in what the words mean in the present, it seems that he could not even start conversing with anyone, nor formulate his skeptical claims in some language.

c. The Skeptical Challenge: The Normativity Demand

The second aspect of the skeptical challenge is that any fact that we may cite to defuse it must also “show how I am justified in giving the answer ‘125’ to ‘68 + 57’” (Kripke 1982, 11). The Constitution and Normativity Demands are put by the skeptic as two separate but related requirements. The second presupposes the first: if we fail to show that the speaker has meant something specific by her words, it would be absurd to say that those meanings determined how she ought to use the words in the future. It is better to see these two demands as two aspects of the skeptical problem. The connection between them is so deep that it would be hard to sharply distinguish between them as two entirely different demands: if there is no normative constraint on our use of words, we would not be able to justifiably talk about them having any specific meaning at all. If there is no such thing as correct vs. incorrect application of a word, the notion of a word meaning something specific would just vanish. The skeptic’s main point in distinguishing between these demands is to emphasize that telling a story about how meanings are constituted may still fail to offer a convincing story about the normative aspect of meaning. That is, even if we can introduce a fact that is somehow capable of explaining what the speaker meant by her word in the past, this by itself would not suffice to rule out the skeptical problem because any such fact must also justify the fact that the speaker uses her words in the way she does. In other words, it must be explained that our confidence in thinking that “125” is the correct answer to “68 + 57 =?” is based exactly on that fact, and not on anything else. (For a different reading of such a relation between meaning and correct use, see Ginsborg (2011; 2021). See also Miller (2019) for further discussion.)

Moreover, the skeptic uses each demand to offer a different argument against the different sorts of facts that may be introduced to resist the skeptical problem. As regards the Normativity Demand, the argument is based on the requirement that such facts must determine the correctness conditions for the application of words, and that they must do so for a potentially indefinite number of cases in which the words may have an application. This requirement is spelled out by the skeptic’s famous claim that any candidate for a fact that is supposed to constitute what we meant by our words in the past must be normative, not descriptive: it must tell us how we ought to or should use the words, not simply describe how we did, do or will use them. This is also known as the Normativity of Meaning Thesis. The normativity of meaning (and content) is now a self-standing topic. (For some of the main works on this thesis, see Boghossian (1989; 2003; 2008), Coates (1986), Gibbard (1994; 2013), Ginsborg (2018; 2021), Glock (2019), Gluer and Wikforss (2009), Hattiangadi (2006; 2007; 2010), Horwich (1995), Kusch (2006), McGinn (1984), Railton (2006), Whiting (2007; 2013), Wright (1984), and Zalabardo (1997).)

One of the clearest characterizations of the Normativity Demand has been given by Paul Boghossian:

Suppose the expression ‘green’ means green. It follows immediately that the expression ‘green’ applies correctly only to these things (the green ones) and not to those (the non-greens). … My use of it is correct in application to certain objects and not in application to others. (Boghossian 1989, 513)

This definition is neutral to the transtemporal aspect of the relation between meaning and use, contrary to McGinn’s reading of this relation. For McGinn, an account of the normativity of meaning requires an explanation of two things: “(a) an account of what it is to mean something at a given time and (b) an account of what it is to mean the same thing at two different times – since (Kripkean) normativeness is a matter of meaning now what one meant earlier” (McGinn 1984, 174). Kripke’s Wittgenstein’s skeptic, however, seems to view the notion of normativity as a transtemporal notion of a different sort: the normativity of meaning concerns the relation between past meanings and future uses. In this sense, what we meant by our words in the past already determined how we ought to use them in the future.

Yet the matter is more complicated than that. As we saw, the skeptic did not start by questioning the correctness of our current use of words. He asked whether some current use of a word accords with what we think we meant by it in the past: if it does, it is correct. This, however, seems merely a tactical move: the skeptic’s ultimate goal is to undermine the claim that we mean anything by our words now, in the past, or in the future and thus to rule out the idea that our past, current, or future uses of words can be regarded as correct (or incorrect) at all. If so, it is better to think of the skeptic’s conception of the normativity relation as not necessarily temporal. For him, the claim is simply that meaning determines the conditions of correct use. Nonetheless, for the reasons mentioned above, the skeptic often prefers to put the matter in a temporal way: “The relation of meaning and intention to future action is normative, not descriptive” (Kripke 1982, 37). The question then is whether our current use of a word accords with what we meant by it in the past. That a word ought to be applied in a specific way now in order to be applied in accordance with what we meant by it in the past is said to be the normative consequence of the fact that we meant a specific thing by it in the past.

All this, however, is a familiar thesis: what we decided to do in the past often has consequences for what we ought to do in the future. For instance, if you believe or accept the claim that telling lies is wrong, it has consequences for how you ought to act in the future: you should not lie. The skeptic has a similar claim in mind with regard to the notion of meaning: we cannot attach a clear meaning to “table” as used by a speaker if she uses it without any constraint whatsoever, that is, if she applies it to tables now, to chairs a minute later, and then to apples, lions, the sky, and so forth, without there being any regularity and coherence in her use of it. In such cases, it is not clear that we can justifiably say that she means this rather than that by “table” at all. The skeptic’s real question is whether there is any fact about the speaker that constitutes the fact that the speaker means table by “table” in such a way as to determine the correct use of the word in the future. If we are to be able to tell that she means table by “table”, we should also be able to say that her use of “table” is correct now and that it is so because of her meaning table by “table”, and not anything else. The reason is that the relation between meaning and use is prescriptive, not descriptive: if you mean plus rather than quus by “plus”, you ought to answer “68 + 57 =?” with “125”. The normative feature of meaning was already present in the skeptic’s characterization of the commonsense notion of meaning: with each new case of using a word, we are confident as to how we should use it because we are confident as to what we meant by it in the past.

The last step that the skeptic must take in order to complete his argument is to argue that no fact about the speaker can satisfy the two aspects of the notion of meaning, that is, the Constitution Demand and the Normativity Demand. It is not possible to introduce his arguments against each candidate fact in detail here, since in chapter 2 of Kripke’s book, the skeptic examines at least ten candidates for such a fact and argues against each in detail. In what follows, the skeptic’s general strategy in rejecting them is described by focusing on two examples.

2. Kripke’s Wittgenstein: The Skeptical Argument

The skeptic considers a variety of suggestions for facts that someone might cite to meet the skeptic’s challenge, that is, to show that we really mean plus, and not quus or anything else, by “plus”. In particular, the skeptic discusses ten candidate sorts of facts, including: (1) facts about previous external behavior of the speaker; (2) facts concerning the instructions the speaker may have in mind when, for instance, she adds two numbers; (3) some mathematical laws that seemingly work only if “+” denotes the addition rule; (4) the speaker’s possession of a certain mental image in mind when, for instance, she applies “green” to a new object; (5) facts about the speaker’s dispositions to respond in certain ways on specific occasions; (6) facts about the functioning of some machines, such as calculators, as embodying our intentions to add numbers; (7) facts about words having Fregean, objective senses; (8) the fact that meaning plus by “plus” is the simplest hypothesis about what we mean by “plus” and is thus capable of constituting the fact that “plus” means plus; (9) the fact that meaning plus is an irreducible mental state of the speaker with its own unique quale or phenomenal character; and (10) the view that meaning facts are primitive, sui generis.

In order to see how the skeptic argues against each such fact, it is helpful to classify them as falling under two general categories: reductionist facts and non-reductionist facts. The skeptic’s claim will be that neither the reductionist nor the non-reductionist facts can constitute the fact that the speaker means one thing rather than another by her words. The first eight candidate facts mentioned above belong to the reductionist camp: they are facts about different aspects of the speaker’s life, mental and physical. Here, the opponent’s claim is that such facts are capable of successfully constituting the fact that the speaker means plus by “plus”. The two last suggestions are from the non-reductionist camp, attempting to view the fact that the speaker means one thing rather than another by a word as a self-standing fact, not reducible to any other fact about the speaker’s behavioral or mental life. The skeptic’s strategy is to argue that both reductionist and non-reductionist facts fail to meet the Constitution and the Normativity Demands.

a. The Skeptic’s Strategy

In the case of the reductionist facts, the skeptic’s strategy is to show that any aspect of the speaker’s physical or mental life that may be claimed to be capable of constituting a determinate meaning fact or rule can be interpreted in a non-standard way, that is, in such a way that it can equally well be treated as constituting a different possible meaning fact or rule. Any attempts to dodge such deviant interpretations, however, face a highly problematic dilemma: either we have to appeal to some other aspect of the speaker’s life in order to eliminate the possibility of deviant interpretations and thereby fix the desired meaning or rule, in which case we will be trapped in a vicious regress, or we have to stop at some point and claim that this aspect, whatever it is, cannot be interpreted non-standardly anymore and is somehow immune to the regress problem, in which case meaning would become entirely indeterminate or totally mysterious. For the skeptic’s question is now “why is it that such a fact or rule cannot be interpreted in a different way?” and since the whole point of the skeptical argument is to show there is no answer to this question, it seems that we cannot really answer it, except if we already have a solution to the skeptical problem. If we do not, the only options available seem to be the following: (1) either we concede that there is no answer to this question, but then the indeterminacy of meaning and the Wittgensteinian paradox are waiting for us because we have embraced the claim that there is nothing on the basis of which we can determine whether our use of a word accords, or not, with a rule; our use is then both correct and incorrect at the same time; (2) or we decide to ignore this question, but we have then made the ordinary notion of meaning and rules entirely mysterious: we have appealed to a “superlative” fact or rule, which is allegedly capable of constituting the fact that the speaker means plus by “plus” but which is, in a mysterious way, immune to the skeptical problem.

In the case of the non-reductionist responses, the skeptic’s strategy is a bit different: his focus is on showing that we cannot make the nature of such primitive meaning facts intelligible, so that not only would they become mysterious, but we also have to deal with very serious epistemological problems about our first-personal epistemic access to their general content.

The next section further explains these problems by considering some examples.

b. Reductionist Facts: The Dispositional View

The most serious reductionist responses to Kripke’s Wittgenstein’s skeptic are the following: (1) the claim that facts about what the speaker meant by her words in the past are constituted by the speaker’s dispositions to respond in a certain way on specific occasions—this is the response from the dispositional view or dispositionalism; (2) the suggestion that there are some instructions in the mind of the speaker, some mental images, samples, ideas, and the like and that facts about having them constitute the fact that the speaker means, for instance, green by “green”.

According to the dispositional view, what a speaker means by her word can be extracted or read off from the responses she is disposed to produce. As the skeptic characterizes it:

To mean addition by ‘+’ is to be disposed, when asked for any sum ‘x + y’, to give the sum of x and y as the answer (in particular, to say ‘125’ when queried about ‘68 + 57’); to mean quus is to be disposed when queried about any arguments, to respond with their quum (in particular to answer ‘5’ when queried about ‘68 + 57’). (Kripke 1982, 22-23)

What dispositions are and what characteristics they have is a self-standing topic. It is helpful, however, to consider a typical example. A glass is said to have the property of being fragile: it shatters if struck by a rock. A glass, in order words, is disposed to shatter when hit by a rock or dropped. However, it is one thing to possess a disposition, another to manifest it. For instance, although a glass is disposed to shatter, and that glasses shatter very often around us, one particular glass may never actually shatter or may decay before finding any chance to manifest this disposition. Since the objects that are said to have such-and-such dispositions may never manifest them, we usually characterize their dispositional properties, or ascribe such dispositions to them, in a counterfactual way:

Glasses are disposed to shatter under certain conditions if and only if glasses would shatter if those conditions held.

These certain, normal, optimal, ideal, or standard conditions, as they are sometimes called, are supposed to exclude the conditions under which glasses may fail to manifest their disposition to shatter. There are various problems with how such conditions can be properly specified, which are not our concern here. (On dispositions, see Armstrong (1997), Bird (1998), Carnap (1928), Goodman (1973), Lewis (1997), Mellor (2000), Mumford (1998), Prior (1985), and Sellars (1958).)

Humans too can be said to possess different dispositions, which manifest themselves under certain circumstances. For instance, a child observes her parents pointing to a certain thing and saying “table”; they encourage the child to mouth “table” in the presence of that thing; the child tries to do so and when she is successful, she is rewarded; if she says “table” in the presence of a chair, she is corrected; and the process continues. The child is gradually conditioned to say “table” in the presence of the table. She then generalizes it: in the presence of a new table, she utters “table”. She is now said to be disposed to respond with “table” in the presence of tables, with “green” in the presence of green things, with the sum of two numbers when asked “x + y =?”, and so forth. Call these the “dispositional facts”. According to the dispositional view, such facts are capable of constituting what the speaker means by her words, or as the skeptic prefers to put it, from these dispositions we are supposed to “read off” what the speaker means by her words. For instance, if the speaker is disposed to apply “green” to green objects only, we can read off from such responses that she means green by “green”. Similarly, if she is disposed to apply “green” to green objects until a certain time t (for example, until the year 2100) and to blue objects after time t, we must conclude that she means something else, for instance, grue by “green” (Goodman 1973). Now, as the speaker is disposed to respond with “125” to “68 + 57 =?”, the dispositionalists’ claim is that the speaker means plus by “plus”.

The skeptic makes three objections. The first is that facts about dispositions cannot determine what the speaker means by “plus”; this is to say that the dispositional view fails to meet the Constitution Demand. The problem that the skeptic puts forward in this case is sometimes called the “Finitude Problem” or “Finiteness Problem” (Blackburn (1984a), Boghossian (1989), Ginsborg (2011), Horwich (1990), Soames (1997), and Wright (1984)). The other two objections concern the dispositional view’s success in accommodating the normative aspect of meaning: the dispositional view cannot account for systematic errors as “errors” and  dispositional facts are descriptive in nature, while meaning facts are supposed to be normative. These different problems are however related, as the next three sections make clear.

i. The Finitude Problem

According to the skeptic, facts about the speaker’s dispositions to respond in specific ways on certain occasions fail to constitute the fact that the speaker means plus by “plus” because “not only my actual performance, but also the totality of m­y dispositions, is finite” (Kripke 1982, 26). During our lifetime, we can produce only a limited number of responses. The skeptic now introduces a brand-new skeptical hypothesis: perhaps, the plus sign “+” stands for a function that we can call skaddition. It can be symbolized by “*” and defined as follows (see Kripke 1982, 30):

x * y = x + y, if x and y are small enough for us to have any disposition to add them in our lifetime;

x * y = 5, otherwise.

There are at least two possible meaning facts now, or two different rules, which are compatible with the totality of the responses a speaker can produce in her life: one possible fact is that she means addition by “+” and the other is that she means skaddition by “+”. The skeptic’s claim is that even if the speaker actually responds with the sum of all the numbers that she is asked to add in her lifetime, we still cannot read off from such responses that she really means plus by “plus”, for even the totality of her dispositions to respond to “x + y =?” is compatible with both “+” meaning addition and “+” meaning skaddition. The dispositional view fails to show that the fact that the speaker means addition, and not skaddition, by “+” can be constituted by facts about the speaker’s dispositions to respond with the sum of numbers. Therefore, the general strategy of the skeptic in this case is to uncover that no matter how the speaker actually responds, such responses can be interpreted differently, that is, in such a way that they remain compatible with different possible meaning facts or rules.

The skeptic anticipates an obvious objection from the dispositionalists, according to which the way the skeptic has characterized the dispositional view is too naive. A more sophisticated version of this view could avoid the finitude problem by including provisos like “under optimal conditions”. Their claim is that, under such conditions, “I surely will respond with the sum of any two numbers when queried” (Kripke 1982, 27). The main difficulty, however, is to specify these ideal, optimal or standard conditions in a non-question-begging way. For the skeptic, there are two general ways in which these conditions can be specified: (1) by using non-semantical and non-intentional terms, that is, in a purely naturalistic way, and (2) by using semantical and intentional terms. Both fail to bypass the skeptical problem, as the skeptic argues.

According to the skeptic, attempts for the first option lead to entirely indeterminate conjectures because we need to include conditions like “if my brain had been stuffed with sufficient extra matter”, “if it were given enough capacity to perform very large additions”, “if my life (in a healthy state) were prolonged enough”, and the like (see Kripke 1982, 27). Under such conditions, the dispositionalist may claim, I would respond by the sum of two numbers, no matter how large they are. According to the skeptic, however, “we have no idea what the results of such experiments would be. They might lead me to go insane, even to behave according to a quus-like rule. The outcome really is obviously indeterminate” (Kripke 1982, 27). It is completely unknown to us how such a person would be disposed to respond in a possible world in which she possesses such peculiar, beyond-imagination abilities.

In order to avoid such a problem, the dispositionalists may go for the second option and claim:

If I somehow were to be given the means to carry out my intentions with respect to numbers that presently are too long for me to add (or to grasp), and if I were to carry out these intentions, then if queried about ‘m + n’ for some big m and n, I would respond with their sum (and not with their quum). (Kripke 1982, 28)

The skeptic’s objection, however, is that this characterization of the optimal conditions is hopeless because it begs the question against the skeptic’s main challenge: what determines the intention of the speaker to use “+” in one way rather than another? The dispositional view presupposes, in its optimal conditions, that the speaker has a determinate intention toward what she wants to do with the numbers. Obviously, if I mean plus by “plus” or intend “+” to denote the addition function, I will be disposed to give their sum. But the problem is to determine what I mean by “plus” or what intention I have with regard to the use of “+”. This means that the dispositional view fails to meet the Constitution Demand.

ii. Systematic Errors

The dispositional account fails to accommodate the simple fact that we might be disposed to make systematic mistakes. Suppose that the speaker, for any reason, is disposed to respond slightly differently to certain arithmetic queries: she responds to “6 + 5 =?” with “10”, to “6 + 6 =?” with “11”, to “6 + 7 =?” with “12”, and so on. According to the skeptic, the dispositionalists cannot claim that the speaker means plus by “+” but simply makes mistakes, unless they beg the question against the skeptic. For, on their view, “the function someone means is to be read off from his dispositions” (Kripke 1982, 29). The dispositional account aims to show that because the speaker is disposed to respond with the sum of numbers, we can conclude that she follows the addition rule. But, in the above example, the speaker’s responses do not accord with the addition function; therefore, we cannot read off from these responses that she means plus by “plus”. Dispositionalists cannot claim that the speaker intends to give the sum of numbers but makes mistakes. Rather, all that they can say is that the speaker does not mean plus by “plus”. Otherwise, they beg the question against the skeptic by presupposing what the speaker means by “plus” in advance. This is related to the third problem with the dispositional view.

iii. The Normative Feature of Meaning

According to the skeptic, not only does the dispositional view fail to meet the Constitution Demand, but it also fails to meet the Normativity Demand. As shown in the previous section, the dispositional view fails to accommodate the fact that a speaker might make systematic mistakes. The skeptic’s more general claim is that even if the dispositional view can somehow find a way to dodge the finitude problem, it still fails to accommodate the normative feature of meaning because the dispositional facts are descriptive in nature, not normative or prescriptive. As the skeptic puts it:

A dispositional account misconceives the sceptic’s problem – to find a past fact that justifies my present response. As a candidate for a ‘fact’ t­­hat determines what I mean, it fails to satisfy the basic condition on such a candidate, […], that it should tell me what I ought to do in each new instance” (Kripke 1982, 24).

When queried about “68 + 57 =?”, we are confident that the correct answer to this query is “125” because we are confident that we mean plus‌ by “plus”. Meaning facts are normative, in that what we meant by “plus” in the past already determined how we ought to respond in the future. Nonetheless, facts about the speaker’s dispositions are descriptive: they do not say that because the speaker has been disposed to respond in this way, she should or ought to respond in that way in the future. They just describe how the speaker has used, uses or will use the word. Therefore, “this is not the proper account of the relation, which is normative, not descriptive” (Kripke 1982, 37): if you meant green by “green” in the past, you ought to apply it to this green object now. The dispositionalist cannot make such a claim, but must rather wait to see whether the speaker is or would be disposed to apply “green” to this green object.

The skeptic’s main objection against the dispositional view is that the speaker’s consistent responses cannot be counted as correct or as the responses the speaker ought to produce. If the responses that the speaker is disposed to produce cannot be viewed as correct, we cannot talk about their being in accordance with a determinate rule or a specific meaning: with no normative constraint on use, there can be no talk of meaning. According to Kripke, this is the skeptic’s chief objection to the dispositional view: “Ultimately, almost all objections to the dispositional account boil down to this one” (Kripke 1982, 24). (For defenses of the dispositional view against the skeptic see, for instance, Coates (1986), Blackburn (1984a), Horwich (1990; 1995; 2012; 2019), Ginsborg (2011; 2021), and Warren (2018).)

The skeptic’s strategy to reject the reductionist responses, such as the dispositional view, can thus be generally stated as follows: it does not matter how the speaker responds because, in whatever way she responds, it can be made compatible with her following different rules. Her answering with “125” to “68 + 57 =?” can be interpreted in such a way as to remain compatible with her following the skaddition rule. We then face a very problematic dilemma.

Suppose that one offers the following solution: each time that the speaker applies the addition rule, she has some other instruction or rule in mind, such as the “counting rule”; by appealing to this latter rule, we can then respond to the skeptic by claiming that “suppose we wish to add x and y. Take a huge bunch of marbles. First count out x marbles in one heap. Then count out y marbles in another. Put the two heaps together and count out the number of marbles in the union thus formed. The result is x + y” (Kripke 1982, 15). The skeptic’s response is obvious and based on the fact that a rule (the addition rule) is determined in terms of another rule (the counting rule). The skeptic can claim that, perhaps, by “count” the speaker always meant quount, not count; he then goes on to offer his non-standard, compatible-with-the-quus-scenario interpretation of “count” (see Kripke 1982, 16). The vicious regress of interpretations reappears, that of rules interpreting rules. At some point, we must stop and say that this rule cannot be interpreted in any other, non-standard way. The skeptic then asks: what is it about this special, “superlative” rule that prevents it from being interpreted in different ways? The skeptical challenge can be applied to this rule, unless we answer the skeptic’s question. But answering that very question is the whole point of the skeptical problem. Any attempt to escape the regress without answering the skeptic’s question, on the other hand, only makes such an alleged superlative rule mysterious.

c. Non-Reductionist Facts: Meaning as a Primitive State

The skeptic rejects a specific version of non-reductionism, according to which the fact that the speaker means plus by “plus” is primitive, irreducible to any other fact about the speaker’s behavioral or mental life. Whenever I use a word, I just directly know what I mean by it; nothing else about me is supposed to constitute this fact. The skeptic himself thinks that “such a move may in a sense be irrefutable” (Kripke 1982, 51). Nevertheless, he describes this suggestion as “despera­te” (Kripke 1982, 51) and makes two objections to it: (1) it leaves the nature of such a primitive state completely mysterious, since this state supposedly possesses a general content that is present in an indefinite number of cases in which we may use the word, but our minds or brains do not have the capacity to consider each such case of use explicitly in advance; (2) it has to propose that we somehow have a direct, first-personal epistemic access to the general content of such a state, which is not known via introspection, but which seems to be, in a queer way, always available to us. The skeptic’s objections have also been called the “argument from queerness” (see Boghossian (1989; 1990) and Wright (1984)).

According to the skeptic, the non-reductionist response “leaves the nature of this postulated primitive state – the primitive state of ‘meaning addition by “plus”’ – c­ompletely mysterious” (Kripke 1982, 51). It is mysterious because it is supposed to be a finite state, embedded in the speaker’s finite mind or brain, whose capacity is limited, but it is also supposed to possess a general content that covers a potentially infinite number of cases in which the word may be used and that is always available to the speaker and tells her what the correct way of using the word is in every possible case:

Such a state does not consist in my explicitly thinking of each case of the addition table, nor even of my encoding each separate case in the brain: we lack the capacity for that. Yet (as Wittgenstein states in the Philosophical Investigations, §195) ‘in a queer way’ each such case already is ‘in some sense present’. (Kripke 1982, 52).

It is very hard, according to the skeptic, to make sense of the nature of such states that are finite but have such a general content.

Moreover, it is not clear how to explain our direct and non-inferential epistemic access to the content of these states. The primitive state of meaning plus by “plus” determines the correct use of the word in indefinitely (or even infinitely) many cases. Yet, as the skeptic says, “we supposedly are aware of it with some fair degree of certainty whenever it occurs” (Kripke 1982, 51). We directly and non-inferentially know how to use “plus” in each possible case of using it. As Wright characterizes the argument from queerness, “how can there be a state which each of us knows about, in his own case at least, non-inferentially and yet which is infinitely fecund, possessing specific directive content for no end of distinct situations?” (Wright 1984, 775). The skeptic’s claim is that there is no plausible answer to this question.

The skeptic’s skeptical argument is now complete: any reductionist or non-reductionist response to his skeptical problem is shown to be a failure. Granted that, it remains to see to what conclusions the skeptic has been leading us all along.

d. The Skeptical Conclusions and Classical Realism

George Wilson (1994; 1998) has usefully distinguished between two different conclusions that the skeptical argument establishes: (1) the Basic Skeptical Conclusion and (2) the Radical Skeptical Conclusion. The Basic Skeptical Conclusion is the outcome of the skeptic’s detailed arguments against the aforementioned candidate facts. After arguing that all of them fail to determine what the speaker means by her words, the skeptic claims that “there can be no fact as to what I mean by ‘plus’, or any other word at any time” (Kripke 1982, 21). In order to see why the argument has a further radical conclusion, we must consider why the skeptic thinks that his argument’s target is “classical realism” (Kripke 1982, 73, 85).

According to the broad realist treatment of meaning, there are facts as to what a (declarative) sentence means or what a speaker means by it. For Kripke, the early Wittgenstein in the Tractatus (1922) supports a similar view of meaning, according to which:

 A declarative sentence gets its meaning by virtue of its truth conditions, by virtue of its correspondence to facts that must obtain if it is true. For example, ‘the cat is on the mat’ is understood by those speakers who realize that it is true if and only if a certain cat is on a certain mat; it is false otherwise (Kripke 1982, 72).

We can tell the same story about the sentences by which we ascribe meaning to our and others’ utterances, such as “Jones means plus by “plus””. According to the realist, this sentence has a truth-condition: it is true if and only if Jones really means plus by “plus”, or if the fact that Jones means plus by “plus” obtains. It is a fact that Jones means plus, and not anything else, by “plus” and depending on the sort of realist view that one holds (such as naturalist reductionist, non-naturalist, non-reductionist, and so forth), such meaning facts are either primitive or, in one way or another, constituted by some other fact about the speaker. Such a realist conception of meaning provides an explanation of why we mean what we do by our words. The skeptical argument rejects the existence of any such fact, as it appears in its Basic Skeptical Conclusion.

If we support such a realist view of meaning, the skeptical argument has a very radical outcome because the combination of the Basic Skeptical Conclusion and the classical realist conception of meaning amounts to the Radical Skeptical Conclusion, according to which “there can be no such thing as meaning anything by any word” (Kripke 1982, 21). For Kripke, this conclusion captures the paradox that Wittgenstein presents in section 201 of the Philosophical Investigations. Any use you make of a word is both correct and incorrect at the same time because it is compatible with different meanings and there is no fact determining what meaning the speaker has in mind. The notion of meaning simply vanishes, together with that of correctness of use. The classical realist explanation of meaning, therefore, leads to the Wittgensteinian paradox. Kripke, however, believes that his Wittgenstein has a “solution” to this problem, though its aim is not to rescue classical realism.

3. Kripke’s Wittgenstein: The Skeptical Solution

The Radical Skeptical Conclusion seems to be obviously wrong at least for two reasons. For one thing, we do very often mean specific things by our words. For another, the Radical Skeptical Conclusion is “incredible and self-defeating” (Kripke 1982, 71) because if it is true, the skeptical conclusions themselves would not have any meaning. According to Kripke, his Wittgenstein does not “wish to leave us with his problem, but to solve it: the sceptical conclusion is insane and intolerable” (Kripke 1982, 60). Kripke’s Wittgenstein agrees with his skeptic that there is no fact about what we mean by our words and thus accepts the Basic Skeptical Conclusion: he thinks that the classical realist explanation of meaning is deeply problematic. Nonetheless, he rejects the Radical Skeptical Conclusion as unacceptable. Although there is no fact as to what someone means by her words, we do not need to accept the conclusion that there is thereby no such thing as meaning and understanding at all. What we need to do is instead to throw away the view that resulted in such a paradox, that is, the classical realist conception of meaning. Such a view was a misunderstanding of our ordinary notion of meaning.

Kripke distinguishes between two general sorts of solutions to the skeptical problem: straight solutions and skeptical solutions. A straight solution aims to show that the skeptic is wrong or unjustified in his claims (see Kripke 1982, 66). The suggested facts previously mentioned can be seen as various attempts to offer a straight solution. The skeptic argues that they are all hopeless as they lead to the paradox. A skeptical solution, however, starts by accepting the negative point of the skeptic’s argument, that is, that there is no fact as to what someone means by her words. The skeptical solution is built on the idea that “our ordinary practice or belief is justified because – contrary appearances notwithstanding—it need not require the justification the sceptic has shown to be untenable” (Kripke 1982, 67).

a. Truth-Conditions vs. Assertibility Conditions

Consider the sentences by which we attribute meaning to others and ourselves, that is, meaning-ascribing sentences, such as “Jones means plus by “plus”” or “I mean plus by “plus””. The classical realist conception of the meaning of such sentences is truth-conditional: the sentence “Jones means plus by “plus”” is true if and only if Jones means plus by “plus” (that is, if and only if the fact that Jones means plus by “plus” obtains) and thus its meaning is that Jones means plus by “plus”. Similarly, the sentence “I mean plus by “plus”” is true if and only if I do mean plus by “plus” (that is, if and only if the fact that I mean plus by “plus” obtains) and thus means that I do mean plus by “plus”. (My concentration will be on the third-personal attributions of meaning such as “Jones means plus by “plus””, while similar considerations apply to the case of self-attributions). The skeptic argues that there is no such fact obtaining which makes these sentences true. The skeptical solution abandons the classical realist truth-conditional treatment of meaning. (See Boghossian (1989), Horwich (1990), McDowell (1992), Peacocke (1984), Soames (1998), and Wilson (1994; 1998) for the claim that Wittgenstein’s aim has not been to rule out the notion of truth-conditions, but the classical realist conception of it.)

Alternatively, as Kripke puts it:

[His] Wittgenstein replaces the question “What must be the case for [a] sentence to be true?” by two other : first, “Under what conditions may this form of words be appropriately asserted (or denied)?”; second, given an answer to the first question, “What is the role, and the utility, in our lives of our practice of asserting (or denying) the form of words under these conditions?” (Kripke 1982, 73)

 Once we give up on the classical realist view of meaning, all we need to do is to take a careful look at our ordinary practice of asserting meaning-ascribing sentences under certain conditions. Kripke’s Wittgenstein calls these conditions Assertibility Conditions or Justification Conditions (Kripke 1984, 74). In its most general sense, the assertibility conditions tell us under what conditions we are justified to assert something specific by using a sentence. When our concern is to attribute meaning to ourselves and others, these conditions tell us when we can justifiably assert that Jones means plus by “plus” or that I follow the addition rule. We already know that we cannot say that we are justified in asserting that Jones means plus by “plus” because the fact that he means plus obtained. Nor can we do the same in our own case: there is no fact about any of us constituting the fact that we mean this rather than that by our words.

Having agreed with the skeptic that there is no fact about meaning, it seems to Kripke’s Wittgenstein that all that we are left with is our feeling of confidence, blind inclinations, mere dispositions or natural propensities to respond or to use words in one way rather than another: it seems that “I apply the rule blindly” (Kripke 1982, 17). The assertibility conditions specify the conditions under which the subject is inclined, or feels confident, to apply her words in such and such a way: “the ‘assertibility conditions’ that license an individual to say that, on a given occasion, he ought to follow his rule this way rather than that, are, ultimately, that he does what he is inclined to do” (Kripke 1982, 88). This, however, does not imply that there is thereby no such thing as meaning one thing rather than another by some words. The evidence justifying us to assert or judge that Jones means green by “green” is our observation of Jones’s linguistic behavior, that is, his use of the word under certain publicly observable circumstances. We can justifiably assert that Jones means green by “green” if we can observe, in enough cases, that he uses this word as we do or would do, or more generally, as others in his speech-community are inclined to do. This is the only justification there is, and the only justification we need, to assert that he means green by “green”. We can also tell a story about why such a practice has the shape it has and why we are participating in it at all, without appealing to any classical realist or otherwise explanation of such practices: participating in them has endless benefits for us. Consider an example from Kripke:

Suppose I go to the grocer with a slip marked ‘five red apples’, and he hands over apples, reciting by heart the numerals up to five and handing over an apple as each numeral is intoned. It is under circumstances such as these that we are licensed to make utterances using numerals. (Kripke 1982, 75-76)

We can assert that the grocer and the customer both mean five by “five”, red by “red”, and apple by “apple” if they agree in the way they are inclined to apply these terms. Our lives depend on our participation and success in such practices. If the customer responds with some bizarre answers, others including the grocer start losing their justification to assert that he really means plus‌ by “plus”: the only justification there is for making such assertions starts vanishing.

Note again that such agreed-on dispositions, blind inclinations or natural propensities to respond in certain ways, contrary to the dispositional account of meaning, are not supposed to form a fact that can constitute some meaning fact, such as the fact that the grocer means apple, and not anything else, by “apple”. The sort of responses we naturally agree to produce and the impact they have on our lives give rise to our “form of life”. The members of our speech-community agree to use “plus” and other words in specific ways: they are uniform in their responses. We live a plus-like form of life (see Kripke 1982, 96). However, there is and can be no (realist or otherwise) explanation of why we agree to respond as we do. Any attempt to cite some fact constituting such agreements leads to the emergence of the Wittgensteinian paradox. For this reason, it would be nothing but a brute empirical fact, a primitive aspect of our form of life, that we all agree as we do (see Kripke 1982, 91).

b. The Private Language Argument

Once we accept such an alternative picture of meaning, we realize that one of its consequences is the impossibility of a private language. Kripke’s Wittgenstein emphasizes that “if one person is considered in isolation, the notion of a rule as guiding the person who adopts it can have no substantive content” (Kripke 1982, 89). The skeptical solution cannot admit the possibility of a private language, that is, a language that someone invents and only she can understand, independently of the shared practices of a speech-community. This comes from the nature of the assertibility conditions: “It turns out that […] these conditions […] involve reference to a community. They are inapplicable to a single person considered in isolation. Thus, as we have said, Wittgenstein rejects ‘private language’” (Kripke 1982, 79).

Consider the case of a Robinson Crusoe who has been in isolation since birth on an island. Crusoe is inclined to apply his words in certain ways. He is confident, for instance, that when he applies “green” to an object, his use is correct, that he means green or in any case something determinate by this word. Facing a new object, he thinks he ought to apply “green” to this object too. As there is no one else with whose use or responses he can contrast his, all there is to assure him that his use is correct is himself and his confidence. To Crusoe, thus, whatever seems right is right, in which case no genuine notion of error, mistake, or disagreement can emerge: if he feels confident that “green” applies to a blue object, this is correct. The assertibility conditions in this case would be along these lines:

“Green” applies to this object if and only if Crusoe thinks or feels confident that “green” applies to the object.

This is the reason why Wittgenstein famously stated that “in the present case I have no criterion of correctness. One would like to say: whatever is going to seem right to me is right. And that only means that here we can’t talk about ‘right’” (Wittgenstein 1953, §258). In order for certain applications of “green” to be incorrect, there are to be certain correct ways of applying it. For a solitary person, however, “there are no circumstances under which we can say that, even if he inclines to say ‘125’, he should have said ‘5’, or vice versa” (Kripke 1982, 88). The correct answer is simply “the answer that strikes him as natural and inevitable” (Kripke 1982, 88). Crusoe’s use is wrong only when he feels it is wrong.

Nonetheless, if Crusoe is a member of a speech-community, a new element enters the picture: although Crusoe may simply feel confident that applying “green” to this (blue) object is correct, others in his speech-community disagree. The assertibility conditions for how “green” applies turn into the following condition:

“Green” applies to this object if and only if others are inclined to apply “green” to that object, or if others feel confident that “green” applies to it.

As Kripke’s Wittgenstein puts it, “others will then have justification conditions for attributing correct or incorrect rule-following to the subject, and these will not be simply that the subject’s own authority is unconditionally to be accepted” (Kripke 1982, 89). This is the reason why Kripke thinks that Wittgenstein’s argument against the possibility of private language (known as the private language argument) is not an independent argument. Nor is it the main concern of Wittgenstein in the Investigations. Rather, it is the consequence of Wittgenstein’s new way of looking at our linguistic practices, according to which speaking and understanding a language is a sort of activity. As Wittgenstein famously puts it, “to understand a sentence means to understand a language. To understand a language means to be master of a technique” (Wittgenstein 1953, §199). If so, then “to obey a rule, to make a report, to give an order, to play a game of chess, are customs (uses, institutions)” (Wittgenstein 1953, §199). There is an extensive literature on the implications of the private language argument as well as Kripke’s reading of it (see for instance, Baker and Hacker (1984), Bar-On (1992), Blackburn (1984a), Davies (1988), Hanfling (1984), Hoffman (1985), Kusch (2006), Malcolm (1986), McDowell (1984; 1989), McGinn (1984), Williams (1991), and Wright (1984; 1991)).

4. Responses and Criticisms

Since the publication of Kripke’s book, almost every aspect of his interpretation of Wittgenstein has been carefully examined. The responses can be put in three main categories: those focusing on the correctness of Kripke’s interpretation of Wittgenstein, those discussing the plausibility of the skeptical argument and solution, and those attempting to offer an alternative solution to the skeptical problem. Many interesting and significant issues, which were first highlighted by Kripke in his book, have since turned into self-standing topics, such as that of the normativity of meaning, the dispositional view of meaning, and the community conception of language. In what follows, it will only be possible to glance upon some of the most famous responses to Kripke’s Wittgenstein. They mainly debate the issues over the individualist vs. communitarian readings of Wittgenstein and the reductionist factualist vs. non-reductionist factualist interpretations of his remarks.

In their 1984 book, Scepticism, Rules and Language, Baker and Hacker defend an individualistic reading of Wittgenstein’s view of the notion of a practice and thereby reject Kripke’s suggested communitarian interpretation. For them, not only does Kripke misrepresent Wittgenstein, but the skeptical argument and the skeptical solution are both wrong. They believe that Wittgenstein never aimed to reject a philosophical view and defend another. Thus, they find it entirely unacceptable to agree with Kripke that Wittgenstein “who throughout his life found philosophical scepticism nonsensical […] should actually make a sceptical problem the pivotal point of his work. It would be even more surprising to find him accepting the sceptic’s premises […] rather than showing that they are ‘rubbish’” (Backer and Hacker 1984, 5). According to Baker and Hacker, the skeptical argument cannot even be treated as a plausible sort of skepticism; it rather leads to pure nihilism: “Why his argument is wrong may be worth investigating (as with any paradox), but that it is wrong is indubitable. It is not a sceptical problem but an absurdity” (Backer and Hacker 1984, 5). For, as they see it, a legitimate skepticism about a subject matter involves only epistemological rather than metaphysical doubts. An epistemological skeptic would claim that we do mean specific things by our words (as we normally do) but, for some reason, we can never be certain what that meaning is. For Kripke’s Wittgenstein’s skeptic, however, there is no fact about meaning at all and this leads to a paradox, which results in the conclusion that there is no such thing as meaning anything by any word. But “this is not scepticism at all, it is conceptual nihilism, and, unlike classical scepticism, it is manifestly self-refuting” (Backer and Hacker 1984, 5).

According to the way Baker and Hacker read Wittgenstein, the paradox mentioned in section 201 of the Investigations is intended by Wittgenstein to reveal a misunderstanding, not something that we should live with, and “this is shown by the fact that no interpretation, i.e. no rule for the application of a rule, can satisfy us, can definitively fix, by itself, what counts as accord. For each interpretation generates the same problem” (Backer and Hacker 1984, 13). Our understanding of words has nothing to do with the task of fixing a mediating interpretation because the result of such an attempt is a regress of interpretations. For Wittgenstein, understanding is nothing but that which manifests itself in our use of words, in our actions, in the technique of using language. Thus, Wittgenstein cannot be taken to be offering a skeptical solution either.

Moreover, for Baker and Hacker, the community view that Kripke attributes to Wittgenstein, as Wittgenstein’s alternative view, must be thrown away. For if it is the notion of a practice that Wittgenstein thinks of as fundamental, we can find no compelling reason to conclude that Crusoe cannot come up with a practice, in the sense of acquiring a technique to use his words and symbols. After all, it is enough that such an understanding manifests itself in Crusoe’s practices. According to Baker and Hacker, to participate in a practice is not just to act but to repeat an action over time with regularity. If so, then “nothing in this discussion involves any commitment to a multiplicity of agents. All the emphasis is on the regularity, the multiple occasions, of action” (Backer and Hacker 1984, 20).

Blackburn also defends an individualistic reading of Wittgenstein. For him, there is no metaphysical difference between the case of Crusoe and the case of a community. For whatever is available to Kripke’s Wittgenstein to avoid the skeptical problem in the case of a community of speakers is equally available to an anti-communitarianist defending the case of Crusoe as a case of genuine rule-following. For instance, consider the problem with the finiteness of dispositions. If the objection is that the totality of the dispositions of an individual, because of being finite, fails to determine what the individual means by her words, the totality of the dispositions of a community too is finite and thus fails to determine what they mean by their words. This means that the community can also be seen as following the skaddition rule: the agreement in their similar responses would remain compatible with both scenarios, that is, their following the addition rule and their following the skaddition rule.

On the other hand, according to Blackburn, if the claim is merely that it is only within a community of speakers that a practice can emerge, we are misreading Wittgenstein. The claim that a practice emerges only within a community may mean different things. It might for instance mean that to Crusoe, whatever seems right is right, so that a community is inevitably required to draw a distinction between what is right and what only seems right. As Blackburn points out, however, the case of an individual and that of a community does not differ metaphysically with respect to this issue because the same problem arises in the case of a community: whatever seems right to the community is right. Alternatively, the claim may mean that it is only because of the interactions between the members of a community that the notion of a practice can be given a legitimate meaning. Blackburn’s objection is that we have no argument against the possibility that Crusoe can interact with himself and thus form a practice: we can imagine that Crusoe interacts with his past self, with the symbols, signs and the like that he used in the past. There is no reason to assume that because his responses are not like ours, Crusoe’s practice is not a practice. The point is that if he is part of no community, there simply is no requirement that he responds as any others do. Consequently, it is implausible to claim that, within a speech-community, “we see ourselves as rule-followers because why is it that Crusoe cannot see himself as a rule-follower?”

For Blackburn, the negative point that Wittgenstein makes is that we must not think of the connection between use of words and understanding them as mediated by something, such as some interpretation, mental image, idea, and so forth, because doing so leads to the regress of interpretations: the search for some other medium making the previous one fixed would go on forever. This is a misunderstanding of our practices. Wittgenstein’s positive insight is that “our rules are anchored in practice […] That is, dignifying each other as rule-following is essentially connected with seeing each other as successfully using techniques or practices” (Wittgenstein 1984a, 296). But such a notion of a practice is not necessarily hinged on a community: “we must not fall into the common trap of simply equating practice with public practice, if the notion is to give us the heartland of meaning” (Wittgenstein 1984b, 85). Blackburn, thus, defends an individualist view of rule-following against the communitarian view that Kripke’s Wittgenstein offers in his skeptical solution.

Colin McGinn, in his well-known book Wittgenstein on Meaning (1984), also defends an individualist reading of Wittgenstein. Some of his objections are similar to those made by Blackburn and by Baker and Hacker: Kripke neglects Wittgenstein’s positive remark, offered in the second part of section 201 of the Investigations, that the paradox is the result of a misunderstanding that must be removed. For McGinn, this forms a reductio for the conception of meaning that treats the notion of interpretation as essential to the possibility of understanding a language (McGinn 1984, 68). Wittgenstein’s aim has been to remove a misconception of this notion, according to which understanding is a kind of mental process, such as that of translating or interpreting words. Kripke is thus unjustified in his claim that Wittgenstein offers a skeptical problem and then a skeptical solution to such a problem. For McGinn, Wittgenstein has never been hostile to notions like “facts” and “truth-conditions” as they are ordinarily used; his target has rather been to unveil a misunderstanding of them, one that builds on the notion of interpretation. This means that McGinn supports a factualist reading of Wittgenstein against the non-factualist view that Kripke seems to attribute to him. This factualist view takes the notion of a practice, or the ability to use words in certain ways, to form a fact as to what someone means by her words: “At any rate, if we want to talk in terms of facts it seems that Wittgenstein does suggest that understanding consists in a fact, the fact of having an ability to use signs” (McGinn 1984, 71). (For some of the well-known factualist readings of Wittgenstein, and the skeptical solution, see, for instance, Byrne (1996), Davies (1998), Soames (1997; 1998), Stroud (1996), and Wilson (1994; 1998). See also Boghossian (1989; 1990), Kusch (2006) and Miller (2010) for further discussions.)

Moreover, for McGinn, the notion of a practice or a custom does not involve the notion of a community. Thus, he agrees with Blackburn and with Baker and Hacker on this point. It is true that Wittgenstein embraces the idea of multiplicity, but this has nothing to do with the multiplicity of subjects, but rather with a multiplicity of instances of rule-following: a word cannot be said to have a meaning if it is used just once; meaning emerges as the result of using words repeatedly over time in a certain way. He also sees the skeptic’s objections to non-reductionism as misplaced. For him, if we treat meaning as an irreducible state of the speaker, we may have a difficult time coming up with a theory that can explain how we directly know the general content of such states. But “lack of a theory of a phenomenon is not in itself a good reason to doubt the existence of it” (McGinn 1984, 161). (For a well-known criticism of McGinn’s view, see Wright (1989).)

On the other hand, McDowell and Peacocke have defended a communitarian reading of Wittgenstein. According to Peacocke, Wittgenstein’s considerations on rule-following reveal that following a rule is a practice, which is essentially communal: “what it is for a person to be following a rule, even individually, cannot ultimately be explained without reference to some community” (Peacocke 1981, 72). We need some public criteria in order to be able to draw the distinction between what seems right to the individual and what is right independently of what merely seems to her to be so, and to assess whether she follows a rule correctly; these criteria would emerge only if the individual can be considered as a member of a speech-community. For Peacocke, Wittgenstein has shown that the individualistic accounts of rule-following are based on a misunderstanding of what is fundamental to the existence of our ordinary linguistic practices.

According to McDowell, Kripke has misinterpreted Wittgenstein’s central point in his remarks on the paradox presented especially in section 201 of the Investigations. His chief remark is offered in the second part of the same paragraph, where Wittgenstein says: “It can be seen that there is a misunderstanding here […] What this shews is that there is a way of grasping a rule which is not an interpretation, but which is exhibited in what we call ‘obeying the rule’ and ‘going against it’ in actual cases” (Wittgenstein 1953, §201). If Wittgenstein views the paradox as the result of a misunderstanding, we cannot claim that he is sympathetic to any skeptic. According to McDowell, for Wittgenstein, the paradox comes not from adopting a realist picture of meaning but from a misconception of our linguistic practices, according to which meaning and understanding are mediated by some interpretation. When we face the question as to what constitutes such an understanding, “we tend to be enticed into looking for a fact that would constitute my having put an appropriate interpretation on what I was told and shown when I was instructed in [for instance] arithmetic” (McDowell 1984, 331). Such a conception of a fact determining an intermediate interpretation is a misunderstanding. For, as Wittgenstein famously said, “any interpretation still hangs in the air along with what it interprets, and cannot give it any support” (Wittgenstein 1953, §198).

For McDowell, if we miss this fundamental point, we then face a devastating dilemma: (1) we try to find facts that fix an interpretation, which obviously leads to the regress of interpretations; but then, (2) in order to escape such a regress, we may be tempted to read Wittgenstein as claiming that to understand is to possess an interpretation but “an interpretation that cannot be interpreted” (McDowell 1984, 332). The latter attempt, however, dramatically fails to dodge the regress of interpretations: it rather pushes us toward an even worse difficulty, that is, that there is a superlative rule which is, in a mysterious way, not susceptible to the problem of the regress of interpretations. For McDowell, “one of Wittgenstein’s main concerns is clearly to cast doubt on this mythology” (McDowell 1984, 332). Understanding has nothing to do with mediating interpretations at all.

McDowell is also against the skeptical solution, which begins by accepting the (basic) skeptical conclusion of the skeptical argument: the whole point of Wittgenstein’s discussion of the paradox in the second part of section 201 has been to warn us against the paradox, that the dilemma in question is not compulsory. The paradox emerges as the result of a misunderstood treatment of meaning and understanding, according to which understanding involves interpretation. If so, there is then no need for a skeptical solution at all. For McDowell, once we fully appreciate Wittgenstein’s point about the paradox, we can see that there really is nothing wrong with our ordinary talk of communal facts, that is, facts as to what we mean by our words in a speech-community: “I simply act as I have been trained to. […] The training in question is initiation into a custom. If it were not that […] our picture would not contain the materials to entitle us to speak of following (going by) a sign-post” (McDowell 1984, 339). To understand a language is to master the technique of using this language, that is, to acquire a practical ability. This, however, does not imply admitting a purely behaviorist view of language and thereby emptying the notion of meaning from its normative feature. McDowell’s Wittgenstein treats acting in a certain way in a community “as acting within a communal custom” (McDowell 1984, 352), which is a rule-governed activity.

As we saw, Blackburn, McGinn, and Baker and Hacker defend an individualist reading of Wittgenstein’s remarks on rule-following, while Peacocke and McDowell support a communitarian one. Boghossian (1989) and Goldfarb (1985) also raise serious doubts about whether the skeptical solution can successfully make the notion of a community central to the existence of the practice of meaning something by a word. For them, the assertibility conditions are either essentially descriptive, rather than normative (Goldfarb 1985, 482-485), or they are capable of being characterized in an individualistic way, in which no mention of others’ shared practices is made at all (Boghossian 1989, 521-522). Nonetheless, defending an individualist view of meaning is one thing, advocating a factualist view of it is another: there are individualist factualist views (such as McGinn’s), as well as communitarian factualist views (such as McDowell’s). Moreover, the factualist views may themselves be reductionist (such as Horwich’s) or non-reductionist (such as Wright’s).

For instance, although Wright has offered various criticisms of Kripke’s Wittgenstein’s view, he thinks that the proper solution to the skeptical problem is a particular version of non-reductionist factualism. Like McGinn, Wright finds the skeptic’s argument from queerness against non-reductionism unconvincing (Wright 1984, 775ff.). Nonetheless, contrary to McGinn, he believes that we need to solve the epistemological problems that come with such a view. According to Wright, the generality of the content of our semantical and intentional states or, as he calls it, their “indefinite fecundity”, is not mysterious at all: it is simply part of the ordinary notion of meaning and intention that these states possess such a general content. Wright gives an example to clarify his point: “suppose I intend, for example, to prosecute at the earliest possible date anyone who trespasses on my land” (Wright 1984, 776). The content of such an intention is general: it does not constrain my action to a specific time, occasion, or person, so that “there can indeed be no end of distinct responses, in distinct situations, which I must make if I remember this intention, continue to wish to fulfil it, and correctly apprehend the prevailing circumstances” (Wright 1984, 776). If so, the main problem with non-reductionism is to account for the problem of self-knowledge, that is, to offer an account of why and how it is that we, as first-persons, non-inferentially and directly know the general content of our meaning states on each occasion of use. For one thing, it is part of our ordinary notion of meaning and intention that “a subject has, in general, authoritative and non-inferential access to the content of his own intentions, and that this content may be open-ended and general, may relate to all situations of a certain kind” (Wright 1984, 776). For another, however, Wright believes that we must, and can, account for such a phenomenon. He attempts to put forward an account of how we know what we mean and intend, differently from the way others, third-persons, know such meanings and intentions. His account is called the “Judgement-Dependent” account of meaning and intention, which Wright develops in several of his writings. Unpacking this account involves much technicality that goes beyond the scope of this article. (See especially Wright (1992; 2001) for his account. For a different response-dependent response to Kripke’s Wittgenstein, which also defends non-reductionism, see Pettit (1990).) Wright’s main point is that the fact that the non-reductionist response must deal with the problem of self-knowledge forms no decisive argument against its plausibility. On this point, Boghossian is on board with Wright: in order to reject the non-reductionist response what the skeptic needs to do is to provide “a proof that no satisfactory epistemology was ultimately to be had” (Boghossian 1989, 542). The skeptic, however, has no such argument to offer. For Wright, this means that if we explain these features of meaning, non-reductionism “is available to confront Kripke’s sceptic, and that, so far as I can see, the Sceptical Argument is powerless against it” (Wright 1984, 776). (For more on Wright’s criticisms of Kripke, see Wright (1986; 1992, appendix to chapter 3; 2001, part II). For the main defenses of non-reductionism against Kripke’s Wittgenstein, see also Hanfling (1985), Pettit (1990), and Stroud (1996).)

Paul Horwich, on the contrary, defends a communitarian version of reductionist factualism, or more accurately a communitarian version of the dispositional view against the skeptic. His main attempt is to show that “Wittgenstein’s equation of meaning with ‘use’ (construed non-semantically) is the taken-to-be-obvious centrepiece of his view of the matter, […] [contrary] to Kripke’s interpretation [that] the centrepiece is his criticism of that equation!” (Horwich 2012, 146). For Horwich, facts about the speaker’s environment, or more particularly facts about his linguistic community, are important and must be carefully taken care of in our account of meaning. His community-based dispositional view goes against the individualistic theory, according to which “what a person means is determined solely by the dispositions of that person” (Horwich 1990, 111). The community-based version of this view aims to show that “individuals are said to mean by a word whatever that word means in the linguistic community they belong to”. Horwich calls this view the Community-Use Theory. According to Horwich, there are (naturalistic) facts with normative consequences, that is, facts about how a speaker is naturally disposed to respond as a member of a speech-community. If we accept what Horwich calls uncontroversial universal principles, that is, the principles of the form “Human beings should be treated with respect”, “one should believe the truth”, and the like, we can then see that such principles are capable of entailing the sort of conditionals that have certain factual claims as their antecedents and certain normative claims as their consequents. Such conditionals would have the following form: “If Jones is a human being, then he ought to be treated with respect” or “If it is true that 68 + 57 = 125, then one ought to believe it” (see Horwich 1990, 112). All we need is then certain agreed-on principles that can tell us what the normative outcomes of non-normative situations are. Since we can have non-semantical, dispositional facts as the antecedents of these conditionals, it would be a mistake to think that factual claims, such as those made by the naturalistic dispositional view of meaning, cannot have normative consequences. For Horwich, therefore, the communal version of the dispositional view can accommodate the normative feature of meaning: factual claims about what a speaker means, whose truth depends on the obtaining of certain facts about the speaker’s dispositions being in agreement with those of the members of the speech-community, can have normative outcomes. Horwich engages in detailed discussions of Wittgenstein’s view of the deflationary theory of truth, different aspects of the normativity of meaning thesis, and the notion of communal dispositions. (For a different sort of reductionist dispositional view, which treats the dispositional facts as irreducibly normative, see Ginsborg (2011; 2018; 2021). See also Maddy (2014) and Marie McGinn (2010) for certain naturalist responses to Kripke’s Wittgenstein.)

Further salient reactions to Kripke’s Wittgenstein, such as those made by Chomsky (1986), Goldfarb (1985), Kusch (2006), Pettit (1992), and Soames (1997), are too technical to be properly unpacked in this article. Reference to some further key works on the topic can be found in the Further Reading section.

5. References and Further Reading

a. References

  • Baker, Gordon P. and Hacker, P. M. S. 1984. Scepticism, Rules and Language. Oxford: Basil Blackwell.
  • Armstrong, David. 1997. A World of States of Affairs. Cambridge: Cambridge University Press.
  • Bar-On, Dorit. 1992. “On the Possibility of a Solitary Language”. Nous 26(1): 27–45.
  • Bird, Alexander. 1998. “Dispositions and Antidotes”. The Philosophical Quarterly 48: 227–234.
  • Blackburn, Simon. 1984a. “The Individual Strikes Back.” Synthese 58: 281–302.
  • Blackburn, Simon. 1984b. Spreading the Word. Oxford: Oxford University Press.
  • Boghossian, Paul. 1989. “The Rule-Following Considerations”. Mind 98: 507–549.
  • Boghossian, Paul. 1990. “The Status of Content”. The Philosophical Review 99(2): 157–184.
  • Boghossian, Paul. 2003. “The Normativity of Content”. Philosophical Issues 13: 31–45.
  • Boghossian, Paul. 2008. “Epistemic Rules”. The Journal of Philosophy 105(9): 472–500.
  • Byrne, Alex. 1996. “On Misinterpreting Kripke’s Wittgenstein”. Philosophy and Phenomenological Research 56(2): 339-343.
  • Carnap, Rudolf. 1928. The Logical Structure of the World. Berkeley: University of California Press.
  • Chomsky, Noam. 1986. Knowledge of Language: Its Nature, Origin and Use. New York: Praeger.
  • Coates, Paul. 1986. “Kripke’s Sceptical Paradox: Normativeness and Meaning”. Mind 95(377): 77–80.
  • Davies, David. 1998. “How Sceptical is Kripke’s ‘Sceptical Solution’?”. Philosophia 26: 119–40.
  • Davies. Stephen. 1988. “Kripke, Crusoe and Wittgenstein”. Australasian Journal of Philosophy 66(1): 52–66.
  • Gibbard, Allan. 1994. “Meaning and Normativity”. Philosophical Issues 5: 95–115.
  • Gibbard, Allan. 2013. Meaning and Normativity. Oxford: Oxford University Press.
  • Ginsborg, Hannah. 2011. “Primitive Normativity and Scepticism about Rules”. The Journal of Philosophy 108(5): 227–254.
  • Ginsborg, Hannah. 2018. “Normativity and Concepts”. In The Oxford Handbook of Reasons and Normativity, edited by Daniel Star, 989–1014. Oxford: Oxford University Press.
  • Ginsborg, Hannah. 2021. “Going On as One Ought: Kripke and Wittgenstein on the Normativity of Meaning”. Mind & Language: 1–17.
  • Glock, Hans-Johann. 2019. “The Normativity of Meaning Revisited”. In The Normative Animal?, edited by Neil Roughley and Kurt Bayertz, 295–318. Oxford: Oxford University Press.
  • Gluer, Kathrin and Wikforss, Asa. 2009. “Against Content Normativity”. Mind 118(469): 31–70.
  • Goldfarb, Warren. 1985. “Kripke on Wittgenstein on Rules”. The Journal of Philosophy 82(9): 471–488.
  • Goodman, Nelson. 1973. Fact, Fiction and Forecast. Indianapolis: Bobbs-Merill.
  • Hanfling, Oswald. 1984. “What Does the Private Language Argument Prove?”. The Philosophical Quarterly 34(137): 468–481.
  • Hattiangadi, Anandi. 2006. “Is Meaning Normative?”. Mind and Language 21(2): 220 –240
  • Hattiangadi, Anandi. 2007. Oughts and Thoughts: Rule-Following and the Normativity of Content. Oxford: Oxford University Press.
  • Hattiangadi, Anandi. 2010. “Semantic Normativity in Context”. In New Waves in Philosophy of Language, edited by Sarah Sawyer, 87–107. London: Palgrave Macmillan.
  • Hoffman, Paul. 1985. “Kripke on Private Language”. Philosophical Studies 47: 23–28.
  • Horwich, Paul. 1990. “Wittgenstein and Kripke on the Nature of Meaning”. Mind and Language 5(2): 105–121.
  • Horwich, Paul. 1995. “Meaning, Use and Truth”. Mind 104(414): 355–368.
  • Horwich, Paul. 2012. Wittgenstein’s Metaphilosophy. Oxford: Oxford University Press.
  • Horwich, Paul. 2019. “Wittgenstein (and his Followers) on Meaning and Normativity”. Disputatio 8(9): 1–25.
  • Kripke, Saul. 1982. Wittgenstein on Rules and Private Language. Cambridge, MA.: Harvard University Press.
  • Kusch, Martin. 2006. A Sceptical Guide to Meaning and Rules: Defending Kripke’s Wittgenstein. Chesham: Acumen.
  • Lewis, David. 1997. “Finkish Dispositions”. The Philosophical Quarterly 47: 143–158.
  • Maddy, Penelope. 2014. The Logical Must: Wittgenstein on Logic. Oxford: Oxford University Press.
  • Malcolm, Norman. 1986. Nothing is Hidden. Oxford: Basil Blackwell.
  • McDowell, John. 1984. “Wittgenstein on Following a Rule”. Synthese 58: 325–363.
  • McDowell. John. 1989. “One Strand in the Private Language Argument”. Grazer Philosophische Studien 33(1): 285–303.
  • McDowell, John. 1991. “Intentionality and Inferiority in Wittgenstein”. In Meaning Scepticism, edited by Klaus Puhl, 148–169. Berlin: De Gruyter.
  • McDowell, John. 1992. “Meaning and Intentionality in Wittgenstein’s Later Philosophy”. Midwest Studies in Philosophy 17(1): 40–52.
  • McGinn, Colin. 1984. Wittgenstein on Meaning. Oxford: Basil Blackwell.
  • McGinn, Marie. 2010. “Wittgenstein and Naturalism”. In Naturalism and Normativity, edited by Mario De Caro and David Macarthur, 322–351. New York: Columbia University Press.
  • Mellor, David Hugh. 2000. “The Semantics and Ontology of Dispositions”. Mind 109: 757–780.
  • Miller, Alexander. 2010. “Kripke’s Wittgenstein, Factualism and Meaning”. In The Later Wittgenstein on Language, edited by Daniel Whiting, 213–230. Basingstoke: Palgrave Macmillan.
  • Miller. Alexander. 2019. “Rule-Following, Meaning, and Primitive Normativity”. Mind 128(511): 735–760.
  • Mumford, Stephen. 1998. Dispositions. Oxford: Oxford University Press.
  • Peacocke, Christopher. 1981. “Rule-Following: The Nature of Wittgenstein’s Arguments”. In Wittgenstein: To Follow a Rule, edited by Steven Holtzman and Christopher Leich, 72–95. NY: Routledge.
  • Peacocke, Christopher. 1984. “Review of Wittgenstein on Rules and Private Language by Saul A. Kripke”. The Philosophical Review 93(2): 263–271.
  • Pettit, Philip. 1990. “The Reality of Rule-Following”. Mind 99(393):1-21.
  • Prior, Elizabeth. 1985. Dispositions. Aberdeen: Aberdeen University Press.
  • Railton, Peter. 2006. “Normative Guidance”. In Oxford Studies in Metaethics: Volume 1, edited by Russ Shafer-Landau, 3–34. Oxford: Clarendon Press.
  • Sellars, Wilfrid. 1958. “Counterfactuals, Dispositions and the Causal Modalities”. Minnesota Studies in the Philosophy of Science 2: 225–308.
  • Soames, Scott. 1997. “Scepticism about Meaning, Indeterminacy, Normativity, and the Rule-Following Paradox”. Canadian Journal of Philosophy 27: 211–249.
  • Soames, Scott. 1998. “Facts, Truth Conditions, and the Skeptical Solution to the Rule-Following Paradox”. Nous 32(12): 313–348.
  • Stroud, Barry. 1996. “Mind, Meaning, and Practice”. In The Cambridge Companion to Wittgenstein, edited by Hans Sluga and David G. Stern, 296–319. Cambridge: Cambridge University Press.
  • Warren, Jared. 2020. “Killing Kripkenstein’s Monster”. Nous 54(2): 257–289.
  • Wedgwood, Ralph. 2006. “The Meaning of ‘Ought’”. In Oxford Studies in Metaethics: Volume 1, edited by Russ Shafer-Landau, 127–160. Oxford: Clarendon Press
  • Wedgwood, Ralph. 2007. The Nature of Normativity. Oxford: Oxford University Press.
  • Whiting, Daniel. 2007. “The Normativity of Meaning Defended”. Analysis 67(294): 133–140.
  • Whiting, Daniel. 2013. “What Is the Normativity of Meaning?”. Inquiry 59(3): 219–238.
  • Williams, Meredith. 1991. “Blind Obedience: Rules, Community and the Individual”. In Meaning Scepticism, edited by Klaus Puhl, 93–125. Berlin: De Gruyter.
  • Wilson, George. 1994. “Kripke on Wittgenstein and Normativity”. Midwest Studies in Philosophy 19(1): 366–390.
  • Wilson, George. 1998. “Semantic Realism and Kripke’s Wittgenstein”. Philosophy and Phenomenological Research 58(1): 99–122.
  • Wittgenstein, Ludwig. 1922. Tractatus Logico-Philosophicus. Translated by C. K Ogden. London: Kegan Paul.
  • Wittgenstein, Ludwig. 1953. Philosophical Investigations. Translated by G. E. M. Anscombe. Oxford: Basil Blackwell.
  • Wittgenstein, Ludwig. 1956. Remarks on the Foundations of Mathematics. Translated by G. E. M. Anscombe. Edited by G. H. von Wright, R. Rhees, and G. E. M. Anscombe. Oxford: Basil Blackwell.
  • Wright, Crispin. 1984. “Kripke’s Account of the Argument Against Private Language”. The Journal of Philosophy 81(12): 759–778.
  • Wright, Crispin. 1986. “Rule-Following, Meaning and Constructivism”. In Meaning and Interpretation, edited by Charles Travis, 271–297. Oxford: Blackwell.
  • Wright, Crispin. 1989. “Critical Study of Colin McGinn’s Wittgenstein on Meaning”. Mind 98(390): 289–305.
  • Wright, Crispin. 1991. “Wittgenstein’s Later Philosophy of Mind: Sensation, Privacy and Intention”. In Meaning Scepticism, edited by Klaus Puhl, 126–147. Berlin: De Gruyter.
  • Wright, Crispin. 1992. Truth and Objectivity. Cambridge, MA: Harvard University Press.
  • Wright, Crispin. 2001. Rails to Infinity: Essays on Themes from Wittgenstein’s Philosophical Investigations. Cambridge, MA: Harvard University Press.
  • Zalabardo, Jose. 1997. “Kripke’s Normativity Argument”. Canadian Journal of Philosophy 27(4): 467–488.

b. Further Reading

  • Bloor, David. 1997. Wittgenstein, Rules and Institutions. New York: Routledge.
  • Cavell, Stanley. 1990. Conditions Handsome and Unhandsome. Chicago: University of Chicago Press.
  • Cavell, Stanley. 2005. Philosophy the Day After Tomorrow. Cambridge, MA: Belknap Press of Harvard University Press.
  • Cavell, Stanley. 2006. “The Wittgensteinian Event”. In Reading Cavell, edited by Alice Crary and Sanford Shieh, 8–25. NY: Routledge.
  • Coates, Paul. 1997. “Meaning, Mistake, and Miscalculation”. Minds and Machines 7(2):171–97.
  • Davidson, Donald. 1992. “The Second Person”. Midwest Studies in Philosophy 17: 255–267.
  • Davidson, Donald. 1994. “The Social Aspect of Language”. In The Philosophy of Michael Dummett, edited by B. McGuinness, 1–16. Dordrecht: Kluwer.
  • Diamond, Cora. 1989. “Rules: Looking in the Right Place”. In Wittgenstein: Attention to Particulars, edited by D. Z. Phillips and Peter Winch, 12–34. Hampshire: Basingstoke.
  • Ebbs, Gary. 1997. Rule-Following and Realism. Cambridge, MA: Harvard University Press.
  • Forbes, Graeme R. 1984. “Scepticism and Semantic Knowledge”. Proceedings of the Aristotelian Society 84:223-37.
  • Hacking, Ian. 1993. “On Kripke’s and Goodman’s Uses of ‘Grue’”. Philosophy 68(265): 269–295.
  • Hanfling, Oswald. 1985. “Was Wittgenstein a Skeptic?”. Philosophical Investigations 8: 1–16.
  • Katz, Jerrold J. 1990. The Metaphysics of Meaning. Cambridge, MA: MIT Press.
  • Maddy, Penelope. 1986. “Mathematical Alchemy”. The British Journal for the Philosophy of Science 37(3):279–314.
  • McGinn, Marie. 1997. The Routledge Guidebook to Wittgenstein’s Philosophical Investigations. New York: Routledge.
  • Miller, Alexander. 2020. “What Is the Sceptical Solution?”. Journal for the History of Analytical Philosophy 8 (2): 1–22.
  • Millikan, Ruth Garrett. 1990. “Truth Rules, Hoverflies, and the Kripke-Wittgenstein Paradox”. The Philosophical Review 99(3): 323–353.
  • Peacocke, Christopher. 1992. A Study of Concepts. Cambridge, MA: MIT Press.
  • Searle, John R. 2002. Consciousness and Language. Cambridge: Cambridge University Press.
  • Smart, J. J. C. 1992. “Wittgenstein, Following a Rule, and Scientific Psychology”. In The Scientific Enterprise, edited by Edna Ullmann-Margalit, 123–138. Berlin: Springer.
  • Stern, David. 1995. Wittgenstein on Mind and Language. Oxford: Oxford University Press.
  • Stern, David. 2004. Wittgenstein’s Philosophical Investigations: An Introduction. Cambridge: Cambridge University Press.
  • Tait, William W. 1986. “Wittgenstein and the ‘Skeptical Paradoxes’”. Journal of Philosophy 83(9): 475–488.
  • Wilson, George. 2006. “Rule-Following, Meaning and Normativity”. In The Oxford Handbook of Philosophy of Language, edited by Ernest Lepore and Barry C. Smith, 1–18. Oxford: Oxford University Press.
  • Wilson, George. 2011. “On the Skepticism about Rule-Following in Kripke’s Version of Wittgenstein”. In Saul Kripke, edited by Alan Berger, 253–289. Cambridge: Cambridge University Press.

 

Author Information

Ali Hossein Khani
Email: hosseinkhani@irip.ac.ir
Iranian Institute of Philosophy (IRIP)
Iran

Epistemic Conditions of Moral Responsibility

What conditions on a person’s knowledge must be satisfied in order for them to be morally responsible for something they have done? The first two decades of the twenty-first century saw a surge of interest in this question. Must an agent, for example, be aware that their conduct is all-things-considered wrong to be blameworthy for it? Or could something weaker than this epistemic state suffice, such as having a mere belief in the act’s wrong-making features, or having the mere capacity for awareness of these features? Notice that these questions are not reducible to the question of whether moral responsibility for something requires free will or control over it. Initially, then, it is worth treating the epistemic condition (otherwise known as the “cognitive” or “knowledge” condition) on moral responsibility as distinct from the control condition. As we shall see, however, some make it part of the control condition.

This article introduces the epistemic conditions of moral responsibility. It starts by clarifying the parameters of the topic and then the two most significant debates in the epistemic condition literature: (1) the debate on whether blameworthiness for wrongdoing requires awareness of wrongdoing, and (2) the debate on whether responsibility for the consequences of our behaviour requires foreseeing those consequences. The bulk of the rest of the article is devoted to an overview of each debate, and it closes with a consideration of future directions for research on the epistemic condition—especially concerning moral praiseworthiness, collective responsibility, and criminal liability.

Table of Contents

  1. The Epistemic Conditions: The Topic
  2. The Epistemic Conditions of Culpable Misconduct
    1. Basic & Control-Based Views
      1. Strong Internalism (aka “Volitionism”)
      2. Weak Internalism
      3. Basic and Control Based Externalism
    2. Capacitarian Views
      1. Capacitarian Externalism
      2. Capacitarian Internalism?
    3. Quality-of-Will Views
      1. Moral Quality-of-Will Theories
      2. Epistemic Vice Theories
    4. Hybrid and Pluralist Views
  3. The Epistemic Conditions of Derivative Responsibility
    1. Foresight and Foreseeability Views
    2. No-Foreseeability Views
  4. Future Areas for Research
  5. References and Further Reading

1. The Epistemic Conditions: The Topic

The epistemic conditions of moral responsibility specify an epistemic property (or set of properties) of the agent that the agent must possess in order to be morally responsible for an act, attitude, trait, or event. “Epistemic” is understood loosely to mean “cognitive” or “intellectual.” The sense of “responsibility” here is, of course, to be distinguished from the sense of responsibility as a baseline moral capacity (being a “morally responsible agent”), as a virtue (“she is very responsible child”), or as a role or obligation (having “the responsibility” to do something). The relevant sense of responsibility is the one involved in being held responsible for something, implying accountability, or eligibility for praise or blame for that thing. Moreover, nearly every theorist of the epistemic condition takes the “backward-looking” perspective on accountability that praise or blame is fitting only in response to something that is about them or what they have done in the past, rather than fitting for the purposes of bringing about good consequences (as on “forward-looking” views).

The topic of the epistemic condition actually has a rather large scope. For anything X that we can be held responsible for—whether X is an act, omission, mental state, character trait, event, or state of affairs—we might be concerned with the epistemic conditions of responsibility in general, for X, or the epistemic conditions of praiseworthiness or blameworthiness in particular, for X. Moreover, we might be concerned with different degrees of responsibility (blameworthiness, etc.) and different modes of responsibility for X. For modes of responsibility, direct/original/non-derivative responsibility for X is obtained when all the conditions on responsibility are fulfilled at the time of X, whereas derivative/indirect responsibility for X is obtained when one or more conditions are not fulfilled at the time of X but are fulfilled at some suitable prior time. When responsibility is derivative, we talk of “tracing” responsibility back to that prior time. Finally, we might even be interested in more than one concept of responsibility for X (Watson 1996).

Concerning the epistemic condition itself, relevant epistemic states in the agent could include beliefs, credences, or capacities to have those beliefs or credences. With respect to X, the content of these epistemic states could include:

  • that one is doing or causing or possesses (etc.) X;
  • that X has a certain moral significance (for example, “is wrong”) or has features that make it morally significant (for example, harms others);
  • that X has an alternative Y;
  • that X could cause some consequence Y;
  • that W is how to perform X; and
  • any combinations of the above.

There is also an important distinction between occurrent and dispositional beliefs/credences. Occurrent beliefs are consciously thought, considered, or felt, whereas dispositional beliefs are not occurrent but are disposed to be occurrent under certain conditions. Finally, often the concepts of knowledge, awareness, foresight, and ignorance are used in the literature to refer to relevant epistemic states. While the traditional view is that ignorance is the lack of knowledge and that awareness is knowledge (or justified true belief), recent theorists of the epistemic condition take true belief to be necessary and sufficient for awareness, and they identify ignorance as the lack of true belief, the opposite of awareness (Peels 2010; Rosen 2008; Rudy-Hiller 2017). Partly for this reason, and for the reason that there is a plausible argument for thinking that the lack of knowledge (even justified true belief) that an act is wrong is no excuse for performing wrongdoing if one still believes that it is wrong (Rosen 2008), positions in the literature tend not to be couched in terms of knowledge. Like awareness, foresight (of consequences) tends to be analysed in terms of true belief as well (Zimmerman 1986).

It is clear, then, how wide the topic of the epistemic condition could be. But given the typical focus in responsibility studies on blame, rather than praise, and on actions/omissions and their consequences, it is unsurprising that the current focus of the debate has been on blameworthiness for actions/omissions and their consequences. Moreover, given the conceptual links between culpable conduct (that is, conduct for which one is blameworthy) and wrongful conduct, or conduct that is bad in some other way (for example, the “suberogatory”; McKenna 2012, 182-3), the focus has largely been on whether awareness of our conduct’s wrongfulness (or badness) is required to be blameworthy for performing it (Section 2). Partly because some views in this debate invoke the notion of blameworthiness for consequences of our conduct, too, there is also an interrelated literature on whether, and if so, what kind of epistemic condition must be satisfied to be culpable for the bad consequences of our conduct (Section 3).

The focus on whether awareness of wrongdoing is necessary for blameworthiness has also been spurred on by interest in the revisionary implications of a view known as “volitionism” or “strong internalism” (see Strong Internalism (aka Volitionism) below). The revisionary implications in question are that we should revise most of our ordinary judgments and practices of blame. There are also views on the epistemic condition for derivative responsibility (in particular, Foresight and Foreseeability Views) that have similar sorts of revisionary implications that have been brought to the attention of philosophers in the debate on derivative responsibility (cf. Vargas 2005). Not surprisingly, many of the positions in these debates have been offered as attempts to avoid these revisionary implications and vindicate our ordinary judgments and practices of blame. In recent times, though, discussion of the relative merits of these non- or semi-revisionary views has come to take centre stage, and the literature will undoubtedly continue to move away from the question of how to respond to revisionism (see Section 4 Future Areas for Research).

2. The Epistemic Conditions of Culpable Misconduct

What are the epistemic conditions on blameworthiness for wrongful (or bad) conduct? A useful initial way to carve up the literature on this question is to divide views into culpability internalist and culpability externalist kinds. This is, of course, to use terminology familiar to theorists of rationality, motivation, knowledge, and epistemic justification. But internalist/externalist terminology is not without some precedent in the literature on the epistemic condition (Husak 2016; Wieland 2017; Cloos 2018), even though the distinction is not often clearly defined. Let us define culpability internalism as follows:

Culpability internalism

An agent is non-derivatively (directly, or originally) blameworthy for some conduct X only if, at the time of X, the agent possesses a belief/credence concerning X’s badness or X’s bad-making features (or a higher-order belief/credence about the need to have the capacity to form such a belief/credence).

(The qualification in parentheses becomes relevant when we discuss Capacitarian Views below.) Culpability externalism is then the denial of culpability internalism. To use George Sher’s (2009) pithy phrase, Culpability externalism affirms the possibility of “responsibility without awareness.” The difference between culpability internalist and externalist views is best not defined in terms of awareness, though, since there are intuitively internalist views which regard acting contrary to one’s mistaken belief in wrongdoing to be blameworthy (Haji 1997). Thus, if a position demands belief in wrongdoing for the wrongdoing to be non-derivatively culpable, then the position is a form of culpability internalism. If, by contrast, a position demands only the capacity to believe that one’s conduct is wrong for it to be non-derivatively culpable, then the position counts as externalist.

The distinction between internalist and externalist theories of the epistemic condition, while useful, is very broad-brush. Fortunately, we can group views more informatively along the lines of what they take to support an internalist or externalist condition, for there are at least four different types of views about the underlying grounds for an epistemic condition: (1) basic views, (2) control-based views, (3) capacitarian views, and (4) quality-of-will views of the epistemic condition for culpable misconduct. Basic views holds that an epistemic condition is basic—that is, not based on any other condition for blameworthiness. Control-based views hold that an epistemic condition is based (partly) on the control or freedom condition for blameworthiness. Capacitarian views hold that an epistemic condition is based (partly) on a capacity-for-awareness condition of blameworthiness. And quality-of-will views hold that an epistemic condition is based (partly) on a quality-of-will condition for blameworthiness. This more informative taxonomy will be used to structure the overview of the debate on the epistemic condition for culpable misconduct.

a. Basic & Control-Based Views

Basic and control-based views tend to be treated as one family in the literature, as distinguished from the rest, and so the two will be treated together in the following sub-section.

According to basic views, an epistemic condition is a basic condition of culpability for misconduct. That is, it is not based even partly on any other condition for blameworthiness. There may be a control or quality-of-will condition for culpable misconduct, but such a condition is entirely independent of the epistemic condition; or there may be no other condition for culpable misconduct than an epistemic condition. Michael Zimmerman (1997), for example, identifies awareness as a “root requirement” of responsibility. And according to Alexander Guerrero (2007), a meat-eater is blameworthy simply if they eat meat while knowing that they don’t know whether the source of meat has “significant moral status.” Nothing else is required. Usually, the support for basic views is a mere appeal to intuition, however Guerrero (2007) appeals to how his principle is supported by theories of right and wrong.

According to control-based views, an internalist/externalist epistemic condition is based (partly) upon the control condition for blameworthiness (“partly,” in order to accommodate views on which the epistemic condition is not entirely a subset of the control condition.) Typically, the epistemic condition is internalist. The idea may be that a belief in the moral significance of the act is part of having the right sort of control at the time of the act—for example, “enhanced control” (Zimmerman 1986), the ability to do the right thing for the right reasons (Husak 2016; Nelkin and Rickless 2017), or the rational capacity to meet a reasonable expectation to act differently (Levy 2009; Robichaud 2014; cf. Rosen 2003).

Basic and control-based theorists are almost always internalists, and a distinction is usually drawn within basic and control-based internalism between a strong internalist view known as “volitionism” and weaker forms of basic or control-based internalism. Plausibly, though, there are basic and control-based theorists who are externalists about the epistemic condition—even though theorists of this kind tend not to be actively involved in the debate on the epistemic condition. This section will discuss, in turn, strong internalism, weak internalism, and then the possibility of basic and control-based externalism.

i. Strong Internalism (aka “Volitionism”)

Several philosophers (Levy 2009, 2011; Rosen 2003, 2004, 2007; Zimmerman 1997) defend the “strong internalist” (Cloos 2018) thesis—which also goes by the name of “volitionism” (Robichaud 2014)—that blameworthiness for misconduct is, or is traceable to blameworthiness for, an act done in the occurrent belief that the act is (all-things-considered) wrong. That the belief must be true, and so the act objectively wrong, is debated. Since akrasia is (often) defined as acting contrary to such an overall moral or all-things-considered judgment, strong internalism is often described as requiring “akrasia” for blameworthiness (Rosen 2004; Levy 2009). And it is often described as requiring “clear-eyed” akrasia in particular (FitzPatrick 2008), because it requires that one acts contrary to this belief when occurrent.

Why accept strong internalism? The key reasons are that (a) someone is blameworthy for an act only if it is either an instance of clear-eyed akrasia, or done in or from culpable ignorance; and (b) ignorance is culpable only if culpability for the ignorance is itself traceable to an instance of clear-eyed akrasia. “Ignorance” here means the lack of an occurrent true belief in the wrongfulness of the act.

In support of (a), everyone in the debate agrees that clear-eyed akratic wrongdoing is blameworthy (perhaps even the paradigm case of blameworthiness). Deliberately cheating on your partner while consciously knowing that it is wrong to do so is obviously blameworthy, provided that the non-epistemic conditions on blameworthiness are met. But when the agent acts in or from ignorance of wrongdoing (when the wrongdoing is “unwitting”; Smith 1983), strong internalists appeal to the intuition that they can still be blameworthy for wrongdoing but only through blameworthiness for their ignorance. Thus, the pilot who does not know that she is wrongfully initiating take-off without disengaging the gust lock is still blameworthy if she is blameworthy for failing to know that the gust lock is still engaged. This is a case of “factual ignorance,” where the agent’s ignorance of wrongdoing is owing exclusively to ignorance of some non-normative fact. But strong internalists argue, more controversially, that the same principle applies “by parity” (Rosen 2003) to “moral ignorance,” where one’s ignorance of wrongdoing is owing to ignorance of moral truth. (Indeed, some strong internalists [Rosen 2003] argue that the principle applies even to all-things-considered normative ignorance or ignorance of the way that morality trumps self-interest under the circumstances). Thus, the Battalion 101 policemen who executed Jewish women and children in the horrific Józefów Massacre (1942) would still have been blameworthy for the massacre if they did not know that they were doing wrong but were blameworthy for being ignorant of their wrongdoing. However, strong internalists appeal to more than a mere intuition to bolster the claim that when the act is unwitting, it is culpable only if it is done in or from culpable factual or moral ignorance. They cite considerations of control in support of (a). When the agent is ignorant, the agent no longer has the relevant abilities (for example, Levy’s [2009] “rational capacity”) to avoid wrongdoing or to act deliberately (Zimmerman 1997, 421-22); it would no longer be reasonable to expect them to act differently (Rosen 2003; Levy 2009), and it would be inappropriate to react with the blaming emotions to the wrongdoer. But if the ignorance is culpable in the first place (as we shall see, due to the presence of these abilities at an earlier time), then lacking these abilities is no legitimate block for blame.

Intuitions of blameworthiness and control-based considerations are also adduced in support of the claim (b), that ignorance is culpable only if culpability for the ignorance is itself traceable to an instance of clear-eyed akrasia. But a couple of further points are needed in support of (b). The first is that ignorance cannot be directly blameworthy (like akratic conduct), because the thesis of doxastic voluntarism is false: we do not have direct voluntary control over our belief-states. Thus, at best, ignorance can only be indirectly culpable through indirect control over it, which involves having direct control over prior acts that can (foreseeably) cause the formation or retention of such a state. (Strong internalists take a foresight or foreseeability view of responsibility for consequences; see 3. The Epistemic Condition for Derivative Responsibility.) Ignorance-causing/-sustaining acts are, of course, known as “benighting acts,” after Holly Smith (1983). And everyone agrees with Smith that benighting acts must be culpable for the ignorance to be culpable. So strong internalists argue that ignorance is culpable only if culpability for ignorance is traceable to culpability for a benighting act. Not just any benighting act will do, however: the distinctive of strong internalism (and what goes beyond Smith’s work) is that culpable benighting acts must themselves either be occurrently akratic or culpably unwitting. Why not, after all, think that the principles already on the table regarding culpability for wrongdoing apply equally well to culpability for benighting acts (Rosen 2004, 303)? Furthermore, since unwitting acts are never directly culpable, strong internalists therefore envision the possibility of yet further tracing when the benighting acts are unwitting, through an indefinitely long “regress” or “chain of culpability” (Zimmerman 2017), whose origin must lie in a case of clear-eyed akrasia. The result is the strong internalist’s “regress argument” (Wieland 2017).

Herein lies strong internalist’s much discussed revisionism to blameworthiness ascriptions.  The strong internalist regress must end only with a case of clear-eyed akrasia, but how easy is that to find? Zimmerman and Levy argue that clear-eyed akratic benighting acts are extremely rare or at least rarer than many think (Levy 2011, 110-32; Zimmerman 1997, 425-6). How often are we in a position to take a precaution against ignorance but decide contrary to our all-things-considered moral judgment to forgo that precaution (and thereby commit a culpable benighting act)? The answer appears to be “not often.” Levy (2011, 121-2) appeals to compelling empirical work which supports this answer. In contrast to Zimmerman and Levy, Gideon Rosen (2004) argues less that the regress makes culpability rare than that the regress recommends skepticism about moral responsibility. Any instance of akrasia, he argues, is extremely difficult to ascertain, and so blameworthiness is difficult to ascertain. Why is akrasia difficult to ascertain? Rosen cites “the opacity of the mind—of other minds and even of one’s own mind” (2004, 208). Indeed, clear-eyed akrasia may be hard to notice even when we can see into someone’s mind because:

it is not readily distinguishable from an impostor: ordinary weakness of will. The akratic agent judges that A is the thing to do, and then does something else, retaining his original judgment undiminished. The ordinary moral weakling, by contrast, may initially judge that A is the thing to do, but when the time comes to act, loses confidence in this judgment and ultimately persuades himself (or finds himself persuaded) that the preferred alternative is at least as reasonable. (2004, 309)

This problem is then compounded when we have to look into the past to determine an episode of clear-eyed akrasia; and it is probably harder to find such an episode when it is a case of benighting akrasia. Strong internalists therefore argue that we should revise most of our ordinary practices and judgments of blame.

ii. Weak Internalism

One reaction to strong internalism and its culpability revisionism is to argue that the same—basic, and control-based—grounds to which strong internalists appeal to support their view support an easier-to-satisfy form of culpable internalism. Call this form “weak internalism,” for the fact that its epistemic requirements are weaker than strong internalist requirements. A number of different views fall under weak internalism.

One is the dispositional belief-in-wrongdoing view according to which wrongdoing in a non-occurrent belief in wrongdoing can still be originally blameworthy (Haji 1997; Peels 2011; cf. Husak 2016, ch. 4). In support of this view, Haji appeals to the intuition that:

Tara may be blameworthy for quaffing her third gin-and-tonic even though, at the time, she does not have the occurrent belief that getting inebriated is wrong [but has a dispositional belief that getting inebriated is wrong]. (1997, 531)

Indeed, it is perfectly consistent for the dispositional belief theorist to assert nonetheless that she knows full well that she shouldn’t, even if the circumstances prevent her from having this thought explicitly. But there may be good theoretical reasons to require occurrent belief.

On the widely accepted principle that one is non-derivatively blameworthy for an action only if it would have been reasonable to expect the agent to avoid the action, Levy argues that

we can only reasonably be expected to do what we can do by an explicit reasoning procedure, a procedure we choose to engage in, and when we engage in explicit reasoning we cannot deliberately guide our behavior by reasons of which we are unaware, precisely because we are unaware of them. (2009, 736, n. 16)

If Tara does not have the occurrent thought that it is wrong to have another gin, then how can she engage in an explicit reasoning procedure with the upshot of avoiding wrongdoing? But this, Levy would argue, is required for her to be subject to a reasonable expectation to avoid having another gin and hence to be blameworthy for having it. Dispositional belief theorists might, however, try to resist Levy’s argument here on the grounds that Tara is subject to a reasonable expectation to avoid wrongdoing, despite her dispositional belief in wrongdoing. Perhaps the fact that her belief in wrongdoing would ordinarily be occurrent under the circumstances is sufficient to ground a reasonable expectation to avoid wrongdoing (but see Capacitarian Internalism? below). Or perhaps she has some other kind of occurrent awareness which grounds the reasonable expectation to act differently (cf. “the phenomenology of deliberative alertness”; Yates 2021, 189-90). In the end, though, the dispositional belief theorist could dig their heels in with the reply that accepting Levy’s argument requires far too drastic a revision to our commonsense ascriptions and practices of blame for his conclusion to be acceptable (Robichaud 2014, 149-151). (It is worth noting that Zimmerman himself seems to allow for an exception to his general requirement of occurrent belief in cases of “deliberate wrongdoing in a routine or habitual… manner” [1997, 422; cf. Zimmerman 2017].)

Another set of weak internalist responses challenge the strong internalist’s requirement of belief in wrongdoing, where the content of the belief is in question. Focusing especially on direct culpability for benighting conduct (see also Nelkin and Rickless 2017), Philip Robichaud (2014) has argued that a wrongdoer can be blameworthy even though they have only “sufficient, non-decisive motivating reasons” to act differently. Robichaud defines these reasons as “strong enough” as to make it (internally) rational to avoid wrongdoing, but not strong enough as to decisively support the avoidance of wrongdoing (2014, 142). To take his example, although we do not believe that we have an obligation (or that we morally ought) to check the functionality of our brake lights every time we go to drive, we may believe that “it would be good” (2014, 143) to check them. “It would be good” or alternatively “it would be safe,” or “I haven’t checked them in a while” (not his examples), would then function as non-decisive motivating reasons to check them and not to ignore them, in contrast to the strong internalist decisive reasons of “it would be wrong not to,” “I overall ought to,” or “I have an obligation to” check them. Suppose, then, that your brake lights were to fail, causing a fatal accident. Robichaud argues that you could be originally blameworthy for the accident, even though you only had these non-decisive reasons. In support of his account, Robichaud appeals to the aforementioned reasonable expectations condition of blameworthiness, and argues, against Levy (2009) that it would be reasonable to expect you to check the brake lights despite having only non-decisive reasons to do so. This is because, he contends, you would still have the rational capacity to check your brake lights under these conditions.

Levy (2016) has responded that acting for non-decisive reasons is too “chancy” to count as making the act one that it would be reasonable to expect you to perform; that is, decisive reasons are required. The reason is that:

when it is genuinely the case that an agent has sufficient but not decisive reasons to choose from two or more conflicting options, chancy factors [such as ‘trivial aspects of the environment or of the agent herself’] will play a decisive role in how she chooses. (2016, 5)

But it is not clear that this should move Robichaud. On some accounts—for example, on leeway incompatibilist accounts (see Free Will)— of control, cases in which one is torn between conflicting motivating reasons to do different things are often regarded as paradigm cases of responsibility-relevant control. Such a conflicted state might provide room for the exercise of agent-causal power on agent-causal accounts such as Roderick Chisholm’s (1976), and so it would not follow from a conflict between non-decisive reasons that “chancy factors” cause the choice. But does it follow that Robichaud needs to help himself to a controversial libertarian account of control to defend his appeal to non-decisive motivating reasons?

Another form of weak internalism that challenges the content of the strong internalist akrasia requirement is Alexander Guerrero’s (2007) moral risk view (cf. also Husak 2016, ch. 3). Guerrero responds to Gideon Rosen’s strong internalism by defending the principle, “Don’t Know Don’t Kill” (DKDK):

[if] someone knows that she doesn’t know whether a living organism has significant moral status or not, it is morally blameworthy for her to kill that organism or to have it killed, unless she believes that there is something of substantial moral significance compelling her to do so. (2007, 78-9)

Thus, DKDK entails that the Battalion 101 shooters would still have been blameworthy if they were merely uncertain whether Jewish women and children have “significant moral status,” and they lacked the belief that something compelled them to perform the executions. Guerrero argues then that a kind of moral recklessness can be grounds for original blameworthiness, alongside cases of clear-eyed akrasia. Indeed, Guerrero believes that forms of moral recklessness other than violating DKDK can be grounds for original blameworthiness too (cf. “Don’t Know Don’t Invade”; 2007, 94), however he confines his attention to the defense of DKDK. Still, one might be tempted to generalise (and simplify) the view to the following: someone is directly blameworthy for an act only if they believe that the act is wrong or that the act risks wrongdoing (Husak 2016, ch. 3).

Guerrero has already been identified as a basic internalist, and that is because he does not appeal to considerations of control to support DKDK. Rather, he appeals directly to intuitions of culpability, especially in cases of meat-eating under moral uncertainty, but also to theories of right action which would look favourably upon DKDK. Notably, he takes DKDK to be supported by recent theories of what to do under moral uncertainty which (rationally or morally) prescribe taking the least morally risky option. Nevertheless, one could certainly cite control-based considerations to support a moral risk view—for instance, the consideration that moral uncertainty provides a non-decisive motivating reason to avoid wrongdoing.

More critically, if the moral risk view does appeal to a non-decisive motivating reason to avoid wrongdoing, its defender would of course have to deal with Levy’s (aforementioned) luck-based objection to Robichaud’s view. There may also be the problem, from Robichaud’s perspective, of the view being still too restrictive in its appeal to only akrasia or moral recklessness as bases for blameworthiness: for Robichaud, believing that checking the brake lights “would be good” can be epistemically sufficient for blameworthiness. On the other side, the strong internalist could object that there are no cases of moral recklessness without akrasia.

One final version of weak internalism can be found in the work of Carolina Sartorio (2017). According to Sartorio, non-derivative blameworthiness requires awareness of the moral significance of one’s behaviour. Moreover,

being aware of the moral significance of our behavior—could be satisfied in different ways in different circumstances. In circumstances where we act wrongly, it could be satisfied by the awareness that we were acting wrongly, or by the awareness that one ought to have behaved differently. In circumstances where we don’t act wrongly, and perhaps are aware that we don’t act wrongly, it could be satisfied simply by virtue of recognizing that we are acting from morally reproachable reasons. (2017, 20)

The way that Sartorio spells out awareness of moral significance here and throughout the paper seems to indicate that Sartorio is thinking of the requirement that there is awareness of moral significance conceived as such for blameworthiness. To use language from the literature, she appears to demand “de dicto” awareness of moral significance (a term derived from “de dicto concern” about morality; Arpaly 2002). An alternative—weaker—view would have it that mere de re awareness of moral significance could be epistemically sufficient for blameworthiness, where de re awareness of moral significance would simply be awareness of features of the act that, as a matter of fact, make the act have its moral significance, whether or not there is awareness of its moral significance as such.

But now internalists might wonder whether de dicto awareness of moral significance is really required for blameworthiness. Quality-of-will internalists deny this requirement (see below). But recall Robichaud’s view that non-decisive motivating reasons suffice, where such a reason could be “I haven’t checked the brake lights in a while” (not his example). This would be a mere de re moral belief. But now suppose that you had this belief while lacking the morally de dicto belief that “therefore, checking the brake lights is now morally right, obligatory, or good.” Even so, it seems that having this de dicto belief could be sufficient epistemic grounds for you to be blameworthy for causing an accident.

Whether or not Sartorio has a successful response to this objection, however, it is worth noting that she tries to account for an intuition of blameworthiness in a certain range of cases that have not been given enough attention in the epistemic condition literature. These are so called “Nelkin-variants” of Frankfurt-style (1969) cases. Suppose that Jones shoots Smith even though he could not have done otherwise; a mad neuroscientist would have intervened if Jones faltered. According to Frankfurt and many of his followers (including Sartorio), Jones can still be blameworthy if he chooses to shoot Smith for reasons of his own. Now a “Nelkin-variant” of this Frankfurt-style case (named after cases raised by Dana Nelkin’s earlier work—cited in Sartorio 2017) would be one in which Jones becomes aware of the fact that a mad neuroscientist will intervene if Jones falters in his attempt to shoot Smith, and thereby comes to believe that he has no alternative to shooting Smith. Jones becomes aware of the neuroscientist’s intentions “at some point during the process” (m 2017, 8) resulting in the shot but in a way that (allegedly) leaves Jones unaffected, preserving his acting on the basis of his own reasons. On Sartorio’s view, Jones may still be blameworthy for shooting Smith if he “makes the choice completely on his own, on the basis of his own reasons (morally reproachable reasons, such as a desire for revenge), in exactly the same way he would have made it if he hadn’t been aware of the neuroscientist’s presence” (2017, 19). He would only need awareness of acting on those morally reproachable reasons. The upshot, for Sartorio, is that belief in alternatives is not an epistemic requirement on culpable conduct.

Plausibly, however, most of the views that we have discussed so far (especially due to Levy, Rosen, Robichaud, and Guerrero) assume such a requirement, and so we might wonder whether they are open to a plausible defense of this requirement. Perhaps they could question whether it is really possible (as Sartorio contends) for Jones to become aware of the neuroscientist’s presence and not let that affect his own assessment of his reasons to shoot Smith or of his alternatives. Perhaps he still has the (micro) alternative of shooting Smith not from his own reasons but by giving into the neuroscientist’s manipulation. Thus, maybe awareness of this alternative is needed for Jones to be blameworthy.

We have canvassed a range of different weak basic and control-based internalist responses to strong internalism, but it is of course possible to combine elements of each. Robichaud (2014), for example, couples his appeal to non-decisive motivating reasons with an appeal to mere dispositional belief. This would further enable such views to account for the commonness of culpability. More recently, Thomas Yates (2021) has provided a sustained defense of weak control-based internalism which incorporates distinctive elements of each of the above views with his requirement, on direct culpability, that the wrongdoer has outweighing motivating reasons to avoid wrongdoing that are based upon the normative reasons to avoid wrongdoing.

iii. Basic and Control Based Externalism

It would be premature to shift away from basic and control-based views without briefly discussing a sub-variety of these views that appears in ethics and the philosophy of action but that does not feature actively in the literature on the epistemic condition. This would be the subvariety of basic and control-based views that are externalist about culpability, on which culpability internalism is false but on basic or control-based grounds. Consider, for example, a view on which freedom or control over wrongdoing is necessary and sufficient for it to be culpable, but where the relevant control does not include a belief/credence according to which one’s conduct is bad. (Such a view might still, of course, involve awareness of what one is doing, and of alternatives, but it would not count as internalist, unless this awareness entailed having a belief/credence in the badness or bad-making features of one’s conduct.) Those who tend to run together “free action” and “action for which one is morally responsible” might endorse such a view. Roderick Chisholm, for instance, states that a “responsible act is an act such that, at the time at which the agent undertook to perform it, he had it within his power not to perform the act” (1976, 86). Michael Boylan (2021) also ties responsibility and freedom tightly and he contends that the judgments of right or wrong “assign praise or blame” (2021, 4-5). Indeed, ethics concerns only those actions that originate from “the free choice to do otherwise”—the same freedom that grounds moral responsibility for one’s actions. Later in the book, Boylan argues that cases of factually ignorant wrongdoing involve breaches of a prior duty (of “authenticity”) to “engage in all reasonable steps to properly justify a belief” (2021, 33)—no doubt, to justify it with respect to the “common body of knowledge” (2021, 34). Thus, as long as Boylan thinks that freely breaching a duty is culpable and need not involve awareness of that duty (or of the reasons for its application in the circumstances), such a view would then count as externalist. As on weak internalist tracing views such as Robichaud’s (2014), culpability for unwitting wrongdoing would not need to be traced back to culpability for clear-eyed akrasia. Nevertheless, culpability for the benighting act would be even easier to satisfy than on weak internalist views. (See Epistemic Vice Theories for a similar form of culpability externalism.)

While basic and control-based externalists may have the advantage of explaining more of our commonsense intuitions of blameworthiness than internalist views, many internalists would argue that basic and control-based externalists give us far too many false positive verdicts of blameworthiness. Consider, for example, that such views, if wedded to a simple conception of the ability to do otherwise, could easily pronounce youth, the elderly, the mentally impaired, the morally incompetent, and the morally ignorant (for example, cult members), blameworthy for their conduct, even though we might find it natural to excuse these wrongdoers. Proponents of such views must also find a way to successfully rebut internalist arguments to the effect that control-based considerations justify internalist requirements on culpable misconduct (see the debate between Levy and Robichaud above). Indeed, most control-based theorists of the epistemic condition think that there is more to culpability than wrongdoing or wrongdoing plus the ability to do otherwise.

b. Capacitarian Views

Another broad family of views on the epistemic condition for culpable misconduct go by the name of “capacitarian” views (Clarke 2014, 2017; Murray 2017; Rudy-Hiller 2017 [who coined the term]; and Sher 2009). Their basic idea is that having the unexercised capacity for awareness without actual awareness of the act’s bad-making features can be grounds for direct blameworthiness. Thus, if a pilot initiates take-off despite failing to notice the engaged gust lock, the idea is that the pilot could still be directly blameworthy for doing so (and for thereby risking the lives of all the passengers on board) if the pilot could have been aware—that is, had the unexercised capacity to be aware—of the engaged gust lock. More conditions are added, but that is the core idea.

Some capacitarians are interested in giving a capacitarian account of control (Clarke 2017; Murray 2017; Rudy-Hiller 2017), and so it could be argued that they advocate a type of control-based account. However, some capacitarians (for example, Sher 2009, 94) deny that they are giving an account in terms of control. Moreover, the control-based views above tend to have the distinctive features that (i) culpable conduct is due to the volitional exercise of one’s capacities, in contrast to the capacitarian’s unanimous appeal to unexercised capacities (but see Nelkin & Rickless 2017); and (ii) the capacities that are emphasised as needed are capacities to act or omit rather than capacities for awareness.

Capacitarian views are externalist—or at least capacitarianism “proper” is externalist. But there seems to be the possibility of “a capacitarian” (Rudy-Hiller 2019, 726) view which nevertheless requires a certain kind of awareness of moral significance, albeit not a first-order awareness of the bad-making features of one’s conduct. Capacitarianism proper will first be discussed before the possibility of “capacitarian internalism.”

i. Capacitarian Externalism

Capacitarianism proper is externalist: it holds that original blameworthiness for misconduct requires either awareness or the capacity for awareness of that conduct’s bad-making features. (The capacity for awareness of these features also does not depend on possessing actual beliefs or credences in one’s conduct’s bad-making features.) The view is disjunctive, because capacitarians allow blameworthiness in cases of acting in awareness of the bad-making features as well. Capacitarians demand the satisfaction of other conditions related to the exercise of the capacity, too. Fernando Rudy-Hiller (2017, 405-6) describes his capacitarian view as that when the agent is ignorant of some (non-moral) fact, they are blameworthy for their unwitting conduct (and their ignorance) only if they should and could be aware of that fact, where being able to be aware of this fact involves not only capacities to be aware of it but the (fair) opportunity to exercise those capacities. And Rudy-Hiller’s view is representative. The three essential elements to a capacitarian view are, to illustrate, (a) that the pilot must have the unexercised capacity to notice the engaged gust lock, (b) that the pilot must have the (fair) opportunity to (exercise the capacity to) notice the engaged gust lock, and (c) that the pilot should notice the engaged gust lock.

One significant advantage of capacitarianism is that it can accommodate folk intuitions of blameworthiness for so-called “unwitting omissions” (Clarke 2014)—cases of failing to do something you ought to do while lacking awareness of that failure. The case of the pilot failing to disengage the gust lock before taking-off is one such example. (Indeed, the unwitting omissions that capacitarians typically have in mind are factually unwitting, although there may be reason for capacitarians to extend their accounts to cover cases of morally unwitting omissions too). But another intuition that capacitarians account for is the intuition that culpability for unwitting omissions (or a subset of them) does not trace back to culpability for a benighting act. Now a tracing strategy could probably be employed to explain the pilot’s culpability in the airplane crash case (grounding culpability in the earlier failure to run through the pre-flight checklist); and indeed, tracing critics of capacitarianism have argued that many of the proposed “non-tracing” cases can be given a plausible tracing analysis (see Nelkin & Rickless’ [2017] discussion of cases given by Sher and Clarke). But let us try to consider an uncontroversial non-tracing case. Suppose that “a house burns down because someone forgot to turn off a stove” (Clarke 2017, 63), but where the culprit—call him Frank—has never forgotten to turn it off, and where it never occurred to him this time, or ever, to be more vigilant about turning it off after using it. Even still, many of us report intuitions of blameworthiness. It might, after all, seem fair for the landlord or family member to blame Frank (morally) for the house fire, especially after learning that he forgot to turn off the stove. And yet Frank was not aware of leaving the stove on at all, let alone aware of its being wrong to do so. Thus, it looks like internalist views are in trouble. But capacitarians can account for the intuition of culpability by appealing to Frank’s capacity to notice the stove, opportunity to exercise this capacity, and obligation to notice the stove.

While all capacitarians endorse this thesis about direct blameworthiness, some—for example, Rudy-Hiller (2017, 417)—also require that the ignorance is culpable for the unwitting conduct to be culpable, but others deny this. Clarke (2014, 173-4) argues that the ignorance need only be faulty for the unwitting conduct to be directly culpable, while tracing would be required to explain culpability for the ignorance. But Rudy-Hiller does not think that a culpable ignorance requirement entails that culpability for unwitting conduct is derivative of culpability for the ignorance. Rather, he thinks that both the ignorance and the unwitting conduct are under “direct” capacitarian control (apparently accepting a kind of doxastic voluntarism).

Capacitarians generally agree on which kinds of cognitive processes or faculties constitute cognitive capacities, however they disagree on how exactly to characterise them. Some also try to unify them under one “mother” capacity—for instance, vigilance (Murray 2017). On which kinds of faculties constitute cognitive capacities, Clarke has a useful passage cataloguing the relevant capacities:

Some are capacities to do things that are in a plain sense active: to turn one’s attention to, or maintain attention on, some matter; to raise a question in one’s mind or pursue such a question; to make a decision about whether to do this or that. These are, in fact, abilities to act. Others, though capacities to do things, aren’t capacities whose exercise consists in intentional action. These include capacities to remember, to think of relevant considerations, to notice features of one’s situation and appreciate their normative significance, to think at appropriate times to do things that need doing. (2017, 68)

Most capacitarians allow both kinds of capacities, however some do not allow the first class of capacities that consist in abilities to act. For example, Sher argues that “if we did construe the cognitive capacities as ones that their possessors can choose to exercise, then we would have ushered [an internalist control-based view] out the front door only to see it reenter through the back” (2009, 114). It is not clear, however, that allowing these capacities to act would involve smuggling such a view back in, for capacitarians need not hold that as soon as we enter any domain of agency or choice, let alone the domain of exercising cognitive capacities, internalist conditions need to be met.

Capacitarians face the challenge of answering what it takes to have a relevant capacity for awareness. Clarke and Rudy-Hiller take a view on which the agent has the relevant capacity if on similar occasions in the past, they have become aware of the relevant bad-making features. By contrast, Sher adopts a counterfactual analysis of capacities, according to which someone has the relevant capacity if she would have been aware of the relevant facts in a range of other similar circumstances (2009, 114). Whichever way we might spell out the relevant capacity there are some unique challenges that need to be met. For both the past-occurrences and counterfactual views, we might ask what (past or counterfactual) circumstances count as “sufficiently similar.” And concerning the past-occurrences view, we might be concerned with cases in which the agent has lost their capacity for awareness ever since they were last relevantly aware (Sher 2009, 109).

For capacitarians, having the capacity for awareness means nothing without a (fair) opportunity for it to manifest. Rudy-Hiller, for instance, requires that there are no “situational factors that decisively interfere with the deployment of the relevant abilities” (2017, 408). Frank would be excused for failing to turn off the stove if Frank collapsed with a heart attack during his cooking (although it is dubious that failing to turn off the stove would still count as wrong in this case). Clarke says something similar, although he argues that it is enough that they “sometimes mask… the manifestation of psychological capacities without diminishing or eliminating them” (Clarke 2017, 68). Imagine, instead, that Frank merely fell ill for the next couple of hours and had to lie down. In these cases, Clarke argues that it would “not be reasonable to expect [him] to remember or think to do certain things that [he] has a capacity to remember or think to do” (2017, 68).

The last key requirement, according to the capacitarian, is that the agent should have been aware of the relevant considerations at the time of their action or omission. Why is such a condition indispensable? Well, just as internalist tracers argue that blameworthiness for an unwitting act requires performing a benighting act that falls below a standard that would have been reasonably expected of them, so capacitarians contend that blameworthiness requires that the agent’s awareness fell below a certain “cognitive standard” (Clarke 2014) that would have been reasonably expected of them. If, for example, Frank fell ill while cooking, it seems false that Frank ought to have remembered that the stove was on, for such a standard seems too harsh or demanding. Capacitarians disagree, however, on whether this standard is set by an obligation (Rudy-Hiller 2017, 415; Murray 2017, 513) or merely a norm (Clarke 2014, 167) of awareness.

A number of objections to capacitarianism, in addition to the problems for giving an adequate account of capacity for awareness, have been raised in the literature. One objection is that the appeal to capacities fails to capture anything that is morally relevant for attributions of moral responsibility. Sher (2009), for instance, argues that the fact that wrongdoing originated from the wrongdoer is sufficient for the wrongdoer’s culpability, never mind whether they had control, freedom, or ill will (see Quality-of-Will Views below). Sher’s story is complicated, and appeals to the way that we react, as blamers, to the whole person when we blame them, to all their psychological capacities, and not only to their vices. But A. Smith (2010) has argued that attributability via origination threatens to collapse attributions of moral responsibility into attributions of causal responsibility. Indeed, the problem seems particularly poignant for accounts such as Sher’s which deny the control condition of blameworthiness, since those who appeal to control at least try to appeal to a widely accepted basis for responsibility attributions. Thus, a good deal seems to ride on a successful defense of the notion of capacitarian control.

A second objection is the reasonable expectations objection raised by Levy (2017) (cf. also Rudy-Hiller 2019). As we have seen, capacitarians appeal to the way that their conditions ground a reasonable expectation to avoid unwitting misconduct. Levy, however, argues that capacitarian conditions fail to ground such a reasonable expectation, because expecting someone to avoid wrongdoing through the exercise or activation of a capacity for awareness is expecting someone to avoid wrongdoing “by chance or by some kind of glitch in their agency” (2017, 255). The problem is especially pressing when one considers those capacities that are not, as Clarke describes them, “capacities to act,” and so it might be in the capacitarian’s interests to restrict the relevant capacities to those that require “effort to appropriately exercise” (Murray 2017, 516). Past-occurrent capacitarians could also reply (as they have done) that:

if an agent has demonstrated in the past that she has a certain capacity and there is no obvious impediment to its manifestation in the present circumstances, then it is reasonable to expect her to exercise it here and now. (Rudy-Hiller 2019, 734)

Even so, Levy’s point is that they would need awareness of the fact that, for example, their mind is wandering for them to have the right sort of control over their capacities, but (1) this is not required by capacitarians (at least of the externalist variety; see below) and (2) this awareness itself is not under the agent’s voluntary control (2017, 255). Rudy-Hiller (2019; see Capacitarian Internalism?) has also argued that there are cases in which the present circumstances are sufficiently different from previous circumstances (in which you demonstrated the relevant capacity for awareness), such that the agent in the present circumstances lacks awareness of the risk of not being aware of the relevant facts, and therefore lacks awareness of the need to “exert more vigilance in the particular circumstances she [is] in” (2019, 735). In these cases, he argues, it is not reasonable to expect the agent to avoid wrongdoing.

Of course, the capacitarian could deny the widely accepted reasonable expectations conditions of blameworthiness. But this would seem to come at the high price of exacerbating the first problem (above) of how to avoid collapsing moral responsibility into causal responsibility. William FitzPatrick (2017, 33) also argues that rejecting this condition fails to account for the way that reasonable expectations are grounded in moral desert, an indispensable aspect of blameworthiness on his view.

ii. Capacitarian Internalism?

But another response to the reasonable expectations objection to capacitarianism proper is to amend capacitarianism so as to include an awareness condition after all. This is Rudy-Hiller’s revised (2019) view. According to this view, the core elements of capacitarianism are left intact and constitute part of what he calls “cognitive control,” but the other part involves an awareness-of-risk condition, that is, awareness of the risk of “cognitive failure” (for example, awareness of the risk of not noticing that the stove is still on), and a know-how condition, involving awareness of how to avoid that cognitive failure in the circumstances. Rudy-Hiller argues that these conditions need to be added, because without having been in similar circumstances in the past, agents are “in the dark regarding the risks associated with allocating cognitive resources in certain ways and therefore… in the dark regarding the need to exercise that capacity” (2019: 731). Indeed, Rudy-Hiller would argue that these agents are “entitled to rely on the good functioning of [their] cognitive capacities without having to put in special effort to shore them up” (emphasis added, 2019: 732). Thus, it turns out that many unwitting wrongdoers are blameless in the end, because they fail to satisfy the awareness-of-risk and know-how conditions. Imagine that Frank’s partner announces halfway through his meal preparation that her friends are coming over, and that they are gluten-free, and so he must now change his cooking plans to accommodate them. He has never had to do this. Suppose then that they arrive and he keeps himself occupied by being a good host. Unfortunately, this means that he is no longer mentally present enough to remember to turn the stove off and it causes a kitchen fire partway through the evening. In this case, Rudy-Hiller would say that Frank is blameless, because he is not especially aware of the risk of failing to notice that the stove is still on.

Such a view seems to count as an internalist view, not only in the spirit of its appeal to awareness, but in the contents of the awareness itself. While it does not involve awareness of the badness or bad-making features of the wrongful omission, it does involve a kind of higher-order awareness of the need to have the capacity for awareness of those features (whatever they may be). (This then explains the parenthetical disjunct in the definition of culpability internalism above.) That being said, one could argue that failing to exercise enough vigilance is itself a wrongful mental omission which explains the subsequent omission to turn the stove off. If so, then awareness of the risk of failing to exercise enough vigilance in the circumstances satisfies the ordinary internalist requirement of possessing a “belief/credence in the bad-making features of one’s conduct.”

Rudy-Hiller’s capacitarian internalist view has certainly much to be said for it, and it is yet to receive significant criticism. However, it is unlikely to move those who wish to accommodate a strong intuition of culpability even in these special cases of “slips.” Rudy-Hiller sacrifices this advantage for the benefit of preserving the reasonable expectations and control conditions on responsibility. We might also wonder to what extent Rudy-Hiller’s capacitarian internalism is not a closet tracing view (a variation on the control-based internalist views from the last section) if it can intelligibly be argued that the omission to exert enough vigilance in the circumstances is a separate “benighting” mental omission that gives rise to the subsequent “unwitting” omission. These would, after all, be cases in which “the temporal gap between it and the unwitting [omission] is infinitesimal” (Smith 1983, 547).

c. Quality-of-Will Views

Another set of views on the epistemic condition for culpable misconduct approaches the topic from an entirely different perspective. According to these so-called “quality-of-will” views (which are also known as “attributionist” views, even though this term has been used for some capacitarians), blameworthiness for misconduct requires that a bad quality of will was on display in that misconduct, or in prior (benighting) misconduct. Moreover, the question of the epistemic condition for blameworthiness is to be answered by inquiring into the epistemic condition for the display of ill will. Thus, what licenses culpability ascriptions is not primarily control, as on control-based views, nor capacities, as on capacitarianism, but a bad will.

The basic idea of quality-of-will theories is simple and intuitive: the Battalion 101 shooters are blameworthy for their participation in the Józefów Massacre because they displayed an egregious disregard for the lives of Jewish women and children. The pilot who takes off without disengaging the gust lock acts carelessly and recklessly.

The main varieties of quality-of-will views are moral quality-of-will views and epistemic vice theories. Moral quality-of-will views appeal to morally reproachable qualities of the will (such as disregard for what’s morally significant). Epistemic vice theories are regarded in this article as quality-of-will views because they ground culpability for unwitting wrongdoing ultimately in the expression of a bad epistemic quality of will—for example, the epistemically vicious traits or attitudes of carelessness, inattentiveness, or arrogance. As we will see, moral quality-of-will views fall on either side of the culpability internalism/externalism debate, but epistemic vice theories are externalist.

i. Moral Quality-of-Will Theories

Moral quality-of-will theories appeal to morally reproachable qualities of the will. Accordingly, the “display of ill will” has been analysed in terms of the act’s expressing or being caused by an inadequate care for what’s morally significant (Arpaly 2002; Harman 2011), indifference towards others’ needs or interests (Talbert 2013, 2017; McKenna 2012), objectionable evaluative attitudes (A. Smith 2005), and reprehensible desires (H. Smith 1983, 2011).

These theorists are united in their view that one can be directly blameworthy for wrongdoing, even if it is done in the absence of a belief in wrongdoing or a de dicto belief in the moral significance of the act (against, for example, Sartorio). Even if the Battalion 101 shooters did not know that it was wrong to murder Jewish women and children, they are directly blameworthy for doing so, because they displayed an objectionable disregard for the moral status (humanity, etc.) of their victims. For some quality of will theorists (Talbert 2013, 234), this holds even if the shooters’ moral ignorance was blameless (or epistemically justified), given widespread cultural acceptance of the inferior status of Jews in Nazi Germany. However, others (Harman 2011, 461-2) would still require that their moral ignorance was blameworthy, even if culpability for their ignorance did not explain culpability for their unwitting wrongdoing. Nevertheless, quality-of-will theorists tend to make it easier than control-based theories for attitudes or states such as ignorance to be culpable, for these states tend to be regarded as directly, rather than indirectly, culpable, and under the same conditions as actions are culpable—namely, when they display ill will (consider, for example, prejudiced or misogynistic beliefs about women; Arpaly 2002, 104). Indeed, these theorists typically do not promote tracing explanations, because, like their “real self” forebears (Waston 1996), they hold that the relevant responsibility relation between agent and object (act, belief, etc.) is an atemporal or structural relation between the agent’s quality-of-will and the object of responsibility assessment. Not surprisingly, then, moral quality of will theorists tend not to focus on benighting conduct. But they could easily extend their views to cover benighting conduct in the way that epistemic vice theorists do below, or by appealing to the notion of motivated ignorance.

Moral quality-of-will theorists are divided on the culpability internalism/externalism debate. Matthew Talbert (2013) and Elizabeth Harman (2011, 460) are internalist, because they argue that caring inadequately for what is morally significant requires awareness of what is morally significant. Hence, they require only de re moral awareness, or awareness of the bad-making features of their conduct. Talbert has probably produced the most sustained defense of this idea. Suppose that walking on plants turns out to be wrong because it causes them to suffer, and you are ignorant of plant suffering (Levy’s [2005] example). Talbert argues that ignorance of plant suffering would excuse you from blame because doing so would not express “a judgment with which we disagree about the significance of the needs and interests of those [plants] affected by the action” (2013, 244). However, if you were to become aware that plants suffer, then you would no longer be excused for walking on plants, even if you believed that it was permissible to continue walking on them. This is because you now express a judgment concerning plant suffering that we disagree with, the judgment that plant suffering does not matter, or should not be regarded like the suffering of other living things.

Some moral quality-of-will theorists by contrast do not require awareness of misconduct’s bad-making features for it to be culpable. Most prominently, Angela Smith (2005) has argued that, among other things, unwitting omissions—such as her case of omitting to send your friend a card on her birthday because you have forgotten it is her birthday—are directly culpable because these omissions and their accompanying ignorance express objectionable evaluative attitudes (for example, the judgment that a friend’s birthday is unimportant). Critical to her argument for the culpability of unwitting omissions is her appeal to the concept of responsibility as answerability—as, being open to “demands for reasons or justifications” (2005, 259)—a property which seems applicable to you in the case of forgetting your friend’s birthday. Since these kinds of cases involve the lack of any belief or credence in the bad-making features of one’s omissions (for example, the features that today is your friend’s birthday and that it would be inconsiderate not to give her a call), the view counts as externalist.

Quality-of-will externalists like Smith and capacitarians therefore have the similarities that both are concerned with unwitting omissions, and both argue against tracing strategies for explaining culpability for unwitting misconduct. Nevertheless, an important difference between these views is that quality-of-will externalists require displays of ill will for blameworthiness. To the extent that in the above house fire case, Frank has never in his cooking shown an objectionable orientation towards his home and his family (nor the house’s owner), we might think that on this occasion, when he forgets to turn it off, Frank does not display any ill will. If so, then even quality-of-will externalists would excuse him for not turning off the stove. We have seen, though, that Frank could easily fulfil the capacitarian’s conditions, and so this is a type of case in which the verdicts of quality-of-will theorists and capacitarians could easily diverge. Admittedly, Smith seems to take it that normal cases of unwitting omissions count as cases involving objectionable attitudes, and so there may not be much of a difference in practice between the verdicts of Smith and capacitarians. But certainly, the contrast between capacitarian views and quality-of-will internalist views is significant. While Talbert (2017) appears to concede to Smith that some cases of factually unwitting omissions are culpable, Talbert argues that “garden-variety” cases of unwitting omissions—including the one about forgetting your friend’s birthday—are not obvious cases of culpability because “quite often, we probably shouldn’t have much confidence that another person’s forgetfulness or his failure to notice something conveys much morally relevant information about what he values” (2017, 30). Capacitarians and quality-of-will externalists have intuitions of culpability Talbert thinks (2017, 31ff.), because humans have a bad tendency (according to studies in psychology) to attribute ill will to other humans (even non-humans) when ill will is absent, especially when we see the harmful results of their behaviour.

Three important objections have been raised against moral quality-of-will theories in the literature. The first objection is one that we have already seen raised against capacitarians: quality-of-will theorists cannot account for the reasonable expectations conditions of blameworthiness (FitzPatrick 2017, 33-4). Consider, for example, that it might not have been reasonable to expect the Battalion 101 shooters to avoid participating in the massacre of Jewish women and children if they were entirely oblivious to the fact that it was wrong, but a quality-of-will theorist has the verdict that they are blameworthy. But this kind of case might reveal that there is a problem with the reasonable expectations conditions of blameworthiness, and this is how Talbert (2013) defends his quality-of-will theory.

A second objection to quality-of-will theories is that they collapse the “bad” and the “blameworthy” (Levy 2005)—once again, a similar objection to one raised against capacitarianism. Smith, after all, identifies the “precondition for legitimate moral assessment” (Smith 2005, 240) with the precondition for legitimate responsibility assessment—that is, she identifies “moral criticism” with moral blame. But mere negative moral assessments of a person given their behaviour—that is, judgments of their being vicious, having an objectionable attitude, or lacking sufficient care for others—seem to be crucially different from, and need not imply, judgments of moral responsibility or blameworthiness for the behaviour in question. Perhaps we think that people need the right kind of control over whether they display their ill will in order to be morally responsible for their behaviour (Levy 2005). Not according to A. Smith (2005): she is happy to accept the consequence that she collapses the bad and the blameworthy. But another quality-of-will response is to accept that this is a problem and try to explain the difference.

According to Holly Smith, we can “appropriately think worse of a person” who expresses a single or “isolated” quality of will that is objectionable, but we cannot blame her, unless she reveals “enough of her moral personality” (2011, 144). Consider her key example (2011, 133-4). Clara strongly dislikes Bonnie but has always managed to reign in “nasty” comments about her hair in order to keep a good reputation (among other reasons). One day, however, “Clara’s psychology teacher hypnotises Clara,” the outcome of which is that Clara no longer cares about her reputation (etc.). In consequence, Clara launches a “cutting attack on Bonnie’s appearance.” Now, what is important is that the attack manifests ill will (her desire to “wound” Bonnie). But H. Smith’s intuition is that she is not blameworthy. After all, the desires for maintaining her good reputation (etc.) that would normally inhibit her are not operative. Thus (apart from akrasia [H. Smith 2011, 145]), blameworthiness requires the display of a sufficient portion of the agent’s will, not just one part of it (for example, a single bad desire). Whether this distinguishes eligibility for moral criticism from eligibility for moral blame sufficiently is not clear, however. There are also concerns in the literature about the ability for quality-of-will theorists to account for intuitions of blamelessness arising from other “manipulation” cases.

A third objection to moral quality-of-will theories is simply that ill will is not necessary for blameworthiness, and the aforementioned capacitarian non-tracing cases are usually trotted out in this context. So a great deal hinges on what we are to make of that debate.

ii. Epistemic Vice Theories

Another subvariety of quality-of-will theories are James Montmarquet’s (1999) and William FitzPatrick’s (2008, 2017) epistemic vice theories. Interestingly, both theorists agree with those control-based internalists who argue that moral and factual ignorance excuses wrongdoing, but they contend that culpability for that wrongdoing traces back to culpability for the ignorance, which, they argue, is grounded in exercises of epistemic vice. The epistemic vices are apparently possessed as character traits on FitzPatrick’s (2008) view, but Montmarquet (1999) seems only to envision a momentary vicious attitude or motive (viz., insufficient “care” in belief-formation).

Consider Zimmerman’s case of Perry who, upon arriving at the scene of a car crash involving a trapped individual, Doris, and a burning car, “rushes in and quickly drags Doris free from the wreck, thinking that at any moment both he and she might get caught in the explosion” (1997, 410). Alas, Perry paralyses Doris in the act of dragging her free. In defense of the appeal to epistemic vices, Montmarquet (1999, 842) attaches significance to Zimmerman’s admission that the natural thing to say about this case is that Perry is culpable for unwittingly paralysing Doris and that this is due to Perry’s “carelessness,” “inconsiderateness,” or “inattentiveness” in failing to “entertain the possibility of doing more harm than good by means of a precipitate rescue” (Zimmerman 1997, 416). For Montmarquet, this is indeed what we should say. In fact, Montmarquet would argue that in this moment, Perry has “direct (albeit incomplete)” control (1999, 844) over his beliefs, and that the way he exercises that control is epistemically vicious, for it fails to exhibit enough “care” in belief-formation. (It is not, however, essential for epistemic vice theories to appeal to direct control over one’s beliefs. FitzPatrick (2008) denies doxastic voluntarism.) At any rate, grounding Perry’s culpability in his lack of care in belief-formation is externalist, because contrary to Zimmerman and other control-based internalists, Montmarquet and FitzPatrick would not require for Perry’s culpable ignorance that Perry was aware of his failure to be open-minded to “the possibility of doing more harm than good.”

The root idea… is that a certain quality of openness to truth- and value-related considerations is expected of persons and that this expectation is fundamental, at least in the following regard. The expectation is not derivative of or dependent upon one’s (at the moment in question) judging such openness as appropriate (good, required, etc.)—just the opposite: it would include a requirement that one be open to the need to be open, and if one is not open to this, one may be blameworthy precisely for that failure. (Montmarquet 1999, 845)

It is clear in this passage that Montmarquet employs the reasonable expectations conditions of blameworthiness (well before it became a key focus of the debate in the late 2000s) and he evidently tries to account for how it is met by his epistemic vice theory. FitzPatrick (2008, 2017) also takes up this project, but he argues in response to Levy’s (2009) strong internalist requirement for reasonable expectations that if it is not reasonable to expect someone to avoid acting from their epistemic vices, then culpability traces even further back to culpability for those vices and for those vicious character-forming acts that it would have been reasonable to expect the agent to avoid in the first place (FitzPatrick 2017). It is not clear that this solves the issue from the strong internalist’s perspective, however, for the internalist would still require that the character-forming choices were themselves seen as wrong. It seems, then, that it is in the best interests of the epistemic vice theorist to resort to Montmarquet’s appeal to the fundamentality of exercises of epistemic vice with or without awareness of doing so (and with or without Montmarquet’s appeal to direct doxastic control).

The debate between epistemic vice theorists and other defenders of the reasonable expectations condition then becomes whether the epistemic vice theorist can ground a reasonable expectation without an internalist requirement. But clearly, it is open to these theorists to dispense with this requirement—as their cousins in the moral quality-of-will camp have done (see above).

But epistemic vice theorists have their own challenges, too. Why, for example, should benighting conduct be treated any differently from ordinary (non-benighting) conduct, as far as culpability ascriptions are concerned? It is difficult to see what it is about being the kind of act or omission that causes ignorance that makes it eligible for a different culpability assessment than any other kind of act or omission. Perhaps an epistemic vice theory is best employed in conjunction with a moral quality-of-will theory of culpability for non-benighting conduct, which does away with tracing.

d. Hybrid and Pluralist Views

We have nearly canvassed the full range of positions that are currently defended on the epistemic condition for culpable misconduct. What we have left are those positions that mix some of the above views in different ways. There are two ways that this can be done: (1) defend a hybrid theory, which combines one or more of the above views in a single theory of blameworthiness; or (2) defend pluralism, which divides blameworthiness into different kinds, and then assigns different epistemic conditions to each.

For examples of a hybrid theory, FitzPatrick (2008) combines his epistemic vice theory with a kind of capacitarian requirement. The agent must have the capacity and the social opportunity to become aware of and avoid acting from epistemic vice. More recently, Christopher Cloos (2018, 211-2) argues that culpability for wrongdoing is secured either directly, under quality-of-will internalist conditions, or indirectly (when there is culpable factual ignorance) under weak internalist or epistemic vice theoretic conditions. Taking an all-inclusive approach like Cloos’s clearly gives you the advantage of accounting for as many of our ordinary intuitions of blameworthiness as possible, however it also inherits some of the distinctive problems of the views it combines. It must also face the charge of ad-hocness: is there some motivation for a hybrid theory other than its ability to account for intuitions about individual cases relevant to the epistemic condition? Is there, for instance, a plausible background theory about responsibility or blame that gives rise to such a hybrid?

By contrast, Elinor Mason (2019) and Michael Zimmerman (2017) offer pluralist accounts of the epistemic condition. Mason holds that there are three “ways to be blameworthy.” One form requires the satisfaction of strong internalist conditions; another demands only the satisfaction of quality-of-will conditions; and then the third is generated voluntarily by taking responsibility for one’s conduct (bringing along epistemic conditions of a different kind). Zimmerman (2017) defends a similar sort of pluralism, submitting that in his earlier (1997) work, he was only intending to give a strong internalist account of one form of blameworthiness, the one that is supposedly the basis for punishment. As for hybrid views, pluralist views inherit some of the problems of the monist views discussed above, but they also face the challenge of accounting for why different forms of blameworthiness are needed to account for the relevant considerations. Given that simplicity should be preferred over complexity, it seems that the debate would need to be intractable enough to warrant splitting blameworthiness into multiple forms, but it is not clear that this is so. How, for instance, should Mason and Zimmerman reply to the control-based criticism of quality-of-will views that they do not specify sufficient conditions for blameworthiness but only for some form of closely related negative attributability which is often confused for blameworthiness (Levy 2005)? Another challenge for pluralist views is justifying the exclusion of those monist analyses above (that is, capacitarianism, for Mason and Zimmerman) that do not constitute an analysis of one of the ways to be blameworthy.

3. The Epistemic Conditions of Derivative Responsibility

Alongside the debate on the epistemic condition for culpable misconduct, an interrelated debate has taken place on the epistemic condition for derivative responsibility—that is, responsibility (especially blameworthiness) for the consequences of our conduct. Why the debate on the epistemic condition for derivative responsibility is interrelated with the debate on the epistemic condition for culpable misconduct should now be clear: in the latter debate, culpability for unwitting omissions is often traced back to culpability for prior conduct, and these tracing strategies nearly always make essential reference to culpability for ignorance as itself a consequence of prior (benighting) conduct. But we have also seen how derivative responsibility for character (epistemic vices) might be part of the story. Thus, many of the philosophers whose views have already been discussed address the question of the epistemic condition for derivative responsibility in the context of the above debate (see below). But as we shall see, a number of philosophers are interested in the question of the epistemic condition for derivative responsibility as a question worth thinking about in its own right, or else they address the question in the context of another debate in responsibility studies (for example, on doxastic responsibility: Nottelmann 2007; Peels 2017). There are also many views which affirm the idea of derivative responsibility but which leave out a discussion of its epistemic condition, and so it is not clear what they have to say on the epistemic condition.

a. Foresight and Foreseeability Views

Views on the epistemic condition for derivative responsibility divide into those we might call foresight views, foreseeability views, and no-foreseeability views. Foresight views have the strongest epistemic condition in their claim that foreseen consequences are the only consequences of our conduct for which we are responsible (see, for example, Boylan 2021, 5; H. Smith 1983; Nelkin and Rickless 2017; Zimmerman 1986, 1997). By contrast, foreseeability views claim that unforeseen but (reasonably) foreseeable consequences can also be consequences for which we are responsible (Fischer and Tognazzini 2009; Murray 2017; Rosen 2004, 2008; Rudy-Hiller 2017; Sartorio 2017; Vargas 2005). Before we discuss the debate between these views, it would be worth introducing various disagreements about the nature and content of the foresight that one must have or be able to have.

On both foresight and foreseeability views, the foresight is nearly always analysed in terms of belief concerning the relevant consequence of one’s conduct (see especially, Zimmerman 1986, 206; cf. Nottelmann’s [2007, 190-3] criticism). Sometimes there is also an appeal to reasonable foresight (see, for example, Nelkin and Rickless 2017; cf. “reasonable foreseeability,” Vargas 2005) Moreover, some theorists analyse foresight in terms of occurrent belief (Zimmerman 1986), while others argue that dispositional belief suffices (for example, Fischer and Tognazzini 2009). Intuitively, if the pilot decided to skip running through every item on the pre-flight checklist but did not consciously foresee that doing so could lead to a catastrophic airplane crash, she could still be blameworthy for these consequences even if she merely dispositionally believed that these were the risks of rushing the pre-flight check (that is, if she would have cited these as reasons not to rush the check if asked). But plausibly this debate hangs on whether a successful defence of the requirement of occurrent belief can be found for directly culpable misconduct (see above).

There are also a number of disagreements surrounding the content of the relevant foresight. One disagreement concerns whether an increased likelihood of the consequence of one’s conduct must be foreseen/foreseeable. Zimmerman (1986, 206) includes no such condition, citing merely belief that there is at least “some probability” that the consequence will occur. But it is much more common to require foresight/foreseeability of an increased risk or likelihood of the consequence (Nottelmann 2007, 191ff.; Nelkin and Rickless 2017, 120; Peels 2017, 177). Intuitively, foreseeing some probability but no increase in the risk of a bad consequence would not give one a reason to take a precaution against it.

Another issue is subject to greater debate: must the specific consequence be foreseen/foreseeable, or does it suffice that the general type of consequence (“consequence type”) is foreseen/foreseeable? Some (Zimmerman 1986; Vargas 2005) think that there must be foreseeability of the specific/token consequence. In contrast, others (Fischer and Tognazzini 2009; King 2017; Nelkin and Rickless 2017; Nottelmann 2007), think that there can be foreseeability of the consequence type. The latter view is perhaps more intuitive. Suppose that a teacher comes up with the wrong answer to a highly important question raised by a student after failing to prepare for class despite recognizing the need to be well-prepared. To be responsible for giving the wrong answer, it seems that the teacher need not have foreseen the specific question to which she gave the wrong answer, nor even foreseen responding wrongly to a students question. She need only have foreseen the risk of misguiding the students or asserting falsehoods in class as a consequence of not preparing. A consequence-type view would also more easily accommodate intuitions of derivative culpability for morally unwitting wrongdoing: if the Battalion 101 shooters had the opportunity to question Nazi ideology at some point in their life prior to the massacre while believing that failing to question this ideology could lead to harming the Jews, then they could well have been indirectly blameworthy for their participation in the massacre. How, then, can defenders of the requirement of foreseen/foreseeable token consequences respond to the intuitive sufficiency of consequence-type foresight/foreseeability? Perhaps there are problems with specifying how broad a “type” the token consequence can fall under. Would foresight of a consequence as general as “causing something bad” suffice?

The final disagreement concerning the content of the required foresight/foreseeability is disagreement about how the foresight/foreseeability of the consequence’s moral significance or morally significant features is to be spelled out. After all, foresight of the consequence’s morally significant status or features is surely required (cf. Vargas 2005; Fischer and Tognazzini 2009; even though it is sometimes left out of analyses—see, for example, Nelkin and Rickless 2017). Suppose, for example, that the pilot foresaw the risk of an airplane crash from failing to run through the pre-flight checklist but did not believe that this was wrong or bad, nor even that it risked being bad. Or suppose that the pilot was crucially factually ignorant, believing mistakenly but fully that she had been told to intentionally crash the plane for a film stunt. Employing various of the intuitions generated in reflection on the epistemic condition for culpable misconduct (above), she is surely blameless for the crash under one or more of these conditions, unless she was blameworthy for her ignorance, or she displayed ill will despite her factual ignorance, or she had the capacity to be aware that she was not in a film set.

What moral significance or morally significant features, in particular, must be foreseen/foreseeable? Plausibly the answer should be informed by one’s account of the epistemic condition for directly culpable misconduct. Thus, strong internalists and others (for example, Sartorio) who require de dicto awareness of moral significance might be tempted to require, for culpability, that the consequence is believed to be morally bad or wrong. Weak internalists such as Robichaud might only require foresight/foreseeability of reasons against the consequence. And quality-of-will theorists and capacitarians might only require foresight/foreseeability of the consequence’s bad-making features.

At last, we come to the debate between foresight and foreseeability views. Why demand a more restrictive foresight condition for derivative responsibility? Intuitively it seems that (reasonable) foreseeability could suffice. Suppose that the teacher failed to even foresee misleading her students as a consequence of not preparing for her class, but that this consequence was (at least reasonably) foreseeable for her. Even so, it seems that she could be blameworthy for misleading her students. At the very least, that is the type of view that quality-of-will externalists and capacitarians would be drawn to (cf. Rudy-Hiller 2017). Consider, after all, that she seems to meet capacitarian conditions with respect to the consequence of misleading her students: she seems to have the capacity and the opportunity to foresee, and failing to foresee falls short of a cognitive standard that applies to her (no doubt qua teacher). In fact, capacitarian conditions seem to provide a plausible analysis of the nature of foreseeability (compare Zimmerman’s [1986] discussion of an alternative analysis in terms of what the “reasonable person” would foresee, as used in the legal definition of negligence). Quality-of-will externalists might also appeal to the way that her failure to foresee misleading her students, despite its being reasonably foreseeable for her, reveals an objectionable indifference to their success.

But the fact that a foreseeability view is at home with externalism about directly culpable misconduct might give us a clue as to how the foresight view could plausibly be defended against it, despite being more restrictive and maybe less intuitive: we seem to get the best justification for the foresight view from internalism about directly culpable misconduct. Interestingly, however, some internalists (Rosen 2008; Fischer and Tognazzini 2009), who argue that blameless ignorance excuses wrongdoing from it, defend a foreseeability view. But they do not tend to give an argument for this combination of internalism about direct culpability with a foreseeability view about indirect culpability. And, in fact, Daniel Miller (2017) has recently produced an ingenious argument for the inconsistency of this combination of commitments:

The argument begins from the premise that it is possible for an agent to be blameless for failing to foresee what was foreseeable for him. The second premise is the principle that an agent is blameworthy for acting from ignorance only if he is blameworthy for that ignorance. If blameless ignorance excuses agents for actions, though, then it also excuses agents for action consequences (the third premise). But, given the first premise, foreseeability versions of the tracing strategy contradict this: they imply that an agent can be blameworthy for some consequence even if he was blamelessly ignorant of it. (Miller 2017, 1567)

So it looks like Rosen and Fischer and Tognazzini owe Miller a reply. Perhaps they might do best to question premise one. If they cannot respond to this charge of inconsistency, however, they must revise one of their commitments.

b. No-Foreseeability Views

Foresight and foreseeability views are not the only views on the epistemic condition for derivative responsibility. No-foreseeability views (we might call them) hold that we can be responsible for the consequences of our conduct even if they are entirely (or at least reasonably) unforeseeable at the time of that conduct, but when the consequences are appropriately (for example, “non-deviantly”) caused by it, or reflect the agent’s ill will, or what have you. Basic and control-based externalists and quality-of-will externalists could therefore be attracted to such a view. In fact, Rik Peels (2017), appears to defend a kind of no-foreseeability view of derivative responsibility for beliefs. On his view, we are responsible for those beliefs that we have merely influenced through our actions, where influence of a belief that p consists simply in the “ability to believe otherwise”—or there being some “action or series of actions A that [the agent] S could have performed such that if S had performed A, S would not have believed that p” (2017, 143). But this view seems to propose far too weak a condition of derivative responsibility for beliefs. A corresponding account of derivative responsibility for events would entail that, for example, if the pilot’s airplane crash could have been prevented had the pilot ran through the pre-flight checklist but the crash caused the airplane company to go into liquidation, then the pilot would be responsible for this consequence, even if the pilot had no way of foreseeing it (especially given her justified belief that the company was on firm financial footing). And it does not seem that beliefs as action consequences are relevantly different from events. From another point of view, quality-of-will externalists might try to justify a no-foreseeability view by arguing that there are cases in which the consequences of one’s conduct reflect ill will even though those consequences weren’t (reasonably) foreseeable. But even if the pilot displayed recklessness towards other people’s lives by rushing through the pre-flight checklist (in the case where the pilot does not believe she is doing a film stunt), it does not seem that she is morally responsible for throwing the company into liquidation, for this consequence does not seem to reflect ill will. But perhaps the quality-of-will externalist could try to argue that there are some unforeseeable consequences of the airplane crash that do reflect the pilot’s recklessness.

These are the challenges facing a no-foreseeability view of derivative responsibility. But a reason to take the view seriously is found in Manuel Vargas’ (2005) well-discussed dilemma for foresight and (reasonable) foreseeability views (which in many ways parallels the revisionist dilemma posed by strong internalists about culpable misconduct). According to Vargas’ dilemma, there are many cases in which the consequences of our behavior (for example, as youth) on our character and later choices are not foreseeable at the time of that behavior, and yet we are intuitively to blame for those consequences. Commonly discussed is his case of “Jeff the Jerk” in which Jeff, a high-school school kid, endeavors to become more like the “jerks” who have “success” with their female classmates. He successfully becomes a jerk, but this means that later in life he is “rude and inconsiderate about the feelings of others” as he lays off his employees (2005, 271). Vargas argues that it is natural and common to think that we are culpable for these sorts of consequences of our earlier behavior, even though they are not reasonably foreseeable. But foresight and reasonable foreseeability views must regard these character traits and choices as something for which we lack responsibility. Thus, we have a dilemma: either we accept a reasonable foreseeability or foresight view and its culpability revisionist implications or we reject those views in order to vindicate our ordinary pre-theoretical intuitions.

But are foreseeability and foresight views stuck on the horns of this dilemma? In favour of a reasonable foreseeability view, Fischer and Tognazzini (2009) reply that Vargas’ cases are either cases in which the consequences in question are intuitively non-culpable, or they are culpable but there is a way for reasonable foreseeability views to account for their culpability. Concerning Jeff the Jerk, for instance, Fischer and Tognazzini argue that he is blameworthy for the way that he lays off his employees, since a relevant consequence type was foreseeable for Jeff: the consequence that he would “[treat] some people poorly at some point in the future as a result of his jerky character” (2009, 537). So it is not clear that Vargas’ dilemma for foresight and foreseeability views can successfully be used to defend no-foreseeability views, or at least used against consequence-type reasonable foreseeability views.

4. Future Areas for Research

The epistemic conditions of moral responsibility is thus a ripe field of philosophical research. While there is much more room for future contributions to the epistemic condition for culpable misconduct and for derivative responsibility, there are at least three other areas for future research on the epistemic conditions on which comparatively less has been written.

One of these areas is the epistemic condition for moral praiseworthiness, to which there are only a few extant contributions. Nomy Arpaly (2002) defends the view that cases of “inverse akrasia” or of doing something right while believing that it is wrong can in fact be morally praiseworthy, given appropriate care about the act’s right-making features. Paulina Sliwa (2017) disagrees, holding that there must be awareness of the rightness of the act to be praiseworthy for it. But even if we grant with Sliwa that a belief in wrongdoing undermines praiseworthiness, must there be awareness of the act’s rightness? What about a view modeled on a kind of weak internalism about culpability? But maybe there are reasons to embrace an asymmetry between the epistemic condition for praiseworthiness and the epistemic condition for blameworthiness?

Another area for future research is the epistemic condition for collective responsibility. As yet, there is not much work on this subject, but there are interesting questions to be asked on what the satisfaction of the above epistemic conditions on individual responsibility would look like at the collective level (supposing that such epistemic conditions ought to be satisfied for collective responsibility), and whether any unique epistemic conditions must be satisfied. If we took a “collectivist” approach to collective responsibility, according to which groups or corporations themselves can be morally responsible for collective actions and their consequences (whatever we say about the responsibility of individual members), we might wonder whether and under what conditions groups can themselves know or believe things, or whether this is even required for them to be morally responsible. Alternatively, if we took a more “individualistic” approach to collective responsibility, according to which only individual members of groups can be held responsible for collective actions and their consequences, it would seem that ordinary epistemic conditions apply concerning responsibility for their direct contribution to the collective action, but that further epistemic conditions need to be satisfied for them to be held responsible for collective actions and their consequences. On Seumas Miller’s (2006, 177) individualist approach, for instance, individual members are morally responsible for a collective action and any consequences of it only if they have a true belief that by acting in a certain way, “they will jointly realize an end which each of them has.”

A final area for future research is on the significance of the epistemic condition for criminal liability. In one of the first book-length studies of this kind, Douglas Husak (2016), a weak control-based internalist, argues that ignorance that an act is, or might be, morally wrong should ideally excuse offenders from criminal punishment. Such a view, if implemented, would force significant revisions to current (Anglo-American/common law) legal systems. Of course, it is already true in such systems that to determine whether a criminal offence has actually taken place—that is, to determine whether the accused performed the actus reus (that is, act) with the mens rea (that is, mental state) of criminal intent, knowledge, recklessness, or negligence—the satisfaction of certain epistemic conditions concerning awareness/ignorance of (non-moral/non-legal) facts must be proven beyond reasonable doubt. These conditions are part of the mens rea components of offenses. If your unattended child is harmed and you are ignorant of the risk of harm, but a “reasonable person” would have recognized that risk, then you are criminally negligent (for example, guilty of negligent homicide or endangerment). You are criminally reckless, by contrast, if you cause harm while recognizing the risk of harm; and you have criminal intent or knowledge if you cause harm knowing that the act would cause harm. Your sentence would likely also be heftier having been found guilty of one of these forms of liability than if you were found guilty of mere negligence (matching the common but not uncontroversial assumption that akratic wrongdoing is more culpable than unwitting wrongdoing.) Some existing offenses do also include awareness of the act’s illegality or wrongfulness in their mens rea components. (And one might think of the existing “insanity defense” in this context, for how it allows offenders to avoid conviction on the grounds that they cannot “distinguish right from wrong.” But in responsibility terms, this would be to appeal to a lack of a baseline moral capacity of responsibility, rather than to appeal directly to ignorance of the act’s wrongfulness). However, in Husak’s mind, we need to look beyond the way that actual jurisdictions impose criminal liability. If, guided by the “presumptive conformity” of law to morality, we were to consistently apply the correct—in Husak’s view, weak internalist—epistemic conditions of moral blameworthiness to criminal liability in the ideally just legal system (that is, without consideration of real-world problems concerning its applicability), then not only might we have to remove negligence as a form of criminal liability (for it is after all a form of ignorance of fact), but, argues Husak, we would have to “treat mistakes of fact and law [or morality] symmetrically by replicating the same normative structure in each context” (2016, 161). That is to say, the just legal system would impose criminal liability and punishment only on those offenders who are intentional, knowledgeable, reckless, and probably not negligent with respect to the underlying morality of the offence—in particular, with respect to whether it is “contrary to the balance of moral reasons and is wrong” (2016, 161). In practice, the just legal system would then either explicitly or implicitly build a requirement of awareness of (the risk of) wrongdoing into the mens rea element of the definition of the offense (for example, “murder” would be “knowingly killing someone while knowing the wrongfulness of doing so”), or (less symmetrically) such a system would leave the definitions of offences untouched and provide a unique “mistake of law/morality” defense (alongside other defenses, such as the insanity defense) for a not-guilty plea (see Husak’s discussion in: 2016, 262ff).

Husak’s revisionary application of the epistemic condition to criminal liability raises a number of questions. One issue that many will have with his straightforward application of culpability internalism to criminal liability is that the ideally just legal system would not punish “zealous terrorists who are unaware of wrongdoing” (2016, 265)—a rather counterintuitive consequence of the view! In this connection, we might ask whether it is true that a just legal system would make criminal liability depend on (at least one form of) moral blameworthiness, and thus on the satisfaction of its epistemic condition. Suppose that it wouldn’t. Would criminal liability still be structurally analogous to moral blameworthiness (cf. Rosen 2003 80-81), such that a parallel epistemic condition applies? If it were to make criminal liability depend on moral blameworthiness or a structural analogue, would the just legal system make criminal liability depend on the most plausible view of the epistemic condition (for example, in Husak’s view, a weak control-based internalism), or rather would it make criminal liability depend on the most accepted view of moral blameworthiness, or maybe whatever view accords most with common-sense intuitions of blameworthiness? Or should criminal liability have nothing to do with moral blameworthiness (but be concerned exclusively with, say, mere wrongdoing, deterrence, or rehabilitation). These are all important questions for future inquiries into the epistemic condition.

5. References and Further Reading

  • Arpaly, Nomy. Unprincipled Virtue. Oxford: Oxford University Press, 2002.
  • Boylan, Michael. Basic Ethics, 3rd ed. New York: Routledge, 2021.
  • Chisholm, Roderick. Person and Object: A Metaphysical Study. George Allen & Unwin Ud, 1976.
  • Clarke, Randolph. “Blameworthiness and Unwitting Omissions.” In The Ethics and Law of Omissions, edited by Dana Kay Nelkin and Samuel C. Rickless. Oxford: Oxford University Press, 2017.
  • Clarke, Randolph. “Negligent Action and Unwitting Omission.” In Omissions: Agency, Metaphysics, and Responsibility. Oxford: Oxford University Press, 2014.
  • Cloos, Christopher Michael. Responsibility Beyond Belief: The Epistemic Condition on Moral Responsibility: a doctoral dissertation accepted by the University of California, Santa Barbara, September, 2018. Available at: https://escholarship.org/uc/item/1hr314cs.
  • Fischer, John Martin, and Neal A. Tognazzini. “The Truth about Tracing.” Noûs 43, no. 3 (2009): 531-556.
  • FitzPatrick, William. “Moral Responsibility and Normative Ignorance: Answering a New Skeptical Challenge.” Ethics 118, no. 4 (2008): 589–613.
  • FitzPatrick, William. “Unwitting Wrongdoing, Reasonable Expectations, and Blameworthiness.” In Responsibility: The Epistemic Condition, edited by Philip Robichaud and Jan Willem Wieland, 29–46. Oxford: Oxford University Press, 2017.
  • Frankfurt, Harry G. “Alternate Possibilities and Moral Responsibility.” The Journal of Philosophy 66, no. 23 (1969): 829–39.
  • Guerrero, Alexander. “Don’t Know, Don’t Kill: Moral Ignorance, Culpability, and Caution.” Philosophical Studies 136, no. 1 (2007): 59–97.
  • Haji, Ishtiyaque. “An Epistemic Dimension of Blameworthiness.” Philosophy and Phenomenological Research 57, no. 3 (1997): 523–44.
  • Harman, Elizabeth. “Does Moral Ignorance Exculpate?” Ratio 24, no. 4 (2011): 443–68.
  • Husak Douglas. Ignorance of Law: A Philosophical Inquiry. Oxford: Oxford University Press, 2016.
  • Levy, Neil. “Culpable Ignorance and Moral Responsibility: A Reply to FitzPatrick.” Ethics 119, no. 4 (2009): 729–41.
  • Levy, Neil. “Culpable Ignorance: A Reply a Robichaud.” Journal of Philosophical Research 41 (2016): 263–71.
  • Levy, Neil. “The Good, the Bad and the Blameworthy.” Journal of Ethics and Social Philosophy 1, no. 2 (2005): 1–16.
  • Levy, Neil. Hard Luck: How Luck Undermines Free Will and Moral Responsibility. Oxford: Oxford University Press, 2011.
  • Levy, Neil. “Methodological Conservatism and the Epistemic Condition.” In Responsibility: The Epistemic Condition, edited by Philip Robichaud and Jan Willem Wieland, 252–65. Oxford: Oxford University Press, 2017.
  • Mason, Elinor. Ways to Be Blameworthy: Rightness, Wrongness, and Responsibility. Oxford: Oxford University Press, 2019.
  • McKenna, Michael. Conversation and Responsibility. Oxford: Oxford University Press, 2012
  • Miller, Daniel. “Reasonable Foreseeability and Blameless Ignorance.” Philosophical Studies 174, no. 6 (2017): 1561-1581.
  • Miller, Seumas. “Collective Moral Responsibility: An Individualist Account.” Midwest Studies in Philosophy 15 (2006): 176-193.
  • Montmarquet, James. “Zimmerman on Culpable Ignorance.” Ethics 109, no. 4 (1999): 842–45.
  • Murray, Samuel. “Responsibility and Vigilance.” Philosophical Studies 174, no. 2 (2017): 507–27.
  • Nelkin, Dana Kay, and Samuel C. Rickless. “Moral Responsibility for Unwitting Omissions.” In The Ethics and Law of Omissions, edited by Dana Kay Nelkin, and Samuel C. Rickless, 106-130. New York: Oxford University Press, 2017.
  • Nottelmann, Nikolaj. Blameworthy Belief: A Study in Epistemic Deontologism. Dordrecht: Springer Netherlands, 2007.
  • Peels, Rik. Responsible Belief: A Theory in Ethics and Epistemology. Oxford: Oxford University Press, 2017.
  • Peels, Rik. “Tracing Culpable Ignorance.” Logos & Episteme 2, no. 4 (2011): 575–82.
  • Robichaud, Philip. “On Culpable Ignorance and Akrasia.” Ethics 125, no. 1 (2014): 137–51.
  • Rosen, Gideon. “Culpability and Ignorance.” Proceedings of the Aristotelian Society 103 (2003): 61–84.
  • Rosen, Gideon. “Kleinbart the Oblivious and Other Tales of Ignorance and Responsibility.” The Journal of Philosophy 105, no. 10 (2008): 591–610.
  • Rosen, Gideon. “Skepticism about Moral Responsibility.” Philosophical Perspectives 18 (2004): 295–313.
  • Rudy-Hiller, Fernando. “A Capacitarian Account of Culpable Ignorance.” Pacific Philosophical Quarterly 98 (2017): 398–426.
  • Rudy-Hiller, Fernando. “Give People a Break: Slips and Moral Responsibility.” Philosophical Quarterly 69, no. 277 (2019): 721-740.
  • Sartorio, Carolina. “Ignorance, Alternative Possibilities, and the Epistemic Conditions for Responsibility.” In Perspectives on Ignorance from Moral and Social Philosophy, edited by Rik Peels, 15–29. New York: Routledge, 2017.
  • Sher, George. Who Knew?: Responsibility Without Awareness. Oxford: Oxford University Press, 2009.
  • Sliwa, Paulina. “On Knowing What’s Right and Being Responsible For It.” In Responsibility: The Epistemic Condition, edited by Philip Robichaud and Jan Willem Wieland, 127-145. Oxford: Oxford University Press, 2017
  • Smith, Angela. “Responsibility for Attitudes: Activity and Passivity in Mental Life.” Ethics 115, no. 2 (2005): 236–71.
  • Smith, Angela. “Review of George Sher’s Who Knew? Responsibility without Awareness”, Social Theory and Practice 36, no. 3 (2010): 515–524.
  • Smith, Holly. “Culpable Ignorance.” The Philosophical Review 92, no. 4 (1983): 543–71.
  • Smith, Holly. “Non-Tracing Cases of Culpable Ignorance.” Criminal Law and Philosophy 5, no. 2 (2011): 115–46.
  • Talbert, Matthew. “Omission and Attribution Error.” In The Ethics and Law of Omissions, edited by Dana Nelkin and Samuel C. Rickless, 17–35. Oxford: Oxford University Press, 2017
  • Talbert, Matthew. “Unwitting Wrongdoers and the Role of Moral Disagreement in Blame.” In Oxford Studies in Agency and Responsibility Volume 1, edited by David Shoemaker. Oxford: Oxford University Press, 2013
  • Vargas, Manuel. “The Trouble with Tracing.” Midwest Studies in Philosophy 29 (2005): 269–91.
  • Watson, Gary. “Two Faces of Responsibility.” Philosophical Topics 24, no. 2 (1996): 227–48.
  • Wieland, Jan W. “Introduction: The Epistemic Condition.” In Responsibility: The Epistemic Condition, edited by Philip Robichaud and Jan Willem Wieland, 1–45. Oxford: Oxford University Press, 2017.
  • Yates, Thomas A. Moral Responsibility and Motivating Reasons: On the Epistemic Condition for Moral Blameworthiness: a doctoral dissertation accepted by the University of Auckland, February 5, 2021. Available at: https://researchspace.auckland.ac.nz/handle/2292/54410.
  • Zimmerman, Michael J. “Ignorance as a Moral Excuse.” In Perspectives on Ignorance from Moral and Social Philosophy, edited by Rik Peels, 77-94. New York, US: Routledge, 2017.
  • Zimmerman, Michael J. “Moral Responsibility and Ignorance.” Ethics 107 (1997): 410–26.
  • Zimmerman, Michael J. “Negligence and Moral Responsibility.” Nous 20, no. 2 (1986): 199–218.

 

Author Information

Tom Yates
Email: tyatesnz@gmail.com
Massey University
New Zealand

Deductive and Inductive Arguments

In philosophy, an argument consists of a set of statements called premises that serve as grounds for affirming another statement called the conclusion. Philosophers typically distinguish arguments in natural languages (such as English) into two fundamentally different types: deductive and inductive. Each type of argument is said to have characteristics that categorically distinguish it from the other type. The two types of argument are also said to be subject to differing evaluative standards. Pointing to paradigmatic examples of each type of argument helps to clarify their key differences. The distinction between the two types of argument may hardly seem worthy of philosophical reflection, as evidenced by the fact that their differences are usually presented as straightforward, such as in many introductory philosophy textbooks. Nonetheless, the question of how best to distinguish deductive from inductive arguments, and indeed whether there is a coherent categorical distinction between them at all, turns out to be considerably more problematic than commonly recognized. This article identifies and discusses a range of different proposals for marking categorical differences between deductive and inductive arguments while highlighting the problems and limitations attending each. Consideration is also given to the ways in which one might do without a distinction between two types of argument by focusing instead solely on the application of evaluative standards to arguments.

Table of Contents

  1. Introduction
  2. Psychological Approaches
  3. Behavioral Approaches
  4. Arguments that “Purport”
  5. Evidential Completeness
  6. Logical Necessity vs. Probability
  7. The Question of Validity
  8. Formalization and Logical Rules to the Rescue?
  9. Other Even Less Promising Proposals
  10. An Evaluative Approach
  11. References and Further Reading

1. Introduction

In philosophy, an argument consists of a set of statements called premises that serve as grounds for affirming another statement called the conclusion. Philosophers typically distinguish arguments in natural languages (such as English) into two fundamentally different kinds: deductive and inductive. (Matters become more complicated when considering arguments in formal systems of logic as well as in the many forms of non-classical logic. Readers are invited to consult the articles on Logic in this encyclopedia to explore some of these more advanced topics.) In the philosophical literature, each type of argument is said to have characteristics that categorically distinguish it from the other type.

Deductive arguments are sometimes illustrated by providing an example in which an argument’s premises logically entail its conclusion. For example:

Socrates is a man.
All men are mortal.
Therefore, Socrates is mortal.

Assuming the truth of the two premises, it seems that it simply must be the case that Socrates is mortal. According to this view, then, this would be a deductive argument.

By contrast, inductive arguments are said to be those that make their conclusions merely probable. They might be illustrated by an example like the following:

Most Greeks eat olives.
Socrates is a Greek.
Therefore, Socrates eats olives.

Assuming the truth of those premises, it is likely that Socrates eats olives, but that is not guaranteed. According to this view, this argument is inductive.

This way of viewing arguments has a long history in philosophy. An explicit distinction between two fundamentally distinct argument types goes back to Aristotle (384-322 B.C.E.) who, in his works on logic (later dubbed “The Organon”, meaning “the instrument”) distinguished syllogistic reasoning (sullogismos) from “reasoning from particulars to universals” (epagôgê). Centuries later, induction was famously advertised by Francis Bacon (1561-1626) in his New Organon (1620) as the royal road to knowledge, while Rationalist mathematician-philosophers, such as René Descartes (1596-1650) in his Discourse on the Method (1637), favored deductive methods of inquiry. Albert Einstein (1879-1955) discussed the distinction in the context of science in his essay, “Induction and Deduction in Physics” (1919). Much contemporary professional philosophy, especially in the Analytic tradition, focuses on presenting and critiquing deductive and inductive arguments while considering objections and responses to them. It is therefore safe to say that a distinction between deductive and inductive arguments is fundamental to argument analysis in philosophy.

Although a distinction between deductive and inductive arguments is deeply woven into philosophy, and indeed into everyday life, many people probably first encounter an explicit distinction between these two kinds of argument in a pedagogical context. For example, students taking an elementary logic, critical thinking, or introductory philosophy course might be introduced to the distinction between each type of argument and be taught that each have their own standards of evaluation. Deductive arguments may be said to be valid or invalid, and sound or unsound. A valid deductive argument is one whose logical structure or form is such that if the premises are true, the conclusion must be true. A sound argument is a valid argument with true premises. Inductive arguments, by contrast, are said to be strong or weak, and, although terminology varies, they may also be considered cogent or not cogent. A strong inductive argument is said to be one whose premises render the conclusion likely. A cogent argument is a strong argument with true premises. All arguments are made better by having true premises, of course, but the differences between deductive and inductive arguments concern structure, independent of whether the premises of an argument are true, which concerns semantics.

The distinction between deductive and inductive arguments is considered important because, among other things, it is crucial during argument analysis to apply the right evaluative standards to any argument one is considering. Indeed, it is not uncommon to be told that in order to assess any argument, three steps are necessary. First, one is to determine whether the argument being considered is a deductive argument or an inductive one. Second, one is to then determine whether the argument is valid or invalid. Finally, one is to determine whether the argument is sound or unsound (Teays 1996).

All of this would seem to be amongst the least controversial topics in philosophy. Controversies abound in metaphysics, epistemology, and ethics (such as those exhibited in the contexts of Ancient and Environmental Ethics, just to name a couple). By contrast, the basic distinctions between deductive and inductive arguments seem more solid, more secure; in short, more settled than those other topics. Accordingly, one might expect an encyclopedic article on deductive and inductive arguments to simply report the consensus view and to clearly explain and illustrate the distinction for readers not already familiar with it. However, the situation is made more difficult by three facts.

First, there appear to be other forms of argument that do not fit neatly into the classification of deductive or inductive arguments. Govier (1987) calls the view that there are only two kinds of argument (that is, deductive and inductive) “the positivist theory of argument”. She believes that it naturally fits into, and finds justification within, a positivist epistemology, according to which knowledge must be either a priori (stemming from logic or mathematics, deploying deductive arguments) or a posteriori (stemming from the empirical sciences, using inductive arguments). She points out that arguments as most people actually encounter them assume such a wide variety of forms that the “positivist theory of argument” fails to account for a great many of them.

Second, it can be difficult to distinguish arguments in ordinary, everyday discourse as clearly either deductive or inductive. The supposedly sharp distinction tends to blur in many cases, calling into question whether the binary nature of the deductive-inductive distinction is correct.

Third (this point being the main focus of this article), a perusal of elementary logic and critical thinking texts, as well as other presentations aimed at non-specialist readers, demonstrates that there is in fact no consensus about how to draw the supposedly straightforward deductive-inductive argument distinction, as least within the context of introducing the distinction to newcomers. Indeed, proposals vary from locating the distinction within subjective, psychological states of arguers to objective features of the arguments themselves, with other proposals landing somewhere in-between.

Remarkably, not only do proposals vary greatly, but the fact that they do so at all, and that they generate different and indeed incompatible conceptions of the deductive-inductive argument distinction, also seems to go largely unremarked upon by those advancing such proposals. Many authors confidently explain the distinction between deductive and inductive arguments without the slightest indication that there are other apparently incompatible ways of making such a distinction. Moreover, there appears to be little scholarly discussion concerning whether the alleged distinction even makes sense in the first place. That there is a coherent, unproblematic distinction between deductive and inductive arguments, and that the distinction neatly assigns arguments to one or the other of the two non-overlapping kinds, is an assumption that usually goes unnoticed and unchallenged. Even a text with the title Philosophy of Logics (Haack 1978) makes no mention of this fundamental philosophical problem.

A notable exception has already been mentioned in Govier (1987), who explicitly critiques what she calls “the hallowed old distinction between inductive and deductive arguments.” However, her insightful discussion turns out to be the exception that proves the rule. Her critique appears not to have awoken philosophers from their dogmatic slumbers concerning the aforementioned issues of the deductive-inductive argument classification. Moreover, her discussion, while perceptive, does not engage the issue with the level of sustained attention that it deserves, presumably because her primary concerns lay elsewhere. In short, the problem of distinguishing between deductive and inductive arguments seems not to have registered strongly amongst philosophers. A consequence is that the distinction is often presented as if it were entirely unproblematic. Whereas any number of other issues are subjected to penetrating philosophical analysis, this fundamental issue typically traipses past unnoticed.

Accordingly, this article surveys, discusses, and assesses a range of common (and other not-so-common) proposals for distinguishing between deductive and inductive arguments, ranging from psychological approaches that locate the distinction within the subjective mental states of arguers, to approaches that locate the distinction within objective features of arguments themselves. It aims first to provide a sense of the remarkable diversity of views on this topic, and hence of the significant, albeit typically unrecognized, disagreements concerning this issue. Along the way, it is pointed out that none of the proposed distinctions populating the relevant literature are entirely without problems. This is especially the case when related to other philosophical views which many philosophers would be inclined to accept, although some of the problems that many of the proposed distinctions face may be judged to be more serious than others.

In light of these difficulties, a fundamentally different approach is then sketched: rather than treating a categorical deductive-inductive argument distinction as entirely unproblematic (as a great many authors do), these problems are made explicit so that emphasis can be placed on the need to develop evaluative procedures for assessing arguments without identifying them as strictly “deductive” or “inductive.” This evaluative approach to argument analysis respects the fundamental rationale for distinguishing deductive from inductive arguments in the first place, namely as a tool for helping one to decide whether the conclusion of any argument deserves assent. Such an approach bypasses the problems associated with categorical approaches that attempt to draw a sharp distinction between deductive and inductive arguments. Ultimately, the deductive-inductive argument distinction should be dispensed with entirely, a move which is no doubt a counterintuitive conclusion for some that nonetheless can be made plausible by attending to the arguments that follow.

First, a word on strategy. Each of the proposals considered below will be presented from the outset in its most plausible form in order to see why it might seem attractive, at least initially so. The consequences of accepting each proposal are then delineated, consequences that might well give one pause in thinking that the deductive-inductive argument distinction in question is satisfactory.

2. Psychological Approaches

Perhaps the most popular approach to distinguish between deductive and inductive arguments is to take a subjective psychological state of the agent advancing a given argument to be the crucial factor. For example, one might be informed that whereas a deductive argument is intended to provide logically conclusive support for its conclusion, an inductive argument is intended to provide only probable, but not conclusive, support (Barry 1992; Vaughn 2010; Harrell 2016; and many others). Some accounts of this sort could hardly be more explicit that such psychological factors alone are the key factor. From this perspective, then, it may be said that the difference between deductive and inductive arguments does not lie in the words used within the arguments, but rather in the intentions of the arguer. That is to say, the difference between each type of argument comes from the relationship the arguer takes there to be between the premises and the conclusion. If the arguer believes that the truth of the premises definitely establishes the truth of the conclusion, then the argument is deductive. If the arguer believes that the truth of the premises provides only good reasons to believe the conclusion is probably true, then the argument is inductive. According to this psychological account, the distinction between deductive and inductive arguments is determined exclusively by the intentions and/or beliefs of the person advancing an argument.

This psychological approach entails some interesting, albeit often unacknowledged, consequences. Because the difference between deductive and inductive arguments is said to be determined entirely by what an arguer intends or believes about any given argument, it follows that what is ostensibly the very same argument may be equally both deductive and inductive.

An example may help to illustrate this point. If person A believes that the premise in the argument “Dom Pérignon is a champagne; so, it is made in France” definitely establishes its conclusion (perhaps on the grounds that “champagne” is a type of sparkling wine produced only in the Champagne wine region of France), then according to the psychological approach being considered, this would be a deductive argument. However, if person B believes that the premise of the foregoing argument provides only good reasons to believe that the conclusion is true (perhaps because they think of “champagne” as merely any sort of fizzy wine), then the argument in question is also an inductive argument. Therefore, it is entirely possible on this psychological view for the same argument to be both a deductive and an inductive argument. It is a deductive argument because of what person A believes. It is also an inductive argument because of what person B believes. Indeed, this consequence need not involve different individuals at all. This result follows even if the same individual maintains different beliefs and/or intentions with respect to the argument’s strength at different times.

The belief-relativity inherent in this psychological approach is not by itself an objection, much less a decisive one. Olson (1975) explicitly advances such an account, and frankly embraces its intention- or belief-relative consequences. Perhaps the fundamental nature of arguments is relative to individuals’ intentions or beliefs, and thus the same argument can be both deductive and inductive. However, this psychological approach does place logical constraints on what else one can coherently claim. For example, one cannot coherently maintain that, given the way the terms ‘deductive argument’ and ‘inductive argument’ are categorized here, an argument is always one or the other and never both. If this psychological account of the deductive-inductive argument distinction is accepted, then the latter claim is necessarily false.

Of course, there is a way to reconcile the psychological approach considered here with the claim that an argument is either deductive or inductive, but never both. One could opt to individuate arguments on the basis of individuals’ specific intentions or beliefs about them. In this more sophisticated approach, what counts as a specific argument would depend on the intentions or beliefs regarding it. So, for example, if person A believes that “Dom Pérignon is a champagne; so, it is made in France” definitely establishes the truth of its conclusion, while person B believes that “Dom Pérignon is a champagne; so, it is made in France” provides only good reasons for thinking that its conclusion is true, then there isn’t just one argument here after all. Rather, according to this more sophisticated account, there are two distinct arguments here that just happen to be formulated using precisely the same words. According to this view, the belief that there is just one argument here would be naïve. Hence, it could still be the case that any argument is deductive or inductive, but never both. Arguments just need to be multiplied as needed.

However, this more sophisticated strategy engenders some interesting consequences of its own. Since intentions and beliefs can vary in clarity, intensity, and certainty, any ostensible singular argument may turn out to represent as many distinct arguments as there are persons considering a given inference. So, for example, what might initially have seemed like a single argument (say, St. Anselm of Canterbury‘s famous ontological argument for the existence of God) might turn out in this view to be any number of different arguments because different thinkers may harbor different degrees of intention or belief about how well the argument’s premises support its conclusion.

On a similar note, the same ostensible single argument may turn out to be any number of arguments if the same individual entertains different intentions or beliefs (or different degrees of intention or belief) at different times concerning how well its premises support its conclusion, as when one reflects upon an argument for some time. Again, this is not necessarily an objection to this psychological approach, much less a decisive one. A proponent of this psychological approach could simply bite the bullet and concede that what at first appeared to be a single argument may in fact be many.

Be that as it may, there are yet other logical consequences of adopting such a psychological account of the deductive-inductive argument distinction that, taken together with the foregoing considerations, may raise doubts about whether such an account could be the best way to capture the relevant distinction. Because intentions and beliefs are not publicly accessible, and indeed may not always be perfectly transparent even to oneself, confident differentiation of deductive and inductive arguments may be hard or even impossible in many, or even in all, cases. For example, in cases where one does not or cannot know what the arguer’s intentions or beliefs are (or were), it is necessarily impossible to identify which type of argument it is, assuming, again, that it must be either one type or the other. If the first step in evaluating an argument is determining which type of argument it is, one cannot even begin.

In response, it might be advised to look for the use of indicator words or phrases as clues to discerning an arguer’s intentions or beliefs. The use of words like “necessarily,” or “it follows that,” or “therefore it must be the case that” could be taken to indicate that the arguer intends the argument to definitely establish its conclusion, and therefore, according to the psychological proposal being considered, one might judge it to be a deductive argument. Alternatively, the use of words like “probably,” “it is reasonable to conclude,” or “it is likely” could be interpreted to indicate that the arguer intends only to make the argument’s conclusion probable. One might judge it to be an inductive argument on that basis.

However, while indicator words or phrases may suggest specific interpretations, they need to be viewed in context, and are far from infallible guides. At best, they are indirect clues as to what any arguer might believe or intend. Someone may say one thing, but intend or believe something else. This need not involve intentional lying. Intentions and beliefs are often opaque, even to the person whose intentions and beliefs they are. Moreover, they are of limited help in providing an unambiguous solution in many cases. Consider the following example:

Most Major League Baseball outfielders consistently have batting averages over .250. Since Ken Singleton played centerfield for the Orioles for three consecutive years, he must have been batting over .250 when he was traded.

If one takes seriously the “must have” clause in the last sentence, it might be concluded that the proponent of this argument intended to provide a deductive argument and thus, according to the psychological approach, it is a deductive argument. If one is not willing to ascribe that intention to the argument’s author, it might be concluded that he meant to advance an inductive argument. In some cases, it simply cannot be known. To offer another example, consider this argument:

It has rained every day so far this month.
If it has rained every day so far this month, then probably it will rain today.
Therefore, probably it will rain today.

The word “probably” appears twice, suggesting that this may be an inductive argument. Yet, many would agree that the argument’s conclusion is “definitely established” by its premises. Consequently, while being on the lookout for the appearance of certain indicator words is a commendable policy for dealing fairly with the arguments one encounters, it does not provide a perfectly reliable criterion for categorically distinguishing deductive and inductive arguments.

This consequence might be viewed as merely an inconvenient limitation on human knowledge, lamentably another instance of which there already are a great many. However, there is a deeper worry associated with a psychological approach than has been considered thus far. Recall that a common psychological approach distinguishes deductive and inductive arguments in terms of the intentions or beliefs of the arguer with respect to any given argument being considered. If the arguer intends or believes the argument to be one that definitely establishes its conclusion, then it is a deductive argument. If the arguer intends or believes the argument to be one that merely makes its conclusion probable, then it is an inductive argument. But what if the person putting forth the argument intends or believes neither of those things?

Philosophy instructors routinely share arguments with their students without any firm beliefs regarding whether they definitely establish their conclusions or whether they instead merely make their conclusions probable. Likewise, they may not have any intentions with respect to the arguments in question other than merely the intention to share them with their students. For example, if an argument is put forth merely as an illustration, or rhetorically to show how someone might argue for an interesting thesis, with the person sharing the argument not embracing any intentions or beliefs about what it does show, then on the psychological approach, the argument is neither a deductive nor an inductive argument. This runs counter to the view that every argument must be one or the other.

Nor can it be said that such an argument must be deductive or inductive for someone else, due to the fact that there is no guarantee that anyone has any beliefs or intentions regarding the argument. In this case, then, if the set of sentences in question still qualifies as an argument, what sort of argument is it? It would seem to exist in a kind of logical limbo or no man’s land. It would be neither deductive nor inductive. Furthermore, there is no reason to suppose that it is some other type, unless it isn’t really an argument at all, since no one intends or believes anything about how well it establishes its conclusion. In that case, one is faced with the peculiar situation in which someone believes that a set of sentences is an argument, and yet it cannot be an argument because, according to the psychological view, no one has any intentions for the argument to establish its conclusion, nor any beliefs about how well it does so. However, it could still become a deductive or inductive argument should someone come to embrace it with greater, or with lesser, conviction, respectively. With this view, arguments could continually flicker into and out of existence.

These considerations do not show that a purely psychological criterion for distinguishing deductive and inductive arguments must be wrong, as that would require adopting some other presumably more correct standard for making the deductive-inductive argument distinction, which would then beg the question against any psychological approach. Logically speaking, nothing prevents one from accepting all the foregoing consequences, no matter how strange and inelegant they may be. However, there are other troubling consequences of adopting a psychological approach to consider.

Suppose that it is said that an argument is deductive if the person advancing it believes that it definitely establishes its conclusion. According to this account, if the person advancing an argument believes that it definitely establishes its conclusion, then it is definitively deductive. If, however, everyone else who considers the argument thinks that it makes its conclusion merely probable at best, then the person advancing the argument is completely right and everyone else is necessarily wrong.

For example, consider the following argument: “It has rained nearly every day so far this month. So, it will for sure rain tomorrow as well.” If the person advancing this argument believes that the premise definitely establishes its conclusion, then according to such a psychological view, it is necessarily a deductive argument, despite the fact that it would appear to most others to at best make its conclusion merely probable. Or, to take an even more striking example, consider Dr. Samuel Johnson’s famous attempted refutation of Bishop George Berkeley‘s immaterialism (roughly, the view that there are no material things, but only ideas and minds) by forcefully kicking a stone and proclaiming “I refute it thus!” If Dr. Johnson sincerely believed that by his action he had logically refuted Berkeley’s immaterialism, then his stone-kicking declaration would be a deductive argument.

Likewise, some arguments that look like an example of a deductive argument will have to be re-classified on this view as inductive arguments if the authors of such arguments believe that the premises provide merely good reasons to accept the conclusions as true. For example, someone might give the following argument:

All men are mortal.
Socrates is a man.
Therefore, Socrates is mortal.

This is the classic example of a deductive argument included in many logic texts. However, if someone advancing this argument believes that the conclusion is merely probable given the premises, then it would, according to this psychological proposal, necessarily be an inductive argument, and not just merely be believed to be so, given that it meets a sufficient condition for being inductive.

A variation on this psychological approach focuses not on intentions and beliefs, but rather on doubts. According to this alternative view, a deductive argument is one such that, if one accepts the truth of the premises, one cannot doubt the truth of the conclusion. By contrast, an inductive argument is one such that, if one accepts the truth of the premises, one can doubt the truth of the conclusion. This view is sometimes expressed by saying that deductive arguments establish their conclusions “beyond a reasonable doubt” (Teays 1996). Deductive arguments, in this view, may be said to be psychologically compelling in a way that inductive arguments are not. Good deductive arguments compel assent, but even quite good inductive arguments do not.

However, a moment’s reflection demonstrates that this approach entails many of the same awkward consequences as do the other psychological criteria previously discussed. What people are capable of doubting is as variable as what they might intend or believe, making this doubt-centered view subject to the same sorts of agent-relative implications facing any intention-or-belief approach.

One might try to circumvent these difficulties by saying that a deductive argument should be understood as one that establishes its conclusion beyond a reasonable doubt. In other words, given the truth of the premises, one should not doubt the truth of the conclusion. Likewise, one might say that an inductive argument is one such that, given the truth of the premises, one should be permitted to doubt the truth of the conclusion. However, this tactic would be to change the subject from the question of what categorically distinguishes deductive and inductive arguments to that of the grounds for deciding whether an argument is a good one – a worthwhile question to ask, to be sure, but a different question than the one being considered here.

Again, in the absence of some independently established distinction between deductive and inductive arguments, these consequences alone cannot refute any psychological account. Collectively, however, they raise questions about whether this way of distinguishing deductive and inductive arguments should be accepted, given that such consequences are hard to reconcile with other common beliefs about arguments, say, about how individuals can be mistaken about what sort of argument they are advancing. Luckily, there are other approaches. However, upon closer analysis these other approaches fare no better than the various psychological approaches thus far considered.

3. Behavioral Approaches

Psychological approaches are, broadly speaking, cognitive. They concern individuals’ mental states, specifically their intentions, beliefs, and/or doubts. Given the necessarily private character of mental states (assuming that brain scans, so far at least, provide only indirect evidence of individuals’ mental states), it may be impossible to know what an individual’s intentions or beliefs really are, or what they are or are not capable of doubting. Hence, it may be impossible given any one psychological approach to know whether any given argument one is considering is a deductive or an inductive one. That and other consequences of that approach seem less than ideal. Can such consequences be avoided?

The problem of knowing others’ minds is not new. A movement in psychology that flourished in the mid-20th century, some of whose tenets are still evident within 21st century psychological science, was intended to circumvent problems associated with the essentially private nature of mental states in order to put psychology on a properly scientific footing. According to Behaviorism, one can set aside speculations about individuals’ inaccessible mental states to focus instead on individuals’ publicly observable behaviors. According to certain behaviorists, any purported psychological state can be re-described as a set of behaviors. For example, a belief such as “It will rain today” might be cashed out along the lines of an individual’s behavior of putting on wet-weather gear or carrying an umbrella, behaviors that are empirically accessible insofar as they are available for objective observation. In this way, it was hoped, one can bypass unknowable mental states entirely.

Setting aside the question of whether Behaviorism is viable as a general approach to the mind, a focus on behavior rather than on subjective psychological states in order to distinguish deductive and inductive arguments promises to circumvent the epistemic problems facing a cognitive approach. According to one such proposal, a deductive argument is one whose premises are claimed to support the conclusion such that it would be impossible for the premises to be true and for the conclusion to be false. An inductive argument is one whose premises are claimed to provide only some less-than-conclusive grounds for accepting the conclusion (Copi 1978; Hurley and Watson 2018). A variation on this approach says that deductive arguments are ones in which the conclusion is presented as following from the premises with necessity, whereas inductive arguments are ones in which the conclusion is presented as following from the premises only with some probability (Engel 1994). Notice that, unlike intending or believing, “claiming” and “presenting” are expressible as observable behaviors.

This behavioral approach thus promises to circumvent the epistemic problems facing psychological approaches. What someone explicitly claims an argument shows can usually, or at least often, be determined rather unproblematically. For example, if someone declares “The following argument is a deductive argument, that is, an argument whose premises definitely establish its conclusion,” then, according to the behavioral approach being considered here, it would be a sufficient condition to judge the argument in question to be a deductive argument. Likewise, if someone insists “The following argument is an inductive argument, that is, an argument such that if its premises are true, the conclusion is, at best, probably true as well,” this would be a sufficient condition to conclude that such an argument is inductive. Consequently, some of the problems associated with psychological proposals fall by the wayside. Initially, therefore, this approach looks promising.

The most obvious problem with this approach is that few arguments come equipped with a statement explicitly declaring what sort of argument it is thought to be. As Govier (1987) sardonically notes, “Few arguers are so considerate as to give us a clear indication as to whether they are claiming absolute conclusiveness in the technical sense in which logicians understand it.” This leaves plenty of room for interpretation and speculation concerning the vast majority of arguments, thereby negating the chief hoped for advantage of focusing on behaviors rather than on psychological states.

Alas, other problems loom as well. Having already considered some of the troubling agent-relative consequences of adopting a purely psychological account, it will be easy to anticipate that behavioral approaches, while avoiding some of the psychological approach’s epistemic problems, nonetheless will inherit many of the latter’s agent-relativistic problems in virtually identical form.

First, what is ostensibly the very same argument (that is, consisting of the same sequence of words) in this view may be both a deductive and an inductive argument when advanced by individuals making different claims about what the argument purports to show, regardless of how unreasonable those claims appear to be on other grounds. For example, the following argument (a paradigmatic instance of the modus ponens argument form) would be a deductive argument if person A claims that, or otherwise behaves as if, the premises definitely establish the conclusion:

If P, then Q.
P.
Therefore, Q.

(The capital letters exhibited in this argument are to be understood as variables that can be replaced with declarative sentences, statements, or propositions, namely, items that are true or false. The investigation of logical forms that involve whole sentences is called Propositional Logic.)

However, by the same token, the foregoing argument equally would be an inductive argument if person B claims (even insincerely so, since psychological factors are by definition irrelevant under this view) that its premises provide only less than conclusive support for its conclusion.

Likewise, the following argument would be an inductive argument if person A claims that its premise provides less than conclusive support for its conclusion:

A random sample of voters in Los Angeles County supports a new leash law for pet turtles; so, the law will probably pass by a very wide margin.

However, it would also be a deductive argument if person B claims that its premises definitely establish the truth of its conclusion. On a behavioral approach, then, recall that whether an argument is deductive or inductive is entirely relative to individuals’ claims about it, or to some other behavior. Indeed, this need not involve different individuals at all. An argument would be both a deductive and an inductive argument if the same individual makes contrary claims about it, say, at different times.

If one finds these consequences irksome, one could opt to individuate arguments on the basis of claims about them. So, two individuals might each claim that “Dom Pérignon is a champagne; so, it is made in France.” But if person A claims that the premise of this argument definitely establishes its conclusion, whereas person B claims that the premise merely makes its conclusion probable, there isn’t just one argument about Dom Pérignon being considered, but two: one deductive, the other inductive, each one corresponding to one of the two different claims. There is no need to rehearse the by-now familiar worries concerning these issues, given that these issues are nearly identical to the various ones discussed with regard to the aforementioned psychological approaches.

A proponent of any sort of behavioral approach might bite the bullet and accept all of the foregoing consequences. Since no alternative unproblematic account of the deduction-induction distinction has been presented thus far, such consequences cannot show that a behavioral approach is simply wrong. Likewise, the relativism inherent in this approach is not by itself an objection. Perhaps the distinction between deductive and inductive arguments is relative to the claims made about them. However, this approach is incompatible with the common belief that an argument is either deductive or inductive, but never both. This latter belief would have to be jettisoned if a behavioral view were to be adopted.

4. Arguments that “Purport”

Both the psychological and behavioral approaches take some aspect of an agent (various mental states or behaviors, respectively) to be the decisive factor distinguishing deductive from inductive arguments. An alternative to these approaches, on the other hand, would be to take some feature of the arguments themselves to be the crucial consideration instead. One such proposal of this type states that if an argument purports to definitely establish its conclusion, it is a deductive argument, whereas if an argument purports only to provide good reasons in support of its conclusion, it is an inductive argument (Black 1967). Another way to express this view involves saying that an argument that aims at being logically valid is deductive, whereas an argument that aims merely at making its conclusion probable is an inductive argument (White 1989; Perry and Bratman 1999; Harrell 2016). The primary attraction of these “purporting” or “aiming” approaches is that they promise to sidestep the thorny problems with the psychological and behavioral approaches detailed above by focusing on a feature of arguments themselves rather than on the persons advancing them. However, they generate some puzzles of their own that are worth considering.

The puzzles at issue all concern the notion of an argument “purporting” (or “aiming”) to do something. One might argue that “purporting” is something that only intentional agents can do, either directly or indirectly. Skyrms (1975) makes this criticism with regard to arguments that are said to intend a conclusion with a certain degree of support. Someone, being the intentional agent they are, may purport to be telling the truth, or rather may purport to have more formal authority than they really possess, just to give a couple examples. The products of such intentional agents (sentences, behaviors, and the like) may be said to purport to do something, but they still in turn depend on what some intentional agent purports. Consequently, then, this “purporting” approach may collapse into a psychological or behavioral approach.

Suppose, however, that one takes arguments themselves to be the sorts of things that can purport to support their conclusions either conclusively or with strong probability. How does one distinguish the former type of argument from the latter, especially in cases in which it is not clear what the argument itself purports to show? Recall the example used previously: “Dom Pérignon is a champagne; so, it is made in France.” How strongly does this argument purport to support its conclusion? As already seen, this argument could be interpreted as purporting to show that the conclusion is logically entailed by the premise, since, by definition, “champagne” is a type of sparkling wine produced only in France. On the other hand, the argument could also be interpreted as purporting to show only that Dom Pérignon is probably made in France, since so much wine is produced in France. How does one know what an argument really purports?

One might attempt to answer this question by inferring that the argument’s purport is conveyed by certain indicator words. Words like “necessarily” may purport that the conclusion logically follows from the premises, whereas words like “probably” may purport that the conclusion is merely made probable by the premises. However, consider the following argument: “The economy will probably improve this year; so, necessarily, the economy will improve this year.” The word “probably” could be taken to indicate that this purports to be an inductive argument. The word “necessarily” could be taken to signal that this argument purports to be a deductive argument. So, which is it? One cannot strictly tell from these indicator words alone. Granted, this is indeed a very strange argument, but that is the point. What does the argument in question really purport, then? Certainly, despite issues of the argument’s validity or soundness, highlighting indicator words does not make it clear what it precisely purports. So, highlighting indicator words may not always be a helpful strategy, but to make matters more complicated, specifying that an argument purports to show something already from the beginning introduces an element of interpretation that is at odds with what was supposed to be the main selling point of this approach in the first place – that distinguishing deductive and inductive arguments depends solely on objective features of arguments themselves, rather than on agents’ intentions or interpretations.

5. Evidential Completeness

Another proposal for distinguishing deductive from inductive arguments with reference to features of arguments themselves focuses on evidential completeness. One might be told, for example, that an inductive argument is one that can be affected by acquiring new premises (evidence), but a deductive argument cannot be.” Or, one might be told that whereas the premises in a deductive argument “stand alone” to sufficiently support its conclusion, all inductive arguments have “missing pieces of evidence” (Teays 1996). This evidential completeness approach is distinct from the psychological approaches considered above, given that an argument could be affected (that is, it could be strengthened or weakened) by acquiring new premises regardless of anyone’s intentions or beliefs about the argument under consideration. It is also distinct from the behavioral views discussed above as well, given that an argument could be affected by acquiring new premises without anyone claiming or presenting anything about it. Finally, it is distinct from the “purporting” view, too, since whether an argument can be affected by acquiring additional premises has no evident connection with what an argument purports to show.

How well does such an evidential completeness approach work to categorically distinguish deductive and inductive arguments? Once again, examination of an example may help to shed light on some of the implications of this approach. Consider the following argument:

All men are mortal.
Therefore, Socrates is mortal.

On the evidential completeness approach, this cannot be a deductive argument because it can be affected by adding a new premise, namely “Socrates is a man.” The addition of this premise makes the argument valid, a characteristic of which only deductive arguments can boast. On the other hand, were one to acquire the premise “Socrates is a god,” this also would greatly affect the argument, specifically by weakening it. At least in this case, adding a premise makes a difference. Without the inclusion of the “Socrates is a man” premise, it would be considered an inductive argument. With the “Socrates is a man” premise, the argument is deductive. As such, then, the evidential completeness approach looks promising.

However, it is worth noticing that to say that a deductive argument is one that cannot be affected (that is, it cannot be strengthened or weakened) by acquiring additional evidence or premises, whereas an inductive argument is one that can be affected by additional evidence or premises, is to already begin with an evaluation of the argument in question, only then to proceed to categorize it as deductive or inductive. “Strengthening” and “weakening” are evaluative assessments. This is to say that, with the evidential completeness approach being considered here, the categorization follows rather than precedes argument analysis and evaluation. This is precisely the opposite of the traditional claim that categorizing an argument as deductive or inductive must precede its analysis and evaluation. If categorization follows rather than precedes evaluation, one might wonder what actual work the categorization is doing. Be that as it may, perhaps in addition to such concerns, there is something to be said with regard to the idea that deductive and inductive arguments may differ in the way that their premises relate to their conclusions. That is an idea that deserves to be examined more closely.

6. Logical Necessity vs. Probability

Govier (1987) observes that “Most logic texts state that deductive arguments are those that ‘involve the claim’ that the truth of the premises renders the falsity of the conclusion impossible, whereas inductive arguments ‘involve’ the lesser claim that the truth of the premises renders the falsity of the conclusion unlikely, or improbable.” Setting aside the “involve the claim” clause (which Govier rightly puts in scare quotes), what is significant about this observation is how deductive and inductive arguments are said to differ in the way in which their premises are related to their conclusions.

Anyone acquainted with introductory logic texts will find quite familiar many of the following characterizations, one of them being the idea of “necessity.” For example, McInerny (2012) states that “a deductive argument is one whose conclusion always follows necessarily from the premises.” An inductive argument, by contrast, is one whose conclusion is merely made probable by the premises. Stated differently, “A deductive argument is one that would be justified by claiming that if the premises are true, they necessarily establish the truth of the conclusion” (Churchill 1987). Similarly, “deductive arguments … are arguments whose premises, if true, guarantee the truth of the conclusion” (Bowell and Kemp 2015). Or, one may be informed that in a valid deductive argument, anyone who accepts the premises is logically bound to accept the conclusion, whereas inductive arguments are never such that one is logically bound to accept the conclusion, even if one entirely accepts the premises (Solomon 1993). Furthermore, one might be told that a valid deductive argument is one in which it is impossible for the conclusion to be false given its true premises, whereas that is possible for an inductive argument.

Neidorf (1967) says that in a valid deductive argument, the conclusion certainly follows from the premises, whereas in an inductive argument, it probably does. Likewise, Salmon (1963) explains that in a deductive argument, if all the premises are true, the conclusion must be true, whereas in an inductive argument, if all the premises are true, the conclusion is only probably true. In a later edition of the same work, he says that “We may summarize by saying that the inductive argument expands upon the content of the premises by sacrificing necessity, whereas the deductive argument achieves necessity by sacrificing any expansion of content” (Salmon 1984).

Another popular approach along the same lines is to say that “the conclusion of a deductively valid argument is already ‘contained’ in the premises,” whereas inductive arguments have conclusions that “go beyond what is contained in their premises” (Hausman, Boardman, and Howard 2021). Likewise, one might be informed that “In a deductive argument, the … conclusion makes explicit a bit of information already implicit in the premises … Deductive inference involves the rearranging of information.” By contrast, “The conclusion of an inductive argument ‘goes beyond’ the premises” (Churchill 1986). A similar idea is expressed by saying that whereas deductive arguments are “demonstrative,” inductive arguments “outrun” their premises (Rescher 1976). The image one is left with in such presentations is that in deductive arguments, the conclusion is “hidden in” the premises, waiting there to be “squeezed” out of them, whereas the conclusion of an inductive argument has to be supplied from some other source. In other words, deductive arguments, in this view, are explicative, whereas inductive arguments are ampliative. These are all interesting suggestions, but their import may not yet be clear. Such import must now be made explicit.

7. The Question of Validity

Readers may have noticed in the foregoing discussion of such “necessitarian” characterizations of deductive and inductive arguments that whereas some authors identify deductive arguments as those whose premises necessitate their conclusions, others are careful to limit that characterization to valid deductive arguments. After all, it is only in valid deductive arguments that the conclusion follows with logical necessity from the premises. A different way to put it is that only in valid deductive arguments is the truth of the conclusion guaranteed by the truth of the premises; or, to use yet another characterization, only in valid deductive arguments do those who accept the premises find themselves logically bound to accept the conclusion. One could say that it is impossible for the conclusion to be false given that the premises are true, or that the conclusion is already contained in the premises (that is, the premises are necessarily truth-preserving). Thus, strictly speaking, these various necessitarian proposals apply only to a distinction between valid deductive arguments and inductive arguments.

Some authors appear to embrace such a conclusion. McIntyre (2019) writes the following:

Deductive arguments are and always will be valid because the truth of the premises is sufficient to guarantee the truth of the conclusion; if the premises are true, the conclusion will be also. This is to say that the truth of the conclusion cannot contain any information that is not already contained in the premises.

By contrast, he mentions that “With inductive arguments, the conclusion contains information that goes beyond what is contained in the premises.” Such a stance might well be thought to be no problem at all. After all, if an argument is valid, it is necessarily deductive; if it isn’t valid, then it is necessarily inductive. The notion of validity, therefore, appears to neatly sort arguments into either of the two categorically different argument types – deductive or inductive. Validity, then, may be the answer to the problems thus far mentioned.

There is, however, a cost to this tidy solution. Many philosophers want to say not only that all valid arguments are deductive, but also that not all deductive arguments are valid, and that whether a deductive argument is valid or invalid depends on its logical form. In other words, they want to leave open the possibility of there being invalid deductive arguments. The psychological approaches already considered do leave open this possibility, since they distinguish deductive and inductive arguments in relation to an arguer’s intentions and beliefs, rather than in relation to features of arguments themselves. Notice, however, that on the necessitarian proposals now being considered, there can be no invalid deductive arguments. “Deduction,” in this account, turns out to be a success term. There are no bad deductive arguments, at least so far as logical form is concerned (soundness being an entirely different matter). Consequently, if one adopts one of these necessitarian accounts, claims like the following must be judged to be simply incoherent: “A bad, or invalid, deductive argument is one whose form or structure is such that instances of it do, on occasion, proceed from true premises to a false conclusion” (Bergmann, Moor, and Nelson 1998). If deductive arguments are identical with valid arguments, then an “invalid deductive argument” is simply impossible: there cannot be any such type of argument. Salmon (1984) makes this point explicit, and even embraces it. Remarkably, he also extends automatic success to all bona fide inductive arguments, telling readers that “strictly speaking, there are no incorrect deductive or inductive arguments; there are valid deductions, correct inductions, and assorted fallacious arguments.” Essentially, therefore, one has a taxonomy of good and bad arguments.

Pointing out these consequences does not show that the necessitarian approach is wrong, however. One might simply accept that all deductive arguments are valid, and that all inductive arguments are strong, because “to be valid” and “to be strong” are just what it means to be a deductive or an inductive argument, respectively. One must then classify bad arguments as neither deductive nor inductive. An even more radical alternative would be to deny that bad “arguments” are arguments at all.

Still, to see why one might find these consequences problematic, consider the following argument:

If P, then Q.
Q.
Therefore, P.

This argument form is known as “affirming the consequent.” It is identified in introductory logic texts as a logical fallacy. In colloquial terms, someone may refer to a widely-accepted but false belief as a “fallacy.” In logic, however, a fallacy is not a mistaken belief. Rather, it is a mistaken form of inference. Arguments can fail as such in at least two distinct ways: their premises can be false (or unclear, incoherent, and so on), and the connection between the premises and conclusion can be defective. In logic, a fallacy is a failure of the latter sort. Introductory logic texts usually classify fallacies as either “formal” or “informal.” An ad hominem (Latin for “against the person”) attack is a classic informal fallacy. By contrast, “affirming the consequent,” such as the example above, is classified as a formal fallacy.

How are these considerations relevant to the deductive-inductive argument distinction under consideration? On the proposal being considered, the argument above in which “affirming the consequent” is exhibited cannot be a deductive argument, indeed not even a bad one, since it is manifestly invalid, given that all deductive arguments are necessarily valid. Rather, since the premises do not necessitate the conclusion, it must be an inductive argument. This is the case unless one follows Salmon (1984) in saying that it is neither deductive nor inductive but, being an instance of affirming the consequent, it is simply fallacious.

Perhaps it is easy to accept such a consequence. Necessitarian proposals are not out of consideration yet, however. Part of the appeal of such proposals is that they seem to provide philosophers with an understanding of how premises and conclusions are related to one another in valid deductive arguments. Is this a useful proposal after all?

Consider the idea that in a valid deductive argument, the conclusion is already contained in the premises. What might this mean? Certainly, all the words that appear in the conclusion of a valid argument need not appear in its premises. Rather, what is supposed to be contained in the premises of a valid argument is the claim expressed in its conclusion. This is the case given that in a valid argument the premises logically entail the conclusion. So, it can certainly be said that the claim expressed in the conclusion of a valid argument is already contained in the premises of the argument, since the premises entail the conclusion. Has there thus been any progress made in understanding validity?

To answer that question, consider the following six arguments, all of which are logically valid:

P. P. P and not P.
Therefore, P. Therefore, either Q or not Q. Therefore, Q.
P. P. P.
Therefore, P or Q. Therefore, if Q then Q. Therefore, if not P, then Q.

In any of these cases (except the first), is it at all obvious how the conclusion is contained in the premise? Insofar as the locution “contained in” is supposed to convey an understanding of validity, such accounts fall short of such an explicative ambition. This calls into question the aptness of the “contained in” metaphor for explaining the relationship between premises and conclusions regarding valid arguments.

8. Formalization and Logical Rules to the Rescue?

In the previous section, it was assumed that some arguments can be determined to be logically valid simply in virtue of their abstract form. After all, the “P”s and “Q”s in the foregoing arguments are just variables or placeholders. It is the logical form of those arguments that determines whether they are valid or invalid. Rendering arguments in symbolic form helps to reveal their logical structure. Might not this insight provide a clue as to how one might categorically distinguish deductive and inductive arguments? Perhaps it is an argument’s capacity or incapacity for being rendered in symbolic form that distinguishes an argument as deductive or inductive, respectively.

To assess this idea, consider the following argument:

If today is Tuesday, we’ll be having tacos for lunch.
Today is Tuesday.
So, we’ll be having tacos for lunch.

This argument is an instance of the valid argument form modus ponens, which can be expressed symbolically as:

P → Q.
P.
∴ Q.

Any argument having this formal structure is a valid deductive argument and automatically can be seen as such. Significantly, according to the proposal that deductive but not inductive arguments can be rendered in symbolic form, a deductive argument need not instantiate a valid argument form. Recall the fallacious argument form known as “affirming the consequent”:

If P, then Q.
Q.
Therefore, P.

It, too, can be rendered in purely symbolic notation:

P → Q.
Q.
∴ P.

Consequently, this approach would permit one to say that deductive arguments may be valid or invalid, just as some philosophers would wish. It might be thought, on the other hand, that inductive arguments do not lend themselves to this sort of formalization. They are just too polymorphic to be represented in purely formal notation.

Note, however, that the success of this proposal depends on all inductive arguments being incapable of being represented formally. Unfortunately for this proposal, however, all arguments, both deductive and inductive, are capable of being rendered in formal notation. For example, consider the following argument:

We usually have tacos for lunch on Tuesdays.
Today is Tuesday.
So, we’re probably having tacos for lunch.

In other words, given that today is Tuesday, there is a better than even chance that tacos will be had for lunch. This might be rendered formally as:

P(A/B) > 0.5

It must be emphasized that the point here is not that this is the only or even the best way to render the argument in question in symbolic form. Rather, the point is that inductive arguments, no less than deductive arguments, can be rendered symbolically, or, at the very least, the burden of proof rests on deniers of this claim. But, if so, then it seems that the capacity for symbolic formalization cannot categorically distinguish deductive from inductive arguments.

Another approach would be to say that whereas deductive arguments involve reasoning from one statement to another by means of logical rules, inductive arguments defy such rigid characterization (Solomon 1993). In this view, identifying a logical rule governing an argument would be sufficient to show that the argument is deductive. Failure to identify such a rule governing an argument, however, would not be sufficient to demonstrate that the argument is not deductive, since logical rules may nonetheless be operative but remain unrecognized.

The “reasoning” clause in this proposal is also worth reflecting upon. Reasoning is something that some rational agents do on some occasions. Strictly speaking, arguments, consisting of sentences lacking cognition, do not reason (recall that earlier a similar point was considered regarding the idea of arguments purporting something). Consequently, the “reasoning” clause is ambiguous, since it may mean either that: (a) there is a logical rule that governs (that is, justifies, warrants, or the like) the inference from the premise to the conclusion; or (b) some cognitional agent either explicitly or implicitly uses a logical rule to reason from one statement (or a set of statements) to another.

If the former, more generous interpretation is assumed, it is easy to see how this suggestion might work with respect to deductive arguments. Consider the following argument:

If today is Tuesday, then the taco truck is here.
The taco truck is not here.
Therefore, today is not Tuesday.

This argument instantiates the logical rule modus tollens:

If P, then Q. P → Q
Not Q. ~ Q
Therefore, not P. ∴ ~ P

Perhaps all deductive arguments explicitly or implicitly rely upon logical rules. However, for this proposal to categorically distinguish deductive from inductive arguments, it must be the case both that all deductive arguments embody logical rules, and that no inductive arguments do.

Is this true? It is not entirely clear. A good case can be made that all valid deductive arguments embody logical rules (such as modus ponens or modus tollens). However, if one wants to include some invalid arguments within the set of all deductive arguments, then it is hard to see what logical rules could underwrite invalid argument types such as affirming the consequent or denying the antecedent. It would seem bizarre to say that in inferring “P” from “If P, then Q” and “Q” that one relied upon the logical rule “affirming the consequent.” That is not a logical rule. It is a classic logical fallacy.

Likewise, consider the following argument that many would consider to be an inductive argument:

Nearly all individuals polled in a random sample of registered voters contacted one week before the upcoming election indicated that they would vote to re-elect Senator Blowhard. Therefore, Senator Blowhard will be re-elected.

There may be any number of rules implicit in the foregoing inference. For example, the rule implicit in this argument might be something like this:

Random sampling of a relevant population’s voting preferences one week before an election provides good grounds for predicting that election’s results.

This is no doubt some sort of rule, even if it does not explicitly follow the more clear-cut logical rules thus far mentioned. Is the above the right sort of rule, however? Perhaps deductive arguments are those that involve reasoning from one statement to another by means of deductive rules. One could then stipulate what those deductive logical rules are, such that they exclude rules like the one implicit in the ostensibly inductive argument above. This would resolve the problem of distinguishing between deductive and inductive arguments, but at the cost of circularity (that is, by committing a logical fallacy).

If one objected that the inductive rule suggested above is a formal rule, then a formal version of the rule could be devised. However, if that is right, then the current proposal stating that deductive arguments, but not inductive ones, involve reasoning from one statement to another by means of logical rules is false. Inductive arguments rely, or at least can rely, upon logical rules as well.

9. Other Even Less Promising Proposals

A perusal of introductory logic texts turns up a hodgepodge of other proposals for categorically distinguishing deductive and inductive arguments that, upon closer inspection, seem even less promising than the proposals surveyed thus far. One example will have to suffice.

Kreeft (2005) says that whereas deductive arguments begin with a “general” or “universal” premise and move to a less general conclusion, inductive arguments begin with “particular”, “specific”, or “individual” premises and move to a more general conclusion.

In light of this proposal, consider again the following argument:

All men are mortal.
Socrates is a man.
Therefore, Socrates is mortal.

As mentioned already, this argument is the classic example used in introductory logic texts to illustrate a deductive argument. It moves from a general (or universal) premise (exhibited by the phrase “all men”) to a specific (or particular) conclusion (exhibited by referring to “Socrates”). By contrast, consider the following argument:

Each spider so far examined has had eight legs.
Therefore, all spiders have eight legs.

This argument moves from specific instances (demarcated by the phrase “each spider so far examined”) to a general conclusion (as seen by the phrase “all spiders”). Therefore, on this proposal, this argument would be inductive.

So far, so good. However, this approach seems much too crude for drawing a categorical distinction between the deductive and inductive arguments. Consider the following argument:

All As are Bs.
All Bs are Cs.
Therefore, all As are Cs.

On this account, this would be neither deductive nor inductive, since it involves only universal statements. Likewise, consider the following as well:

 Each spider so far examined has had eight legs.
Therefore, likewise, the next spider examined will have eight legs.

According to Kreeft’s proposal, this would be neither a deductive nor an inductive argument, since it moves from a number of particulars to yet another particular. What kind of argument, then, may this be considered as? Despite the ancient pedigree of Kreeft’s proposal (since he ultimately draws upon both Platonic and Aristotelian texts), and the fact that one still finds it in some introductory logic texts, it faces such prima facie plausible exceptions that it is hard to see how it could be an acceptable, much less the best, view for categorically distinguishing between deductive and inductive arguments.

10. An Evaluative Approach

There have been many attempts to distinguish deductive from inductive arguments. Some approaches focus on the psychological states (such as the intentions, beliefs, or doubts) of those advancing an argument. Others focus on the objective behaviors of arguers by focusing on what individuals claim about or how they present an argument. Still others focus on features of arguments themselves, such as what an argument purports, its evidential completeness, its capacity for formalization, or the nature of the logical bond between its premises and conclusion. All of these proposals entail problems of one sort or another. The fact that there are so many radically different views about what distinguishes deductive from inductive arguments is itself noteworthy, too. This fact might not be evident from examining the account given in any specific text, but it emerges clearly when examining a range of different proposals and approaches, as has been done in this article. The diversity of views on this issue has so far garnered remarkably little attention. Some authors (such as Moore and Parker 2004) acknowledge that the best way of distinguishing deductive from inductive arguments is “controversial.” Yet, there seems to be remarkably little actual controversy about it. Instead, matters persist in a state of largely unacknowledged chaos.

Rather than leave matters in this state of confusion, one final approach must be considered. Instead of proposing yet another account of how deductive and inductive arguments differ, this proposal seeks to dispense entirely with the entire categorical approach of the proposals canvassed above.

Without necessarily acknowledging the difficulties explored above or citing them as a rationale for taking a fundamentally different approach, some authors nonetheless decline to define “deductive” and “inductive” (or more generally “non-deductive”) arguments at all, and instead adopt an evaluative approach that focuses on deductive and inductive standards for evaluating arguments (see Skyrms 1975; Bergmann, Moor, and Nelson 1998). When presented with any argument, one can ask: “Does the argument prove its conclusion, or does it only render it probable, or does it do neither?” One can then proceed to evaluate the argument by first asking whether the argument is valid, that is, whether the truth of the conclusion is entailed by the truth of the premises. If the answer to this initial question is affirmative, one can then proceed to determine whether the argument is sound by assessing the actual truth of the premises. If the argument is determined to be sound, then its conclusion is ceteris paribus worth believing. If the argument is determined to be invalid, one can then proceed to ask whether the truth of the premises would make the conclusion probable. If it would, one can judge the argument to be strong. If one then determines or judges that the argument’s premises are probably true, the argument can be declared cogent. Otherwise, it ought to be declared not-cogent (or the like). In this latter case, one ought not to believe the argument’s conclusion on the strength of its premises.

What is noteworthy about this procedure is that at no time was it required to determine whether any argument is “deductive,” “inductive,” or more generally “non-deductive.” Such classificatory concepts played no role in executing the steps in the process of argument evaluation. Yet, the whole point of examining an argument in first place is nevertheless achieved with this approach. That is, the effort to determine whether an argument provides satisfactory grounds for accepting its conclusion is carried out successfully. In order to discover what one can learn from an argument, the argument must be treated as charitably as possible. By first evaluating an argument in terms of validity and soundness, and, if necessary, then in terms of strength and cogency, one gives each argument its best shot at establishing its conclusion, either with a very high degree of certainty or at least with a degree of probability. One will then be in a better position to determine whether the argument’s conclusion should be believed on the basis of its premises.

This is of course not meant to minimize the difficulties associated with evaluating arguments. Evaluating arguments can be quite difficult. However, insisting that one first determine whether an argument is “deductive” or “inductive” before proceeding to evaluate it seems to insert a completely unnecessary step in the process of evaluation that does no useful work on its own. Moreover, a focus on argument evaluation rather than on argument classification promises to avoid the various problems associated with the categorical approaches discussed in this article. There is no need to speculate about the possibly unknowable intentions, beliefs, and/or doubts of someone advancing an argument. There is no need to guess at what an argument purports to show, or to ponder whether it can be formalized or represented by logical rules in order to determine whether one ought to believe the argument’s conclusion on the basis of its premises. In short, one does not need a categorical distinction between deductive and inductive arguments at all in order to successfully carry out argument evaluation..

This article is an attempt to practice what it preaches. Although there is much discussion in this article about deductive and inductive arguments, and a great deal of argumentation, there was no need to set out a categorical distinction between deductive and inductive arguments in order to critically evaluate a range of claims, positions, and arguments about the purported distinction between each type of argument. Hence, although such a distinction is central to the way in which argumentation is often presented, it is unclear what actual work it is doing for argument evaluation, and thus whether it must be retained. Perhaps it is time to give the deductive-inductive argument distinction its walking papers.

11. References and Further Reading

  • Aristotle. The Basic Works of Aristotle. New York: Random House, 1941.
  • Bacon, Francis. Francis Bacon: The Major Works. Oxford: Oxford University Press, 2002.
  • Barry, Vincent E. The Critical Edge: Critical Thinking for Reading and Writing. Orlando, FL: Holt, Rinehart and Winston, Inc., 1992.
  • Bergmann, Merrie, James Moor and Jack Nelson. The Logic Book. 3rd ed. New York: McGraw-Hill, 1998.
  • Black, Max. “Induction.” The Encyclopedia of Philosophy. Ed. Paul Edwards. Vol. 4. New York: Macmillan Publishing Co., Inc. & The Free Press, 1967. 169-181.
  • Bowell, Tracy and Gary Kemp. Critical Thinking: A Concise Guide. 4th ed. London: Routledge, 2015.
  • Churchill, Robert Paul. Becoming Logical: An Introduction to Logic. New York: St. Martin’s Press, 1986.
  • Copi, Irving. Introduction to Logic. 5th ed. New York: Macmillan, 1978.
  • Descartes, René. A Discourse on the Method. Oxford: Oxford University Press, 2006.
  • Einstein, Albert. “Induction and Deduction in Physics.” Einstein, Albert. The Collected Papers of Albert Einstein: The Berlin Years: Writings, 1918-1921. Trans. Alfred Engel. Vol. 7. Princeton: Princeton University Press, 2002. 108-109. <https://einsteinpapers.press.princeton.edu/vol7-trans/124>.
  • Engel, S. Morris. With Good Reason: An Introduction to Informal Fallacies. 5th ed. New York: St. Martin’s Press, 1994.
  • Govier, Trudy. Problems in Argument Analysis and Evaluation. Updated Edition. Windsor: Windsor Studies in Argumentation, 1987.
  • Haack, Susan. Philosophy of Logics. Cambridge: Cambridge University Press, 1978.
  • Harrell, Maralee. What is the Argument? An Introduction to Philosophical Argument and Analysis. Cambridge: The MIT Press, 2016.
  • Hausman, Alan, Frank Boardman and Kahane Howard. Logic and Philosophy: A Modern Introduction. 13th ed. Indianapolis: Hackett Publishing, 2021.
  • Hurley, Patrick J. and Lori Watson. A Concise Introduction to Logic. 13th ed. Belmont: Cengage Learning, 2018.
  • Kreeft, Peter. Socratic Logic: A Logic Text Using Socratic Method, Platonic Questions, and Aristotelian Principles. 2nd ed. South Bend: St. Augustine’s Press, 2005.
  • McInerny, D. Q. An Introduction to Foundational Logic. Elmhurst Township: The Priestly Fraternity of St. Peter, 2012.
  • McIntyre, Lee. The Scientific Attitude: Defending Science from Denial, Fraud, and Pseudoscience. Cambridge: The MIT Press, 2019.
  • Moore, Brooke Noel and Richard Parker. Critical Thinking. 7th ed. New York:: McGraw Hill, 2004.
  • Neidorf, Robert. Deductive Forms: An Elementary Logic. New York: Harper and Row, 1967.
  • Olson, Robert G. Meaning and Argument. New York: Harcourt, Brace, and World, 1975.
  • Perry, John and Michael Bratman. Introduction to Philosophy: Classical and Contemporary Readings. 3rd ed. New York: Oxford University Press, 1999.
  • Rescher, Nicholas. Plausible Reasoning. Assen: Van Gorcum, 1976.
  • Salmon, Wesley. Logic. Englewood Cliffs: Prentice Hall, 1963.
  • Salmon, Wesley. Logic. 3rd ed. Englewood Cliffs: Prentice Hall, 1984.
  • Skyrms, Brian. Choice and Chance. 2nd ed. Encino: Dikenson, 1975.
  • Solomon, Robert C. Introducing Philosophy: A Text with Integrated Readings. 5th ed. Fort Worth: Harcourt Brace Jovanovich, 1993.
  • Teays, Wanda. Second Thoughts: Critical Thinking from a Multicultural Perspective. Mountain View: Mayfield Publishing Company, 1996.
  • Vaughn, Lewis. The Power of Critical Thinking: Effective Reasoning about Ordinary and Extraordinary Claims. 3rd ed. New York: Oxford University Press, 2010.
  • White, James E. Introduction to Philosophy. St. Paul: West Publishing Company, 1989.

Author Information:
Timothy Shanahan
Email: timothy.shanahan@lmu.edu
Loyola Marymount University
U. S. A.

Virtue Epistemology

Virtue epistemology is a collection of recent approaches to epistemology that give epistemic or intellectual virtue concepts an important and fundamental role. Virtue epistemologists can be divided into two groups, each accepting a different conception of what an intellectual virtue is.

Virtue reliabilists conceive of intellectual virtues as stable, reliable and truth-conducive cognitive faculties or powers and cite vision, introspection, memory, and the like as paradigm cases of intellectual virtue. These virtue epistemologists tend to focus on formulating virtue-based accounts of knowledge or justification. Virtue reliabilist accounts of knowledge and justification are versions of epistemological externalism. Consequently, whatever their strengths as versions of externalism, virtue reliabilist views are likely to prove unsatisfying to anyone with considerable internalist sympathies.

Virtue responsibilists conceive of intellectual virtues as good intellectual character traits, like attentiveness, fair-mindedness, open-mindedness, intellectual tenacity, and courage, traits that might be viewed as the traits of a responsible knower or inquirer. While some virtue responsibilists have also attempted to give virtue-based accounts of knowledge or justification, others have pursued less traditional projects, focusing on such issues as the nature and value of virtuous intellectual character as such, the relation between intellectual virtue and epistemic responsibility, and the relevance of intellectual virtue to the social and cross-temporal aspects of the intellectual life.

There is however a sense in which the very distinction between virtue reliabilism and virtue responsibilism is sketchier than it initially appears. Indeed, the most plausible version of virtue reliabilism will incorporate many of the character traits of interest to virtue responsibilists into its repertoire of virtues and in doing so will go significant lengths toward bridging the gap between virtue reliabilism and virtue responsibilism.

Table of Contents

  1. Introduction to Virtue Epistemology
  2. Virtue Reliabilism
    1. Key Figures
    2. Prospects for Virtue Reliabilism
  3. Virtue Responsibilism
    1. Key Figures
    2. Prospects for Virtue Responsibilism
  4. The Reliabilist/Responsibilist Divide
  5. References and Further Reading

1. Introduction to Virtue Epistemology

Virtue epistemology is a collection of recent approaches to epistemology that give epistemic or intellectual virtue concepts an important and fundamental role.

The advent of virtue epistemology was at least partly inspired by a fairly recent renewal of interest in virtue concepts among moral philosophers (see, for example, Crisp and Slote 1997). Noting this influence from ethics, Ernest Sosa introduced the notion of an intellectual virtue into contemporary epistemological discussion in a 1980 paper, “The Raft and the Pyramid.” Sosa argued in this paper that an appeal to intellectual virtue could resolve the conflict between foundationalists and coherentists over the structure of epistemic justification. Since the publication of Sosa’s paper, several epistemologists have turned to intellectual virtue concepts to address a wide range of issues, from the Gettier problem to the internalism/externalism debate to skepticism.

There are substantial and complicated differences between the various virtue epistemological views; as a result, relatively little can be said by way of generalization about the central tenets of virtue epistemology. These differences are attributable mainly to two competing conceptions of the nature of an intellectual virtue. Sosa and certain other virtue epistemologists tend to define an intellectual virtue as roughly any stable and reliable or truth-conducive property of a person. They cite as paradigm instances of intellectual virtue certain cognitive faculties or powers like vision, memory, and introspection, since such faculties ordinarily are especially helpful for getting to the truth. Epistemologists with this conception of intellectual virtue have mainly been concerned with constructing virtue-based analyses of knowledge and/or justification. Several have argued, for instance, that knowledge should be understood roughly as true belief arising from an exercise of intellectual virtue. Because of their close resemblance to standard reliabilist epistemologies, these views are referred to as instances of “virtue reliabilism.”

A second group of virtue epistemologists conceives of intellectual virtues, not as cognitive faculties or abilities like memory and vision, but rather as good intellectual character traits, traits like inquisitiveness, fair-mindedness, open-mindedness, intellectual carefulness, thoroughness, and tenacity. These character-based versions of virtue epistemology are referred to as instances of “virtue responsibilism,” since the traits they regard as intellectual virtues might also be viewed as the traits of a responsible knower or inquirer. Some virtue responsibilists have adopted an approach similar to that of virtue reliabilists by giving virtue concepts a crucial role in an analysis of knowledge or justification. Linda Zagzebski, for instance, claims that knowledge is belief arising from what she calls “acts of intellectual virtue” (1996). Other virtue responsibilists like Lorraine Code (1987) have eschewed more traditional epistemological problems. Code argues that epistemology should be oriented on the notion of epistemic responsibility and that epistemic responsibility is the chief intellectual virtue; however, she makes no attempt to offer a definition of knowledge or justification based on these concepts. Her view instead gives priority to topics like the value of virtuous cognitive character as such, the social and moral dimensions of the intellectual life, and the role of agency in inquiry.

Virtue reliabilists and virtue responsibilists alike have claimed to have the more accurate view of intellectual virtue and hence of the general form that a virtue-based epistemology should take. And both have appealed to Aristotle, one of the first philosophers to employ the notion of an intellectual virtue, in support of their claims. Some virtue responsibilists (for example, Zagzebski 1996) have argued that the character traits of interest to them are the intellectual counterpart to what Aristotle and other moral philosophers have regarded as the moral virtues and that these traits are therefore properly regarded as intellectual virtues. In response, virtue reliabilists have pointed out that, whatever his conception of moral virtue, Aristotle apparently conceived of intellectual virtues more as truth-conducive cognitive powers or faculties than as good intellectual character traits. They have claimed furthermore that these powers, but not the responsibilist’s character traits, have an important role to play in an analysis of knowledge, and that consequently, the former are more reasonably regarded as intellectual virtues (Greco 2000).

It would be a mistake, however, to view either group of virtue epistemologists as necessarily having a weightier claim than the other to the concept of an intellectual virtue, for both are concerned with traits that are genuine and important intellectual excellences and therefore can reasonably be regarded as intellectual virtues. Virtue reliabilists are interested in cognitive qualities that are an effective means to epistemic values like truth and understanding. The traits of interest to virtue responsibilists are also a means to these values, since a person who is, say, reflective, fair-minded, perseverant, intellectually careful, and thorough ordinarily is more likely than one who lacks these qualities to believe what is true, to achieve an understanding of complex phenomena, and so forth. Moreover, these qualities are “personal excellences” in the sense that one is also a better person (albeit in a distinctively intellectual rather than straightforwardly moral way) as a result of possessing them, that is, as a result of being reflective, fair-minded, intellectually courageous, tenacious, and so forth. The latter is not true of cognitive faculties or abilities like vision or memory. These traits, while contributing importantly to one’s overall intellectual well-being, do not make their possessor a better person in any relevant sense. This is entirely consistent, however, with the more general point that virtue responsibilists and virtue reliabilists alike are concerned with genuine and important intellectual excellences both sets of which can reasonably be regarded as intellectual virtues. Virtue reliabilists are concerned with traits that are a critical means to intellectual well-being or “flourishing” and virtue responsibilists with traits that are both a means to and are partly constitutive of intellectual flourishing.

A firmer grasp of the field of virtue epistemology can be achieved by considering, for each branch of virtue epistemology, how some of its main proponents have conceived of the nature of an intellectual virtue and how they have employed virtue concepts in their theories. It will also be helpful to consider the apparent prospects of each kind of virtue epistemology.

2. Virtue Reliabilism

a. Key Figures

Since introducing the notion of an intellectual virtue to contemporary epistemology, Sosa has had more to say than any other virtue epistemologist about the intellectual virtues conceived as reliable cognitive faculties or abilities. Sosa characterizes an intellectual virtue, very generally, as “a quality bound to help maximize one’s surplus of truth over error” (1991: 225). Recognizing that any given quality is likely to be helpful for reaching the truth only with respect to a limited field of propositions and only when operating in a certain environment and under certain conditions, Sosa also offers the following more refined characterization: “One has an intellectual virtue or faculty relative to an environment E if and only if one has an inner nature I in virtue of which one would mostly attain the truth and avoid error in a certain field of propositions F, when in certain conditions C” (284). Sosa identifies reason, perception, introspection, and memory as among the qualities that most obviously satisfy these conditions.

Sosa’s initial appeal to intellectual virtue in “The Raft and the Pyramid” is aimed specifically at resolving the foundationalist/coherentist dispute over the structure of epistemic justification. (Sosa has since attempted to show that virtue concepts are useful for addressing other epistemological problems as well; the focus here, however, will be limited to his seminal discussion in the “The Raft and the Pyramid.”) According to Sosa, traditional formulations of both foundationalism and coherentism have fatal defects. The main problem with coherentism, he argues, is that it fails to give adequate epistemic weight to experience. The coherentist claims roughly that a belief is justified just in case it coheres with the rest of what one believes. But it is possible for a belief to satisfy this condition and yet be disconnected from or even to conflict with one’s experience. In such cases, the belief in question intuitively is unjustified, thereby indicating the inadequacy of the coherentist’s criterion for justification (1991: 184-85). Sosa also sees standard foundationalist accounts of justification as seriously flawed. The foundationalist holds that the justification of non-basic beliefs derives from that of basic or foundational beliefs and that the latter are justified on the basis of things like sensory experience, memory, and rational insight. According to Sosa, an adequate version of foundationalism must explain the apparent unity of the various foundationalist principles that connect the ultimate sources of justification with the beliefs they justify. But traditional versions of foundationalism, Sosa claims, seem utterly incapable of providing such an explanation, especially when the possibility of creatures with radically different perceptual or cognitive mechanisms than our own (and hence of radically different epistemic principles) is taken into account (187-89).

Sosa briefly sketches a model of epistemic justification that he says would provide the required kind of explanation. This model depicts justification as “stratified”: it attaches primary justification to intellectual virtues like sensory experience and memory and secondary justification to beliefs produced by these virtues. A belief is justified, according to the model, just in case it is has its source in an intellectual virtue (189). Sosa’s proposed view of justification is, in effect, an externalist version of foundationalism, since a belief can have its source in an intellectual virtue and hence be justified without this fact’s being internally or subjectively accessible to the person who holds it. This model provides an explanation of the unity of foundationalist epistemic principles by incorporating the foundationalist sources of epistemic justification under the concept of an intellectual virtue and offering a unified account of why beliefs grounded in intellectual virtue are justified (namely, because they are likely to be true). If Sosa’s criticisms of traditional coherentist and foundationalist views together with his own positive proposal are plausible, virtue reliabilism apparently has the resources to deal effectively with one of the more challenging and longstanding problems in contemporary epistemology.

John Greco also gives the intellectual virtues conceived as reliable cognitive faculties or abilities a central epistemological role. Greco characterizes intellectual virtues generally as “broad cognitive abilities or powers” that are helpful for reaching the truth. He claims, more specifically, that intellectual virtues are “innate faculties or acquired habits that enable a person to arrive at truth and avoid error in some relevant field.” These include things like “perception, reliable memory, and various kinds of good reasoning” (2002: 287).

Greco offers an account of knowledge according to which one knows a given proposition just in case one believes the truth regarding that proposition because one believes out of an intellectual virtue (311). This definition is broken down by Greco as follows. It requires, first, that one be subjectively justified in believing the relevant claim. According to Greco, one is subjectively justified in believing a given proposition just in case this belief is produced by dispositions that one manifests when one is motivated to believe what it is true. Greco stipulates that an exercise of intellectual virtue entails the manifestation of such dispositions. Second, Greco’s definition of knowledge requires that one’s belief be objectively justified. This means that one’s belief must be produced by one or more of one’s intellectual virtues. Third, Greco’s definition requires that one believe the truth regarding the claim in question because one believes the claim out of one or more of one’s intellectual virtues. In other words, one’s being objectively justified must be a necessary and salient part of the explanation for why one believes the truth.

Greco discusses several alleged virtues of his account of knowledge. One of these is the reply it offers to the skeptic. According to one variety of skepticism, we do not and cannot have any non-question-begging reasons for thinking that any of our beliefs about the external world are true, for any such reasons inevitably depend for their force on some of the very beliefs in question (305-06). Greco replies by claiming that the skeptic’s reasoning presupposes a mistaken view of the relation between knowledge and epistemic grounds or reasons. The skeptic assumes that to know a given claim, one must be in possession of grounds or reasons which, via some inductive, deductive, or other logical or quasi-logical principle, provide one with a cogent reason for thinking that the claim is true or likely to be true. If Greco’s account of knowledge is correct, this mischaracterizes the conditions for knowledge. Greco’s account requires merely that an agent’s grounds be reliable, or rather, that an agent herself be reliable on account of a disposition to believe on reliable grounds. It follows that as long as a disposition to form beliefs about the external world on the basis of sensory experience of that world is reliable, knowledge of the external world is possible for a person who possesses this disposition. But since an agent can be so disposed and yet lack grounds for her belief that satisfy the skeptic’s more stringent demands, Greco can conclude that knowledge does not require the satisfaction of these demands (307).

b. Prospects for Virtue Reliabilism

The foregoing indicates some of the ways that virtue reliabilist accounts of knowledge and justification may, if headed in the right general direction, provide helpful ways of addressing some of the more challenging problems in epistemology. It remains, however, that one is likely to find these views plausible only to the extent that one is already convinced of a certain, not wholly uncontroversial position that undergirds and partly motivates them.

Virtue reliabilist accounts of knowledge and justification are versions of epistemological externalism: they deny that the factors grounding one’s justification must be cognitively accessible from one’s first-person or internal perspective. Consequently, whatever their strengths as versions of externalism, virtue reliabilist views are likely to prove unsatisfying to anyone with considerable internalist sympathies. Consider, for example, a version of internalism according to which one is justified in believing a given claim just in case one has an adequate reason for thinking that the claim is true. It is not difficult to see why, if this account of justification were correct, the virtue reliabilist views considered above would be less promising than they might initially appear.

Sosa, for instance, attempts to resolve the conflict between foundationalism and coherentism by offering an externalist version of foundationalism. But traditionally, the coherentist/foundationalist debate has been an in-house debate among internalists. Coherentists and foundationalists alike have generally agreed that to be justified in believing a given claim is to have a good reason for thinking that the claim is true. The disagreement has been over the logical structure of such a reason, with coherentists claiming that the structure should be characterized in terms of doxastic coherence relations and foundationalists that it should be characterized mainly in terms of relations between foundational beliefs and the beliefs they support. Sosa rejects this shared assumption. He claims that justification consists in a belief’s having its source in an intellectual virtue. But a belief can have its source in an intellectual virtue without one’s being aware of it and hence without one’s having any reason at all for thinking that the belief is true. Therefore, Sosa’s response to the coherentism/foundationalism debate is likely to strike traditional coherentists and foundationalists as seriously problematic.

(It is worth noting in passing that in later work [for example, 1991], Sosa claims that the kind of justification just described is sufficient, when combined with the other elements of knowledge, merely for “animal knowledge” and not for “reflective” or “human knowledge.” The latter requires the possession of an “epistemic perspective” on any known proposition. While Sosa is not entirely clear on the matter, this apparently requires the satisfaction of something like either traditional coherentist or traditional foundationalist conditions for justification [see, for example, BonJour 1995].)

An internalist is likely to have a similar reaction to Greco’s response to the skeptic. Greco argues against skepticism about the external world by claiming that if a disposition to reason from the appearance of an external world to the existence of that world is in fact reliable then knowledge of the external world is possible for a person who possesses such a disposition. But this view allows for knowledge of the external world in certain cases where a person lacks any cogent or even merely non-question-begging reasons for thinking that the external world exists. As a result, Greco’s more lenient requirements for knowledge are likely to seem to internalists more like a capitulation to rather than a victory over skepticism.

Of course, these considerations do not by themselves show virtue reliabilism to be implausible, as the internalist viewpoint in question is itself a matter of some controversy. Indeed, Sosa and Greco alike have argued vigorously against internalism and have lobbied for externalism as the only way out of the skeptical bog. But the debate between internalists and externalists remains a live one and the foregoing indicates that the promise of virtue reliabilism hangs in a deep and important way on the outcome of this debate.

3. Virtue Responsibilism

a. Key Figures

Virtue responsibilism contrasts with virtue reliabilism in at least two important ways. First, virtue responsibilists think of intellectual virtues, not as cognitive faculties like introspection and memory, but rather as traits of character like attentiveness, intellectual courage, carefulness, and thoroughness. Second, while virtue reliabilists tend to focus on the task of providing a virtue-based account of knowledge or justification, several virtue responsibilists have seen fit to pursue different and fairly untraditional epistemological projects.

One of the first contemporary philosophers to discuss the epistemological role of the intellectual virtues conceived as character traits is Lorraine Code (1987). Code claims that epistemologists should pay considerably more attention to the personal, active, and social dimensions of the cognitive life and she attempts to motivate and outline an approach to epistemology that does just this. The central focus of her approach is the notion of epistemic responsibility, as an epistemically responsible person is especially likely to succeed in the areas of the cognitive life that Code says deserve priority. Epistemic responsibility, she claims, is the chief intellectual virtue and the virtue “from which other virtues radiate” (44). Some of these other virtues are open-mindedness, intellectual openness, honesty, and integrity. Since Code maintains that epistemic responsibility should be the focus of epistemology and thinks of epistemic responsibility in terms of virtuous intellectual character, she views the intellectual virtues as deserving an important and fundamental role in epistemology.

Code claims that intellectual virtue is fundamentally “a matter of orientation toward the world, toward one’s knowledge-seeking self, and toward other such selves as part of the world” (20). This orientation is partly constituted by what she calls “normative realism”: “[I]t is helpful to think of intellectual goodness as having a realist orientation. It is only those who, in their knowing, strive to do justice to the object – to the world they want to know as well as possible – who can aspire to intellectual virtue … Intellectually virtuous persons value knowing and understanding how things really are” (59). To be intellectually virtuous on Code’s view is thus to regard reality as genuinely intellectually penetrable; it is to regard ourselves and others as having the ability to know and understand the world as it really is. It is also to view such knowledge as an important good, as worth having and pursuing.

Code also claims that the structure of the intellectual virtues and their role in the intellectual life are such that an adequate conception of these things is unlikely to be achieved via the standard methodologies of contemporary epistemology. She claims that an accurate and illuminating account of the intellectual virtues and their cognitive significance must draw on the resources of fiction (201) and often must be content with accurate generalizations rather than airtight technical definitions (254).

Because of its uniqueness on points of both content and method, Code’s suggested approach to epistemology is relatively unconcerned with traditional epistemological problems. But she sees this as an advantage. She believes that the scope of traditional epistemology is too narrow and that it overemphasizes the importance of analyzing abstract doxastic properties (for example, knowledge and justification) (253-54). Her view focuses alternatively on cognitive character in its own right, the role of choice in intellectual flourishing, the relation between moral and epistemic normativity, and the social and communal dimensions of the intellectual life. The result, she claims, is a richer and more “human” approach to epistemology.

A second contemporary philosopher to give considerable attention to the intellectual virtues understood as character traits is James Montmarquet. Montmarquet’s interest in these traits arises from a prior concern with moral responsibility (1993). He thinks that to make sense of certain instances moral responsibility, an appeal must be made to a virtue-based conception of doxastic responsibility.

According to Montmarquet, the chief intellectual virtue is epistemic conscientiousness, which he characterizes as a desire to achieve the proper ends of the intellectual life, especially the desire for truth and the avoidance of error (21). Montmarquet’s “epistemic conscientiousness” bears a close resemblance to Code’s “epistemic responsibility.” But Montmarquet is quick to point out that a desire for truth is not sufficient for being fully intellectually virtuous and indeed is compatible with the possession of vices like intellectual dogmatism or fanaticism. He therefore supplements his account with three additional kinds of virtues that regulate this desire. The first are virtues of impartiality, which include “an openness to the ideas of others, the willingness to exchange ideas with and learn from them, the lack of jealousy and personal bias directed at their ideas, and the lively sense of one’s own fallibility” (23). A second set of virtues are those of intellectual sobriety: “These are the virtues of the sober-minded inquirer, as opposed to the ‘enthusiast’ who is disposed, out of sheer love of truth, discovery, and the excitement of new and unfamiliar ideas, to embrace what is not really warranted, even relative to the limits of his own evidence.” Finally, there are virtues of intellectual courage, which include “the willingness to conceive and examine alternatives to popularly held beliefs, perseverance in the face of opposition from others (until one is convinced that one is mistaken), and the determination required to see such a project through to completion” (23).

Montmarquet argues that the status of these traits as virtues cannot adequately be explained on account of their actual reliability or truth-conduciveness. He claims, first, that if we were to learn that, say, owing to the work of a Cartesian demon, the traits we presently regard as intellectual virtues actually lead us away from the truth and the traits we regard as intellectual vices lead us to the truth, we would not immediately revise our judgments about the worth or virtue of those epistemic agents we have known to possess the traits in question (for example, we would not then regard someone like Galileo as intellectually vicious) (20). Second, he points out that many of those we would regard as more or less equally intellectually virtuous (for example, Aristotle, Ptolemy, Galileo, Newton, and Einstein) were not equally successful at reaching the truth (21).

Montmarquet goes on to argue that the traits we presently regard as intellectual virtues merit this status because they are qualities that a truth-desiring person would want to have (30). The desire for truth therefore plays an important and basic normative role in Montmarquet’s account of intellectual virtue. The value or worth of this desire explains why the traits that emerge from it should be regarded as intellectual virtues.

Unlike Code, Montmarquet does not call for a reorientation of epistemology on the intellectual virtues. His concern is considerably narrower. He is interested mainly in cases in which an agent performs a morally wrong action which from her own point of view is morally justified. In some such cases, the person in question intuitively is morally responsible for her action. But this is possible, Montmarquet argues, only if we can hold the person responsible for the beliefs that permitted the action. He concludes that moral responsibility is sometimes grounded in doxastic responsibility.

Montmarquet appeals to the concept of an intellectual virtue when further clarifying the relevant sense of doxastic responsibility. He claims that in cases of the sort in question, a person can escape moral blame only if the beliefs that license her action are attributable to an exercise of intellectual virtue. Beliefs that satisfy this condition count as epistemically justified in a certain subjective sense (99). Thus, on Montmarquet’s view, the intellectual virtues are central to an account of doxastic responsibility which in turn is importantly related to the notion of moral responsibility.

Linda Zagzebski’s treatment of the intellectual virtues in her book Virtues of the Mind (1996) is one of the most thoroughly and systematically developed in the literature. Zagzebski is unquestionably a virtue responsibilist, as she clearly thinks of intellectual virtues as traits of character. That said, her view bears a notable resemblance to several virtue reliabilist views because its main component is a virtue-based account of knowledge.

Zagzebski begins this account with a detailed and systematic treatment of the structure of a virtue. She says that a virtue, whether moral or intellectual, is “a deep and enduring acquired excellence of a person” (137). She also claims that all virtues have two main components: a motivation component and a success component. Accordingly, to possess an intellectual virtue, a person must be motivated by and reliably successful at achieving certain intellectual ends. These ends are of two sorts (1999: 106). The first are ultimate or final intellectual ends like truth and understanding. Zagzebski’s account thus resembles both Code’s and Montmarquet’s, since she also views the intellectual virtues as arising fundamentally from a motivation or desire to achieve certain intellectual goods. The second set of ends consists of proximate or immediate ends that differ from virtue to virtue. The immediate end of intellectual courage, for instance, is to persist in a belief or inquiry in the face of pressure to give it up, while the immediate end of open-mindedness is to genuinely consider the merits of others’ views, even when they conflict with one’s own. Thus, on Zagzebski’s view, an intellectually courageous person, for instance, is motivated to persist in certain beliefs or inquiries out of a desire for truth and is reliably successful at doing so.

Zagzebski claims that knowledge is belief arising from “acts of intellectual virtue.” An “act of intellectual virtue” is an act that “gets everything right”: it involves having an intellectually virtuous motive, doing what an intellectually virtuous person would do in the situation, and reaching the truth as a result (1996: 270-71). One performs an act of fair-mindedness, for example, just in case one exhibits the motivational state characteristic of this virtue, does what a fair-minded person would do in the situation, and reaches the truth as a result. Knowledge is acquired when one forms a belief out of one or more acts of this sort.

As this characterization indicates, the justification or warrant condition on Zagzebski’s analysis of knowledge entails the truth condition, since part of what it is to perform an act of intellectual virtue is to reach the truth or to form a true belief, and to do so through certain virtuous motives and acts. This explains why Zagzebski characterizes knowledge simply as belief – rather than true belief – arising from acts of intellectual virtue.

Zagzebski claims that this tight connection between the warrant and truth conditions for knowledge makes her analysis immune to Gettier counterexamples (1996: 296-98). She characterizes Gettier cases as situations in which the connection between the warrant condition and truth condition for knowledge is severed by a stroke of bad luck and subsequently restored by a stroke of good luck. Suppose that during the middle of the day I look at the highly reliable clock in my office and find that it reads five minutes past 12. I form the belief that it is five past 12, and this belief is true. Unknown to me, however, the clock unexpectedly stopped exactly 12 hours prior, at 12:05 AM. My belief in this case is true, but only as a result of good luck. And this stroke of good luck cancels out an antecedent stroke of bad luck consisting in the fact that my ordinarily reliable clock has malfunctioned without my knowing it. While my belief is apparently both true and justified, it is not an instance of knowledge.

Zagzebski’s account of knowledge generates the intuitively correct conclusion in this and similar cases. My belief about the time, for instance, fails to satisfy her conditions for knowledge because what explains my reaching the truth is not any virtuous motive or activity on my part, but rather a stroke of good luck. Thus, by making it a necessary condition for knowledge that a person reach the truth through or because of virtuous motives and actions, Zagzebski apparently is able to rule out cases in which a person gets to the truth in the fortuitous manner characteristic of Gettier cases.

b. Prospects for Virtue Responsibilism

Virtue responsibilist views clearly are a diverse lot. This complicates any account of the apparent prospects of virtue responsibilism, since these prospects are likely to vary from one virtue responsibilist view to another. It does seem fairly clear, however, that as analyses of knowledge or justification, virtue responsibilism faces a formidable difficulty. Any such analysis presumably will make something like an exercise of intellectual virtue a necessary condition either for knowledge or for justification. The problem with such a requirement is that knowledge and justification often are acquired in a more or less passive way, that is, in a way that makes few if any demands on the character of the cognitive agent in question. Suppose, for example, that I am working in my study late at night and the electricity suddenly shuts off, causing all the lights in the room to go out. I will immediately know that the lighting in the room has changed. Yet in acquiring this knowledge, it is extremely unlikely that I exercise any virtuous intellectual character traits; rather, my belief is likely to be produced primarily, if not entirely, by the routine operation of my faculty of vision. Given this and related possibilities, an exercise of intellectual virtue cannot be a necessary condition for knowledge or justification.

This point has obvious implications for a view like Zagzebski’s. In the case just noted, I do not exhibit any virtuous intellectual motives. Moreover, while I may not act differently than an intellectually virtuous person would in the circumstances, neither can I be said to act in a way that is characteristic of intellectual virtue. Finally, I get to the truth in this case, not as a result of virtuous motives or actions, but rather as a result of the more or less automatic operation of one of my cognitive faculties. Thus, on several points, my belief fails to satisfy Zagzebski’s requirements for knowledge.

This suggests that any remaining hope for virtue responsibilism must lie with views that do not attempt to offer a virtue-based analysis of knowledge or justification. But such views, which include the views of Code and Montmarquet, also face a serious and rather general challenge. Virtue epistemologists claim that virtue concepts deserve an important and fundamental role in epistemology. But once it is acknowledged that these concepts should not play a central role in an analysis of knowledge or justification, it becomes difficult to see how the virtue responsibilist’s claim about the epistemological importance of the intellectual virtues can be defended, for it is at best unclear whether there are any other traditional epistemological issues or questions that a consideration of intellectual virtue is likely to shed much light on. It is unclear, for instance, how reflection on the intellectual virtues as understood by virtue responsibilists could shed any significant light on questions about the possible limits or sources of knowledge.

Any viable version of virtue responsibilism must, then, do two things. First, it must show that there is a unified set of substantive philosophical issues and questions to be pursued in connection with the intellectual virtues and their role in the intellectual life. In the absence of such issues and questions, the philosophical significance of the intellectual virtues and the overall plausibility of virtue responsibilism itself remain questionable. Second, if these issues and questions are to form the basis of an alternative approach to epistemology, they must be the proper subject matter of epistemology itself, rather than of ethics or some other related discipline.

The views of Code and Montmarquet appear to falter with respect to either one or the other of these two conditions. Code, for instance, provides a convincing case for the claim that the possession of virtuous intellectual character is crucial to intellectual flourishing, especially when the more personal and social dimensions of intellectual flourishing are taken into account. But she fails to identify anything like a unified set of substantive philosophical issues and questions that might be pursued in connection with these traits. Nor is it obvious from her discussion what such questions and issues might be. This leaves the impression that while Code has identified an important insight about the value of the intellectual virtues, this insight does not have significant theoretical implications and therefore cannot successfully motivate anything like an alternative approach to epistemology.

Montmarquet, on the other hand, does identify several interesting philosophical questions related to intellectual virtue, for example, questions about the connection between moral and doxastic responsibility, the role of intellectual character in the kind of doxastic responsibility relevant to moral responsibility, and doxastic voluntarism as it relates to issues of moral and doxastic responsibility. The problem with Montmarquet’s view as a version of virtue responsibilism, however, is that the questions he identifies seem like the proper subject matter of ethics rather than epistemology. While he does offer a virtue-based conception of epistemic justification, he is quick to point out that this conception is not of the sort that typically interests epistemologists, but rather is aimed at illuminating one aspect of moral responsibility (1993: 104). Indeed, taken as an account of epistemic justification in any of the usual senses, Montmarquet’s view is obviously problematic, since it is possible to be justified in any of these senses without satisfying Montmarquet’s conditions, that is, without exercising any virtuous intellectual character traits. (This again is due to the fact that knowledge and justification are sometimes acquired in a more or less passive way.) Montmarquet’s view therefore apparently fails to satisfy the second of the two conditions noted above.

Jonathan Kvanvig (1992) offers a treatment of the intellectual virtues and their role in the intellectual life that comes closer than that of either Code or Montmarquet to showing that there are substantive questions concerning these traits that might reasonably be pursued by an epistemologist. Kvanvig maintains that the intellectual virtues should be the focus of epistemological inquiry but that this is impossible given the Cartesian structure and orientation of traditional epistemology. He therefore commends a radically different epistemological perspective, one that places fundamental importance on the social and cross-temporal dimensions of the cognitive life and gives a backseat to questions about the nature and limits of knowledge and justification.

While the majority of Kvanvig’s discussion is devoted to showing that the traditional framework of epistemology leaves little room for considerations of intellectual virtue (and hence that this framework should be abandoned), he does go some way toward sketching a theoretical program motivated by his proposed alternative perspective that allegedly would give the intellectual virtues a central role. One of the main themes of this program concerns how, over the course of a life, “one progresses down the path toward cognitive ideality.” Understanding this progression, Kvanvig claims, would require addressing issues related to “social patterns of mimicry and imitation,” cognitive exemplars, and “the importance of training and practice in learning how to search for the truth” (172). Another crucial issue on Kvanvig’s view concerns “accounting for the superiority from an epistemological point of view of certain communities and the bodies of knowledge they generate.” This might involve asking, for instance, “what makes physics better off than, say, astrology; or what makes scientific books, articles, addresses, or lectures somehow more respectable from an epistemological point of view than books, articles, addresses or lectures regarding astrology” (176). Kvanvig maintains that answers to these and related questions will give a crucial role to the intellectual virtues, as he, like Code, thinks that the success of a cognitive agent in the more social and diachronic dimensions of the cognitive life depends crucially on the extent to which the agent embodies these virtues (183).

Kvanvig’s discussion along these lines is suggestive and may indeed point in the direction of a plausible and innovative version of virtue responsibilism. But without seeing the issues and questions he touches on developed and addressed in considerably more detail, it is difficult to tell whether they really could support a genuine alternative approach to epistemology and whether the intellectual virtues would really be the main focus of such an approach. It follows that the viability of virtue responsibilism remains at least to some extent an open question. But if virtue responsibilism is viable, this apparently must be on account of approaches that are in the same general vein as Kvanvig’s, that is, approaches that attempt to stake out an area of inquiry regarding the nature and cognitive significance of the intellectual virtues that is at once philosophically substantial as well as the proper subject matter of epistemology.

4. The Reliabilist/Responsibilist Divide

Virtue reliabilists and virtue responsibilists appear to be advocating two fundamentally different and perhaps opposing kinds of epistemology. The former view certain cognitive faculties or powers as central to epistemology and the latter certain traits of intellectual character. The two approaches also sometimes differ about the proper aims or goals of epistemology: virtue reliabilists tend to uphold the importance of traditional epistemological projects like the analysis of knowledge, while some virtue responsibilists give priority to new and different epistemological concerns. The impression of a deep difference between virtue reliabilism and virtue responsibilism is reinforced by at least two additional considerations. First, by defining the notion of intellectual virtue in terms of intellectual character, virtue responsibilists seem to rule out ex hypothesi any significant role in their theories for the cognitive abilities that interest the virtue reliabilist. Second, some supporters of virtue reliabilism have claimed outright that the character traits of interest to the virtue responsibilist have little bearing on the questions that are most central to a virtue reliabilist epistemology (Goldman 1992: 162).

But the divide between virtue reliabilism and virtue responsibilism is not entirely what it seems. Minimally, the two approaches are not always incompatible. A virtue reliabilist, for instance, can hold that relative to questions concerning the nature of knowledge and justification, a faculty-based approach is most promising, while still maintaining that there are interesting and substantive epistemological questions (even if not of the traditional variety) to be pursued in connection with the character traits that interest the virtue responsibilist (see, for example, Greco 2002).

More importantly, there is a sense in which the very distinction between virtue reliabilism and virtue responsibilism is considerably sketchier than it initially appears. Virtue reliabilists conceive of intellectual virtues, broadly, as stable and reliable cognitive qualities. In developing their views, they go on to focus more or less exclusively on cognitive faculties or powers like introspection, vision, reason, and the like. To a certain extent, this approach is quite reasonable. After all, the virtue reliabilist is fundamentally concerned with those traits that explain one’s ability to get to the truth in a reliable way, and in many cases, all that is required for reaching the truth is the proper functioning of one’s cognitive faculties. For example, to reach the truth about the appearance of one’s immediate surroundings, one need only have good vision. Or to reach the truth about whether one is in pain, one need only be able to introspect. Therefore, as long as virtue reliabilists limit their attention to instances of knowledge like these, a more or less exclusive focus on cognitive faculties and related abilities seems warranted.

But reaching the truth often requires much more than the proper operation of one’s cognitive faculties. Indeed, reaching the truth about things that matter most to human beings—for example, matters of history, science, philosophy, religion, and morality—would seem frequently to depend more, or at least more saliently, on rather different qualities, many of which are excellences of intellectual character. An important scientific discovery, for example, is rarely explainable primarily in terms of a scientist’s good memory, excellent eyesight, or proficiency at drawing valid logical inferences. While these things may play a role in such an explanation, this role is likely to be secondary to the role played by other qualities, for instance, the scientist’s creativity, ingenuity, intellectual adaptability, thoroughness, persistence, courage, and so forth. And many of these are the very traits of interest to the virtue responsibilist.

It appears that since virtue reliabilists are principally interested in those traits that play a critical or salient role in helping a person reach the truth, they cannot reasonably neglect matters of intellectual character. They too should be concerned with better understanding the nature and intellectual significance of the character traits that interest the virtue responsibilist. Indeed, the most plausible version of virtue reliabilism will incorporate many of these traits into its repertoire of virtues and in doing so will go significant lengths toward bridging the gap between virtue reliabilism and virtue responsibilism.

5. References and Further Reading

  • Aristotle. 1985. Nicomachean Ethics, trans. Terence Irwin (Indianapolis: Hackett).
  • Axtell, Guy. 1997. “Recent Work in Virtue Epistemology,” American Philosophical Quarterly 34: 1-27.
  • Axtell, Guy, ed. 2000. Knowledge, Belief, and Character (Lanham, MD: Rowman & Littlefield).
  • BonJour, Laurence. 1995. “Sosa on Knowledge, Justification, and ‘Aptness’,” Philosophical Studies 78: 207-220. Reprinted in Axtell (2000).
  • Code, Lorraine. 1987. Epistemic Responsibility (Hanover, NH: University Press of New England).
  • Crisp, Roger and Michael Slote, eds. 1997. Virtue Ethics (Oxford: Oxford UP).
  • DePaul, Michael and Linda Zagzebski. 2003. Intellectual Virtue: Perspectives from Ethics and Epistemology (Oxford: Oxford UP).
  • Fairweather, Abrol and Linda Zagzebski. 2001. Virtue Epistemology: Essays on Epistemic Virtue and Responsibility (New York: Oxford UP).
  • Goldman, Alvin. 1992. “Epistemic Folkways and Scientific Epistemology,” Liaisons: Philosophy Meets the Cognitive and Social Sciences (Cambridge, MA: MIT Press).
  • Greco, John. 1992. “Virtue Epistemology,” A Companion to Epistemology, eds. Jonathan Dancy and Ernest Sosa (Oxford: Blackwell).
  • Greco, John. 1993. “Virtues and Vices of Virtue Epistemology,” Canadian Journal of Philosophy 23: 413-32.
  • Greco, John. 1999. “Agent Reliabilism,” Philosophical Perspectives 13, Epistemology, ed. James Tomberlin (Atascadero, CA: Ridgeview).
  • Greco, John. 2000. “Two Kinds of Intellectual Virtue,” Philosophy and Phenomenological Research 60: 179-84.
  • Greco, John. 2002. “Virtues in Epistemology,” Oxford Handbook of Epistemology, ed. Paul Moser (New York: Oxford UP).
  • Hookway, Christopher. 1994. “Cognitive Virtues and Epistemic Evaluations,” International Journal of Philosophical Studies 2: 211-27.
  • Kvanvig, Jonathan. 1992. The Intellectual Virtues and the Life of the Mind (Savage, MD: Rowman & Littlefield).
  • Montmarquet, James. 1992. “Epistemic Virtue,” A Companion to Epistemology, eds. Jonathan Dancy and Ernest Sosa (Oxford: Blackwell).
  • Montmarquet, James. 1993. Epistemic Virtue and Doxastic Responsibility (Lanham, MD: Rowman & Littlefield).
  • Plantinga, Alvin. 1993. Warrant and Proper Function (New York: Oxford UP).
  • Sosa, Ernest. 1980. “The Raft and the Pyramid: Coherence versus Foundations in the Theory of Knowledge,” Midwest Studies in Philosophy V: 3-25. Reprinted in Sosa (1991).
  • Sosa, Ernest. 1991. Knowledge in Perspective (Cambridge: Cambridge UP).
  • Steup, Matthias. 2001. Knowledge, Truth, and Duty (Oxford: Oxford UP).
  • Zagzebski, Linda. 1996. Virtues of the Mind (Cambridge: Cambridge UP).
  • Zagzebski, Linda. 1998. “Virtue Epistemology,” Encyclopedia of Philosophy, ed. Edward Craig (London: Routledge).
  • Zagzebski, Linda. 1999. “What Is Knowledge?” The Blackwell Guide to Epistemology, eds. John Greco and Ernest Sosa (Oxford: Blackwell).
  • Zagzebski, Linda. 2000. “From Reliabilism to Virtue Epistemology,” Axtell (2000).

 

Author Information

Jason S. Baehr
Email: Jbaehr@lmu.edu
Loyola Marymount University
U. S. A.

Paradox of Hedonism

Varieties of hedonism have been criticized from ancient to modern times. Along the way, philosophers have also considered the paradox of hedonism. The paradox is a common objection to hedonism, even if they often do not give that specific name to the objection. According to the paradox of hedonism, the pursuit of pleasure is self-defeating. This article examines this objection. There are several ambiguities that surround the use of this paradox, so first, a condensed conceptual history of the paradox of hedonism is presented. Second, it is explained that prudential hedonism is the best target of the paradox, and this is made clear by considering different hedonistic theories and meanings of the word hedonism. Third, it is claimed that the overly conscious pursuit of pleasure, instead of other definitions that emerge from the literature, best captures the kind of pursuit that might generate paradoxical effects. Fourth, there is a discussion on the implications of prudential hedonism. Fifth, different explanations of the paradox that can be traced in the literature are analysed, and the incompetence account is identified as the most plausible. Sixth, the implications of prudential hedonism are discussed. Finally, it is concluded that no version of the paradox provides a convincing objection against prudential hedonism.

Table of Contents

  1. Condensed Conceptual History
  2. Paradoxes of Hedonism
  3. Isolating the Paradox of Hedonism
  4. Defining the Paradox
    1. Definition: Conscious Pursuit of Pleasure
    2. Self-Defeatingness Objection: Conscious Pursuit of Pleasure
  5. Explanations of the Paradox
    1. Logical Paradoxes
    2. Incompetence Account
    3. Self-Defeatingness Objection: Incompetence
  6. Concluding Remarks
  7. References and Further Reading

1. Condensed Conceptual History

“I can’t get no satisfaction. ‘Cause I try, and I try, and I try, and I try.” (The Rolling Stones)

These lyrics evoke the so-called paradox of hedonism that it leads to reduced pleasure. The worry the paradox generates for hedonistic theories is that they appear to be self-defeating. That is, if we pursue the goal of this theory, we are less likely to achieve it. For example, Crisp states, “one will gain more enjoyment by trying to do something other than to enjoy oneself.” Veenhoven attests that this paradox strikes at the heart of hedonism. He argues that if hedonism does not bring pleasure in the end, the true hedonist should repudiate the theory. Eggleston adds that the paradox of hedonism seems to be an issue for hedonistic ethical theories, such as utilitarianism (for objections, see experience machine).

“The paradox of hedonism,” “the paradox of happiness,” “the pleasure paradox,” the “the hedonistic paradox,” and so forth are a family of names given to the same paradox and are usually used interchangeably. Hereon, I refer to the paradox of hedonism only, and I understand happiness as hedonists do—interchangeable with pleasure.

Non-hedonistic accounts of happiness do not consider it a state of mind. Aristotle, for example, considers eudaimonia, sometimes translated as happiness, as an activity in accordance with virtue exercised over a lifetime and in the presence of sufficient external goods.

The word “hedonism” descends from the Ancient Greek for “pleasure.” Psychological hedonism holds pleasure or pain to be the only good that motivates us. In other words, we are motivated only by conscious or unconscious desires to experience pleasure or avoid pain. Ethical hedonism holds that only pleasure has value and only pain has disvalue.

The relation between the philosophical and non-philosophical uses of the word hedonism needs to be explained. The word hedonism is used differently in ordinary language from its use among philosophers. For a non-philosopher, a stereotypical hedonist is epitomized by the slogan “sex, drugs, and rock ‘n’ roll.” To “the folk,” a hedonist is a person that pursues pleasure shortsightedly, selfishly, or indecently—without regard for her long-term pleasure, the pleasure of others, and the socially-appropriate conduct. Also, psychologists sometimes use the word hedonism in the sense of folk hedonism.

That said, even within philosophy, the word hedonism can cause confusion. For instance, some consider hedonistic utilitarianism egoistic; some identify pleasure as necessarily non-social and purely physical. However, hedonism corresponds to a set of theories attributing pleasure the primary role. Ethical hedonism is the theory that identifies pleasure as the only ultimate value, not as an instrumental value or an ultimate value among several. The attainment of good through the so-called “base or disgusting pleasures” such as sex, drugs, and sadism; the indifference to long-term consequences such as rejecting delayed gratification; and the disregard for others’ pleasure, such as taking pleasure at another’s expense are features attached to folk hedonism but are not necessarily part of philosophical hedonism.

Prudential hedonism—the theory identified as the target of the paradox—is a kind of ethical hedonism concerning well-being. It is the claim that pleasure is the only ultimate good for an individual’s life and pain is the opposite. That is, the best life is the one with the most net pleasure, and the worst is the one with the most net pain. Net-pleasure (or “pleasure minus pain”) hedonism means the result of a calculation where dolors (units of pain) are subtracted from hedons (units of pleasure). Like ethical hedonism, prudential hedonism does not claim how pleasure should be pursued. In prudential hedonism, pleasure can be pursued in disparate ways, such as sensory gratification and ascetic spiritual practice; all strategies are good as long as they are successful. Prudential hedonism is also silent about the time span (immediate vs. future pleasure). Nor does prudential hedonism advise that pleasure should be pursued anti-socially. In short, prudential hedonism is not committed to any claims concerning pleasures’ source, temporal location, or whether pleasure can be generated by social behaviors.

According to Herman, the paradox of hedonism can be found in Aristotle. Aristotle claimed that pleasure represents the outcome of an activity by asking and answering the following question: why is it the case that no one is never-endingly pleased? Aristotle replied that human beings are unable to perpetually perform an activity. Therefore, pleasure cannot be perpetual because it derives from activity. So, on closer inspection, Aristotle’s argument does not seem to be the forerunner of the paradox. This argument does not tackle the issue of whether the pursuit of pleasure is self-defeating. Rather, Aristotle’s reflection concerns what causes pleasure/activity and the impossibility of perpetual pleasure.

Later, Butler elaborates an argument against psychological egoism, especially its hedonistic version, which can be considered the harbinger to the paradox, if not its first complete instantiation. Butler’s argument, called “Butler’s stone,” has been interpreted widely as refuting psychological hedonism. The claim is that the experience of pleasure upon the satisfaction of a desire presupposes a desire for something that is not pleasure itself. That is, it presupposes that people sometimes experience pleasure that can generate only from the satisfaction of a non-hedonistic desire. Therefore, psychological hedonism is false—the view that all desires are hedonistic.

Austin attributes the first formulation of the paradox to J. S. Mill. After experiencing major depression in his early twenties, Mill states that happiness is not attainable directly and that happy people have their attention directed at something different than happiness. Later, Sidgwick coined the phrase “paradox of hedonism” while discussing egoistic hedonism. This form of ethical hedonism equates the moral good with the pleasure of the individual, so for Sidgwick, the overly conscious pursuit of pleasure is self-defeating because it promotes pleasure-seeking in a way that results in diminished pleasure.

2. Paradoxes of Hedonism

However, it is questionable to consider the paradox of folk hedonism a paradox, even in the sense of empirical irony. To be empirically ironic, the paradox should involve the psychological truth of a seemingly absurd claim. Common sense holds that certain ways to pursue pleasure, such as committing crimes to finance heroin addiction, are ineffective. Since common sense holds that folk hedonism does not lead to happiness, this “paradox” lacks the counter-intuitiveness required to be labeled as such. Furthermore, if we consider that the focus of folk hedonism is short-term gains, it is not paradoxical. For example, suppose Suzy consumes cocaine during a party; this means she reached her aim. Suzy may encounter future displeasure, perhaps from addiction, but as a folk hedonist, Suzy does not have to care about her future self. So, neither common folk nor folk hedonists should be surprised that folk hedonism is a bad strategy for maximizing pleasure over a lifetime.

Psychological hedonism is the view that conscious or unconscious intrinsic desires are exclusively oriented towards pleasure. Individuals hold a particular desire because they believe that satisfying it will bring them pleasure. For example, Jane desires to do gardening because she believes that gardening will increase her pleasure. The paradox of psychological hedonism consists in the claim that the way our motivational system functions, we get less pleasure than we would have if our motivational system worked differently, specifically if it allowed the non-pleasures to motivate us. On the one hand, if psychological hedonism is a true description of our motivational system, it would have no prescriptive value because it advises us to do something impossible, at least until it becomes possible to alter our motivational system. The paradox of psychological hedonism can be seen as a device to stop being human. On the other hand, if psychological hedonism is not a true description of our motivational system, then we do not need to worry about the paradox at all. Considering this, it appears that this version of the paradox of hedonism is not particularly useful.

It seems that the above-explained ways of understanding the paradox do not capture the core idea. The paradox of folk hedonism is not counter-intuitive enough to be a paradox. For short-term gains, folk hedonism does not seem to backfire. However, any wisdom that resides in the paradox of folk hedonism collapses into the incompetence account analyzed below. Furthermore, the paradox of psychological hedonism is a descriptive claim that does not generate any useful advice.

The paradox of prudential hedonism best captures the heart of the expression: the paradox of hedonism. (1) It is prescriptive. That is, if you do x, the result will be y—which is bad. (2) It is counter-intuitive. That is, if you try to maximize your life’s net pleasure, you end up with less. The apparent absurdity in this claim is a necessary condition for a paradox. For instance, imagine telling a musician that if you aim to produce beautiful music, you will end up producing unpleasant noises. Or consider advising a student not to study hard because aiming for good grades will be counter-productive. These ways of talking are nonsensical. Common sense tells us that if you aim at something, you will be more likely to get it.

(1) and (2) also apply to the paradox of egoistic hedonism. Consider the similarities and differences between these theories. Both egoistic hedonism and prudential hedonism are normative theories that one should pursue pleasure. Yet, prudential hedonism is a theory of well-being or self-interest rationality, while egoistic hedonism is a theory of morality. According to prudential hedonism, it is rational in terms of self-interest to pursue pleasure. In contrast, according to egoistic hedonism, it is a moral obligation to pursue pleasure. Given that, it becomes apparent why prudential hedonism is the best candidate for the most refined version of the paradox of hedonism. The paradox, in fact, questions whether hedonism is rational, not whether it is moral. In other words, the claim of the paradox concerns the idiocy of pursuing pleasure, not its moral blameworthiness. For these reasons, this article focuses on the paradox of prudential hedonism.

3. Isolating the Paradox of Hedonism

This article is restricted to the common understanding of the paradox which refers to the pursuit of pleasure and does not cast light on avoiding displeasure. The points being made may not apply to both. Further research is required to understand that to what extent, if any, these processes overlap. For example, it might be claimed that happiness is a mirage. Such a claim would not imply that minimizing suffering is unrealizable too. For example, a pessimist such as Schopenhauer advised avoiding suffering instead of pursuing happiness. According to him, if you keep your expectations low, you will have the most bearable life. Therefore, further research is needed to understand that to what extent the reflection on the paradox of hedonism applies to the paradox of negative hedonism—the claim that the avoidance of displeasure is self-defeating. This distinction might have an important implication for prudential hedonism. If the pursuit of pleasure is paradoxical but the avoidance of displeasure is not, prudential hedonism is safe from the objection of self-defeatingness. Prudential hedonists would have to pursue the good life by minimizing displeasure rather than by maximizing pleasure.

Since affects can alter decision-making, we should exclude this from the most refined version of the paradox of hedonism. The opposite is the relevant mechanism: decision altering affects. The paradox is usually thought to be concerned with the relationship between pursuing pleasure and getting it, not with the relation between being pleased and its continuation. A related popular belief consists in the claim that happiness necessarily collapses into boredom. However, this cultural belief seems questionable. Certainly, some pleasures can lead to temporary satiation and loss of interest, but to not practice these pleasures in the rotation is a case of incompetence in the pursuit of pleasure. This phenomenon does not imply that pleasant states necessarily impair themselves. Relatedly, Timmermann’s “new paradox of hedonism” is based on the claim that “there can be cases in which we reject pleasure because there is too much of it.” Timmermann denies that his paradox descends from temporary satiation. However, Feldman shows that Timmermann’s new paradox of hedonism is nothing new and is based on a conflation of ethical hedonism with psychological hedonism. The psychological mechanism according to which we reject pleasure may threaten the claim that our motivation is only directed at pleasure but does not affect the claim that pleasure is good. Timmermann’s new paradox of hedonism is not a problem for prudential hedonism.

Another clarification describes the paradox of hedonism as the only mechanism that concerns decision-making and expected pleasure. In other words, the possible cases where prudential hedonism defeats itself momentarily are not included in the most refined understanding of the paradox. According to the paradox of hedonism, the agent’s decision to maximize pleasure does not optimize it in the long-term. A different mechanism involves decision-making and immediately experienced pleasure or pain. Since empirical evidence supports the view that decision-making involves immediate pleasure and pain, we should consider the paradox to refer only to the paradoxical effects concerning expected utility.

Following Moore, the paradox of hedonism is distinct from the weakness of will—hen a subject acts freely and intentionally but contrary to their better judgment. Consider the following example: Imagine that after years of studying philosophy, Bill concludes that prudential hedonism is true. Meanwhile, he cannot implement any change directed at his neurotic personality. Bill is an unhappy prudential hedonist, exhibiting the weakness of will. Indeed, empirical evidence suggests that when we imagine what will make us happier, we fail to be consistent with the plans that rationally follow from it. For example, people knowing that flow activities facilitate happiness end up over-practicing passive leisure and underutilizing active-leisure activities that could elicit periods of flow. Nevertheless, considering that the paradox of hedonism is the pursuit of pleasure resulting in less pleasure, cases of the weakness of will are not included in the refined version of the paradox because the pursuit of pleasure is missing. Cases of the weakness of will do not represent prudential hedonism’s paradoxical effects unless the belief about the truth of prudential hedonism somehow disposes of people the weakness of will more than other beliefs. Thus, the refined version of the paradox of hedonism excludes: the paradox of negative hedonism, pleasure impairing its continuation, momentary self-defeatingness, and the weakness of will.

4. Defining the Paradox

The direct pursuit of pleasure is frequently used to express the paradox of hedonism, but how is it different from the indirect pursuit of pleasure. Imagine taking an opioid. The opiates travel through the bloodstream into the brain and attach to particular proteins, the mu-opioid receptors located on the surfaces of opiate-sensitive neurons. The union of these chemicals and the mu-opioid receptors starts the biochemical brain processes that make subjects experience feelings of pleasure. Taking an opioid seems to be the most direct way to pursue pleasure, but notice that several steps are still required, for instance, owning enough money and acquiring and taking the drug. Consequently, our pursuit of pleasure is always indirect in the sense that various actions mediate it. Thus, it seems that we cannot substantially regulate our hedonic experience at will.

However, even if the direct pursuit of pleasure is impossible, it is still possible for the pursuit of pleasure to be more or less direct. Imagine the directness of the pursuit as a spectrum where the action of consuming a psychoactive substance stands on the far right and the less controversial activities such as going to a party on the left. These activities on the left also include a wide range of more or less direct paths to the goal of pleasure. For example, diving into a pool on a hot day seems to be a shortcut to pleasure compared to the challenges of studying hard and eventually securing a fulfilling job. The issues seem to lie in how long one has to wait for pleasure. Incorporating this more plausible spectrum of directness into the paradox, we get that the direct pursuit of pleasure results in less pleasure. However, this formulation seems empirically questionable. Unless endorsing some forms of asceticism, it does not seem that pleasure simply depends on always choosing the long and hard route. Sometimes, like for pool-owners on a very hot day, the highly direct pursuit seems to produce more net pleasure in addition to immediate pleasure. So, it seems that not all forms of the direct pursuit of pleasure uniformly generate paradoxical effects.

The formulation of the paradox as a consequence of holding pleasure as the only intrinsically valuable end seems poorly descriptive. This expression corresponds to broader definitions of prudential hedonism. By definition, every prudential hedonist considers pleasure as the ultimate goal, the intrinsic good, the sole ultimately valuable end, etc. According to this interpretation, the belief in the truth of prudential hedonism is itself the mental state that generates paradoxical effects. However, it seems more useful to identify the mental state that descends from the belief in the truth of prudential hedonism (for example, the conscious pursuit of pleasure) determines the paradox. In other words, the expressions at stake do not seem descriptive because it seems that the paradoxical mental state is not a philosophical belief but another mental state or behavior that the philosophical belief might be determining.

As recognized by Dietz, the definition of the paradox that emerges from holding pleasure as the only intrinsic desire configures the paradox of hedonism as a symptom of a paradox of desire-satisfaction. If we only desire desire-satisfaction, we are stuck. In Dietz’s view, this paradox threatens all theories of well-being that value satisfaction of a subject’s desires, primarily desire-satisfactionism, which is one of the main rivals of hedonism as a theory of prudence. That said, this article is silent about the plausibility of the paradox of desire-satisfaction. Nevertheless, the paradox of desire-satisfaction needs a further step to affect prudential hedonism, which is rational desire—the view that there is a rational connection between our evaluative beliefs and desires. Contrary to rational desire, Blake writes that being a hedonist does not commit one to consider pleasure as the only desire. Even if the rational desire is true, this mechanism concerns ideal agents. We seem to consider things good without desiring them or desire things without considering them good.

What of the intentional pursuit of pleasure? Kant’s use of the adverb “purposely” seems to be a synonym of “intentionally.” Notice that philosophers distinguish between prior intention and intention in action, corresponding to action-initiation and action-control. Given that, the conscious pursuit of pleasure, analyzed below, appears more precise by pointing only to the paradoxical mechanism of action control.

a. Definition: Conscious Pursuit of Pleasure

All things considered, the conscious pursuit of pleasure seems to be the most appropriate definition. The conscious pursuit of pleasure can be understood as the pursuit that holds pleasure in the mind’s eye. Pleasure is kept in mind by the agent as her regulative objective. This is a case t“indirect self-defeatingness” when the counter-productive effects of a theory are caused by conscious efforts to comply with it. Among different passages, Sidgwick advances this interpretation when he writes: “Happiness is likely to be better attained if the extent to which we set ourselves consciously to aim at it be carefully restricted.” Which share of our conscious awareness should the pursuit of pleasure occupy? Or how often should we perform a conscious recollection of the goal of pleasure? Perhaps, the wisdom underlying the paradox of hedonism can be found in answers to these questions: the paradox should be regarded as advice against focusing too much on hedonic maximization.

The strategy of never being conscious of the goal of pleasure also seems irrational, especially when considering normative theories of instrumental rationality. The calculation of the best means to any given end is assumed to be more effective than a chance to secure the end. It does not seem wise to never think of the outcome we aim for. Sometimes, we need to remember why we are acting, even in the broad sense of directing or sustaining our attention. To never aim at happiness and yet still achieve it is a case of serendipity. In fact, it is possible to find x when looking for y, such as finding pleasure while pursuing a life of moral or intellectual perfection. Still, if you enter a supermarket to buy peanuts, looking for toothpaste does not seem to be the most rational strategy; however, if you are taking a walk, you might find peanuts.

Mill goes further by trying to identify why pursuing pleasure too consciously may be ineffective. He claims that allowing pleasure to occupy our internal discourse brings about an excessive critical scrutiny of pleasures. Similarly, Sidgwick seems to have identified one paradoxical mechanism of a too conscious pursuit of pleasure when warning about the risks of pleasure’s meta-awareness. In the first two decades of the 21st century, the empirical literature on the paradoxical effects of pursuing pleasure claim that much research supports the idea that monitoring one’s hedonic experience can negatively interfere with one’s hedonic state (Zerwas and Ford).

Concerning empirical evidence on the conscious pursuit of pleasure, Schooler and colleagues instructed participants to up-regulate pleasure while listening to a hedonically ambiguous melody, while the control group was only required to listen to the melody. Subsequent experimental studies by Mauss and colleagues employed a similar methodology by making participants watch a happy film clip. This research has investigated the effects of attempting to up-regulate pleasure during a neutral or pleasant experience consciously. Importantly, these studies support the paradoxical over-conscious pursuit of pleasure. In fact, the inductions of the experiments—subjects were asked to up-regulate their hedonic experience—caused the participants to pursue pleasure and fail consciously. Given the points above, it seems that the most sensible definition of the paradox of hedonism consists of the claim that the overly conscious pursuit of pleasure is self-defeating. According to Wild, it seems that hedonism’s paradox constitutes advice to maximize pleasure by temporarily forgetting about it. It is self-defeating to fix attention on pleasure too often.

Concerning the paradoxical conscious pursuit of pleasure, it does not seem that the strategy reported by Arnold, aiming at devising it as a logical argument, is successful. The argument is supposed to work this way: the pleasure kept in view (so that it can be sought) must be an idea. An idea is no longer a feeling, and the intellectual nature of ideas prevents them from being pleasurable. However, as claimed by Arnold, one of the fallacies of this argument lies in a false conception of the function of logical constructions: a hedonist aims at pleasant states, not at the idea of such states. The idea of pleasure is just a signpost, a concept that is supposed to lead to pleasure-producing choices. If keeping in mind pleasure the signpost impairs one’s ability to experience pleasure, this seems to be an empirical claim rather than a logical necessity. Therefore, the excessively conscious pursuit seems best understood as an empirical rather than a logical paradox because the attempt to make it a logical paradox fails. Following Singer, the paradox of hedonism does not seem like a paradox in the sense of a logical contradiction; instead, it seems to represent a psychological incongruity or empirical irony about the process of pleasure-seeking.

b. Self-Defeatingness Objection: Conscious Pursuit of Pleasure

The version of the paradox identified gives no reason to think that prudential hedonism is theoretically weakened by it. As Eggleston claims, the paradox of hedonism might result in being an interesting psychological mechanism with no philosophical implications. Mill seems to support this conclusion when he starts his exposition of the paradox by saying that he is not questioning the prudential primacy of pleasure.

In fact, a theory that (1) considers pleasure to be the only intrinsic prudential good is not necessarily doomed to be internally inconsistent just because it (2) acknowledges that we should forget about pleasure at some points. (1) is a claim of theoretical reason, the kind of reason concerned with the truth of propositions; (2) is a claim of practical reason, it concerns the value of actions. The former addresses beliefs, and the latter addresses intentions. Since prudential hedonism advises the maximization of pleasure, it also advises that the agent instrumentally shapes the pursuit in whatever way it is most effective. As Sidgwick claims, the paradox of hedonism does not seem to cause any practical problem once the possibility of it has been acknowledged. As advanced by Sheldon, pleasure can be a by-product of states that require us not to pursue pleasure overly consciously. So, pleasure may be the reason to sometime forget about pleasure. These recommendations on how to avoid the paradox determine this version of the paradox of hedonism as a contingent practical problem for prudential hedonism but can be avoided. To sum up, considering the best definition of the paradox, the argument based on the paradox does not constitute a valid objection to prudential hedonism.

5. Explanations of the Paradox

Based on Butler’s reflections, Dietz discusses an older explanation of the paradox of hedonism that considers the paradox to generate from pleasure itself and its relation to the satisfaction of desire. This explanation, with Dietz’s spin on it, explains that the evidentialist account is supposed to represent logical paradoxes. The evidentialist account relies on a desire-belief condition for pleasure and evidentialism. The desire-belief condition claims that pleasure requires the subject to believe she is getting what she wants. This account is based on Heathwood’s view that pleasure consists in having an intrinsic desire for some state of affairs and believing that this state of affairs is the case. Evidentialism is an epistemological theory in which a rational agent will hold beliefs only if justified by the evidence. This theory is supposed to dictate the rules for the formation of the belief about whether the desire is satisfied.

According to this account, prudential hedonism is self-defeating if the subject is epistemically rational and not deceived. Opposite to the incompetence account that arises from our irrationality and lack of self-knowledge, the evidentialist account arises for ideal agents. According to Dietz, if we suppose that I will experience pleasure only if I believe in my own pleasure and that I am going to be rational and well-informed, there will be no option for me to find independent support for this belief; thus, I will not be able to form such a belief, and I will never experience pleasure. In other words, as an evidentialist, I will only believe what I have good evidence to believe. To be pleased, I have to believe I am pleased. But, to believe I am pleased, I need good evidence that I am pleased. Unfortunately, the only evidence of my pleasure is the belief that I am pleased. So, no pleasure beliefs ever get off the ground because the evidence is tightly circular, therefore, not compelling. The underlying reasoning of the evidentialist account has the same structure as Cave’s placebo paradox. Cave imagines a sick person that receives a placebo. This person will regain his health only if he believes that he will regain his health. Similarly, a hedonist, following this account, will be pleased only if he believes that he is or will be pleased. But if the sick person is rational, he will only have the belief that he will regain his health if he has solid evidence that this is the case. Likewise, if a hedonist holds that his pleasure itself is a unique thing in which he will take pleasure, the belief that he will experience pleasure is not independently true, and if he is rational, he cannot form this belief.

a. Logical Paradoxes

Butler’s account is based on the view that pleasure consists in the satisfaction of non-hedonistic desires—desires for anything but pleasure—is implausible. For instance, we can take delight in pleasure itself and not only by gratifying non-hedonistic desires. The concepts of meta-emotions (emotions about emotions) and meta-moods (moods about moods) have been adopted and explored by researchers within both philosophy and psychology. It is possible to feel, for example, content about being relaxed, hopeful about being relieved, and grateful about being euphoric (positive-positive secondary emotions). These are counter-examples to Butler’s account because they involve feeling good about feeling good, precisely what is supposed to be impossible in Butler’s view. Thus, Butler’s theory of pleasure seems implausible.

Concerning Dietz’s evidentialist account, it is weakened by concerning ideal agents: given that human beings are not, as this account presupposes, it has scant practical utility. The evidentialist account assumes a questionable theory of pleasure (see Katz for problems in desire-based theories of pleasure). For example, pleasant surprises constitute prima facie counter-examples to hold desire-satisfaction necessary for pleasure. Also, solid neuroscientific evidence confutes the reduction of pleasure to desire.

To summarize, Butler’s and the evidentialist’s accounts do not seem reliable explanations of the paradox of hedonism because they are built on implausible theories of pleasure.

b. Incompetence Account

A closer inspection reveals that the special goods account collapses into the incompetence account, the belief that by atomistically pursuing our pleasure, we will maximize it. The special goods account collapses into the incompetence account because pursuing pleasure atomistically seems a fallacious strategy in terms of self-interest. Accordingly, if only individuals were well-informed about what leads to pleasure, they would cultivate special goods as means to pleasure.

Having rejected Butler’s and the evidentialist’s accounts and reduced the special goods account to the incompetence account, we are left with the “overly conscious” definition versions of the paradox. The incompetence account claims that we are so prone to making mistakes in pursuing pleasure, that by not aiming at it we are more successful in securing it. Following Haybron, much empirical evidence has been amassed on the ways in which humans are likely to make errors in pursuing their interests, including happiness. We possess compelling empirical evidence confirming that individuals are systematically unskillful at forecasting what will bring them pleasure. Individuals seem to suffer from several cognitive biases that undermine their capacity to elaborate accurate predictions about what will please them. This inability to make accurate predictions about the affective impact of future events might be problematic for prudential hedonism, especially what Sidgwick calls the “empirical-reflective method.” The empirical-reflective method consists of: (1) forecasting the affects resulting from different lines of conduct; (2) evaluating, considering probabilities, which affects are preferable; (3) undertaking the matching line of conduct. As Sidgwick already recognized, to imagine future pleasures and pains, sub (1), is an unreliable operation, so our confidence in the empirical-reflective method should be restricted.

Kant seems to have explained the paradox of hedonism similarly. Incidentally, for him, morality must always be given normative priority over happiness. The moral person acts to obey the moral law irrespectively of what might be prudentially good. He claims we do not have an accurate idea of what will make us happy. According to him, pursuing wealth can generate troubles and anxiety, pursuing knowledge can determine a sense of tragedy, pursuing health can highlight the pains of ill health in advanced age, and so forth.

Kant’s understanding of the paradox seems to rely on the incompetence account and especially on the failures of affective forecasting. Many life-defining choices are based on affective forecasts. Should you get married? Have children? Pursue a career as an academic or as a financer? These important decisions are heavily influenced by forecasts about how the different scenarios will make you feel.

Consequently, the aforementioned line of empirical research shows that, in pursuing pleasure, we are not rational agents: we make mistakes, and we can fail miserably. Perhaps this is not surprising. Who has not at some time chosen a job, holiday, partner, etc., only to find out that the choice did not bring nearly as much pleasure as we had expected?

To summarize, in this section, we explored affective forecasting failures as examples of our ineptitude in pursuing pleasure. Given this evidence of human incompetence in the pursuit of pleasure, it seems we lack the skills and knowledge required to effectively grasp and sustain this elusive feeling. This weakness in our psychology seems a plausible cause of the paradox¾a case Parfit labels as “direct self-defeatingness” when the counter-productive effects of a theory are caused by compliance to it. Pity that we are so inept in our pursuit of pleasure that pursuing it destines us to fail, and perhaps fail so catastrophically that we might find ourselves less pleased than when we started.

c. Self-Defeatingness Objection: Incompetence

Having identified a plausible causal relation underpinning the paradox above, whether the incompetence account represents a theoretical issue for prudential hedonism is explored here. Recall that according to the argument based on the paradox of hedonism, prudential hedonism is a self-defeating theory.

Parfit elaborates on self-interest theory (the name under which he includes several theories of well-being) and the problem of self-defeatingness. For Parfit, a self-defeating theory “fails even in its own terms. And thus condemns itself.” Furthermore, the incompetence account corresponds to a peculiar category of self-defeatingness, a category that Parfit considers very unproblematic. In fact, in setting the boundaries of his study, he excludes cases where the paradoxical effects are mistakenly caused by what the agent does. For Parfit, incompetence is not a legitimate objection to a theory because the fault is not in the theory but in the agent.

Once again, as in the “overly conscious” definition of the paradox, the incompetence account can be seen as a practical problem that does not affect prudential hedonism as a theory. The possible practical self-defeatingness of prudential hedonism does not disprove any of prudential hedonism’s claims. Our incompetence in pursuing pleasure does not affect the validity of a theory that holds pleasure as the ultimate prudential good. If the paradox of hedonism emerges merely because of some contingent mechanisms in our psychology, prudential hedonists have no reason to reject the theory.

6. Concluding Remarks

This article analyzed the paradox of hedonism, which is the objection that prudential hedonism is self-defeating. First, the most plausible definition of the paradox was pointed out. The overly conscious pursuit of pleasure was identified as the behavior that might determine paradoxical effects in a hedonistic prudential agent. This constitutes a plausible case of prudential hedonism’s indirect self-defeatingness when the conscious effort to comply with the theory defeats its aims. Secondly, the explanations of different versions of the paradox identifiable in the literature were assessed. The incompetence account emerged as a plausible causal mechanism behind the paradox of hedonism. This is a case of prudential hedonism’s direct self-defeatingness when acting in accordance with the theory defeats its aims. However, both versions of the paradox end up being contingent on psychological mechanisms. The possible practical problems that were identified, overly conscious and incompetent pursuits of pleasure, do not theoretically affect the plausibility of prudential hedonism that concerns prudential value and not practical rationality. Nevertheless, both seem avoidable. In practice, prudential hedonism does not seem to imply a necessarily self-defeating pursuit.

7. References and Further Reading

  • Aristotle. (1975). Nicomachean ethics, In H. Rackham (Transl.), Aristotle in 23 Volumes, Vol 19. Harvard University Press.
  • Arnold, F. (1906). The so-called hedonist paradox. International Journal of Ethics, 16(2), 228–234.
  • Austin, L. (2009). John Stuart Mill, the Autobiography, and the paradox of happiness. World Picture, 3, http://www.worldpicturejournal.com/WP_3/Austin.html
  • Besser, L. L. (2021). The philosophy of happiness: An interdisciplinary introduction. Routledge.
  • Blackburn, S. (2016). Hedonism, paradox of. In The Oxford dictionary of philosophy. Oxford University Press.
  • Blake, R. M. (1926). Why not hedonism? A protest. The International Journal of Ethics, 37(1), 1–18.
  • Butler, J. (1991). Fifteen sermons preached at the Rolls Chapel. In D. D. Raphael (Ed.), British Moralists, 1650 –1800, 374–435. Hackett.
  • Cave, P. (2001). Too self-fulfilling. Analysis, 61(270), 141–146.
  • Crisp, R. (2001). Well-being. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Crisp, R. (2006). Hedonism reconsidered. Philosophy and Phenomenological Research, 73(3), 619–645.
  • Dietz, A. (2019). Explaining the paradox of hedonism. Australasian Journal of Philosophy, 97(3), 497–510.
  • Dietz, A. (2021). How to use the paradox of hedonism. Journal of Moral Philosophy, 18(4), 387–411.
  • Eggleston, B. (2013). Paradox of happiness. International Encyclopedia of Ethics, 3794–3799. Wiley-Blackwell.
  • Feldman, F. (2006). Timmermann’s new paradox of hedonism: Neither new nor paradoxical. Analysis, 66(1), 76–82.
  • Haybron, D. M. (2011). Happiness. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Heathwood, C. (2006). Desire satisfactionism and hedonism. Philosophical Studies, 128(3), 539–63.
  • Herman, A. L. (1980). Ah, but there is a paradox of desire in Buddhism—A reply to Wayne Alt. Philosophy East and West, 30(4), 529–532.
  • Hewitt, S. (2010). What do our intuitions about the experience machine really tell us about hedonism? Philosophical Studies, 151(3), 331–349.
  • Kant, I. (1996). Practical Philosophy. M. J. Gregor (Ed.), Cambridge University Press.
  • Martin, M. W. (2008). Paradoxes of happiness. Journal of Happiness Studies, 9(2), 171–184.
  • Mauss, I. B., et al. (2011). Can seeking happiness make people unhappy? Paradoxical effects of valuing happiness. Emotion, 11(4), 807–815.
  • Mill, J. S. (1924). Autobiography. Columbia University Press.
  • Moore, A. (2004). Hedonism. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Parfit, D. (1986). Reasons and Persons. Oxford University Press.
  • Schooler, J. W., et al. (2003). The pursuit and assessment of happiness can be self-defeating. In J. C. I. Brocas (Ed.), The Psychology of Economic Decisions, 41–70. Oxford University Press.
  • Sheldon, W. H. (1950). The absolute truth of hedonism. The Journal of Philosophy, 47(10), 285–304.
  • Sidgwick, H. (1907). The methods of ethics. Macmillan & Co. https://www.gutenberg.org/files/46743/46743-h/46743-h.htm
  • Silverstein, M. (2000). In defense of happiness. Social Theory and Practice, 26(2), 279–300.
  • Singer, P. (2011). Practical Ethics. Cambridge University Press.
  • Stocker, M. (1976). The schizophrenia of modern ethical theories. The Journal of Philosophy, 73(14), 453–466.
  • Timmermann, J. (2005). Too much of a good thing? Another paradox of hedonism. Analysis, 65(2), 144–146.
  • Veenhoven, R. (2003). Hedonism and happiness. Journal of Happiness Studies, 4(4), 437–457.
  • Wild, J. (1927). The resurrection of hedonism. International Journal of Ethics, 38(1), 11–26.
  • Zerwas, F. K. and Ford, B. Q. (2021). The paradox of pursuing happiness. Current Opinion in Behavioral Sciences, 39, 106–112.

 

Author Information

Lorenzo Buscicchi
Email: lorenzobuscicchi@hotmail.it
The University of Waikato
New Zealand

Certainty

The following article provides an overview of the philosophical debate surrounding certainty. It does so in light of distinctions that can be drawn between objective, psychological, and epistemic certainty. Certainty consists of a valuable cognitive standing, which is often seen as an ideal. It is indeed natural to evaluate lesser cognitive standings, in particular beliefs and opinions, in light of one’s intuitions regarding what is certain. Providing an account of what certainty is has however proven extremely difficult; in one part because certainty comes in varieties that may easily be conflated, and in another part because of looming skeptical challenges.

Is certainty possible in the domain of the contingent? Or is it restricted, as Plato and Aristotle thought, to the realm of essential truths? The answer to this question depends heavily on whether or not a distinction can be drawn between the notion of objective certainty and the notion of epistemic certainty. How are we to characterize the epistemic position of a subject for whom a particular proposition is certain? Intuitively, if a proposition is epistemically certain for a subject, that subject is entitled to be psychologically certain of that proposition. Yet, as outlined by philosophers such as Unger, depending on how psychological certainty is conceived of, skeptical implications are looming. Depending on how psychological certainty is conceived of, it is not clear that a subject can be entitled in being psychological certain of a proposition. Generally, it has proven challenging to articulate a notion of epistemic certainty that preserves core intuitions regarding what one is entitled to think and regarding what characterizes, psychologically, the attitude of certainty.

Table of Contents

  1. Varieties of Certainty
    1. Objective, Epistemic and Psychological Certainty
    2. Certainty and Knowledge
  2. Psychological Certainty
    1. Certainty and Belief
    2. A Feeling of Conviction
    3. The Operational Model
  3. Epistemic Certainty
    1. The Problem of Epistemic Certainty
    2. Skeptical Theories of Epistemic Certainty
      1. Radical Infallibilism
      2. Invariantist Maximalism
      3. Classical Infallibilism
      4. A Worry for Skeptical Theories of Certainty
    3. Moderate Theories of Epistemic Certainty
      1. Moderate Infallibilism
      2. Fallibilism
      3. Epistemic Immunity and Incorrigibility
    4. Weak Theories of Epistemic Certainty
      1. The Relativity of Certainty
      2. Contextualism
      3. Pragmatic Encroachment
  4. Connections to Other Topics in Epistemology
  5. References and Further Readings

1. Varieties of Certainty

a. Objective, Epistemic and Psychological Certainty

As a property, certainty can be attributed to a proposition or a subject. When attributed to a proposition, certainty can be understood metaphysically (objectively) or relative to a subject’s epistemic position (Moore 1959, DeRose 1998). Objective certainties consist of propositions that are necessarily true. The relevant types of necessities are logical, metaphysical and physical. For instance, the proposition “It rains in Paris”, even if true, cannot be regarded as objectively certain. This is because it is possible that it does not rain in Paris. On the other hand, the proposition “All bachelors are unmarried” can be regarded as objectively certain, for it is logically impossible that this proposition be false.

Epistemic certainties are propositions that are certain relative to the epistemic position of a subject and the notion of epistemic certainty ought to be distinguished from that of psychological certainty, which denotes a property attributed to a subject relative to a given proposition (Moore 1959, Unger 1975: 62sq, Klein 1981: 177sq, 1998, Audi 2003: 224 sq, Stanley 2008, DeRose 2009, Littlejohn 2011, Reed 2008, Petersen 2019, Beddor 2020a, 2020b, Vollet 2020). Consider the statement “It is certain for Peter that John is not sick”. Note that this statement is ambiguous, as, “for Peter” could refer to Peter’s epistemic position, for example, the evidence Peter possesses. If Peter states “It is certain that John is not sick, because the doctor told me so”, he can be understood as stating that “John is not sick” is certain given his epistemic position which comprises what the doctor told him. But “for Peter” can also denote an attitude adopted by Peter toward the proposition “John is not sick”. The attitude at issue is the type of attitude that falls under the concept of psychological certainty.

Epistemic certainty or epistemic uncertainty are often expressed by the use of modals such as “may” or “impossible” which are understood in an epistemic sense (see Moore 1959, DeRose 1998: 69, Littlejohn 2011, Petersen 2019, Beddor 2020b: sect. 5). To express epistemic certainty one can say, for instance, “It is impossible that John is sick, because the doctor said he wasn’t”. Likewise, to express epistemic uncertainty, one can say “John may be sick, as his temperature is high”. Used in such a way, these modals describe the epistemic position of a subject relative to a proposition toward which she may or may not adopt an attitude of certainty.

Even if it is intuitively correct that the epistemic position of a subject for whom some proposition is certain consists of a favorable epistemic position, it is an open question if a proposition being epistemically certain for a subject entails that that proposition is true. Depending on how the epistemic position relative to which a proposition that is epistemically certain for a subject is conceived of, epistemic certainty may not turn out to be factive (DeRose 1998 Hawthorne 2004: 28, Huemer 2007, von Fintel et Gillies 2007, Littlejohn 2011, Petersen 2019, Beddor 2020b, Vollet 2020).

Psychological certainty, for its part, is generally regarded as a being non-factive. For example, John can be psychologically certain that it is raining in Paris even if it is not raining in Paris. In addition, psychological certainty does not require that a subject be in a favorable epistemic position. It is possible for John to have no reason to believe that it is raining in Paris, and yet, be psychologically certain that it is raining in Paris.

Despite being conceptually distinct, the notions of objective, epistemic and psychological certainty are significantly related. From Antiquity to the end of the Middle Ages, the idea that true science (epistémè) – that is, epistemic certainty – could only pertain to necessary truths  whose object was either intelligible forms or essences seemed to be widely accepted. In The Republic books V, VI and VII, Plato endorses the view that sensible reality is merely an imperfect and mutable copy of an ideal, perfect and immutable realm of existence. As a result, sensible reality can only be the object of uncertain opinions (doxa). For his part, Aristotle defines epistemic certainty, or “scientific knowledge,” as the syllogistic demonstration of essential truths. It is through such syllogisms that one can comprehend what belongs essentially to objects of knowledge (Aristotle Organon IV. Posterior Analytics, I, 2, Metaphysics VII.2, 1027a20). It is during the Scientific Revolution that emerges the idea of a science of the contingent and of the possibility of distinguishing epistemic certainty from objective certainty.

In addition, epistemic certainty has an important normative relationship to psychological certainty (Klein 1998). For instance, Descartes states that one should not adopt an attitude of certainty toward propositions that are not entirely certain and indubitable (Descartes Meditations on First Philosophy § 2). Similarly, Locke’s evidentialist principle prescribes that a subject should proportionate her opinion to the evidence she possesses (Locke Essay, IV, 15 §5). Indeed, it seems that if a proposition is not epistemically certain for a subject, that subject is not justified in being certain that that proposition is true (Ayer 1956: 40, Unger 1975).

b. Certainty and Knowledge

According to several philosophers, the notions of psychological and epistemic certainty are closely connected to the notion of knowledge. One could regard the propositional attitude involved in knowing something to be the case as consisting of the attitude of psychological certainty and therefore take epistemic certainty to be a condition on knowledge. (Ayer 1956, Moore 1959, Unger 1975, Klein 1981). Such a view would explain why concessive knowledge attributions such as, “I know that I have hands, but I might be a handless brain in a vat” appear to be inconsistent (Lewis 1996).

However, there are reasons to draw a more substantial distinction between certainty and knowledge. First, as epistemic certainty is intuitively very demanding, taking it as a condition on knowledge could easily lead to the conclusion that ordinary knowledge is beyond one’s reach (Unger 1975). Second, there seem to be cases where some knowledge is attributed to a particular subject without that subject being described as psychologically certain (Stanley 2008, McGlynn 2014, Petersen 2019, Beddor 2020b, Vollet 2020). For instance, consider the statements, “I know that p for certain” or, “I know that p with certainty”. These statements are not redundant, and express something stronger than “I know that p” (Malcolm 1952, Beddor 2020b, Vollet 2020, Descartes distinguishes, for his part, cognitio and scientia: see Pasnau 2013).

In addition, concessive knowledge attributions can be explained by other means. According to some philosophers, a pragmatic implicature in tension with the attribution of knowledge is communicated whenever an epistemic possibility is explicitly considered during  a conversation. For instance, that this possibility is relevant when it comes to determining whether the subject knows that p (Rysiew 2001, Fantl and MacGrath 2009, Dougherty and Rysiew 2009, 2011, for difficulties raised by this type of explanation see Dodd 2010). Other philosophers explain the apparent inconsistency of concessive knowledge attributions by claiming that epistemic certainty is the norm of assertion (Stanley 2008, Pertersen 2019, Beddor 2020b, Vollet 2020).

Whether or not knowledge can be conceived of in terms certainty, because of the close connection between these notions and the centrality of questions pertaining to knowledge in epistemology, the philosophical discussion has been primarily focused on the notions of psychological and epistemic certainty. Therefore, the primary aim of this entry is to present the different ways in which these notions have been understood by philosophers. Note that the question of the relationship between epistemic and objective certainty will nevertheless be addressed in relation to infaillibilist conceptions of epistemic certainty.

2. Psychological Certainty

a. Certainty and Belief

There is an intuitive connection between psychological certainty and belief. Whenever a subject is certain of p, she takes p to be true which is characteristic of the attitude of belief. Yet, one can believe that p without being certain of p. One can reasonably state, “I believe that it will rain tomorrow, but I am not certain that it will” (Hawthorne, Rothschild and Spectre 2016, Rothshild 2020). Thus, being certain of p does not (merely) amount to believing that p (for versions of the claim that believing that p involves being certain that p, see Finetti 1990, Roorda 1995, Wedgwood 2012, Clarke 2013, Greco 2015, Dodd 2017 and Kauss 2020).

One way to conceive of the relationship between psychological certainty and belief is to introduce a “graded” notion of belief. Traditionally, philosophical discussion has relied on a binary notion of belief: either a subject believes a proposition, or she does not. But a subject can also be said to have a certain degree of belief in a proposition and given a graded notion of belief, psychological certainty can plausibly be conceived as the maximum degree of belief one could have in a proposition. For its part, the attitude of belief or outright belief is conceivable as a high, yet non-maximal degree of belief in a proposition (Foley 1992, Ganson 2008, Weatherson 2005, Sturgeon 2008: 154–160 and Leitgeb 2013, 2014, 2017).

Such a conception of psychological certainty raises, however, three important questions. First, what does it mean for a subject to have a particular degree of belief in a proposition? Second, how should degrees of belief be quantified? Third, what does it take for a subject to have the maximum degree of belief in a proposition?

b. A Feeling of Conviction

 

A first possibility is to consider that a subject’s degree of belief in p consists of an internally discernable feeling of conviction toward p’s truth (for a discussion of certainty as a metacognitive or epistemic feeling, see Dokic 2012, 2014; Vazard 2019; Vollet 2022). For example, consider the propositions “2+2=4” and “It will rain tomorrow.” Presumably, there is a discernable difference in one’s feeling of conviction toward the truth of each proposition. One’s conviction toward the truth of “2+2=4” is stronger than one’s conviction toward “It will rain tomorrow” and, given the view under examination, one’s degree of belief in “2+2=4” is thereby higher than one’s degree of belief in “it will rain tomorrow”. This is the case even if one believes that both “2+2=4” and “It will rain tomorrow” are true. Such a conception seems to prevail among philosophers such as Descartes and If a subject’s degree of belief in p consists of a certain feeling of conviction toward p’s truth, then psychological certainty is naturally conceived of a maximally strong feeling of conviction toward the truth of a proposition (see also Locke Essay, and Hume Enquiry).

Such a conception of psychological certainty is problemmatic however. First, many propositions which a particular subject is certain of at a given time may not be associated with a discernable feeling of conviction. One might be absolutely certain that 2+2=4 at t without having any particular feeling toward that proposition, for example, if one is not entertaining the thought that 2+2=4 at t. Second, it is not clear that the type of feeling that is supposed to illuminate the notion of psychological certainty has the required structure. In particular, it is not clear that such a feeling has upper and lower bounds. In light of these complications, it might be preferable, in order to explicate the notion of psychological certainty, to exploit the intuitive connection there is between that notion and the notion of doubt.

It is intuitively correct that if a subject is absolutely convinced that a proposition is true, her attitude toward that proposition leaves no room for doubt. The proposition, for that subject, is indubitable. For instance, one may have that degree of conviction toward the proposition, “I think, therefore I am ” because one finds it difficult—perhaps even impossible—to doubt it (see Ayer 1956: 44-45 for discussion). Note, however, that this characterization of psychological certainty remains ambiguous. It can be understood either synchronically or diachronically (Reed 2008); assuming that one distinguishes the degree of belief a subject can have in a proposition from the stability of this degree of belief (see Levi 1983: 165; Gärdenfors et Makinson 1988: 87, Skyrms 1980: 11f., Leitgeb 2013: 1348, 1359, Moon 2017 and Kauss 2020). From a synchronic perspective, it follows from this characterization that if a subject is certain that p at t, that subject has absolutely no doubt regarding p’s truth at t. However, it does not follow that the subject who is certain that p at t could not doubt that p at a later time t’. In contrast, from a diachronic perspective, it follows from this characterization of psychological certainty that if a subject is certain that p at t, then p is indubitable in the sense that any doubt concerning p’s truth is excluded for that subject, both at t, and at a later time t’.

Understood diachronically, this characterization of psychological certainty under examination is thus stronger and possibly better suited to explicate this type of attitude.  As a matter of fact, according to the synchronic reading, a belief very easily revisable could qualify as a psychological certainty. Yet, it seems that psychological certainty consists of something stronger than this. If a subject is absolutely convinced that p is true, one expects her attitude toward p to be stable.

Several ways of understanding what it takes for a proposition to be indubitable for a subject have been put forward. According to Peirce, the notion of doubt is not related to the mere contemplation of the possibility of a proposition being false (Peirce 1877 and 2011). Instead, doubt is characterized as a “mental irritation” resulting from the acquisition of unforeseen information which provides motivation to investigate a proposition’s truth further. According to this pragmatic conception of doubt, the exclusion of doubt regarding p’s truth is the result of a rational inquiry into the question of p’s truth and does not simply consist of a psychological impossibility to contemplate the possibility of p being false (Vazard 2019) .

In contrast, Malcolm and Unger adopt a Cartesian conception of doubt (Malcolm 1963 67-68, Unger 1975  30-31, 105 sq). For them, doubt regarding p’s truth is excluded for a subject at a time t whenever her attitude toward p at t is such that no information or experiences will lead her to change her mind regarding p. When a subject is certain that p and doubt regarding p’s truth is excluded for her, her attitude toward p is such that, for her, any consideration in favor of the conclusion that p is false is misleading. This means, for instance, that if a subject is certain that there is an ink bottle in front of her at t, her attitude toward the proposition “There is an ink bottle in front of me” is such that, at t, if she felt or had a visual experience as of her hand passing through that ink-bottle, this would not provide reason for her to think that there is not an ink bottle in front of her. Rather, it would provide reason for the subject to think that such sensory experiences are illusory.

For other philosophers, if a subject is certain that p, no data or experience could lead her to doubt that p is true without putting the totality of her beliefs into question, including beliefs concerning the data or experience prompting her to doubt that p is true in the first place (Miller 1978). That is, if a subject is certain that p, then any reason to doubt that p is true would constitute, for her, a reason to doubt absolutely anything, including the very existence of that reason.

This characterization of psychological certainty manages to capture the idea that this attitude differs from the attitude of belief in that it is not revisable. Note, however, that this characterization does not entail that no psychological event could alter one’s attitude of certainty. Even if any doubt regarding p’s truth is excluded for a subject at t, circumstances could lead her to doubt p’s truth. In addition, this characterization does not exclude the possibility of the subject acquiring, at a later time t’, evidence supporting the conclusion that p is false, and of losing, as a result, her conviction regarding p’s truth.

Additionally, the exclusion of doubt regarding p’s truth is not only related to the attitude adopted by a subject who is certain of p toward what counts as a reason to think that p is false. As noted by Unger, a subject’s absolute conviction that p is true manifests itself in the subject’s readiness to use p as a premise of practical or theoretical reasoning without hesitation (Unger 1975: 116). Of course, this aspect of psychological certainty is not in conflict with the characterization of this attitude outlined above. As a matter of fact, if any doubt regarding p’s truth is excluded for a subject, it is plausible that she is ready to use p as a premise of reasoning without hesitation.

While the characterization of psychological certainty just outlined capture central features of this attitude, it also faces certain difficulties. Given such a characterization of psychological certainty, one could be led to think, following Unger, that psychological certainty is a fundamentally dogmatic attitude which should not be adopted (Unger 1975). Yet, philosophers such as Dicker, Carrier, Douven and Olders reject the idea that psychological certainty consists of a dogmatic attitude (Dicker 1974: 166, 168, Carrier 1983, Douven and Olders 2008: 248) and philosophers such as Miller argue that psychological certainty is in fact compatible with a feeling of doubt (Miller 1978: 48, 53-54).

c. The Operational Model

As noted in the previous section, explicating the notion of psychological certainty in terms of an internally discernable feeling of conviction raises serious problems. This has led several philosophers to favor an operationalist or functionalist approach of psychological certainty. According to De Finetti, an operational definition of a subject’s degree of belief can be given in terms of her betting behavior (De Finetti 1937 and 1990). More precisely, a subject’s degree of belief in p can be conceived of as the odd that a subject regards as fair for a bet on p’s truth that would be rewarded with one monetary unit if p was true. For instance, suppose one is offered a bet on whether or not the proposition “Berlin is the capital of the Federal Republic of Germany” is true. Suppose, in addition, that if one were to be right concerning that proposition, one would be rewarded $1. If one is ready to pay $.80 to bet on the truth of that proposition, then, given De Finetti’s model, one’s degree of belief in the proposition “Berlin is the capital of the Federal Republic of Germany” can be represented as a function which assigns the value .8 to that proposition.

Ramsey generalizes the relationship between a subject’s expectations—her degree of belief regarding the truth of a set of propositions—and her behavior to any type of preferences (Ramsey 1929). According to him, whenever a subject determines whether she prefers to do A or B, she relies on her degree of belief in the propositions representing the states of affairs which the possible results of each option depends on. Thus, a subject’s expectations allow her to determine the value she can expect from each option and to rationally determine whether or not she prefers to do A or B.

Several representation theorems have been formulated to show that if a subject’s preferences satisfy a set of intuitively acceptable constraints, they can be represented by a probability function corresponding to both the subject’s expectations and a utility function which, in conjunction, maximize the expected utility of each possible option (Ramsey 1926, Savage 1954, Jeffrey 1965, Joyce 1999). This formalization of the relationship between rational expectations and rational preferences is central to both Bayesian Epistemology and Bayesian Decision Theory as it lays the groundwork for an analysis of epistemic rationality in light of the assumption that rational expectations can be represented as a probability distribution over a given set of propositions.

The connection between a subject’s degree of belief and her preferences suggests that psychological certainty can be conceived of as a subject’s propensity to act in a certain way. If one relies on betting behavior to explicate degree of belief, psychological certainty could be conceived of as a subject’s readiness to accept any bet concerning p’s truth as long as the bet at issue can result in potential gains. Such a conception would be similar to the one presented in the previous section; viewing psychological certainty as an attitude toward p characterized by the exclusion of doubt regarding p’s truth. If doubt regarding p’s truth is excluded for a subject, what reason could that subject have to refuse a bet on p’s truth that could result in potential gains? If any doubt regarding p’s truth is excluded for her, then nothing could lead her to doubt that p is true; not even the stakes involved in a particular bet concerning p’s truth.

But, of course, we are not certain of many propositions in that sense. If the stakes related to a bet that is offered to us concerning the truth of a proposition we regard as being certain are high, we hesitate. Additionally, it seems that we are right not to be certain of many propositions in that sense, for the evidence we normally have access to does not warrant adopting such an attitude. This is not perceived, however, as being fundamentally problemmatic by the proponents of Bayesian conceptions of epistemic and practical rationality as such conceptions purport to model epistemic and practical rationality in a context of generalized uncertainty. If such conceptions fundamentally aim at showing how it can be reasonable for a subject to think or act in certain way in a context of generalized uncertainty, the fact that given these conceptions there is almost nothing which we are certain and can reasonably be certain of can hardly count as a drawback. This is true as long as one is ready to concede that this construal of psychological certainty is not necessarily equivalent to our ordinary concept of psychological certainty (Jeffrey 1970: 161).

3. Epistemic Certainty

a. The Problem of Epistemic Certainty

According to the Lockean principle, which requires that one proportions one’s degree of belief to the evidence, a subject is justified in being psychologically certain of a proposition if, and only if, this proposition is epistemically certain. This principle is widely accepted as it explains why statements such as “It is certain that p, but I’m not certain that p” sound incoherent (Stanley 2008, Beddor 2020b, Vollet 2020). The main question being: are there epistemically certain propositions, and if so, what grounds their epistemic certainty?

To tackle this question, let us consider the following propositions:

(1)   It will rain in Paris in exactly one month at 5 p.m.

(2)   The lottery ticket I’ve just bought is a losing ticket (Context: It only has a chance of one in a million to win).

(3)   My car is currently parked in the parking lot (Context: I parked my car in the parking lot five minutes ago.)

(4)   The world was not created five minutes ago.

(5)   It hurts. (Context: I just dropped a hammer on my foot)

(6)   All bachelors are unmarried men.

As previously mentioned, epistemic certainty is relative to the epistemic position of a particular subject. Considering the six propositions above, the question at hand is therefore whether or not a subject’s epistemic position can be such that these propositions are certain for her. One can reasonably doubt that a subject can be in an epistemic position such that (1) is certain for her. Considering the evidence a subject normally has access to, it seems that (1) can be, at best, highly probable, but not certain. Likewise, (2) appears to constitute a typical example of uncertain propositions; the sort that tends to illustrate the difference between certainty and high probability.

On the other hand, it is intuitive to think that propositions such as (3), (4), (5) or (6) can be epistemically certain. In a typical situation, if a subject comes to doubt that her car is still parked in the spot she left it five minutes ago, that the world was not created five minutes ago, that it hurts when she drops a hammer on her foot, or that all bachelors are unmarried, one would presumably consider this doubt as ill-founded and absurd. Yet, one might think it is possible that one’s car was stolen two minutes ago, or that the world was created five minutes ago. In fact, it is unclear if the evidence one possesses allows dismissing such scenarios. What about propositions such as (5) and (6)? Some philosophers suggest that it is reasonable to doubt the truth of such propositions in cases where one is offered a bet with extremely high stakes, for example a bet in which, if the proposition is false, one’s family is tortured to death. In such cases, it seems reasonable to decline the bet. Now, given the Lockean principle mentioned above, this may be interpreted as providing good evidence to think that even these kinds of propositions are not actually certain.

These considerations show how problemmatic the notion of epistemic certainty can be. We easily admit that, given the evidence normally available to a subject, propositions such as (1) and (2) are uncertain. In contrast, we are inclined to think that propositions such as (3), (4), (5) and (6) are, or can be, epistemically certain. Yet, minimal reflection suffices to put this inclination into question. This has been highlighted by Hume; when one does not pay specific enough attention, one considers many propositions to be certain (Hume Treatise III, 1. 1. 1.). However, minimal philosophical reflection suffices to shake this conviction which reappears as soon as one comes back to one’s daily life. The question of the nature, possibility and extension of epistemic certainty is in fact nothing other than the problem of skepticism, which rests at the heart of important debates in epistemology (Firth 1967).

The challenge, then, consists in articulating and defending a criterion for epistemic certainty, while also explaining the problemmatic cases which arise from this criterion. If we consider the propositions (1)-(6), three families of theories of epistemic certainty can be distinguished.

In the following list, the term “skeptical” is used with respect to epistemic certainty, rather than knowledge:

  Skeptical:

o        Radically skeptical: none of the considered propositions are, or can be, certain.

o        Strongly skeptical: only propositions such as (6) are, or can be, certain.

o        Moderately skeptical: only propositions such as (5) or (6) are, or can be, certain.

Moderate:

o        Strong moderate: only propositions such as (4), (5) and (6) are, or can be, certain.

o        Weak moderate: only propositions such as (3), (4), (5) and (6) are, or can be, certain.

Weak:

o        Propositions such as  (2), (3), (4), (5) and (6) are, or can be, certain.

In the remaining sections, the focus is on the theories listed above, with radically skeptical theories considered as opponents that these theories are designed to respond to.

b. Skeptical Theories of Epistemic Certainty

i. Radical Infallibilism

One way of explaining epistemic certainty appeals to infallibility. In general, one is infallible with regard to p if and only if it is impossible for one to be mistaken about p’s truth (Audi 2003: 301 sq):

Certainty-RI: p is certain for S if, and only if, S can believe that p, and it is absolutely impossible for S to believe that p and to be mistaken about p’s truth.

According to this definition, at least two kinds of propositions can be certain. First, there are necessarily true propositions such as (6). Indeed, if a proposition is necessarily true, it is impossible for a subject to believe that this proposition is true and, at the same time, to be mistaken about that proposition’s truth. Second, there are propositions that are true in virtue of being believed to be so by a subject. For example, a subject cannot believe the propositions, “I have a belief” or, “I think, therefore I am” to be true and, at the same time, be mistaken concerning their truth. This is because, these propositions are true in virtue of being believed to be so by the subject.

As it should be clear, this conception of epistemic certainty excludes propositions such as (1), (2), (3), (4) and (5). Indeed, these propositions are contingent, and their truth is independent of whether or not they are believed to be true by a subject (Ayer 1956: 19). Certainty-RI is therefore a strongly skeptical conception of epistemic certainty. Given that conception, very few informative propositions can be certain or even known if one maintains that knowledge requires epistemic certainty.

A major difficulty raised by this conception of epistemic certainty is, however, that it entails that any logically or metaphysically necessary proposition is epistemically certain. For instance, a mathematical conjecture such as Goldbach’s, if it is true, is necessarily true. As a result, according to this conception of epistemic certainty, any mathematical conjecture, if it is true, is epistemically certain. Yet, it seems clear that one can have reasons to doubt the truth of a mathematical conjecture (Plantinga 1993: ch. 8).For example, a recognized expert might wrongly assert that a given conjecture is false. A related worry is the well-known problem of logical omniscience: since we are not logically omniscient, it is implausible to consider that every logical or metaphysical truth is certain for us (Hintikka 1962, Stalnaker 1991). In order for a logical or metaphysical necessity to be epistemically certain for a subject, it seems that the subject should at least grasp the nature of that necessity.

One crucial aspect that this conception of epistemic certainty fails to capture is related to the absence of good reasons to doubt. Intuitively, what makes a logical or metaphysical truth epistemically certain is not the fact that it is necessarily true, but that we have very strong reasons to regard it as necessarily true.

ii. Invariantist Maximalism

The above considerations suggest that epistemic certainty should rather be explicated in terms of the absence of good reasons to doubt:

Certainty-IND: p is certain for S if and only if p is epistemically indubitable for S. That is, if and only if it is impossible for S to have a good reason to doubt that p.

According to a first version of Certainty-IND, epistemic certainty depends on a subject having the highest possible degree of justification for believing a proposition (Russell 1948, Firth 1967: 8-12). If a subject’s justification is absolutely maximal with respect to p, no proposition q can be more justified than p. It follows that if p is certain, no proposition can provide a good reason to doubt that p, as any consideration q speaking against the truth of p would have a lower degree of justification. Let us label this conception “Invariantist Maximalism.”

Certainty-IM: p is epistemically certain for S if and only if there is no proposition q which can be more justified, for S, than p.

Invariantist Maximalism relies on the thought that the term “certain” is an absolute term which applies in light of an invariant and maximal criterion: if p is certain for S, nothing can be more certain than p, no matter the context of epistemic appraisal (Unger 1975. For criticisms, see Barnes 1973, Cargile 1972, Klein 1981, Stanley 2008, and Vollet 2020). An advantage of this view is that it does not entail that all necessary truths are epistemically certain. Even if one assumes that if a proposition is maximally justified for a subject, the subject is then infallible with respect to it, infallibility, on its own, is not sufficient for epistemic certainty. (Fantl and McGrath 2009: ch. 1, Firth 1967: 9). For example, someone cannot incorrectly believe that water is H2O, though one’s justification for believing that water is H2O need not be maximal.

Nonetheless, Invariantist Maximalism easily leads to radical skepticism. A first reason for this is that one might think it is impossible to identify a maximal threshold of justification. Indeed, it always seems possible to raise the degree of justification one has for believing that a given proposition is true, either by acquiring new evidence from a different source, or by acquiring higher-order justification (Brown 2011; 2018, Fantl 2003). Taking this into account, the invariantist Maximalist conception of epistemic certainty predicts that no proposition is epistemically certain.

Furthermore, this approach leads one to classify as epistemically uncertain any proposition less justified than those such as, “I exist.” This is the case even with propositions that can be logically deduced from such propositions. For, it is plausible that the degree of justification one has for believing logically stronger propositions than “I exist” is slightly lower than the degree of justification one has for believing the proposition “I exist”.

One way of avoiding these skeptical consequences is to restrict the set of propositions that constitute the comparison class. That is, the set of propositions which p is compared to in determining its epistemic certainty (see Firth 1967: 12 for a presentation of various possibilities). However, the difficulty is to propose a criterion for restricting this set that is neither too strong nor too weak. For instance, Chisholm proposes to restrict the comparison class to the propositions that a subject, at a given time t, can reasonably believe (Chisolm 1976: 27):

Certainty-Chisholm 1: p is certain for S at t if and only if

(i) Accepting p is more reasonable for S at t than withholding p, and

(ii) there is no q such that accepting q is more reasonable for S at t than accepting q.

Yet, this criterion seems too weak. If no proposition is justified to a high degree, some propositions with a very low degree of justification could be said to be epistemically certain given Certainty-Chisholm 1 (Reed 2008).

Consequently, consider the criterion proposed by Chisholm (1989: 12):

Certainty-Chisholm 2: p is epistemically certain for S if and only if, for any proposition q:

(i) believing p is more justified for S than withholding judgement concerning q, and

(ii) believing p is at least as justified for S as is believing q.

Even if this criterion is stronger, the problem is that there are many propositions which one has absolutely no evidence for, and, accordingly, concerning which one is absolutely justified to suspend judgement. For instance, it does not seem more reasonable to believe the proposition, “I think, therefore I am” than it is to withhold judgement concerning the proposition, “There are an even number of planets.” As a result, according to Certainty-Chisholm 2, “I think, therefore I am” is not epistemically certain (Reed 2008).

iii. Classical Infallibilism

According to another version of Certainty-IND, epistemic certainty does not require having a maximal justification for believing a proposition (in contrast to Certainty-IM) or being infallible regarding a proposition’s truth in the sense of Certainty-RI. Instead, epistemic certainty requires that one’s justification be infallible. To say that the justification S has for p is infallible is to say that it is impossible, logically or metaphysically, for S to have this justification and that p is false. In addition, this requirement is traditionally understood along internalist lines in such a way that whenever a subject possesses an infallible justification for p she is in a position, because of the access she has to the justifiers, to rule out, herself, the possibility of p being false. Thus, consider the following formulation of the classical infallibilist conception of epistemic certainty:

Certainty-CI: p is certain for S if and only if S has (internal) access to a justification for p which implies p.

According to this conception, epistemic certainty requires an infallible guarantee that is (reflexively) accessible to the subject (Dutant 2015). For instance, Descartes maintains that clear and distinct ideas are guaranteed to be true, and they are therefore epistemically indubitable (Meditations II). Russell states that through introspection, one can directly access one’s sense data, and that thereby, one’s introspective beliefs are guaranteed to be true (Russell 1912).

This kind of approach can avoid the problem of propositions that are necessarily true, but epistemically uncertain. This is the case if one maintains that one consideration can justify another only if what makes the first consideration true also (in part) makes the other consideration true (Fumerton 2005: sec. 2.2). Alternatively, one can think of an infallible guarantee as a ground or method of belief formation which cannot produce false beliefs (Dutant 2016). Note that this conception can allow propositions of type (5) to be certain if it is assumed, for example, that what justifies my belief that “My foot hurts” is the fact that my foot itself hurts, which is accessible via introspection.

The Bayesian approach, on a standard interpretation, can be considered as a formalized version of such a conception. One of its main tenets is that a subject’s expectations regarding a set of propositions can be, if rational, represented as a probability distribution over this set. In other words, a subject’s expectations relative to the truth of a set of propositions can be, if rational, represented as a function assigning to each of these propositions a numerical value which satisfies the definition of a probability function given by Kolmogorov (Kolmogorov 1956). If a subject’s rational expectations regarding a set of propositions are represented as such, the expectations a subject should have, given the evidence she possesses, can be represented in terms of conditional probability, whose definition is provided by Bayes’ theorem. Epistemic certainty is thus conceived of as the maximal probability (probability 1) that a proposition can have, given the available evidence:

Certainty-Prob: p is epistemically certain for S if and only if Pr(p|E) = 1

This conception of certainty can be viewed as a version of classical infallibilism if it is assumed that no false proposition can be evidence, that evidence is always accessible as such to the subject, and that whatever constitutes the evidence possessed by a subject is itself epistemically certain. This is a strongly skeptical conception if, following orthodox Bayesians like Jeffrey, one thinks that only logically necessary propositions should receive maximal probability (Jeffrey 2004). If, on the other hand, some contingent propositions, in particular about our own mental states, can be considered as evidence in this sense, then this conception can be regarded as moderately skeptical. The main advantage of this kind of approach is that it offers a way of accounting for practical and epistemic rationality in absence of epistemic certainty (and knowledge) of a great number of propositions.

Still, we may think we can have reasons to doubt the evidence we have and what our evidence does or does not entail (Lasonen-Aanio 2020). For example, one may have reason to doubt the infallible character of clear and distinct perceptions, or doubt what qualifies as clear and distinct (Descartes Meditations I, Ayer 1956: 42-44). Furthermore, the standard Bayesian conception has it that it is rational to take a bet on any logical truth, no matter the stakes or the odds. According to this framework, it would be irrational (probabilistically incoherent) to assign a non-maximal probability to a logical truth. Yet, it seems that it is sometimes irrational to take such a bet (Fantl and McGrath 2009: ch. 1, Hawthorne 2004). As previously demonstrated, it is plausible to think that some logical truths can be epistemically uncertain.

iv. A Worry for Skeptical Theories of Certainty

Whether or not a satisfactory skeptical account of certainty can be offered, it is in tension with our intuitive judgements and the ordinary way in which we use the word ‘certainty’ and epistemic modals (Huemer 2007, Beddor 2020b, Vollet 2020). Suppose that it is epistemically certain that p if and only if it is epistemically impossible that not-p (DeRose 1998). According to Invariantist Maximalism, for example, if I lost my wallet this morning and my wife tells me, “It might be in the restaurant” and I answer “No, that’s impossible, I didn’t go to the restaurant.” then, my wife says something which is, strictly speaking, true, and I say something which is, strictly speaking, false. Indeed, I do not satisfy the absolutely maximal criteria of justification with respect to the proposition, “My wallet is not in the restaurant.” Similarly, if my evidence does not logically exclude this possibility, then the probability of that proposition on my evidence is lower than 1. Even more surprisingly, we should admit that my wife would, strictly speaking, say something true were she to say “Your wallet might be on the Moon.” (Huemer 2007).

A pragmatic explanation of the (in)appropriateness of these utterances might be advanced. One might say that, in the above context, it is (practically) appropriate to ignore the epistemic possibilities in question (see the treatment of concessive knowledge attributions above). Still, explaining the intuitive (in)appropriateness of an utterance does not amount to accounting for its intuitive truth value. Another option is to rely on the distinction between epistemic and moral or practical certainty (Descartes Principles of philosophy, IV, § 205, Wedgwood 2012, Locke 2015). The latter can be understood as a degree of epistemic justification sufficiently high to treat the proposition as certain in one’s actions and practical decisions. One may suggest that the ordinary way in which people use ‘certain’ and the associated epistemic modals, as well as our intuitions, primarily tracks certainty as understood in the latter sense (Kattsoff 1965: 264).

The Bayesian approach has the advantage of providing a general framework in which practical and epistemic rationality in a context of generalized uncertainty can be modeled in a precise way. The claim that the subject’s expectations, if rational, can be represented as a probabilistic distribution allows formulating conditionalization rules describing exactly how a subject should adapt her expectations to the evidence in the absence of certainties. That said, either the concept of certainty that this approach uses is a technical one, in that it does not correspond to our ordinary concept of certainty, or this approach must provide an explanation of the way in which we ordinarily use this concept and the associated epistemic modals.

c. Moderate Theories of Epistemic Certainty

i. Moderate Infallibilism

It is often thought that requiring an infallible justification leads to skepticism about certainty (and knowledge). However, a non-skeptical and infallibilist account can be offered if one rejects the (internalist) accessibility requirement of Certainty-CI (Dutant 2016). For example, one could say that infallibility should not be cashed out in terms of the evidence possessed by a subject but instead in terms of a modal relation between the belief and the truth of its propositional content, such as the sensitivity of a belief or its safety (Dretske 1971, Nozick 1981, Williamson 2000). A reliabilist could also maintain that epistemic certainty requires that the belief-forming processes be maximally reliable (in the circumstances in which the belief is formed). Such approaches are infallibilist in the sense that they state it is impossible for a belief to be false if the guarantee (sensitivity, safety or maximal reliability) required for a proposition to be certain holds.

Another option consists in maintaining that infallibility depends on the subject’s evidence, while also opting for a more generous view of evidence. One may hold that propositions such as (4) can be certain if one thinks that the set of evidence a subject possesses can include any propositions about the external world (see Brown 2018 ch. 3 for further discussion). This option will typically exclude propositions such as (2), whose epistemic probability, although high, is not maximal. It can be declined in a stronger or weaker version, depending on whether propositions such as (3) can receive a maximal probability on the evidence.

Williamson defends a weak version of moderate infallibilism about knowledge, according to which one can know a proposition such as (3) (Williamson 2000). In addition to a safety condition allowing for a margin of error — in which a subject knows that p only if p is true in relevantly close situations — Williamson proposes that the evidence of a subject is only constituted by what she knows (see also McDowell 1982). If epistemic certainty is evidential probability 1, it follows that:

Certainty-Prob/K: p is epistemically certain for S if and only if Pr(p|K) = 1, where K stands for the set of propositions known by S.

This view is part of a broader “knowledge-first” research program in which Williamson assumes that (the concept of) knowledge is primitive. If this approach is correct, it can provide a reductive analysis of epistemic certainty in terms of knowledge by considering that all and only known propositions (or all and only propositions one is in a position to know) are epistemic certainties. This fits well with the traditional view of epistemic modals, according to which “It is impossible that p” (and, therefore, “It is certain that not-p”) is true if and only if p is incompatible with what the subject (or a potentially contextually determined relevant group of subjects) knows, or is in a position to know (DeRose 1991).

However, one can subscribe to a moderately infallibilist view of certainty without subscribing to the claim that knowledge entails certainty. As a matter of fact, the very idea that knowledge is moderately infallible is controversial. According to a widespread view, often called “logical fallibilism about knowledge,” S can know that p even if S’ evidence does not entail that p, or even if prob(p|E) is less than one (Cohen 1988, Rysiew 2001, Reed 2002, Brown 2011, 2018, Fantl and McGrath 2009: chap. 1). In some versions, this kind of fallibilism concedes that knowing that p requires having entailing evidence for p, but rejects that all evidence must be maximally probable (Dougherty 2011: 140). According to this fallibilist view of knowledge, propositions such as (2) can typically be known. However, proponents of this view generally deny that propositions such as (2), — and, even, propositions such as (3), (4), (5) or (6) — can be certain. Therefore, logical fallibilism about knowledge is compatible with a moderately infallibilist conception of epistemic certainty, in which epistemic certainty requires that a subject’s evidence entails that p, or that p’s epistemic probability be maximal (Reed 2008, Dougherty 2011). In brief, logical fallibilism about knowledge states that p can be known even if p is not epistemically certain, even in the sense of the moderate infallibilist view of certainty.

Regardless of if one endorses logical fallibilism about knowledge, a moderate infallibilist view of certainty which relies on a generous account of evidence accepts that p is epistemically certain for S if and only if S’ evidence rules out all the possibilities in which p is false, where a possibility is “ruled out” by evidence when it is logically incompatible with it (or with the fact that S has this evidence: Lewis 1996). According to this approach, the certainty conferring evidence must have probability 1 and be true, otherwise, the entailed proposition could be false and receive a probability lower than 1 (Brown 2018: 28). If one endorses logical fallibilism about knowledge, then epistemic certainties are a subset of what the subject knows, or is in a position to know (Littlejohn 2011, Petersen 2019, Beddor 2020b, Vollet 2020).

A general concern for this type of approach is that it can seem circular or insufficient. Indeed, the propositions (evidence) that grant epistemic certainty must themselves be epistemically certain. If not, what can be deduced from them will not be certain. Therefore, according to such approaches, one must assume that there are primitive epistemic certainties, that is, propositions whose prior probability is 1 (Russell 1948, Van Cleve 1977). However, the question remains: in virtue of what do these propositions have such a high probability? In addition, as logical truths are logically entailed by any set of evidence, this approach fails to account for the fact that logical truths can be epistemically uncertain.

ii. Fallibilism

According to other philosophers, while epistemic certainty depends on the evidence a subject has, that evidence does not need to entail that p in order for p to be certain. To express that claim in terms of epistemic probability: the probability of p being true conditional on the evidence possessed by a subject does not need to be maximal for p to be epistemically certain for that subject (see Reed 2002: 145-146 for further discussion).

According to Moore, if S knows that p, then it is epistemically impossible for S that p is false (Moore 1959). In other words, p is epistemically certain for S. However, in Moore’s view, one can know that p based on evidence that does not entail p’s truth. For instance, a subject can know that she has hands based on a visual experience of her hands so that it is epistemically impossible for her, given that experience, that she does not have hands. Yet, it is logically possible for the subject to undergo that visual experience without having hands. The logical possibility of undergoing that experience without having hands is simply conceived of by Moore as being compatible with the epistemic certainty regarding the fact that she has hands (see also DeRose 1991, Stanley 2005b). Thus, Moore offers a fallibilist conception of certainty based on a fallibilist conception of knowledge.

According to Moore’s conception of epistemic certainty, propositions such as (3) or (4) can be certain provided that their negation is incompatible with what a subject knows and propositions such as (5) can be uncertain if their negation is compatible with what a subject knows. In addition, this framework opens up the possibility of a weak conception of epistemic certainty, according to which propositions such as (2) can be epistemically certain.

In Moore’s approach, epistemic certainty is identified with knowledge. Nevertheless, Moore himself acknowledges that one may want to draw a distinction between knowledge and certainty (see also Firth 1967: 10, Miller 1978: 46n3, Lehrer, 1974, Stanley 2008). As previously noted, it is common to draw such a distinction by endorsing a logical fallibilist conception of knowledge, while also maintaining an infallibilist (either moderate or skeptical) conception of certainty. But is it possible to endorse, with Moore, a fallibilist conception of epistemic certainty and still draw a distinction between knowledge and epistemic certainty?

That is possible if one endorses another version of fallibilism with respect to knowledge, which is known as epistemic fallibilism. According to epistemic fallibilism, S can know that p even if S cannot rule out every possibility of p being false, where a possibility of p being false is “ruled out” whenever it is logically incompatible with what S knows to be true (Dretske 1981: 371). Endorsing such a conception of knowledge involves rejecting epistemic closure in that it involves accepting that S can know that p, and that p entails q, without thereby knowing (or being in a position to know) whether q is true. It involves accepting that S can know that she has hands, and that her having hands entails that she is not a handless brain in a vat, without thereby being in a position to know whether she is a handless brain in a vat (Nozick 1981, Dretske 1970).

Thus, even if one endorses a logical fallibilist conception of epistemic certainty, one can maintain that, in contrast to knowledge, certainty requires epistemic infallibility in this sense. That is, epistemic certainty requires having a justification for p such that every possibility of p being false is ruled out. This can be seen by the fallibilist conception of epistemic certainty in terms of immunity, as presented below. Given this approach, it is possible to claim that propositions such as (2) can be known, while also claiming that they cannot be epistemically certain. Though a subject can know that her lottery ticket is a losing one, it is not certain to her that it is a losing ticket. Indeed, the justification that the subject has for her belief does not rule out any possibility that her ticket is a winning one.

Another way to draw a distinction between knowledge and epistemic certainty which is compatible with a fallibilist conception of epistemic certainty is to argue that certainty involves, in addition to knowing a particular proposition to be true, having a specific epistemic perspective on that knowledge. For instance, Descartes acknowledges the fact that an atheist mathematician can possess a cognitio of mathematical truths but claims that she could not possess a scientia of that domain (Descartes Meditations on First Philosophy, second answer to objections). As such, an atheist mathematician does not recognize the divine guarantee that anything which is conceived of clearly and distinctly is thereby true. Thus, whenever she conceives mathematical truths clearly and distinctly, she cannot know that she knows. Accordingly, her knowledge of mathematical truths remains uncertain (Carrier 1993). Likewise, Williamson endorses the idea that a form of epistemic uncertainty can result from a lack of second-order knowledge (Williamson 2005; 2009, Rysiew 2007: 636-37, 657–58, n. 13, Turri 2010).

iii. Epistemic Immunity and Incorrigibility

According to moderate conceptions of epistemic certainty, propositions such as (3) or (4) can be certain for a subject. This is because either these propositions can be deduced from other propositions which are themselves certain, or the justification one has for these propositions is, although fallible, sufficient for certainty. Yet, if certainty depends on neither complete infallibility, nor on maximal justification, what does it depend on, precisely? What makes propositions such as (3) and (4), or the propositions from which they can be deduced from, certain for a subject?

The plausibility of strong conceptions of epistemic certainty results, at least partly, from the fact that they attribute a type of epistemic immunity to propositions which are certain. As previously noted, if a proposition p is maximally justified for a subject, then that proposition is immune to refutation. This is because there is no proposition q, such that q is incompatible with p, which can defeat or diminish p’s justification. A proposition also seems to be immune to refutation so long as the evidence one has for that proposition is itself infallible. One aim of moderate conceptions of epistemic certainty is therefore to offer a conception of epistemic certainty that attributes a form of epistemic immunity to propositions which are considered certain, without making this immunity dependent on complete infallibility or maximal justification.

Incorrigibility consists of a type of epistemic immunity that depends neither on complete infallibility, nor on maximal justification. The justification one has for p is considered incorrigible if it constitutes an ultimate authority on the question as to whether p and it cannot be defeated (Ayer 1963: 70-73, Firth 1967: 21, Alston 1992, Reed 2002: 144). Propositions concerning one’s mental states– such as one’s intentions, feelings, thoughts and immediate experiences– are typical examples of propositions for which one can have an incorrigible justification (Malcolm 1963: 77-86; Ayer 1963: 68-73, Firth 1967: 21). For instance, if a subject sincerely asserts that she undergoes an experience of something as of being red, this assertion seems to provide an incorrigible justification for the proposition “That subject undergoes an experience of something being red” (see Armstrong 1963 for a critical discussion).

However, that the justification one has for such a proposition is incorrigible does not entail that the proposition which is thus justified is true (Ayer 1963: 71-73). For, incorrigibility does not require infallibility (Firth 1967: 25). Some philosophers suggest that with respect to our own mental states, the incorrigibility of the justification we have for a proposition such as ‘I am in pain’ depends on the fact that we are infallible regarding our mental states (Malcolm 1963: 85). On the contrary, one can suppose that incorrigibility results from enjoying a privileged, yet fallible access to our mental states. For example, if I consider two lines of similar length, I can wonder which line appears to be the longest. If I can doubt this particular appearance, then I can consider being wrong about it. I can also falsely believe that the second line appears to be longer than the first one (Ayer 1956: 65). Additionally, incorrigible justification does not require maximal justification. In a world where no one can be wrong about their mental states, propositions concerning one’s mental states would be better justified if asserted sincerely. This suggests that, in the actual world, the incorrigible justification one has for such propositions is not maximal.

According to this approach, propositions about the external world such as (4) are considered epistemically certain. This is because it seems unreasonable to question the truth of propositions such as ”The world was not created five minutes ago”, “The external world exists” or “I have hands.” According to hinge theorists following Wittgenstein, the position of such propositions within one’s conceptual scheme is what makes them immune or incorrigible, and thereby certain (Wittgenstein 1969, Coliva 2015, Pritchard 2016). The truth of these propositions must be assumed in order to be able to operate with the notions of doubt and justification. Outside of very specific contexts, such as those involving someone who lost her hands in an accident, attempting to justify or doubt the truth of having hands is simply nonsensical. In Wittgenstein’s view, this incorrigibility is what distinguishes knowable propositions from certain propositions; the former can be the object of doubt and justification, while the latter cannot.

Another approach proposes that propositions about the external world can be certain even if all their logical consequences have not been verified, so long as their justification is sufficiently immune (contra C. I. Lewis 1946: 80, 180). A specification of sufficient immunity relies on the concept of ideal irrefutability. A proposition p is ideally irrefutable at a time t for a subject S if and only if there is no conceivable event such that if at t S is justified in believing that this event will occur, S is also justified in believing that p is false at t (Firth 1967: 16). In other words, a proposition p is certain if and only if one is justified in believing that p and, for all future tests which one is justified in believing that they will happen (or that one could imagine), they are such that they would not provide a justification for believing that p is false (Malcolm 1963: 68).

For example, suppose I see that there is an ink bottle here. This is compatible with the possibility of me having the future sensation of my hand passing through the ink bottle. I may even be justified in believing I will undergo such a sensation (suppose, for example, that a reliable prediction has been made). Yet, there is a way in which the present sensation of there being an ink bottle here justifies treating, at the moment of my seeing, any future sensation indicating that there is not an ink bottle here as misleading. After all, for any future such sensation, there exists a possible explanation that is compatible with the claim ”There is an ink bottle here.” For example, a future sensation of my hand passing through the ink bottle might be explained as a hallucination. If I am, at the moment of my seeing, justified in believing that there is actually an ink bottle here, it seems that, at that very moment, I am also justified in believing that any future sensation is explainable in a way that is compatible with the claim ”There is an ink-bottle here”. If so, at the moment of my seeing, the proposition that there is an ink-bottle here can be said to be ideally irrefutable and, according to the view of epistemic certainty under examination, epistemically certain for me.

Note that this does not mean the propositions that are epistemically certain for S in the present will remain epistemically certain for S in the future. If, in the future, S has the sensation that her hand passes through the ink bottle and her vision of the ink bottle is different, the proposition “There is, or was, an ink-bottle here” can become epistemically uncertain for S (Klein 1981: 91). What matters here is that, at the moment of S’s seeing that there is an ink bottle, S is justified in believing any future sensation can be explained as compatible  with the claim ”There is an ink bottle here.”

In comparison, Miller defends a weaker account of immunity and certainty. According to him, the justification possessed by S makes p certain only if there can be no other proposition q which is justified enough to show that S should not believe p, in spite of S’ current reasons to believe p (Miller 1978). In other words, p is certain for S if it is always permissible for S to believe p in light of S’s current and future evidence. According to this view, it does not matter if the new evidence will make the belief that not-p permissible, or if the hypothesis that not-p will constitute the best available explanation for the new set of evidence. For example, suppose that a scientist and everyone around me say that there have never been cars, and that I’m just waking from a dream caused by drugs. If I add this experience to my memories of cars, it still seems permissible for me to believe that there are cars, and to doubt the testimony of the people telling me the contrary– even if I can find no good explanation for my new experience.

According to a stronger characterization of immunity, certainty requires the proposition p be ideally immune to any decrease in justification. In this sense, it is not clear that the proposition “There is an ink bottle here” is certain. For, it seems that if I was justified in believing that my hand will pass over the supposed ink bottle, my justification for believing that there is an ink bottle would diminish (Firth 1967: 17, Miller 1978, contra Malcolm 1963: 93).

A slightly different approach proposed by Klein also requires immunity against a decrease in justification (Klein 1981; 1992). According to him, p is absolutely certain for S if and only if (a) p is warranted for S, (b) S is warranted in denying every proposition q such that if q was added to S‘s beliefs, the warrant for p would be reduced (subjective immunity) and (c) there is no true proposition d, such that if d was added to S‘s true beliefs the warrant for p would be reduced (objective immunity).The satisfaction of condition (a) does not entail that p is true, for the set of justified beliefs can include false beliefs, but condition (c) can be satisfied only if p is true.

This approach proposes that epistemic certainty requires immunity against all attacks in the actual world, but not immunity against all attacks in all possible worlds (in particular, in the worlds in which the considered proposition is false). The fact that it is not certain (for S) that a proposition is certain, or the fact that this proposition is not certain in all possible worlds, does not make this proposition uncertain in the actual world (Klein 1981: 181-189).

One can apprehend the distinction between certainty in the actual world and certainty in all possible worlds with the notion of relative certainty. When speaking of relative certainty, one may want to characterize a degree of justification more or less close to absolute certainty in the actual world, which implies some uncertainty for the proposition in the actual world. But one may also want to designate a degree of justification more or less close to absolute certainty in all possible worlds. If, in this second sense, relative certainty implies uncertainty for the proposition in some possible worlds, it does not imply uncertainty for the proposition in the actual world (Klein 1981: 189).

Most theories of certainty based on the notion of epistemic immunity are strong moderate theories. They take p to be epistemically certain for S only if, for any contrary proposition q (which implies not-p or decreases the justification for p), S is permitted to deny q. Hence, these theories exclude that propositions such as (3) are certain, for it is not difficult to imagine a situation in which the police call you to say that your car was stolen. In such a situation, it does not seem that you are allowed to deny that your car has been stolen.

However, a question remains regarding the kind of certainty this approach assigns to propositions such as (4). In virtue of what would S be allowed to deny a contrary proposition q, for example, the proposition that the world was created five minutes ago and we were born two minutes ago with false memories? If it is in virtue of the fact that p is certain, it appears that immunity is a logical consequence of epistemic certainty, rather than its grounds (Klein 1981: 30, Reed 2008). If it is in virtue of the fact that p occupies a specific place in our conceptual or linguistic scheme (Wittgenstein 1969), or that one cannot imagine or conceive of a possible refutation or invalidation for p, it is not clear that the certainty attached to propositions such as (4) is epistemic, rather than merely psychological.

d. Weak Theories of Epistemic Certainty

i. The Relativity of Certainty

According to weak theories of certainty, propositions such as (2) can be certain. As outlined in the previous section, one may understand the notion of certainty in relation to a class of propositions justified or true in the actual world or in relation to a class of propositions justified or true in all possible worlds. Yet, there are many other ways of relativising the notion of certainty (Firth 1967: 10-12). For example, there are Chisholm’s views previously mentioned (Chisholm 1976, 1989). Malcolm suggests that there are various kinds of justification associated with different kinds of propositions, which can give rise to various criteria for certainty (Malcolm 1963). For instance, to see in full light that a given object is a plate seems to provide one with a maximally strong justification for believing that the object is a plate. That is because, according to Malcolm, no one “has any conception of what would be a better proof that it is a plate”  (Malcolm 1963: 92). Still, as Firth (1967: 19-20) notes, that depends on the criterion used to define “better.” (Firth 1967: 19-20) According to a Cartesian criterion, we would have a better justification in a world in which vision is infallible and our senses are never misleading. Although it is possible to defend a weak invariantist account of epistemic certainty, the fact that various criteria of epistemic certainty can be conceived of may suggest that these criteria are, in fact, shifty.

ii. Contextualism

A first way of elaborating the thought that the standards of certainty are shifty consists in suggesting that ascriptions of certainty are, with respect to their truth-conditions, context-sensitive (Lewis 1976: 353-354, Stanley 2008, Petersen 2019, Beddor 2020a, b, Vollet 2020). A theory of this kind has notably been defended regarding ascriptions of knowledge. On this view, the epistemic standards that a subject S must satisfy with respect to a proposition p for a statement such as “S knows that p” to be true, depends on the conversational context (Cohen 1988, Lewis 1996, DeRose 2009). Some relevant features of the context are the salience of various error possibilities, as well as possibilities that one must take into account given the stakes related to being wrong about p. In the same way as the question of whether S is tall cannot be answered without (implicitly) invoking a reference class, relative to which a standard is fixed (for instance, tall “for a basketball player” or “for a child”), the question of whether or not a proposition is certain might not be answerable independently of the context in which the word ‘certain’ is used. There could be contexts in which the statement, “It is certain that my lottery ticket is a losing ticket” is true, for example a context in which we are discussing the opportunity of making an important financial investment – and contexts where that statement is false, even if the evidential situation remains the same – for example a context where we are discussing the fact that at least one lottery ticket will be the winning one.

However, adopting a contextualist view of certainty does not suffice to vindicate a weak theory of certainty. For example, Beddor proposes that the ascription “It is (epistemically) certain for S that p” is true if and only if p is true in all the contextually relevant worlds compatible with S’s epistemic situation, where the space of contextually relevant worlds includes all worlds nearby the actual world. Under the assumption that there is always a nearby world where one’s ticket wins, (2) cannot, given such a view, qualify as certain (see also Lewis’ rule of resemblance, 1996: 557).

iii. Pragmatic Encroachment

Some authors claim that the epistemic standards a subject must satisfy to know a proposition are partially determined by the question as to whether it is rational for this subject to act on the proposition’s truth given her overall practical situation (Stanley 2005a, Hawthorne 2004, Fantl and McGrath 2009). One may suggest that this “pragmatic encroachment” concerns also, or rather, epistemic certainty. For example, Stanley argues for the existence of a pragmatic encroachment on knowledge and maintains that knowledge determines epistemic certainties, that is, the epistemic possibilities relative to which a proposition can be considered to be epistemically certain (Stanley 2005a). Fantl and McGrath, for their part, defend the existence of a pragmatic encroachment on knowledge-level justification but reject the claim that knowledge-level justification determines epistemic certainties (Fantl and McGrath 2009). A third option would be to reject pragmatic encroachment on knowledge as well as the idea that knowledge determines epistemic certainties, while allowing pragmatic encroachment on epistemic certainties.

The conceptions according to which the criteria of epistemic certainty shifts with the conversational context or the practical cost of error, are compatible with a weak conception of epistemic certainty. Indeed, they can easily grant that there are contexts in which one says something true when one says of a proposition like (2) that it is certain for S.

4. Connections to Other Topics in Epistemology

The notion of certainty is connected to various epistemological debates. In particular, it is connected to philosophical issues concerning norms of assertions, actions, beliefs and credences. It also concerns central questions regarding the nature of evidence, evidential probability, and the current debate regarding epistemic modals (Beddor 2020b).

For example, some philosophers distinguish knowledge and certainty and propose to deal with concessive knowledge attributions by embracing the view that certainty is the epistemic norm of assertion (Stanley 2008, Pertersen 2019, Beddor 2020b, Vollet 2020). A prominent argument for such a certainty norm of assertion comes from the infelicity of Moorean assertions involving certainty, such as, “p, but it’s not / I’m not certain that p.” In a similar vein, some philosophers defend a certainty norm of action and practical reasoning (Beddor 2020a, Vollet 2020). This is in part because such a norm can easily handle some of the counterexamples raised against competing knowledge norms (for such counterexamples, see Brown 2008, Reed 2010 and Roebert 2018; for an overview on knowledge norms, see Benton 2014).

For instance, Beddor argues that, with respect to the nature of evidence and evidential probability, we should analyze evidence in terms of epistemic certainty (Beddor 2020b). Such a view is supported by the oddity of the utterance, “It is certain that smoking causes cancer, but the evidence leaves open the possibility that smoking does not cause cancer.” This suggests that if p is epistemically certain, then p is entailed by the available evidence. In addition, the oddity of the utterance, “The medical evidence entails that smoking causes cancer, but it isn’t certain that smoking causes cancer” suggests that p is entailed by the available evidence only if p is epistemically certain.

Thus, given the relations it bears to other important philosophical notions, it is clear that certainty is central to epistemological theorizing. The difficulty of providing a fully satisfactory analysis of this notion might then suggest that certainty should, in fact, be treated as primitive.

5. References and Further Readings

  • Alston, W. (1992). Incorrigibility. In Dancy, Jonathan & Sosa, Ernest (Eds.), A Companion to Epistemology. Wiley-Blackwell.
  • Aristotle (1984). The Complete Works of Aristotle, Volumes I and II, ed. and tr. J. Barnes, Princeton: Princeton University Press..
  • Armstrong, D. M. (1963). Is Introspective Knowledge Incorrigible? Philosophical Review 72 (4): 417.
  • Armstrong, D. M. (1981). The Nature of Mind and Other Essays. Ithaca: Cornell University Press.
  • Audi, R. (2003). Epistemology: A Contemporary Introduction to the Theory of Knowledge. Routledge.
  • Ayer, A.J. (1956). The Problem of Knowledge. London: Penguin.
  • Ayer, A. J. (1963). The Concept of a Person and Other Essays. New York: St. Martin’s Press.
  • Barnes, G. W. (1973). Unger’s Defense of Skepticism. Philosophical Studies 24 (2): 119-124.
  • Beddor, B. (2020a). Certainty in Action. Philosophical Quarterly 70 (281): 711-737.
  • Beddor, B. (2020b). New Work for Certainty. Philosophers’ Imprint 20 (8).
  • Benton, M. A. (2014). Knowledge Norms. Internet Encyclopedia of Philosophy
  • Brown, J. (2008). Subject‐Sensitive Invariantism and the Knowledge Norm for Practical Reasoning. Noûs 42 (2):167-189.
  • Brown, J. (2011). Fallibilism and the Knowledge Norm for Assertion and Practical Reasoning. In Brown, J. & Cappelen, H. (Eds.), Assertion: New Philosophical Essays. Oxford University Press.
  • Brown, J. (2018). Fallibilism: Evidence and Knowledge. Oxford University Press.
  • Cargile, J. (1972). In Reply to A Defense of Skepticism. Philosophical Review 81 (2): 229-236.
  • Carnap, R. (1947). Meaning and Necessity. University of Chicago Press.
  • Carrier, L. S. (1983). Skepticism Disarmed. Canadian Journal of Philosophy. 13 (1): 107-114.
  • Carrier, L. S. (1993). How to Define a Nonskeptical Fallibilism. Philosophia 22 (3-4): 361-372.
  • Chisholm, R. (1976). Person and Object. La Salle, IL: Open Court.
  • Chisholm, R. (1989). Theory of Knowledge. 3rd. ed. Englewood Cliffs. NJ: Prentice-Hall.
  • Clarke, R. (2013). Belief Is Credence One (in Context). Philosophers’ Imprint 13:1-18.
  • Cohen, S. (1988). How to Be a Fallibilist. Philosophical Perspectives 2: 91-123
  • Coliva, A. (2015). Extended Rationality: A Hinge Epistemology. Palgrave-Macmillan.
  • DeRose, K. (1991). Epistemic Possibilities. Philosophical Review 100 (4): 581-605.
  • DeRose, K. (1998). Simple ‘might’s, indicative possibilities and the open future. Philosophical Quarterly 48 (190): 67-82.
  • DeRose, K. (2009). The Case for Contextualism: Knowledge, Skepticism, and Context, Vol. 1. Oxford University Press.
  • Descartes, R. (1999). Rules for the Direction of the Natural Intelligence: A Bilingual Edition of the Cartesian Treatise on Method, ed. and tr. George Heffernan. Amsterdam: Editions Rodopi.
  • Descartes, R. (2008). Meditations on First Philosophy: With Selections from the Objections and Replies, trans. Michael Moriarty. Oxford: Oxford University Press
  • Dicker, G. (1974). Certainty without Dogmatism: a Reply to Unger’s ‘An Argument for Skepticism’. Philosophic Exchange 5 (1): 161-170.
  • Dodd, D. (2010). Confusion about concessive knowledge attributions. Synthese 172 (3): 381 – 396.
  • Dodd, D. (2017). Belief and certainty. Synthese 194 (11): 4597-4621.
  • Dokic, J. (2012). Seeds of self-knowledge: noetic feelings and metacognition. Foundations of metacognition 6: 302–321.
  • Dokic, J. (2014). Feelings of (un)certainty and margins for error. Philosophical Inquiries 2(1): 123–144.
  • Dokic, J. et Engel, P. (2001). Frank Ramsey: Truth and Success. London: Routledge.
  • Dougherty, T. & Rysiew, P. (2009). Fallibilism, Epistemic Possibility, and Concessive Knowledge Attributions. Philosophy and Phenomenological Research 78 (1):123-132.
  • Dougherty, T. & Rysiew, P. (2011). Clarity about concessive knowledge attributions: reply to Dodd. Synthese 181 (3): 395-403.
  • Dougherty, T. (2011). Fallibilism. In Duncan Pritchard & Sven Bernecker (eds.), The Routledge Companion to Epistemology. Routledge.
  • Douven, I. & Olders, D. (2008). Unger’s Argument for Skepticism Revisited. Theoria 74 (3): 239-250.
  • Dretske, F. (1970). Epistemic Operators. Journal of Philosophy 67: 1007-1023.
  • Dretske, F. (1971). Conclusive reasons. Australasian Journal of Philosophy 49 (1):1-22.
  • Dretske, F. (1981). The Pragmatic Dimension of Knowledge. Philosophical Studies 40: 363-378
  • Dutant, J. (2015). The legend of the justified true belief analysis. Philosophical Perspectives 29 (1): 95-145.
  • Dutant, J. (2016). How to be an Infallibilist. Philosophical Issues 26 (1): 148-171.
  • Fantl, J. (2003). Modest Infinitism. Canadian Journal of Philosophy 33 (4): 537- 562.
  • Fantl, J. & McGrath, M. (2009). Knowledge in an Uncertain World. Oxford University Press.
  • de Finetti, B. (1937). La Prévision: Ses Lois Logiques, Ses Sources Subjectives. Annales de l’Institut Henri Poincaré 7: 1–68.
  • de Finetti, B. (1990). Theory of Probability (Volume I). New York: John Wiley.
  • Firth, R. (1967). The Anatomy of Certainty. Philosophical Review 76: 3-27.
  • Foley, R. (1992). Working Without a Net: A Study of Egocentric Epistemology. New York: Oxford University Press.
  • Fumerton, R. (2005). Theories of justification. In Paul K. Moser (Ed.), The Oxford Handbook of Epistemology. Oxford University Press: 204–233.
  • Ganson, D. (2008). Evidentialism and pragmatic constraints on outright belief. Philosophical Studies 139 (3): 441- 458.
  • Gärdenfors, P. and D. Makinson (1988). Revisions of Knowledge Systems Using Epistemic Entrenchment. In Theoretical Aspects of Reasoning About Knowledge, Moshe Verde (Ed.) (Morgan Kaufmann): 83–95.
  • Greco, D. (2015). How I learned to stop worrying and love probability 1. Philosophical Perspectives 29 (1): 179-201.
  • Hawthorne, J. (2004). Knowledge and Lotteries. Oxford University Press.
  • Hawthorne, J., Rothschild, D. & Spectre, L. (2016). Belief is weak. Philosophical Studies 173 (5): 1393-1404.
  • Hintikka, J. (1962). Knowledge and Belief: An Introduction to the Logic of the Two Notions. V. Hendriks and J. Symons (Eds.). London: College Publications.
  • Huemer, M. (2007). Epistemic Possibility. Synthese 156 (1): 119-142.
  • Hume, D. (1975). A Treatise of Human Nature. ed. by L. A. Selby-Bigge, 2nd ed. rev. by P. H. Nidditch. Oxford: Clarendon Press.
  • Hume, D. (1993). An Enquiry Concerning Human Understanding. ed. Eric Steinberg. Indianapolis: Hackett Publishing Co.
  • Jeffrey, R. (1965). The Logic of Decision. New York: McGraw-Hill.
  • Jeffrey, R. (1970). Dracula meets Wolfman: Acceptance vs. Partial Belief’. In Induction, Acceptance, and Rational Belief. Marshall Swain (Ed.) Dordrecht: D. Reidel Publishing Company: 157-85.
  • Jeffrey, R. (2004). Subjective Probability. The Real Thing. Cambridge: Cambridge University Press.
  • Joyce, J. M. (1999). The Foundations of Causal Decision Theory. New York: Cambridge University Press
  • Kattsoff, L. O. (1965). Malcolm on knowledge and certainty. Philosophy and Phenomenological Research 26 (2): 263-267.
  • Kauss, D. (2020). Credence as doxastic tendency. Synthese 197 (10): 4495-4518.
  • Klein, P. (1981). Certainty: A Refutation of Scepticism. Minneapolis: University of Minnesota Press.
  • Klein, P. (1992). Certainty. In J. Dancy and E. Sosa (Eds.), A Companion to Epistemology. Oxford: Blackwell: 61-4.
  • Kolmogorov, A. N. (1956). Foundations of the Theory of Probability. New York: Chelsea Publishing Company.
  • Lasonen-Aarnio, M. (2020). Enkrasia or evidentialism? Learning to love mismatch. Philosophical Studies 177 (3): 597-632.
  • Lehrer, K. (1974). Knowledge. Oxford: Clarendon Press.
  • Leitgeb, H. (2013). Reducing belief simpliciter to degrees of belief. Annals of Pure and Applied Logic 164 (12): 1338-1389.
  • Leitgeb, H. (2014). The Stability Theory of Belief. Philosophical Review 123 (2): 131-171.
  • Leitgeb, H. (2017). The Stability of Belief: How Rational Belief Coheres with Probability. Oxford University Press
  • Levi, I. (1983). Truth, fallibility and the growth of knowledge. In R. S. Cohen & M. W. Wartofsky (Eds.), Boston studies in the philosophy of science (Vol. 31, pp. 153–174). Dordrecht: Springer
  • Lewis, C.I. (1929). Mind and the World Order. New York: Dover.
  • Lewis, C. I. (1946). An Analysis of Knowledge and Valuation. Open Court.
  • Lewis, D. (1979). Scorekeeping in a Language Game. Journal of Philosophical Logic 8 (1): 339-359.
  • Lewis, D. (1996). Elusive Knowledge. Australasian Journal of Philosophy 74 (4: 549—567.
  • Littlejohn, C. (2011). Concessive Knowledge Attributions and Fallibilism. Philosophy and Phenomenological Research 83 (3): 603-619.
  • Locke, D. (2015). Practical Certainty. Philosophy and Phenomenological Research 90 (1): 72-95.
  • Locke, J. (1975). An Essay Concerning Human Understanding, Peter H. Nidditch (Ed.), Oxford: Clarendon Press.
  • Malcolm, N. (1952). Knowledge and belief. Mind 61 (242): 178-189.
  • Malcolm, N. (1963). Knowledge and Certainty. Englewood Cliffs, NJ: Prentice-Hall.
  • McDowell, J. H. (1982). Criteria, Defeasibility, and Knowledge. Proceedings of the British Academy, 68: 455–479.
  • Miller, R. W. (1978). Absolute certainty. Mind 87 (345): 46-65.
  • Moore G.E. (1959). Certainty. In Philosophical Papers. London: George Allen & Unwin, 227-251.
  • Nozick, R. (1981). Philosophical Explanations. Cambridge: Cambridge University Press.
  • Pasnau, R. (2013). Epistemology Idealized. Mind 122 (488): 987-1021.
  • Peirce, C. (1877/2011). The Fixation of Belief. In R. Talisse & S. Aikin (Eds.). The Pragmatism Reader: From Peirce Through the Present. Princeton University Press: 37-49.
  • Petersen, E. (2019). A case for a certainty norm of assertion. Synthese 196 (11): 4691-4710.
  • Plantinga, A. (1993). Warrant and Proper Function. Oxford University Press.
  • Plato (1997). Republic. In J. M. Cooper (Ed.). Plato: Complete Works. Indianapolis: Hackett.
  • Pritchard, D. (2008). Certainty and Scepticism. Philosophical Issues 18 (1): 58-67.
  • Pritchard, D. (2016). Epistemic Angst. Radical Scepticism and the Groundlessness of Our Believing, Princeton University Press.
  • Ramsey, F. P. (1926). Truth and Probability. In R. B. Braithwaite (Ed.). Foundations of Mathematics and Other Logical Essays. London: Kegan, Paul, Trench, Trubner & Co., New York: Harcourt, Brace and Company: 156–198.
  • Reed, B. (2002). How to Think about Fallibilism. Philosophical Studies 107: 143-57.
  • Reed, B. (2008). Certainty. Stanford Encyclopedia of Philosophy.
  • Reed, B. (2010). A defense of stable invariantism. Noûs 44 (2): 224-244.
  • Roeber, B. (2018). The Pragmatic Encroachment Debate. Noûs 52 (1): 171-195.
  • Roorda, J. (1997). Fallibilism, Ambivalence, and Belief. Journal of Philosophy 94 (3): 126.
  • Rothschild, D. (2020). What it takes to believe. Philosophical Studies 177 (5): 1345-1362.
  • Russell, B. (1912). The Problems of Philosophy, Londres, Williams & Norgate
  • Russell, B. (1948). Human Knowledge: Its Scope and Limits. New York: Simon and Schuster.
  • Rysiew, P. (2001). The Context-sensitivity of Knowledge Attributions. Noûs 35 (4): 477–514.
  • Rysiew, P. (2007). Speaking of Knowledge. Noûs 41: 627–62.
  • Savage, L. J. (1954). The Foundations of Statistics. New York: John Wiley.
  • Skyrms, B. (1980). Causal Necessity: A Pragmatic Investigation of the Necessity of Laws. Yale University Press.
  • Stalnaker, R. (1991). The problem of logical omniscience, I. Synthese 89 (3): 425–440.
  • Stanley, J. (2005a). Knowledge and Practical Interests. Oxford University Press.
  • Stanley, J. (2005b). Fallibilism and concessive knowledge attributions. Analysis 65 (2): 126-131.
  • Stanley, J. (2008). Knowledge and Certainty. Philosophical Issues 18 (1): 35-57.
  • Sturgeon, S. (2008). Reason and the grain of belief. Noûs 42 (1): 139–165.
  • Turri, J. (2010). Prompting Challenges. Analysis 70 (3): 456-462.
  • Unger, P. (1975). Ignorance: A Case for Scepticism. Oxford: Clarendon Press.
  • Van Cleve, J. (1977). Probability and Certainty: A Reexamination of the Lewis-Reichenbach Debate. Philosophical Studies 32: 323-34.
  • Vazard, J. (2019). Reasonable doubt as affective experience: Obsessive–compulsive disorder, epistemic anxiety and the feeling of uncertainty. Synthese https://doi.org/10.1007/s11229-019-02497-y
  • Vollet, J.-H. (2020). Certainty and Assertion. Dialectica, 74 (3).
  • Vollet, J.-H. (2022). Epistemic Excuses and the Feeling of Certainty, Analysis.
  • von Fintel, K. and A. Gillies (2007). An Opinionated Guide to Epistemic Modality. In T. Gendler and J. Hawthorne (ed.), Oxford Studies in Epistemology, Volume 2. New York: Oxford University Press.
  • Weatherson, B. (2005). Can we do without pragmatic encroachment. Philosophical Perspectives 19 (1): 417–443.
  • Wedgwood, R. (2012). Outright Belief. Dialectica 66 (3): 309–329.
  • Williamson, T. (2000). Knowledge and Its Limits. Oxford University Press.
  • Williamson, T. (2009). Reply to Mark Kaplan. In Pritchard, D. and Greenough, P. (ed.) Williamson on Knowledge. Oxford: Oxford University Press .
  • Wittgenstein, L. (1969). On Certainty. G.E.M. Anscombe & G.H. von Wright (Eds.). New York: Harper & Row.

 

Author Information

Miloud Belkoniene
Email: miloud@belkoniene.org
University of Glasgow
United Kingdom

and

Jacques-Henri Vollet
Email: jacquesvollet@yahoo.fr
University Paris-Est Créteil
France

Bodily Awareness

Most of us agree that we are conscious, and we can be consciously aware of public things such as mountains, tables, foods, and so forth; we can also be consciously aware of our own psychological states and episodes such as emotions, thoughts, perceptions, and so forth. Each of us can be aware of our body via vision, sound, smell, and so on. We also can be aware of our own body “from the inside,” via proprioception, kinaesthesis, the sense of balance, and interoception. When you are reading this article, in addition to your visual experiences of many words, you might feel that your legs are crossed, that one of your hands is moving toward a coffee mug, and that you are a bit hungry, without ever seeing or hearing your limbs and your stomach. We all have these experiences. The situation can get peculiar, intriguing, and surprising if we reflect upon it a bit more: the body and its parts are objective, public things, and that is why in principle everyone else can perceive our bodies. But the body and its parts also have a subjective dimension. This is why many believe that in principle only one’s own self can be aware of one’s own body “from the inside.” Consciousness of, or awareness of, one’s own body, then, can generate many interesting and substantive philosophical and empirical questions due to the objective-subjective dual aspects, as is seen below. The beginning of section 1 introduces the structure of this article and presents some caveats. Having these early on can be daunting, but they occur there because this is a complicated area of study.

Table of Contents

  1. Varieties of Bodily Awareness
    1. Touch
    2. Proprioception, Kinaesthesis, and the Vestibular Sense
    3. Thermal Sensation, Pain, and Interoception
    4. Bodily Feelings
    5. Bodily Representations: Body Image, Body Schema, and Peripersonal Space
  2. Contemporary Issues
    1. Is There a Tactile Field?
    2. Does Bodily Immunity to Error Through Misidentification Hold?
    3. How Do Body Ownership and Mental Ownership Relate?
    4. Must Bodily Awareness Be Bodily Self-Awareness?
    5. What Does Body Blindness, Actual or Imagined, Show?
  3. Phenomenological Insights: The Body as a Subjective Object and an Objective Subject
    1. Two Notions of the Body
    2. Non-Perceptual Bodily Awareness
  4. Conclusion
  5. References and Further Reading

1. Varieties of Bodily Awareness

Bodily awareness, or bodily consciousness, covers a wide range of experiences. It is closely related to, though crucially different from, bodily representation (1.e) and bodily self-awareness (2.d). Another related notion is bodily self-knowledge, which includes immunity to error through misidentification (2.b). What follows is some broad territory, and it is unrealistic to claim comprehensiveness. It is divided in the following way: section 1 discusses varieties of bodily awareness, without committing the view that this represents the classification of bodily awareness (Armstrong, 1962): different researchers would carve things up in slightly different ways, but the most important elements are covered here. Section 2 surveys several contemporary issues in Anglo-Saxon philosophy and cognitive sciences. Note that the divide between sections 1 and 2 is somewhat artificial: in introducing varieties of bodily awareness, we will of course discuss theoretical issues and questions in those areas, otherwise, it would become a pure reportage. However, this divide between sections 1 and 2 is not entirely arbitrary, since while section 1 will be primarily on different varieties of bodily awareness, section 2 will be explicitly question-oriented. They will be mutually complementary and not repetitive. Section 3 discusses some insights from the phenomenological tradition with a specific focus on the lived body as a subjective object and an objective subject. The divide between sections 2 and 3 can also be seen as somewhat artificial: it is perfectly sensible to spread those or even more phenomenological insights along the way in sections 1 and 2. This will not be the strategy because in practice, these traditions work in parallel most of the time, and seek to communicate when there are opportunities. It will be conceptually cleaner if we proceed in a way that separates them first. Also, the phenomenological insights covered below seem especially suitable for the larger issues in section 3, so we will save them mostly for that section, with the proviso that many ideas in section 3 will rely on various elements in the previous sections, and that considerations from the analytic tradition will creep back toward the end. Note that the discussions of section 3 are highly selective; after all, this article is mostly written from the analytical point of view. Many phenomenologists have studied the body and bodily awareness intensively, but for the flow of the narrative and the scope of the article, they are not included below. Notable names that we will not discuss include Aron Gurwitsch (1964), Michel Henry (1965), Dorothée Legrand (2007a, 2007b), and Dan Zahavi (2021). Section 4 concludes and summarises.

a. Touch

What is touch? This question is surprisingly difficult to answer if what we are looking for is a precise definition. Examples are easy to give: we (and other animals) touch things with our hands, feet, and/or other parts of the body when we make contact with those things with body parts. Things quickly become murkier when we consider specific conditions; for example, is skin necessary for touch? Many animals do not have skin, at least under common understandings of what skins are, but they can touch things and have tactile experiences, at least according to most. Even humans seem to be able to touch things with lips, tongues, and eyes, thereby having tactile experiences, but they are not covered by skin. Some would even claim that when one’s stomach is in contact with foods, one can sometimes feel tactile sensations, though see discussions of interoception below (1.c). So even if we only focus on examples, it is difficult to differentiate touch from non-touch. Moreover, many touches or tactile experiences seem to involve indirect contacts: for example, your hands can touch your shoulders even when wearing clothes or gloves; one’s hands can have tactile feedbacks by using crutches to walk. Exactly how to conceive the relation between touch and contact can seem controversial.

What about definitions then? This often appears under the heading of “individuating the senses” (for example, Macpherson, 2011): what are the individuation conditions of, say, vision, audition, olfaction, gustation, touch, and perhaps other senses? Aristotle in De Anima proposed the “proper object account”: colours are only for vision, sounds are only for audition, smells are only for olfaction, tastes are only for gustation, and so on. But what about touch? There does not seem to be any proper object for it. With touch we can take in information about objects’ sizes and shapes, but they can also be taken in by sight, or perhaps even by audition: we seem to be able to hear (to some extent) whether the rolling rocks are huge or small, or what the shape of a room roughly is, for example (for example, Plumbley, 2013). Some have argued that pressure is the proper objects of touch (Vignemont and Massin, 2015), though controversies have not been settled. Researchers have proposed many other candidate criteria, including the representational criterion, the phenomenal character criterion, the proximal stimulus criterion, the sense-organ criterion, and so on. Each has its strengths and weaknesses. Still, there are difficult questions to answer such as: are ventral and dorsal visions separate senses? How about orthonasal and retronasal olfaction (Wilson, 2021)? Does neutral touch, thermoception, and nociception form a unitary sense (Fulkerson, 2013)? To acknowledge touch as one element of bodily awareness, though, one does not need to resolve these difficult questions first.

Setting aside the above controversies, a basic distinction within touch is between haptic/active and passive touch. While in daily life, creatures often actively explore objects in the environment, they also experience passive touch all the time; consider the contacts between your body and the chair you sit on, or the clothing that covers different parts of your body. This distinction is closely related to, though not perfectly mapped onto, the distinction between kinaesthesis and proprioception (see the next subsection). In experimental works, laboratories tend to specialise on either haptic or passive touch, focusing on their temporal or/and spatial profiles. For example, in the famous cutaneous rabbit illusion (a.k.a. cutaneous saltation), where participants feel a tactile illusion induced by tapping multiple separate regions of the skin (often on a forearm) in rapid succession (Geldard and Sherrick, 1972), participants are asked not to move their body; same is true of the perhaps even more famous rubber hand illusion, in which the feeling that a rubber hand belongs to one’s body is generated by stroking a visible rubber hand synchronously to the participant’s own hidden hand (Ehrsson, Spence, and Passingham, 2004; also see a related four-hand illusion in Chen, Huang, Lee, and Liang, 2018, where each participant has the illusory experience of owning four hands). Varieties of tactile and body illusions are important entry points for researchers to probe the distinctive properties of touch. Vignemont (2018) offers an excellent list of bodily illusions with informative descriptions (p. 207-211).

An important approach to studying touch is to look into cases in which the subjects have no sight, both congenitally and otherwise (Morash, Pensky, Alfaro, and McKerracher, 2012). This also includes experimental conditions where participants are blindfolded or situated in a dark room. This is a useful method because crossmodal or multisensory interactions can greatly influence tactile experiences; therefore, blocking the influence from vision (and other senses) can make sure what is being studied is touch itself. This is one reason why Molyneux’s question is so theoretically relevant and intriguing (Locke 1693/1979; Cheng, 2020; Ferretti and Glenney, 2020). Molyneux’s question hypothesizes that it is possible to restore the vision of those who are born completely blind. It then asks whether the subjects who obtain this new visual capability can immediately tell which shapes are which, solely by vision. The question depends on how we think of the structural similarities between sight and touch, how amodal spatial representation works in transforming spatial representations in different modalities, and so on. The same consideration about blocking crossmodal effects applies to audition: in experiments on touch, participants are often asked to put on earplugs or headphones with white noises. The relations between sight, touch, and multimodality have been important in the literature, but this goes beyond the scope of this article.

Touch is a form of perception, and in many philosophical and empirical studies of touch, researchers focus primarily on the “cold” aspect of it; that is, sometimes people talk as if touch is primarily about gathering information about the immediate environment and one’s own body. But touch also has the “hot” aspect, which is often called “affective touch.” This cold/hot distinction is also applicable to other sense modalities, and even to cognition. While “cold” perceptions or cognitions are often said to be receptive and descriptive, “hot” perceptions and cognitions are by contrast evaluative and motivational. Affective perceptions involve conscious experiences, emotions, and evaluative judgments. Another way to pick out this “hot” aspect is to label these perceptions as “valenced.” Focusing on touch, it is notable that tactile experiences often if not always have felt pleasant or unpleasant phenomenal characters. Phenomenologically speaking, these valences might feel as if they are integral to tactile experiences themselves, though physiologically, specialised afferent nerve channels “CT-Afferents” might be distinctively responsible for pleasantness (McGlone, Wessberg, and Olausson, 2014). Affective perceptions, touch included, seem to be essential to varieties of social relations and aesthetic experiences, and this makes them a much wider topic of study in philosophy, psychology, and beyond (Nanay, 2016; Korsmeyer, 2020).

Touch carries information both about the external world and about the body itself (Katz, 1925/1989). It is related to other forms of bodily awareness, such as proprioception and kinaesthesis, thermal sensation and pain, interoception, and so on. These will be discussed in some detail in the following subsections. For other philosophical discussions concerning touch, for example, varieties of tangible qualities, the nature of pleasant touch, and the relation between touch and action, see for example Fulkerson (2015/2020).

b. Proprioception, Kinaesthesis, and the Vestibular Sense

The term “proprioception” can be at least traced back to Sherrington (1906): “In muscular receptivity, we see the body itself acting as a stimulus to its own receptors – the proprioceptors.” This definition has been refined many times in the past century, and the term has at least a broad and a narrow meaning. Broadly construed, this term is interchangeable with “kinaesthesis,” and they jointly refer to the sense through which the subjects can perceive or sense the position and movement of our body (Tuthill and Azim, 2018). Narrowly construed, although “proprioception” refers to the perception or at least sensing of the positions of our body parts, “kinaesthesis” refers to the perception or at least sensing of the movement of our body parts. The reservation here concerning perception is that some would think perception is necessarily exteroceptive and can be about multiple objects, while some might regard proprioception and kinaesthesis as interoceptive and can only be about one specific object (note that Sherrington himself clearly distinguishes proprioception from interoception; for more on interoception and related issues, see also 1.c and 3.b). With this narrower usage, one can see that proprioception and kinaesthesis can sometimes be dissociated, but they often occur together: when we sit or stand without any obvious movement, we still feel where our limbs are and how they stretch, and so forth, so this can be a case of having proprioception without kinaesthesis. In other cases, where someone moves around or uses their hands to grab things, they at the same time feel the positions and movements of our body parts.

Proprioception and kinaesthesis raise some distinctive philosophical issues (for example, Fridland, 2011); specifically, some have argued that surprisingly, one can proprioceive someone else’s movements in some sense (Montero, 2006); it is also explored as an aesthetic sense (Schrenk, 2014) and an affective sense (Cole and Montero, 2007). In considering deafferented subjects, who lack proprioceptive awareness of much of their bodies (or “body blind”; see 2.e), some have considered the role of proprioceptive awareness in our self-conscious unity as practical subjects (Howe, 2018). Relatedly, it has been argued that the possibility of bodily action is provided by multimodal body representations for action (Wong, 2017a). Also based on deafferented patients, some have argued that proprioception is necessary for body schema plasticity (Cardinali, Brozzoli, Luauté, Roy, and Farnè, 2016). Moreover, some have argued that proprioception is our direct, immediate knowledge of the body (Hamilton, 2005). It has also been identified as a crucial element in many other senses (O’Dea, 2011). And there is much more. To put it bluntly, proprioception is almost everywhere in our conscious life, though this might not be obvious before being pointed out. It is worth noting that the above contributions are from both philosophers and empirical researchers, and sometimes it is hard to figure out whether a specific work is by philosophers or scientists.

The vestibular sense or system in the inner ear is often introduced with proprioception and kinaesthesis as bodily senses; it is our sense of balance, including sensations of body rotation, gravitation, acceleration, and movement. The system includes two structures of the bony labyrinth of the inner ear – the vestibule and the semicircular canals. When it goes wrong, we feel dizziness or vertigo. The basic functions of the vestibular system include stabilising postures and gazes and providing the gravitational or geocentric frame of reference (Berthoz, 1991). It is multisensory in the sense that it is often or even always implicated in other sense perceptions. Whether it has “proprietary phenomenology,” that is, phenomenology specific to it, is a matter of dispute (Wong, 2017b). It is less seen in philosophical contexts, but in recent years it also figures in the purview of philosophy. What are the distinctive features of the vestibular sense or system? Here are some potential candidates: vestibular afferents are constantly active even when we are motionless; it has “no overt, readily recognizable, localizable, conscious sensation from [the vestibular] organs” (Day and Fitzpatrick, 2005, p.R583); it enables an absolute frame of reference for self-motion, particularly absolute head motion in a head-centered frame of reference; and vestibular information and processing in the central nervous system is highly multisensory (Wong, 2017b). It can be argued that, however, some of these characteristics are shared with other senses. For example, the first point might be applicable to proprioception, and the fourth point might be applicable to some cases of touch. Still, even if these four points are not exclusive for the vestibular sense, they are at least important characteristics of it. One major philosophical import of the vestibular sense is the ways in which it relates self, body, and world. More specifically, the vestibular system plays crucial roles “in agentive self-location…, in anchoring the self to its body…, and in orienting the subject to the world… balance is being-in-my-body-in-the-world” (ibid., p. 319-320; 328). Note that self-location is often but not always bounded with body-location: in the case of out-of-body experience (Lenggenhager, Tadi, Metzinger, and Blanke, 2007), for example, the two are dissociated. It has also been proposed that there should be a three-way distinction here: in addition to self-location and body-location, there is also “1PP-location”: “the sense of where my first-person perspective is located in space” (Huang, Lee, Chen, and Liang, 2017).

c. Thermal Sensation, Pain, and Interoception

Another crucial factor in bodily awareness is thermal sensation or thermoception, which is necessarily implicated in every tactile experience: people often do not notice the thermal aspect of touch, but they can become salient when, for example, the coffee is too hot, or the bathing water is too cold. They also exist in cases without touch: People feel environmental temperatures without touch (exteroceptive), and they feel body temperature in body parts that have no contact with things (interoceptive; for more on the exteroceptive and the interoceptive characters of thermal perception, see Cheng, 2020). Thermal illusions are also ways of probing the nature of bodily awareness (for example, thermal referral, Cataldo, Ferrè, di Pellegrino, and Haggard, 2016; thermal grill, Fardo, Finnerup, and Haggard, 2018). Connecting back to the individuation of the senses discussion, there is a question concerning how many senses there are within the somatosensory system. More specifically, are touch, thermal sensation, and nociception (see below) different senses? Or should they be grouped as one sense modality? Or perhaps this question has no proper theoretical answer (Ratcliffe, 2012)? Besides, there are questions specific to thermal perception. For example, what do experiences of heat and cold represent, if they represent anything at all? Do they represent states or processes of things? Gray (2013) argues that experiences of heat and cold do not represent states of things; they represent processes instead. More specifically, he develops the “heat exchange model of heat perception,” according to which “the opposite processes of thermal energy being transmitted to and from the body, respectively” (p. 131). Relating this back to general considerations in philosophy of mind and metaphysics should help us understand what is at stake: some have argued that the senses do not have intentional content, that is, they do not represent (Travis, 2014). Many philosophers demur and hold the “content view” of experience instead (Siegel, 2010), but within the content view the major variant is that sensory experiences represent objects such as tables, chairs, mountains, and rivers; they also represent states of things, such as how crowded the room is, or the temperatures of things people are in contact. Gray’s view is that experiences of heat and cold do represent, but what they represent are not states but a certain kind of processes (for more on the ontological differences between events, processes, and states, see Steward, 1997). This view is controversial, to be sure, but it opens up a new theoretical possibility that should be considered seriously. Philosophical discussions of thermal perception or the thermal sense have been quite limited so far, and there might be some more potential in this area.

Pain is often regarded as having similar status as thermal perception, that is, subjective and (at least often) interoceptive, though pain seems to have drawn more attention at least in philosophy (for example, the toy example with pain and C-fibre firing). In the empirical literature, “pain” tends to occur with another term “nociception,” but they are strictly speaking different: “Pain is a product of higher brain center processing, whereas nociception can occur in the absence of pain” (National Research Council, 2009). This is not to deny they have large physiological overlaps, but since we do not aim to cover physiology, readers are encouraged to look for relevant resources elsewhere. Pain sometimes appears in the context of touch, for example, under specific circumstances where touch is multisensory (Fulkerson, 2015/2020); it also occurs in the context of thermal pain. But pain also has its own distinctive philosophical issues: do pains represent at all? Are painful qualities exhausted by representational properties (for example, Lycan, 1987)? Do pains have physical locations (for example, Bain, 2007)? How should we explain clinical cases such as pain asymbolia, that is, the syndrome with which subjects can feel pain but do not care to remove them (Berthier, Starkstein, and Leiguarda, 1988)? Is pain a natural kind (Corns, 2020)? Amongst these significant questions, arguably the most central question concerning the nature of pain is epitomised by the so-called “paradox of pain,” that is, according to the folk conception of pain, it is both mental and bodily (Hill, 2005, 2017; Aydede, 2013; Reuter, Phillips, and Sytsma, 2014; Reuter, 2017; Borg, Harrison, Stazicker, & Salomons, 2020). On the one hand, pains seem to allow privileged access for the subject in question and admits no appearance/reality distinction (Kripke, 1980; Searle, 1992), while on the other hand, pains seem to be bodily states, processes, or activities, just like bodily damages are. In addition to these two opposing views, there is also the “polyeidic view,” according to which our concept of pain is polyeidic or multi-dimensional, “containing a number of different strands or elements (with the bodily/mental dimension being just one strand among others” (Borg, Harrison, Stazicker, & Salomons, 2020, p. 30-31). Moreover, there is also the “polysemy view,” according to which pain terms are polysemous, referring to both mental and bodily states (Liu, 2021). Without going into the details, three observations are on offer: firstly, some have argued that the above discussions tend to be conducted in English, but other languages might reflect different conceptions of pain (Liu and Klein, 2020). Secondly, sometimes we can easily run two debates together, one about the nature or metaphysics of pain, and the other about folk notions or concepts of pain. And thirdly, sometimes it can seem that the above debate is at least partially about consciousness in general, not about pain. For example, when disagreeing about whether one can draw the distinction between appearance and reality for pain, it seems that the disagreement is actually about consciousness, whether it is painful experience or otherwise.

Apart from the above controversies, there is a relatively new category that has not been recognised widely by the literature as an independent sense, but the experience itself is familiar enough: as Lin, Hung, Han, Chen, Lee, Sun, and Chen (2018) point out, “acid or soreness sensation is a characteristic sensory phenotype of various acute and chronic pain syndromes” (p. 1). The question is whether they should be classified under nociception, or they should be singled out as a distinct sense. What is sngception exactly? In a certain variant of Chinese, acid pain is called “sng” (「痠」), “meaning a combination of soreness and pain, and is much more commonly reported than ‘pain’ among patients with chronic pain, especially for those with musculoskeletal pain” (ibid., 2018, p. 5). The authors introduced this term “specifically to describe the response of the somatosensory nervous system to sense tissue acidosis or the activation of acid-sensitive afferent neurons” (ibid., p. 6). The authors’ reason for distinguishing it from other elements of bodily awareness is primarily physiological, and as indicated above we will not go into those biological details. As far as individuating the senses is concerned, physiology is an important consideration, but it is far from decisive (Macpherson, 2011). Whether sng-ception should really be distinguished from pain and nociception is an open empirical question.

Interoception is to be contrasted with exteroception: whether the senses in question are directed toward the outside or inside, to put it crudely. One major difficulty is how to draw the inner/outer boundary, since not every part of our body is covered by skin, but there seems to be an intuitive sense in which we want to classify specific senses as exteroceptive or interoceptive. For example, the classical five senses – vision, audition, olfaction, gustation, and touch – are exteroceptive, while proprioception, kinaesthesis, feelings of heart beats and gut, and so forth, are interoceptive. A more technical definition is this: “Interoception is the body-to-brain axis of signals originating from the internal body and visceral organs (such as gastrointestinal, respiratory, hormonal, and circulatory systems)” (Tsakiris, and de Preester, 2019, p. v; some use “visceroception” to refer to the sensings of visceral organs). But in the very same piece, actually the next two sentences, the authors say that it “refers to the sensing of the state of the inner body and its homeostatic needs, to the ever-fluctuating state of the body beneath its sensory (exteroceptive) and musculoskeletal sheath” (ibid., p. v). These two definitions or characterisations are already not identical, and this shows that interoception is a rich territory that covers lots of grounds. More classically, and also from Sherrington (1906), interoception “is based on cardiovascular, respiratory, gastrointestinal, and urogenital systems, [which] provides information about the physiological condition of the body in order to maintain optimal homeostasis” (Vignemont, 2020, p. 83). Defining interoception has proven to be extremely difficult: in the literature, there have been the sole-object definition (“interoception consists of information that is exclusively about one’s body”), the insider definition (“interoception consists of information about what is internal to the body and not about what is at its surface”), and the regulator definition (“interoception consists of information that plays a role in internal regulation and monitoring”). Each of them exists due to certain initial plausibility, but they all face some difficult challenges and potential counterexamples (Vignemont, 2018a).

Interoception provides many good samples of philosophical relevance of bodily awareness: can interoception provide priors for the Bayesian predictive model? How does interoception shape the perception of time? In what way and to what extent is the brain-gut axis mental? What is the relation between interoception and emotion? And there are many more, especially if we consider how interoception interacts with other elements of bodily awareness, and with exteroceptions such as vision, audition, and olfaction (Tsakiris, and de Preester, 2019). In the past two decades, interoception has been thought to be connected with (bodily or otherwise) self-awareness, as in the proto-self (Damasio, 1999), the sentient self (Craig, 2003), the embodied self (Seth, 2013), and the material me (Tsakiris, 2017). However, Vignemont (2018a) argues that interoceptive feelings by itself cannot distinguish self and non-self, but it provides an affective background for bodily sensations (more on “feelings” in the next subsection).

d. Bodily Feelings

It is quite important to note that there is a group of bodily experiences that is recognisably different from all of the above. According to Vignemont (2020), they are different specifically in that these feelings are relatively permanent features of bodily awareness. In the literature, the following three are the most prominent:

The feeling of bodily presence: The body in the world.

The feeling of bodily capacities: The body in action.

The feeling of bodily ownership: The body and the self.

The notion of “presence” here is derived from the sensorimotor approach, and primarily in the case of vision (for example, Noë, 2004): when one sees a tree in front of her, for example, her sensorimotor skills or knowledge enable her to have a sense of the visual presence of the sides and the back of the tree. Quite independent of the plausibility of the sensorimotor approach itself, that understanding of presence can be appropriated to characterise bodily experiences. For example, when one feels a tickle in her left wrist, she feels not only that specific spot, but also the nearby areas of skin, muscles, and joints. There is a sense in which the body is there (presence, rather than absence), though not all parts of them are in the foreground of our awareness. This feeling of presence can sometimes be replaced by the feeling of absence, for example, in the case of depersonalisation (more on this in 3.d) and is sometimes classified as a sensory problem.

Bodily capacities include feelings of being able and unable to do with one’s own body. In the literature sometimes it is called the “sense of agency,” but that normally refers to “the awareness of oneself as the cause of a particular action” (Vignemont, 2020, p. 85; emphasis added). By “bodily capacities,” here we mean something more permanent, that is, long-term capacities to do various things with one’s body. Where do these capacities come from? They might be from monitoring our past capacities of doing stuff, and hence involve certain metacognitive abilities, which need to be studied themselves. This sense of bodily capacities can sometimes be replaced by the feeling of bodily incapacities, for example, in the case of hysterical conversion (roughly: wrongly assume that parts of one’s body is paralysed).

Bodily or body ownership is probably the most discussed phenomenon in this area, so will be also covered in 2.b and 2.c below. In most cases, “one does not normally experience a body that immediately responds to one’s intentions; one normally experiences one’s own body” (Vignemont, 2020, p. 86). Bermúdez (2011/2018) argues that this kind of body ownership involves only judgments, not feelings, but this remains to be controversial. This sense of ownership can sometimes be replaced by the feeling of bodily disownership, for example, in the case of somatoparaphrenia (more on this in 3.b). It has been argued that bodily ownership crucially involves affective consciousness (Vignemont, 2018b).

In a slightly different context, Matthew Ratcliffe (2005, 2008, 2016) has developed a sophisticated theory of existential feelings, which is both bodily and affective. This kind of feelings shapes human’s space of possible actions. They are pre-structuring backgrounds of all human experiences, and they are themselves parts of experiences as well. Ratcliffe argues that these kinds of bodily existential feelings are different from emotions and moods; they are sui generis. How this kind of feeling relates to other mental phenomena, such as thoughts and self-consciousness, remains to be seen (Kreuch, 2019). For our purposes, the most relevant question might be: in what way and to what extent do these existential feelings overlap with the three kinds of bodily feelings Vignemont identifies?

e. Bodily Representations: Body Image, Body Schema, and Peripersonal Space

Bodily awareness is closely related to bodily representations, just as in general awareness or consciousness is closely related to representations. The three notions introduced in this subsection are often understood in terms of mental representations, though they do not have to be (for anti-representational alternatives specific to some of these notions, see for example Gallagher, 2008). No matter how they are understood, it is a consensus that they play some significant roles in understanding bodily awareness. Let’s begin with body image, which refers to the subject’s mental representation of one’s own body configurations, very generally speaking. In philosophy, Brian O’Shaughnessy’s works (1980, 1995) have brought it into focus. He posits a long-term body image that sustains the spatial structure of our bodily awareness. This is a rather static notion, as the spatial structure can be quite relevant to possible actions, but it does not mention actions explicitly. Body schema, by contrast, is defined as consisting in sensory-motor capacities and actions. It is worth noting that the discussions we are familiar with today are already quite different from the original discussions in the early 20th century, notably from Head and Holmes (1911). For example, they did not mention action at all, and they distinguished two types of body schema, one that keeps track of postural changes and the other that represents the surface of the body. Also note that in other disciplines sometimes a broader notion of body image is invoked to refer to one’s thoughts and feelings about the attractiveness of one’s own body, but in philosophy we tend to stick to its narrower meanings. Note also that there is a group of questions concerning whether bodily awareness requires action (Briscoe, 2014; see also 3.b) and whether action requires bodily awareness (O’Shaughnessy, 2000; Wong, 2015), which we do not review here.

This pair of notions can seem to be intuitively clear, but when researchers make claims about them, things can get complicated and controversial. For example, O’Shaughnessy (1989) holds that our body image consists in a collection of information from our bodily senses, such as proprioception, but this seems to miss the important fact that blind subjects tend to have less accurate representations of the sizes of their own bodies (Kinsbourne and Lempert, 1980), which shows that sight also plays a crucial role in our body image. Gallagher (1986) once states that “[t]he body image is a conscious image or representation, owned, but abstract and disintegrated, and appears to be something in-itself, differentiated from its environment” (p. 541). This obviously goes beyond what many would want to mean by “body image.” The same goes for body schema. For example, Gallagher (2008) holds that body schema is in effect a sensorimotor function, which is not itself a mental representation. Moreover, both Head and Holmes (1911) and Gallagher (1986) regard body schema as unconscious, but it can be argued that under certain circumstances it can be brought into consciousness, at least in principle. To trace the history of how these terms have been used in the past century is itself an interesting and useful project (Ataria, Tanaka, and Gallagher, 2021), but since this article is not primarily a historical one, we will stick to the key idea that while body image is one’s own mental representation about the spatial structure of one’s own body, body schema is a corresponding representation that explicitly incorporate elements in relations to the possibility of actions. For potential double dissociations, see Paillard (1999). There is a long history of using the two terms interchangeably, but nowadays it is advised not to do so. Vignemont (2011/2020) offers a very useful list of potential differences that researchers have invoked to distinguish between body schema and body image, and she also points out that different taxonomies even “sometimes lead to opposite interpretations of the very same bodily disorders” (section 3.2). The situation is thorny and disappointing, and there seems to be no easy way out. To give one example, Bermúdez (1995) critically evaluates O’Shaughnessy’s views and arguments (1980, 1989, 1995) for the view that “it is a necessary condition of being capable of intentional action that one should have an immediate sensation-based awareness of one’s body” (Bermúdez, 1995, p. 382). Here he follows O’Shaughnessy’s conception of body image, but since it is about intentional action, considerations about (some notions of) body schema might be relevant; exactly how the discussion would go remains unclear. A related distinction between A-location and B-location is proposed by Bermúdez (2011/2018): while “[t]he A-location of a bodily event is fixed relative to an abstract map of the body,” “the B-location of a bodily event does take into account how the body is disposed” (p. 177-178). In introducing this pair of distinctions, the author does not mention body image or body schema.

What about peripersonal space (PPS)? This notion was invented only in the early 1980s (Rizzolatti, Scandolara, Matelli, and Gentilucci, 1981). It has also gone through many conceptual refinements and empirical investigations. A recent definition goes like this: it is “the space surrounding the body where we can reach or be reached by external entities, including objects or other individuals” (Rabellino, Frewen, McKinnon, and Lanius, 2020). Note that this kind of definition would not be accepted by those who clearly differentiate peripersonal space from reaching space; for example, “Human-environment interactions normally occur in the physical milieu and thus by medium of the body and within the space immediately adjacent to and surrounding the body, the peripersonal space (PPS)” (Serino, et al., 2018). Peripersonal space has been regarded as an index of multisensory body-environment interactions in real, virtual, and mixed realities. Some recent studies have supported the idea that PPS should be understood as a set of different graded fields that are affected by many factors other than stimulus proximity (Bufacchi and Iannetti, 2018). A basic distinction between appetitive and defensive PPS has been made (Vignemont and Iannetti, 2015), but further experimental and conceptual works are called for to substantiate this and other potential distinctions. A further question is in what ways body image, body schema, and peripersonal space relate to one another (for example, Merleau-Ponty on projection and intentional arc, 1945/2013).

It has been argued that awareness of peripersonal space facilitates a sense of being here (Vignemont, 2021). This is different from bodily presence discussed in 2.d, as the presence in question now is hereness, that is, self-location, which is different from bodily location (1.b). One implication of this view is that “depersonalized patients fail to process their environment as being peripersonal” (ibid., p. 192). Peripersonal awareness gives a specific sense of presence, which is not given by other awareness such as interoception and proprioception. This also relates bodily awareness to traditional philosophical discussions of indexicals (Perry, 1990, 2001).

Similar complications concerning representation can be found in this area too. For example, in their introductory piece, Vignemont, Serino, Wong, and Farnè have “a special way of representing space” as their subtitle, but in what sense and whether PPS is indeed representational can be debatable. Another critical point concerns in what way and to what extent issues surrounding PPS are philosophically significant, given that so many works in this area are empirical or experimental. This is indeed a difficult question, and similar worries can be raised for other interdisciplinary discussions in philosophy and cognitive sciences. Without going into the theoretical disagreements concerning a priori/armchair/a posteriori, here is a selective list of the relevant issues: What are the relations between egocentric space, allocentric space, and peripersonal space? How does it help us understand self-location, body ownership, and bodily self-awareness? How does attention affect our experiences of peripersonal space? Is peripersonal space a set of contact-related action fields? How does peripersonal space contribute to our sense of bodily presence (see various chapters in Vignemont, Serino, Wong, and Farnè, 2021)? No matter what the verdict is, it is hard to deny the relevance of peripersonal space for philosophical issues concerning bodily awareness in general, which will hopefully be clearer in the following sections.

Now, is bodily awareness a unified sense modality? Given how diverse its elements are, the answer is probably going to be negative; though as Vignemont (2018) points out, what these diverse elements “have in common is that they seem to all guarantee that bodily judgments are immune to error through misidentification relative to the first-person” (see 2.b). She then goes on to elaborate on several puzzles about body ownership, exploring varieties of bodily experiences and body representations and proposing a positive solution to those puzzles: the “bodyguard hypothesis,” which has it that “only the protective body map can ground the sense of bodily ownership” (Vignemont, 2018b, p. 167). However, “bodily awareness” can also be construed narrowly: for example, when Martin (1992) and others argue that spatial touch depends on bodily awareness (see 2.a below), they intend a narrower meaning of the term, including proprioception and kinaesthesis only. So, one can still sensibly ask: is proprioception a sense modality on its own? Is the vestibular sense a sense modality on its own? These are all open questions for future research. The next section is about some contemporary issues concerning aspects of bodily awareness. Some might hold that before asking whether bodily awareness is a unified sense modality, we should decide first whether these various experiences described above are perceptual or not. Others might hold that this is not the case, as the senses do not have to be exteroceptive, and therefore perceptual. More positively, one can ask whether proprioception itself is a natural kind, without committing that it is perceptual.

2. Contemporary Issues

In section 1, it was shown that bodily awareness has many varieties. In considering them, one has also seen that many questions arise along the way, for example, how to individuate the senses within bodily awareness, how to draw the distinction between interoception and exteroception, and so forth. However, many philosophical questions deserve further consideration given the complexities involved; in this section is a discussion of some of these questions.

a. Is There a Tactile Field?

This question would not make sense unless it is situated within a wider context. Consider visual field: in daily life, we know that when we close one eye, our visual fields are roughly cut half. In clinical contexts, we are sometimes told that due to strokes or other conditions, our visual fields shrink and as a result we bump into things more often and will need to readjust. When we say blindsight patients have a “blind field,” we are already presupposing the existence of visual fields. In psychology, we can measure the boundaries of our visual fields, and there are of course individual differences. Different philosophical theories of perception might attach different metaphysical natures of visual fields. For example, an often-quoted passage states that a visual field is the “spatial array of visual sensations available to observation in introspectionist psychological experiments” (Smythies, 1996, p. 369). This obviously commits to something – visual sensations – that are not acknowledged by many researchers in this area, though it can be regarded as the standard understanding of the sensationalist tradition (for example, Peacocke, 1983). A naïve realist might prefer characterising visual fields as constituted by external objects and phenomena themselves (Martin, 1992). A representationalist would presumably invoke mental representations to characterise visual fields. On this occasion we do not need to deal with the metaphysics of visual fields; suffice to say that visual fields seem to be indispensable, or at least quite important, for spatial vision. So, our question is: what about other spatial senses? Do they also rely on the relevant sensory fields? Specifically, for spatial touch, do they rely on a tactile field or tactile fields?

Questions concerning tactile fields arise explicitly in the context of P. F. Strawson’s essay on descriptive metaphysics:

Evidently the visual field is necessarily extended at any moment… The case of touch is less obvious: it is not, for example, clear what one would mean by a “tactual field” (Strawson, 1959, p. 65, emphasis added)

Strawson’s challenge here is moderate in the sense that he only invites those who believe in tactile fields to say more about what they mean by it. More challenging moves can be found in Brian O’Shaughnessy, M. G. F. Martin, and Matthew Soteriou. O’Shaughnessy writes, “There is in touch no analogue of the visual field of visual sensations” (1989 p. 38, emphasis added). This is more challenging because, unlike Strawson, O’Shaughnessy here asserts that there is no analogue. Notice that when he writes this, he has a rather specific view on vision, which involves visual sensations or visual sense-data. Soteriou makes a more specific claim that “The structural feature of normal visual experience that accounts for the existence of its spatial sensory field is lacking in the form of bodily awareness involved when one feels a located bodily sensation” (2013, p. 120, emphasis added).

Countering the above line of thought, Patrick Haggard and colleagues (2008, 2011) have attempted to empirically test the hypothesis that tactile fields exist, and they sustain tactile pattern perceptions. Following earlier works, Fardo, Beck, Cheng, and Haggard (2018) argue that “integration of continuous sensory inputs across several tactile RFs [receptive fields] provides an intrinsic mechanism for spatial perception” (p. 236). For a more detailed summary of this series of works, see Cheng (2019), where it is also noted that in the case of thermal perception and nociception, there seems to be no such field (Mancini, Stainitz, Steckelmacher, Iannetti, and Haggard, 2015). Further characteristics of tactile fields include, for example, we can perceive space between multiple stimuli (Mac Cumhaill 2017; also compare Evans 1980 on the simultaneous spatial concept). For touch, the sensory array has a distinctive spatial organisation due to the arrangement of receptive fields on the receptor surface on the skin.

Recently, discussions on tactile fields have gone beyond the above contexts. For example, in comparing shape representations in sight and touch, E. J. Green (2020) discusses various responses to Molyneux’s question and classifies (and argues against) the tactile field proposal into what he calls the “structural correspondence view” (also see Cheng, 2020). In investigating the spatial content of painful sensations, Błażej Skrzypulec (2021) argues that cutaneous pains “do not have field-like content, as they do not present distance relations between painful sensations” (p. 1). In what sense there are tactile fields seems to be a theoretically fruitful question, and further studies need to be done to explore ramifications in this area. Similar works have been done for other sensory modalities, such as olfaction (Aasen, 2019) and audition.

b. Does Bodily Immunity to Error Through Misidentification Hold?

“Immunity to error through misidentification relative to the first-person” (IEM) is a putative phenomenon identified by Sydney Shoemaker (1968), who also attributes it to Wittgenstein (1958) (Also see Salje, 2017). “Error through misidentification” is a specific kind of error; let’s illustrate IEM via an example. When I say “I see a canary,” if I am sincere, I can still be wrong about what I see, or even about whether I really have visual experiences at that time. But it seems that I cannot be wrong about the first-person: it is me who thinks and judges I see a canary, and there is no doubt about it (beyond reasonable doubt). Shoemaker regards this as a logical truth, though a further complication here is that Shoemaker himself draws a distinction between de facto IEM and logical IEM, which is about the scope of the IEM claim. If we regard IEM as a logical thesis, then we are after the broader, logical thesis.

Now, setting aside whether in general IEM is true, it is about self-ascription of mental states. Independently, one might reasonably wonder whether IEM is applicable to the self-ascription of bodily states (that is, physical bodily properties, including body size, weight, and posture, and so forth). Again, let’s illustrate this with an example. Suppose I come up with the judgement that I am doing a power pose. If I formed this judgement via my vision, it is possible (though unlikely) for me to get it wrong who is doing this pose, as I might confuse someone else’s arms with my own. By contrast, if I form this judgement by proprioception, I might be wrong about the pose itself, but I cannot be wrong about who is doing the power pose, or so it is sometimes argued (Evans, 1982). But things are not so simple; Vignemont (2018) usefully distinguishes between bodily immunity from the inside and from the outside. In what follows we briefly discuss their contents and potential problems respectively.

Bodily immunity from the inside is in a way more standard: bodily senses seem to guarantee IEM in this sense because they provide privileged informational access to one’s own body. This is not to say that bodily senses do not provide information about other things – touch of course brings in information about other objects – but in retrieving information about those other things, the privileged bodily information is always implicated. As Vignemont states, “Proprioceptive experiences suffice to justify bodily self-ascriptions such that no intermediary process of self-identification is required” (2018, p. 51). Details aside, this thesis faces at least two kinds of putative counterexamples: in false negative errors, “one does not self-ascribe properties that are instantiated by one’s own body” (ibid., p. 51, emphasis added), while in false positive errors, “one self-ascribes properties that are instantiated by another’s body” (ibid., p. 51, emphasis added).

For false negative errors, the clinical case somatoparaphrenia is a salient example (Bottini, Bisiach, Sterzi, and Vallar, 2002). A famous case patient FB can feel tactile experiences when her hand is touched, but she would judge that it is her niece’s hand that is touched. That is to say, she has troubles with body ownership with respect to her left hand. She does not self-ascribe properties, in this case being touched, that are instantiated by her own left hand. Whether this kind of case really constitutes counterexamples of bodily IEM is a matter of dispute. For example, Vignemont (2018) has argued that somatoparaphrenia is actually irrelevant to bodily IEM, because “The bodily IEM thesis claims that if the judgement derives from the right ground, then it is immune to error” (p. 52-3, emphasis added). However, one might worry that this move makes bodily IEM too weak. After all, philosophical theses like this tend to lose their significance when they are not universal claims. To this Vignemont might reply that “immunity to error” is a quite strong claim, so even if it needs to be qualified like above, it is still a significant thesis. For comparison, consider this parallel claim that if the perception derives from the right ground, then it is immune to error. This seems to be false because even when perceptions have right or good grounds, they can still be subject to errors.

For false positive errors, an obvious candidate is rubber hand illusion. In such a case, participants (to some extent) identify rubber hands as their hands. That is to say, they self-ascribe properties, in this case being their hands, that are not instantiated by their bodies. There are, to be sure, lots of controversies concerning the interpretation of this illusion, and whether it really constitutes a counterexample here. As Vignemont points out, “it is questionable whether participants actually self-attribute the rubber hand… they feel as if the rubber hand were part of their own body, but they do not believe it” (2018, p. 53, emphasis added). Arguably, those subjects do not make mistakes here; they rightly believe that those rubber hands are not their own hands. Another potential false positive case is embodied hand illusion, which is sometimes also derived from somatoparaphrenia. Some though not all somatoparaphrenia patients would also self-attribute another person’s hand, either spontaneously or induced via the RHI paradigm (Bolognini, Ronchi, Casati, Fortis, and Vallar, 2014; note that a similar condition can be found in those who do not have somatoparaphrenia). Basically, when the embodied hand moves, if the subjects see it, they might report feeling their hands moving. These are all tricky examples, and many individual differences are involved. What is crucial, methodologically, is to recognise that these are actual world clinical or experimental examples, rather than thought experiments. With these actual examples, we need to look into details in different cases, and be really careful in making sweeping claims about them.

What about bodily immunity from the outside? This putative phenomenon is less well known, but the cases for them might be familiar. It is less well known because we tend to think that information from outside (of the body) is fallible so cannot be immune to error. But consider this passage from J. J. Gibson:

[A]ll the perceptual systems are propriosensitive as well as exterosensitive, for they all provide information in their various ways about the observer’s activities… Information that is specific to the self is picked up as such, no matter what sensory nerve is delivering impulses to the brain. (1979, p. 115, emphasis added)

By “propriosensitive” Gibson means “information about one’s body.” This part of Gibson’s idea – self-specifying information – is less known than affordance, but is actually integral to his view, under the label of “visual kinesthesis”: due “to self-specific invariants in the optic flow of visual information (for example, rapid expansion of the entire optic array), we can see whether we are moving, even though we do not directly see our body moving” (Vignemont, 2018, p. 58). Relatedly, Evans (1982) and Cassam (1995) have argued that self-locating judgements enjoy the same status if they represent the immediate environment within an egocentric frame of reference, because this frame carries self-specifying information concerning the location of the perceiver. As Evans puts it, when I am standing in front of a tree, I cannot sincerely entertain this doubt: “someone is standing in front of a tree, but is it I?” (1982, p. 222). Again, these ideas have clear route from Wittgenstein. What is introduced above is visual experiences of the environment grounding bodily IEM; Vignemont (2018) also discusses the possibility that visual experiences of the body grounding bodily IEM (p. 58-61). Note that self-specificity is weaker than self-reference, as the former does not imply that awareness of one’s body as one’s own (Vignemont, 2018a).

Relatively independent of IEM, philosophers also disagree about how to model body ownership. The questions include: it seems to make an experiential difference whether one is aware of one’s body as one’s own or not, but how to account for this difference in consciousness or phenomenology? What are the grounds of the sense of body ownership? Is there a distinct feeling of myness (Bermúdez, 2011/2018; Alsmith, 2015; Guillot, 2017; Chadha, 2018)? Different answers have been proposed, including the deflationary account, the cognitive account, the agentive account, and the affective account. The deflationary account has it that the sense of body ownership can be reduced to the spatiality of bodily sensations and judgements of ownership about one’s own body (Martin, 1992, 1995; Bermúdez, 2011). One potential problem is that one seems to be able to become aware of the boundaries of own’s own body without being aware of the boundaries of one’s own body qua one’s own (Dokic, 2003; Serrahima, forthcoming). Another potential problem is that bodily sensations might be able to be dissociated from the sense of body ownership: patients with disownership syndromes remain to be able to experience at least some bodily experiences; whether this decisively refutes the deflationary account remains to be determined (Moro, Massimiliano, and Salvatore, 2004; Bradley, 2021). The cognitive account has it that “one experiences something as one’s own only if one thinks of something as one’s own” (Alsmith, 2015, p. 881). Whether this account is successful depends on how we account for the apparent cognitive impenetrability of the sense of body ownership: if there are cases of body ownership or disownership that cannot be altered by thinking or other propositional attitudes, it will be difficult for this account to explain what is really going on. The agentive account has it that body ownership has certain constitutive connection between body schema (Vignemont, 2007), agentive feelings (Baier and Karnath, 2008), or agentive abilities (Peacocke, 2017). One major potential problem with this is that, for example, participants with the rubber hand illusion might feel that the rubber hand is her or his own, without feeling that they can act with that very rubber hand. Finally, the affective account has it that there is a specific affective phenomenological quality that is over and above sensory phenomenological qualities of bodily awareness (Vignemont, 2018b). As we mentioned in discussing affective touch, this kind of quality is valenced or valued, and in this specific case the quality signifies the unique value of the body for the self. This kind of affective quality is key to survival. One concern is that it might be unclear whether it is affective phenomenology that explains body ownership, or the other way around. Another concern is that evolutionary explanations always risk being just-so stories. These are all very substantive issues that we do not go into, but the general shape of this rich terrain should be clear enough.

c. How Do Body Ownership and Mental Ownership Relate?

Above we have seen that somatoparaphrenia and other conditions have been regarded as test cases for body ownership. Relatedly, they might cause a parallel problem for mental or experiential ownership (Lane, 2012). Let’s recall the patient FB case: when she judges that her left hand belongs to her niece, she was confused about body ownership, as left hand is a body part. By contrast, when she judges that the relevant tactile sensations belong to her niece as well, she was confused about mental ownership, as tactile sensation is a mental state or episode. This corresponds to Evans’ distinction between mental self-ascription and bodily self-ascription (1982, p. 220-235), which also brings us back to the original formulation of IEM in Shoemaker.

How does somatoparaphrenia, cases like patient FB, threaten IEM with regard to mental ownership? Since FB gets the who wrong in mental self-ascription, it does look like a counterexample of IEM. Consider some original formulations:

(1) To ask “are you sure it’s you who have pains?” would be nonsensical. (Wittgenstein, 1958, p. 67)

(2) [T]here is no room for the thought “Someone is hungry all right, but is it me?” (Shoemaker, 1996, p. 211)

“Nonsensical” in (1) and “no room” in (2) both refer to the “immunity” part of IEM. “Are you sure it’s you who have pains?” in (1) and “Someone is hungry all right, but is it me?” in (2) refer to the “error through misidentification” part. For Wittgenstein, the question in (1) looks like a query in response to the subject’s spontaneous report of his sensational state, saying, “it is me who is in pain.” Here Wittgenstein argues that when a subject sincerely reports that she is in pain, it is nonsensical to question whether the subject is wrong about who is the subject. In the case of FB, she did not spontaneously report that she was experiencing a certain sensation; moreover, she reported that the sensation belongs to someone else. This makes no contact with what Wittgenstein has in mind. However, this is not true in Shoemaker’s case. The question “Someone is hungry all right, but is it me?” allows two kinds of cases. First, the subject is not hungry, but she suspects she is the subject of that experience. Second, the subject is truly hungry, but she suspects she is not the subject of that experience. FB fits the second case, so proponents of IEM will have a hard time reconciling this second case with the case of FB. How about the first case? Since by hypothesis the subject is not hungry in the first place, FB’s case would be irrelevant. So, if we read Shoemaker’s question in the first sense, it would be easier for proponents of IEM to face empirical cases like FB.

How, then, do body ownership and mental ownership relate? There seems to be no straightforward answer. Consider bodily IEM and the original IEM: as discussed above, Vignemont argues that the bodily IEM thesis claims that if the judgement derives from the right ground, then it is immune to error; presumably this strategy, if acceptable, can apply in the original IEM. Indeed, Shoemaker once states that “if I have my usual access to my hunger, there is no room for the thought ‘Someone is hungry alright, but is it me?’” (Shoemaker, 1996, p. 210, emphasis added). So, in this sense, Bodily IEM and the original IEM can be coped with in the same way. This does not show, to be sure, there are no crucial differences between them. What about mental ownership and body ownership in general, independent of IEM, bodily or not? They seem to go together very often: on the one hand, in normal cases one would correctly recognise one’s own limbs as one’s own, and would correctly recognise one’s own sensations as one’s own; on the other hand, in the case of FB and some other somatoparaphrenia patients, they wrongly recognise one’s own limbs as others’, and would also wrongly recognise one’s own sensations as others’ (this is debatable, to be sure). Is it possible to double dissociate them? For correct body ownership and wrong mental ownership, the claim “I feel your pain” might be a possible case, since in this case one gets the body right but the sensation wrong when it comes to the who question: when you sympathise with someone’s else’s pain so that you feel pain too, it is your pain then. What about wrong body ownership and correct mental ownership? These are all open empirical questions that need to be further explored.

d. Must Bodily Awareness Be Bodily Self-Awareness?

On the face of it, this question might make no good sense: “of course it must; bodily awareness is through proprioception, kinaesthesis, and pain and so forth, to become aware of one’s own body; one is aware of one’s own body from the inside, as it were” (see O’Shaughnessy, 1980; “highly unusual relation”). How can it fail to be bodily self-awareness?” Indeed, in the empirical literature, researchers do not normally distinguish between them (for example, Blanke, 2012). In philosophy, sometimes “bodily self-awareness” refers to something more specific, for example, aware of this body as mine, aware of this bodily self qua subject (Cassam, 1997; also see Salje, 2019; Longuenesse, 2006, 2021), or perhaps aware of oneself as a bodily presence in the world (McDowell, 1996; or “existential feelings,” various writings by Ratcliffe, and Vignemont, 2020, p. 83, as discussed in 1.d). This is not to accuse scientists of committing a conceptual confusion; it is just that philosophers are sometimes concerned with questions that have no clear empirical bearing, at least for the time being. Below we briefly review this stricter usage of “bodily self-awareness,” and philosophical implications around this corner.

In Self and World (1997), Cassam seeks to identify necessary conditions for self-consciousness. One line he takes is called the “objectivity argument,” which has it that objective experience requires “awareness of oneself, qua subject of experience, as a physical object” (p. 28; “the materialist conception”; also see his 2019). For our current purpose, that is, distinguishing bodily awareness and bodily self-awareness, we only need to get clear about what “qua subject” means. One can be aware of oneself, or one’s own body, not qua subject, but just qua (say) an animal, or even a thing. To illustrate this, consider the case in which you see yourself in a mirror from a rather strange angle (“from the outside”), and without realising that it is you. In that case, there is no self-awareness under this stricter meaning. To apply this to the body case, proprioception might automatically tell the subject the locations of the limbs, but without a proper sense of mineness, to put it vaguely, it does not automatically follow that those bodily awarenesses are also cases of bodily self-awareness. Note that Cassam sometimes defends different though related theses on different occasions; for example, he also defends the “bodily awareness thesis” that “awareness of one’s own body is a necessary condition for the acquisition and possession of concepts of primary qualities such as force and shape” (2002, p. 315).

This point can be further illustrated by the “Missing Self” problem, explained below. Here is how Joel Smith formulates the target:

[I]n bodily awareness, one is not simply aware of one’s body as one’s body, but one is aware of one’s body as oneself. That is, when I attend to the object of bodily awareness I am presented not just with my body, but with my “bodily self.” [the bodily-self thesis] (2006, p. 49)

Smith’s argument against this view is based on two claims about imagination, which he defends in turn. To retain our focus, here we will assume that those two claims are cogent. They are as follows:

(1) “[T]o imagine sensorily a ψ is to imagine experiencing a ψ” (Martin, 2002, p. 404; the “dependency thesis”).

(2) “When we engage in imagining being someone else, we do not imagine anything about ourselves” (Smith, 2006, p. 56).

With these two claims about imagination, Smith launches his argument as follows:

The argument…begins with the observation that I can imagine being Napoleon feeling a bodily sensation such as a pain in the left foot. According to [1], when I imagine a pain I imagine experiencing a pain. It follows from this that the content of perceptual awareness will be “mirrored” by the content of sensory imagination…Now, [given 2], then imagining being Napoleon having a pain in the left foot will not contain me as an object. The only person in the content of this imagining is Napoleon…Thus, when I simply imagine a pain, but without specifying whose pain, the imagined experience is not first personal. (2006, p. 57, emphasis added)

What should we say about this argument? For Smith, the bodily-self thesis requires getting the who right. Therefore, imagining being other people is relevant. But it is unclear whether getting the who right is crucial for Cassam (1997), for example. Suppose that I am engaging the kind of imagination Smith has in mind. In that scenario, according to his view, I am not part of the imagination. Napoleon is. Smith believes that this is sufficient for rejecting the bodily-self thesis, but this hardly places any pressure on Cassam’s view. All we need here is that in having a certain kind of bodily awareness, this awareness is not only about the body, but also about the mind that is associated with that body. Whether the mind is Tony or Napoleon is out of the question here. Perhaps I get the subject wrong. Perhaps, as Smith has it, in the imagination the subject is Napoleon, not Tony. Still, all we need is that bodily awareness is not only about the body, but the minded body. If so, even if Smith’s argument is sound, the Cassam picture is not one of his targets, since for him it is not needed to get it right about who the subject is.

Another way to see the current point is to consider an analogous point concerning the first-person pronoun: the reference of “I” is token reflexive in Reichenbach’s sense (1947): any token of “I” refers to whoever produces that expression. When I produce a sentence containing “I”, it refers to me. Whether I correctly identify myself as Tony Cheng or misidentify myself as Napoleon is irrelevant. Likewise, in the case of bodily awareness, the subject is aware of him- or herself as the person who is experiencing the bodily experience in question. Whether the subject can correctly identify who he or she is – Napoleon or not – is irrelevant. The reason might be that what is unique about the first person is the token reflexivity. The identity of the subject, though important, is always an additional question. It is interesting to compare Bernard Williams’ thought experiments concerning torturing and personal identity (1970; see also Williams 2006): when I am tortured and want to escape from the situation, what is crucial is that I am being tortured and I want to escape. Whether I am Tony Cheng or Napoleon is a further, and less important, question. One outcome of this view is that one then has no absolute authority about what one’s is imagining. This might not be a theoretical cost, as the general trend in contemporary epistemology has it that all sorts of first-person authority are less robust than philosophers have thought in the past.

The moral is that no matter who I am, who I will be, what I will remember, or what I can imagine, as long as what is going to be tortured is me, then I have every reason to fear. In his later writings, Smith is more sympathetic to the bodily-self thesis. For example, he writes: “if bodily sensations are given as located properties of one’s lived body and, further, bodily sensations are presented as properties of oneself, then bodily awareness is an awareness of one’s body as oneself” (Smith 2016, p. 157). So, must bodily awareness be bodily self-awareness? Philosophers seem to still disagree, and it is to date unclear how this can be resolved with the helps with empirical works directly.

e. What Does Body Blindness, Actual or Imagined, Show?

Partial body/proprio- blind cases have been found in the actual world, whereby the subject has no touch or proprioception below the neck but is still able to see the world roughly in the way we do, and can experience temperatures, pain, and muscle fatigue (Cole, 1991). For this kind of rare subjects, they need to make use of information from other modalities, mostly vision, to coordinate actions. What this shows is that proprioception and touch play extremely important roles in our daily lives. Although it is still possible to maintain minimal functions, it is extremely laborious to conduct bodily actions without appropriate bodily awareness.

What about the imagined case? Consider this thought experiment: “[E]ven someone who lacks the form of bodily awareness required for tactile perception can still see the surrounding world as a world of physical objects (Aquila 1979, p. 277). This is a suggestion of disembodiment: most people agree that having bodily awareness is very important for navigating the world, but in this imagined case, call it “total body blindness,” the subject seems to be able to have basic cognition without any bodily awareness. This seems to contradict Cassam’s claim that objective experiences require bodily self-awareness (1997). More specifically, Aquila argues that given that if I am body blind, “I experience no bodily sensations, or at least none which I am able to identify in connection with some particular body I perceive, and I perceive no body at all which I would identify as my own” (Aquila, 1979, p. 277). If this thought experiment is coherent, it suggests a limitation of the importance of bodily awareness: true, bodily awareness is so crucial for cognition and action, but it is not a necessary condition.

It is worth considering a real-life example that might pose a similar threat. Depersonalisation Disorder, or DPD, denotes a specific feeling of being unreal. In the newest edition of the DSM (Diagnostic and Statistical Manual of Mental Disorders, fifth edition), it has been renamed as Depersonalisation/Derealisation Disorder, or DDD. In what follows we use “depersonalisation” for this syndrome not only because it is handier, but also because depersonalisation and derealisation are closely connected but different phenomena: while depersonalisation is directed toward oneself or at least one’s own body, derealisation is directed toward the outside world. Here we will only discuss depersonalisation.

The first reported case of depersonalisation was presented by an otolaryngologist M. Krishaber under the label “cerebro-cardial neuropathy” in 1873. The term “depersonalisation” was coined later in 1880 by the Swiss researcher Henri Amiel in response to his own experiences. Like all the other mental disorders or illnesses, the evaluation and diagnosis of depersonalisation have a long and convoluted history, and the exact causes have not been entirely disentangled. What is crucial for our current purposes is the feeling involved in the phenomenon: patients with this disorder might feel themselves to be “a floating mind in space with blunted, blurry thoughts” (Bradshaw 2016). To be sure, there are individual differences amongst patients, just as there are individual differences amongst so-called “healthy subjects.” Still, this description seems to fit many such patients, and more importantly, even if it does not fit many, the very fact that some patients feel like this is sufficient to generate worries here. Here is why: presumably most, if not all, of these patients retain the capacity for object cognition and perception. They still (perceptually) understand that objects are solid, shaped, and sized, and that things can persist even when occluded, for example. But they seem to lack the kind of bodily awareness in question: the very description of a “floating mind in space” signifies this feeling of disembodiment. If this is right, these patients are real-life counterexamples to the Cassamian thesis: they have the capacity for basic perception/cognition, while lacking awareness of oneself as a physical object.

So, what does body blindness, actual or imagined, show? In the actual case, one sees in detail how a lack of robust bodily awareness can put daily life into troubles; in the imagined case, where the subject in question does not even have bodily awareness above the neck, the subject seems to be still able to have basic awareness of the world. There can be worries here, to be sure. For example, if someone has zero bodily awareness, she or he would have no muscle feedback around the eyes, which will impair the visual capacities to a substantive extent. Still, it would presumably not render the subject blind, so again, bodily awareness is so very important, but perhaps not strictly speaking necessary for basic cognition or objective experience. For more on embodiment, the self, and self-awareness, see Cassam (2011).

3. Phenomenological Insights: The Body as a Subjective Object and an Objective Subject

From the above discussions, we have seen that that the body seems to be both subjective and objective, in some sense. What should we make of this? Or more fundamentally, how is that even possible? Let’s consider the possibility that with bodily awareness one can be aware of one’s body as a subjective object and an objective subject: the bodily self can be aware of itself as an object, but it is not just another object in the world. It is a subjective object (Husserl, 1989), that is, the object that sustains one’s subjective states and episodes. It is also an objective subject, that is, the subject that is situated in an objective world. There seems to be no inherent incompatibility within this distinction between object and subject; they are not mutually exclusive. To channel Joel Smith, “in bodily awareness my body is given as lived – as embodied subjectivity – but it is also co-presented as a thing – as the one thing I constantly see” (2016, p. 159). By contrast, this line is at odds with Sartre’s idea that one’s body is either “a thing among other things, or it is that by which things are revealed to me. But it cannot be both at the same time” (1943/2003, p. 304).

a. Two Notions of the Body

If bodily self-awareness can be about subjective object and objective subject, this comes close to Merleau-Ponty’s notion of subject-object (Merleau-Ponty, 1945/2013). However, in both his and Husserl’s works, and in the phenomenological tradition more broadly, the general consensus is that we are never aware of ourselves as physical objects. In order to incorporate their insights without committing to this latter point, we need to look into some of the details of their views. For Husserl, the Body (Leib) is the “animated flesh of an animal or human being,” that is, a bodily self, while a mere body (Körper) is simply “inanimate physical matter” (1913/1998, p. xiv). The Body presents itself as “a bearer of sensations” (ibid., p. 168). A similar distinction emerges in Merleau-Ponty’s work between the phenomenal/lived body and the objective body that is made of muscles, bones, and nerves (1945/2013). There is a debate over whether the distinction should be interpreted as between different entities or different perspectives of the same entity (Baldwin, 1988). As in the case of Kant’s transcendental idealism, the two-world/entity view is in general more difficult to defend, so for our purposes we will assume the less contentious two-perspective view. The idea, then, is that the human body can be viewed in at least two ways: as phenomenal, and as objective. From the first-person point of view, the body presents to us only as phenomenal, not objective. For a detailed comparison of Husserl and Merleau-Ponty in this regard, see Carman (1999).

For Merleau-Ponty, “[t]he body is not one more among external objects” (1945/2013, p. 92). One can only be aware of oneself as the phenomenal self in one’s pre-reflective awareness. As Vignemont explains,

[T]he lived body is not an object that can be perceived from various perspectives, left aside or localized in objective space. More fundamentally, the lived body cannot be an object at all because it is what makes our awareness of objects possible…The objectified body could then no longer anchor the way we perceive the world…The lived body is understood in terms of its practical engagement with the world…[Merleau-Ponty] illustrates his view with a series of dissociations between the lived body and the objective body. For instance, the patient Schneider was unable to scratch his leg where he was stung. (2011/2020, pp. 17-18, emphasis added)

Another gloss is that the lived body is “the location of bodily sensation” (Smith 2016, p. 148, original emphasis; Merleau-Ponty, “sensible sentient,” 1968, p. 137). Compare Cassam’s characterisation of the physical or material body as the “bearer of primary qualities” (2002, p. 331). Now, for pathological cases like that of Schneider, who exemplified a “dissociation of the act of pointing from reactions of taking or grasping” (Merleau-Ponty, 1945/2013, p. 103-4), it shows only that the phenomenal/lived body is not the same as the objective body. It does not show that one cannot be aware of oneself as an objective body. For Merleau-Ponty, one’s body is “a being of two leaves, from one side a thing among things and otherwise what sees and touches them” (Merleau-Ponty, 1968, p. 137). The human body has a “double belongingness to the order of the ‘object’ and to the order of the ‘subject’” (ibid., p. 137). Our notion of the subjective object and objective subject, then, is intended to capture, or at least echo, Merleau-Ponty’s “Subject-Object,” and Husserl’s intriguing idea that the human body is “simultaneously a spatial externality and a subjective internality” (1913/1998).

This phenomenological approach has an analytic ally called “sensorimotor approach” (for example, see Noë, 2004; for details, see Vignemont, 2011/2020). Its major rival is the “representational approach,” which has it that “in order to account for bodily awareness one needs to appeal to mental representations of the body” (ibid., p. 12). To rehearse, reasons for postulating these representations include: 1) explaining the disturbances of bodily awareness such as phantom limbs; 2) accounting for the spatial organisation of bodily awareness (O’Shaughnessy 1980, 1995); and 3) understanding the ability to move one’s own body. Even if one agrees with the representational approach, it has proven to be extremely difficult to decide how many and what kinds of body representations we should postulate since the classic work by Head and Holmes (1911) (1.e). The reason for discussing the representational approach here is that while crucially different from the sensorimotor approach, the representational approach can raise a similar objection with its own terms: what one is aware of is one’s body schema or body image, but not one’s objective body. Under the phenomenological tradition, there is a branch called “neurophenomenology,” which “aimed at bridging the explanatory gap between first-person subjective experience and neurophysiological third-person data, through an embodied and enactive approach to the biology of consciousness” (Khachouf, Poletti, and Pagnoni, 2013). What these neurophenomenologists would say about the current case is not immediately clear.

This formulation of the problem might have some initial plausibility. Consider the case of phantom limb, in which the patient feels pain in a limb that has been amputated. The representational explanation says that the patient represents the pain in his body schema/image, which still retains the amputated limb. This shows, so the thought goes, that one is aware of only one’s own body schema/image. A similar line of thought can be found in Thomas Reid’s work, for instance when he argues that bodily awareness, such as sensations, is the result of purely subjective states or episodes (1863/1983, Ch. 5; see Martin, 1993 for discussion). If this is correct, then that kind of awareness can only be about something subjective, for example, the represented body, as opposed to the objective or physical body.

This inference might be too hasty. Assuming representationalism in this domain, it is sensible to hold the kind of explanation of phantom limb described above. However, the right thing to say might be that one is aware of one’s objective body through one’s body schema/image. They function as modes of presentation of the body. Why is this the right thing to say? One reason is that one can be aware of one’s own body objectively. If the representational approach needs to fit in, the only sensible place is in the modes of presentation.

In sum, the force behind the phenomenological or representational considerations should be fully acknowledged, but the right thing to say seems to be this: what one is aware of is the physical body, but one is not aware of it simply as a mere body or just yet another physical object. Rather, as explained above, one’s own body is aware of it as a subjective object, for example, the object that sustains one’s subjective states and episodes. The bodily self is aware of itself as a subjective object, and as an object in the weighty sense, that is, something can persist without being perceived (Strawson, 1959). Here we can echo Martin’s view of sensation: “Sensations are not… purely subjective events internal to the mind; they are experiences of one’s body, itself a part of the objective world” (1993, p. 209).

b. Non-Perceptual Bodily Awareness

These discussions from the phenomenological perspective also interact with the analytic tradition concerning the topic whether awareness of one’s own body is perceptual (Mandrigin and Thompson, 2015). According to Vignemont (2018b), “bodily presence” refers to the idea that “one’s body is perceived as being one object among others” (Vignemont, 2018b, p. 44). There is, to be sure, a matter of controversy given different criteria or conceptions of perception. McGinn (1996) holds that “bodily sensations do not have an intentional object in the way perceptual experiences do” (p. 8), and one potential reason can be found in the above Merleau-Ponty view, namely that “the body is not one more among external objects” (1945/2013, p. 92), and being an external object seems to be a necessary condition for perception. This seems also to echo Wittgenstein’s distinction between self as a subject and as an object, and the former cannot be an object of perception by oneself. However, this might not preclude the latter use of the self as an object, which can be an object of perception by oneself (Dokic, 2003). Another potential reason comes from the analytic tradition; Shoemaker (1996) has argued that one necessary condition of perception is that its way of gaining information needs to make room for identification and reidentification of the perceived objects, but bodily awareness seems to gain information from only one object, for example that is, one’s own body. Martin (1995) has argued that this sole-object view does not preclude bodily awareness being perceptual; Schwenkler (2013) instead argues that bodily awareness conveys information about multiple objects, since those pieces of information are about different parts of one’s own body.

This is a huge topic that deserve further investigations; as Vignemont (2011/2020) points out, the perceptual model of bodily awareness has faced challenges from many directions. In addition to the above considerations, some have argued that the distinctive spatiality of bodily awareness precludes it from being perceptual: its spatiality violates basic spatial rules in the so-called “external world” (for example, O’Shaughnessy, 1980); some go further to argue that bodily awareness itself is not intrinsically spatial (Noordhof, 2002). As the once famous local sign theory has it (Lotze, 1888), “each sensible nerve gives rise to its own characteristic sensation that is specific to the body part that is stimulated but spatial ascription does not derive from the spatial content of bodily sensations themselves” (Vignemont, 2011/2020). This goes against the tactile field view introduced in 2.a, as it argues that some intrinsic spatiality of touch is held by tactile fields as sustained by skin-space, that is, a flattened receptor surface or sheet (derma-topic, Cheng, 2019). Skin-space is to be contrasted with body-space, understood as torsos, limbs, joints, and their connections (somato-topic), and external-space, including peripersonal space, understood as coordinates in an egocentric representation that will update when the body parts move (spatio-topic). Relatedly, A. D. Smith (2002) has argued that bodily sensations are mere sensations and therefore non-perceptual, on the ground they do not meet his criteria of being objective. Other challenges toward the perceptual model of bodily awareness include the so-called “knowledge without observation” (Anscombe, 1962), the enactivists perspectives (Noë, 2004), and other action-based theories of perception (for example, Evans, 1982; Briscoe, 2014).

Now we are in a much better position to see why bodily awareness is philosophically important and intriguing: there can be many answers to this as we have seen throughout, but one major reason is that philosophy seeks to understand the convoluted relations between the subjective and the objective, and the body is one’s medium through which the objective can be reached by the subjective; this can be said to be a bodily quest for objectivity: the body as the seat of the subjective is also objective itself, and is one’s way towards the rest of the objective world; it is worth emphasising that the body is itself also part of the so-called “external world”; it is itself a denizen of the objective, that is, mind-independent. One might think that the body is external to the mind, though this spatial metaphor is not uncontroversial: others might think that the body is internal to the mind in the sense that the body is represented by the mind. Above we have touched on many aspects of the body and bodily awareness, and thereby seen how we can make progress in thinking about difficult philosophical issues in this area.

4. Conclusion

Bodily awareness is an extremely rich area of research that defies comprehensive introductions. Even if we double the word count here, there will still be territories that are not covered. Above we have surveyed varieties of bodily awareness, including touch, proprioception, kinaesthesis, the vestibular sense, thermal sensation, pain, interoception, a relatively new category called “sngception,” and bodily feelings. We have also discussed some contemporary issues that involve tactile fields, bodily IEM and IEM in general, mental ownership, bodily self-awareness, and body blindness. Finally, going beyond the Anglo-Saxon tradition, we have also selectively discussed insights from the phenomenological tradition, notably on the possibility of being aware of one’s bodily self as a subjective object and an objective subject, and whether bodily awareness is perceptual. Together they cover a huge ground under the general heading of “bodily awareness.” It would be an exaggeration to say that bodily awareness has become a heated area in the early twenty-first century, but it should be safe and accurate to state that it has been undergoing a resurgence or revival in the first quarter of the twenty-first century, as this article shows. This impressive lineup should guarantee the continuing importance of topics in this area, and there is much to follow up on in this rich area of research.

5. References and Further Reading

  • Aasen, S. (2019). Spatial aspects of olfactory experience. Canadian Journal of Philosophy, 49(8), 1041-1061.
  • Alsmith, A. J. T. (2015). Mental activity and the sense of ownership. Review of Philosophy and Psychology, 6(4), 881-896.
  • Alsmith, A. J. T. (forthcoming). Bodily self-consciousness. London: Routledge.
  • Anscombe, G. E. M. (1962). On sensation of position. Analysis, 22(3), 55-58.
  • Aquila, R. (1979). Personal identity and Kant’s “refutation of idealism.” Kant Studien, 70, 257-278.
  • Aristotle (1987). De Anima (On the soul). London: Penguin Classics.
  • Armstrong, D. M. (1962). Bodily sensations. London: Routledge.
  • Ataria, Y., Tanaka, S., & Gallagher, S. (2021). (Ed.) Body schema and body image: New irections. Oxford: Oxford University Press.
  • Aydede, M. (2013). Pain. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Baier, B., & Karnath, H-O. (2008). Tight link between our sense of limb ownership and self-awareness of actions. Stroke, 39(2), 486-488.
  • Bain, D. (2007). The location of pains. Philosophical Papers, 36(2), 171-205.
  • Baldwin, T. (1988). Phenomenology, solipsism, and egocentric thought. Aristotelian Society Supplementary Volume, 62(1), 27-60.
  • Bermúdez, J. L. (1995). Transcendental arguments and psychology: The example of O’Shaughnessy on intentional action. Metaphilosophy, 26(4), 379-401.
  • Bermúdez, J. L. (2011/2018). Bodily awareness and self-consciousness. In The bodily self: Selected essays. Cambridge, MA: MIT Press.
  • Berthier, M., Starkstein, S., & Leiguarda, R. (1988). Asymbolia for pain: A sensory-limbic disconnection syndrome. Annals of Neurology, 24(1), 41-49.
  • Berthoz, A. (1991). Reference frames for the perception and control of movement. In J. Paillard (Ed.), Brain and space. Oxford: Oxford University Press.
  • Blanke, O. (2012). Multisensory brain mechanisms of bodily self-consciousness. Nature Review Neuroscience, 13, 556-571.
  • Bolognini, N., Ronchi, R., Casati, C., Fortis, P., & Vallar., G. (2014). Multisensory remission of somatoparaphrenic delusion: My hand is back! Neurology: Clinical Practice, 4(3), 216-225.
  • Borg, E., Harrison, R., Stazicker, J., & Salomons, T. (2020). Is the folk concept of pain polyeidic? Mind and Language, 35, 29-47.
  • Bottini, G., Bisiach, E., Sterzi, R., & Vallar, G. (2002). Feeling touches in someone else’s hand. Neuroreport, 13(2), 249-252.
  • Bradley, A. (2021). The feeling of bodily ownership. Philosophy and Phenomenological Research, 102(2), 359-379.
  • Bradshaw, M. (2016). A return to self: Depersonalization and how to overcome it. Seattle, WA: Amazon Services International.
  • Briscoe, R. (2014). Spatial content and motoric significance. AVANT: The Journal of the Philosophical-Interdisciplinary Vanguard, 5(2), 199-217.
  • Bufacchi, R. J. & Iannetti, G. D. (2018). An action field theory of peripersonal space. Trends in Cognitive Neurosciences, 22(12), 1076-1090.
  • Cardinali, L., Brozzoli, C., Luauté, J., Roy, A. C., & Farnè, A. (2016). Proprioception is necessary for body schema plasticity: Evidence from a deafferented patient. Frontiers in Human Neuroscience, 10, 272.
  • Carman, T. (1999). The body in Husserl and Merleau-Ponty. Philosophical Topics, 27(2), 205-226.
  • Cassam, Q. (1995). Introspection and bodily self-ascription. In J. L. Bermúdez, A. J. Marcel, and N. M. Eilan (Eds.), The body and the self. Cambridge, MA: MIT Press.
  • Cassam, Q. (1997). Self and world. Oxford: Oxford University Press.
  • Cassam, Q. (2002). Representing bodies. Ratio, 15(4), 315-334.
  • Cassam, Q. (2011). The embodied self. In S. Gallagher (Ed.), The Oxford handbook of the self. Oxford: Oxford University Press.
  • Cassam, Q. (2019). Consciousness of oneself as subject. Philosophy and Phenomenological Research, 98(3), 736-741.
  • Cataldo, A., Ferrè, E. R., di Pellegrino, G., & Haggard, P. (2016). Thermal referral: Evidence for a thermoceptive uniformity illusion without touch. Scientific Reports, 6, 35286.
  • Chadha, M. (2018). No-self and the phenomenology of ownership. Australasian Journal of Philosophy, 96(1), 114-27.
  • Chen, W. Y., Huang, H. C., Lee, Y. T., & Liang, C. (2018). Body ownership and four-hand illusion. Scientific Reports, 8, 2153.
  • Cheng, T. (2019). On the very idea of a tactile field. In Cheng, T., Deroy, O., and Spence, C. (Eds.), Spatial senses: Philosophy of perception in an age of science. London: Routledge.
  • Cheng, T. (2020). Molyneux’s question and somatosensory spaces. In Ferretti, G., and Glenney, B. (Eds.), Molyneux’s question and the history of philosophy. London: Routledge.
  • Cheng, T., & Cataldo, A. (2022). Touch and other somatosensory senses. In Brigard, F. D. and Sinnott-Armstrong, W. (Eds.), Neuroscience and philosophy. Cambridge, MA: MIT Press.
  • Cole, J. (1991). Pride and a daily marathon. Cambridge, MA: MIT Press.
  • Cole, J., & Montero, B. (2007). Affective proprioception. Jenus Head, 9(2), 299-317.
  • Corns, J. (2020). The complex reality of pain. New York: Routledge.
  • Craig, A. D. (2003). Interoception: The sense of the physiological condition of the body. Current Opinion in Neurobiology, 13(4), 500-505.
  • Damasio, A. (1999). The feeling of what happens: Body and emotion in the making of consciousness. London: William Heinemann.
  • Day, B. L., & Fitzpatrick, R. C. (2005). Virtual head rotation reveals a process of route reconstruction from human vestibular signals. Journal of Physiology, 567(Pt 2), 591-597.
  • Dokic, J. (2003). The sense of ownership: An analogy between sensation and action. In J. Roessler and N. Eilan (Eds.), Agency and self-awareness: Issues in philosophy and psychology. Oxford: Oxford University Press.
  • Ehrsson, H. H., Spence, C., & Passingham, R. E. (2004). That’s my hand! Activity in premotor cortex reflects feeling of ownership of a limb. Science, 305(5685), 875-877.
  • Evans, G. (1980). Things without the mind. In Z. V. Straaten (Ed.), Philosophical subjects. Oxford: Oxford University Press.
  • Evans, G. (1982). The varieties of reference. Oxford: Oxford University Press.
  • Fardo, F., Beck, B., Cheng, T., & Haggard, P. (2018). A mechanism for spatial perception on human skin. Cognition, 178, 236-243.
  • Fardo, F., Finnerup, N. B., & Haggard, P. (2018). Organization of the thermal grill illusion by spinal segments, Annals of Neurology, 84(3), 463-472.
  • Ferretti, G., & Glenney, B. (Eds.), Molyneux’s question and the history of philosophy. London: Routledge.
  • Fridland, E. (2011). The case for proprioception. Phenomenology and Cognitive Sciences, 10(4), 521-540.
  • Fulkerson, M. (2013). The first sense: A philosophical study of human touch. Cambridge, MA: MIT Press.
  • Fulkerson, M. (2015/2020). Touch. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Gallagher, S. (1986). Body image and body schema: A conceptual clarification. Journal of Mind and Behavior, 7(4), 541-554.
  • Gallagher, S. (2008). Are minimal representations still representations? International Journal of Philosophical Studies, 16(3), 351-369.
  • Geldard, F. A., & Sherrick, C. E. (1972). The cutaneous “rabbit”: A perceptual illusion. Science, 178(4057), 178-179.
  • Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin.
  • Gray, R. (2013). What do our experiences of heat and cold represent? Philosophical Studies, 166(S1), 131-151.
  • Green, E. J. (2020). Representing shape in sight and touch. Mind and Language, online first.
  • Guillot, M. (2017). I, me, mine: On a confusion concerning the subjective character of experience. Review of Philosophy and Psychology, 8(1), 23-53.
  • Gurwitsch, A. (1964). The field of consciousness. Pittsburgh, PA: Duquesne University Press.
  • Haggard, P., & Giovagnoli, G. (2011). Spatial patterns in tactile perception: Is there a tactile field? Acta Psychologica, 137(1), 65-75.
  • Hamilton, A. (2005). Proprioception as basic knowledge of the body. In van Woudenberg, R., Roeser, S., Rood, R. (2005). Basic belief and basic knowledge. Ontos-Verlag.
  • Head, H., & Holmes, G. (1911). Sensory disturbances from cerebral lesions. Brain, 34(2-3), 102-254.
  • Henry, M. (1965/1975). Philosophy and phenomenology of the body. (G. Etzkorn, trans.). The Hague: Nijhoff.
  • Hill, C. (2005). Ow! The paradox of pain. In M. Aydede (Ed.), Pain: New essays on the nature of pain and the methodology of its study. Cambridge, MA: MIT Press.
  • Hill, C. (2017). Fault lines in familiar concepts of pain. In J. Corns (Ed.), The Routledge handbook of philosophy of pain. New York: Routledge.
  • Howe, K. A. (2018). Proprioceptive awareness and practical unity. Theorema: International Journal of Philosophy, 37(3), 65-81.
  • Huang, H. C., Lee, Y. T., Chen, W. Y., & Liang, C. (2017). The sense of 1PP-location contributes to shaping the perceived self-location together with the sense of body-location. Frontiers in Psychology, 8, 370.
  • Husserl, E. (1913/1998). Ideas pertaining to a pure phenomenology and to a phenomenological philosophy – first book: general introduction to a pure phenomenology. (F. Kersten, trans.). Dordrecht: Kluwer Academic Publishers.
  • Husserl, E. (1989). Ideas pertaining to a pure phenomenology and to a phenomenological philosophy – Second book: studies in the phenomenology of constitution. (R. Rojcewicz and A. Schuwer, trans.). Dordrecht: Kluwer Academic Publishers.
  • Katz, D. (1925/1989). The world of touch. Krueger, L. E. (trans.) Hillsdale, NJ: Erlbaum.
  • Khachouf, O. T., Poletti, S., & Pagnoni, G. (2013). The embodied transcendental: A Kantian perspective on neurophenomenology. Frontiers in Human Neuroscience, 7, 611.
  • Kinsbourne, M., & Lempert, H. (1980). Human figure representation by blind children. The Journal of General Psychology, 102(1), 33-37.
  • Korsmeyer, C. (2020). Things: In touch with the past. New York: Oxford University Press.
  • Kreuch, G. (2019). Self-feeling: Can self-consciousness be understood as a feeling? Springer.
  • Kripke, S. (1980). Naming and necessity. Cambridge, MA: Harvard University Press.
  • Lane, T. (2012). Toward an explanatory framework for mental ownership. Phenomenology and Cognitive Sciences, 11(2), 251-286.
  • Legrand, D. (2007a). Pre-reflective self-consciousness: On being bodily in the world. Janus Head, 9(2), 493-519.
  • Legrand, D. (2007b). Subjectivity and the body: Introducing basic forms of self-consciousness. Consciousness and Cognition, 16(3), 577-582.
  • Lenggenhager, B., Tadi, T., Metzinger, T., & Blanke, O. (2007). Video ergo sum: Manipulating bodily self-consciousness. Science, 317(5841), 1096-1099.
  • Lin, J. H., Hung, C. H., Han, D. S., Chen, S. T., Lee, C. H., Sun, W. Z., & Chen, C. C. (2018). Sensing acidosis: Nociception or sngception? Journal of Biomedical Science, 25, 85.
  • Liu, M. (2021). The polysemy view of pain. Mind and Language, Online first.
  • Liu, M., & Klein, C. (2020). Analysis, 80(2), 262-272.
  • Locke, J. (1693/1979). Letter to William Molynoux, 28 March. In de Beer, E. S. (Ed.), The correspondence of John Locke (vol. 9). Oxford: Clarendon Press.
  • Longuenesse, B. (2006). Self-consciousness and consciousness of one’s own body: Variations on a Kantian theme. Philosophical Topics, 34(1/2), 283-309.
  • Longuenesse, B. (2021). Revisiting Quassim Cassam’s Self and world. Analytic Philosophy, 62(1), 70-83.
  • Lotze, H. (1888). Logic, in three books: Of thought, of investigation, and of knowledge. Oxford: Clarendon Press.
  • Lycan, W. G. (1987). Consciousness. Cambridge, MA: MIT Press.
  • Mac Cumhaill, C. (2017). The tactile ground, immersion, and the “space between.” Southern Journal of Philosophy, 55(1), 5-31.
  • Macpherson, F. (2011). (Ed.) The senses: Classical and contemporary philosophical perspectives. Oxford: Oxford University Press.
  • Mancini, F., Stainitz, H., Steckelmacher, J., Iannetti,G. D., & Haggard, P. (2015). Poor judgment of distance between nociceptive stimuli. Cognition, 143, 41-47.
  • Mandrigin, A., & Thompson, E. (2015). Own-body perception. In M Matthen (Ed.), Oxford handbook of the philosophy of perception. Oxford: Oxford University Press.
  • Martin, M. G. F. (1992). Sight and touch. In Crane, T. (Ed.). The contents of experience: Essays on perception. New York: Cambridge University Press.
  • Martin, M. G. F. (1993). Sensory modalities and spatial properties. In N. Eilan, R. McCarty, and B. Brewer (Eds.), Spatial representation: Problems in philosophy and psychology. Oxford: Basil Blackwell.
  • Martin, M. G. F. (1995). Bodily awareness: A sense of ownership. In J. L. Bermúdez, A. Marcel, and N. Eilan (Eds.), The body and the self. Cambridge, MA: MIT Press.
  • Martin, M. G. F. (2002). The transparency of experience. Mind and Language, 17(4), 376-425.
  • McDowell, J. (1996). Mind and world. Cambridge, MA: Harvard University Press.
  • McGinn, C. (1996). The character of mind: An introduction to the philosophy of mind. Oxford: Oxford University Press.
  • McGlone, F., Wessberg, J., & Olausson, H. (2014). Discriminative and affective touch: Sensing and feeling. Neuron, 82(4), 737-755.
  • Merleau-Ponty, M. (1945/2013). Phenomenology of perception. (D. A. Landes, trans.) London: Routledge.
  • Merleau-Ponty, M. (1968). The visible and the invisible. (A. Lingis, trans.). Evanston: Northwestern University Press.
  • Montero, B. (2006). Proprioceiving someone else’s movement. Philosophical Explorations: An International Journal for the Philosophy of Mind and Action, 9(2), 149-161.
  • Morash, V., Pensky, A. E. C., Alfaro, A. U., & McKerracher, A. (2012). A review of haptic spatial abilities in the blind. Spatial Cognition and Computation, 12(2-3), 83-95.
  • Moro, V., Massimiliano, Z., & Salvatore, M. A. (2004). Changes in spatial position of hands modify tactile extinction but not disownership of contralesional hand in two right brain-damaged patients. Neurocase, 10(6), 437-443.
  • Nanay, B. (2016). Aesthetics as philosophy of perception. Oxford: Oxford University Press.
  • National Research Council (US) Committee on Recognition and Alleviation of Pain in Laboratory Animals. (2009). Recognition and alleviation of pain in laboratory animals. Washington, DC: National Academies Press.
  • Noë, A. (2004). Action in perception. Cambridge, MA: MIT Press.
  • Noordhof, P. (2002). In pain. Analysis, 61(2), 95-97.
  • O’Dea, J. (2011). A proprioceptive account of the sense modalities. In Macpherson, F. (Ed.), The senses: Classic and contemporary philosophical perspectives. Oxford: Oxford University Press.
  • O’Shaughnessy, B. (1980). The will, vol. 1. Cambridge: Cambridge University Press.
  • O’Shaughnessy, B. (1989). The sense of touch. Australasian Journal of Philosophy, 67(1), 37-58.
  • O’Shaughnessy, B. (1995). Proprioception and the body image. In Bermúdez, B., Marcel, A., & Eilan, N. (Eds.), The body and the self. Cambridge, MA: MIT Press.
  • O’Shaughnessy, B. (2000). Consciousness and the world. Oxford: Oxford University Press.
  • Paillard, J. (1999). Body schema and body image: A double dissociation in deafferented patients. In Gantchev, G. N., Mori, S., and Massion, J. (Eds.), Motor control today and tomorrow. Sofia: Professor Marius Drinov Academic Publishing House.
  • Peacocke, C. (1983). Sense and content: Experience, thought, and their relations. Oxford: Oxford University Press.
  • Peacocke, C. (2017). Philosophical reflections on the first person, the body, and agency. The subject’s matter: Self-consciousness and the body. Cambridge, MA: MIT Press.
  • Penfield, W., & Rasmussen, T. (1950). The cerebral cortex of man: A clinical study of localization of function. New York: Macmillan.
  • Perry, J. (1990). Self-location. Logos, 11, 17-31.
  • Perry, J. (2001). Reference and reflexivity. Stanford: CSLI Publications.
  • Plumbley, M. D. (2013). Hearing the shape of a room. Proceedings of the National Academy of Sciences of the United States of America, 201309932.
  • Rabellino, D., Frewen, P. A., McKinnon, M. C., & Lanius, R. A. (2020). Peripersonal space and bodily self-consciousness: Implications for psychological trauma-related disorders. Frontiers in Neuroscience, 14, 586605.
  • Ratcliffe, M. (2005). The feeling of being. Journal of Consciousness Studies, 12(8-10), 43-60.
  • Ratcliffe, M. (2008). Feelings of being: Phenomenology, psychiatry and the sense of reality. Oxford: Oxford University Press.
  • Ratcliffe, M. (2012). What is touch? Australasian Journal of Philosophy, 90(3), 413-432.
  • Ratcliffe, M. (2016). Existential feeling and narrative. In Muller, O. and Breyer, T. (Eds.), Funktionen des Lebendigen. Berlin: De Gruyter.
  • Reichenbach, H. (1947). Elements of symbolic logic. New York: Free Press.
  • Reid, T. (1863/1983). Inquiry and essays. Indiana: Hackett Publishing Company.
  • Reuter, K. (2017). The developmental challenge of the paradox of pain. Erkenntnis, 82, 265-283.
  • Reuter, K., Phillips, D., & Sytsma, J. (2014). In J. Sytsma (Ed.), Advances in experimental philosophy of mind. London: Bloomsbury Academic.
  • Richardson, L. (2010). Seeing empty space. European Journal of Philosophy, 18(2), 227-243.
  • Rizzolatti, G., Scandolara, C., Matelli, M., & Gentilucci, M. (1981). Afferent properties of periarcuate neurons in macaque monkeys. I. Somatosensory responses. Behavioural Brain Research, 2, 125-146.
  • Rosenthal, D. M. (2010). Consciousness, the self and bodily location. Analysis, 70(2), 270-276.
  • Salje, L. (2017). Crossed wires about crossed wires: Somatosensation and immunity to error through misidentification. Dialectica, 71(1), 35-56.
  • Salje, L. (2019). The inside-out binding problem. In Cheng, T., Deroy, O., and Spence, C. (Eds.), Spatial senses: Philosophy of perception in an age of science. London: Routledge.
  • Sartre, J-P. (1943/2003). Being and nothingness: An essay on phenomenological ontology. (H. E. Barnes, trans). Oxford: Routledge.
  • Searle, J. (1992). The rediscovery of the mind. Cambridge, MA: MIT Press.
  • Seth, A. (2013). Interoceptive inference, emotion, and the embodied self. Trends in Cognitive Sciences, 17(11), 565-573.
  • Schrenk, M. (2014). Is proprioceptive art possible? In Priest, G. and Young, D. (Eds.), Philosophy and the Martial Arts. New York: Routledge.
  • Schwenkler, J. (2013). The objects of bodily awareness. Philosophical Studies, 162(2), 465-472.
  • Serino, A., Giovagnoli, G., Vignemont, de. V., & Haggard, P. (2008). Acta Psychologica,
  • Serino, A., Noel, J-P., Mange, R., Canzoneri, E., Pellencin, E., Ruiz, J. B., Bernasconi, F., Blanke, O., & Herbelin, B. (2018). Peripersonal space: An index of multisensory body-environment interactions in real, virtual, and mixed realities. Frontiers in ICT, 4, 31.
  • Serrahima, C. (forthcoming). The bounded body: On the sense of bodily ownership and the experience of space. In Garcia-Carpintero, M. and Guillot, M. (Eds)., The sense of mineness. Oxford: Oxford University Press.
  • Shoemaker, S. (1968). Self-reference and self-awareness. The journal of Philosophy, 65(19), 555-567.
  • Shoemaker, S. (1996). The first-person perspective and other essays. Cambridge: Cambridge University Press.
  • Sherrington, C. S. (1906). (Ed.) The integrative action of the nervous system. Cambridge: Cambridge University Press.
  • Siegel, S. (2010). The contents of visual experience. Oxford: Oxford University Press.
  • Skrzypulec, B. (2021). Spatial content of painful sensations. Mind and Language, online first.
  • Smith, A. D. (2002). The problem of perception. Cambridge, MA: Harvard University Press.
  • Smith, J. (2006). Bodily awareness, imagination and the self. European Journal of Philosophy, 14(1), 49-68.
  • Smith, J. (2016). Experiencing phenomenology: An introduction. New York: Routledge.
  • Smythies, J. (1996). A note on the concept of the visual field in neurology, psychology, and visual neuroscience. Perception, 25(3), 369-371.
  • Soteriou, M. (2013). The mind’s construction: The ontology of mind and mental action. Oxford: Oxford University Press.
  • Steward, H. (1997). The ontology of mind: Events, processes, and states. Oxford: Clarendon Press.
  • Strawson, P. F. (1959). Individuals: An essay in descriptive metaphysics. London: Routledge.
  • Travis, C. (2004). The silence of the senses. Mind, 113(449), 57-94.
  • Tsakiris, M. (2017). The material me: Unifying the exteroceptive and interoceptive sides of the bodily self. In F. D. Vignemont and A. J. T. Alsmith (Eds.), The subject’s matter: Self-consciousness and the body. Cambridge, MA: MIT Press.
  • Tsakiris, M., & de Preester, H. (2019). The interoceptive mind: From homeostasis to awareness. Oxford: Oxford University Press.
  • Tuthill, J. C., & Azim, E. (2018). Proprioception. Current Biology, 28(5), R194-R203.
  • Vignemont, F. (2007). Habeas Corpus: The sense of ownership of one’s own body. Mind and Language, 22(4), 427-449.
  • Vignemont, F. (2011/2020). Bodily awareness. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy.
  • Vignemont, F. (2018a). Was Descartes right after all? An affective background for bodily awareness. In M. Tsakiris and H. de Preester (Eds.), The interoceptive mind: From homeostasis to awareness. Oxford: Oxford University Press.
  • Vignemont, F. (2018b). Mind the body: An exploration of bodily self-awareness. Oxford: Oxford University Press.
  • Vignemont, F. (2020). Bodily feelings: Presence, agency, and ownership. In U. Kriegel (Ed.), The Oxford handbook of the philosophy of consciousness. Oxford: Oxford University Press.
  • Vignemont, F. (2021). Feeling the world as being here. In F. de Vignemont, A. Serino, H. Y. Wong, and A. Farnè (Eds.), The world at our fingertips: A multidisciplinary exploration of peripersonal space. Oxford: Oxford University Press.
  • Vignemont, F. (forthcoming). Bodily awareness. Cambridge: Cambridge University Press.
  • Vignemont, F., & Alsmith, A. (2017) (Ed.) The subject’s matter: Self-consciousness and the body. Cambridge, MA: MIT Press.
  • Vignemont, F., & Iannetti, G. D. (2015). How many peripersonal spaces? Neuropsychologia, 70, 327-334.
  • Vignemont, F., & Massin, O. (2015). Touch. In Matthen, M. (Ed.) The Oxford Handbook of Philosophy of Perception. Oxford: Oxford University Press.
  • Vignemont, F., Serino, A., Wong, H. Y., & Farnè, A. (2021). (Eds.) The world at our fingertips: A multidisciplinary exploration of peripersonal space. Oxford: Oxford University Press.
  • Williams, B. (1970). The self and the future. The Philosophical Review, 79(2), 161-187.
  • Williams, B. (2006). Ethics and the limits of philosophy. London: Routledge.
  • Wilson, K. (2021). Individuating the senses of “smell”: Orthonasal versus retronasal olfaction. Synthese, 199, 4217-4242.
  • Wittgenstein, L. (1958). The blue and brown books. Oxford: Blackwell.
  • Wong, H. Y. (2015). On the significance of bodily awareness for bodily action. The Philosophical Quarterly, 65(261), 790-812.
  • Wong, H. Y. (2017a). On proprioception in action: Multimodality versus deafferentation. Mind and Language, 32(3), 259-282.
  • Wong, H. Y. (2017b). In and out of balance. In de Vignemont, F. and Alsmith, A. (Eds.), The subject’s matter: Self-consciousness and the body. Cambridge, MA: MIT Press.
  • Zahavi, D. (2021). Embodied subjectivity and objectifying self-consciousness: Cassam and phenomenology. Analytic Philosophy, 62, 97-105.

 

Author Information

Tony Cheng
Email: h.cheng.12@alumni.ucl.ac.uk
National Chengchi University
Taipei

George Orwell (1903—1950)

Eric Arthur Blair, better known by his pen name George Orwell, was a British essayist, journalist, and novelist. Orwell is most famous for his dystopian works of fiction, Animal Farm and Nineteen Eighty-Four, but many of his essays and other books have remained popular as well. His body of work provides one of the twentieth century’s most trenchant and widely recognized critiques of totalitarianism.

Orwell did not receive academic training in philosophy, but his writing repeatedly focuses on philosophical topics and questions in political philosophy, epistemology, philosophy of language, ethics, and aesthetics. Some of Orwell’s most notable philosophical contributions include his discussions of nationalism, totalitarianism, socialism, propaganda, language, class status, work, poverty, imperialism, truth, history, and literature.

Orwell’s writings map onto his intellectual journey. His earlier writings focus on poverty, work, and money, among other themes. Orwell examines poverty and work not only from an economic perspective, but also socially, politically, and existentially, and he rejects moralistic and individualistic accounts of poverty in favor of systemic explanations. In so doing, he provides the groundwork for his later championing of socialism.

Orwell’s experiences in the 1930s, including reporting on the living conditions of the poor and working class in Northern England as well as fighting as a volunteer soldier in the Spanish Civil War, further crystalized Orwell’s political and philosophical outlook. This led him to write in 1946 that, “Every line of serious work I have written since 1936 has been, directly or indirectly, against totalitarianism and for democratic Socialism” (“Why I Write”).

For Orwell, totalitarianism is a political order focused on power and control. Much of Orwell’s effectiveness in writing against totalitarianism stems from his recognition of the epistemic and linguistic dimensions of totalitarianism. This is exemplified by Winston Smith’s claim as the protagonist in Nineteen Eighty-Four: “Freedom is the freedom to say that two plus two makes four. If that is granted, all else follows.” Here Orwell uses, as he often does, a particular claim to convey a broader message. Freedom (a political state) rests on the ability to retain the true belief that two plus two makes four (an epistemic state) and the ability to communicate that truth to others (via a linguistic act).

Orwell also argues that political power is dependent upon thought and language. This is why the totalitarian, who seeks complete power, requires control over thought and language. In this way, Orwell’s writing can be viewed as philosophically ahead of its time for the way it brings together political philosophy, epistemology, and philosophy of language.

Table of Contents

  1. Biography
  2. Political Philosophy
    1. Poverty, Money, and Work
    2. Imperialism and Oppression
    3. Socialism
    4. Totalitarianism
    5. Nationalism
  3. Epistemology and Philosophy of Mind
    1. Truth, Belief, Evidence, and Reliability
    2. Ignorance and Experience
    3. Embodied Cognition
    4. Memory and History
  4. Philosophy of Language
    1. Language and Thought
    2. Propaganda
  5. Philosophy of Art and Literature
    1. Value of Art and Literature
    2. Literature and Politics
  6. Orwell’s Relationship to Academic Philosophy
  7. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Eric Arthur Blair was born on June 25, 1903 in India. His English father worked as a member of the British specialized services in colonial India, where he oversaw local opium production for export to China. When Blair was less than a year old, his mother, of English and French descent, returned to England with him and his older sister. He saw relatively little of his father until he was eight years old.

Blair described his family as part of England’s “lower-upper-middle class.” Blair had a high degree of class consciousness, which became a common theme in his work and a central concern in his autobiographical essay, “Such, Such Were the Joys” (facetiously titled) about his time at the English preparatory school St. Cyprian’s, which he attended from ages eight to thirteen on a merit-based scholarship. After graduating from St. Cyprian’s, from ages thirteen to eighteen Orwell attended the prestigious English public school, Eton, also on a merit-based scholarship.

After graduating from Eton, where he had not been a particularly successful student, Blair decided to follow in his father’s footsteps and join the specialized services of the British Empire rather than pursue higher education. Blair was stationed in Burma (now Myanmar) where his mother had been raised. He spent five unhappy years with the Imperial Police in Burma (1922-1927) before leaving the position to return to England in hopes of becoming a writer.

Partly out of need and partly out of desire, Blair spent several years living in or near poverty both in Paris and London. His experiences formed the basis for his first book, Down and Out in Paris and London, which was published in 1933. Blair published the book under the pen name George Orwell, which became the moniker he would use for his published writings for the rest of his life.

Orwell’s writing was often inspired by personal experience. He used his experiences working for imperial Britain in Burma as the foundation for his second book, Burmese Days, first published in 1934, and his frequently anthologized essays, “A Hanging” and “Shooting an Elephant,” first published in 1931 and 1936 respectively.

He drew on his experiences as a hop picker and schoolteacher in his third novel, A Clergyman’s Daughter, first published in 1935. His next novel, Keep the Aspidistra Flying, published in 1936, featured a leading character who had given up a middle-class job for the subsistence pay of a book seller and the chance to try to make it as a writer. At the end of the novel, the protagonist gets married and returns to his old middle-class job. Orwell wrote this book while he himself was working as a book seller who would soon be married.

The years 1936-1937 included several major events for Orwell, which would influence his writing for the rest of his life. Orwell’s publisher, the socialist Victor Gollancz, suggested that Orwell spend time in the industrial north of England in order to gather experience about the conditions there for journalistic writing. Orwell did so during the winter of 1936. Those experiences formed the foundation for his 1937 book, The Road to Wigan Pier. The first half of Wigan Pier reported on the poor working conditions and poverty that Orwell witnessed. The second half focused on the need for socialism and the reasons why Orwell thought the British left intelligentsia had failed in convincing the poor and working class of the need for socialism. Gollancz published Wigan Pier as part of his Left Book Club, which provided Wigan Pier with a larger platform and better sales than any of his previous books.

In June 1936, Orwell married Eileen O’Shaughnessy, an Oxford graduate with a degree in English who had worked various jobs including those of teacher and secretary. Shortly thereafter, Orwell became a volunteer soldier fighting on behalf of the left-leaning Spanish Republicans against Francisco Franco and the Nationalist right in the Spanish Civil War. His wife joined him in Spain later. Orwell’s experiences in Spain further entrenched his shift towards overtly political writing. He experienced first-hand the infighting between various factions opposed to Franco on the political left. He also witnessed the control that the Soviet Communists sought to exercise over both the war, and perhaps more importantly, the narratives told about the war.

Orwell fought with the POUM (Partido Obrero de Unificación Marxista) militia that was later maligned by Soviet propaganda. The Soviets leveled a range of accusations against the militia, including that its members were Trotskyists and spies for the other side. As a result, Spain became an unsafe place for him and Eileen. They escaped Spain by train to France in the summer of 1937. Orwell later wrote about his experiences in the Spanish Civil War in Homage to Catalonia, published in 1938.

While Wigan Pier had signaled the shift to an abiding focus on politics and political ideas in Orwell’s writing, similarly, Homage to Catalonia signaled the shift to an abiding focus on epistemology and language in his work. Orwell’s time in Spain helped him understand how language shapes beliefs and how beliefs, in turn, shape the contours of power. Thus, Homage to Catalonia does not mark a mere epistemic and linguistic turn in Orwell’s thinking. It also marks a significant development in Orwell’s views about the complex relationship between language, thought, and power.

Orwell’s experiences in Spain also further cemented his anti-Communism and his role as a critic of the left operating within the left. After a period of ill health upon returning from Spain due to his weak lungs from having been shot in his throat during battle, Orwell took on a grueling pace of literary production, publishing Coming Up for Air in 1939, Inside the Whale and Other Essays in 1940, and his lengthy essay on British Socialism, “The Lion and the Unicorn: Socialism and the English Genius” in 1941, as well as many other essays and reviews.

Orwell would have liked to have served in the military during the Second World War, but his ill health prevented him from doing so. Instead, between 1941-1943 he worked for the British Broadcasting Company (BBC). His job was meant, in theory, to aid Britain’s war efforts. Orwell was tasked with creating and delivering radio content to listeners on the Indian subcontinent in hopes of creating support for Britain and the Allied Powers. There were, however, relatively few listeners, and Orwell came to consider the job a waste of his time. Nevertheless, his experiences of bureaucracy and censorship at the BBC would later serve as one of the inspirations for the “Ministry of Truth,” which played a prominent role in the plot of Nineteen Eighty-Four (Sheldon 1991, 380-381).

Orwell’s final years were a series of highs and lows. After leaving the BBC, Orwell was hired as the literary editor at the democratic socialist magazine, the Tribune. As part of his duties, he wrote a regular column titled “As I Please.” He and Eileen, who herself was working for the BBC, adopted a baby boy named Richard in 1944. Shortly before they adopted Richard, Orwell had finished work on what was to be his breakthrough work, Animal Farm. Orwell originally had trouble finding someone to publish Animal Farm due to its anti-Communist message and publishers’ desires not to undermine Britain’s war effort, given that the United Kingdom was allied with the USSR against Nazi Germany at the time. The book was eventually published in August 1945, a few months after Eileen had died unexpectedly during an operation at age thirty-nine.

Animal Farm was a commercial success in both the United States and the United Kingdom. This gave Orwell both wealth and literary fame. Orwell moved with his sister Avril and Richard to the Scottish island of Jura, where Orwell hoped to be able to write with less interruption and to provide a good environment in which to raise Richard. During this time, living without electricity on the North Atlantic coast, Orwell’s health continued to decline. He was eventually diagnosed with tuberculosis.

Orwell pressed ahead on completing what was to be his last book, Nineteen Eighty-Four. In the words of one of Orwell’s biographers, Michael Sheldon, Nineteen Eighty-Four is a book in which “Almost every aspect of Orwell’s life is in some way represented.” Published in 1949, Nineteen Eighty-Four was in many ways the culmination of Orwell’s life work: it dealt with all the major themes from his writing—poverty, social class, war, totalitarianism, nationalism, censorship, truth, history, propaganda, language, and literature, among others.

Orwell died less than a year after the publication of Nineteen Eighty-Four. Shortly before his death, he had married Sonia Brownell, who had worked for the literary magazine Horizons. Brownell, who later went by Sonia Brownell Orwell, became one of Orwell’s literary executors. Her efforts to promote her late husband’s work included establishing the George Orwell Archive at University College London and co-editing with Ian Angus a four-volume collection of Orwell’s essays, journalism, and letters, first published in 1968. The publication of this collection further increased interest in Orwell and his work, which has yet to abate in the over seventy years since his death.

2. Political Philosophy

Orwell’s claim that “Every line of serious work I have written since 1936 has been, directly or indirectly, against totalitarianism and for democratic Socialism,” divides Orwell’s work into two parts: pre-1936 and 1936-and-after.

Orwell’s second period (1936-and-after) is characterized by his strong views on politics and his focus on the interconnections between language, thought, and power. Orwell’s first period (pre-1936) focuses on two sets of interrelated themes: (1) poverty, money, work, and social status, and (2) imperialism and its ethical costs.

a. Poverty, Money, and Work

Orwell frequently wrote about poverty. It is a central topic in his books Down and Out and Wigan Pier and many of his essays, including “The Spike” and “How the Poor Die.” In writing about poverty, Orwell does not adopt an objective “view from nowhere”: rather, he writes as a member of the middle class to readers in the upper and middle classes. In doing so, he seeks to correct common misconceptions about poverty held by those in the upper and middle classes. These correctives deal with both the phenomenology of poverty and its causes.

His overall picture of poverty is less dramatic but more benumbing than his audience might initially imagine: one’s spirit is not crushed by poverty but rather withers away underneath it.

Orwell’s phenomenology of poverty is exemplified in the following passage from Down and Out:

It is altogether curious, your first contact with poverty. You have thought so much about poverty it is the thing you have feared all your life, the thing you knew would happen to you sooner or later; and it, is all so utterly and prosaically different. You thought it would be quite simple; it is extraordinarily complicated. You thought it would be terrible; it is merely squalid and boring. It is the peculiar lowness of poverty that you discover first; the shifts that it puts you to, the complicated meanness, the crust-wiping (Down and Out, 16-17).

This account tracks Orwell’s own experiences by assuming the perspective of one who encounters poverty later in life, rather than the perspective of someone born into poverty. At least for those who “come down” into poverty, Orwell identifies a silver lining in poverty: that the fear of poverty in a hierarchical capitalist society is perhaps worse than poverty itself. Once you realize that you can survive poverty (which is something Orwell seemed to think most middle-class people in England who later become impoverished could), there is “a feeling of relief, almost of pleasure, at knowing yourself at last genuinely down and out” (Down and Out, 20-21). This silver lining, however, seems to be limited to those who enter poverty after having received an education. Orwell concludes that those who have always been down and out are the ones who deserve pity because such a person “faces poverty with a blank, resourceless mind” (Down and Out, 180). This latter statement invokes controversial assumptions in the philosophy of mind and is indicative of the ways in which Orwell was never able to overcome certain class biases from his own education. Orwell’s views on the working class and the poor have been critiqued by some scholars, including Raymond Williams (1971) and Beatrix Campbell (1984).

Much of Orwell’s discussion about poverty is aimed at humanizing poor people and at rooting out misconceptions about poor people. Orwell saw no inherent difference of character between rich and poor. It was their circumstances that differed, not their moral goodness. He identifies the English as having a “a strong sense of the sinfulness of poverty” (Down and Out, 202). Through personal narratives, Orwell seeks to undermine this sense, concluding instead that “The mass of the rich and the poor are differentiated by their incomes and nothing else, and the average millionaire is only the average dishwasher dressed in a new suit” (Down and Out, 120). Orwell blames poverty instead on systemic factors, which the rich have the ability to change. Thus, if Orwell were to pass blame for the existence of poverty, it is not the poor on whom he would pass blame.

If poverty is erroneously associated with vice, Orwell notes that money is also erroneously associated with virtue. This theme is taken up most directly in his 1936 novel, Keep the Aspidistra Flying, which highlights the central role that money plays in English life through the failures of the novel’s protagonist to live a fulfilling life that does not revolve around money. Orwell is careful to note that the significance of money is not merely economic, but also social. In Wigan Pier, Orwell notes that English class stratification is a “money-stratification” but that it is also a “shadowy caste-system” that “is not entirely explicable in terms of money” (122). Thus, both money and culture seem to play a role in Orwell’s account of class stratification in England.

Orwell’s view on the social significance of money helped shape his views about socialism. For example, in “The Lion and the Unicorn,” Orwell argued in favor of a socialist society in which income disparities were limited on the grounds that a “man with £3 a week and a man with £1500 a year can feel themselves fellow creatures, which the Duke of Westminster and the sleepers on the Embankment benches cannot.”

Orwell was attuned to various ways in which money impacts work and vice versa. For example, in Keep the Aspidistra Flying, the protagonist, Gordon Comstock, leaves his job in order to have time to write, only to discover that the discomforts of living on very little money have drained him of the motivation and ability to write. This is in keeping with Orwell’s view that creative work, such as making art or writing stories, requires a certain level of financial comfort. Orwell expresses this view in Wigan Pier, writing that, “You can’t command the spirit of hope in which anything has got to be created, with that dull evil cloud of unemployment hanging over you” (82).

Orwell sees this inability to do creative or other meaningful work as itself one of the harmful consequences of poverty. This is because Orwell views engaging in satisfying work as a meaningful part of human experience. He argues that human beings need work and seek it out (Wigan Pier, 197) and even goes so far as to claim that being cut off from the chance to work is being cut off from the chance of living (Wigan Pier, 198). But this is because Orwell sees work as a way in which we can meaningfully engage both our bodies and our minds. For Orwell, work is valuable when it contributes to human flourishing.

But this does not mean that Orwell thinks all work has such value. Orwell is often critical of various social circumstances that require people to engage in work that they find degrading, menial, or boring. He shows particular distaste for working conditions that combine undesirability with inefficiency or exploitation, such as the conditions of low-level staff in Paris restaurants and coal miners in Northern England. Orwell recognizes that workers tolerate such conditions out of necessity and desperation, even though such working conditions often rob the workers of many aspects of a flourishing human life.

b. Imperialism and Oppression

By the time he left Burma at age 24, Orwell had come to strongly oppose imperialism. His anti-imperialist works include his novel Burmese Days, his essays “Shooting an Elephant” and “A Hanging,” and chapter 9 of Wigan Pier, in which he wrote that by the time he left his position with the Imperial Police in Burma that “I hated the imperialism I was serving with a bitterness which I probably cannot make clear” (Wigan Pier, 143).

In keeping with Orwell’s tendency to write from experience, Orwell focused mostly on the damage that he saw imperialism causing the imperialist oppressor rather than the oppressed. One might critique Orwell for failing to better account for the damage imperialism causes the oppressed, but one might also credit Orwell for discussing the evils of imperialism in a manner that might make its costs seem real to his audience, which, at least initially, consisted mostly of beneficiaries of British imperialism.

In writing about the experience of imperialist oppression from the perspective of the oppressor, Orwell often returns to several themes.

The first is the role of experience. Orwell argues that one can only really come to hate imperialism by being a part of imperialism (Wigan Pier, 144). One can doubt this is true, while still granting Orwell the emotional force of the point that experiencing imperialism firsthand can give one a particularly vivid understanding of imperialism’s “tyrannical injustice,” because one is, as Orwell put it, “part of the actual machinery of despotism” (Wigan Pier, 145).

Playing such a role in the machinery of despotism connects to a second theme in Orwell’s writing on imperialism: the guilt and moral damage caused by being an imperialist oppressor. In Wigan Pier, for example, Orwell writes the following about his state of mind after working for five years for the British Imperial Police in Burma:

I was conscious of an immense weight of guilt that I had got to expiate. I suppose that sounds exaggerated; but if you do for five years a job that you thoroughly disapprove of, you will probably feel the same. I had reduced everything to a simple theory that the oppressed are always right and the oppressors always wrong: a mistaken theory, but the natural result of being one of the oppressors yourself (Wigan Pier, 148).

A third theme in Orwell’s writing about imperialism is about ways in which imperialist oppressors—despite having economic and political power over the oppressed—themselves become controlled, in some sense, by those whom they oppress. For example, in “Shooting an Elephant” Orwell presents himself as having shot an elephant that got loose in a Burmese village merely in order to satisfy the local people’s expectations, even though he doubted shooting the elephant was necessary. Orwell writes of the experience that “I perceived in this moment that when the white man turns tyrant it is his own freedom that he destroys…For it is the condition of his rule that he shall spend his life trying to impress the ‘natives’ and so in every crisis he has got to do what the ‘natives’ expect of him.”

Thus, on Orwell’s account, no one is free under conditions of imperialist oppression—neither the oppressors nor the oppressed. The oppressed experience what Orwell calls in Wigan Pier “double oppression” because imperialist power not only leads to substantive injustice being committed against oppressed people, but to injustices committed by unwanted foreign invaders (Wigan Pier, 147). Oppressors, on the other hand, feel the need to conform to their role as oppressors despite their guilt, shame, and a desire to do otherwise (which Orwell seemed to think were near universal feelings among the British imperialists of his day).

Notably, some of Orwell’s earliest articulations of how pressures to socially conform can lead to suppression of freedom of speech occur in the context of his discussions of the lack of freedom experienced by imperialist oppressors. For example, in “Shooting an Elephant,” Orwell wrote that he “had to think out [his] problems in the utter silence that is imposed on every Englishman in the East.” And in Wigan Pier, he wrote that for British imperialists in India there was “no freedom of speech” and that “merely to be overheard making a seditious remark may damage [one’s] career” (144).

c. Socialism

From the mid-1930s until the end of his life, Orwell advocated for socialism. In doing so, he sought to defend socialism against mischaracterization. Thus, to understand Orwell’s views on socialism, one must understand both what Orwell thought socialism was and what he thought it was not.

Orwell offers his most succinct definition of socialism in Wigan Pier as meaning “justice and liberty.” The sense of justice he had in mind included not only economic justice, but also social and political justice. Inclusion of the word “liberty” in his definition of socialism helps explain why elsewhere Orwell specifies that he is a democratic socialist. For Orwell, democratic socialism is a political order that provides social and economic equality while also preserving robust personal freedom. Orwell was particularly concerned to preserve what we might call the intellectual freedoms: freedom of thought, freedom of expression, and freedom of the press.

Orwell’s most detailed account of socialism, at least as he envisioned it for Great Britain, is included in his essay “The Lion and the Unicorn.” Orwell notes that socialism is usually defined as “common ownership of the means of production” (Part II, Section I), but he takes this definition to be insufficient. For Orwell, socialism also requires political democracy, the removal of hereditary privilege in the United Kingdom’s House of Lords, and limits on income inequality (Part II, Section I).

For Orwell, one of the great benefits of socialism seems to be the removal of class-based prejudice. Orwell saw this as necessary for the creation of fellow feeling between people within a society. Given his experiences within socially stratified early twentieth century English culture, Orwell saw the importance of removing both economic and social inequality in achieving a just and free society.

This is reflected in specific proposals that Orwell suggested England adopt going into World War II. (In “The Lion and the Unicorn,” Orwell typically refers to England or Britain, rather than the United Kingdom as a whole. This is true of much of Orwell’s work.) These proposals included:

I. Nationalization of land, mines, railways, banks and major industries.
II. Limitation of incomes, on such a scale that the highest tax-free income in Britain does not exceed the lowest by more than ten to one.
III. Reform of the educational system along democratic lines.
IV. Immediate Dominion status for India, with power to secede when the war is over.
V. Formation of an Imperial General Council, in which the colored peoples are to be represented.
VI. Declaration of formal alliance with China, Abyssinia and all other victims of the Fascist powers. (Part III, Section II)

Orwell viewed these as steps that would turn England into a “socialist democracy.”

In the latter half of Wigan Pier, Orwell argues that many people are turned off by socialism because they associate it with things that are not inherent to socialism. Orwell contends that socialism does not require the promotion of mechanical progress, nor does it require a disinterest in parochialism or patriotism. Orwell also views socialism as distinct from both Marxism and Communism, viewing the latter as a form of totalitarianism that at best puts on a socialist façade.

Orwell contrasts socialism with capitalism, which he defines in “The Lion and the Unicorn” as “an economic system in which land, factories, mines and transport are owned privately and operated solely for profit.” Orwell’s primary reason for opposing capitalism is his contention that capitalism “does not work” (Part II, Section I). Orwell offers some theoretical reasons to think capitalism does not work (for example, “It is a system in which all the forces are pulling in opposite directions and the interests of the individual are as often as not totally opposed to those of the State” (Part II, Section I). But the core of Orwell’s argument against capitalism is grounded in claims about experience. In particular, he argues that capitalism left Britain ill-prepared for World War II and led to unjust social inequality.

d. Totalitarianism

Orwell conceives of totalitarianism as a political order focused on absolute power and control. The totalitarian attitude is exemplified by the antagonist, O’Brien, in Nineteen Eighty-Four. The fictional O’Brien is a powerful government official who uses torture and manipulation to gain power over the thoughts and actions of the protagonist, Winston Smith, a low-ranking official working in the propaganda-producing “Ministry of Truth.” Significantly, O’Brien treats his desire for power as an end in itself. O’Brien represents power for power’s sake.

Orwell recognized that because totalitarianism seeks complete power and total control, it is incompatible with the rule of law—that is, that totalitarianism is incompatible with stable laws that apply to everyone, including political leaders themselves. In “The Lion and the Unicorn,” Orwell writes of “[t]he totalitarian idea that there is no such thing as law, there is only power.” While law limits a ruler’s power, totalitarianism seeks to obliterate the limits of law through the uninhibited exercise of power. Thus, the fair and consistent application of law is incompatible with the kind of complete centralized power and control that is the final aim of totalitarianism.

Orwell sees totalitarianism as a distinctly modern phenomenon. For Orwell, Soviet Communism, Italian Fascism, and German Nazism were the first political orders seeking to be truly totalitarian. In “Literature and Totalitarianism,” Orwell describes the way in which totalitarianism differs from previous forms of tyranny and orthodoxy as follows:

The peculiarity of the totalitarian state is that though it controls thought, it doesn’t fix it. It sets up unquestionable dogmas, and it alters them from day to day. It needs the dogmas, because it needs absolute obedience from its subjects, but it can’t avoid the changes, which are dictated by the needs of power politics (“Literature and Totalitarianism”).

In pursuing complete power, totalitarianism seeks to bend reality to its will. This requires treating political power as prior to objective truth.

But Orwell denies that truth and reality can bend in the ways that the totalitarian wants them to. Objective truth itself cannot be obliterated by the totalitarian (although perhaps the belief in objective truth can be). It is for this reason that Orwell writes in “Looking Back on the Spanish War” that “However much you deny the truth, the truth goes on existing, as it were, behind your back, and you consequently can’t violate it in ways that impair military efficiency.” Orwell considers this to be one of the two “safeguards” against totalitarianism. The other safeguard is “the liberal tradition,” by which Orwell means something like classical liberalism and its protection of individual liberty.

Orwell understood that totalitarianism could be found on the political right and left. For Orwell, both Nazism and Communism were totalitarian (see, for example, “Raffles and Miss Blandish”). What united both the Soviet Communist and the German Nazi under the banner of totalitarianism was a pursuit of complete power and the ideological conformity that such power requires. Orwell recognized that such power required extensive capacity for surveillance, which explains why means of surveillance such as the “telescreen” and the “Thought Police” play a large role in the plot of Nineteen Eighty-Four. (For a discussion of Orwell as an early figure in the ethics of surveillance, see the article on surveillance ethics.)

e. Nationalism

One of Orwell’s more often cited contributions to political thought is his development of the concept of nationalism. In “Notes on Nationalism,” Orwell describes nationalism as “the habit of identifying oneself with a single nation or other unit, placing it beyond good and evil and recognizing no other duty than that of advancing its interests.” In “The Sporting Spirit,” Orwell adds that nationalism is “the lunatic modern habit of identifying oneself with large power units and seeing everything in terms of competitive prestige.”

In both these descriptions Orwell describes nationalism as a “habit.” Elsewhere, he refers to nationalism more specifically as a “habit of mind.” This habit of mind has at least two core features for Orwell—namely, (1) rooting one’s identity in group membership rather than in individuality, and (2) prioritizing advancement of the group one identifies with above all other goals. It is worth examining each of these features in more detail.

For Orwell, nationalism requires subordination of individual identity to group identity, where the group one identifies with is a “large power unit.” Importantly, for Orwell this large power unit need not be a nation. Orwell considered nationalism to be prevalent in movements as varied as “Communism, political Catholicism, Zionism, Antisemitism, Trotskyism and Pacifism” (“Notes on Nationalism”). What is required is that the large power unit be something that individuals can adopt as the center of their identity. This can happen both via a positive attachment (that is, by identifying with a group), but it can also happen via negative rejection (that is, by identifying as against a group). This is how, for example, Orwell’s list of movements with nationalistic tendencies could include both Zionism and Antisemitism.

But making group membership the center of one’s identity is not on its own sufficient for nationalism as Orwell understood it. Nationalists make advancement of their group their top priority. For this reason, Orwell states that nationalism “is inseparable from the desire for power” (“Notes on Nationalism”). The nationalist stance is aggressive. It seeks to overtake all else. Orwell contrasts the aggressive posture taken by nationalism with a merely defensive posture that he refers to as patriotism. For Orwell, patriotism is “devotion to a particular place and a particular way of life, which one believes to be the best in the world but has no wish to force on other people” (“Notes on Nationalism”). He sees patriotism as laudable but sees nationalism as dangerous and harmful.

In “Notes on Nationalism,” Orwell writes that the “nationalist is one who thinks solely, or mainly, in terms of competitive prestige.” As a result, the nationalist “may use his mental energy either in boosting or in denigrating—but at any rate his thoughts always turn on victories, defeats, triumphs and humiliations.” In this way, Orwell’s analysis of nationalism can be seen as a forerunner for much of the contemporary discussion about political tribalism and negative partisanship, which occurs when one’s partisan identity is primarily driven by dislike of one’s outgroup rather than support for one’s ingroup (Abramowitz and Webster).

It is worth noting that Orwell takes his own definition of nationalism to be somewhat stipulative. Orwell started with a concept that he felt needed to be discussed and decided that nationalism was the best name for this concept. Thus, his discussions of nationalism (and patriotism) should not be considered conceptual analysis: rather, these discussions are more akin to what is now often called conceptual ethics or conceptual engineering.

3. Epistemology and Philosophy of Mind

Just as 1936-37 marked a shift toward the overtly political in Orwell’s writing, so too those years marked a shift toward the overtly epistemic. Orwell was acutely aware of how powerful entities, such as governments and the wealthy, were able to influence people’s beliefs. Witnessing both the dishonesty and success of propaganda about the Spanish Civil War, Orwell worried that these entities had become so successful at controlling others’ beliefs that “The very concept of objective truth [was] fading out of the world” (“Looking Back at the Spanish War”). Orwell’s desire to defend truth, alongside his worries that truth could not be successfully defended in a completely totalitarian society, culminate in the frequent epistemological ruminations of Winston Smith, the fictional protagonist in Nineteen Eighty-Four.

a. Truth, Belief, Evidence, and Reliability

Orwell’s writing routinely employs many common epistemic terms from philosophy, including “truth,” “belief,” “knowledge,” “facts,” “evidence,” “testimony,” “reliability,” and “fallibility,” among others, yet he also seems to have taken for granted that his audience would understand these terms without defining them. Thus, one must look at how Orwell uses these terms in context in order to figure out what he meant by them.

To start with the basics, Orwell distinguishes between belief and truth and rejects the view that group consensus makes something true. For example, in his essay on Rudyard Kipling, Orwell writes “I am not saying that that is a true belief, merely that it is a belief which all modern men do actually hold.” Such a statement assumes that truth is a property that can be applied to beliefs, that truth is not grounded on acceptance by a group, and that just because someone believes something does not make it true.

On the contrary, Orwell seems to think that truth is, in an important way, mind-independent. For example, he writes that, “However much you deny the truth, the truth goes on existing, as it were, behind your back, and you consequently can’t violate it in ways that impair military efficiency” (“Looking Back on the Spanish War”). For Orwell, truth is derived from the way the world is. Because the world is a certain way, when our beliefs fail to accord with reality, our actions fail to align with the way the world is. This is why rejecting objective truth wholesale would, for instance, “impair military efficiency.” You can claim there are enough rations and munitions for your soldiers, but if, in fact, there are not enough rations and munitions for your soldiers, you will suffer military setbacks. Orwell recognizes this as a pragmatic reason to pursue objective truth.

Orwell does not talk about justification for beliefs as academic philosophers might. However, he frequently appeals to quintessential sources of epistemic justification—such as evidence and reliability—as indicators of a belief’s worthiness of acceptance and its likelihood of being true. For example, Orwell suggests that if one wonders whether one harbors antisemitic attitudes that one should “start his investigation in the one place where he could get hold of some reliable evidence—that is, in his own mind” (“Antisemitism”). Regardless of what one thinks of Orwell’s strategy for detecting antisemitism, this passage shows Orwell’s assumption that, at least some of the time, we can obtain reliable evidence via introspection.

Orwell’s writings on the Spanish Civil War provide a particularly rich set of texts from which to learn about the conditions under which Orwell thinks we can obtain reliable evidence. This is because Orwell was seeking to help readers (and perhaps also himself) separate truth from lies about what happened during that war. In so doing, Orwell offers an epistemology of testimony. For example, he writes:

Nor is there much doubt about the long tale of Fascist outrages during the last ten years in Europe. The volume of testimony is enormous, and a respectable proportion of it comes from the German press and radio. These things really happened, that is the thing to keep one’s eye on (“Looking Back on the Spanish War”).

Here, Orwell appeals to both the volume and the source of testimony as reason to have little doubt that fascist outrages had been occurring in Europe. Orwell also sometimes combines talk of evidence via testimony with other sources of evidence—like first-hand experience—writing, for example, “I have had accounts of the Spanish jails from a number of separate sources, and they agree with one another too well to be disbelieved; besides I had a few glimpses into one Spanish jail myself” (Homage to Catalonia, 179).

While recognizing the epistemic challenges posed by propaganda and self-interest, Orwell was no skeptic about knowledge. He was comfortable attributing knowledge to agents and referring to states of affairs as facts, writing, for example: “These facts, known to many journalists on the spot, went almost unmentioned to the British press” (“The Prevention of Literature”). Orwell was less sanguine about our ability to know with certainty, writing, for example, “[It] is difficult to be certain about anything except what you have seen with your own eyes, and consciously or unconsciously everyone writes as a partisan” (Homage to Catalonia, 195). This provides reason to think that Orwell was a fallibilist about knowledge—that is, someone who thinks you can know a proposition even while lacking certainty about the truth of that proposition. (For example, a fallibilist might claim to know she has hands but still deny that she is certain that she has hands.)

Orwell saw democratic socialism as responsive to political and economic facts, whereas he saw totalitarianism as seeking to bend the facts to its will. Thus, Orwell’s promotion of objective truth is closely tied to his promotion of socialism over totalitarianism. This led Orwell to confess that he was frightened by “the feeling that the very concept of objective truth is fading out of the world.” For Orwell, acknowledging objective truth requires acknowledging reality and the limitations reality places on us. Reality says that 2 + 2 = 4 and not that 2 + 2 = 5.

In this way, Orwell uses the protagonist of Nineteen Eighty-Four, Winston Smith, to express his views on the relationship between truth and freedom. An essential part of freedom for Orwell is the ability to think and to speak the truth. Orwell was especially prescient in identifying hindrances to the recognition of truth and the freedom that comes with it. These threats include nationalism, propaganda, and technology that can be used for constant surveillance.

b. Ignorance and Experience

Writing was a tool Orwell used to try to dispel his readers’ ignorance. He was a prolific writer who wrote many books, book reviews, newspaper editorials, magazine articles, radio broadcasts, and letters during a relatively short career. In his writing, he sought to disabuse the rich of ignorant notions about the poor; he sought to correct mistaken beliefs about the Spanish Civil War that had been fueled by fascist propaganda; and he sought to counteract inaccurate portrayals of democratic socialism and its relationship to Soviet Communism.

Orwell’s own initial ignorance on these matters had been dispelled by life experience. As a result, he viewed experience as capable of overcoming ignorance. He seemed to believe that testimony about experience also had the power to help those who received testimony to overcome their ignorance. Thus, Orwell sought to testify to his experiences in a way that might help counteract the inaccurate perceptions of those who lacked experience about matters to which he testified in his writing.

As discussed earlier, Orwell believed that middle- and upper-class people in Britain were largely ignorant about the character and circumstances of those living in poverty, and that what such people imagined poverty to be like was often inaccurate. Concerning his claim that the rich and poor do not have different natures or different moral character, Orwell writes that “Everyone who has mixed on equal terms with the poor knows this quite well. But the trouble is that intelligent, cultivated people, the very people who might be expected to have liberal opinions, never do mix with the poor” (Down and Out, 120).

Orwell made similar points about many other people and circumstances. He argued that the job of low-level kitchen staff in French restaurants that appeared easy from the outside was actually “astonishingly hard” (Down and Out, 62), that actually watching coal miners work could cause a member of the English to doubt their status as “a superior person” (Wigan Pier, 35), and that working in a bookshop was a good way to disabuse oneself of the fantasy that working in a bookshop was a paradise (see “Bookshop Memories”).

There is an important metacommentary that is hard to overlook concerning Orwell’s realization that experience is often necessary to correct ignorance. During his lifetime, Orwell amassed an eclectic set of experiences that helped him to understand better the perspective of those in a variety of professions and social classes. This allowed him to empathize with the plight of a wide variety of white men. However, try as he might, Orwell could not ever experience what it was like to be a woman, person of color, or queer-identified person in any of these circumstances.  Feminist critics have rightfully called attention to the misogyny and racism that is common in Orwell’s work (see, for example, Beddoe 1984, Campbell 1984, and Patai 1984). Orwell’s writings were also often homophobic (see, for example, Keep the Aspidistra Flying, chapter 1; Taylor 2003, 245). In addition, critics have pointed to antisemitism and anti-Catholicism in his writing (see, for example, Brennan 2017). Thus, Orwell’s insights about the epistemic power of experience also help explain significant flaws in his corpus, due to the limits of his own experience and imagination, or perhaps more simply due to his own prejudices.

c. Embodied Cognition

Orwell’s writing is highly consonant with philosophical work emphasizing that human cognition is embodied. For Orwell, unlike Descartes, we are not first and foremost a thinking thing. Rather, Orwell writes that “A human being is primarily a bag for putting food into; the other functions and faculties may be more godlike, but in point of time they come afterwards” (Wigan Pier, 91-92).

The influence of external circumstances and physical conditions on human cognition plays a significant role in all of Orwell’s nonfiction books as well as in Animal Farm and Nineteen Eighty-Four. In Homage to Catalonia, Orwell relays how, due to insufficient sleep as a soldier in the Spanish Republican Army, “One grew very stupid” (43). In Down and Out, Orwell emphasized how the physical conditions of a poor diet make it so that you can “interest yourself in nothing” so that you become “only a belly with a few accessory organs” (18-19). And in Wigan Pier, Orwell argues that even the “best intellect” cannot stand up against the “debilitating effect of unemployment” (81). This, he suggests, is why it is hard for the unemployed to do things like write books. They have the time, but according to Orwell, writing books requires peace of mind in addition to time. And Orwell believed that the living conditions for most unemployed people in early twentieth century England did not afford such peace of mind.

Orwell’s emphasis on embodied cognition is another way in which he recognizes the tight connection between the political and the epistemic. In Animal Farm, for example, the animals are initially pushed toward their rebellion against the farmer after they are left unfed, and their hunger drives them to action. And Napoleon, the aptly named pig who eventually gains dictatorial control over the farm, keeps the other animals overworked and underfed as a way of making them more pliable and controllable. Similarly, in Nineteen Eighty-Four, while food is rationed, gin is in abundance for party members. And the physical conditions of deprivation and torture are used to break the protagonist Winston Smith’s will to the point that his thoughts become completely malleable. Epistemic control over citizens’ minds gives the Party power over the physical conditions of the citizenry, with control over the physical conditions of the citizenry in turn helping cement the Party’s epistemic control over citizens.

d. Memory and History

Orwell treats memory as a deeply flawed yet invaluable faculty, because it is often the best or only way to obtain many truths about the past. The following passage is paradigmatic of his position: “Treacherous though memory is, it seems to me the chief means we have of discovering how a child’s mind works. Only by resurrecting our own memories can we realize how incredibly distorted is the child’s vision of the world” (“Such, Such Were the Joys”).

In his essay “My Country Right or Left,” Orwell expresses wariness about the unreliability of memories, yet he also seems optimistic about our ability to separate genuine memories form false interpolations with concentration and reflection. Orwell argued that over time British recollection of World War I had become distorted by nostalgia and post hoc narratives. He encouraged his readers to “Disentangle your real memories from later accretions,” which suggests he thinks such disentangling is at least possible. This is reinforced by his later claim that he was able to “honestly sort out my memories and disregard what I have learned since” about World War I (“My Country Right or Left”).

As these passages foreshadow, Orwell sees both the power and limitation of memory as politically significant. Accurate memories can refute falsehoods and lies, including falsehoods and lies about history. But memories are also susceptible to corruption, and cognitive biases may allow our memories to be corrupted in predictable and useful ways by those with political power. Orwell worried that totalitarian governments were pushing a thoroughgoing skepticism about the ability to write “true history.” At the same time, Orwell also noted that these same totalitarian governments used propaganda to try to promote their own accounts of history—accounts which often were wildly discordant with the facts (see, for example, “Looking Back at the Spanish War,” Section IV).

The complex relationship between truth, memory, and history in a totalitarian regime is a central theme in Nineteen Eighty-Four. One of the protagonist’s primary ways of resisting the patent lies told by the Party was clinging to memories that contradicted the Party’s false claims about the past. The primary antagonist, O’Brien, sought to eliminate Winston’s trust in his own memories by convincing him to give up on the notion of objective truth completely. Like many of the key themes in Nineteen Eighty-Four, Orwell discussed the relationship between truth, memory, and history under totalitarianism elsewhere. Notable examples include his essays “Looking Back on the Spanish War,” “Notes on Nationalism,” and “The Prevention of Literature.”

4. Philosophy of Language

Orwell had wide-ranging interests in language. These interests spanned the simple “joy of mere words” to the political desire to use language “to push the world in a certain direction” (“Why I Write”). Orwell studied how language could both obscure and clarify, and he sought to identify the political significance language had as a result.

a. Language and Thought

For Orwell, language and thought significantly influence one another. Our thought is a product of our language, which in turn is a product of our thought.

“Politics and the English Language” contains Orwell’s most explicit writing about this relationship. In the essay, Orwell focuses primarily on language’s detrimental effects on thought and vice versa, writing, for example, that the English language “becomes ugly and inaccurate because our thoughts are foolish, but the slovenliness of our language makes it easier for us to have foolish thoughts” and that “If thought corrupts language, language can also corrupt thought.” But despite this focus on detrimental effects, Orwell’s purpose in “Politics and the English Language” is ultimately positive. His “point is that the process [of corruption] is reversible.” Orwell believed the bad habits of thought and writing he observed could “be avoided if one is willing to take the necessary trouble.” Thus, the essay functions, in part, as a guide for doing just that.

This relationship between thought and language is part of a larger three-part relationship Orwell identified between language, thought, and politics (thus why the article is entitled “Politics and the English Language”). Just as thought and language mutually influence one another, so too do thought and politics. Thus, through the medium of thought, politics and language influence one another too. Orwell argues that if one writes well, “One can think more clearly,” and in turn that “To think clearly is a necessary first step toward political regeneration.” This makes good writing a political task. Therefore, Orwell concludes that for those in English-speaking political communities, “The fight against bad English is not frivolous and is not the exclusive concern of professional writers.” An analogous principle holds for those living in political communities that use other languages. For example, based on his theory about the bi-directional influence that language, thought, and politics have upon one another, Orwell wrote that he expected “that the German, Russian and Italian languages have all deteriorated in the last ten or fifteen years, as a result of dictatorship.” (“Politics and the English Language” was published shortly after the end of World War II.)

Orwell’s desire to avoid bad writing is not the desire to defend “standard English” or rigid rules of grammar. Rather, Orwell’s chief goal is for language users to aspire “to let the meaning choose the word, and not the other way about.” Communicating clearly and precisely requires conscious thought and intention. Writing in a way that preserves one’s meaning takes work. Simply selecting the words, metaphors, and phrases that come most easily to mind can obscure our meaning from others and from ourselves. Orwell describes a speaker who is taken over so completely by stock phrases, stale metaphors, and an orthodox party line as someone who:

Has gone some distance toward turning himself into a machine. The appropriate noises are coming out of his larynx, but his brain is not involved, as it would be if he were choosing his words for himself. If the speech he is making is one that he is accustomed to make over and over again, he may be almost unconscious of what he is saying.

Orwell explores this idea in Nineteen Eighty-Four with the concept of “duckspeak,” which is defined as a speaker who merely quacks like a duck when repeating orthodox platitudes.

b. Propaganda

Like many terms that were important to him, Orwell never defines what he means by “propaganda,” and it is not clear that he always used the term consistently. Still, Orwell was an insightful commentator on how propaganda functioned and why understanding it mattered.

Orwell often used the term “propaganda” pejoratively. But this does not mean that Orwell thought propaganda was always negative. Orwell wrote that “All art is propaganda,” while denying that all propaganda was art (“Charles Dickens”). He held that the primary aim of propaganda is “to influence contemporary opinion” (“Notes on Nationalism”). Thus, Orwell’s sparsest conception of propaganda seems to be messaging aimed at influencing opinion. Such messages need not be communicated only with words. For example, Orwell frequently pointed out the propagandistic properties of posters, which likely inspired his prose about the posters of Big Brother in Nineteen Eighty-Four. This sparse conception of propaganda does not include conditions that other accounts may include, such as that the messaging must be in some sense misleading or that the attempt to influence must be in some sense manipulative (compare with Stanley 2016).

Orwell found much of the propaganda of his age troubling because of the deleterious effects he believed propaganda was having on individuals and society. Propaganda functions to control narratives and, more broadly, thought. Orwell observed that sometimes this was done by manipulating the effect language was apt to have on audiences.

He noted that dictators like Hitler and Stalin committed callous murders, but never referred to them as such, preferring instead to use terms like “liquidation,” “elimination,” or “some other soothing phrase” (“Inside the Whale”). But at other times, he noted that propaganda consisted of outright lies. In lines reminiscent of the world he would create in Nineteen Eighty-Four, Orwell described the situation he observed as follows: “Much of the propagandist writing of our time amounts to plain forgery. Material facts are suppressed, dates altered, quotations removed from their context and doctored so as to change their meaning” (“Notes on Nationalism”). Orwell also noted that poorly done propaganda could not only fail but could also backfire and repel the intended audience. He was often particularly hard on his allies on the political left for propaganda that he thought most working-class people found off-putting.

5. Philosophy of Art and Literature

Orwell viewed aesthetic value as distinct from other forms of value, such as moral and economic. He most often discussed aesthetic value while discussing literature, which he considered a category of art. Importantly, Orwell did not think that the only way to assess literature was on its aesthetic merits. He thought literature (along with other kinds of art and writing) could be assessed morally and politically as well. This is perhaps unsurprising given his desire “to make political writing into an art” (“Why I Write”).

a. Value of Art and Literature

That Orwell views aesthetic value as distinct from moral value is clear. Orwell wrote in an essay on Salvador Dali that one “ought to be able to hold in one’s head simultaneously the two facts that Dali is a good draughtsman and a disgusting human being” (“Benefit of Clergy”). What is less clear is what Orwell considers the grounds for aesthetic value. Orwell appears to have been of two minds about this. At times, Orwell seemed to view aesthetic values as objective but ineffable. At other times, he seemed to view aesthetic value as grounded subjectively on the taste of individuals.

For example, Orwell writes that his own age was one “in which the average human being in the highly civilized countries is aesthetically inferior to the lowest savage” (“Poetry and the Microphone”). This suggests some culturally neutral perspective from which aesthetic refinement can be assessed. In fact, Orwell seems to think that one’s cultural milieu can enhance or corrupt one’s aesthetic sensitivity, writing that the “ugliness” of his society had “spiritual and economic causes,” and that “Aesthetic judgements, especially literary judgements, are often corrupted in the same way as political ones” (“Poetry and the Microphone”; “Notes on Nationalism”). Orwell even held that some people “have no aesthetic feelings whatever,” a condition to which he thought the English were particularly susceptible (“The Lion and the Unicorn”). On the other hand, Orwell also wrote that “Ultimately there is no test of literary merit except survival, which is itself an index to majority opinion” (“Lear, Tolstoy, and the Fool”). This suggests that perhaps aesthetic value bottoms out in intersubjectivity.

There are ways of softening this tension, however, by noting the different ways in which Orwell thinks literary merit can be assessed. For example, Orwell writes the following:

Supposing that there is such a thing as good or bad art, then the goodness or badness must reside in the work of art itself—not independently of the observer, indeed, but independently of the mood of the observer. In one sense, therefore, it cannot be true that a poem is good on Monday and bad on Tuesday. But if one judges the poem by the appreciation it arouses, then it can certainly be true, because appreciation, or enjoyment, is a subjective condition which cannot be commanded (“Politics vs. Literature”).

This suggests literary merit can be assessed either in terms of artistic merit or in terms of subjective appreciation and that these two forms of assessment need not generate matching results.

This solution, however, still leaves the question of what justifies artistic merit unanswered. Perhaps the best answer available comes in Orwell’s essay on Charles Dickens. There, Orwell concluded that “As a rule, an aesthetic preference is either something inexplicable or it is so corrupted by non-aesthetic motives as to make one wonder whether the whole of literary criticism is not a huge network of humbug.” Here, Orwell posits two potential sources of aesthetic preference: one of which is humbug and one of which is inexplicable. This suggests that Orwell may favor a view of aesthetic value that is ultimately ineffable. But even if the grounding of aesthetic merit is inexplicable, Orwell seems to think we can still judge art on aesthetic, as well as moral and political, grounds.

b. Literature and Politics

Orwell believed that there was “no such thing as genuinely non-political literature” (“The Prevention of Literature”). This is because Orwell thought that all literature sent a political message, even if the message was as simple as reinforcing the status quo. This is part of what Orwell means when he says that all art is propaganda. For Orwell, all literature—like all art—seeks to influence contemporary opinion. For this reason, all literature is political.

Because all literature is political, Orwell thought that a work of literature’s political perspective often influenced the level of merit a reader assigned to it. More specifically, people tend to think well of literature that agrees with their political outlook and think poorly of literature that disagrees with it. Orwell defended this position by pointing out “the extreme difficulty of seeing any literary merit in a book that seriously damages your deepest beliefs” (“Inside the Whale”).

But just as literature could influence politics through its message, so too politics could and did influence literature. Orwell argued that all fiction is “censored in the interests of the ruling class” (“Boys’ Weeklies”). For Orwell, this was troubling under any circumstances, but was particularly troublesome when the state exhibited totalitarian tendencies. Orwell thought that the writing of literature became impossible in a state that was genuinely authoritarian. This was because in a totalitarian regime there is no intellectual freedom and there is no stable set of shared facts. As a result, Orwell held that “The destruction of intellectual liberty cripples the journalist, the sociological writer, the historian, the novelist, the critic, and the poet, in that order” (“The Prevention of Literature”).

Thus, Orwell’s views on the mutual connections between politics, thought, and language extend to art—especially written art. These things affect literature so thoroughly that certain political orders make writing literature impossible. But literature, in turn, has the power to affect these core aspects of human life.

6. Orwell’s Relationship to Academic Philosophy

Orwell’s relationship to academic philosophy has never been a simple matter. Orwell admired Bertrand Russell, yet he wrote in response to a difficulty he encountered reading one of Russell’s books that it was “the sort of thing that makes me feel that philosophy should be forbidden by law” (Barry 2021). Orwell considered A. J. Ayer a “great friend,” yet Ayer said that Orwell “wasn’t interested in academic philosophy in the very least” and believed that Orwell thought academic philosophy was “rather a waste of time” (Barry 2022; Wadhams 2017, 205). And Orwell referred to Jean Paul Sartre as “a bag of wind” to whom he was going to give “a good [metaphorical] boot” (Tyrrell 1996).

Some have concluded that Orwell was uninterested in or incapable of doing rigorous philosophical work. Bernard Crick, one of Orwell’s biographers who was himself a philosopher and political theorist, concluded that Orwell would “have been incapable of writing a contemporary philosophical monograph, scarcely of understanding one,” observing that “Orwell chose to write in the form of a novel, not in the form of a philosophical tractatus” (Crick 1980, xxvii). This is probably all true. But this does not mean that Orwell’s work was not influenced by academic philosophy. It was. This also does not mean that Orwell’s work is not valuable for academic philosophers. It is.

Aside from critical comments about Marx, Orwell tended not to reference philosophers by name in his work (compare with Tyrrell 1996). As such, it can be hard to determine the extent to which he was familiar with or was influenced by such thinkers. Crick concludes that Orwell was “innocent of reading either J.S. Mill or Karl Popper,” yet seemed independently to reach some similar conclusions (Crick 1980, 351). But while there is little evidence of Orwell’s knowledge of the history of philosophy, there is plenty of evidence of his familiarity with at least some philosophical work written during his own lifetime. Orwell reviewed books by both Sartre and Russell (Tyrrell 1996, Barry 2021), and Orwell’s library at the time of his death included several of Russell’s books (Barry 2021). By examining Orwell’s knowledge of, interactions with, and writing about Russell, Peter Brian Barry has offered compelling arguments that Russell influenced Orwell’s views on moral psychology, metaethics, and metaphysics (Barry 2021; Barry 2022). And as others have noted, there is a clear sense in which Orwell’s writing deals with philosophical themes and seeks to work through philosophical ideas (Tyrrell 1996; Dwan 2010, 2018; Quintana 2018, 2020; Satta 2021a, 2021c).

These claims can be made consistent by distinguishing being an academic philosopher and being a philosophical thinker in some other sense. Barry puts the point well, noting that Orwell’s lack of interest in “academic philosophy” is “consistent with Orwell being greatly interested in normative public philosophy, including social and political philosophy.” David Dwan makes a similar point, preferring to call Orwell a “political thinker” rather than a “political philosopher” and arguing that we “can map the challenges he [Orwell] presents for political philosophy without ascribing to him a rigour to which he never aspired” (Dwan 2018, 4).

Philosophers working in political philosophy, philosophy of language, epistemology, ethics, and metaphysics, among other fields, have used and discussed Orwell’s writing. Richard Rorty, for example, devoted a chapter to Orwell in his 1989 book Contingency, Irony, and Solidarity, where he claimed that Orwell’s “description of our political situation—of the dangers and options at hand—remains as useful as any we possess” (Rorty 1989, 170). For Rorty, part of Orwell’s value was that he “sensitized his [readers] to a set of excuses for cruelty,” which helped reshape our political understanding (Rorty 1989, 171). Rorty also saw Orwell’s work as helping show readers that totalitarian figures like 1984’s O’Brien were possible (Rorty 1989, 175-176).

But perhaps the chief value Rorty saw in Orwell’s work was the way in which it showed the deep human value in having the ability to say what you believe and the “ability to talk to other people about what seems true to you” (Rorty 1989, 176). That is to say, Rorty recognized the value that Orwell placed on intellectual freedom. That said, Rorty here seeks to morph Orwell into his own image by suggesting that Orwell cares merely about intellectual freedom and not about truth. Rorty argues that, for Orwell, “It does not matter whether ‘two plus two is four’ is true” and that Orwell’s “question about ‘the possibility of truth’ is a red herring” (Rorty 1989, 176, 182). Rorty’s claims that Orwell was not interested in truth have not been widely adopted. In fact, his position has prompted philosophical defense of the much more plausible view that Orwell cared about truth and considered truth to be, in some sense, real and objective (see, for example, van Inwagen 2008; Dwan 2018, 160-163; confer Conant 2020).

In philosophy of language, Derek Ball has identified Orwell as someone who recognized that “A particular metasemantic fact might have certain social and political consequences” (Ball 2021, 45). Ball also notes that on one plausible reading, Orwell seems to accept both linguistic determinism—“the claim that one’s language influences or determines what one believes, in such a way that speakers of different languages will tend to possess different (and potentially incompatible) beliefs precisely because they speak different languages”—and linguistic relativism—”the claim that one’s language influences or determines what concepts one possesses, and hence what thoughts one is capable of entertaining, in such a way that speakers of different languages frequently possess entirely different conceptual repertoires precisely because they speak different languages” (Ball 2021, 47).

Ball’s points are useful ways to frame some of Orwell’s key philosophical commitments about the interrelationship between language, thought, and politics. Ball’s observations accord with Judith Shklar’s claim that the plot of 1984 “is not really just about totalitarianism but rather about the practical implications of the notion that language structures all our knowledge of the phenomenal world” (Shklar 1984). Similarly, in his work on manipulative speech, Justin D’Ambrosio has noted the significance of Orwell’s writing for politically relevant philosophy of language (D’Ambrosio unpublished manuscript). These kinds of observations about Orwell’s views may become increasingly significant in academic philosophy, given the current development of political philosophy of language as an area of study (see, for example, Khoo and Sterken 2021).

Philosophers have also noted the value of Orwell’s work for epistemology. Martin Tyrrell argues that much of Orwell’s “later and better writing amounts to an attempt at working out the political consequences of what are essentially philosophical questions,” citing specifically epistemological questions like “When and what should we doubt?” and “When and what should we believe?” (Tyrrell 1996). Simon Blackburn has noted the significance of Orwell’s worries about truth for political epistemology, concluding that “The answer to Orwell’s worry [about the possibility of truth] is not to give up inquiry, but to conduct it with even more care, diligence, and imagination” (Blackburn 2021, 70). Mark Satta has documented Orwell’s recognition of the epistemic point that our physical circumstances as embodied beings influence our thoughts and beliefs (Satta 2021a).

As noted earlier, Orwell treats moral value as a domain distinct from other types of value, such as the aesthetic. Academic philosophers have studied and productively used Orwell’s views in the field of ethics. Barry argues that Orwell’s moral views are a form of threshold deontology, on which certain moral norms (such as telling the truth) must be followed, except on occasions where not following such norms is necessary to prevent horrendous results. Barry also argues that Orwell’s moral norms come from Orwell’s humanist account of moral goodness, which grounds moral goodness in what is good for human beings. This account of Orwell’s ethical commitments accords with Dwan’s view that, while Orwell engaged in broad criticism of moral consequentialism, there were limits to Orwell’s rejection of consequentialism, such as Orwell’s acceptance that some killing is necessary in war (Dwan 2018, 17-19).

Philosophers have also employed Orwell’s writing at the intersection of ethics and political philosophy. For example, Martha Nussbaum identifies the ethical and political importance given to emotions in 1984. She examines how Winston Smith looks back longingly at a world which contained free expression of emotions like love, compassion, pity, and fellow feeling, while O’Brien seeks to establish a world in which the dominant (perhaps only) emotions are fear, rage, triumph, and self-abasement (Nussbaum 2005). Oriol Quintana has identified the importance of human recognition in Orwell’s corpus and has used this in an account of the ethics of solidarity (Quintana 2018). Quintana has also argued that there are parallels between the work of George Orwell and the French philosopher Simone Weil, especially the importance they both attached to “rootedness”—that is, “a feeling of belonging in the world,” in contrast to asceticism or detachment (Quintana 2020, 105). Felicia Nimue Ackerman has emphasized the ways in which 1984 is a novel about a love affair, which addresses questions about the nature of human agency and human relationships under extreme political circumstances (Ackerman 2019). David Dwan examines Orwell’s understanding of and frequent appeals to several important moral and political terms including “equality,” “liberty,” and “justice” (Dwan 2012, 2018). Dwan holds that Orwell is “a great political educator, but less for the solutions he proffered than for the problems he embodied and the questions he allows us to ask” (Dwan 2018, 2).

Thus, although he was never a professional philosopher or member of the academy, Orwell has much to offer those interested in philosophy. An increasing number of philosophers seem to have recognized this in recent years. Although limited by his time and his prejudices, Orwell was an insightful critic of totalitarianism and many other ways in which political power can be abused. Part of his insight was the interrelationship between our political lives and other aspects of our individual and collective experiences, such as what we believe, how we communicate, and what we value. Both Orwell’s fiction and his essays provide much that is worthy of reflection for those interested in such aspects of human experience and political life.

7. References and Further Reading

a. Primary Sources

  • Down and Out in Paris and London. New York: Harcourt Publishing Company, 1933/1961.
  • Burmese Days. Boston: Mariner Books, 1934/1974.
  • “Shooting an Elephant.” New Writing, 1936 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/shooting-an-elephant/).
  • Keep the Aspidistra Flying. New York: Harcourt Publishing Company, 1936/1956.
  • The Road to Wigan Pier. New York: Harcourt Publishing Company, 1937/1958.
  • Homage to Catalonia. Boston: Mariner Books, 1938/1952.
  • “My Country Right or Left.” Folios of New Writing, 1940 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/my-country-right-or-left/).
  • “Inside the Whale.” Published in Inside the Whale and Other Essays. London: Victor Gollancz Ltd., 1940 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/inside-the-whale/).
  • “Boys’ Weeklies.” Published in Inside the Whale and Other Essays. London: Victor Gollancz Ltd., 1940 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/boys-weeklies/).
  • “Charles Dickens.” Published in Inside the Whale and Other Essays. London: Victor Gollancz Ltd., 1940 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/charles-dickens/).
  • “Rudyard Kipling.” Horizon, 1941 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/rudyard-kipling/).
  • “The Lion and the Unicorn: Socialism and the English Genius.” Searchlight Books, 1941 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/the-lion-and-the-unicorn-socialism-and-the-english-genius/.
  • “Literature and Totalitarianism.” Listener (originally broadcast on the BBC Overseas Service). June 19, 1941. (Reprinted in The Collected Essays, Journalism and Letters of George Orwell, Vol 2. Massachusetts: Nonpareil Books, 2007.)
  • “Looking Back on the Spanish War.” New Road, 1943 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/looking-back-on-the-spanish-war/.
  • “Benefit of Clergy: Some Notes on Salvador Dali.” 1944. https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/benefit-of-clergy-some-notes-on-salvador-dali/.
  • “Antisemitism in Britain.” Contemporary Jewish Record, 1945 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/antisemitism-in-britain/.
  • “Notes on Nationalism.” Polemic, 1945 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/notes-on-nationalism/.
  • “The Sporting Spirit.” Tribune, 1945 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/the-sporting-spirit/.
  • “Poetry and the Microphone.” The New Saxon Pamphlet, 1945 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/poetry-and-the-microphone/.
  • “The Prevention of Literature.” Polemic, 1946 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/the-prevention-of-literature/.
  • “Why I Write.” Gangrel, 1946 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/why-i-write/.
  • “Politics and the English Language.” Horizon, 1946 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/politics-and-the-english-language/.
  • “Politics vs. Literature: An Examination of Gulliver’s Travels.” Polemic, 1946 https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/politics-vs-literature-an-examination-of-gullivers-travels/.
  • “Lear, Tolstoy, and the Fool.” Polemic, 1947 (Reprinted in The Collected Essays, Journalism and Letters of George Orwell, Vol 4. Massachusetts: Nonpareil Books, 2002.)
  • Animal Farm. New York: Signet Classics, 1945/1956.
  • 1984. New York: Signet Classics, 1949/1950.
  • “Such, Such Were the Joys.” Posthumously published in Partisan Review, 1952.

b. Secondary Sources

  • Abramowitz, Alan I. and Steven W. Webster. (2018). “Negative Partisanship: Why Americans Dislike Parties but Behave Like Rabid Partisans.” Advances in Political Psychology 39 (1): 119-135.
  • Ackerman, Felicia Nimue. (2019). “The Twentieth Century’s Most Underrated Novel.” George Orwell: His Enduring Legacy. University of New Mexico Honors College / University of New Mexico Libraries: 46-52.
  • Ball, Derek. (2021). “An Invitation to Social and Political Metasemantics.” The Routledge Handbook of Social and Political Philosophy of Language, Justin and Rachel Katharine Sterken). New York and Abingdon: Routledge.
  • Barry, Peter Brian. (2021). “Bertrand Russell and the Forgotten Fallacy in Nineteen Eighty-Four.” George Orwell Studies 6 (1): 121-129.
  • Barry, Peter Brian. (2022). “Orwell and Bertrand Russell.” In Oxford Handbook of George Orwell, Nathan Waddell (ed.). Oxford: Oxford University Press.
  • Beddoe, Deirdre. (1984). “Hindrances and Help-Meets: Women and the Writings of George Orwell.” In Inside the Myth, Orwell: Views from the Left, Christopher Norris (ed.). London: Lawrence and Wishart: 139-154.
  • Blackburn, Simon. (2021). “Politics, Truth, Post-Truth, and Post-Modernism.” In The Routledge Handbook of Political Epistemology, Michael Hannon and Jeroen de Ridder (eds.). Abingdon and New York: Routledge: 65-73.
  • Bowker, Gordon. (2003). George Orwell. Time Warner Books UK.
  • Brennan, Michael G. (2017). George Orwell and Religion. London: Bloomsbury Academic.
  • Campbell, Beatrix. (1984). “Orwell – Paterfamilias or Big Brother?” In Inside the Myth, Orwell: Views from the Left, Christopher Norris (ed.). London: Lawrence and Wishart: 126-138.
  • Conant, James. (2000). “Freedom, Cruelty, and Truth: Rorty versus Orwell.” In Rorty and His Critics, Robert Brandom (ed.). Oxford: Blackwell: 268-342.
  • Crick, Bernard. (1987). George Orwell: A Life. Sutherland House.
  • D’Ambrosio, Justin. (unpublished manuscript). “A Theory of Manipulative Speech.”
  • Dwan, David. (2010). “Truth and Freedom in Orwell’s Nineteen Eighty-Four.” Philosophy and Literature 34 (2): 381-393.
  • Dwan, David. (2012). “Orwell’s Paradox: Equality in ‘Animal Farm’.” ELH 79 (3): 655-683.
  • Dwan, David. (2018). Liberty, Equality & Humbug: Orwell’s Political Ideals. Oxford: Oxford University Press.
  • Ingle, Stephen. (2008). The Social and Political Thought of George Orwell. Routledge.
  • Khoo, Justin and Rachel Katharine Sterken (eds.). (2021). The Routledge Handbook of Social and Political Philosophy of Language. New York and Abingdon: Routledge.
  • Nussbaum, Martha. (2005). “The Death of Pity: Orwell and American Political Life.” In On Nineteen Eighty-Four: Orwell and Our Future, Abbott Gleason, Jack Goldsmith, and Martha Nussbaum (eds.). Princeton NJ: Princeton University Press: 279-299.
  • Patai, Daphne. (1984). The Orwell Mystique: A Study in Male Ideology. Amherst: University of Massachusetts Press.
  • Quintana, Oriol. (2018). “What Makes Help Helpful? Some Thoughts on Ethics of Solidarity Through George Orwell’s Writings.” Ramon Llull Journal of Applied Ethics 9: 137-153.
  • Quintana, Oriol. (2020). “The Politics of Rootedness: On Simone Weil and George Orwell.” In Simone Weil, Beyond Ideology? Sophie Bourgault and Julie Daigle (eds.). Switzerland: Palgrave Macmillan: 103-121.
  • Rodden, John (ed.). (2007). Cambridge Companion to George Orwell. Cambridge: Cambridge University Press.
  • Satta, Mark. (2021a). “George Orwell on the Relationship Between Food and Thought.” George Orwell Studies 5 (2): 76-89.
  • Satta, Mark. (2021b). “Orwell’s ideas remain relevant 75 years after ‘Animal Farm’ was published.” The Conversation US, https://theconversation.com/orwells-ideas-remain-relevant-75-years-after-animal-farm-was-published-165431.
  • Satta, Mark. (2021c). “George Orwell’s Philosophical Views.” 1000-Word Philosophy, https://1000wordphilosophy.com/2021/12/17/george-orwell/.
  • Scrivener, Michael and Louis Finkelman, (1994). “The Politics of Obscurity: The Plain Style and Its Detractors.” Philosophy and Literature 18 (1): 18-37.
  • Sheldon, Michael. (1991). George Orwell: The Authorized Biography. HarperCollins.
  • Shklar, Judith N. (1985). “Nineteen Eighty-Four: Should Political Theory Care?” Political Theory 13 (1): 5-18.
  • Taylor, D.J. (2003). Orwell: The Life. New York: Vintage Books.
  • Tyrrell, Martin. (1996). “Orwell and Philosophy.” Philosophy Now 16.
  • Waddell, Nathan (ed.). (2020). The Cambridge Companion to Nineteen Eighty-Four. Cambridge: Cambridge University Press.
  • Wadhams, Stephen. (2017). The Orwell Tapes. Vancouver, Canada: Locarno Press.
  • Williams, Raymond. (1971). Orwell. London: Fontana Paperbacks.
  • Woloch, Alex. (2016). Or Orwell: Writings and Democratic Socialism. Harvard University Press.

 

Author Information

Mark Satta
Email: mark.satta@wayne.edu
Wayne State University
U. S. A.

Epistemic Value

Epistemic value is a kind of value which attaches to cognitive successes such as true beliefs, justified beliefs, knowledge, and understanding. These kinds of cognitive success do often have practical value: true beliefs about local geography help us get to work on time; knowledge of mechanics allows us to build vehicles; understanding of general annual weather patterns helps us to plant our fields at the right time of year to ensure a good harvest. By contrast, false beliefs can and do lead us astray both in trivial and in colossally important ways.

It is fairly uncontroversial that we tend to care about having various cognitive or epistemic goods, at least for their practical value, and perhaps also for their own sakes as cognitive successes. But this uncontroversial point raises a number of important questions. For example, it is natural to wonder whether there really are all these different kinds of things (true beliefs, knowledge, and so on) which have distinct value from an epistemic point of view, or whether the value of some of them is reducible to, or depends on, the value of others.

It is also natural to think that knowledge is more valuable than mere true belief, but it has proven to be a challenge explaining where the extra value of knowledge comes from. Similarly, it is natural to think that understanding is more valuable than any other epistemic state which falls short of understanding, such as true belief or knowledge. But there is disagreement about what makes understanding the highest epistemic value, or what makes it distinctly valuable, or even whether it is distinctly valuable.

Indeed, it is no easy task saying just what makes something an epistemic value in the first place. Perhaps epistemic values just exist on their own, independent of other kinds of value? Or perhaps cognitive goods are valuable because we care about having them for their own sakes? Or perhaps they are valuable because they help us to achieve other things which we care about for their own sakes?

Furthermore, if we accept that there are things that are epistemically valuable, then we might be tempted to accept a kind of instrumental (or consequentialist, or teleological) conception of epistemic rationality or justification, according to which a belief is epistemically rational just in case it appropriately promotes the achievement of an epistemic goal, or it complies with rules which tend to produce overall epistemically valuable belief-systems. If this idea is correct, then we need to know which epistemic values to include in the formulation of the epistemic goal, where the epistemic goal is an epistemically valuable goal in light of which we evaluate beliefs as epistemically rational or irrational.

Table of Contents

  1. Claims about Value
    1. Instrumental and Final Value
    2. Subjective and Objective Value
    3. Pro Tanto and All-Things-Considered Value
  2. The Value Problem
    1. The Primary Value Problem
      1. Knowledge as Mere True Belief
      2. Stability
      3. Virtues
      4. Reliabilism
      5. Contingent Features of Knowledge
      6. Derivative Non-Instrumental Value
    2. The Secondary Value Problem
      1. No Extra Value
      2. Virtues
      3. Knowledge and Factive Mental States
      4. Internalism and the Basing Relation
    3. The Tertiary Value Problem
  3. Truth and other Epistemic Values
    1. Truth Gets Us What We Want
    2. What People Ought to Care About
    3. Proper Functions
    4. Assuming an Epistemic Value/Critical Domains of Evaluation
    5. Anti-Realism
    6. Why the Focus on Truth?
  4. Understanding
    1. Understanding: Propositions and Domains; Subjective and Objective
    2. The Location of the Special Value of Understanding
    3. The Value of Understanding
    4. Alternatives to the Natural Picture of the Value of Understanding
  5. Instrumentalism and Epistemic Goals
    1. The Epistemic Goal as a Subset of the Epistemic Values
    2. Common Formulations of the Epistemic Goal
    3. Differences between the Goals: Interest and Importance
    4. Differences between the Goals: Synchronic and Diachronic Formulations
  6. Conclusion
  7. References and Further Reading

1. Claims about Value

Philosophers working on questions of value typically draw a number of distinctions which are good to keep in mind when we’re thinking about particular kinds of value claims. We’ll look at three particularly useful distinctions here before getting into the debates about epistemic value.

a. Instrumental and Final Value

The first important distinction to keep in mind is between instrumental and final value. An object (or state, property, and so forth) is instrumentally valuable if and only if it brings about something else that is valuable. An object is finally valuable if and only if it’s valuable for its own sake.

For example, it’s valuable to have a hidden pile of cash in your mattress: when you have a pile of cash readily accessible, you have the means to acquire things which are valuable, such as clothing, food, and so on. And, depending on the kind of person you are, it might give you peace of mind to sleep on a pile of cash. But piles of cash are not valuable for their own sake—money is obviously only good for what it can get you. So money is only instrumentally valuable.

By contrast, being healthy is something we typically think of as finally valuable. Although being healthy is instrumentally good because it enables us to do other valuable things, we also care about being healthy just because it’s good to be healthy, whether or not our state of health allows us to achieve other goods.

The existence of instrumental value depends on and derives from the existence of final value. But it’s possible for final value to exist without any instrumental value. There are possible worlds where there simply are no causal relations at all, for example. In some worlds like that, there could exist some final value (for instance, there could be sentient beings who feel great pleasure), but nothing would ever count as a means for bringing about anything else, and there would be no instrumental value. In the actual world, though, it’s pretty clear that there is both instrumental and final value.

b. Subjective and Objective Value

The second distinction is between subjective and objective value. Subjective value is a matter of the satisfaction of people’s desires (or the fulfillment of their plans, intentions, and so forth). Objective value is a kind of value which doesn’t depend on what people desire, care about, plan to do, and so forth. (To say that an object or event O is subjectively valuable for a subject S is not to say anything about why S thinks that O is valuable; O can be subjectively valuable in virtue of S’s desiring to bring O about, even if the reason S desires to bring O about is precisely because S thinks that O is objectively valuable. In a case like that, if O really is objectively valuable, then it is both objectively and subjectively valuable; but if S is mistaken, and O is not objectively valuable, then O is only subjectively valuable.)

Some philosophers think that there is really only subjective value (and correspondingly, subjective reasons, obligations, and so on); others think that there is only objective value, and that there is value in achieving one’s actual desires only when the desires are themselves objectively good. Still other philosophers allow both kinds of value. Many of the views which we’ll see below can be articulated in terms of either subjective or objective value, and when a view is committed to allowing only one type of value, the context will generally make it clear whether it’s subjective or objective. So, to keep things simple, claims about value will not be qualified as subjective or objective in what follows.

c. Pro Tanto and All-Things-Considered Value

Suppose that God declares that it is maximally valuable, always and everywhere, to feed the hungry. Assuming that God is omniscient and doesn’t lie, it necessarily follows that it’s true that it’s maximally valuable, always and everywhere, to feed the hungry. So there’s nothing that could ever outweigh the value of feeding the hungry. This would be an indefeasible kind of value: it is a kind of value that cannot be defeated by any contrary values or considerations.

Most value, however, is defeasible: it can be defeated, either by being overridden by contrary value-considerations, or else by being undermined. For an example of undermining: it’s instrumentally valuable to have a policy of getting an annual physical exam done, because that’s the kind of thing that normally helps catch medical issues before they become serious. But suppose that Sylvia becomes invulnerable to medically diagnosable illnesses. . In this case, nothing medically valuable comes about as a result of Sylvia’s policy of getting her physical done. The instrumental medical value which that policy would have enjoyed is undermined by the fact that annual physicals no longer contribute to keeping Sylvia in good health.

By contrast, imagine that Roger goes to the emergency room for a dislocated shoulder. The doctors fix his shoulder, but while sitting in the waiting room, Roger inhales droplets from another patient’s sneeze, and he contracts meningitis as a result, which ends up causing him brain damage. In this case, there is some medical value which resulted from Roger’s visit to the emergency room: his shoulder was fixed. But because brain damage is more disvaluable than a fixed shoulder is valuable, the value of having a fixed should is outweighed, or overridden, by the disvalue of having brain damage. So all things considered, Roger’s visit to the emergency room is disvaluable. But at least there is still something positive to be said for it.

In cases where some value V1 of an object O (or action, event, and so forth) is overridden by some contrary value V2, but where V1 still at least counts in favour of O’s being valuable, we can say that V1 is a pro tanto kind of value (that is, value “so far as it goes” or “to that extent”). So the value of Roger’s fixed shoulder is pro tanto: it counts in favour of the value of his visit to the emergency room, even though it is outweighed by the disvalue of his resulting brain damage. The disvalue of getting brain damage is also pro tanto: there can be contrary values which would outweigh it, though in Roger’s case, the disvalue of the brain damage is the stronger of the competing value-considerations. So we can say that, all things considered, Roger’s visit to the emergency room is disvaluable.

2. The Value Problem

a. The Primary Value Problem

Knowledge and true belief both tend to be things we want to have, but all else being equal, we tend to prefer to have knowledge over mere true belief. The “Primary Value Problem” is the problem of explaining why that should be the case. Many epistemologists think that we should take it as a criterion of adequacy for theories of knowledge that they be able to explain the fact that we prefer knowledge to mere true belief, or at least that they be consistent with a good explanation of why that should be the case.

To illustrate: suppose that Steve believes that the Yankees are a good baseball team, because he thinks that their pinstriped uniforms are so sharp-looking. Steve’s belief is true – the Yankees always field a good team – but he holds his belief for such a terrible reason that we are very reluctant to think of it as an item of knowledge.

Cases like that one motivate the view that knowledge consists of more than just true belief. In order to count as knowledge, a belief has to be well justified in some suitable sense, and it should also meet a suitable Gettier-avoidance condition (see the article on Gettier Problems). But not only do beliefs like Steve’s motivate the view that knowledge consists of more than mere true belief: they also motivate the view that knowledge is better to have than true belief. For suppose that Yolanda knows the Yankees’ stats, and on that basis she believes that the Yankees are a good team. It seems that Yolanda’s belief counts as an item of knowledge. And if we compare Steve and Yolanda, it seems that Yolanda is doing better than Steve; we’d prefer to be in Yolanda’s epistemic position rather than in Steve’s. This seems to indicate that we prefer knowledge over mere true belief.

The challenge of the Primary Value Problem is to explain why that should be the case. Why should we care about whether we have knowledge instead of mere true belief? After all, as is often pointed out, true beliefs seem to bring us the very same practical benefits as knowledge. (Steve would do just as well as Yolanda betting on the Yankees, for example.) Socrates makes this point in the Meno, arguing that if someone wants to get to Larisa, and he has a true belief but not knowledge about which road to take, then he will get to Larisa just as surely as if he had knowledge of which road to take. In response to Socrates’s argument, Meno is moved to wonder why anyone should care about having knowledge instead of mere true belief. (Hence, the Primary Value Problem is sometimes called the Meno Problem.)

So in short, the problem is that mere true beliefs seem to be just as likely as knowledge to guide us well in our actions. But we still seem to have the persistent intuition that there is something better about any given item of knowledge than the corresponding item of mere true belief. The challenge is to explain this intuition. Strategies for addressing this problem can either try to show that knowledge really is always more valuable than corresponding items of mere true belief, or else they can allow that knowledge is sometimes (or even always) no more valuable than mere true belief. If we adopt the latter kind of response to the problem, it is incumbent on us to explain why we should have the intuition that knowledge is more valuable than mere true belief, in cases where it turns out that knowledge isn’t in fact more valuable. Following Pritchard (2008; 2009), we can call strategies of the first kind vindicating, and we can call strategies of the second kind revisionary.

There isn’t a received view among epistemologists about how we ought to respond to the Primary Value Problem. What follows is an explanation of some important proposals from the literature, and a discussion of their challenges and prospects.

i. Knowledge as Mere True Belief

A very straightforward way to respond to the problem is to deny one of the intuitions on which the problem depends, the intuition that knowledge is distinct from true belief.  Meno toys with this idea in the Meno, though Socrates disabuses him of the idea. (Somewhat more recently, Sartwell (1991; 1992) has defended this approach to knowledge.) If knowledge is identical with true belief, then we can simply reject the value problem as resting on a mistaken view of knowledge. If knowledge is true belief, then there’s no discrepancy in value to explain.

The view that knowledge is just true belief is almost universally rejected, however. Cases where subjects have true beliefs but lack knowledge are so easy to construct and so intuitively obvious that identifying knowledge with true belief represents an extreme departure from what most epistemologists and laypeople think of knowledge. Consider once again Steve’s belief that the Yankees are a good baseball team, which he holds because he thinks their pinstriped uniforms are so sharp. It seems like an abuse of language to call Steve’s belief an item of knowledge. At the very least, we should be hesitant to accept such an extreme view until we’ve exhausted all other theoretical options.

It could still be the case that knowledge is no more valuable than mere true belief, even though knowledge is not identical with true belief. But, as we’ve seen, there is a widespread and resilient intuition that knowledge is more valuable than mere true belief (recall, for instance, that we tend to think that Yolanda’s epistemic state is better than Steve’s). If knowledge were identical with true belief, then we would have to take that intuition to be mistaken; but, since we can see that knowledge is more than mere true belief, we can continue looking for an acceptable account which would explain why knowledge is more valuable than mere true belief.

ii. Stability

Most attempts to explain why knowledge is more valuable than mere true belief proceed by identifying some condition which must be added to true belief in order to yield knowledge, and then explaining why that further condition is valuable. Socrates’s own view, at least as presented in the Meno, is that knowledge is true opinion plus an account of why the opinion is true (where the account of why it is true is itself already present in the soul, and it must only be recalled from memory). So, Socrates proposes, a known true belief will be more stable than a mere true belief, because having an account of why a belief is true helps to keep us from losing it. If you don’t have an account of why a proposition is true, you might easily forget it, or abandon your belief in it when you come across some reason for doubting it. But if you do have an account of why a proposition is true, you likely have a greater chance of remembering it, and if you come across some reason for doubting it, you’ll have a reason available to you for continuing to believe it.

A worry for this solution is that it seems to be entirely possible for a subject S to have some entirely unsupported beliefs, which do not count as knowledge, but where S clings to these beliefs dogmatically, even in the face of good counterevidence. S’s belief in a case like this can be just as stable as many items of knowledge – indeed, dogmatically held beliefs can even be more stable than knowledge. For if you know that p, then presumably your belief is a response to some sort of good reason for believing that p. But if your belief is a response to good reasons, then you’d likely be inclined to revise your belief that p, if you were to come across some good evidence for thinking that p is false, or for thinking that you didn’t have any good reason for believing that p in the first place. On the other hand, if p is something you cling to dogmatically (contrary evidence be damned), then you’ll likely retain p even when you get good reason for doubting it. So, even though having stable true beliefs is no doubt a good thing, knowledge isn’t always more stable than mere true belief, and an appeal to stability does not seem to give us an adequate explanation of the extra value of knowledge over mere true belief.

One way to defend the stability response to the value problem is to hold that knowledge is more stable than mere true beliefs, but only for people whose cognitive faculties are in good working order, and to deny that the cognitive faculties of people who cling dogmatically to evidentially unsupported beliefs are in good working order (Williamson 2000). This solution invites the objection, however, that our cognitive faculties are not all geared to the production of true beliefs. Some cognitive faculties are geared towards ensuring our survival, and the outputs of these latter faculties might be held very firmly even if they are not well supported by evidence. For example, there could be subjects with cognitive mechanisms which take as input sudden sounds and generate as output the belief that there’s a predator nearby. Mechanisms like these might very well generate a strong conviction that there’s a predator nearby. Such mechanisms would likely yield many more false positive predator-identifications than they would yield correct identifications, but their poor true-to-false output-ratio doesn’t prevent mechanisms of this kind from having a very high survival value, as long as they do correctly identify predators when they are present. So it’s not really clear that knowledge is more stable than mere true beliefs, even for mere true beliefs which have been produced by cognitive systems which are in good working order, because it’s possible for beliefs to be evidentially unsupported, and very stable, and produced by properly functioning cognitive faculties, all at the same time. (See Kvanvig 2003, ch1. for a critical discussion of Williamson’s appeal to stability.)

iii. Virtues

Virtue epistemologists are, roughly, those who think that knowledge is true belief which is the product of intellectual virtues. (See the article on Virtue Epistemology.) Virtue Epistemology seems to provide a plausible solution to the Primary (and, as we’ll see, to the Secondary) Value Problem.

According to a prominent strand of virtue epistemology, knowledge is true belief for which we give the subject credit (Greco 2003), or true belief which is a cognitive success because of the subject’s exercise of her relevant cognitive ability (Greco 2008; Sosa 2007). For example (to adapt Sosa’s analogy): an archer, in firing at a target, might shoot well or poorly. If she shoots poorly but hits the target anyway (say, she takes aim very poorly but sneezes at the moment of firing, and luckily happens to hit the target), her shot doesn’t display skill, and her hitting the target doesn’t reflect well on her. If she shoots well, on the other hand, then she might hit the target or miss the target. If she shoots well and misses the target, we will still credit her with having made a good shot, because her shot manifests skill. If she shoots well and hits the target, then we will credit her success to her having made a good shot – unless there were intervening factors which made it the case that the shot hit the mark just as a matter of luck. For example: if a trickster moves the target while the arrow is in mid-flight, but a sudden gust of wind moves the arrow to the target’s new location, then in spite of the fact that the archer makes a good shot, and she hits the target, she doesn’t hit the target because she made a good shot. She was just lucky, even though she was skillful. But when strange factors don’t intervene, and the archer hits the target because she made a good shot, we give her credit for having hit the target, since we think that performances which succeed because they are competent are the best kind of performances. And, similarly, when it comes to belief-formation, we give people credit for getting things right as a result of the exercise of their intellectual virtues: we think it’s an achievement to get things right as the result of one’s cognitive competence, and so we tend to think that there’s a sense in which people who get things right because of their intellectual competence deserve credit for getting things right.

According to another strand of virtue epistemology (Zagzebski 2003), we don’t think of knowledge as true belief which meets some further condition. Rather, we should think of knowledge as a state which a subject can be in, which involves having the propositional attitude of belief, but which also includes the motivations for which the subject has the belief. Virtuous motivations might include things like diligence, integrity, and a love of truth. And, just as we think that, in ethics, virtuous motives make actions better (saving a drowning child because you don’t want children to suffer and die is better than saving a drowning child because you don’t want to have to give testimony to the police, for example), we should also think that the state of believing because of a virtuous motive is better than believing for some other reason.

Some concerns have been raised for both strands of virtue epistemology, however. Briefly, a worry for the Sosa/Greco type of virtue epistemology is that (as we’ll see in section 3) knowledge might not after all in general be an achievement – it might be something we can come by in a relatively easy or even lazy fashion. A worry for Zagzebski’s type of virtue epistemology is that there seem to be possible cases where subjects can acquire knowledge even though they lack virtuous intellectual motives. Indeed, it seems possible to acquire knowledge even if one has only the darkest of motives: if a torturer is motivated by the desire to waterboard people until they go insane, for example, he can thereby gain knowledge of how long it takes to break a person by waterboarding.

iv. Reliabilism

The Primary Value Problem is sometimes thought to be especially bad for reliabilists about knowledge. Reliabilism in its simplest form is the view that beliefs are justified if and only if they’re produced by reliable processes, and they count as knowledge if and only if they’re true, and produced by reliable processes, and they’re not Gettiered. (See, for example, Goldman and Olsson (2009, p. 22), as well as the article on Reliabilism.) The apparent trouble for reliabilism is that reliability only seems to be valuable as a means to truth – so, in any given case where we have a true belief, it’s not clear that the reliability of the process which produced the belief is able to add anything to the value that the belief already has in virtue of being true. The value which true beliefs have in virtue of being true completely “swamps” the value of the reliability of their source, if reliability is only valuable as a means to truth. (Hence the Primary Value Problem for reliabilism has often been called the “swamping problem.”)

To illustrate (Zagzebski 2003): the value of a cup of coffee seems to be a matter of how good the coffee tastes. And we value reliable coffeemakers because we value good cups of coffee. But when it comes to the value of any particular cup of coffee, its value is just a matter of how good it tastes; whether the coffee was produced by a reliable coffeemaker doesn’t add to or detract from the value of the cup of coffee. Similarly, we value true beliefs, and we value reliable belief-forming processes because we care about getting true beliefs. So we have reason to prefer reliable processes over unreliable ones. But whether a particular belief was reliably or unreliably produced doesn’t seem to add to or detract from the value of the belief itself.

Responses have been offered on behalf of reliabilism. Brogaard (2006) points out that critics of reliabilism seem to have been presupposing a Moorean conception of value, according to which the value of an object (or state, condition, and so forth) is entirely a function of the internal properties of the object. (The value of the cup of coffee is determined entirely by its internal properties, not by the reliability of its production, or by the fineness of a particular morning when you enjoy your coffee.) But this seems to be a mistaken view about value in general. External features can add value to objects. We value a genuine Picasso painting more than a flawless counterfeit, for example. If that’s correct, then extra value can be conferred on an object, if it has a valuable source, and perhaps the value of reliable processes can transfer to the beliefs which they produce. Goldman and Olsson (2009) offer two further responses on behalf of reliabilism. Their first response is that we can hold that true belief is always valuable, and that reliability is only valuable as a means to true belief, but that it is still more valuable to have knowledge (understood as reliabilists understand knowledge, that is, as reliably-produced and unGettiered true belief) than a mere true belief. For if S knows that p in circumstances C, then S has formed the belief that p through some reliable process in C. So S has some reliable process available to her, and it generated a belief in C. This makes it more likely that S will have a reliable process available to her in future similar circumstances, than it would be if S had an unreliably produced true belief in C. So, when we’re thinking about how valuable it is to be in circumstances C, it seems to be better for S to be in C if S has knowledge in C than if she has mere true belief in C, because having knowledge in C makes it likelier that she’ll get more true beliefs in future similar circumstances.

This response, Goldman and Olsson think, accounts for the extra value which knowledge has in many cases. But there will still be cases where S’s having knowledge in C doesn’t make it likelier that she’ll get more true beliefs in the future. For example, C might be a unique set of circumstances which is unlikely to come up again. Or S might be employing a reliable process which is available to her in C, but which is likely to become unavailable to her very soon. Or S might be on her deathbed. So this response isn’t a completely validating solution to the value problem, and it’s incumbent on Goldman and Olsson to explain why we should tend to think that knowledge is more valuable than mere true belief in those cases when it’s not.

So Goldman and Olsson offer a second response to the Primary Value Problem: when it comes to our intuitions about the value of knowledge, they argue, it’s plausible that these intuitions began long ago with the recognition that true belief is always valuable in some sense to have, and that knowledge is usually valuable because it involves both true belief and the probability of getting more true beliefs; and then, over time, we have come to simply think that knowledge is valuable, even in cases when having knowledge doesn’t make it more probable that the subject will get more true beliefs in the future. (For some critical discussions and defenses of Goldan and Olsson’s treatment of the value problem, see Horvath (2009); Kvanvig (2010); and Olsson (2009; 2011)).

v. Contingent Features of Knowledge

An approach similar to Goldman and Olsson’s is to consider the values of contingent features of knowledge, rather than the value of its necessary and/or sufficient conditions. Although we might think that the natural way to account for the value of some state or condition S1, which is composed of other states or conditions S2-Sn, is in terms of the values of S2-Sn, perhaps S1 can be valuable in virtue of some other conditions which typically (but not always) accompany S1, or in terms of some valuable result which S1 is typically (but not always) able to get us. For example: it’s normal to think that air travel is valuable, because it typically enables people to cover great distances safely and quickly. Sometimes airplanes are diverted, and slow travellers down, and sometimes airplanes crash. But even so, we might continue to think, air travel is typically a valuable thing, because in ordinary cases, it gets us something good.

Similarly, we might think that knowledge is valuable because we need to rely on the information which people give us in order to accomplish just about anything in this life, and being able to identify people as having knowledge means being able to rely on them as informants. And we also might think that there’s value in being able to track whether our own beliefs are held on the basis of good reasons, and we typically have good reasons available to us for believing p when we know that p. We are not always in a position to identify when other people have knowledge, and if externalists about knowledge are right, then we don’t always have good reasons available to us when we have knowledge ourselves. Nevertheless, we can typically identify people as knowers, and we can typically identify good reasons for the things we know. These things are valuable, so they make typical cases of knowledge valuable, too. (See Craig (1990) for an account of the value of knowledge in terms of the characteristic function of knowledge-attribution. Jones (1997) further develops the view.)

Like Goldman and Olsson’s responses, this strategy for responding to the value problem doesn’t give us an account of why knowledge is always more valuable than mere true belief. For those who think that knowledge is always preferable to mere true belief, and who therefore seek a validating solution to the Primary Value Problem, this strategy will not be satisfactory. But for those who are willing to accept a somewhat revisionist response, according to which knowledge is only usually or characteristically preferable to mere true belief, this strategy seems promising.

vi. Derivative Non-Instrumental Value

Sylvan (2018) proposes the following principle as a way to explain the extra value that justification adds to true belief:

(The Extended Hurka Principle) When V is a non-instrumental value from the point of view of domain D, fitting ways of valuing V in D and their manifestations have some derivative non-instrumental value in D.

For instance, in the aesthetic domain, beauty is fundamentally valuable; but it’s also derivatively good to value or respect beauty, and it’s bad to disvalue or disrespect beauty. In the moral domain, beneficence is good; and it’s derivatively good to value or respect beneficence, and it’s bad to value or respect maleficence. And in the epistemic domain, true belief is good; but it’s also derivatively good to value or respect truth (by having justified beliefs), and it’s bad to disvalue or disrespect truth (by having unjustified beliefs).

In these domains, the derivatively valuable properties are not valuable because they promote or generate more of what is fundamentally valuable; rather, they are valuable because it’s just a good thing to manifest respect for what is fundamentally valuable. Still, Sylvan argues that the value of justification in the epistemic domain depends on and derives from the epistemic value of truth, because if truth were not epistemically valuable, then neither would respecting the truth be epistemically valuable.

A possible worry for this approach is that although respecting a fundamentally valuable thing might be good, it’s not clear that it adds domain-relative value to the thing itself (Bondy 2022). For instance, an artist passionately manifesting her love of beauty as she creates a sculpture does not necessarily make the sculpture itself better. The same might go for belief: perhaps the fact that a believer manifests respect for the truth in holding a belief does not necessarily make the belief itself any better.

b. The Secondary Value Problem

Suppose you’ve applied for a new position in your company, but your boss tells you that your co-worker Jones is going to get the job. Frustrated, you glance over at Jones, and see that he has ten coins on his desk, and you then watch him put the coins in his pocket. So you form the belief that the person who will get the job has at least ten coins in his or her pocket (call this belief “B”). But it turns out that your boss was just toying with you; he just wanted to see how you would react to bad news. He’s going to give you the job. And it turns out that you also have at least ten coins in your pocket.

So, you have a justified true belief, B, which has been Gettiered. In cases like this, once you’ve found out that you were Gettiered, it’s natural to react with annoyance or intellectual embarrassment: even though you got things right (about the coins, though not about who would get the job), and even though you had good reason to think you had things right, you were just lucky in getting things right.

If this is correct – if we do tend to prefer to have knowledge over Gettiered justified true beliefs – then this suggests that there’s a second value problem to be addressed. We seem to prefer having knowledge over having any proper subset of the parts of knowledge. But why should that be the case? What value is added to justified true beliefs, when they meet a suitable anti-Gettier condition?

i. No Extra Value

An initial response is to deny that knowledge is more valuable than mere justified true belief. If we’ve got true beliefs, and good reasons for them, we might be Gettiered, if for some reason it turns out that we’re just lucky in having true beliefs. When we inquire into whether p, we want to get to the truth regarding p, and we want to do so in a rationally defensible way. If it turns out that we get to the truth in a rationally defensible way, but strange factors of the case undermine our claim to knowing the truth about p, perhaps it just doesn’t matter that we don’t have knowledge.

Few epistemologists have defended this view, however (though Kaplan (1985) is an exception). We do after all find it irritating when we find out that we’ve been Gettiered; and when we are considering corresponding cases of knowledge and of Gettiered justified true belief, we tend to think that the subject who has knowledge is better off than the subject who is Gettiered. We might be mistaken; there might be nothing better in knowledge than in mere justified true belief. But the presumption seems to be that knowledge is more valuable, and we should try to explain why that is so. Skepticism about the extra value of knowledge over mere justified true belief might be acceptable if we fail to find an adequate explanation, but we shouldn’t accept skepticism before searching for a good explanation.

ii. Virtues

We saw above that some virtue epistemologists think of knowledge in terms of the achievement of true beliefs as a result of the exercise of cognitive skills or virtues. And we do generally seem to value success that results from our efforts and skills (that is, we value success that’s been achieved rather than stumbled into (for example Sosa (2003; 2007) and Pritchard (2009)). So, because we have a cognitive aim of getting to the truth, and we can achieve that aim either as a result of luck or as a result of our skillful cognitive performance, it seems that the value of achieving our aims as a result of a skillful performance can help explain why knowledge is more valuable than mere true belief.

That line of thought works just as well as a response to the Secondary Value Problem as to the Primary Value Problem. For in a Gettier case, the subject has a justified true belief, but it’s just as a result of luck that she arrived at a true belief rather than a false one. By contrast, when a subject arrives at a true belief because she has exercised a cognitive virtue, it’s plausible to think that it’s not just lucky that she’s arrived at a true belief; she gets credit for succeeding in the aim of getting to the truth as a result of her skillful performance. So cases of knowledge do, but Gettier cases do not, exemplify the value of succeeding in achieving our aims as a result of a skillful performance.

iii. Knowledge and Factive Mental States

”Knowledge-first epistemology” (beginning with Williamson 2000) is the approach to epistemology that does not attempt to analyze knowledge in terms of other more basic concepts; rather, it takes knowledge to be fundamental, and it analyzes other concepts in terms of knowledge. Knowledge-first epistemologists still want to say informative things about what knowledge is, but they don’t accept the traditional idea that knowledge can be analyzed in terms of informative necessary and sufficient conditions.

Williamson argues that knowledge is the most general factive mental state. At least some mental states have propositional contents (the belief that p has the content p; the desire that p has the content p; and so on). Factive mental states are mental states which you can only be in when their contents are true. Belief isn’t a factive mental state, because you can believe p even if p is false. By contrast, knowledge is a factive mental state, because you can only know that p if p is true. Other factive mental states include seeing that (for example you can only see that the sun is up, if the sun really is up) and remembering that. Knowledge is the most general factive mental state, for Williamson, because any time you are in a factive mental state with the content that p, you must know that p. If you see that it’s raining outside, then you know that it’s raining outside. Otherwise – say, if you have a mere true belief that it’s raining, or if your true belief that it’s raining is justified but Gettiered – you only seem to see that it’s raining outside.

If Williamson is right, and knowledge really is the most general factive mental state, then it is easy enough to explain the value of knowledge over mere justified true belief. We care, for one thing, about having true beliefs, and we dislike being duped. We would especially dislike it if we found out that we were victims of widespread deception. (Imagine your outrage and intellectual embarrassment, for example, if you were to discover that you were living in your own version of The Truman Show!) But not only that: we also care about being in the mental states we think we’re in (we care about really remembering what we think we remember, for example), and we would certainly dislike being duped about our own mental states, including when we take ourselves to be in factive mental states. So if having a justified true belief that p which is Gettiered prevents us from being in the factive mental states we think we’re in, but having knowledge enables us to be in these factive mental states, then it seems that we should care about having knowledge.

iv. Internalism and the Basing Relation

Finally, internalists about knowledge have an interesting response to offer to the Secondary Value Problem. Internalism about knowledge is the view that a necessary condition on S’s knowing that p is that S must have good reasons available for believing that p (where this is usually taken to mean that S must be able to become aware of those reasons, just by reflecting on what reasons she has). Internalists will normally hold that you have to have good reasons available to you, and you have to hold your belief on the basis of those reasons, in order to have knowledge.

Brogaard (2006) argues that the fact that beliefs must be held on the basis of good reasons gives the internalist her answer to the Secondary Value Problem. Roughly, the idea is that, if you hold the belief that p on the basis of a reason q, then you must believe (at least dispositionally) that in your current circumstances, q is a reliable indicator of p’s truth. So you have a first-order belief, p, and you have a reason for believing p, which is q, and you have a second-order belief, r, to the effect that q is a reliable indicator of p’s truth. And when your belief that p counts as knowledge, your reason q must in fact be a reliable indicator of p’s truth in your current circumstances – which means that your second-order belief r is true. So, assuming that the extra-belief requirement for basing beliefs on reasons is correct, it follows that when you have knowledge, you also have a correct picture of how things stand more broadly speaking.

When you are in a Gettier situation, by contrast, there is some feature of the situation which makes it the case that your belief that q is not a reliable indicator of the truth of p. That means that your second-order belief r is false. So, even though you’ve got a true first-order belief, you have an incorrect picture of how things stand more broadly speaking. Assuming that it’s better to have a correct picture of how things stand, including a correct picture of what reasons are reliable indicators of the truth of our beliefs, knowledge understood in an internalist sense is more valuable than Gettiered justified true belief.

c. The Tertiary Value Problem

Pritchard (2007; 2010) suggests that there’s a third value problem to address (cf. also Zagzebski 2003). We often think of knowledge as distinctively valuable – that it’s a valuable kind of thing to have, and that its value isn’t the same kind of value as (for example) the value of true belief. If that’s correct, then simply identifying a kind of value which true beliefs have, and showing that knowledge has that same kind of value but to a greater degree, does not yield a satisfactory solution to this value problem.

By analogy, think of two distinct kinds of value: moral and financial. Suppose that both murders and mediocre investments are typically financially disvaluable, and suppose that murders are typically more financially disvaluable than mediocre investments. Even if we understand the greater financial disvalue of murders over the financial disvalue of mediocre investments, if we do not also understand that murders are disvaluable in a distinctively moral sense, then we will fail to grasp something fundamental about the disvalue of murder.

If knowledge is valuable in a way that is distinct from the way that true beliefs are valuable, then the kind of solution to the Primary Value Problem offered by Goldman and Olsson which we saw above isn’t satisfactory, because the extra value they identify is just the extra value of having more true beliefs. By contrast, as Pritchard suggests, if knowledge represents a cognitive achievement, in the way that virtue theorists often suggest, then because we do seem to think of achievements as being valuable just insofar as they are achievements (we value the overcoming of obstacles, and we value success which is attributable to a subject’s exercise of her skills or abilities), it follows that thinking of knowledge as an achievement provides a way to solve the Tertiary Value Problem. (Though, as we’ll see in section 3, Pritchard doesn’t think that knowledge in general represents an achievement.)

However, it’s not entirely clear that the Tertiary Value Problem is a real problem which needs to be addressed. (Haddock (2010) explicitly denies it, and Carter, Jarvis, and Rubin (2013) also register a certain skepticism before going on to argue that if there is a Tertiary Value Problem, it’s easy to solve.) Certainly most epistemologists who have attempted to solve the value problem have not worried about whether the extra value they were identifying in knowledge was different in kind from the value of mere true belief, or of mere justified true belief. Perhaps it is fair to say that it would be an interesting result if knowledge turned out to have a distinctive kind of value; maybe that would even be a mark in favour of an epistemological theory which had that result. But the consensus seems to be that, if we can identify extra value in knowledge, then that is enough to solve the value problem, even if the extra value is just a greater degree of the same kind of value which we find in the proper parts of knowledge such as true belief.

3. Truth and other Epistemic Values

We have been considering ways to try to explain why knowledge is more valuable than its proper parts. More generally, though, we might wonder what sorts of things are epistemically valuable, and just what makes something an epistemic value in the first place.

A natural way to proceed is simply to identify some state which epistemologists have traditionally been interested in, or which seems like it could or should be important for a flourishing cognitive life – such as the states of having knowledge, true belief, justification, wisdom, empirically adequate theories, and so on – and try to give some reason for thinking that it’s valuable to be in such a state.

Epistemologists who work on epistemic value usually want to explain either why true beliefs are valuable, or why knowledge is valuable, or both. Some also seek to explain the value of other states, such as understanding, and some seek to show that true beliefs and knowledge are not always as valuable as we might think.

Sustained arguments for the value of knowledge are easy to come by; the foregoing discussion of the Value Problem was a short survey of such arguments. Sustained arguments for the value of true belief, on the other hand, are not quite so plentiful. But it is especially important that we be able to show that true belief is valuable, if we are going to allow true belief to play a central role in epistemological theories. It is, after all, very easy to come up with apparently trivial true propositions, which no one is or ever will be interested in. Truths about how many grains of sand there are on some random beach, for example, seem to be entirely uninteresting. Piller suggests that “the string of letters we get, when we combine the third letters of the first ten passenger’s family names who fly on FR2462 to Bydgoszcz no more than seventeen weeks after their birthday with untied shoe laces” is an uninteresting truth, which no one would care about (2009, p.415). (Though see Treanor (2014) for an objection to arguments that proceed by comparing what appear to be more and less interesting truths.)  What is perhaps even worse, it is easy to construct cases where having a true belief is positively disvaluable. For example, if someone tells you how a movie will end before you see it, you will probably not enjoy the movie very much when you do get around to seeing it (Kelly 2003). Now, maybe these apparently trivial or disvaluable truths are after all at least a little bit valuable, in an epistemic sense – but on the face of them, these truths don’t seem valuable, so the claim that they are valuable needs to be argued for. We’ll see some such arguments shortly.

Keep in mind that although epistemologists often talk about the value of having true beliefs, this is usually taken to be short for the value of having true beliefs and avoiding false beliefs (though see Pritchard 2014 and Hutchinson 2021, who think that truth itself is what is valuable). These two aspects of what is usually referred to as a truth-goal are clearly related, but they are distinct, and sometimes they can pull in opposite directions. An extreme desire to avoid false beliefs can lead us to adopt some form of skepticism, for example, where we abandon all or nearly all of our beliefs, if we’re not careful. But in giving up all of our beliefs, we do not only avoid having false beliefs; we also lose all of the true beliefs we would have had. When the goals of truth-achievement and error-avoidance pull in opposite directions, we need to weigh the importance of getting true beliefs against the importance of avoiding false ones, and decide how much epistemic risk we’re willing to take on in our body of beliefs (cf. James 1949, Riggs 2003).

Still, because the twin goals of achieving true beliefs and avoiding errors are so closely related, and because they are so often counted as a single truth-goal, we can continue to refer to them collectively as a truth-goal. We just need to be careful to keep the twin aspects of the goal in mind.

a. Truth Gets Us What We Want

One argument for thinking that true beliefs are valuable is that without true beliefs, we cannot succeed in any of our projects. Since even the most unambitious of us care about succeeding in a great many things (even making breakfast is a kind of success, which requires a great many true beliefs), we should all think that it’s important to have true beliefs, at least when it comes to subjects that we care about.

An objection to this argument for the value of true beliefs is that, as we’ve already seen, there are many true propositions which seem not to be worth caring about, and some which can be positively harmful. So although true beliefs are good when they can get us things we want, that is not always the case. So this argument doesn’t establish that we should always care about the truth.

A response to this worry is that we will all be faced with new situations in the future, and we will need to have a broad range of true beliefs, and as few false beliefs mixed in with the true ones as we can, in order to have a greater chance of succeeding when such situations come up (Foley 1993, ch.1). So it’s a good idea to try to get as many true beliefs as we can. This line of argument gives us a reason to think that it’s always at least pro tanto valuable to have true beliefs (that is, there’s always something positive to be said for true beliefs, even if that pro tanto value can sometimes be overridden by other considerations).

This is a naturalistically acceptable kind of value for true beliefs to enjoy. Although it doesn’t ground the value of true beliefs in the fact that people always desire to have true beliefs, it does ground their value in their instrumental usefulness for getting us other things which we do in fact desire. The main drawback for this approach, however, is that when someone positively desires not to have a given true belief – say, because it will cause him pain, or prevent him from having an enjoyable experience at the movies – it doesn’t seem like his desires can make it at all valuable for him to have the true belief in question. The idea here was to try to ground the value of truths in their instrumental usefulness, in the way that they are good for getting us what we want. But if there are true beliefs which we know will not be useful in that way (indeed, if there are true beliefs which we know will be harmful to us), then those beliefs don’t seem to have anything to be said in favour of them – which is to say that they aren’t even pro tanto valuable.

Whether we think that this is a serious problem will depend on whether we think that the claim that true beliefs are valuable entails that true beliefs must always have at least pro tanto value. Sometimes epistemologists (for example White 2007) explicitly claim that true beliefs are not always valuable in any real sense, since we just don’t always care about having them; but, just as money is valuable even though it isn’t something that we always care about having, so too, true beliefs are still valuable, in a hypothetical sense: when we do want to have true beliefs, or when true beliefs are necessary for getting us what we want, they are valuable. So we can always say that they have value; it’s just that the kind of value in question is only hypothetical in nature. (One might worry, however, that “hypothetical” seems to be only a fancy way to say “not real.”)

b. What People Ought to Care About

A similar way to motivate the claim that true beliefs are valuable is to say that there are some things that we morally ought to care about, and we need to have true beliefs in order to achieve those things (Zagzebski 2003; 2009). For example, I ought to care about whether my choices as a consumer contribute to painful and degrading living and working conditions for people who produce what I am consuming. (I do care about that, but even if I did not, surely, I ought to care about it.) But in order to buy responsibly, and avoid supporting corporations that abuse their workers, I need to have true beliefs about the practices of various corporations.

So, since there are things we should care about, and since we need true beliefs to successfully deal with things which we should care about, it follows that we should care about having true beliefs.

This line of argument is unavailable to anyone who wants to avoid positing the existence of objective values which exist independently of what people actually desire or care about, and it doesn’t generate any value for true beliefs which aren’t relevant to things we ought to care about. But if there are things which we ought to care about, then it seems correct to say that at least in many cases, true beliefs are valuable, or worth caring about.

Lynch (2004) gives a related argument for the objective value of truth. Although he doesn’t ground the value of true beliefs in things that we morally ought to care about, his central argument is that it’s important to care about the truth for its own sake, because caring for the truth for its own sake is part of what it is to have intellectual integrity, and intellectual integrity is an essential part of a healthy, flourishing life. (He also argues that a concern for the truth for its own sake is essential for a healthy democracy.)

c. Proper Functions

Some epistemologists (for example Plantinga 1993; Bergmann 2006; Graham 2011) invoke the proper functions of our cognitive systems in order to argue for (or to explain) the value of truth, and to explain the connection between truth and justification or warrant. Proper functions are usually given a selected-effects gloss, following Millikan (1984). The basic idea is that an organ or a trait (T), which produces an effect (E), has the production of effects of type E as its proper function just in case the ancestors of T also produced effects of type E, and the fact that they produced effects of type E is part of a correct explanation of why the Ts (or the organisms which have Ts) survived and exist today. For example, hearts have the proper function of pumping blood because hearts were selected for their ability to pump blood – the fact that our ancestors had hearts that pumped blood is part of a correct explanation of why they survived, reproduced, and why we exist today and have hearts that pump blood.

Similarly, the idea goes, we have cognitive systems which have been selected for producing true beliefs. And if that’s right, then our cognitive systems have the proper function of producing true beliefs, which seems to mean that there is always at least some value in having true beliefs.

It’s not clear whether selected-effect functions are in fact normative, however (in the sense of being able by themselves to generate reasons or value). Millikan, at least, thought that proper functions are normative. Others have disagreed (for example Godfrey-Smith 1998). Whether we can accept this line of argument for the value of true beliefs will depend on whether we think that selected-effects functions are capable of generating value by themselves, or whether they only generate value when taken in a broader context which includes reference to the desires and the wellbeing of agents.

A further potential worry with the proper-function explanation of the value of true beliefs is that there seem to be cognitive mechanisms which have been selected for, and which systematically produce, false beliefs. (See Hazlett (2013), for example, who considers cognitive biases such as the self-enhancement bias at considerable length.) Plantinga (1993) suggests that we should distinguish truth-directed cognitive mechanisms from others, and say that it’s only the proper functioning of well-designed, truth-conducive mechanisms that yield warranted beliefs. But if this response works, it’s only because there’s some way to explain why truth is valuable, other than saying that our cognitive mechanisms have been selected for producing true beliefs; otherwise there would be no reason to suggest that it’s only the truth-directed mechanisms that are relevant to warranted and epistemically valuable beliefs.

d. Assuming an Epistemic Value/Critical Domains of Evaluation

Many epistemologists don’t think that we need to argue that truth is a valuable thing to have (for example BonJour 1985, Alston 1985; 2005, Sosa 2007). All we need to do is to assume that there is a standpoint which we take when we are doing epistemology, or when we’re thinking about our cognitive lives, and stipulate that the goal of achieving true beliefs and avoiding errors is definitive of that standpoint. We can simply assume that truth is a real and fundamental epistemic value, and proceed from there.

Proponents of this approach still sometimes argue for the claim that achieving the truth and avoiding error is the fundamental epistemic value. But when they do, their strategy is to assume that there must be some distinctively epistemic value which is fundamental (that is, which orients our theories of justification and knowledge, and which explains why we value other things from an epistemic standpoint), and then to argue that achieving true beliefs does a better job as a fundamental epistemic value than other candidate values do.

The strategy here isn’t to argue that true beliefs are always valuable, all things considered. The strategy is to argue only that true belief is of fundamental value insofar as we are concerned with evaluating beliefs (or belief-forming processes, practices, institutions, and so forth) from an epistemic point of view. True beliefs are indeed sometimes bad to have, all things considered (as when you know how a movie will end), and not everyone always cares about having true beliefs. But enough of us care about having true beliefs in a broad enough range of cases that a critical domain of evaluation has arisen, which takes true belief as its fundamental value.

In support of this picture of epistemology and epistemic value, Sosa (2007) compares epistemology to the critical domain of evaluation which centers on good coffee. That domain takes the production and consumption of good cups of coffee as its fundamental value, and it has a set of evaluative practices in light of that goal. Many people take that goal seriously, and we have enormous institutional structures in place which exist entirely for the purpose of achieving the goal of producing good cups of coffee. But there are people who detest coffee, and perhaps coffee isn’t really valuable at all. (Perhaps…) But even so, enough people take the goal of producing good coffee to be valuable that we have generated a critical domain of evaluation centering on the value of producing good coffee, and even people who don’t care about coffee can still recognize good coffee, and they can engage in the practices which go with taking good coffee as a fundamental value of a critical domain. And for Sosa, the value of true belief is to epistemology as the value of good cups of coffee is to the domain of coffee production and evaluation.

One might worry, however, that this sort of move cannot accommodate the apparently non-optional nature of epistemic evaluation. It’s possible to opt out of the practice of making evaluations of products and processes in terms of the way that they promote the goal of producing tasty cups of coffee, but our epistemic practices don’t seem to be optional in that way. Even if I were to foreswear any kind of commitment to the importance of having epistemically justified beliefs, for example, you could appropriately level criticism at me if my beliefs were to go out of sync with my evidence.

e. Anti-Realism

An important minority approach to epistemic value and epistemic normativity is a kind of anti-realism, or conventionalism. The idea is that there is no sense in which true beliefs are really valuable, nor is there a sense in which we ought to try to have true beliefs, except insofar as we (as individuals, or as a community) desire to have true beliefs, or we are willing to endorse the value of having true beliefs.

One reason for being anti-realist about epistemic value is that you might be dissatisfied with all of the available attempts to come up with a convincing argument for thinking that truth (or anything else) is something which we ought to value. Hazlett (2013) argues against the “eudaimonic ideal” of true belief, which is the idea that even though true beliefs can be bad for us in exceptional circumstances, still, as a rule, true beliefs systematically promote human flourishing better than false beliefs do. One of Hazlett’s main objections to this idea is that there are types of cases where true beliefs are systematically worse for us than false beliefs. For example, people who have an accurate sense of what other people think of them tend to be more depressed than people who have an inflated sense of what others think of them. When it comes to beliefs about what others think about us, then, true beliefs are systematically worse for our wellbeing than corresponding false beliefs would be.

Because Hazlett thinks that the problems facing a realist account of epistemic value and epistemic norms are too serious, he adopts a form of conventionalism, according to which epistemic norms are like club rules. Just as a club might adopt the rule that they will not eat peas with spoons, so too, we humans have adopted epistemic rules such as the rule that we should believe only what the evidence supports. The justification for this rule isn’t that it’s valuable in any real sense to believe what the evidence supports; rather, the justification is just that the rule of believing in accord with the evidence is in fact a rule that we have adopted. (A worry for this approach, however, is that epistemic rules seem to be non-optional in a way that club rules are not. Clubs can change their rules by taking a vote, for example, whereas it doesn’t seem as though epistemic agents can do any such thing.)

f. Why the Focus on Truth?

We’ve been looking at some of the main approaches to the question of whether and why true beliefs are epistemically valuable. For a wide range of epistemologists, true beliefs play a fundamental role in their theories, so it’s important to try to see why we should think that truth is valuable. But, given that we tend to value knowledge more than we value true belief, one might wonder why true belief is so often taken to be a fundamental value in the epistemic domain. Indeed, not only do many of us think that knowledge is more valuable than mere true belief; we also think that there are a number of other things which should also count as valuable from the epistemic point of view: understanding, justification, simplicity, empirical adequacy of theories, and many other things too, seem to be important kinds of cognitive successes. These seem like prime candidates for counting as epistemically valuable – so why do they so often play such a smaller role in epistemological theories than true belief plays?

There are three main reasons why truth is often invoked as a fundamental epistemic value, and why these other things are often relegated to secondary roles. The first reason is that, as we saw in section 2(a), true beliefs do at least often seem to enable us to accomplish our goals and achieve what we want. And they typically enable us to do so whether or not they count as knowledge, or even whether or not they’re justified, or whether they represent relatively simple hypotheses. This seems like a reason to care about having true beliefs, which doesn’t depend on taking any other epistemic states to be valuable.

The second reason is that, if we take true belief to be the fundamental epistemic value, we will still be able to explain why we should think of many other things aside from true beliefs as epistemically valuable. If justified beliefs tend to be true, for example, and having true beliefs is the fundamental epistemic value, then justification will surely also be valuable, as a means to getting true beliefs (this is suggested in a widely-cited and passage in (BonJour 1985, pp.7-8)). Similarly, we might be able to explain the epistemic value of simplicity in terms of the value of truth, because the relative simplicity of a hypothesis can be evidence that the hypothesis is more likely than other competing hypotheses to be true. On one common way of thinking about simplicity, a hypothesis H1 is simpler than another hypothesis H2 if H1 posits fewer theoretical entities. Understanding simplicity in that way, it’s plausible to think that simpler hypotheses are likelier to be true, because there are fewer ways for them to go wrong (there are fewer entities for them to be mistaken about).

By contrast, it is not so straightforward to try to explain the value of truth in terms of other candidate epistemic values, such as simplicity or knowledge. If knowledge were the fundamental (as opposed to the highest, or one of the highest) epistemic value, so that the value of true beliefs would have to be dependent on the value of knowledge, then it seems that it would be difficult to explain why unjustified true beliefs should be more valuable than unjustified false beliefs, which they seem to be.

And the third reason why other candidate epistemic values are not often invoked in setting out epistemic theories is that, even if there are epistemically valuable things which do not get all of their epistemic value from their connection with true belief, there is a particular theoretical role which many epistemologists want the central epistemic goal or value to play, and it can only play that role if it’s understood in terms of achieving true beliefs and avoiding false ones (David 2001; cf. Goldman 1979). Briefly, the role in question is that of providing a way to explain our epistemic notions, including especially the notions of knowledge and epistemic rationality, in non-epistemic terms. Since truth is not itself an epistemic term, it can play this role. But other things which seem to be epistemically valuable, like knowledge and rationality, cannot play this role, because they are themselves epistemic terms. We will come back to the relation between the analysis of epistemic rationality and the formulation of the epistemic goal in the final section of this article.

Still, “veritism,” or “truth-value-monism”—the view that truth, or true belief, is the sole or the fundamental epistemic value—has come in for heavy criticism in recent years. Pluralists argue that there are multiple states or properties that have independent epistemic value (for example, DePaul 2001; Kvanvig 2005; Brogaard 2009; Madison 2017); some argue that truth is not particularly valuable, or not particularly epistemically valuable (for example Feldman 2000; Wrenn 2017); and as we saw above, some epistemologists argue that knowledge is what is primarily valuable, and that the attempt to explain the value of knowledge in terms of the value of truth is misguided (for example, Littlejohn 2018; Aschliman 2020) For defenses of veritism from some of its challenges, see (Ahlstrom-Vij 2013; Pritchard 2014; 2021).

4. Understanding

There is growing support among epistemologists for the idea that understanding is the highest epistemic value, more valuable even than knowledge. There are various ways of fleshing out this view, depending on what kind of understanding we have in mind, and depending on whether we want to remain truth-monists about what’s fundamentally epistemically valuable or not.

a. Understanding: Propositions and Domains; Subjective and Objective

If you are a trained mechanic, then you understand how automobiles work. This is an understanding of a domain, or of a kind of object. To have an understanding of a domain, you need to have a significant body of beliefs about that domain, which fits together in a coherent way, and which involves beliefs about what would explain why things happen as they do in that domain. When you have such a body of beliefs, we can say that you have a subjective understanding of the domain (Grimm 2012). When, in addition, your beliefs about the domain are mostly correct, we can say that you have an objective understanding of the domain.

In addition to understanding a domain, you might also understand that p – you might understand that some proposition is true. There are several varieties of propositional understanding: there is simply understanding that p; there is understanding why p, which involves understanding that p because q; there is understanding when p, which involves understanding that p happens at time t, and understanding why p happens at time t; and so on, for other wh- terms, such as who and where. In what follows, we’ll talk in general in terms of propositional understanding, or understanding that p, to cover all these cases.

Understanding that p entails having at least some understanding of a domain. To borrow an example of Pritchard’s (2009): imagine that you come home to find your house burnt to the ground. You ask the fire chief what caused the fire, and he tells you that it was faulty wiring. Now you know why your house burnt to the ground (you know that it burnt down because of the faulty wiring), and you also understand why your house burnt to the ground (you know that the house burnt down because of faulty wiring, and you have some understanding of the kinds of things that tend to start fires, such as sparks, or overheating, both of which can be caused by faulty wiring.) You understand why the house burnt down, in other words, only because you have some understanding of how fires are caused.

As Kvanvig (2003) notes, it’s plausible that you only genuinely understand that p if you have a mostly correct (that is, an objective) understanding of the relevant domain. For suppose that you have a broad and coherent body of beliefs about celestial motion, but which centrally involves the belief that the earth is at the center of the universe. Because your body of beliefs involves mistaken elements at its core, we would normally say that you misunderstand celestial motions, and you misunderstand why (for example) we can observe the sun rising every day. In a case like this, where you misunderstand why p (for example why the sun comes up), we can say that you have a subjective propositional understanding: your belief that the sun comes up every day because the earth is at the center of the Universe, and the celestial bodies all rotate around it, can be coherent with a broader body of justified beliefs, and it can provide explanations of celestial motions. But because your understanding of the domain of celestial motion involves false beliefs at its core, you have an incorrect understanding of the domain, and your explanatory propositional understanding, as a result, is also a misunderstanding.

By contrast, when your body of beliefs about a domain is largely correct, and your understanding of the domain leads you to believe that p is true because q is true, we can say that you have an objective understanding of why p is true. In what follows, except where otherwise specified, “understanding” refers to objective propositional understanding.

b. The Location of the Special Value of Understanding

It seems natural to think that understanding that p involves knowing that p, plus something extra, where the extra bit is something like having a roughly correct understanding of some relevant domain to do with p: you understand that p when (and only when) you know that p, and your belief that p fits into a broader, coherent, explanatory body of beliefs, where this body of beliefs is largely correct. So the natural place to look for the special epistemic value of understanding is in the value of this broader body of beliefs.

Some authors (Kvanvig 2003; Hills 2009; Pritchard 2009) have argued that propositional understanding does not require the corresponding propositional knowledge: S can understand that p, they argue, even if S doesn’t know that p. The main reason for this view is that understanding seems to be compatible with a certain kind of luck, environmental luck, which is incompatible with knowledge. For example, think again of the case where you ask the fire chief the cause of the fire, but now imagine that there are many pretend fire chiefs all walking around the area in uniform, and it’s just a matter of luck that you asked the real fire chief. In this case, it seems fairly clear that you lack knowledge of the cause of the fire, since you could so easily have asked a fake fire chief, and formed a false belief as a result. But, the argument goes, you do gain understanding of the cause of the fire from the fire chief. After all, you have gained a true belief about what caused the fire, and your belief is justified, and it fits in with your broader understanding of the domain of fire-causing. What we have here is a case of a justified true belief, where that belief fits in with your understanding of the relevant domain, but where you have been Gettiered, so you lack knowledge.

So, it’s controversial whether understanding that p really presupposes knowing that p. But when it comes to the value of understanding, we can set this question aside. For even if there are cases of propositional understanding without the corresponding propositional knowledge, still, most cases of propositional understanding involve the corresponding propositional knowledge, and in those cases, the special value of understanding will lie in what is added to the propositional knowledge to yield understanding. In cases where there is Gettierizing environmental luck, so that S has a Gettierized justified true belief which fits in with her understanding of the relevant domain, the special value of understanding will lie in what is added to justified true belief. In other words, whether or not propositional understanding presupposes the corresponding propositional knowledge, the special value of propositional understanding will be located in the subject’s understanding of the relevant domain.

c. The Value of Understanding

There are a few plausible accounts of why understanding should be thought of as distinctively epistemically valuable, and perhaps even as the highest epistemic value. One suggestion, which would be friendly to truth-monists about epistemic value, is that we can consistently hold both that truth is the fundamental epistemic value and that understanding is the highest epistemic value. Because understanding that p typically involves both knowing that p and having a broader body of beliefs, where this body of beliefs is coherent and largely correct, it follows from the fundamental value of true beliefs that in any case where S understands that p, S’s cognitive state involves greater epistemic value than if S were merely to truly believe that p, because S has many other true beliefs too. On this picture, understanding doesn’t have a distinctive kind of value, but it does have a greater quantity of value than true belief, or even than knowledge. But, for a truth-monist about epistemic value, this is just the result that should be desired – otherwise, the view would no longer be monistic.

An alternative suggestion, which does not rely on truth-monism about epistemic value, is that the value of having a broad body of beliefs which provide an explanation for phenomena is to be explained by the fact that whether you have such a body of beliefs is transparent to you: you can always tell whether you have understanding (Zagzebski 2001). And surely, if it’s always transparent to you whether you understanding something, that is a source of extra epistemic value for understanding on top of the value of having true belief or even knowledge, since we can’t in general tell whether we are in those states.

The problem with this suggestion, though, as Grimm (2006; 2012) points out, is that we cannot always tell whether we have understanding. It often happens that we think we understand something, when in fact we gravely misunderstand it. It might be the case that we can always tell whether we have a subjective understanding – we might always be able to tell whether we have a coherent, explanatory body of beliefs – but we are not in general in a position to be able to tell whether our beliefs are largely correct. The subjective kind of understanding doesn’t entail the objective kind. Still, it is worth noting that there seems to be a kind of value in being aware of the coherence and explanatory power of one’s beliefs on a given topic, even if it’s never transparent whether one’s beliefs are largely correct. (See Kvanvig 2003 for more on the value of internal awareness and of having coherent bodies of beliefs.)

A third suggestion about the value of understanding, which is also not committed to truth-monism, is that having understanding can plausibly be thought of as a kind of success which is properly attributable to one’s exercise of a relevant ability, or in other words, an achievement. As we saw above, a number of virtue epistemologists think that we can explain the distinctive value of knowledge by reference to the fact that knowledge is a cognitive achievement. But others (notably, Lackey 2006 and 2009) have denied that subjects in general deserve credit for their true belief in cases of knowledge. Cases of testimonial knowledge are popular counterexamples to the view that knowledge is in general an achievement: when S learns some fact about local geography from a random bystander, for example, S can gain knowledge, but if anyone deserves credit for S’s true belief, it seems to be the bystander. So, if that’s right, then it’s not after all always much of an achievement to gain knowledge.

Pritchard (2009) also argues that knowledge is not in general an achievement, but he claims that understanding is. For when S gains an understanding that p, it seems that S must bring to bear significant cognitive resources, unlike when S only gains knowledge that p. Suppose, for example, that S asks bystander B where the nearest tourist information booth is, and B tells him. Now let’s compare S’s and B’s cognitive states. S has gained knowledge of how to get to the nearest information booth, but S doesn’t have an understanding of the location of the nearest information booth, since S lacks knowledge of the relevant domain (that is, local geography). B, on the other hand, both knows and understands the location of the nearest booth. And B’s understanding of the local geography, and her consequent understanding of the location of the nearest booth, involves an allocation of significant cognitive resources. (Anyone who has had to quickly memorize the local geography of a new city will appreciate how much cognitive work goes into having a satisfactory understanding of this kind of domain.)

d. Alternatives to the Natural Picture of the Value of Understanding

If understanding that p requires both knowing that p (or having a justified true belief that p) and having a broader body of beliefs which is coherent, explanatory, and largely correct, then it’s plausible to think that the special value of understanding is in the value of having such a body of beliefs. But it’s possible to resist this view of the value of understanding in a number of ways. One way to resist it would be to deny that understanding is ever any different from knowing. Reductivists about understanding think that it’s not possible to have knowledge without having understanding, or understanding without knowledge.  Sliwa (2015) argues, for example, that when S knows that p, S must understand that p at least to some extent. S has a better understanding that p when S has a better understanding of the relevant domain, in the form of knowledge of more related propositions, but S knows that p if and only if S has some understanding that p.

For reductivists about understanding, there can obviously be no value in understanding beyond the value of having knowledge. There are better and worse understandings, but any genuine (objective) understanding involves at least some knowledge, and better understanding just involves more knowledge. If that’s right, then we don’t need to say that understanding has more value than knowledge.

A second way to resist the approach to the value of understanding presented in the previous section is to resist the claim that understanding requires that one’s beliefs about a domain must be mostly correct. Elgin (2007; 2009), for example, points out that in the historical progression of science, there have been stages at which scientific understanding, while useful and epistemically good, centrally involved false beliefs about the relevant domains. Perhaps even more importantly, scientists regularly employ abstract or idealized models, which are known to be strictly false – but they use these models to gain a good understanding of the domain or phenomenon in question. And the resulting understanding is better, rather than worse, because of the use of these models, which are strictly speaking false. So the elimination of all falsehoods from our theories is not even desirable, on Elgin’s view. (In the language of subjective and objective understanding, we might say that Elgin thinks that subjective understanding can be every bit as good to have as objective understanding. We need to keep in mind, though, that Elgin would reject the view that subjective understandings which centrally involve false beliefs are necessarily misunderstandings.)

5. Instrumentalism and Epistemic Goals

The final topic we need to look at now is the relation between epistemic values and the concept of epistemic rationality or justification. According to one prominent way of analyzing epistemic rationality, the instrumental conception of epistemic rationality, beliefs are epistemically rational when and just to the extent that they appropriately promote the achievement of a distinctively epistemic goal. This approach can measure the epistemic rationality of individual beliefs by how well they themselves do with respect to the epistemic goal (for example, Foley 1987); or it can measure the rationality of whole belief-systems by how accurate they are, according to some appropriate formal rule that scores bodies of beliefs in light of the epistemic goal (for example, Joyce 1998).

The instrumental conception has been endorsed by many epistemologists over the past several decades (for example BonJour 1985; Alston 1985, 2005; Foley 1987, 1993, 2008), though a number of important criticisms of it have emerged in recent years (for example Kelly 2003; Littlejohn 2012; Hazlett 2013). For instrumentalists, getting the right accounts of epistemic goals and epistemic rationality are projects which constrain each other. Whether or not we want to accept instrumentalism in the end, it’s important to see the way that instrumentalists think of the relation of epistemic goals and epistemic rationality.

a. The Epistemic Goal as a Subset of the Epistemic Values

The first thing to note about the instrumentalist’s notion of an epistemic goal is that it has to do with what is valuable from an epistemic or cognitive point of view. But instrumentalists typically are not concerned to identify a set of goals which is exhaustive of what is epistemically valuable. Rather, they are concerned with identifying an epistemically valuable goal which is capable of generating a plausible, informative, and non-circular account of epistemic rationality in instrumental terms, and it’s clear that not all things that seem to be epistemically valuable can be included in an epistemic goal which is going to play that role. David (2001) points out that if we take knowledge or rationality (or, we might also add here, understanding) to be part of the epistemic goal, then the instrumental account of epistemic rationality becomes circular. This is most obvious with rationality: rationality is no doubt something we think is epistemically valuable, but if we include rationality in the formulation of the epistemic goal, and we analyze epistemic rationality in terms of achieving the epistemic goal, then we’ve analyzed epistemic rationality as the appropriate promotion of the goal of getting epistemically rational beliefs – an unhelpfully circular analysis, at best. And, if knowledge and understanding presuppose rationality, we also cannot include knowledge or understanding in the formulation of the epistemic goal.

This is one important reason why many epistemologists have taken the epistemic goal to be about achieving true beliefs and avoiding false ones. That seems to be a goal which is valuable from an epistemic point of view, and it stands a good chance at grounding a non-circular analysis of epistemic rationality.

David in fact goes a step further, and claims that because true belief is the only thing that is epistemically valuable that is capable of grounding an informative and non-circular analysis of epistemic rationality, truth is the only thing that’s really valuable from an epistemic point of view; knowledge, he thinks, is an extra-epistemic value. But it’s possible for pluralists about epistemic value to appreciate David’s point that only some things that are epistemically valuable (such as having true beliefs) are suitable for being taken up in the instrumentalist’s formulation of the epistemic goal. In other words, pluralism about epistemic values is consistent with monism about the epistemic goal.

b. Common Formulations of the Epistemic Goal

Now, there are two further important constraints on how to formulate the epistemic goal. First, it must be plausible to take as a goal – that is, as something we do in fact care about, or at least something that seems to be worth caring about even if people don’t in fact care about it. We might express this constraint by saying that the epistemic goal must be at least pro tanto valuable in either a subjective or an objective sense. And second, the goal should enable us to categorize clear cases of epistemically rational and irrational beliefs correctly. We can close this discussion of epistemic values and goals by considering three oft-invoked formulations of the epistemic goal, and noting the important differences between them. According to these formulations, the epistemic goal is:

(1) “to amass a large body of beliefs with a favorable truth-falsity ratio” (Alston 1985, p.59);

(2) “maximizing true beliefs and minimizing false beliefs about matters of interest and importance” (Alston 2005, p.32); and

(3) “now to believe those propositions that are true and now not to believe those propositions that are false” (Foley 1987, p.8).

Each of these formulations of the epistemic goal emphasizes the achievement of true beliefs and the avoidance of false ones. But there are two important dimensions along which they diverge.

c. Differences between the Goals: Interest and Importance

The first difference is with respect to whether the epistemic goal includes all propositions (or, perhaps, all propositions which a person could conceivably grasp), or whether it includes only propositions about matters of interest or importance. Formulation (2) includes an “interest and importance” clause, whereas (1) and (3) do not. The reason for including a reference to interest and importance is that it makes the epistemic goal much more plausible to take as a goal which is pro tanto valuable. For, as we have seen, there are countless examples of apparently utterly trivial or even harmful true propositions, which one might think are not worth caring about having. This seems like a reason to restrict the epistemic goal to having true beliefs and avoiding false ones about matters of interest and importance: we want to have true beliefs, but only when it is interesting or important to us to have them.

The drawback of an interest and importance clause in the epistemic goal is that it seems to prevent the instrumental approach from providing a fully general account of epistemic rationality. For it seems possible to have epistemically rational or irrational beliefs about utterly trivial or even harmful propositions. Suppose I were to come across excellent evidence about the number times the letter “y” appears in the seventeenth space on all lines in the first three and the last three sections of this article. Even though that strikes me as an utterly trivial truth, which I don’t care about believing, I might still come to believe what my evidence supports regarding it. And if I do, then it’s plausible to think that my belief will count as epistemically rational, because it’s based on good evidence. If it is not part of the epistemic goal that we should achieve true beliefs about even trivial or harmful matters, then it doesn’t seem like instrumentalists have the tools to account for our judgments of epistemic rationality or irrationality in such cases. This seems to give us a reason to make the epistemic goal include all true propositions, or at least all true propositions which people can conceivably grasp. (Such a view might be supported by appeal to the arguments for the general value of truth which we saw above, in section 2.)

d. Differences between the Goals: Synchronic and Diachronic Formulations

The second difference between the three formulations of the epistemic goal is regarding whether the goal is synchronic or diachronic. Formulation (3) is synchronic: it is about now having true beliefs and avoiding false ones. (Or, if we are considering a subject S’s beliefs at a time t other than the present, the goal is to believe true propositions and not believe false ones, at t.) Formulations (1) and (2) are neutral on that question.

A reason for accepting a diachronic formulation of the epistemic goal is that it is, after all, plausible to think that we do care about having true beliefs and avoiding false beliefs over the long run. Having true beliefs now is a fine thing, but having true beliefs now and still having them ten minutes from now is surely better. A second reason for adopting a diachronic formulation of the goal, offered by Vahid (2003), is to block Maitzen’s (1995) argument that instrumentalists who think that the epistemic goal is about having true beliefs cannot say that there are justified false beliefs, or unjustified true beliefs. Briefly, Maitzen argues that false beliefs can never, and true beliefs can never fail to, promote the achievement of the goal of getting true beliefs and avoiding false ones. Vahid replies that if the epistemic goal is about having true beliefs over the long run, then false beliefs can count as justified, in virtue of their truth-conducive causal histories.

The reason why instrumentalists like Foley formulate the epistemic goal instead in synchronic terms is to avoid the counterintuitive result that the epistemic status of a subject’s beliefs at t can depend on what happens after t. For example: imagine that you have very strong evidence at time t for thinking that you are a terrible student, but you are extremely confident in yourself anyway, and you hold the belief at t that you are a good student. At t+1, you consider whether to continue your studies or to drop out of school. Because of your belief about your abilities as a student, you decide to continue with your studies. And in continuing your studies, you go on to become a better student, and you learn all sorts of new things.

In this case, your belief at t that you are a good student does promote the achievement of a large body of beliefs with a favorable truth-falsity ratio over the long run. But by hypothesis, your belief is held contrary to very strong evidence at time t. The intuitive verdict in such cases seems to be that your belief at t that you are a good student is epistemically irrational. So, since the belief promotes the achievement of a diachronic epistemic goal, but not a synchronic one, we should make the epistemic goal synchronic. Or, if we want to maintain that the epistemic goal is diachronic, we can do so, as long as we are willing to accept the cost of adopting a partly revisionary view about what’s epistemically rational to believe in some cases where beliefs are held contrary to good available evidence.

6. Conclusion

We’ve gone through some of the central problems to do with epistemic value here. We’ve looked at attempts to explain why and in what sense knowledge is more valuable than any of its proper parts, and we’ve seen attempts to explain the special epistemic value of understanding. We’ve also looked at some attempts to argue for the fundamental epistemic value of true belief, and the role that the goal of achieving true beliefs and avoiding false ones plays when epistemologists give instrumentalist accounts of the nature of epistemic justification or rationality. Many of these are fundamental and important topics for epistemologists to address, both because they are intrinsically interesting, and also because of the implications that our accounts of knowledge and justification have for philosophy and inquiry more generally (for example, implications for norms of assertion, for norms of practical deliberation, and for our conception of ourselves as inquirers, to name just a few).

7. References and Further Reading

  • Ahlstrom-Vij, Kristoffer (2013). In Defense of Veritistic Value Monism. Pacific Philosophical Quarterly. 94: 1, 19-40.
  • Ahlstrom-Vij, Kristoffer, and Jeffrey Dunn (2018). Epistemic Consequentialism. Oxford: Oxford University Press.
    • This useful volume contains essays that develop, criticize, and defend consequentialist (instrumentalist) accounts of epistemic norms. Much of the volume concerns formal approaches to scoring beliefs and belief-systems in light of the epistemic goal of achieving true beliefs and avoiding false beliefs.
  • Alston, William (1985). Concepts of Epistemic Justification. The Monist. 68. Reprinted in his Epistemic Justification: Essays in the Theory of Knowledge. Ithaca, NY: Cornell University Press, 1989.
    • Discusses concepts of epistemic justification. Espouses an instrumentalist account of epistemic evaluation.
  • Alston, William (2005). Beyond Justification: Dimensions of Epistemic Evaluation. Ithaca, NY: Cornell University Press.
    • Abandons the concept of epistemic justification as too simplistic; embraces the pluralist idea that there are many valuable ways to evaluate beliefs. Continues to endorse the instrumentalist approach to epistemic evaluations.
  • Aschliman, Lance (2020). Is True Belief Really a Fundamental Value? Episteme. 17: 1, 88-104.
  • Bergmann, Michael (2006). Justification without Awareness. Oxford: Oxford University Press.
  • Bondy, Patrick (2018). Epistemic Rationality and Epistemic Normativity. Routledge.
    • Considers three strategies for explaining the normativity of epistemic reasons; criticizes instrumentalism about the nature of epistemic reasons and rationality; defends instrumentalism about the normativity of epistemic reasons.
  • Bondy, Patrick (2022). Avoiding Epistemology’s Swamping Problem: Instrumental Normativity without Instrumental Value. Southwest Philosophy Review.
    • Argues that the normativity of epistemic reasons is instrumental. Also raises worries for Sylvan’s (2018) derivative but non-instrumental approach to the epistemic value of justification and knowledge.
  • BonJour, Laurence (1985). The Structure of Empirical Knowledge. Cambridge, Mass: Harvard University Press.
    • Develops a coherentist internalist account of justification and knowledge. Gives a widely-cited explanation of the connection between epistemic justification and the epistemic goal.
  • Brogaard, Berit (2006). Can Virtue Reliabilism Explain the Value of Knowledge? Canadian Journal of Philosophy. 36: 3, 335-354.
    • Defends generic reliabilism from the Primary Value Problem; proposes an internalist response to the Secondary Value Problem.
  • Brogaard, Berit (2008). The Trivial Argument for Epistemic Value Pluralism, or, How I Learned to Stop Caring About Truth. In: Adrian Haddock, Alan Millar, and Duncan Pritchard, eds. Epistemic Value. Oxford: Oxford University Press. 284-308.
  • Carter, J. Adam, Benjamin Jarvis, and Katherine Rubin (2013). Knowledge: Value on the Cheap. Australasian Journal of Philosophy. 91: 2, 249-263.
    • Presents the promising proposal that because knowledge is a continuing state rather than something that is achieved and then set aside, there are easy solutions to the Primary, Secondary, and even Tertiary Value Problems for knowledge.
  • Craig, Edward (1990). Knowledge and the State of Nature. Oxford: Oxford University Press.
  • David, Marian (2001). Truth as the Epistemic Goal. In Matthias Steup, ed., Knowledge, Truth, and Duty: Essays on Epistemic Justification, Responsibility, and Virtue. New York and Oxford: Oxford University Press. 151-169.
    • A thorough discussion of how instrumentalists about epistemic rationality or justification ought to formulate the epistemic goal.
  • David, Marian (2005). Truth as the Primary Epistemic Goal: A Working Hypothesis. In Matthias Steup and Ernest Sosa, eds. Contemporary Debates in Epistemology. Malden, MA: Blackwell. 296-312.
  • DePaul, Michael (2001). Value Monism in Epistemology. In: Mathhias Steup, ed. Knowledge, Truth, and Duty: Essays on Epistemic Justification, Responsibility, and Virtue. Oxford: Oxford University Press. pp.170-183.
  • Dogramaci, Sinan (2012). Reverse Engineering Epistemic Evaluations. Philosophy and Phenomenological Research. 84: 3, 513-530.
    • Accepts the widely-endorsed thought that justification or rationality are only instrumentally valuable for getting us true beliefs. The paper inquires into what function our epistemic practices could serve, in cases where what’s rational to believe is false, or what’s irrational to believe is true.
  • Elgin, Catherine (2007). Understanding and the Facts. Philosophical Studies. 132, 33-42.
  • Elgin, Catherine (2009). Is Understanding Factive? In A. Haddock, A. Millar, and D. Pritchard, eds. Epistemic Value. Oxford: Oxford University Press. 322-330.
  • Feldman, Richard (2000). The Ethics of Belief. Philosophy and Phenomenological Research. 60: 3, 667-695.
  • Field, Hartry (2001). Truth and the Absence of Fact. Oxford: Oxford University Press.
    • Among other things, argues that there are no objectively correct epistemic goals which can ground objective judgments of epistemic reasonableness.
  • Foley, Richard (1987). The Theory of Epistemic Rationality. Cambridge, Mass: Harvard University Press.
    • A very thorough development of an instrumentalist and egocentric account of epistemic rationality.
  • Foley, Richard (1993). Working Without a Net: A Study of Egocentric Rationality. New York and Oxford: Oxford University Press.
    • Develops and defends the instrumental approach to rationality generally and to epistemic rationality in particular.
  • Foley, Richard (2008). An Epistemology that Matters. In P. Weithman, ed. Liberal Faith: Essays in Honor of Philip Quinn. Notre Dame, Indiana: University of Notre Dame Press. 43-55.
    • Clear and succinct statement of Foley’s instrumentalism.
  • Godfrey-Smith, Peter (1998). Complexity and the Function of Mind in Nature. Cambridge; Cambridge University Press.
  • Goldman, Alvin (1979). What Is Justification? In George Pappas, ed. Justification and Knowledge. Dordrecht: D. Reidel Publishing Company, 1-23.
  • Goldman, Alvin (1999). Knowledge in a Social World.
    • Adopts a veritist approach to epistemic value; describes and evaluates a number of key social institutions and practices in light of the truth-goal.
  • Goldman, Alvin and Olsson, Erik (2009). Reliabilism and the Value of Knowledge. In A. Haddock, A. Millar, and D. Pritchard, eds. Epistemic Value. Oxford: Oxford University Press. 19-41.
    • Presents two reliabilist responses to the Primary Value Problem.
  • Graham, Peter (2011). Epistemic Entitlement. Noûs. 46: 3, 449-482.
  • Greco, John (2003). Knowledge as Credit for True Belief. In Michael DePaul and Linda Zagzebski, eds. Intellectual Virtue: Perspectives from Ethics and Epistemology. Oxford: Oxford University Press. 111-134.
    • Sets out the view that attributions of knowledge are attributions of praiseworthiness, when a subject gets credit for getting to the truth as a result of the exercise of intellectual virtues. Discusses praise, blame, and the pragmatics of causal explanations.
  • Greco, John (2008). Knowledge and Success from Ability. Philosophical Studies. 142, 17-26.
    • Elaboration of ideas in Greco (2003).
  • Grimm, Stephen (2006). Is Understanding a Species of Knowledge? British Journal for the Philosophy of Science. 57, 515–35.
  • Grimm, Stephen (2012). The Value of Understanding. Philosophy Compass. 7: 2, 1-3-117.
    • Good survey article of work on the value of understanding up to 2012.
  • Haddock, Adrian (2010). Part III: Knowledge and Action. In Duncan Pritchard, Allan Millar, and Adrian Haddock, The Nature and Value of Knowledge: Three Investigations. Oxford: Oxford University Press.
  • Hazlett, Allan (2013). A Luxury of the Understanding: On the Value of True Belief. Oxford: Oxford University Press.
    • An extended discussion of whether true belief is valuable. Presents a conventionalist account of epistemic normativity.
  • Hills, Alison (2009). Moral Testimony and Moral Epistemology. Ethics. 120: 1, 94-127.
  • Horvath, Joachim (2009). Why the Conditional Probability Solution to the Swamping Problem Fails. Grazer Philosophische Studien. 79: 1, 115-120.
  • Hutchinson, Jim (2021). Why Can’t What Is True Be Valuable? Synthese. 198, 6935-6954.
  • Hyman, John (2010). The Road to Larissa. Ratio. 23: 4, 393-414.
    • Contains detailed explanatory and critical discussion of the Primary and Secondary Value Problems, and Plato’s and Williamson’s stability solutions. Proposes that knowledge is the ability to be guided by the facts; and that knowledge is expressed when we guide ourselves by the facts—when we “do things for reasons that are facts” (p.411); and mere true belief is insufficient for this kind of guidance.
  • James, William (1949). The Will to Believe. In his Essays in Pragmatism. New York: Hafner. pp. 88-109. Originally published in 1896.
  • Jones, Ward (1997). Why Do We Value Knowledge? American Philosophical Quarterly. 34: 4, 423-439.
    • Argues that reliabilists and other instrumentalists cannot handle the Primary Value Problem. Proposes that we solve the problem by appealing to the value of contingent features of knowledge.
  • Joyce, James (1998). A Nonpragmatic Vindication of Probabilism. Philosophy of Science. 65: 4, 575-603.
    • Assumes an epistemic goal of truth or accuracy; shows that credal systems that conform to the axioms of probability do better than systems that violate those axioms.
  • Kaplan, Mark (1985). It’s Not What You Know that Counts. The Journal of Philosophy. 82: 7, 350-363.
    • Denies that knowledge is any more important than justified true belief.
  • Kelly, Thomas (2003). Epistemic Rationality as Instrumental Rationality: A Critique. Philosophy and Phenomenological Research. 66: 3, 612-640.
    • Criticizes the instrumental conception of epistemic rationality, largely on the grounds that beliefs can be epistemically rational or irrational in cases where there is no epistemic goal which the subject desires to achieve.
  • Kornblith, Hilary (2002). Knowledge and its Place in Nature. Oxford: Clarendon Press of Oxford University Press.
    • Develops the idea that knowledge is a natural kind which ought to be studied empirically rather than through conceptual analysis. Grounds epistemic norms, including the truth-goal, in the fact that we desire anything at all.
  • Kvanvig, Jonathan (2003). The Value of Knowledge and the Pursuit of Understanding. Cambridge: Cambridge University Press.
    • Considers and rejects various arguments for the value of knowledge. Argues that understanding rather than knowledge is the primary epistemic value.
  • Kvanvig, Jonathan (2005). Truth is Not the Primary Epistemic Goal. In Matthias Steup and Ernest Sosa, eds. Contemporary Debates in Epistemology. Malden, MA: Blackwell. 285-296.
    • Criticizes epistemic value monism.
  • Kvanvig, Jonathan (2010). The Swamping Problem Redux: Pith and Gist. In Adrian Haddock, Alan Millar, and Duncan Pritchard, eds. Social Epistemology. 89-112.
  • Lackey, Jennifer (2007). Why We Don’t Deserve Credit for Everything We Know. Synthese. 158: 3, 345-361.
  • Lackey, Jennifer (2009). Knowledge and Credit. Philosophical Studies. 142: 1, 27-42.
  • Lackey argues against the virtue-theoretic idea that when S knows that p, S’s getting a true belief is always creditable to S.
  • Littlejohn, Clayton (2012). Justification and the Truth-Connection. Cambridge: Cambridge University Press.
    • Contains an extended discussion of internalism and externalism, and argues against the instrumental conception of epistemic justification. Also argues that there are no false justified beliefs.
  • Littlejohn, Clayton (2018). The Right in the Good: A Defense of Teleological Non-Consequentialism. In: Kristoffer Ahlstrom-Vij and Jeffrey Dunn, eds. Epistemic Consequentialism. Oxford: Oxford University Press. 23-47.
  • Lynch, Michael (2004). True to Life: Why truth Matters. Cambridge, Mass: MIT Press.
    • Argues for the objective value of true beliefs.
  • Lynch, Michael (2009). Truth, Value and Epistemic Expressivism. Philosophy and Phenomenological Research. 79: 1, 76-97.
    • Argues against expressivism and anti-realism about the value of true beliefs.
  • Madison, B.J.C. (2017). Epistemic Value and the New Evil Demon. Pacific Philosophical Quarterly. 98: 1, 89-107.
    • Argues that justification is valuable for its own sake, not just as a means to truth.
  • Maitzen, Stephen (1995). Our Errant Epistemic Aim. Philosophy and Phenomenological Research. 55: 4, 869-876.
    • Argues that if we take the epistemic goal to be achieving true beliefs and avoiding false ones, then all and only true beliefs will count as justified. Suggests that we need to adopt a different formulation of the goal.
  • Millikan, Ruth (1984). Language, Thought, and other Biological Categories. Cambridge, Mass.: MIT Press.
    • Develops and applies the selected-effect view of the proper functions of organs and traits.
  • Olsson, Erik (2009). In Defense of the Conditional Reliability Solution to the Swamping Problem. Grazer Philosophische Studien. 79: 1, 93-114.
  • Olsson, Erik (2011). Reply to Kvanvig on the Swamping Problem. Social Epistemology. 25: 2, 173-182.
  • Piller, Christian (2009). Valuing Knowledge: A Deontological Approach. Ethical Theory and Moral Practice. 12, 413-428.
  • Plantinga, Alvin (1993). Warrant and Proper Function. New York: Oxford University Press.
    • Develops a proper function analysis of knowledge.
  • Plato. Meno. Trans. G. M. A. Grube. In Plato, Complete Works. J. M. Cooper and D. S. Hutcheson, eds. Indianapolis and Cambridge: Hackett, 1997. 870-897.
  • Pritchard, Duncan. (2007). Recent Work on Epistemic Value. American Philosophical Quarterly. 44: 2, 85-110.
    • Survey article on problems of epistemic value. Distinguishes Primary, Secondary, and Tertiary value problems.
  • Pritchard, Duncan. (2008). Knowing the Answer, Understanding, and Epistemic Value. Grazer Philosophische Studien. 77, 325–39.
  • Pritchard, Duncan. (2009). Knowledge, Understanding, and Epistemic Value. Epistemology (Royal Institute of Philosophy Lectures). Ed. Anthony O’Hear. New York: Cambridge University Press. 19–43.
  • Pritchard, Duncan. (2010) Part I: Knowledge and Understanding. In Duncan Pritchard, Allan Millar, and Adrian Haddock, The Nature and Value of Knowledge: Three Investigations. Oxford: Oxford University Press.
  • Pritchard, Duncan. (2014). Truth as the Fundamental Epistemic Good. In: Jonathan Matheson and Rico Vitz, eds. The Ethics of Belief: Individual and Social. Oxford:Oxford University Press. 112-129.
  • Pritchard, Duncan. (2021). Intellectual Virtues and the Epistemic Value of Truth. Synthese. 198, 5515- 5528.
  • Riggs, Wayne (2002). Reliability and the Value of Knowledge. Philosophy and Phenomenological Research. 64, 79-96.
  • Riggs, Wayne (2003). Understanding Virtue and the Virtue of Understanding. In Michael DePaul & Linda Zagzebski, eds. Intellectual Virtue: Perspectives from Ethics and Epistemology. Oxford University Press.
  • Riggs, Wayne (2008). Epistemic Risk and Relativism. Acta Analytica. vol. 23, no. 1, pp. 1-8.
  • Sartwell, Crispin (1991). Knowledge is Merely True Belief. American Philosophical Quarterly. 28: 2, 157-165.
  • Sartwell, Crispin  (1992). Why Knowledge is Merely True Belief. The Journal of Philosophy. 89: 4, 167-180.
    • These two articles by Sartwell are the only places in contemporary epistemology where the view that knowledge is just true belief is seriously defended.
  • Sliwa, Paulina (2015). Understanding and Knowing. Proceedings of the Aristotelian Society. 115, part 1, pp.57-74.
    • Defends the reductivist thesis that the various types of understanding (understanding a domain, understanding that p, understanding a person, and so on) are no different from the corresponding types of knowing.
  • Sosa, Ernest (2003). The Place of Truth in Epistemology. In Michael DePaul and Linda Zagzebski, eds. Intellectual Virtue: Perspectives from Ethics and Epistemology. Oxford: Clarendon Press; New York: Oxford University Press.
  • Sosa, Ernest (2007). A Virtue Epistemology: Apt Belief and Reflective Knowledge, Volume 1. Oxford: Clarendon Press; New York: Oxford University Press.
    • Sets out a virtue-theoretic analysis of knowledge. Distinguishes animal knowledge from reflective knowledge. Responds to dream-skepticism. Argues that true belief is the fundamental epistemic value.
  • Sylvan, Kurt (2018). Veritism Unswamped. Mind. 127: 506, 381-435.
    • Proposes that justification is non-instrumentally, but still derivatively, valuable.
  • Treanor, Nick (2014). Trivial Truths and the Aim of Inquiry. Philosophy and Phenomenological Research. 89: 3, pp.552-559.
    • Argues against an argument for the popular claim that some truths are more interesting than others. Points out that the standard comparisons between what are apparently more and less interesting true sentences are unfair, because the sentences might not involve or express the same number of true propositions.
  • Vahid, Hamid (2003). Truth and the Aim of Epistemic Justification. Teorema. 22: 3, 83-91.
    • Discusses justification and the epistemic goal. Proposes that accepting a diachronic formulation of the epistemic goal solves the problem raised by Stephen Maitzen (1995).
  • Weiner, Matthew (2009). Practical Reasoning and the Concept of Knowledge. In A. Haddock, A. Millar, and D. Pritchard, eds. Epistemic Value. Oxford: Oxford University Press. 163-182.
    • Argues that knowledge is valuable in the same way as a Swiss Army Knife is valuable. A Swiss Army Knife contains many different blades which are useful in different situations; they’re not always all valuable to have, but it’s valuable to have them all collected in one easy-to-carry package. Similarly, the concept of knowledge has a number of parts which are useful in different situations; they’re not always all valuable in all cases, but it’s useful to have them collected together in one easy-to-use concept.
  • Williamson, Timothy (2000). Knowledge and its Limits. Oxford: Oxford University Press.
    • Among many other things, Williamson sets out and defends knowledge-first epistemology, adopts a stability-based solution to the Primary Value Problem, and suggests that his view of knowledge as the most general factive mental state solves the Secondary Value Problem.
  • White, R. (2007). Epistemic Subjectivism. Episteme: A Journal of Social Epistemology. 4: 1, 115-129.
  • Whiting, Daniel (2012). Epistemic Value and Achievement. Ratio. 25, 216-230.
    • Argues against the view that the value of epistemic states in general should be thought of in terms of achievement (or success because of ability). Also argues against Pritchard’s achievement-account of the value of understanding in particular.
  • Wrenn, Chase (2017). True Belief Is Not (Very) Intrinsically Valuable. Pacific Philosophical Quarterly. 98, 108-128.
  • Zagzebski, Linda (2001). Recovering Understanding. In Knowledge, Truth, and Duty: Essays on Epistemic Justification, Responsibility, and Virtue. Ed. Matthias Steup. New York: Oxford University Press, 2001. 235–56.
  • Zagzebski, Linda (2003). The Search for the Source of Epistemic Good. Metaphilosophy. 34, 12-28.
    • Gives a virtue-theoretic explanation of knowledge and the value of knowledge. Claims that it is morally important to have true beliefs, when we are performing morally important actions. Claims that knowledge is motivated by a love of the truth, and explains the value of knowledge in terms of that love and the value of that love.
  • Zagzebski, Linda (2009). On Epistemology. Belmont, CA: Wadsworth.
    • Accessible introduction to contemporary epistemology and to Zagzebski’s preferred views in epistemology. Useful for students and professional philosophers.

 

Author Information

Patrick Bondy
Email: patrbondy@gmail.com
Cornell University
U. S. A.

The Problem of Induction

This article discusses the problem of induction, including its conceptual and historical perspectives from Hume to Reichenbach. Given the prominence of induction in everyday life as well as in science, we should be able to tell whether inductive inference amounts to sound reasoning or not, or at least we should be able to identify the circumstances under which it ought to be trusted. In other words, we should be able to say what, if anything, justifies induction: are beliefs based on induction trustworthy? The problem(s) of induction, in their most general setting, reflect our difficulty in providing the required justifications.

Philosophical folklore has it that David Hume identified a severe problem with induction, namely, that its justification is either circular or question-begging. As C. D. Broad put it, Hume found a “skeleton” in the cupboard of inductive logic. What is interesting is that (a) induction and its problems were thoroughly debated before Hume; (b) Hume rarely spoke of induction; and (c) before the twentieth century, almost no one took it that Hume had a “problem” with induction, a.k.a. inductive scepticism.

This article tells the story of the problem(s) of induction, focusing on the conceptual connections and differences among the accounts offered by Hume and all the major philosophers that dealt with induction until Hans Reichenbach. Hence, after Hume, there is a discussion of what Kant thought Hume’s problem was. It moves on to the empiricist-vs-rationalist controversy over induction as it was instantiated by the views of J. S. Mill and W. Whewell in the nineteenth century.  It then casts light on important aspects of the probabilistic approaches to induction, which have their roots in Pierre Laplace’s work on probability and which dominated most of the twentieth century. Finally, there is an examination of important non-probabilistic treatments of the problem of induction, such as Peter Strawson’s view that the “problem” rests on a conceptual misunderstanding, Max Black’s self-supporting justification of induction, Karl Popper’s “anathema” of induction, and Nelson Goodman’s new riddle of induction.

Table of Contents

  1. Reasoning
    1. Τwo Kinds of Reasoning
      1. Deductive Reasoning
      2. Inductive Reasoning
    2. The Skeleton in the Cupboard of Induction
    3. Two Problems?
  2. What was Hume’s Problem?
    1. “Rules by which to judge causes and effects”
    2. The Status of the Principle of Uniformity of Nature
    3. Taking a Closer Look at Causal Inference
    4. Causal Inference is Non-Demonstrative
    5. Against Natural Necessity
      1. Malebranche on Necessity
      2. Leibniz on Induction
    6. Can Powers Help?
    7. Where Does the Idea of Necessity Come From?
  3. Kant on Hume’s Problem
    1. Hume’s Problem for Kant
    2. Kant on Induction
  4. Empiricist vs Rationalist Conceptions of Induction (After Hume and Kant)
    1. Empiricist Approaches
      1. John Stuart Mill: “The Problem of Induction”
      2. Mill on Enumerative Induction
      3. Mill’s Methods
      4. Alexander Bain: The “Sole Guarantee” of the Inference from a Fact Known to a Fact Unknown
    2. Rationalist Approaches
      1. William Whewell on “Collecting General Truths from Particular Observed Facts”
        1. A Short Digression: Francis Bacon
        2. Back to Whewell
      2. Induction as Conception
    3. The Whewell-Mill Controversy
      1. On Kepler’s Laws
      2. On the Role of Mind in Inductive Inferences
    4. Early Appeals to Probability: From Laplace to Russell via Venn
      1. Venn: Induction vs Probability
      2. Laplace: A Probabilistic Rule of Induction
      3. Russell’s Principle of Induction
  5. Non-Probabilistic Approaches
    1. Induction and the Meaning of Rationality
    2. Can Induction Support Itself?
      1. Premise-Circularity vs Rule-Circularity
      2. Counter-Induction?
    3. Popper Against Induction
    4. Goodman and the New Riddle of Induction
  6. Reichenbach on Induction
    1. Statistical Frequencies and the Rule of Induction
    2. The Pragmatic Justification
    3. Reichenbach’s Views Criticized
  7. Appendix
  8. References and Further Reading

1. Reasoning

a. Τwo Kinds of Reasoning

Reasoning in general is the process by which one draws conclusions from a set of premises. Reasoning is guided by rules of inference, that is, rules which entitle the reasoner to draw the conclusion, given the premises. There are, broadly speaking, two kinds of rules of inference and hence two kinds of reasoning: Deductive or (demonstrative) and Inductive (or non-demonstrative).

i. Deductive Reasoning

Deductive inference is such that the rule used is logically valid. A logically valid argument is such that the premises are inconsistent with the negation of the conclusion. That is, a deductively valid argument is such that if the premises are true, the conclusion has to be true. Deductive arguments can be valid without being sound. A sound argument is a deductively valid argument with true premises. A valid argument is sound if its premises are actually true. For example, the valid argument {All human beings are mortal; Madonna is a human being; therefore, Madonna is mortal} is valid, but whether or not it is sound depends on whether or not its premises are true. If at least one of them fails to be true, the argument is unsound. So, soundness implies validity, whereas validity does not imply soundness. Logically valid rules of inference are, for instance, modus ponens and modus tollens, the hypothetical and the disjunctive syllogism and categorical syllogisms.

The essential property of valid deductive argument is known as truth-transmission. This simply is meant to capture the fact that in a valid argument the truth of the premises is “transferred” to the conclusion: if the premises are true, the conclusion has to be true. Yet this feature comes at a price: deductive arguments are not content-increasing. The information contained in the conclusion is already present—albeit in an implicit form—in the premises. Thus, deductive reasoning is non-ampliative, or explicative, as the American philosopher Charles Peirce put it. The non-ampliative character of deductive reasoning has an important consequence regarding its function in language and thought: deductive reasoning unpacks the information content of the premises. In mathematics, for instance, the axioms of a theory contain all information that is unraveled by proofs into the theorems.

ii. Inductive Reasoning

Not all reasoning is deductive, however, for the simple reason that the truth of the premises of a deductive argument cannot, as a rule, be established deductively. As John Stuart Mill put it, the “Truth can only be successfully pursued by drawing inferences from experience,” and these are non-deductive. Following Mill, let us call Induction (with capital I) the mode of reasoning which moves from “particulars to generals,” or equivalently, the rule of inference in which “The conclusion is more general than the largest of the premises.” A typical example is enumerative Induction: if one has observed n As being B and no As being not-B, and if the evidence is enough and variable, then one should infer that “All As are B.”

Inductive arguments are logically invalid: the truth of the premises is consistent with the falsity of the conclusion. Thus, the rules of inductive inference are not truth-preserving precisely because they are ampliative: the content of the conclusion of the argument exceeds (and hence amplifies) the content of its premises. A typical case of it is:

All observed individuals who have the property A also have the property B;

therefore, All individuals who have the property A also have the property B.

It is perfectly consistent with the fact that All observed individuals who have the property A also have the property B, that there are some As which are not B (among the unobserved individuals).

And yet, the logical invalidity of Induction is not, per se, reason for indictment. The conclusion of an ampliative argument is adopted on the basis that the premises offer some reason to accept it as true. The idea here is that the premises inductively support the conclusion, even if they do not prove it to be true. This is the outcome of the fact that Induction by enumeration is ampliative. It is exactly this feature of inductive inference that makes it useful for empirical sciences, where next-instance predictions or general laws are inferred on the basis of a finite number of observational or experimental facts.

b. The Skeleton in the Cupboard of Induction

Induction has a problem associated with it. In a nutshell, it is motivated by the following question: on what grounds is one justified to believe that the conclusion of an inductive inference is true, given the truth of its premises? The skeptical challenge to Induction is that any attempt to justify Induction, either by the lights of reason only or with reason aided by (past) experience, will be circular and question begging.

In fact, the problem concerns ampliative reasoning in general. Since the conclusion Q of an ampliative argument can be false, even though all of its premises are true, the following question arises: what makes it the case that the ampliative reasoning conveys whatever epistemic warrant the premises might have to the intended conclusion Q, rather than to its negation not-Q? The defender of ampliative reasoning will typically reply that Induction relies on some substantive and contingent assumptions (for example, that the world has a natural-kind structure, that the world is governed by universal regularities, or that the course of nature will remain uniform, etc.); hence some argue that these assumptions back up Induction in all cases. But the sceptic will retort that these very assumptions can only be established as true by means of ampliative reasoning. Arguing in a circle, the sceptic notes, is inevitable and this simply means, she concludes, that the alleged defense carries no rational compulsion with it.

It is typically, but not quite rightly, accepted that the Problem of Induction was noted for the first time by David Hume in his A Treatise of Human Nature (1739). (For an account of Induction and its problem(s) before Hume, see Psillos 2015.)  In section 2, this article discusses Hume’s version of the Problem of Induction (and his solution to this problem) in detail. For the time being, it is important to note that Hume’s Problem of Induction as it appears in standard textbooks, and in particular the thought that Induction needs a special justification, is formed distinctly as a philosophical problem only in the twentieth century. It has been expressed by C. D. Broad in an address delivered in 1926 at Cambridge on the occasion of Francis Bacon’s tercentenary. There, Broad raised the following question: “Did Bacon provide any logical justification for the principles and methods which he elicited and which scientists assume and use?” His reply is illuminating: “He did not, and he never saw that it was necessary to do so. There is a skeleton in the cupboard of Inductive Logic, which Bacon never suspected and Hume first exposed to view.” (1952: 142-3) This skeleton is the Problem of Induction. Another Cambridge philosopher, J. M. Keynes explains in his A Treatise of Probability why Hume’s criticism of Induction never became prominent in the eighteenth and the nineteenth century:

Between Bacon and Mill came Hume (…) Hume showed, not that inductive methods were false, but that their validity had never been established and that all possible lines of proof seemed equally unpromising. The full force of Hume’s attack and the nature of the difficulties which it brought to light were never appreciated by Mill, and he makes no adequate attempt to deal with them. Hume’s statement of the case against induction has never been improved upon; and the successive attempts of philosophers, led by Kant, to discover a transcendental solution have prevented them from meeting the hostile arguments on their own ground and from finding a solution along lines which might, conceivably, have satisfied Hume himself. (1921: 312-313)

c. Two Problems?

Indeed, hardly ever does anyone mention Hume’s name in relation to the Problem of Induction before the Cambridge Apostles, with the exception of John Venn (see section 4.4). Bertrand Russell, in his famous book The Problems of Philosophy in 1912, devoted a whole chapter on Induction (interestingly, without making any reference to Hume). There, he took it that there should be a distinction between two different issues, and hence two different types of justification that one may provide to Induction, a distinction “without which we should soon become involved in hopeless confusions.” (1912: 34) The first issue is a fact about human and animal lives, namely, that expectations about the future course of events or about hitherto unobserved objects are formed on the basis on (and are caused by) past uniformities. In this case, “The frequent repetition of some uniform succession or coexistence has been a cause of our expecting the same succession or coexistence on the next occasion” (ibid.). Thus, the justification (better put, exculpation) would be of the following sort: since, as a matter of fact, the mind works in such and such a way, we expect the conclusion of induction to be true. The second issue is about the justification of the inferences that lie at the basis of the transition from the past regularities (or the hitherto observed pattern among objects) to a generalization (that is, to their extension to the future or to the hitherto unobserved). This second issue, Russell thought, revolves around the problem of whether there is “any reasonable ground for giving weight” to such expectations of uniformity after “the question of their validity has been raised.” (1912: 35) Hence, the Problem of Induction is a problem that arises upon reflection on a practice, namely, the practice to form expectations about the future on the basis of whatever has happened in the past; or, in other words, the practice of learning from experience.

Later on, Karl Popper distinguished between the psychological problem of Induction, which can be formulated in terms of the following question: How is it that nevertheless all reasonable people expect and believe that instances of which they have had no experience will conform to those of which they have had experience?” (Popper 1974: 1018) The logical problem of Induction which is expressed in the question: “Are we rationally justified in reasoning from repeated instances of which we have had experience to instances of which we have had no experience?” (ibid.)

To show the difference between the two types of problems, Popper (1974: 1019) referred to an example from Russell (1948): consider a person who, out of mental habit, does not follow the rules of inductive inference. If the only justification of the rule is based on how the mind works, we cannot explain why that person’s way of thinking is irrational. The only thing we can tell is that the person does not follow the way that most people think. Can we do better than that? Can we solve the logical problem of induction? And, more importantly, is there a logical problem to solve?

A fairly recent, but very typical, formulation of it from Gerhard Schurz clarifies the logical problem of Induction. The Problem of Induction is that:

There is no epistemic justification [of induction], meaning a system of arguments showing that inductive methods are useful or the right means for the purpose of acquiring true and avoiding false beliefs. […] Hume did not only say that we cannot prove that induction is successful or reliable; he argued that induction is not capable of any rational justification whatsoever. (2019: 7)

2. What was Hume’s Problem?

a. “Rules by which to judge causes and effects”

Suppose that you started to read Hume’s A Treatise of Human Nature from section XV, of part III of book I, titled Rules by which to judge causes and effects. You read:

[…] There are no objects, which by the mere survey, without consulting experience, we can determine to be the causes of any other; and no objects, which we can certainly determine in the same manner not to be the causes. Any thing may produce any thing. Where objects are not contrary, nothing hinders them from having that constant conjunction, on which the relation of cause and effect totally depends. (1739: 173)

Fair enough, you may think. Hume claims that only experience can teach us what causes what, and without any reference to (prior) experience anything can be said to cause anything else to happen—meaning, no causal connections can be found with the lights of reason only. Reason imposes no constraints on what constant conjunctions among non-contrary (mutually exclusive) objects or properties there are in nature. Then, you read on: “Since therefore ’tis possible for all objects to become causes or effects to each other, it may be proper to fix some general rules, by which we may know when they really are so.”

Fair enough again, you may think. If only experience can teach us what constant conjunctions of objects there are in the world, then we had better have some ways to find out which among the possible constant conjunctions (possible if only Reason were in operation) are actual. And Hume does indeed go ahead to give 8 rules, the first six of which are:

    1. The cause and effect must be contiguous in space and time;
    2. The cause must be prior to the effect;
    3. There must be a constant union between the cause and effect;
    4. The same cause always produces the same effect, and the same effect never arises but from the same cause;
    5. When several different objects produce the same effect, it must be by means of some quality, which is common amongst them;
    6. If two resembling objects produce different effects, then the difference in the effects must proceed from something in which the causes differ.

It is not the aim of this article to discuss these rules. Suffice it to say that they are hardly controversial. Rules 1 and 2 state that causes are spatio-temporally contiguous with and temporally prior to their effects. Rule 3 states that cause and effect form a regular succession. Rule 4, perhaps the most controversial, states a fundamental principle about causation (which encapsulates the principle of uniformity of nature) which Mill defended too. Rules 5 and 6 are early versions of the methods of agreement and difference, which became central features of Mill’s epistemology of causation. Hume readily acknowledges that the application of these rules is not easy, since most natural phenomena are complex and complicated. But all this is very natural and is nowhere related with any Problem of Induction, apart from the issue of how to distinguish between good and bad inductive inferences.

There is something even more surprising in Hume’s Treatise. He notes:

’Tis certain, that not only in philosophy, but even in common life, we may attain the knowledge of a particular cause merely by one experiment, provided it be made with judgement, and after a careful removal of all foreign and superfluous circumstances. Now as after one experiment of this kind, the mind, upon the appearance of the cause or the effect, can draw an inference concerning the existence of its correlative; and as a habit can never be acquir’d merely by one instance; it may be thought that belief cannot in this case be esteem’d the effect of custom. (1739: 104-5)

Hume certainly allows that a single experiment may be enough for causal knowledge (which is always general), provided, as he says, the experiment is “made with judgement, and after a careful removal of all foreign and superfluous circumstances.” Now, strictly speaking, it makes no sense to say that in a single experiment “all foreign and superfluous circumstances” can be removed. A single experiment is a one-off act: it includes all the factors it actually does. To remove or change some factors (circumstances) is to change the experiment, or to perform a different, but related, one. So, what Hume has in mind when he says that we can draw causal conclusions from single experiments is that we have to perform a certain type of experiment a few times, each time removing or changing a certain factor, in order to see whether the effect is present (or absent) under the changed circumstances. In the end, it will be a single experiment that will reveal the cause. But this revelation will depend on having performed the type of experiment a few times, each under changed circumstances. Indeed, this thought is captured by Hume’s Rule 5 above. This rule urges the experimenter to remove the “foreign and superfluous circumstances” in a certain type of experiment by removing a factor each time it is performed until the common factor in all of them is revealed.

But Hume’s main concern in the quotation above is to resist the claim that generalizing on the basis of a single experiment is a special non-inductive procedure. He goes on to explain that even though in a certain case we may have to rely on a single experiment to form a general belief, we in fact rely on a principle for which we have “millions” of experiments in support: “That like objects, plac’d in like circumstance, will always produce like effects.” (1739: 105) So, when general causal conclusions are drawn from single experiments, this activity is “comprehended under [this higher-order] principle,” which is clearly a version of the Principle of Uniformity of Nature. This higher-order principle “bestows an evidence and firmness on any opinion, to which it can be apply’d.” (1739: 105)

Note that section XV, of part III of book I reveals hardly any sign of inductive skepticism from Hume. Instead, it offers methods for judging the circumstances under which Induction is legitimate.

b. The Status of the Principle of Uniformity of Nature

So, what is the issue of Hume’s skepticism about Induction? Note, for a start, what he adds to what he has already said. This higher-order principle (the principle of uniformity of nature) is “habitual”; that is, it is the product of habit or custom and not of Reason. The status of this principle is then the real issue that Hume is concerned with.

Hume rarely uses the term “induction,” but when he does use it, it is quite clear that he has in mind something like generalization on the basis of observing cases or instances. But on one occasion, in his Enquiry Concerning the Principles of Morals, he says something more:

There has been a controversy started of late, much better worth examination, concerning the general foundation of MORALS; whether they be derived from REASON, or from SENTIMENT; whether we attain the knowledge of them by a chain of argument and induction, or by an immediate feeling and finer internal sense. (1751: 170)

It seems that Hume contrasts “induction” to argument (demonstration); hence he seems to take it to be an inferential process based on experience.

With this in mind, let us discuss Hume’s “problem of induction.” In the Treatise, Hume aims to discover the locus of the idea of necessary connection, which is taken to be part of the idea of causation. One of the central questions he raises is this: “Why we conclude, that such particular causes must necessarily have such particular effects; and what is the nature of that inference we draw from the one to the other, and of the belief we repose in it?” (1739: 78)..

When it comes to the inference from cause to effect, Hume’s approach is captivatingly simple. We have memory of past co-occurrences of types of events C and E, where Cs and Es have been directly perceived, or remembered to have been perceived. This co-occurrence is “a regular order of contiguity and succession” among tokens of C and tokens of E. (1739: 87) So, when in a fresh instance we perceive or remember a C, we “infer the existence” of an E. Although in all past instances of co-occurrence, both Cs and Es “have been perceiv’d by the senses and are remember’d,” in the fresh instance, E is not yet perceived, but its idea is nonetheless “supply’d in conformity to our past experience” (ibid.). He then adds: “Without any further ceremony, we call the one [C] cause and the other [E] effect, and infer the existence of the one from that of the other” (ibid.). What is important in this process of causal inference is that it reveals “a new relation betwixt cause and effect,” a relation that is different from contiguity, succession and necessary connection, namely, constant conjunction. It is this “CONSTANT CONJUNCTION” (1739: 87) that is involved in our “pronouncing” a sequence of events to be causal. Hume says that contiguity and succession “are not sufficient to make us pronounce any two objects to be cause and effect, unless we perceive, that these two relations are preserv’d in several instances” (ibid.). The “new relation” (constant conjunction) is a relation among sequences of events. Its content is captured by the claim: “Like objects have always been plac’d in like relations of contiguity and succession.” (1739: 88)

Does that mean that Hume identifies the sought-after necessary connection with the constant conjunction? By no means! The observation of a constant conjunction generates no new impression in the objects perceived. Hume points out that the mere multiplication of sequences of tokens of C being followed by tokens of E adds no new impressions to those we have had from observing a single sequence. Observing, for instance, a single collision of two billiard balls, we have impressions of the two balls, of their collision, and of their flying apart. These are exactly the impressions we have no matter how many times we repeat the collision of the balls. The impressions we had from the single sequence did not include any impression that would correspond to the idea of necessary connection. But since the observation of the multiple instances generates no new impressions in the objects perceived, it cannot possibly add a new impression which might correspond to the idea of necessary connection. As Hume puts it:

From the mere repetition of any past impression, even to infinity, there never will arise any new original idea, such as that of necessary connexion; and the number of impressions has in this case no more effect than if we confin’d ourselves to one only. (1739: 88)

The reason why constant conjunction is important (even though it cannot directly account for the idea of necessary connection by means of an impression) is that it is the source of the inference we make from causes to effects. Looking more carefully at this inference might cast some new light on what exactly is involved when we call a sequence of events causal. As he put it: “Perhaps ‘twill appear in the end, that the necessary connexion depends on the inference, instead of the inference’s depending on the necessary connexion.” (1739: 88)

c. Taking a Closer Look at Causal Inference

The inference of which Hume wants to unravel the “nature” is this: “After the discovery of the constant conjunction of any objects, we always draw an inference from one object to another.” (1739: 88) This, it should be noted, is what might be called an inductive inference. To paraphrase what Hume says, its form is:

(I)

(CC): A has been constantly conjoined with B (that is, all As so far have been followed by Bs)

(FI): a is A (a fresh instance of A)

Therefore, a is B (the fresh instance of A will be followed by a fresh instance of B).

Hume’s target is all those philosophers who think that this kind of inference is (or should be) demonstrative. In particular, his target is all those who think that the fresh instance of A must necessarily be followed by a fresh instance of B. Recall his question cited above: “Why we conclude, that such particular causes must necessarily have such particular effects.”

What, he asks, determines us to draw inference (I)? If it were Reason that determined us, then this would have to be a demonstrative inference: the conclusion would have to follow necessarily from the premises. But then an extra premise would be necessary, namely, “Instances, of which we have had no experience, must resemble those, of which we have had experience, and that the course of nature continues always uniformly the same” (ibid.).

Let us call this the Principle of Uniformity of Nature (PUN). If indeed this principle were added as an extra premise to (I), then the new inference:

(PUN-I)

(CC): A has been constantly conjoined with be (i.e., all As so far have been followed by Bs)

(FI): a is A (a fresh instance of A)

(PUN): The course of nature continues always uniformly the same.

Therefore, a is B (the fresh instance of A will be followed by a fresh instance of B).

would be demonstrative and the conclusion would necessarily follow from the premises. Arguably then, the logical necessity by means of which the conclusion follows from the premises would mirror the natural necessity by means of which causes bring about the effects (a thought already prevalent in Aristotle). But Hume’s point is that for (PUN-I) to be a sound argument, PUN need to be provably true. There are two options here.

The first is that PUN is proved itself by a demonstrative argument. But this, Hume notes, is impossible since “We can at least conceive a change in the course of nature; which sufficiently proves that such a change is not absolutely impossible.” (1739: 89) Here what does the work is Hume’s separability principle, namely, that if we can conceive A without conceiving B, then A and B are distinct and separate entities and one cannot be inferred from the other. Hence, since one can conceive the idea of past constant conjunction without having to conceive the idea of the past constant conjunction being extended in the future, these two ideas are distinct from each other. So, PUN cannot be demonstrated a priori by pure Reason. It is not a conceptual truth, nor a principle of Reason.

The other option is that PUN is proved by recourse to experience. But, Hume notes, any attempt to base the Principle of Uniformity of Nature on experience would be circular. From the observation of past uniformities in nature, it cannot be inferred that nature is uniform, unless it is assumed what was supposed to be proved, namely, that nature is uniform,  that there is “a resemblance betwixt those objects, of which we have had experience [i.e. past uniformities in nature] and those, of which we have had none [i.e. future uniformities in nature].” (1739: 90) In his first Enquiry, Hume is even more straightforward: “To endeavour, therefore the proof of this last supposition [that the future will be conformable to the past] by probable arguments, or arguments regarding existence, must evidently be going in a circle, and taking that for granted, which is the very point in question.” (1748: 35-6) As he explains in his Treatise, “The same principle cannot be both the cause and effect of another.” (1739: 89-90) PUN would be the “cause” (read: “premise”) for the “presumption of resemblance” between the past and the future, but it would also be the “effect” (read: “conclusion”) of the “presumption of resemblance” between the past and the future.

d. Causal Inference is Non-Demonstrative

What then is Hume’s claim? It is that (PUN-I) cannot be a demonstrative argument. Neither Reason alone, nor Reason “aided by experience” can justify PUN, which is necessary for (PUN-I) being demonstrative. Hence, causal inference—that is (I) above—is genuinely non-demonstrative.

Hume summed up this point as follows:

Thus not only our reason fails us in the discovery of the ultimate connexion of causes and effects, but even after experience has inform’d us of their constant conjunction, ‘tis impossible for us to satisfy ourselves by our reason, why we shou’d extend that experience beyond those particular instances, which have fallen under our observation. We suppose, but are never able to prove, that there must be a resemblance betwixt those objects, of which we have had experience, and those which lie beyond the reach of our discovery. (1739: 91-92)

Note well Hume’s point: “We suppose but we are never able to prove” the uniformity of nature. Indeed, Hume goes on to add that there is causal inference in the form of (I), but it is not (cannot be) governed by Reason, but “by certain principles, which associate together the ideas of these objects, and unite them in the imagination” (1739: 92). These principles are general psychological principles of resemblance, contiguity and causation by means of which the mind works. Hume is adamant that the “supposition” of PUN “is deriv’d entirely from habit, by which we are determin’d to expect for the future the same train of objects, to which we have been accustom’d.” (1739: 134)

Hume showed that (I) is genuinely non-demonstrative. In summing up his view, he says:

According to the hypothesis above explain’d [his own theory] all kinds of reasoning from causes or effects are founded on two particulars, viz. the constant conjunction of any two objects in all past experience, and the resemblance of a present object to any one of them. (1739: 142)

In effect, Hume says that (I) supposes (but does not explicitly use) a principle of resemblance (PUN).

It is a nice question to wonder in what sense Hume’s approach is skeptical. For Hume does not deny that the mind is engaged in inductive inferences, he denies that these inferences are governed by Reason. To see the sense in which this is a skeptical position, let us think of someone who would reply to Hume by saying that there is more to Reason’s performances than demonstrative arguments. The thought could be that there is a sense in which Reason governs non-demonstrative inference according to which the premises of a non-demonstrative argument give us good reasons to rationally accept the conclusion. Argument (I) above is indeed genuinely non-demonstrative, but there is still a way to show that it offers reasons to accept the conclusion. Suppose, for instance, that one argued as follows:

(R-I)

(CC): A has been constantly conjoined with be (that is, all As so far have been followed by Bs)

(FI): a is A (a fresh instance of A)

(R): CC and FI are reasons to believe that a is B

Therefore, (probably) a is B (the fresh instance of A will be followed by a fresh instance of B).

Following Stroud (1977: 59-65), it can be argued that Hume’s reaction to this would be that principle (R) cannot be a good reason for the conclusion. Not because (R) is not a deductively sufficient reason, but because any defense of (R) would be question-begging in the sense noted above. To say, as (R) in effect does, that a past constant conjunction between As and Bs is reason enough to make the belief in their future constant conjunction reasonable is just to assume what needs to be defended by further reason and argument.

Be that as it may, Hume’s so-called inductive skepticism is a corollary of his attempt to show that the idea of necessary connection cannot stem from the supposed necessity that governs causal inference. For, whichever way you look at it, talk of necessity in causal inference is unfounded.

e. Against Natural Necessity

In the Abstract, Hume considers a billiard-ball collision which is “as perfect an instance of the relation of cause and effect as any which we know, either by sensation or reflection” (1740: 649) and suggests we examine it. He concludes that experience dictates three features of cause-effect relation: contiguity in time and place; priority of the cause in time; constant conjunction of the cause and the effect; and nothing further. However, as we have already seen, Hume did admit that, over and above these three features, causation involves necessary connection of the cause and the effect.

The view that causation implies necessary connections between distinct existences had been the dominant one ever since Aristotle put it forward. It was tied to the idea that things possess causal powers, where power is “a principle of change in something else or in itself qua something else.” Principles are causes, hence powers are causes. Powers are posited for explanatory reasons—they are meant to explain activity in nature: change and motion. Action requires agency. For X to act on Y, X must have the (active) power to bring a change to Y, and Y must have the (passive) power to be changed (in the appropriate way) by X. Powers have modal force: they ground facts about necessity and possibility. Powers necessitate their effects: when a (natural) power acts (at some time and in the required way), and if there is “contact” with the relative passive power, the effect necessarily (that is, inevitably) follows. Here is Aristotle’s example: “And that that which can be hot must be made hot, provided the heating agent is there, i.e. comes near.” (324b8) (1985: 530)

i. Malebranche on Necessity

Before Hume, Father Nicolás Malebranche had emphatically rejected as “pure chimera” the idea that things have natural powers in virtue of which they necessarily behave the way they do. When someone says that, for instance, the fire burns by its nature, they do not know what they mean. For him, the very notion of such a “force,” “power,” or “efficacy,” was completely inconceivable: “Whatever effort I make in order to understand it, I cannot find in me any idea representing to me what might be the force or the power they attribute to creatures.” (1674-5: 658) Moreover, he challenged the view that there are necessary connections between worldly existences (either finite minds or bodies) based on the claim that the human mind can only perceive the existence of a necessary connection between God’s Will and his willed actions. In a famous passage in his La Recherche de la Vérité, he noted:

A true cause as I understand it is one such that the mind perceives a necessary connection between its and its effect. Now the mind perceives a necessary connection between the will of an infinite being and its effect. Therefore, it is only God who is the true cause and who truly has the power to move bodies. (1674-5: 450)

Drawing a distinction between real causes and natural causes (or occasions), he claimed that natural causes are merely the occasions on which God causes something to happen, typically by general volitions which are the laws of nature. Malebranche and, following him, a bunch of radical thinkers argued that a coherent Cartesianism should adopt occasionalism, namely, the view that a) bodies lack motor force and b) God acts on nature via general laws. Since, according to Cartesianism, a body’s nature is exhausted by its extension, Malebranche argued, bodies cannot have the power to move anything, and hence to cause anything to happen. He added, however, that precisely because causality involves a necessary connection between the cause and the effect, and since no such necessary connection is perceived in cases of alleged worldly causality (where, for instance, it is said that a billiard ball causes another one to move), there is no worldly causality: all there is in the world is regular sequences of events, which, strictly speaking, are not causal. Hume, as is well known, was very much influenced by Malebranche, to such an extent that Hume’s own approach can be described as Occasionalism minus God.

ii. Leibniz on Induction

But by the time of Hume’s Treatise, causal powers and necessary connections had been resuscitated by Leibniz. He distinguished between two kinds of necessity. Some principles are necessary because opposing them implies a contradiction. This is what he called “logical, metaphysical or geometrical” necessity. In Theodicy he associated this kind of necessity with the “‘Eternal Verities’, which are altogether necessary, so that the opposite implies contradiction.” But both in Theodicy and the New Essays on Human Understanding (which were composed roughly the same time), he spoke of truths which are “only necessary by a physical necessity.” (1896: 588) These are not absolutely necessary in that they can be denied without contradiction. And yet they are necessary because, ultimately, they are based on the wisdom of God. In Theodicy Leibniz says that we learn these principles either a posteriori based on experience or “by reason and a priori, that is, by considerations of the fitness of things which have caused their choice.” (1710: 74) In the New Essays he states that these principles are known by Induction, and hence that physical necessity is “founded upon induction from that which is customary in nature, or upon natural laws which, so to speak, are of divine institution.” (1896: 588) Physical necessity constitutes the “order in Nature” and “lies in the rules of motion and in some other general laws which it pleased God to lay down for things when he gave them being.” (1710: 74) So, denying these principles entails that nature is disorderly (and hence unknowable).

Leibniz does discuss Induction in various places in his corpus. In his letter to Queen Sophie Charlotte of Prussia, On what is Independent of Sense and Matter in 1702, he talks of “simple induction,” and claims that it can never assure us of the “perfect generality” of truth arrived at by it. He notes: “Geometers have always held that what is proved by induction or by example in geometry or in arithmetic is never perfectly proved.” (1989: 190) To be sure, in this particular context, he wants to make the point that mathematical truths are truths of reason, known either a priori or by means of demonstration. But his point about induction is perfectly general. The “senses and induction” as he says, “can never teach us truths that are fully universal, nor what is absolutely necessary, but only what is, and what is found in particular examples.” (1989: 191)  Since, however, Leibniz does not doubt that “We know universal and necessary truth in the sciences,” there must be a way of knowing them which is non-empirical. They are known by “an inborn light within us;” we have “derived these truths, in part, from what is within us” (ibid.).

In his New Essays, he allows that “Propositions of fact can also become general,” by means of “induction or observation.” For instance, he says, we can find out by Induction that “All mercury is evaporated by the action of fire.” But Induction, he thought, can never deliver more than “a multitude of similar facts.” In the mercury case, the generality achieved is never perfect, the reason being that “We can’t see its necessity.”  For Leibniz, only Reason can come to know that a truth is necessary: “Whatever number of particular experiences we may have of a universal truth, we could not be assured of it forever by induction without knowing its necessity through the reason.” (1896: 81)

For Leibniz, Induction, therefore, suffers from an endemic “imperfection.” But what exactly is the problem? Ιn an early unpublished piece, (Preface to an Edition of Nizolius 1670), Leibniz offers perhaps his most systematic treatment of the problem of the imperfection of Induction.

The problem: Induction is essentially incomplete.

(1) Perfectly universal propositions can never be established on this basis [through collecting individuals or by induction] because “You are never certain in induction that all individuals have been considered.” (1989a: 129)

(2) Since, then, “No true universality is possible, it will always remain possible that countless other cases which you have not examined are different” (ibid.).

Ηowever, the following objection may be put forward: from the fact that entity A with nature N has regularly caused B in the past, we infer (with moral certainty) that universally entity A with nature N causes B. As Leibniz put it:

“Do we not say universally that fire, that is, a certain luminous, fluid, subtle body, usually flares up and burns when wood is kindled, even if no one has examined all such fires, because we have found it to be so in those cases we have examined?” (op.cit.)

“We infer from them, and believe with moral certainty, that all fires of this kind burn and will burn you if you put your hand to them.” (op.cit.)

Leibniz endorses this objection, and hence he does not aim to discredit Induction. Rather, he aims to ground it properly by asking what is the basis for true universality? What is the basis for blocking the possibility of exceptions?

Leibniz’s reply is that the grounds for true universality are the (truly universal) principle that nature is uniform. But the (truly universal) principle that nature is uniform cannot depend on Induction because this would lead to a(n) (infinite) regress, and moral certainty would not be possible.

Induction yields at best moral (and not perfect) certainty. But this moral certainty:

Is not based on induction alone and cannot be wrested from it by main force but only by the addition or support of the following universal propositions, which do not depend on induction but on a universal idea or definition of terms:

(1) if the cause is the same or similar in all cases, the effect will be the same or similar in all;

(2) the existence of a thing which is not sensed is not assumed; and, finally,

(3) whatever is not assumed, is to be disregarded in practice until it is proved.

From these principles arises the practical or moral certainty of the proposition that all such fire burns…. (op.cit.)

So here is how we would reason “inductively” according to Leibniz.

(L)

Fires have so far burned.

Hence, (with moral certainty) “All fire burns.”

This inference rests on “the addition or support” of the universal proposition (1): “If the cause is the same or similar in all cases, the effect will be the same or similar in all.” In making this inference, we do not assume anything about fires we have not yet seen or touched (hence, we do not beg the question concerning unseen fires); instead, we prove something about unseen fires, namely, that they too burn.

Note Leibniz’s reference to the “addition or support” of proposition (1), which amounts to a Uniformity Principle. We may think of (L) as an elliptical demonstrative argument which requires the addition of (1), or we can think of it as a genuine inductive argument, “supported” by a Uniformity principle. In either case, the resulting generalization is naturally necessary, and hence truly universal, though the supporting uniformity principle is not metaphysically necessary. The resulting generalization (“All fire burns”) is known by “practical or moral certainty,” which rests on the three principles supplied by Reason.

It is noteworthy that Leibniz is probably the first to note explicitly that any attempt to justify the required principles by means of Induction would lead to an infinite regress, since if these principles were to be arrived at by Induction, further principles would be required for their derivation, “and so on to infinity, and moral certainty would never be attained.” (1989a: 130) So, these principles are regress-stoppers, and for them to play this role they cannot be inductively justified.

Let us be clear on Leibniz’s “problem of induction”: Induction is required for learning from experience, but experience cannot establish the universal necessity of a principle, which requires the uniform course of nature. If Induction is to be possible, it must be based on principles which are not founded on experience. It is Reason that supplies the missing rationale for Induction by providing the principles that are required for the “connection of the phenomena.” (1896: 422) Natural necessity is precisely this “connection of the phenomena” that Reason supplies and makes Induction possible.

Though Induction (suitably aided by principles of reason) can and does lead to moral certainty about matters of fact, only demonstrative knowledge is knowledge proper. And this, Leibniz concludes, can only be based on reason and the Principle of Non-Contradiction. But this is precisely the problem. For if this is the standard of knowledge, then even the basic principles by means of which induction can yield moral certainty cannot be licensed by the Principle of Non-Contradiction. So, the space is open for an argument to the effect that they are not, properly speaking, principles of reason.

f. Can Powers Help?

It is no accident, then, that Hume takes pains to show that the Principle of Uniformity of Nature is not a principle of Reason. What is even more interesting is that Hume makes an extra effort to block an attempt to offer a certain metaphysical foundation to the Principle of Uniformity of Nature based on the claim that so-called physically necessary truths are made true by the causal powers of things. Here is how this metaphysical grounding would go: a certain object A has the power to produce an object B. If this were the case, then the necessity of causal claims would be a consequence of a power-based ontology, according to which “The power necessarily implies the effect.” (1739: 90) Hume even allowed that positing of powers might be based on experience in the following sense: after having seen A and B being constantly conjoined, we conclude that A has the power to produce B. Either way, the relevant inference would become thus:

(P-I)

(CC): A has been constantly conjoined with B (that is, all As so far have been followed by Bs)

(P):  A has the power to produce B

(FI): a is A (a fresh instance of A)

Therefore, a is B (the fresh instance of A will be followed by a fresh instance of B).

Here is how Hume put it: “The past production implies a power: The power implies a new production: And the new production is what we infer from the power and the past production.” (1739: 90) If this argument were to work, PUN would be grounded in the metaphysical structure of the world, and, more particularly, in powers and their productive relations with their effects. Hume’s strategy against this argument is that even if powers were allowed (a thing with which Hume disagrees), (P-I) would be impotent as a demonstrative argument since it would require proving that powers are future-oriented (namely, that a power which has been manifested in a certain manner in the past will continue to manifest itself in the same way in the future), and this is a claim that neither reason alone nor reason aided with experience can prove.

g. Where Does the Idea of Necessity Come From?

Hume then denies necessity in the workings of nature. He criticizes Induction insofar as it is taken to be related to PUN, that is, insofar as it was meant to yield (naturally) necessary truths, based on Reason and past experiences. Here is how he summed it up:

That it is not reasoning which engages us to suppose the past resembling the future, and to expect similar effects from causes, which are, to appearance, similar. This is the proposition which I intended to enforce in the present section. (1748: 39)

Instead of being products of Reason, “All inferences from experience, therefore, are effects of custom.” (1748: 43)

For Hume, causality, as it is in the world, is regular succession of event-types: one thing invariably following another. His famous first definition of causality runs as follows:

We may define a CAUSE to be “An object precedent and contiguous to another, and where all the objects resembling the former are plac’d in like relations of precedency and contiguity to those objects, that resemble the latter. (1739: 170)

And yet, Hume agrees that not only do we have the idea of necessary connection, but also that it is part of the concept of causation. As noted already, it would be wrong to think that Hume identified the necessary connection with the constant conjunction. After all, the observation of a constant conjunction generates no new impression in the objects perceived. What it does do, however, is cause a certain feeling of determination in the mind. After a point, the mind does not treat the repeated and sequence-resembling phenomenon of tokens of C being followed by tokens of E as independent anymore—the more it perceives, the more determined it is to expect that they will occur again in the future. This determination of the mind is the source of the idea of necessity and power: “The necessity of the power lies in the determination of the mind…” Hence, the alleged natural necessity is something that exists only in the mind, not in nature! Instead of ascribing the idea of necessity to a feature of the natural world, Hume took it to arise from within the human mind when it is conditioned by the observation of a regularity in nature to form an expectation of the effect when the cause is present. Indeed, Hume offered a second definition of causality: “A CAUSE is an object precedent and contiguous to another, and so united with it, that the idea of the one determines the mind to form the idea of the other, and the impression of the one to form a more lively idea of the other.” (1739: 170) Hume thought that he had unpacked the “essence of necessity”: it “is something that exists in the mind, not in the objects.” (1739: 165) He claimed that the supposed objective necessity in nature is spread by the mind onto the world. Hume can be seen as offering an objective theory of causality in the world (since causation amounts to regular succession), which was however accompanied by a mind-dependent view of necessity.

3. Kant on Hume’s Problem

Kant, rather bravely, acknowledged in the Prolegomena that “The remembrance of David Hume was the very thing that many years ago first interrupted my dogmatic slumber and gave a completely different direction to my researches in the field of speculative philosophy.” (1783: 10) In fact, his magnum opus, the Critique of Pure Reason, was “the elaboration of the Humean problem in its greatest possible amplification.”

a. Hume’s Problem for Kant

But what was Hume’s problem for Kant? It was not inductive skepticism and the like. Rather, it was the origin and justification of necessary connections among distinct and separate existences. Hume, Kant noted, “indisputably proved” that Reason cannot be the foundation of the judgment that “Because something is, something else necessarily must be.” (B 288) But that is exactly what the concept of causation says. Hence, the very idea of causal connections, far from being introduced a priori, is the “bastard” of imagination and experience which, ultimately, disguises mere associations and habits as objective necessities.

Kant took it upon himself to show that the idea of necessary connections is a synthetic a priori principle and hence that it has “an inner truth independent of all experience.” Synthetic a priori truths are not conceptual truths of reason; rather, they are substantive claims which are necessary and are presupposed for the very possibility of experience. Kant tried to demonstrate that the principle of causality, namely, “Everything that happens, that is, begins to be, presupposes something upon which it follows by rule” (A 189), is a precondition for the very possibility of objective experience.

He took the principle of causality to be a requirement for the mind to make sense of the temporal irreversibility in certain sequences of impressions. So, whereas we can have the sequence of impressions that correspond to the sides of a house in any order we please, the sequence of impressions that correspond to a ship going downstream cannot be reversed: it exhibits a certain temporal order (or direction). This temporal order by which certain impressions appear can be taken to constitute an objective happening only if the later event is taken to be necessarily determined by the earlier one (that is, to follow by rule from its cause). For Kant, objective events are not “given”: they are constituted by the organizing activity of the mind and, in particular, by the imposition of the principle of causality on the phenomena. Consequently, the principle of causality is, for Kant, a synthetic a priori principle.

b. Kant on Induction

What about Induction then? Kant distinguished between two kinds of universality when it comes to judgements (propositions): strict and comparative. Comparative universal propositions are those that derive from experience and are made general by Induction. An inductively arrived at proposition is liable to exceptions; it comes with the proviso, as Kant put it: “As far as we have yet perceived, there is no exception to this or that rule.” (B 4) Strictly universal propositions are thought of without being liable to any exceptions. Hence, they are not derived from experience or by induction. Rather, as Kant put it, they are “valid absolutely a priori.” That is an objective distinction, Kant thought, which we discover rather than invent. Strictly universal propositions are essentially so. For Kant, strict universality and necessity go together, since experience can teach us how things are but not that they could not be otherwise. Hence, strictly universal propositions are necessary propositions, while comparatively universal propositions are contingent. Necessity and strict universality are then the marks of a priority, whereas comparative universality and contingency are the marks of empirical-inductive knowledge. Naturally, Kant is not a sceptic about inductive knowledge; yet he wants to demarcate it properly from a priori knowledge: “[Rules] cannot acquire anything more through induction than comparative universality, i.e., widespread usefulness.” (A92/B124) It follows that the concept of cause “must be grounded completely a priori in the understanding,” precisely because experience can only show a regular succession of events A and B, and never that event B must follow from A. As Kant put it: “To the synthesis of cause and effect there [the rule] attaches a dignity that can never be expressed empirically, namely, that the effect does not merely come along with the cause, but is posited through it and follows from it.” (A91/B124)

Not only is there not a Problem of Induction in Kant, but he discussed Induction in his various lectures on Logic. In the so-called Blomberg Logic (dating back to the early 1770s) he noted of Induction that it is indispensable (“We cannot do without it”) and that it yields knowledge (were we to abolish it, “Along with it most of our cognitions would have to be abolished at the same time”), despite the fact that it is non-demonstrative. Induction is a kind of inference where “We infer from the particular to the universal.” (1992: 232) It is based on the following rule: “What belongs to as many things as I have ever cognized must also belong to all things that are of this species and genus.” Natural kinds have properties shared by all of their members; hence if a property P has been found to be shared by all examined members of kind K, then the property P belongs to all members of K.

Now, a principle like this is fallible, as Kant knew very well. Not all properties of an individual are shared by all of its fellow kind members; only those that are constitutive of the kind. But what are they? It was partly to highlight this problem that Kant drew the distinction between “empirical universality” (what in the Critique he called “comparative universality”) and “rational” or “strict” universality, in which a property is attributed to all things of a kind without the possibility of exception. For instance, the judgment “All matter is extended” is rationally universal whereas the judgement “All matter has weights” is empirically universal. All and only empirically universal propositions are formed by Induction; hence they are uncertain. And yet, as already noted, Induction is indispensable, since “Without universal rules we cannot draw a universal inference.” (1992: 409) In other words, if our empirical knowledge is to be extended beyond the past and the seen, we must rely on Induction (and analogy). They are “inseparable from our cognitions, and yet errors for the most part arise from them.” Induction is a fallible “crutch” to human understanding.

Later on, this “crutch” was elevated to the “reflective power of judgement.” In his third Critique (Critique of Judgement) Kant focused on the power of judgement, where judgement is a cognitive faculty, namely, that of subsuming the particular under the universal. The power of judgement is reflective, as opposed to determining, when the particular is known and the universal (the rule, the law, the principle) is sought. Hence, the reflective power of judgement denotes the inductive use of judgement, that is, looking for laws or general principles under which the particulars can be subsumed. These laws will never be known with certainty; they are empirical laws. But, as Kant admits, they can be tolerated in empirical natural science. Uncertainty in pure natural science, as well as in metaphysics, of course cannot be tolerated. Hence, knowledge proper must be grounded in the apodictic certainty of synthetic a priori principles, such as the causal maxim. Induction can only be a crutch for human reason and understanding, but, given that we (are bound to) learn from experience, it is an indispensable crutch.

4. Empiricist vs Rationalist Conceptions of Induction (After Hume and Kant)

a. Empiricist Approaches

i. John Stuart Mill: “The Problem of Induction”

It might be ironic that John Stuart Mill was the first who spoke of “the problem of Induction.” (1879: 228) But by this he meant the problem of distinguishing between good and bad inductions. In particular, he thought that there are cases in which a single instance might be enough for “a complete induction,” whereas in other cases, “Myriads of concurring instances, without a single exception known or presumed, go such a very little way towards establishing an universal proposition.” Solving this problem, Mill suggested, amounts to solving the Problem of Induction.

Mill took Induction to be both a method of generating generalizations and a method of proving they are true. In his System of Logic, first published in 1848, he defined Induction as “The operation of discovering and proving general propositions” (1879: 208). As a nominalist, he thought that “generals”—what many of his predecessors had thought of as universals—are collections of particulars “definite in kind but indefinite in number.” So, Induction is the operation of discovering and proving relations among (members of) kinds—where kinds are taken to be characterized by relations of resemblance “in certain assignable respects” among its members. The basic form of Induction, then, is by enumeration: “This and that A are B, therefore every A is B.” The key point behind enumerative Induction is that it cannot be paraphrased as a conjunction of instances. It yields “really general propositions,” namely, a proposition such that the predicate is affirmed or denied of “an unlimited number of individuals.” Mill was ready to add that this unlimited number of individuals include actual and possible instances of a generalization, “existing or capable of existing.” This suggests that inductive generalizations have modal or counterfactual force: If All As are B, then if a were an A it would be a B.

It is then important for Mill to show how Induction acquires this modal force. His answer is tied to his attempt to distinguish between good and bad inductions and connects good inductions with establishing (and latching onto) laws of nature. But there is a prior question to be dealt with, namely, what is the “warrant” for Induction? (1879: 223) Mill makes no reference to Hume when he raises this issue. But he does take it that the root of the problem of the warrant for Induction is the status of the Principle of Uniformity of Nature. This is a principle according to which “The universe, so far as known to us, is so constituted, that whatever is true in any one case, is true in all cases of a certain description; the only difficulty is, to find what description.” (1879: 223)

This, he claims, is “a fundamental principle, or general axiom, of Induction” (1879: 224) and yet, it is itself an empirical principle (a generalization itself based on Induction): “This great generalization is itself founded on prior generalizations.” If this principle were established and true, it could appear as a major premise in all inductions; hence all inductions would turn into deductions. But how can it be established? For Mill there is no other route to it than experience: “I regard it as itself [the Principle of Uniformity of Nature] a generalization from experience.” (1879: 225) Mill claims that the Principle of Uniformity of Nature emerges as a second-order induction over successful first-order inductions, the successes of which support each other and the general principles.

There may be different ways to unpack this claim, but it seems that the most congenial to Mill’s own overall strategy is to note that past successes of inductions offer compelling reasons to believe that there is uniformity in nature. In a lengthy footnote (1879: 407) in which he aimed to tackle the standard objection attributed to Reid and Stewart that experience gives us knowledge only of the past and the present but never of the future, he stressed: “Though we have had no experience of what is future, we have had abundant experience of what was future.” Differently put, there is accumulated future-oriented evidence for uniformity in nature. Induction is not a “leap in the dark.”

In another lengthy footnote, this time in his An Examination of Sir William Hamilton’s Philosophy (1865: 537) he favored a kind of reflective equilibrium justification of PUN. After expressing his dismay of the constant reminder that “The uniformity of the course of nature cannot be itself an induction, since every inductive reasoning assumes it, and the premise must have been known before the conclusion,” he stressed that those who are moved by this argument have missed the point of the continuous “giving and taking, in respect of certainty” between PUN and “all the narrower truths of experience”—that is, of all first-order inductions. This “reciprocity” mutually enhances the certainty of the PUN and the certainty of first-order inductions. In other words, first-order inductions support PUN, but having been supported by them, PUN, in its turn, “raises the proof of them to a higher level.”

ii. Mill on Enumerative Induction

Recall that in formulating the Principle of Uniformity of Nature, Mill takes it to be a principle about the “constitution” of the universe, being such that it contains regularities: “Whatever is true in any one case, is true in all cases of a certain description.” But he meaningfully adds: “The only difficulty is, to find what description,” which should be taken to imply that the task of inductive logic is to find the regularities there are in the universe and that this task is not as obvious as it many sound, since finding the kinds (that is, the description of collections of individuals) that fall under certain regularities is far from trivial and may require extra methods. Indeed, though Mill thinks that enumerative induction is indispensable as a form of reasoning (since true universality in space and time can be had only through it, if one starts from experience, as Mill recommends), he also thinks that various observed patterns in nature may not be as uniform as a simple operation of enumerative induction would imply.

To Europeans, not many years ago, the proposition, All swans are white, appeared an equally unequivocal instance of uniformity in the course of nature. Further experience has proved (…) that they were mistaken; but they had to wait fifty centuries for this experience. During that long time, mankind believed in a uniformity of the course of nature where no such uniformity really existed. (1879: 226)

 The “true theory of induction” should aim to find the laws of nature. As Mill says:

Every well-grounded inductive generalization is either a law of nature, or a result of laws of nature, capable, if those laws are known, of being predicted from them. And the problem of Inductive Logic may be summed up in two questions: how to ascertain the laws of nature; and how, after having ascertained them, to follow them into their results. (1879: 231)

The first question—much more significant in itself—requires the introduction of new methods of Induction, namely, methods of elimination. Here is the rationale behind these methods:

Before we can be at liberty to conclude that something is universally true because we have never known an instance to the contrary, we must have reason to believe that if there were in nature any instances to the contrary, we should have known of them. (1879: 227)

Note the counterfactual claim behind Mill’s assertion: enumerative Induction on its own (though ultimately indispensable) cannot yield the modal force required for empirical generalizations that can be deemed laws of nature. What is required are methods which would show how, were there exceptions, they could be (or would have been) found. Given these methods, Induction acquires modal force: in a good induction—that is, in an induction such that if there were negative instances, they would have been found—the conclusion is not just “All As are B”; implicit in it is the further claim: if there were an extra A, it would be B.

iii. Mill’s Methods

These methods are Mill’s famous methods of agreement and difference, which Mill presents as methods of Induction (1879: 284).

Suppose that we know of a factor C, and we want to find out its effect. We vary the factors we conjoin with C and examine what the effects are in each case. Suppose that, in a certain experiment, we conjoin C with A and B, and what follows is abe. Then, in a new experiment, we conjoin C, not with A and B, but with D and F, and what follows is dfe. Both experiments agree only on the factor C and on the effect e. Hence, the factor C is the cause of the effect e. AB is not the cause of e since the effect was present even when AB was absent. Nor is DF the cause of e since e was present when DF was absent. This is then the Method of Agreement. The cause is the common factor in a number of otherwise different cases in which the effect occurs. As Mill put it: “If two or more instances of the phenomenon under investigation have only one circumstance in common, the circumstance in which alone all the instances agree is the cause (or effect) of the given phenomenon.” (1879: 280) The Method of Difference proceeds in an analogous fashion. Suppose that we run an experiment, and we find that an antecedent ABC has the effect abe. Suppose also that we run the experiment once more, this time with AB only as the antecedent factors. So, factor C is absent. If, this time, we only find the part ab of the effect, that is, if e is absent, we conclude that C was the cause of e. In the Method of Difference, then, the cause is the factor that is different in two cases, which are similar except that in the one the effect occurs, while in the other it does not. In Mill’s words:

If an instance in which the phenomenon under investigation occurs, and an instance in which it does not occur, have every circumstance in common save one, that one occurring only in the former; the circumstance in which alone the two instances differ is the effect, or the cause, or an indispensable part of the cause, of the phenomenon. (1879: 280)

It is not difficult to see that what Mill has described are cases of controlled experiments. In such cases, we find causes (or effects) by creating circumstances in which the presence (or the absence) of a factor makes the only difference to the production (or the absence) of an effect. The effect is present (or absent) if and only if a certain causal factor is present (or absent). Mill is adamant that his methods work only if certain metaphysical assumptions are already in place. First, it must be the case that events have causes. Second, it must be the case that events have a limited number of possible causes. In order for the eliminative methods he suggested to work, it must be the case that the number of causal hypotheses considered is relatively small. Third, it must be the case that same causes have same effects, and conversely. Fourth, it must be the case that the presence or absence of causes makes a difference to the presence or absence of their effects. Indeed, Mill (1879: 279) made explicit reference to two “axioms” on which his two Methods depend. The axiom for the Method of Agreement is this:

Whatever circumstances can be excluded, without prejudice to the phenomenon, or can be absent without its presence, is not connected with it in the way of causation. The casual circumstance being thus eliminated, if only one remains, that one is the cause we are in search of: if more than one, they either are, or contain among them, the cause…. (ibid.)

The axiom for the Method of Difference is:

Whatever antecedent cannot be excluded without preventing the phenomenon, is the cause or a condition of that phenomenon: Whatever consequent can be excluded, with no other difference in the antecedent than the absence of the particular one, is the effect of that one. (1879: 280)

What is important to stress is that although only a pair of (or even just a single) carefully controlled experiment(s) might get us at the causes of certain effects, what, for Mill, makes this inference possible is that causal connections and laws of nature are embodied in regularities—and these, ultimately, rely on enumerative induction.

iv. Alexander Bain: The “Sole Guarantee” of the Inference from a Fact Known to a Fact Unknown

The Millian Alexander Bain (1818-1903), Professor of Logic in the University of Aberdeen, in his Logic: Deductive and Inductive (1887), undertook the task of explaining the role of the Principle of Uniformity of Nature in Inductive Logic. He took this principle to be the “sole guarantee” of the inference from a fact known to a fact unknown. He claimed that when it comes to uniformities of succession, the Law of Cause and Effect, or Causation, is a version of the PUN: “Every event is uniformly preceded by some other event. To every event there is some antecedent, which happening, it will happen.” (1887: 20) He actually took it that this particular formulation of PUN has an advantage over more controversial modal formulations of the Principle, such as “every effect must have a cause.” The advantage is precisely that this is a non-modal formulation of the Principle in that it states a meta-regularity.

Bain’s treatment of Induction is interesting, because he takes it that induction proper should be incomplete—that is, it should not enumerate all relevant instances or facts, because then it would yield a summation and not a proper generalization. For Bain, Induction essentially involves the move from some instances to a generalization because only this move constitutes an “advance beyond” the particulars that probed the Induction. In fact, the scope of an inductive generalization is sweeping. It involves:

The extension of the concurrence from the observed to the unobserved cases—to the future which has not yet come within observation, to the past before observation began, to the remote where there has been no access to observe. (1887: 232)

And precisely because of this sweeping scope, Induction involves a “leap” which is necessary to complete the process. This leap is “the hazard of Induction,” which is, however, inevitable as “an instrument for multiplying and extending knowledge.” So, Induction has to be completed in the end, in that the generalization it delivers expresses “what is conjoined everywhere, and at all times, superseding for ever the labour of fresh observation.” But it is not completed through enumeration of particulars; rather, the completion is achieved by PUN.

Bain then discusses briefly “a more ambitious form of the Inductive Syllogism” offered by Henry Aldrich and Richard Whately in the Elements of Logic (1860). According to this, a proper Induction has the following form:

The magnets that I have observed, together with those that I have not observed, attract iron.

These magnets are all magnets.

All magnets attract iron.

Bain says that this kind of inference begs the question, since it assumes what needs to be proved, namely, that the unobserved magnets attract iron. As he says: “No formal logician is entitled to lay down a premise of this nature.” (1887: 234)

Does, however, the very same problem not arise for Bain’s PUN? Before we attempt to answer this, let us address a prior question: how many instances are required for a legitimate generalization? Here Bain states what he calls the principle of Universal Agreement, which he takes to be the sole evidence for inductive truth. According to this principle, “We must go through the labour of a full examination of instances, until we feel assured that our search is complete, that if contrary cases existed, they must have been met with.” (1887: 276) Note that the application of this principle does not require exhaustive enumeration—rather, it requires careful search for negative instances. Once this search has been conducted thoroughly, Bain claims that the generalization can be accepted as true (until exceptions are discovered) based on the further claim that “What has never been contradicted (after sufficient search) is to be received as true.” (1887: 237) This kind of justification is not obvious. But it does point to the view that beliefs are epistemically innocent until proven guilty. It is a reflexive principle in that it urges for the active search of counter-instances.

Bain accepts the Millian idea that PUN is “the ultimate major premise of every inductive inference.” (1887: 238) The thought here is that an argument of the following form would be a valid syllogism:

All As observed so far have been B

What has been in the past will continue

Therefore, the unobserved As are B.

What then is the status of PUN itself? Bain takes it to be a Universal Postulate. Following Spencer, he does not take it that a Universal Postulate has to be a logical or conceptual truth. That is, a Universal Postulate does not have to be such that it cannot be denied without contradiction. Rather, he takes it that a Universal Postulate is an ultimate principle on a which all reasoning of a sort should be based. As such, it is a Principle such that some might say it begs the question, while others might say that it has to be granted for reasoning to be possible. But this dual stance is exactly what is expected when it comes to ultimate principles. And that is why he thinks that, unlike Aldrich and Whately’s case above, his own reliance on PUN is not necessarily question begging.

Besides, unlike Aldrich and Whately, Bain never asserts indiscriminately that whatever holds of the observed As also holds of the unobserved As. (Recall Aldrich and Whately’s premise above: The magnets that I have observed, together with those that I have not observed, attract iron. Bain, taking a more cautious stance towards PUN, talks about uniformities as opposed to Uniformity. We have evidence for uniformities in nature, and these are the laws of nature, according to Bain. More importantly, however, we have evidence for exceptions in natural uniformities. This “destructive evidence,” Bain says, entitles us to accept the uniformities for which there has not been found destructive evidence, despite our best efforts to find it. As he put it:

We go forward in blind faith, until we receive a check; our confidence grows with experience; yet experience has only a negative force, it shows us what has never been contradicted; and on that we run the risk of going forward in the same course. (1887: 672)

So PUN—in the form “What has never been contradicted in any known instance (there being ample means and opportunities of search) will always be true”—is an Ultimate Postulate, which, however, is not arbitrary in that there is ample evidence for and lack of destructive evidence against uniformities in nature.

In fact, Bain takes PUN to be an Ultimate Postulate, alongside the Principle of Non-Contradiction. Here is how he puts it:

The fact, generally expressed as Nature’s Uniformity, is the guarantee, the ultimate major premise, of all Induction. ‘What has been, will be’, justifies the inference that water will assuage thirst in after times. We can give no reason, or evidence, for this uniformity; and, therefore, the course seems to be to adopt this as the finishing postulate. And, undoubtedly, there is no other issue possible. We have a choice of modes of expressing the assumption, but whatever be the expression, the substance is what is conveyed by the fact of Uniformity. (1887: 671)

Does that mean that Bain takes it that PUN is justified as a premise to all inductive inference? Strikingly, he takes the issue to be practical as opposed to theoretical. He admits that it can be seen as question begging from the outset but claims that it is a folly to try to avoid this charge by proposing reasons for its justification. For,

If there be a reason, it is not theoretical, but practical. Without the assumption, we could not take the smallest steps in practical matters; we could not pursue any object or end in life. Unless the future is to reproduce the past, it is an enigma, a labyrinth. Our natural prompting is to assume such identity, to believe it first, and prove it afterwards. (1887: 672)

Bain then presages the trend to offer practical or pragmatic “justifications” of Induction.

b. Rationalist Approaches

i. William Whewell on “Collecting General Truths from Particular Observed Facts”

William Whewell (1794-1866) was perhaps the most systematic writer on Induction after Francis Bacon.

1. A Short Digression: Francis Bacon

 In his Novum Organum in 1620 Bacon spoke of “inductio legitima et vera” in order to characterize his own method. The problem, Bacon thought, lied with the way Induction was supposed to proceed, namely, via simple enumeration without taking “account of the exceptions and distinctions that nature is entitled to.” Having the Aristotelians in mind, he called enumerative Induction “a childish thing” in that it “jumps to conclusions, is exposed to the danger of instant contradiction, observes only familiar things and reaches no result.” (2000: 17).. His new form of Induction differed from Aristotle’s (and Bacon’s predecessors in general) in the following: it is a general method for arriving at all kinds of general truths (not just the first principles, but also at the “lesser middle axioms” as he put it); it surveys not only affirmative or positive instances, but also negative ones. It therefore “separate(s) out a nature through appropriate rejections and exclusions.” (2000: 84)

As is well-known, Bacon’s key innovation was that he divided his true and legitimate Induction into three stages, only the third of which was Induction. Stage I is experimental and natural history: a complete inventory of all instances of natural things and their effects. Here, observation and experiment rule. Then at Stage II, tables of presences, absences and degrees of comparison are constructed. Finally, Stage III is Induction. Whatever is present when the nature under investigation is present or absent when this nature is absent or decreases when this nature decreases and conversely, is the form of this nature.

What is really noteworthy is that in denying that all instances have to be surveyed, Bacon reconceptualised how particulars give rise to the universal. By taking a richer view about experience, he did not have to give to the mind a special role in bridging the gap between the particulars and the general.

2. Back to Whewell

Whewell was a central figure of Victorian science. He was among the founders of the British Association for the Advancement of Science, a fellow of the Royal Society, president of the Geological Society, and Master of Trinity College, Cambridge. He was elected Professor of Mineralogy in 1828, and of Moral theology in 1837. Whewell coined the word “scientist” in 1833.

In The Philosophy of the Inductive Sciences, Founded Upon Their History (1840), he took Induction to be the “common process of collecting general truths from particular observed facts,” (1840 v.1: 2) which is such that, as long as it is “duly and legitimately performed,” it yields real substantial truth. Inductive truths are not demonstrative truths. They are “proved, like the guess which answers a riddle, by [their] agreeing with the facts described;” (1840 v.1: 23) they capture relations among existing things and not relations among ideas. They are contingent and not necessary truths. (1840 v.1: 57)

Whewell insisted that experience can never deliver (and justify) necessary truths. Knowledge derived from experience “can only be true as far as experience goes, and can never contain in itself any evidence whatever of its necessity.” (1840 v.1: 166) What is the status of a principle such that “Every event must have a cause”? Of this principle, Whewell argues that it is “rigorously necessary and universal.” Hence, it cannot be based on experience. This kind of principle, which Whewell re-describes as a principle of invariable succession of the form “Every event must have a certain other event invariably preceding it,” is required for inductive extrapolation. Given that we have seen a case of a stone ascending after it was thrown upwards, we have no hesitation to conclude that another stone that will be thrown upwards will ascend. Whewell argues that for this kind of judgement to be possible, the mind should take it that there is a connection between the invariably related events and not a mere succession. And then he concludes that “The cause is more than the prelude, the effect is more than the sequel, of the fact. The cause is conceived not as a mere occasion; it is a power, an efficacy, which has a real operation.” (1840 v.1: 169)

This is a striking observation because it introduces a notion of natural necessity between the cause, qua power, and the effect. But this only accentuates the problem of the status of the principle “Every event must have a cause.” For the latter is supposed to be universal and necessary—logically necessary, that is. The logical necessity which underwrites this principle is supposed to give rise to the natural necessity by means of which the effect follows from the cause. In the end, logical and natural necessity become one. And if necessary truths such as the above cannot be known from experience, how are they known?

In briefly recounting the history of this problem, Whewell noted that it was conceived as the co-existence of two “irreconcilable doctrines”: the one was “the indispensable necessity of a cause of every event,” and the other was “the impossibility of our knowing such a necessity.” (1840 v.1: 172) He paid special attention to the thought of Scottish epistemologists, such as Thomas Brown and Dugald Stewart, that a principle of the form “Every event must have a cause” is an “instinctive law of belief, or a fundamental principle of the human mind.” He was critical of this approach precisely because it failed to explain the necessity of this principle. He contrasted this approach to Kant’s, according to which a principle such as the above is a condition for the possibility of experience, being a prerequisite for our understanding of events as objective events. Whewell’s Kantian sympathies were no secret. As he put it: “The Scotch metaphysicians only assert the universality of the relation; the German attempts further to explain its necessity.” (1840 v.1: 174) But in the end, he chose an even stronger line of response. He took it that the Causal Maxim is such that “We cannot even imagine the contrary”—hence it is a truth of reason, which is grounded in the Principle of Non-Contradiction.

Whewell offered no further explanation of this commitment. In the next paragraph, he assumes a softer line by noting that there are necessary truths concerning causes and that “We find such truths universally established and assented to among the cultivators of science, and among speculative men in general.” (1840 v.1: 180) This is a far cry from the claim that their negation is inconceivable. In fact, Mill was quick to point out that this kind of point amounts to claiming that some habitual associations, after having been entrenched, are given the “appearance of necessary ones.” And that is not something that Mill would object to, provided it was not taken to imply that these principles are not absolutely necessary. It is fair to say that, though Whewell was struggling with this point, he wanted to argue that some principles are constitutive of scientific inquiry and that the evidence for it is their universal acceptance. But Mill’s persistent (and correct) point was that if the inconceivability criterion is taken as a strict logical criterion, then the negation of the principles Whewell appeals to is not inconceivable; hence they cannot be absolutely necessary, and that is the end of it.

It was the search for the ground of universal and necessary principles that led Whewell to accept that there are Fundamental Ideas (like the one of cause noted above) which yield universality and necessity. Whewell never doubted that universal and necessary principles are known and that they cannot be known from experience. But Induction proceeds on the basis of experience. Hence, it cannot, on its own, yield universal and necessary truths. The thought, however, is that Induction does play a significant role in generating truths which can then be the premises of demonstrative arguments. According to Whewell, each science grows through three stages. It begins with a “prelude” in which a mass of unconnected facts is collected. It then enters an “inductive epoch” in which useful theories put order to these facts through the creative role of the scientists—an act of “colligation.” Finally, a “sequel” follows where the successful theory is extended, refined, and applied.

ii. Induction as Conception

The key element of Induction, for Whewell, is that it is not a mere generalization of singular facts. The general proposition is not the result of “a mere juxtaposition of the cases” or of a mere conjunction and extension of them. (1840 v.2: 47) The proper Induction introduces a new element—what Whewell calls “conception”— which is actively introduced by the mind and was not there in the observed facts. This conceptual novelty is supposed to exhibit the common property—the universal—under which all the singular facts fall. It is supposed to provide a “Principle of Connexion” of the facts that probed it but did not dictate it. Whewell’s typical example of a Conception is Kepler’s notion of an ellipse. Observing the motion of Mars and trying to understand it, Kepler did not merely juxtapose the known positions. He introduced the notion of an ellipse, namely, that the motion of Mars is an ellipse. This move, Whewell suggested, was inductive but not enumerative. So, the mind plays an active role in Induction—it does not merely observe and generalize, it introduces conceptual novelties which act as principles of connection. In this sense, the mind does not have to survey all instances. Insofar as it invents the conception that connects them, it is entitled to the generalization. Whewell says:

In each inference made by Induction, there is introduced some General Conception, which is given, not by the phenomena, but by the mind. The conclusion is not contained in the premises, but includes them by the introduction of a New Generality. In order to obtain our inference, we travel beyond the cases which we have before us; we consider them as mere exemplifications of some Ideal Case in which the relations are complete and intelligible. We take a Standard, and measure the facts by it; and this Standard is constructed by us, not offered by Nature. (1840 v.2: 49)

Induction is then genuinely ampliative—not only does it go beyond the observed instances, but it introduces new conceptual content as well, which is not directly suggested by the observed instances. Whewell calls this type of ampliation “superimposition,” because “There is some Conception superinduced upon the Facts” and takes it that this is the proper understanding of Induction. So, proper Induction requires, as he put it, “an idea from within, facts from without, and a coincidence of the two.” (1840 v.2: 619)

c. The Whewell-Mill Controversy

Whewell takes it that this dual aspect is his own important contribution to the Logic of Induction. His account of Induction landed him in a controversy with Mill. Whewell summarized his views and criticized Mill in a little book titled Of Induction, with a Special Reference to John Stuart Mill’s System of Logic, which appeared in 1849. In this, he first stressed the basic elements of his own views. More specifically: Reason plays an ineliminable role in Induction, since Induction requires the Mind’s conscious understanding of the general form under which the individual instances are subsumed. Hence, Whewell insists, Induction cannot be based on instinct, since the latter operates “blindly and unconsciously in particular cases.” The role of Mind is indispensable, he thought, in inventing the right “conception.” Once this is hit upon by the mind, the facts “are seen in a new point of view.” This point of view puts the facts (the inductive basis) in a certain unity and order. Before the conception, “The facts are seen as detached, separate, lawless; afterwards, they are seen as connected, simple, regular; as parts of one general fact, and thereby possessing innumerable new relations before unseen.” (1849: 29) The point here is that the conception is supposed to bridge the gap between the various instances and the generalization; it provides the universal under which all particular instances, seen and unseen, are subsumed.

Mill objected to this view that what Whewell took to be a proper Induction was a mere description of the facts. The debate was focused on Kepler’s first law, namely, that all planets move in ellipses—or, for that matter, that Mars describes an ellipse. We have already seen Whewell arguing that the notion of “ellipse” is not to be found in the facts of Mars’s motion around the sun. Rather, it is a new point of view, a new conception introduced by the mind, and it is such that it provided a “principle of connexion” among the individual facts—that is, the various positions of Mars in the firmament. This “ellipsis,” Whewell said, is superinduced on the fact, and this superinduction is an essential element of Induction.

i. On Kepler’s Laws

For Mill, when Kepler introduced the concept of “ellipse” he described the motion of Mars (and of the rest of the planets). Whewell had used the term “colligation” to capture the idea that the various facts are connected under a new conception. For Mill, colligation is just description and not Induction. More specifically, Kepler collected various observations about the positions occupied by Mars, and then he inquired about what sort of curve these points would make. He did end up with an ellipse. But for Mill, this was a description of the trajectory of the planet. There is no doubt that this operation was not easy, but it was not an induction. It is no more an induction than drawing the shape of an island on a map based on observations of successive points of the coast.

What, then, is Induction? As we have already seen, Mill took Induction to involve a transition from the particular to the general. As such, it involves a generalization to the unobserved and a claim that whatever holds for the observed holds for the unobserved too. Then, the inductive move in Kepler’s first law is not the idea of an ellipse, but rather the commitment to the view that when Mars is not observed its positions lie on the ellipse; that is, the inductive claim is that Mars has described and will keep describing an ellipse. Here is how Mill put it:

The only real induction concerned in the case, consisted in inferring that because the observed places of Mars were correctly represented by points in an imaginary ellipse, therefore Mars would continue to revolve in that same ellipse; and in concluding (before the gap had been filled up by further observations) that the positions of the planet during the time which intervened between two observations, must have coincided with the intermediate points of the curve.

In fact, Kepler did not even make the induction, according to Mill, because it was known that the planets periodically return to their positions. Hence, “Knowing already that the planets continued to move in the same paths; when [Kepler] found that an ellipse correctly represented the past path, he knew that it would represent the future path.”

Part of the problem with Whewell’s approach, Mill thought, was that it was verging on idealism. He took Whewell to imply that the mind imposes the conception on the facts. For Mill, the mind simply discovers it (and hence, it describes it). Famously, Mill said that if “the planet left behind it in space a visible track,” it could be seen that it is an ellipse. So, for Mill, Whewell was introducing hypotheses by means of his idea of conception and was not describing Induction. Colligation is the method of hypothesis, he thought, and not of Induction.

Whewell replied that Kepler’s laws are based on Induction in the sense that “The separate facts of any planet (Mars, for instance) being in certain places at certain times, are all included in the general proposition which Kepler discovered, that Mars describes an ellipse of a certain form and position.” (1840: 18)

What can we make of this exchange? Mill and Whewell do agree on some basic facts about Induction. They both agree that Induction is a process that moves from particulars to the universal, from observed instances to a generalization. Mill says, “Induction may be defined the operation of discovering and forming general propositions,” and Whewell agrees with this and emphasizes that generality is essential for Induction, since only this can make Induction create excess content.

Generality is conceived of as true universality. As Mill makes clear (and he credits this thought to all those who have discussed induction in the past), Induction:

  • involves “inferences from known cases to unknown”;
  • affirms “of a class, a predicate which has been found true of some cases belonging to the class”;
  • concludes that “Because some things have a certain property, that other things which resemble them have the same property”;
  • concludes that “Because a thing has manifested a property at a certain time, that it has and will have that property at other times.”

So, inductive generalizations are spatio-temporal universalities. They extend a property possessed by some observed members of a kind to all other (unobserved or unobservable) members of the kind (in different times and different spaces); they extend a property being currently possessed by an individual to its being possessed at all times. There is no doubt that Whewell shares this view too. So where is the difference?

ii. On the Role of Mind in Inductive Inferences

The difference is in the role of the principles of connection in Induction and, concomitantly, on the role of mind in inductive inferences—and this difference is reflected in how exactly Induction is described. Whewell takes it that the only way in which the inductively arrived proposition is truly universal is when the Intellect provides the principle of connection (that is, the conception) of the observed instances. In other words, the principles of connection are necessary for Induction, and, since they cannot be found in experience, the Mind has to provide them. If a principle of connection is provided, and if it is the correct one, then the resulting proposition captures within itself, as it were, its true universality (aka its future extendibility). In the case of Mars, the principle of connection is that Mars describes an ellipse—that is, that an ellipse binds together “particular observations of separate places of Mars.” If Mars does describe an ellipse, or if all planets do describe ellipses, then there is no (need for) further assurance that this claim is truly universal. Its universality follows from its capturing a principle of connection between the various instances (past, present and future).

In this sense, Whewell sees Induction as a one-stage process. The observation of particulars leads the mind to search for a principle of connection (the “conception” that binds them together into a general claim about all particulars of this kind). This is where Induction ends. But Inquiry does not end there for Whewell—for further testing is necessary for finding out whether the introduced principle of connection is the correct one. Recall his point: Induction requires “an idea from within, facts from without, and a coincidence of the two.” The coincidence of the two is precisely a matter of further testing. The well-known consilience of inductions is precisely how the further testing works and secures, if successfully performed, that the principle of connection was the correct one. Consilience, Whewell argued, “is another kind of evidence of theories, very closely approaching to the verification of untried predictions.” (1849: 61) It occurs when “Inductions from classes of facts altogether different have thus jumped together,” (1840 v.2: 65) that is, when a theory is supported by facts that it was not intended to explain. His example is the theory of universal gravitation, which, though obtained by Induction from the motions of the planets, “was found to explain also that peculiar motion of the spheroidal earth which produces the Precession of the Equinoxes.” Whewell thought that the consilience of inductions is a criterion of truth, a “stamp of truth,” or, as he put it, “the point where truth resides.”
Mill objected that no predictions could prove the truth of a theory. But the important point here is that Whewell took it that the principles of connection that the Mind supplies in Induction require further proof to be accepted as true.

For Mill, there are no such principles of connection—just universal and invariant successions—and the mind has no power, not inclination, to find them. Actually, there are no such connections. So, Induction is, in essence, described as a two-stage process. In the first stage, there is description of a regularity; in the second stage, there is a proper universalization, so to speak, of this regularity. The genuinely inductive “Mars’ trajectory is an  an ellipse asserts a regularity. But this regularity is truly universal only if it asserts that it holds for all past, present, and future trajectories of Mars. In criticizing Whewell, Mill agreed that the assertion “The successive places of Mars are points in an ellipse” is “not the sum of the observations merely,” since the idea of an ellipse is involved in it. Still, he thought, “It was not the sum of more than the observations, as a real induction is.” That is, it rested only on the actual observations and did not extend it to the unobserved positions of Mars. “It took in no cases but those which had been actually observed…There was not that transition from known cases to unknown, which constitutes Induction in the original and acknowledged meaning of the term.” (1879: 221) Differently put, the description of the regularity, according to Mill, should be something like: Mars has described an ellipse. The Inductive move should be “Mars describes an ellipse.”

What was at stake, in the end, were two rival metaphysical conceptions of the world. Not only did Whewell take it that “Metaphysics is a necessary part of the inductive movement, (1858, vii)  but he also thought the inductive movement is grounded on the existence of principles of connection in nature, which the mind (and human reason) succeeds in discovering. Mill, on the other hand, warned us against “the notion of causation:” The notion of causation is deemed, by the schools of metaphysics most in vogue at the present moment,

to imply a mysterious and most powerful tie, such as cannot, or at least does not, exist between any physical fact and that other physical fact on which it is invariably consequent, and which is popularly termed its cause: and thence is deduced the supposed necessity of ascending higher, into the essences and inherent constitution of things, to find the true cause, the cause which is not only followed by, but actually produces, the effect.

Mill was adamant that “No such necessity exists for the purposes of the present inquiry…. The only notion of a cause, which the theory of induction requires, is such a notion as can be gained from experience.” (1879: 377)

d. Early Appeals to Probability: From Laplace to Russell via Venn

i. Venn: Induction vs Probability

Induction, for John Venn (1834–1923), “involves a passage from what has been observed to what has not been observed.” (1889: 47) But the very possibility of such a move requires that Nature is such that it enables knowing the unobserved. Hence, Venn asks the key question: “What characteristics then ought we to demand in Nature in order to enable us to effect this step?” Answering this question requires a principle which is both universal (that is, it has universal applicability) and objective (that is, it must express some regularity in the world itself and not something about our beliefs.)

Interestingly, Venn took this principle to be the Principle of Uniformity of Nature. But Venn was perhaps the first to associate Hume’s critique of causation with a critique of Induction and, in particular, with a critique of the status of PUN. To be sure, Venn credited Hume with a major shift in the “signification of Cause and Effect” from the once dominant account of causation as efficiency to the new account of causation as regularity. (1889:49) But this shift brought with it the question: what is the foundation of our belief in the regularity? To which Hume answered, according to Venn, by showing that the foundation of this belief is Induction based on past experience. In this setting, Venn took it that the problem of induction is the problem of establishing the foundation of the belief in the uniformity of nature.

Hence, for Venn, Hume moved smoothly from causation, to regularity, to Induction. Moreover, he took the observed Uniformity of Nature as “the ultimate logical ground of our induction” (1889: 128). And yet, the belief in the Uniformity of Nature is the result of Induction. Hume had shown that the process for extending to the future a past association of two events cannot possibly be based on reasoning, but it is instead a matter of custom or habit. (op.cit. 131)

Venn emphatically claims there is no logical solution to the problem of uniformity. And yet, this is no cause for despair. For inductive reasoning requires taking the Uniformity of Nature as a postulate: “It must be assumed as a postulate, so far as logic is concerned, that the belief in the Uniformity of Nature exists.” (op.cit. 132) This postulate of Uniformity (same antecedents are followed by the same consequents) finds its natural expression in the Law of Causation (same cause, same effect). The Law of Causation captures “a certain permanence in the order of nature.” This permanence is “clearly essential” if we are to generalize from the observed to the unobserved. Hence, “The truth of the law [of causation] is clearly necessary to enable us to obtain our generalisations: in other words, it is necessary for the Inductive part of the process.” (1888: 212)

These inductively-established generalizations are deemed the laws of nature. The laws are regularities; they suggest that some events are “connected together in a regular way.” Induction enables the mind to move from the known to the unknown and hence to acquire knowledge of new facts. As Venn put it:

[The] mind […] dart[s] with its inferences from a few facts completely through a whole class of objects, and thus [it] acquire[s] results the successive individual attainment of which would have involved long and wearisome investigation, and would indeed in multitudes of instances have been out of the question. (1888: 206)

The intended contrast here is between inductive generalizations and next-instance inductions. There are obviously two routes to the conclusion that the next raven will be black, given that all observed ravens have been black. The first is to start with the observed ravens being black and, passing through the generalization that All ravens are black, to conclude that the next raven will be black. The other route is to start with the observed ravens being black and to conclude directly that the next raven will be black. Hence, we can always avoid the generalization and “make our inference from the data afforded by experience directly to the conclusion,” namely, to the next instance. And though Venn adds that “It is a mere arrangement of convenience” (1888: 207) to pass through the generalization, the convenience is so extreme that generalization is forced upon us when we infer from past experience. The inductive generalizations are not established with “with absolute certainty, but with a degree of conviction that is of the utmost practical use” (1888: 207). Nor is the existence of laws of nature “a matter of a priori necessity.” (op.cit.)

Now, following Laplace, Venn thought there is a link between Induction and probability, though he did think that “Induction is quite distinct from Probability”, the latter being, by and large, a mathematical theory. Yet, “[Induction] co-operates [with probability] in almost all its inferences.”

To see the distinction Venn has in mind, we first have to take into account the difference between establishing a generalization and drawing conclusions from it. Following Mill, Venn argued that the first task requires Induction, while the second requires logic. Now suppose the generalization is universal: all As are B. We can use logic to determine what follows from it, for instance, that the next A will be B. But not all generalizations are universal. There are generalizations which assert that a “certain proportion” prevails “among the events in the long run.” (1888: 18) These are what are today called statistical generalizations. Venn thinks of them as expressing “proportional propositions” and claims that probability is needed to “determine what inferences can be made from and by them” (1888: 207). The key point, then, is this: no matter whether a generalization is universal or statistical, it has to rely on the Principle of Uniformity of Nature. For only the latter can render valid the claim that either the regular succession found so far among factors A and B or the statistical correlation found so far among A and B is stable and can be extended to the unobserved As and Bs.

That is a critical point. Take the very sad fact that Venn refers to, namely, that three out of ten infants die in their first four years. It is a matter for Induction, Venn says, to examine whether the available evidence justifies the generalization that All infants die in that proportion, and not of Probability.

Venn distanced himself from those, like Laplace, who thought of a tight link between probability and Induction. He took issue with Laplace’s attempt to forge this tight link by devising a probabilistic rule of Induction, namely, what Venn dubbed the “Rule of Succession.” He could not be more upfront: “The opinion therefore according to which certain Inductive formulae are regarded as composing a portion of Probability, and which finds utterance in the Rule of Succession (…) cannot, I think, be maintained.” (1888: 208)

ii. Laplace: A Probabilistic Rule of Induction

Now, the mighty Pierre Simon, Marquis De Laplace (1749-1827) published in 1814 a book titled A Philosophical Essay on Probabilities, in which he developed a formal mathematical theory of probability based, roughly put, on the idea that, given a partition of a space of events, equiprobability is equipossibility. This account, which defined probability as the degree of ignorance in the occurrence of the event, became known as the classical interpretation of probability. (For a presentation of this interpretation and its main problems, see the IEP article on Probability and Induction.) For the time being, it is worth stressing that part of the motivation of developing the probability calculus was to show that Induction, “the principal means for ascertaining truth,” is based on probability.  (1814: 1)

In his attempt to put probability into Induction, Laplace put forward an inductive-probabilistic rule, which Venn called the “Rule of Succession.” It was a rule for the estimation of the probability of an event, given a past record of failures and successes in the occurrence of that event-type:

An event having occurred successively any number of times, the probability that it will happen again the next time is equal to this number increased by unity divided by the same number, increased by two units. (1814: 19)

The rule tells us how to calculate the conditional probability (see section 6.1) of an event  to occur, given evidence  that the same event (type) has occurred  times in a row in the past. This probability is:

(N+1)/(N+2).

In the more general case where an event has occurred  times and failed to occur  times in the past, the probability of a subsequent occurrence is:

(N+1)/(N+M+2).

(See, Keynes 1921: 423.)

The derivation of the rule in mathematical probability theory is based on two assumptions. The first one is that the only information available is that related to the number of successes and failures of the event examined. And the second one is what Venn called the “physical assumption that the universe may be likened to…a bag” of black and white balls from which we draw independently (1888: 197), that is, the success or failure in the occurrence of an event has no effect on subsequent tests of the same event.

Famously, Laplace applied the rule to calculate the probability of the sun rising tomorrow, given the history of observed sunrises, and concluded that it is extremely likely that the sun will rise tomorrow:

Placing the most ancient epoch of history at five thousand years ago, or at 182623 days, and the sun having risen constantly in the interval at each revolution of twenty-four hours, it is a bet of 1826214 to one that it will rise again tomorrow. (1814: 19)

Venn claimed that “It is hard to take such a rule as this seriously.” (1888: 197) The basis of his criticism is that we cannot have a good estimate of the probability of a future recurrence of an event if the event has happened just a few times, so much less if it has happened just once. However, the rule of succession suggests that on the first occasion the odds are 2 to 1 in favor of the event’s recurrence. Commenting on an example suggested by Jevons, Venn claimed that more information should be taken into account to say something about the event’s recurrence, which is not available just by observing its first occurrence:

For instance, Jevons (Principles of Science p. 258) says “Thus on the first occasion on which a person sees a shark, and notices that it is accompanied by a little pilot fish, the odds are 2 to 1 that the next shark will be so accompanied.” To say nothing of the fact that recognizing and naming the fish implies that they have often been seen before, how many of the observed characteristics of that single ‘event’ are to be considered essential? Must the pilot precede; and at the same distance? Must we consider the latitude, the ocean, the season, the species of shark, as matter also of repetition on the next occasion? and so on. (1888: 198 n.1)

Thus, he concluded that “I cannot see how the Inductive problem can be even intelligibly stated, for quantitative purposes, on the first occurrence of any event.” (1888: 198, n.1)

In a similar vein, Keynes pointed out “the absurdity of supposing that the odds are 2 to 1 in favor of a generalization based on a single instance—a conclusion which this formula would seem to justify.” (1921: 29 n.1) However, his criticism, as we shall see, goes well beyond noticing the problem of the single case.

iii. Russell’s Principle of Induction

Could Induction not be defended on synthetic a priori grounds? This was attempted by Russell (1912) in his famous The Problems of Philosophy. He took the Principle of Induction to assert the following: (1) the greater the number of cases in which A has been found associated with B, the more probable it is that A is always associated with B (if no instance is known of A not associated with B); (2) a sufficient number of cases of association between A and B will make it nearly certain that A is always associated with B.

Clearly, thus stated, the Principle of Induction cannot be refuted by experience, even if an A is actually found not to be followed by B. But neither can it be proved on the basis of experience. Russell’s claim was that without a principle like this, science is impossible and that this principle should be accepted on the ground of its intrinsic evidence. Russell, of course, said this in a period in which the synthetic a priori could still have a go for it. But, as Keynes observed, Russell’s Principle of Induction requires that the Principle of Limited Variety holds. Though synthetic, this last principle is hardly a priori.

5. Non-Probabilistic Approaches

a. Induction and the Meaning of Rationality

P. F. Strawson discussed the Problem of Induction in the final section of his Introduction to Logical Theory (1952), entitled “The ‘Justification’ of Induction.” After arguing that any attempt to justify Induction in terms of deductive standards is not viable, he went on to argue that the inductive method is the standard of rationality when we reason from experience.

Strawson invited us to consider “the demand that induction shall be shown to be really a kind of deduction.” (1952: 251) This demand stems from considering the ideal of rationality in terms of deductive standards as realized in formal logic. Thus, to justify Induction, one should show its compliance with these standards. He examined two attempts along this line of thought which are both found problematic. The first consists in finding “the supreme premise of inductions” that would turn an inductive argument into a deductive one. What would be the logical status of such a premise, he wondered? If the premise were a non-necessary proposition, then the problem of justification would reappear in a different guise. If it were a necessary truth that, along with the evidence, would yield the conclusion, then there is no need for it since the evidence would entail the conclusion by itself without the need of the extra premise and the problem would disappear. A second (more sophisticated) attempt to justify Induction on deductive grounds rests on probability theory. In this case, the justification takes the form a mathematical theorem. However, Strawson points out that mathematical modelling of an inductive process requires assumptions that are not of mathematical nature, and they need, in turn, to be justified. Hence, the problem of justification is simply moved instead of being solved. As Strawson commented, “This theory represents our inductions as the vague sublunary shadows of deductive calculations we cannot make.” (1952: 256)

Strawson’s major contribution to the problem is related to the conceptual clarification of the meaning of rationality: what do we mean by being rational when we argue about matters of fact? If we answer that question we can (dis-)solve the problem of the rational justification of Induction, since the rationality of Induction is not a “fact about the constitution of the world. It is a matter of what we mean by the word ‘rational’….” (1952: 261) We suggest the following reconstruction of Strawson’s argument: (1952: 256-257)

(1) If someone is rational, then they “have a degree of belief in a statement which is proportional to the strength of the evidence in its favour.”

(2) If someone has “a degree of belief in a statement which is proportional to the strength of the evidence in its favour,” then they have a degree of belief in a generalization as high as “the number of favourable instances, and the variety of circumstances in which they have been found, is great.”

(3) If someone has a degree of belief in a generalization as high as “the number of favourable instances, and the variety of circumstances in which they have been found, is great,” then they apply inductive methodology.

Therefore,

(C) If someone is rational, then they apply inductive methodology.

According to Strawson, all three premises in the above reconstruction are analytic propositions stemming from the definition of rationality, its application to the case of a generalization, and, finally, our understanding of Induction. Hence, that Induction exemplifies rationality when arguing about facts of matter is an inevitable conclusion. Of course, this does not mean that Induction is always successful, that is, the evidence may not be sufficient to assign a high degree of belief to the generalization.

When it comes to the success of Induction, Strawson claimed that to deem successful a method of prediction about the unobserved, Induction is required, since the success of any method is justified in terms of a past record of successful predictions. Thus, the proposition “any successful method of finding about the unobserved is justified by induction” is an analytic proposition, and “Having, or acquiring, inductive support is a necessary condition of the success of a method.” (1952: 259)

However, those who discuss the success of induction have in mind something quite different. To consider Induction a successful method of inference, the premises of an inductive argument should confer a high degree of belief on its conclusion. But this is not something that should be taken for granted. In a highly disordered, chancy world, the favorable cases for a generalization may be comparable with the unfavorable. Thus, there would be no strong evidence for the conclusion of an inductive argument. (1952: 262) Hence, assumptions that guarantee the success of Induction need to be imposed if Induction is to be considered a successful method. Such conditions, Strawson claimed, are factual, not necessary, truths about the universe. Given a past record of successful predictions about the unobserved, such factual claims are taken to have a good inductive support and speak for the following claim: “[The universe is such that] induction will continue to be successful.” (1952: 261)

Nevertheless, Strawson insisted that we should not confuse the success of Induction with its being rational; hence, it would be utterly senseless and absurd to attempt to justify the rationality of Induction in terms of its being successful. To Strawson, Induction is rational, and this is an analytic truth that is known a priori and independently of our ability to predict successfully unobserved facts, whereas making successful predictions about unobserved rests on contingent facts about the world which can be inductively supported but cannot fortify or impair the rationality of Induction. Thus, Strawson concludes, questions of the following sort: “Is the universe such that inductive procedures are rational?” or “what must the universe be like in order for inductive procedures to be rational?”, are confused and senseless on a par with statements like “The uniformity of nature is a presupposition of the validity of induction.” (1952: 262) In this way, Strawson explains the emergence of the Problem of Induction as a result of a conceptual misunderstanding.

b. Can Induction Support Itself?

Can there be an inductive justification of Induction? For many philosophers the answer is a resounding NO! The key argument for this builds on the well-known sceptical challenge: subject S asserts that she knows that p, where p is some proposition. The sceptic asks her: how do you know that p? S replies: because I have used criterion c (or method m, or whatever). The sceptic then asks: how do you know that criterion c (or whatever) is sufficient for knowledge? It is obvious that this strategy leads to a trilemma: either infinite regress (S replies: because I have used another criterion c’), or circularity (S replies: because I have used criterion c itself) or dogmatism (S replies: because criterion c is sufficient for knowledge). So, the idea is that if Induction is used to vindicate Induction, this move would be infinitely regressive, viciously circular, or merely dogmatic.

What would such a vindication be like? It would rest on what Max Black has called self-supporting inductive arguments (1958). Roughly put, the argument would be: Induction has led to true beliefs in the past (or so far); therefore Induction is reliable, where reliability, in the technical epistemic conception, is a property of a rule of inference such that if it is fed with true premises, it tends to generate true conclusions. So:

Induction has yielded true conclusions in the past; therefore, Induction is likely to work in the future—and hence to be reliable.

A more exact formulation of this argument would use as premises lots of successful individual instances of Induction and would conclude (by a meta-induction or a second-order Induction) the reliability of Induction simpliciter. Or, as Black put it, about a rule of Induction R:

In most instances of the use of R in arguments with true premises examined in a wide variety of conditions, R has been successful. Hence (probably): In the next instance to be encountered of the use of R in an argument with a true premise, R will be successful. The rule of inductive inference R is the following: “Most instances of A’s examined in a wide variety of conditions have been B; hence (probably) The next A to be encountered will be B.” (1958: 719-20)

Arguments such as these have been employed by many philosophers, such as Braithwaite (1953), van Cleve (1984), Papineau (1992), Psillos (1999), and others. What is wrong with them? There is an air of circularity in them, since the rule R is employed in an argument which concludes that R is trustworthy or reliable.

i. Premise-Circularity vs Rule-Circularity

In his path-breaking work, Richard Braithwaite (1953) distinguished between two kinds of circularity: premise-circularity and rule-circularity.

“Premise-circular” describes an argument such that its conclusion is explicitly one of its premises. Suppose you want to prove P, and you deploy an argument with P among its premises. This would be a viciously circular argument. The charge of vicious circularity is an epistemic charge—a viciously circular argument has no epistemic force: It cannot offer reasons to believe its conclusion, since it presupposes it; hence, it cannot be persuasive. Premise-circularity is vicious! But (I) above (even in the rough formulation offered) is not premise-circular.

There is, however, another kind of circularity. This, as Braithwaite put it, “is the circularity involved in the use of a principle of inference being justified by the truth of a proposition which can only be established by the use of the same principle of inference” (1953: 276). It can be called rule-circularity. In general, an argument has a number of premises, P1,…,Pn. Qua argument, it rests on (employs/uses) a rule of inference R, by virtue of which a certain conclusion Q follows. It may be that Q has a certain content: it asserts or implies something about the rule of inference R used in the argument, in particular, that R is reliable. So, rule-circular arguments are such that the argument itself is an instance, or involves essentially an application, of the rule of inference whose reliability is asserted in the conclusion.

If anything, (I) is rule-circular. Is rule-circularity vicious? Obviously, rule circularity is not premise-circularity. But, one may wonder, is it still vicious in not having any epistemic force? This issue arises already when it comes to the justification of deductive logic. In the case of the justification of modus ponens (or any other genuinely fundamental rule of logic), if logical scepticism is to be forfeited, there is only rule-circular justification. Indeed, any attempt to justify modus ponens by means of an argument has to employ modus ponens itself (see Dummett 1974).

ii. Counter-Induction?

But, one may wonder, could any mode of reasoning (no matter how crazy or invalid) not be justified by rule-circular arguments? A standard worry is that a rule-circular argument could be offered in defense of “counter-induction.” This moves from the premise that “Most observed As are B” to the conclusion “The next A will be not-B.” A “counter-inductivist” might support this rule by the following rule-circular argument: since most counter-inductions so far have failed, conclude, by counter-induction, that the next counter-induction will succeed.

The right reply here is that the employment of rule-circular arguments rests on or requires the absence of specific reasons to doubt the reliability of a rule of inference. We can call this, the Fair-Treatment Principle: a doxastic/inferential practice is innocent until proven guilty. This puts the onus on those who want to show guilt. The rationale for this principle is that justification has to start from somewhere and there is no other point to start apart from where we currently are, that is, from our current beliefs and inferential practices. Accordingly, unless there are specific reasons to doubt the reliability of induction, there is no reason to forego its uses in justificatory arguments. Nor is there reason to search for an active justification of it. Things are obviously different with counter-induction, since there are plenty of reasons to doubt its reliability, the chief being that typically counter-induction have led to false conclusions.

It may be objected that we have no reasons to rely on certain inferential rules. But this is not quite so. Our basic inferential rules (including Induction, of course) are rules we value. And we value them because they are our rules, that is, rules we employ and reply upon to form beliefs. Part of the reason why we value these rules is that they have tended to generate true beliefs—hence, we have some reason to think they are reliable, or at least more reliable than competing rules (say counter-induction).

Rule-circularity is endemic in any kind of attempt to justify basic method of inference and basic cognitive processes, such as perception and memory. In fact, as Frank Ramsey noted, it is only via memory that we can examine the reliability of memory (1926). Even if we were to carry out experiments to examine it, we would still have to rely on memory: we would have to remember their outcomes. But there is nothing vicious in using memory to determine and enhance the degree of accuracy of memory, for there is no reason to doubt its general reliability and have some reasons to trust it.

If epistemology is not to be paralysed, if inferential scepticism is not to be taken as the default reasonable position, we have to rely on rule-circular arguments for the justification of basic methods and cognitive processes.

c. Popper Against Induction

In the first chapter of the book Objective Knowledge: An evolutionary approach, Popper presented his solution of the Problem of Induction. His reading of Hume distinguished between the logical Problem of Induction (1972: 4),

HL: Are we justified in reasoning from [repeated] instances of which we have experience to other instances [conclusion] of which we have no experience?

and the psychological Problem of Induction,

HPs: Why, nevertheless, do all reasonable people expect, and believe, that instances of which we have no experience will conform to those they have experience? That is, Why do we have expectations in which we have great confidence?

Hume, Popper claimed, answered the logical problem in the negative—no number of observed instances can justify unobserved ones—while he answered the psychological problem positively—custom and habit are responsible for the formation of our expectations. In this way, Popper observes, a huge gap is opened up between rationality and belief formation and, thus, “Hume (…) was turned into a sceptic and, at the same time, into a believer: a believer in an irrationalist epistemology.” (ibid.)

In his own attempt to solve the logical Problem of Induction, Popper suggested the following three reformulations of it (1972: 7-8):

L1: Can the claim that an explanatory universal theory is true be justified by “empirical reasons’”; that is by assuming the truth of certain test statements or observation statements (which it may be said, are “based on experience”)?

L2: Can the claim that an explanatory universal theory is true or that it is false be justified by “empirical reasons”; that is can the assumption of the truth of test statements justify either the claim that a universal theory is true or the claim that it is false?

L3: Can a preference, with respect to truth or falsity, for some competing universal theories over others ever be justified by such “empirical reasons”?

Popper considers L2 to be a generalization of L1 and L3 an equivalent formulation of L2. In addition, Popper’s formulation(s) of the logical problem L1 differs from his original formulation of the Humean problem, HL, since, in L1 L3, the conclusion is an empirical generalization and the premises are “observation; or ‘test’ statements, as opposed to instances of experience” (1972: 12). In deductive logic, the truth of a universal statement cannot be established by any finite number of true observation or test statements. However, Popper, in L2, added an extra disjunct so as to treat the falsity of universal statements on empirical grounds. He can then point out that a universal statement can always be falsified by a test statement. (1972: 7) Hence, by the very (re)formulation of the logical Problem of Induction, as in L2, in such a way as to include both the (impossible) verification of a universal statement as well as its (possible) falsification, Popper thinks he has “solved” the logical Problem of Induction. The “solution” is merely stating the “asymmetry between verification and falsification by experience” from the point of view of deductive logic.

After having “solved” the logical Problem of Induction, Popper applies a heuristic conjecture, called the principle of transference, to transfer the logical solution of the Problem of Induction to the realm of psychology and to remove the clash between the answers provided by Hume to the two aspects of the Problem of Induction. This principle states roughly that “What is true in logic is true in psychology.” (1972: 6) Firstly, Popper noticed that “Induction—the formation of a belief by repetition—is a myth”: people have an inborn, instinctual inclination to impose regularities upon their environment and to make the world conform with their expectation in the absence of or prior to any repetitions of phenomena. As a consequence, Hume’s answer to HPs that bases belief formation on custom and habit is considered inadequate. Having disarmed Hume’s answer to the psychological Problem of Induction, Popper applies the principle of transference to align logic and psychology in terms of the following problem and answer:

Ps1: If we look at a theory critically, from the point of view of view of sufficient evidence rather than from any pragmatic point of view, do we always have the feeling of complete assurance or certainty of its truth, even with respect to the best-tested theories, such as that the sun rises every day? (1972: 26)

Popper’s answer to Ps1 is negative: the feeling of certainty we may experience is not based on evidence; it has its source in pragmatic considerations connected with our instincts and with the assurance of an expectation that one needs to engage in goal-oriented action. The critical examination of a universal statement shows that such a certainty is not justified, although, for pragmatic reasons related to action, we may not take seriously possibilities that are against our expectations. In this way, Popper aligns his answer to the logical Problem of Induction with his treatment of its psychological counterpart.

d. Goodman and the New Riddle of Induction

In Fact, Fiction and Forecast (1955: 61ff), Goodman argued that the “old,” as he called it, Problem of Induction is a pseudo-problem based on a presumed peculiarity of Induction which, nevertheless, does not exist. Both in deduction and in Induction, an inference is correct if it conforms with accepted rules, and rules are accepted if they codify our inferential practices. Hence, we should not seek after a reason that would justify Induction in a non-circular way any more than we do so for deduction, and the noted circularity is, as Goodman says, a “virtuous” one. The task of the philosopher is to find those rules that best codify our inferential practices in order to provide a systematic description of what a valid inference is. As a result, the only problem about Induction that remains is that, contrary to deductive inference, such rules have not been consolidated. The search for such rules is what Goodman called “the constructive task of confirmation theory.”

The new riddle of Induction appeared in the attempt to explicate the relation of confirmation of a general hypothesis by a particular instance of it. It reflects the realization that the confirmation relation is not purely syntactic: while a positive instance of a generalization may confirm it, if it is a lawlike generalization, it does not bear upon its truth if it is an accidental generalization. To illustrate this fact, Goodman used the following examples: firstly, consider the statement, “This piece of copper conducts electricity” that confirms the lawlike generalization, “All pieces of copper conduct electricity.” Secondly, consider the statement, “The man in the room is a third son” that does not confirm the accidental generalization, “All men in the room are third sons.” Obviously, the difference in these examples is not couched in terms of syntax since in both cases the observation statements and the generalizations have the same syntactic form. The new riddle of Induction shows the difficulty of making the required distinction between lawlike and accidental generalizations.

Consider two hypotheses H1 and H2, that have the form of a universal generalization: “All S is P.” Let H1 be “All emeralds are green” and H2  be “All emeralds are grue,” where “grue” is a one-place predicate defined as follows:

At time T1, both H1 and H2 are equally well confirmed by reports of observations of green emeralds made before time T. The two hypotheses differ with respect to the predictions they make about the color of the observed emeralds after time : the predictions, “The next emerald to observe after time T  is green” and “The next emerald to observe after time T is grue” are inconsistent. In addition, it may occur that the same prediction made at a time T is equally well-supported by diverse collections of evidence collected before T, as long as these collections of evidence are reflected on the different hypotheses formulated in terms of appropriately formed predicate constructs. However, Goodman claims that “…only the predictions subsumed under law-like hypotheses are genuinely confirmed.” (1955: 74-75) Thus, to distinguish between the predictions that are genuinely confirmed from the ones that are not is to distinguish between lawlike generalizations and accidental ones.

The most popular suggestion is to demand that lawlike generalizations should not contain any reference particular individuals or involve any spatial or temporal restrictions (Goodman 1955: 77). In the new riddle, the predicate ‘grue’ used in  violates this criterion, since it references a particular time ; it is a positional predicate. Hence, one may claim that  does not qualify as a lawlike generalization. However, this analysis can be challenged as follows. Specify a grue-like predicate, bleen, as follows:

Now notice, we can define green (and blue) in terms of grue and bleen as follows:

“Thus qualitativeness is an entirely relative matter,” concludes Goodman, “[t]his relativity seems to be completely overlooked by those who contend that the qualitative character of a predicate is a criterion for its good behavior. (1955: 80)

Goodman solves the problem in terms of the entrenchment of a predicate. Entrenchment measures the size of the past record of hypotheses formulated using a predicate that they have been actually projected—that is, they have been adopted after their examined instances have been found true. Hence, the predicate “grue” is less entrenched than the predicate “green,” since it has not been used to construct hypotheses licensing predictions about as yet unexamined objects as many times as “green.” Roughly, Goodman’s idea is that lawful or projectible hypotheses use only well-entrenched predicates. On this account, only hypothesis H1 is lawful or projectible and not H2, and only H1 can be confirmed in the light of evidence.

Goodman’s account of lawlikeness is pragmatic, since it rests on the use of the predicates in language, and so it is the suggested solution for his new riddle and is restricted to universal hypotheses. Entrenchment has been criticized as imprecise concept, “a crude measure” says Teller (1969), which has not been properly defined. Anyone who attempts to measure entrenchment faces the problem of dealing with two predicates having the same extension and different past records of actual projections. Although their meaning is the same, their extension is different. Finally, entrenchment seems to suggest an excessively conservative policy for scientific practice that undermines the possibility of progress, since no new predicate would be well-entrenched on the basis of past projections, and “Science could never confirm anything new.” (ibid)

6. Reichenbach on Induction

a. Statistical Frequencies and the Rule of Induction

Hans Reichenbach distinguished between classical and statistical Induction, with the first being a special case of the latter. Classical Induction is what is ordinarily called Induction by enumeration, where an initial section of a given sequence of objects or events is found to possess a given attribute, and it is assumed that the attribute persists in the entire sequence. On the other hand, statistical Induction does not presuppose the uniform appearance of an attribute in any section of the sequence. In statistical Induction it is assumed that in an initial section of a sequence, an attribute is manifested with relative frequency f, and we infer that “The relative frequency observed will persist approximately for the rest of the sequence; or, in other words, that the observed value represents, within certain limits of exactness, the value of the limit for the whole sequence.” (1934: 351) Classical Induction as a special case of statistical induction results for f = 1.

Consider a sequence of events or objects and an attribute , which is exhibited by some events of the sequence.  Suppose that you flip a coin several times forming a sequence of “Heads” (H) and “Tails” (T), and you focus your attention on the outcome H.

H H T T H T T T H …

By examining the first six elements of the sequence you can calculate the relative frequency of exhibiting H in the six flips by dividing the number of H, that is, three, by the total number of trials, that is, six: hence,

Generally, by inspecting the first  elements of the sequence, we may calculate the relative frequency,

In this way, we may define a mathematical sequence, {fn}n∈ℕ, with elements fn representing the relative frequency of appearance of the attribute A in the first n elements of the sequence of events. In the coin-flipping example we have:

n 1 2 3 4 5 6 7 8 9
Outcome H H T T H T T T H
fn 1 1 2/3 2/4 3/5 3/6 3/7 3/8 4/8

 

According to Reichenbach (1934: 445), the rule or principle of Induction makes the following posit (for the concept of posit, see below):

For any given δ > 0, no matter how small we choose it

for all n > n0.

To apply the rule of Induction to the coin-flipping example we need to fix a δ, say δ = 0.05, and to conjecture at each trial n0, the relative frequency of H for the flips > n0 to a δ–degree of approximation.

n 1 2 3 4 5 6 7 8 9
Outcome H H T T H T T T H
fn 1 1 2/3 2/4 3/5 3/6 3/7 3/8 4/8
Conjectured fn 1 ± 0.05 1/2 ± 0.05 2/3 ± 0.05 2/4 ± 0.05 3/5 ± 0.05 3/6 ± 0.05 3/7 ± 0.05 3/8 ± 0.05 4/8 ± 0.05

 

The sequence of relative frequencies, {fn}n∈ℕ, may converge to a limiting relative frequency p or not. This limiting relative frequency, if it exists, expresses the probability of occurrence of attribute  in this sequence of events, according to the frequency interpretation of probability. For a fair coin in the coin-flipping experiment, the sequence of relative frequencies converges to = ½ Generally, however, we do not know whether such a limit exists, and it is non-trivial to assume its existence.  Reichenbach formulated the rule of induction in terms of such a limiting frequency (For further discussion consult the Appendix):

Rule of Induction. If an initial section of n elements of a sequence xi  is given, resulting in the frequency n, and if, furthermore, nothing is known about the probability of the second level for the occurrence of a certain limit p, we posit that the frequency f i (i > n) will approach a limit  p within f n ± δ when the sequence is continued. (1934: 446)

Two remarks are in order here: the first is about Reichenbach’s reference to “probability of the second level.”  He examined higher-level probabilities in Ch. 8 of his book on probability theory. If the first-level probabilities are limits of relative frequencies in a given sequence of events expressing the probability of an attribute to be manifested in this sequence, second-level probabilities refer to different sequences of events, and they express the probability of a sequence of events to exhibit a particular limiting relative frequency for that attribute. By means of second-level probabilities, Reichenbach discussed probability implications that have, as a consequent, a probability implication. In the example of coin flips, this would amount to having an infinite pool of coins that are not all of them fair. The probability of picking out a coin with a limiting relative frequency of ½ to bring “Heads” is a second-order probability. In the Rule of Induction, it is assumed that we have no information about “the probability of the second level for the occurrence of a certain limit ” and the posit we make is a blind one (1936: 446); namely, we have no evidence to know how good it is.

Secondly, it is worthwhile to highlight the analogy with classical Induction. An enumerative inductive argument either predicts what will happen in the next occurrence of a similar event or yields a universal statement that claims what happens in all cases. Similarly, statistical Induction either predicts something about the behavior of the relative frequencies that follow the ones already observed, or it yields what corresponds to the universal claim, namely, that the sequence of frequencies as a whole converges to a limiting value that lies within certain bounds of exactness from an already calculated relative frequency.

b. The Pragmatic Justification

Reichenbach claims that the problem of justification of Induction is a problem of justification of a rule of inference. A rule does not state a matter of fact, so it cannot be proved to be true or false; a rule is a directive that tells us what is permissible to do, and it requires justification. But what did Reichenbach mean by justification?

He writes, “It must be shown that the directive serves the purpose for which it is established, that it is a means to a specific end” (1934: 24), and “The recognition of all rules as directives makes it evident that a justification of the rules is indispensable and that justifying a rule means demonstrating a means-end relation.” (1934: 25)

Feigl called this kind of justification that is based on showing that the use of a rule is appropriate for the attainment of a goal vindication to distinguish it from validation, a different kind of justification that is based on deriving a rule from a more fundamental principle. (Feigl, 1950)

In the case of deductive inferences, a rule is vindicated if it can be proven that its application serves the purpose of truth-preservation, that is, if the rule of inference is applied to true statements, it provides a true statement. This proof is a proof of a meta-theorem. Consider, for instance, modus ponens; by applying this rule to the well-formed formulas φ, φψ  we get ψ. It is easy to verify that φ, φψ cannot have the value “True” while ψ has the value “False.” Reichenbach might have had this kind of justification in mind for deductive rules of inference.

What is the end that would justify the rule of induction as a means to it? The end is to determine within a desired approximation the limiting relative frequency of an attribute in a given sequence, if that limiting relative frequency exists: “The aim is predicting the future—to formulate it as finding the limit of a frequency is but another version of the same aim.” (1951: 246)

And, as we have seen, the rule of induction is the most effective means for accomplishing this goal: “If a limit of the frequency exists, positing the persistence of the frequency is justified because this method, applied repeatedly, must finally lead to true statements;” (1934: 472) “So if you want to find a limit of the frequency, use the inductive inference – it is the best instrument you have, because, if your aim can be reached, you will reach it that way.” (1951: 244)

Does this sort of justification presuppose that the limit of the sequence of relative frequency exists in a given sequence of events? Reichenbach says “No!”: “If [your aim] it cannot be reached, your attempt was in vain; but any other attempt must also break down.” (1951: 244)

In the last two passages quoted from The Rise of Scientific Philosophy, we find Reichenbach’s argument for the justification of Induction:

  1. Either the limit of the relative frequency exists, or it does not exist.
  2. If it does exist, then, by applying the rule of induction, we can find it.
  3. If it does not exist, then no method can find it.
  4. Therefore, either we find the limit of the frequency by induction or by no method at all.

The failure of any method in premise #3 follows from the consideration that if there were a successful alternative method, then the limit of the frequency would exist, and the rule of induction would be successful too. Reichenbach does not deny in principle that methods other than induction may succeed in accomplishing the aim set in certain circumstances; what he claims is that induction is maximally successful in accomplishing this aim.

The statement that there is a limit of a frequency is synthetic, since it says something non-trivial about the world and, Reichenbach claims, “that sequences of events converge toward a limit of the frequency, may be regarded as another and perhaps more precise version of the uniformity postulate.” (1934: 473) In regards to its truth, the principle is commonly taken either as postulated and self-warranted or as inferred from other premises. If postulated, then, Reichenbach says, we are introducing in epistemology a form of synthetic a priori principles. Russell is criticized for having introduced synthetic a priori principles in his theory of probability of Induction and is called to “revise his views.” (1951: 247) On the other hand, if inferred, we are attempting to justify the principle by proving it from other statements, which may lead to circularity or infinite regress.

Reichenbach did not undertake the job of proving that inductive inference concludes true or even probable beliefs from any more fundamental principle. He was convinced that this cannot be done. (1951: 94) Instead, he claimed that knowledge consists of assertions for which we have no proof of their truth, although we treat them as true, as posits. As he put it:

The word “posit” is used here in the same sense as the word “wager” or “bet” in games of chance. When we bet on a horse we do not want to say by such a wager that it is true that the horse will win; but we behave as though it were true by staking money on it. A posit is a statement with which we deal as true, although the truth value is unknown.” (1934: 373)

And elsewhere he stressed, “All knowledge is probable knowledge and can be asserted only in the sense of posits.” (1951: 246) Thus, as a posit, a predictive statement does not require a proof of its truth. And the classical problem of induction is not a problem for knowledge anymore: we do not need to prove from ‘higher’ principles that induction yields true conclusions. Since, for a posit, “All that can be asked for is a proof that it is a good posit, or even the best posit available.” (1951: 242)

Induction is justified as the instrument for making good posits:

Thesis θ. The rule of induction is justified as an instrument of positing because it is a method of which we know that if it is possible to make statements about the future we shall find them by means of this method. (1934: 475)

c. Reichenbach’s Views Criticized

One objection to Reichenbach’s vindication of Induction questions the epistemic end of finding the limit of the frequency asymptotically, since, as Keynes’s famous slogan put it, “In the long run we are all dead.” (1923: 80) What we should care about, say the critics, is to justify Induction as a means to the end of finding truth, or the correct limiting frequency, in a finite number of steps, in the short run. This is the only legitimate epistemic end, and in this respect Reichenbach’s convergence to truth has not much to say.

Everyone agrees that reaching a goal set, in a finite number of steps, would be a desideratum for any methodology. However, we should notice that any method that can be successful in the short run, will be successful in the long run as well. Or, by contraposition, if a method does not guarantee success in the long run, then it will not be successful in the short run as well. Hence, although success in the long run is not the optimum one could request from a method, it is still a desirable epistemic end. And Induction is the best candidate for being successful in the short run, since it is maximally successful in the long run. (Glymour 2015: 249) To stress this point, Huber made an analogy with deductive logic. As eternal life is impossible, it is impossible to live in any logically possible world other than the actual one. Yet, this does not prevent us from requiring our belief system to be logically consistent, that is, to have an epistemic virtue that is defined in every logically possible world, as a minimum requirement of having true beliefs about the actual world. (Huber 2019: 211)

A second objection rests on the fact that Reichenbach’s rule of Induction is not the only rule that converges to the limit of relative frequency if the limit exists. Thus, there are many rules, actually an infinite number of rules, that are vindicated. Any rule that would posit that the limit of the relative frequency p  is found within a δ-interval around cno + fno

for any given δ > 0 and cno → 0 when n0 → ∞ would yield a successful prediction if the limiting frequency  existed.

For instance, let

Then in the coin-flipping example, we obtain the following different conjectures according to Reichenbach’s rule and the cno-rule:

n 1 2 3 4 5 6 7 8 9
Outcome H H T T H T T T H
Conjectured fn 1 ± 0.05 1 ± 0.05 2/3 ± 0.05 2/4 ± 0.05 3/5 ± 0.05 3/6 ± 0.05 3/7 ± 0.05 3/8 ± 0.05 4/8 ± 0.05
cn0–Conjectured fn 0 ± 0.05 1/2 ± 0.05 5/9 ± 0.05 2/4 ± 0.05 14/25 ± 0.05 3/6 ± 0.05 22/49 ± 0.05 13/32 ± 0.05 4/8 ± 0.05

Despite the differences in the short run, the two rules converge to the same relative frequency asymptotically; hence, both rules are vindicated. Why, then, should one choose Reichenbach’s rule (cno = 0) rather than the cno-rule to make predictions?

Reichenbach was aware of the problem, and he employed descriptive simplicity to select among the rival rules. (1934: 447) According to Reichenbach, descriptive simplicity is a characteristic of a description of the available data that has no bearing on its truth. Using this criterion, we may choose among different hypotheses, not on the basis of their predictions, but on the basis of their convenience or easiness to handle: “…The inductive posit is simpler to handle.” (ibid.)

Thus, since all rules converge in the limit of empirical investigation, when all available evidence have been taken into consideration, the more convenient choice is the rule of Induction with cno = 0 for all n0 ∈ ℕ.

Huber claims that all the different rules that converge to the same limiting frequency and are associated with the same sequence of events are functionally equivalent since they serve the same end, that of finding the limit of the relative frequency. So, an epistemic agent can pick out any of these methods to attain this goal, but only one at a time. Yet, he argues, this is not a peculiar feature of Induction; the situation in deductive logic is similar. There are different systems of rules of inference in classical logic, and all of them justify the same particular inferences. Every time one uses a language, they are committed to a system of rules of inference. If one does not demand a justification of the system of rules in deductive logic, why should they require such a justification of the inductive rule. (Huber 2019: 212)

7. Appendix

This appendix shows the asymptotic and self-corrective nature of the inductive method that establishes its success and the truth of the posit made in Reichenbach’s rule of Induction for a convergent sequence of relative frequencies.

Firstly, assume that the sequence of relative frequencies {fn}n∈ℕ is convergent to a value p.  Then {fn}n∈ℕ is a Cauchy sequence,

ε > 0,∃N(ε)∈ℕ such that ∀n∈ℕ,n> N ⟹ |f– fno| < ε.

Setting ε = δ, where δ is the desired accuracy of our predictions, we conclude that there is always a number of trials, N(δ), after which our conjectured relative frequency fno for n0 > N(δ) approximates the frequencies that will be observed, fn, n > n> N(δ), to a δ degree of error.

Of course, this mathematical fact does not entail that the inductive posit is necessarily true. It will be true only if the number of items,  inspected is sufficient (that is, ) to establish deductively the truth of

|f– fno| < δ for nn0.

In the example of the coin-flipping, as we see in the relevant table, for δ, the conjectured relative frequency of H at the 3rd trial is between 185/300 and 215/300 for every > 3. However, at the fourth trial the conjecture is proved false since the relative frequency is 150/300.

Now, if the posit is false, we may inspect more elements of the sequence and correct our posit. Hence, for nn0 our posit may become

for all n > n1. Again, if the new posit is false we may correct it anew and so on. However, since {fn}n∈ℕ is convergent, after a finite number of (+ 1) steps, for some nk, our posit,

for all > n> N(δ) ,will become true.

This is what Reichenbach meant when he called inductive method, self-corrective, or asymptotic:

The inductive procedure, therefore, has the character of a method of trial and error so devised that, for sequences having a limit of the frequency, it will automatically lead to success in a finite number of steps. It may be called a self-corrective method, or an asymptotic method. (1934: 446)

Secondly, we show that for a sequence of relative frequencies {fn}n∈ℕ that converges to a number p the posit that Reichenbach makes in his rule of induction is true. Namely, we will show that for every desirable degree of accuracy δ > 0, there is a N(δ)∈ℕ such that for every > n> N, fapproaches to p that is within fn ± δ, i.e. |fn| and |p – fno| < δ.

We start from the inequality,

From the convergence of {fn}n∈ℕ it holds that

∃ N∈ ℕ such that ∀∈ ℕ,> N1 ⟹ |fn| < δ/2

and

∃ N∈ ℕ such that ∀∈ ℕ,> N0 > N⟹ |fn fno| < δ/2.

Let = max{N1, N2}, then for every n n> N,

|p – fno| < δ and |p – fn| < δ/2.

8. References and Further Reading

  • Aristotle, (1985). “On Generation and Corruption,” H. H. Joachim (trans.). In Barnes, J. (ed.). Complete Works of Aristotle v. 1: 512-555. Princeton: Princeton University Press.
  • Bacon, F., (2000). The New Organon. Cambridge: Cambridge University Press.
  • Bain, A., (1887). Logic: Deductive and Inductive. New York: D. Appleton and Company.
  • Black, M., (1958). “Self-Supporting Inductive Arguments.” The Journal of Philosophy 55(17): 718-725.
  • Braithwaite, R. B., (1953). Scientific Explanation: A Study of the Function of Theory, Probability and Law in Science. Cambridge: Cambridge University Press.
  • Broad, C. D., (1952). Ethics and The History of Philosophy: Selected essays. London: Routledge.
  • Dummett, M., (1974). “The Justification of Deduction.” In Dummett, M. (ed.). Truth and Other Enigmas. Oxford: Oxford University Press.
  • Feigl, H., (1950 [1981]). “De Principiis Non Disputandum…? On the Meaning and the Limits of Justification.” In Cohen, R.S. (ed.). Herbert Feigl Inquiries and Provocations: Selected Writings 1929-1974. 237-268. Dordrecht: D. Reidel Publishing Company.
  • Glymour, C., (2015). Thinking Things Through: An Introduction to Philosophical Issues and Achievements. Cambridge, MA: The MIT Press.
  • Goodman, N., (1955 [1981]). Fact, Fiction and Forecast. Cambridge, MA: Harvard University Press.
  • Huber, F., (2019). A Logical Introduction to Probability and Induction. Oxford: Oxford University Press.
  • Hume, D., (1739 [1978]). A Treatise of Human Nature. Selby-Bigge, L. A. & Nidditch, P. H. (eds). Oxford: Clarendon Press.
  • Hume, D., (1740 [1978]). “An Abstract of A Treatise of Human Nature.” In Selby-Bigge, L. A. & Nidditch, P. H., (eds). A Treatise of Human Nature. Oxford: Clarendon Press.
  • Hume, D., (1748 [1975]). “An Enquiry concerning Human Understanding.” In Selby-Bigge, L. A. & Nidditch, P. H., (eds). Enquiries concerning Human Understanding and concerning the Principle of Morals. Oxford: Clarendon Press.
  • Hume, D., (1751 [1975]). “An Enquiry concerning the Principles of Morals.” In Selby-Bigge, L. A. & Nidditch, P. H., (eds). Enquiries concerning Human Understanding and concerning the Principle of Morals. Oxford: Clarendon Press.
  • Jeffrey, R., (1992). Probability and the Art of Judgement. Cambridge: Cambridge University Press.
  • Kant, I., (1783 [2004]). Prolegomena to any Future Metaphysics That Will Be Able to Come Forward as Science. Revised edition.G. Hatfield (trans. and ed). Cambridge: Cambridge University Press.
  • Kant, I., (1781-1787 [1998]). Critique of Pure Reason. Guyer, P. and Wood, A. W. (trans and eds). Cambridge: Cambridge University Press.
  • Kant, I., (1992). Lectures on Logic. Young, J. M (trans. and ed.). Cambridge: Cambridge University Press.
  • Keynes, J. M., (1921). A Treatise on Probability. London: Macmillan and Company.
  • Keynes, J. M., (1923). A Tract on Monetary Reform. London: Macmillan and Company.
  • Laplace, P. S., (1814 [1951]). A Philosophical Essay on Probabilities. New York: Dover Publications, Inc.
  • Leibniz, G. W. (1989). Philosophical Essays. Ariew, R. and Garber, D. (trans.). Indianapolis & Cambridge: Hackett P.C.
  • Leibniz, G. W. (1989a). Philosophical Papers and Letters. Loemker, L. (trans.), Dordrecht: Kluwer.
  • Leibniz, G. W. (1896). New Essays on Human Understanding. New York: The Macmillan Company.
  • Leibniz, G. W. (1710 [1985]). Theodicy: Essays on the Goodness of God, the Freedom of Man and the Origin of Evil. La Salle, IL: Open Court.
  • Malebranche, N. (1674-5 [1997]). The Search after Truth and Elucidations of the Search after Truth. Lennon, T. M. and Olscamp, P. J. (eds). Cambridge: Cambridge University Press.
  • Mill, J. S. (1865). An Examination of Sir William Hmailton’s Philosophy. London: Longman, Roberts and Green.
  • Mill, J. S. (1879). A System of Logic, Ratiocinative and Inductive: Being a Connected View of The Principles of Evidence and the Methods of Scientific Investigation. New York: Harper & Brothers, Publishers.
  • Papineau, D., (1992). “Reliabilism, Induction and Scepticism.” The Philosophical Quarterly 42(66): 1-20.
  • Popper, K., (1972). Objective Knowledge: An evolutionary approach. Oxford: Oxford University Press.
  • Popper, K., (1974). “Replies to My Critics.” In Schilpp, P. A., (ed.). The Philosophy of Karl Popper. 961-1174. Library of Living Philosophers, Volume XIV Book II. La Salle, IL: Open Court Publishing Company.
  • Psillos, S. (1999). Scientific Realism: How Science Tracks Truth. London: Routledge.
  • Psillos, S. (2015). “Induction and Natural Necessity in the Middle Ages.” Philosophical Inquiry 39(1): 92-134.
  • Ramsey, F., (1926). “Truth and Probability.” In Braithwaite, R. B. (ed.). The Foundations of Mathematics and other essays. London: Routledge.
  • Reichenbach, H., (1934 [1949]). The Theory of Probability: An Inquiry into the Logical and Mathematical Foundations of the Calculus of Probability. Berkeley and Los Angeles: University of California Press.
  • Reichenbach, H., (1951). The Rise of Scientific Philosophy. Berkeley and Los Angeles: University of California Press.
  • Russell, B., (1912). The Problems of Philosophy. London: Williams and Norgate; New York: Henry Holt and Company.
  • Russell, B., (1948 [1992]). Human Knowledge—Its Scope and Limits. London: Routledge.
  • Schurz, G., (2019). Hume’s Problem Solved. The Optimality of Meta-Induction. Cambridge, MA: The MIT Press.
  • Strawson, P. F., (1952 [2011]). Introduction to Logical Theory. London: Routledge.
  • Teller, P., (1969). “Goodman’s Theory of Projection.” The British Journal for the Philosophy of Science, 20(3): 219-238.
  • van Cleve, J., (1984). “Reliability, Justification, and the Problem of Induction.” Midwest Studies in Philosophy 9(1): 555-567.
  • Venn, J., (1888). The Logic of Chance. London: Macmillan and Company.
  • Venn, J., (1889). The Principles of Empirical or Inductive Logic. London: Macmillan and Company
  • Whewell, W., (1840). The Philosophy of the Inductive Sciences, Founded Upon Their History, vol. I, II. London: John W. Parker, West Strand.
  • Whewell, W., (1858). Novum Organum Renovatum. London: John W. Parker, West Strand.
  • Whewell, W., (1849). Of Induction with especial reference to John Stuart Mill’s System of Logic. London: John W. Parker, West Strand.

 

Author Information

Stathis Psillos
Email: psillos@phs.uoa.gr
University of Athens
Greece

and

Chrysovalantis Stergiou
Email: cstergiou@acg.edu
The American College of Greece
Greece

The Benacerraf Problem of Mathematical Truth and Knowledge

Before philosophical theorizing, people tend to believe that most of the claims generally accepted in mathematics—claims like “2+3=5” and “there are infinitely many prime numbers”—are true, and that people know many of them.  Even after philosophical theorizing, most people remain committed to mathematical truth and mathematical knowledge.

These commitments come as a package.  Those committed to mathematical knowledge are committed to mathematical truth because knowledge is factive.  One can only know claims that are true.  And those committed to mathematical truth turn out to be committed to mathematical knowledge as well.  The reasons for this are less transparent.  But regardless of what those reasons are, commitments to mathematical truth and to mathematical knowledge always seem to stand and fall together.

There is a serious problem facing the standard view that we know most of the mathematical claims we think we know, and that these claims are true.  The problem, first presented by Paul Benacerraf (1973), is that plausible accounts of mathematical truth and plausible accounts of mathematical knowledge appear to be incompatible with each other.  It has received a great deal of attention in the philosophy of mathematics throughout the late 20th and early 21st centuries.

This article focuses on illuminating Benacerraf’s mathematical truth-mathematical knowledge problem.  Section 1 outlines how two plausible constraints on accounts of mathematical truth, the semantic constraint and the epistemological constraint, give rise to challenges for those committed to mathematical truth and mathematical knowledge.  The details of these challenges are developed in sections 2–5.  Sections 2 and 3 focus on platonistic accounts of mathematical truth; semantic arguments that support mathematical platonism are addressed in section 2, and epistemological arguments against mathematical platonism are addressed in section 3.  Section 4 expands the epistemological arguments beyond the platonistic context and sketches a range of responses to the epistemological concerns. Section 5 focuses on a category of accounts of mathematical truth that Benacerraf calls combinatorial.  Such accounts appear to make sense of mathematical epistemology but seem semantically implausible.  Taken together, these arguments suggest a serious problem for those committed to both mathematical truth and mathematical knowledge: it is unclear how an account of mathematical truth can fare well both semantically, as an account of truth, and also epistemologically, so that we can make sense of our possession of mathematical knowledge.

This article is about the mathematical truth-mathematical knowledge problem stemming from Benacerraf 1973.  It does not address a different issue also sometimes called “the Benacerraf problem”, stemming from Benacerraf 1968, which centers on the existence of multiple candidates for set-theoretic reductions of the natural numbers.

Table of Contents

  1. Introducing the Problem
    1. Two Constraints
    2. Overview of the Problem
    3. A Terminological Note: “Benacerraf’s Dilemma”
  2. The Semantic Constraint and Mathematical Platonism
    1. The Fregean Semantic Argument for Mathematical Platonism
    2. The Quinean Semantic Argument for Mathematical Platonism
    3. The Semantic Constraint and Benacerraf’s Argument
  3. Epistemological Problems for Platonism
    1. Benacerraf’s Epistemological Argument
    2. Objections to and Reformulations of Benacerraf’s Epistemological Argument
    3. Field’s Epistemological Argument
  4. Related Epistemological Challenges and Responses
    1. Further Challenges and Generalizations
    2. Addressing the Epistemological Challenges
  5. Epistemologically Plausible Accounts of Mathematical Truth
    1. Combinatorial Accounts
    2. Combinatorial Accounts and the Epistemological Constraint
    3. The Problem with Combinatorial Accounts
  6. Conclusion
  7. References and Further Reading

1. Introducing the Problem

a. Two Constraints

Most discussion of the mathematical truth-mathematical knowledge problem traces back to Benacerraf’s 1973 paper “Mathematical Truth.”  In that paper, Benacerraf identifies two constraints on accounts of mathematical truth.  These constraints are:

Semantic Constraint: The account of mathematical truth must cohere with a “homogeneous semantical theory in which semantics for the propositions of mathematics parallel the semantics for the rest of the language” (p. 661).

Epistemological Constraint: “The account of mathematical truth [must] mesh with a reasonable epistemology,” that is, with a plausible general epistemological theory (p. 661).

Why should anyone accept the semantic constraint? Accounts of truth are intimately connected to accounts of what sentences mean, that is, to accounts of the semantics of language.  If the semantics of mathematical language were to differ substantially from the semantics of the rest of language, then mathematical truth would correspondingly differ, again substantially, from truth in other domains.  As Benacerraf puts it,

A theory of truth for the language we speak, argue in, theorize in, mathematize in, etc., should… provide similar truth conditions for similar sentences.  The truth conditions assigned to two sentences containing quantifiers should reflect in relevantly similar ways the contribution made by the quantifiers. (p. 662)

A divide between semantics and truth in mathematics and semantics and truth in other domains would render mathematical truth unrecognizable as truth, rather than some other property.  (The discussion of what Benacerraf calls “combinatorial” accounts of mathematical truth in section 5 illustrates this point more clearly.)  Part of the motivation for the semantic constraint is the idea that an account of mathematical truth should fall under an account of truth quite generally, and that, for this to happen, mathematical language needs the same kind of semantics as the rest of language.

Why should anyone accept the epistemological constraint? First, mathematical knowledge is intertwined with knowledge of other domains.  A lot of science, for example, involves mathematics.  If our ability to possess mathematical knowledge is unintelligible, that unintelligibility is liable to infect the intelligibility of the rest of our knowledge as well.  And second, as discussed in the introduction to this article, commitments to mathematical truth stand and fall with commitments to mathematical knowledge.  On a plausible account of mathematical truth, many mathematical truths are known.  In short, one cannot maintain commitments to both mathematical truth and mathematical knowledge if one’s account of mathematical truth renders mathematical knowledge unintelligible.  For both of these reasons, it is important to ensure that people can have mathematical knowledge.  And to do that, a plausible account of mathematical truth should cohere with a plausible general account of knowledge.

b. Overview of the Problem

As Benacerraf first presented the mathematical truth-mathematical knowledge problem, the issue is that the semantic constraint and the epistemological constraint appear to be in tension with each other.  He put it like this:

It will be my general thesis that almost all accounts of the concept of mathematical truth can be identified with serving one or another of these masters [the semantic and epistemological constraints] at the expense of the other.  Since I believe further that both concerns must be met by any adequate account, I find myself deeply dissatisfied with any package of semantics and epistemology that purport to account for truth and knowledge both within and outside of mathematics.  (Benacerraf 1973: 661-2).

Although Benacerraf presents this tension as a problem, it is worth noting that he does not appear to abandon his commitments to mathematical truth and mathematical knowledge in light of his concerns.  Instead, he takes it to be reason for dissatisfaction with existing accounts of mathematical truth and mathematical knowledge.  The ultimate intent is to challenge philosophers to develop a better account of mathematical truth and mathematical knowledge—an account that satisfies both constraints.  Indeed, Benacerraf writes, “I hope that it is possible ultimately to produce such an account” (p. 663).

There are two distinct problems that arise from the tension between the semantic and epistemological constraints.  One is a problem for accounts of mathematical truth in general.  The other is a targeted problem that specifically engages with platonistic accounts of mathematical truth.

Benacerraf presents the general problem in the form of a dilemma.  One horn, which is addressed in section 2c and 3a, is concerned with accounts of mathematical truth that appear to satisfy the semantic constraint. Though there are such accounts, Benacerraf argues that they do not satisfy the epistemological constraint: “accounts of truth that treat mathematical and nonmathematical discourse in relevantly similar ways do so at the cost of leaving it unintelligible how we can have any mathematical knowledge whatsoever.” (p. 662)

The other horn, which is addressed in section 5, is concerned with accounts that appear to satisfy the epistemological constraint.  Though there are accounts of this sort as well, Benacerraf argues that they are inadequate as accounts of truth: “whereas those [accounts] which attribute to mathematical propositions the kinds of truth conditions we can clearly know to obtain, do so at the expense of failing to connect these conditions with any analysis of the sentences which shows how the assigned conditions are conditions of their truth.” (p. 662)

The reasons that such accounts appear to fall short of explaining truth are entwined with reasons for thinking that they rely on semantics that seem quite different from the semantics of the rest of language.   In short, the general problem for accounts of mathematical truth is that, while accounts of mathematical truth can satisfy the semantic constraint or the epistemological constraint, an account that satisfies one constraint appears to be doomed to violate the other.

The targeted problem is effectively one horn of the general dilemma, specifically, the horn engaging with accounts that satisfy the semantic constraint.  The targeted problem involves two separate arguments, which together suggest an incompatibility of the semantic and epistemological constraints.

The first argument in the targeted problem aims to establish that the semantic constraint requires the existence of mathematical objects—objects that presumably are abstract and independent of human minds and language.  Mathematical platonism is the view on which mathematical objects exist, are abstract, and are independent of human minds and language, so the upshot of this argument is that platonistic accounts are the only accounts of mathematical truth that satisfy the semantic constraint.  Section 2c addresses Benacerraf’s semantic argument to this effect.  Frege and Quine also gave semantic arguments for the existence of mathematical objects; their arguments are discussed in sections 2a and 2b, respectively.

The second argument in the targeted problem aims to establish that mathematical truths are not knowable if platonistic accounts are correct, because causal interaction with objects is required for knowledge of those objects. Benacerraf’s argument to this effect is discussed in section 3a, and further epistemological arguments are discussed in sections 3c and 4a. Putting the semantic argument for mathematical platonism together with the epistemological argument against mathematical platonism, the targeted problem is that the specific kind of account of mathematical truth that seems to be required to satisfy the semantic constraint—namely, mathematical platonism—is precluded by the epistemological constraint.

There are several ways of responding to Benacerraf’s mathematical truth-knowledge problem and the apparent incompatibility of the semantic and epistemological constraints.  Some proponents of mathematical truth and knowledge have focused on undermining the arguments that purport to establish the apparent incompatibility (see section 3b).  Other proponents of mathematical truth have focused on developing positive accounts of mathematical truth and mathematical knowledge that are compatible with one another (see section 4b).  And some philosophers have taken the apparent incompatibility as an argument against mathematical truth.

c. A Terminological Note: “Benacerraf’s Dilemma”

Somewhat confusingly, the term “Benacerraf’s dilemma” has been used for two distinct problems in the literature emerging from Benacerraf 1973. In the years before 1990, the term did not get much use, and when it was used, it sometimes denoted as “the general problem” (see Bonevac 1983: 98, Weinstein 1983: 266-267, and Papineau 1988: 15) and sometimes denoted as “the targeted problem” (see Parsons 1979: 161 n.12, Kitcher & Aspray 1987: 14).  Uses of the term were about evenly split between these two problems.  In at least one paper, both problems were described as dilemmas raised by Benacerraf (see Creath 1980: 335, 336).

Starting with three papers in 1991 (Maddy 1991: 155, Hart 1991a, 1991b: 61), usage of the term “Benacerraf’s dilemma” shifted almost exclusively to “the targeted problem”.  (There are some rare exceptions, for example, Halimi 2016: 45.)  And starting in the early 2000s, the term “Benacerraf’s dilemma” came to be used much more often.  This article adopts the usage that has been standard since about 1991.  That is, it uses the term “Benacerraf’s dilemma” as a name for the targeted problem that focuses on platonistic accounts of mathematical truth.

2. The Semantic Constraint and Mathematical Platonism

Benacerraf’s dilemma, which comprises one horn of the general dilemma, focuses on platonistic accounts of mathematical truth.  The first half of Benacerraf’s dilemma consists of semantic arguments in favor of mathematical platonism.  This section covers three such semantic arguments, each of which starts by presupposing the truth of certain mathematical or number-involving claims, specifically claims that involve no quantification or start with existential quantifiers.  In accord with the semantic constraint, each semantic argument then draws on some general semantical theory thought to cover all of language (and not just mathematics). The arguments differ in the semantical theories they presuppose, but each uses semantic and grammatical features of specific mathematical sentences—ones that are assumed to be true—in conjunction with those general semantical theories to argue for the existence of mathematical objects.  Mathematical platonism will follow from this conclusion if one assumes, as many but not all do, that these mathematical objects must be outside of space, time, and the causal realm, and independent of human minds and language.

The second half of Benacerraf’s dilemma focuses on epistemological problems for mathematical platonism; those are the topic of section 3.

a. The Fregean Semantic Argument for Mathematical Platonism

One prominent semantic argument for mathematical platonism is Frege’s.  Frege’s primary reasons for believing that numbers are objects came from observations about the way numerical expressions are used in language. In The Foundations of Arithmetic, he gives two reasons for endorsing the view that numbers are objects.  The first is that we apply the definite article to numerical expressions. In his words: “[W]e speak of ‘the number 1’, where the definite article serves to class it as an object” (1884: §57).

Furthermore, assuming the truth of basic arithmetic, such definite descriptions appear in true sentences, for example, ‘the number 1 is less than the number 4’.  This ensures that “the number 1” denotes successfully.  (In ordinary, straightforward contexts, definite descriptions that do not denote, for example, “the king of France”, do not appear in true sentences.)  The use of the definite description indicates that “the number 1” plays the grammatical role of denoting an object (as opposed to, say, a concept), and its use in true sentences indicates that it does so successfully.

Frege’s second reason for believing that numbers are objects is that numerical expressions are used in (true) identity statements, where identity is presumed to be a relation between an object and itself. (Frege rejects the idea that there can be identities between, say, concepts.) He appeals to two kinds of numerical statements to establish that numerical expressions are used in this way. First, he claims that statements of the following form are identity statements:

(a) The number of Jupiter’s moons is four.

(Jupiter has dozens of moons, but only its Galilean moons—Io, Europa, Ganymede, and Callisto—were known in 1884.  What follows pretends that (a) is true.)  In treating statements like (a) as identity statements, Frege denies that statements of number are predicative statements akin to ‘The surface of Earth’s moon is rocky’. As he explains it: “Here ‘is’ has the sense of ‘is identical with’ or ‘is the same as.’ So that what we have is an identity, stating that the expression ‘the number of Jupiter’s moons’ signifies the same object as the word ‘four’” (§57).

Second, Frege claims that arithmetical equalities (for example, “1 + 1 = 2”) are numerical identity statements too. With such equalities in mind, he asserts that, “identities are, of all forms of proposition, the most typical of arithmetic” (§57).

If Frege is right, numerical identity statements are overwhelmingly common, and many of them are true.  If he is also right that identity statements are true precisely when one and the same object is signified by the expressions on either side of the identity sign, then it follows that numbers are objects.

A third reason that some neo-Fregeans (but not Frege himself) have offered for thinking that numbers are objects, rather than (Fregean) concepts, is the so-called “Aristotelian test”.  The basic idea, in Bob Hale’s words, is this:

[W]hereas for any given predicate there is always a contradictory predicate, applying to any given object if and only if the original predicate fails to apply, there is not, for singular terms, anything parallel to this—we do not have, for a given singular term, another ‘contradictory’ singular term such that a statement incorporating the one is true if and only if the corresponding statement incorporating the other is not true. (Hale 2001: 40)

Numerical expressions are singular terms—they have the grammatical role of referring to objects—because they do not have contradictories in the sense Hale describes.  Consider, for example, the statement “The number of Jupiter’s moons is greater than the square root of four”.  If (a) is true, then this statement is true too.  But there is no contradictory of the word “four”, for example, “non-four”, that will make both these statements false.  Perhaps, grammatical awkwardness aside, “The number of Jupiter’s moons is non-four” would be false.  But there is no straightforward way to evaluate the truth-value of “The number of Jupiter’s moons is greater than the square root of non-four”.  That is because “four” functions grammatically as a singular term, and not as a predicate.

Fregean arguments for the claim that numbers are objects, then, start with the assumption that a range of mathematical claims are true, and then appeal to grammatical considerations to argue that the truth of those claims requires the existence of mathematical objects like numbers.

There is an obvious objection to these semantic considerations in favor of taking numbers to be objects: not all numerical language seems to fit this pattern.  Numerical expressions are often used as nouns; following Geach (1962) and Dummett (1973), call such uses substantival.  The concern is that not all uses of numerical expressions are substantival.  Frege offers the following example in the Grundlagen:

(b) Jupiter has four moons.

Again, following Geach and Dummett, call such uses of numerical expressions adjectival.  Adjectival uses of numerical expressions do not seem to support the claim that numbers are objects.

Frege does not think that adjectival uses undermine the claim that numbers are objects.  Rather, he thinks that such uses are secondary to substantival uses.  Because of the centrality of identity statements (that is, equations) in arithmetic, substantival uses are more useful for developing “a concept of number usable for the purposes of science” (§57).  Furthermore, he suggests that adjectival uses of numerical expressions are eliminable. Adjectival uses appear in sentences that have the same content as substantival uses.  For example, sentences like (b) and (a) express the same proposition.  Expressing that proposition in the form of (a) instead of (b) eliminates the adjectival use of ‘four’ in (b) in favor of the substantival use in (a).  But the converse does not hold.  Substantival uses cannot always be eliminated in favor of adjectival uses because identity claims like “1+1=2” cannot be transformed into sentences with adjectival uses of numerical expressions without changing propositional content.  So, since substantival uses of numerical expressions are the central uses, and adjectival uses are eliminable, the subjectival uses are the ones that reveal the ontology.

Although Frege does not appeal to adjectival uses of numerical expressions in his arguments for the existence of numbers as objects, his treatment of adjectival uses would seem to combine well with the Quinean semantic arguments for mathematical platonism discussed in section 2b.  As Frege describes it, adjectival uses attribute properties to concepts.  For example, (b) attributes a property to the concept “moon of Jupiter”.  He writes: “This is perhaps clearest with the number 0.  If I say ‘Venus has 0 moons’… what happens is that a property is assigned to the concept ‘moon of Venus’, namely that of including nothing under it” (1884: §46).

More specifically, the properties that Frege takes adjectival uses of numerical expressions to attribute to concepts are properties concerning how many things instantiate the concept in question. For example, to say that there are zero Fs, where F designates a concept, is to say that nothing falls under the concept F, that is, F is not instantiated, that is, there do not exist any Fs.  Or, equivalently, to say that Fs do exist is to say that there are not zero Fs.  In Frege’s words, “affirmation of existence is in fact nothing but denial of the number nought” (§53).

This idea extends to adjectival uses of other numerical expressions.  To say that there is one F is to say that there is something that falls under F, and that anything that falls under F is identical with that thing.  To say that there are two Fs is to say that there exist things a and b that fall under F, that a and b are non-identical, and that anything that falls under F is either a or b.  And so on.  In practice, this means that sentence (b) can be understood as equivalent to the more explicitly quantificational statement “There are four moons of Jupiter”.

On Frege’s view, then, adjectival uses of numerical expressions have something akin to quantificational significance.  And that leads us to the Quinean semantic argument.

b. The Quinean Semantic Argument for Mathematical Platonism

Quinean semantic arguments for mathematical platonism appeal to uses of quantification in mathematics to establish that true mathematical sentences commit us to the existence of (presumably platonistic) mathematical objects.  Quine (1948: 28), for example, starts with this sentence:

(c) There is a prime number between 1000 and 1010.

Sentence (c) appears to quantify over prime numbers between 1000 and 1010.

To get from the truth of sentences like (c) to the existence of mathematical objects, proponents of Quinean arguments appeal to Quine’s criterion of ontological commitment: “[A] theory is committed to those and only those entities to which the bound variables of the theory must be capable of referring in order that the affirmations made in the theory be true” (1948: 33).

The Quinean semantic argument depends on a general claim about the connection between true sentences and ontology: we must take the quantificational claims of our accepted theories (including arithmetic) at face value. If theories that we accept include sentences that purport to quantify over mathematical objects, then we are committed to accepting those objects into our ontology. Since we accept arithmetic, and since standard arithmetic entails (c), that sentence commits us to the existence of an object that witnesses its existential claim—the prime number 1009.

The Quinean approach differs from the Fregean approach in that the two arguments rely on different grammatical features to generate the commitment to mathematical objects.  The Fregean argument appeals to singular terms, and the Quinean argument appeals to quantification.  But either kind of semantic argument can be used to support the claim that there are mathematical objects—and also to support mathematical platonism, if such objects are assumed to be abstract and independent of human minds and language.  Vestiges of both kinds of arguments seem to appear in Benacerraf’s argument that accounts of mathematical truth that satisfy the semantic constraint are inevitably platonistic.

c. The Semantic Constraint and Benacerraf’s Argument

The semantic constraint on accounts of mathematical truth requires that an account of mathematical truth cohere with a “homogeneous semantical theory in which semantics for the propositions of mathematics parallel the semantics for the rest of the language” (Benacerraf 1973: 661).  An account on which names, descriptions, or predicates function differently in mathematics than they do in the rest of language, for example, will not satisfy the semantic constraint.  Neither will an account on which quantifiers function differently in mathematics than they do in non-mathematical language.

Benacerraf appeals to David Hilbert’s finitist account of mathematical truth in “On the Infinite” (1925) to illustrate how an account of mathematical truth can violate the semantic constraint. The semantic problem with Hilbert’s account arises in his treatment of quantified arithmetical statements.  Consider two different statements about how, given some prime number p, there is a greater prime number:

(d) ∃n (n>p and n is prime)

(e) ∃n (p!+1≥n>p and n is prime)

(Note: “p!” is the product of all the natural numbers up to and including p, that is, “1×2×3×…×(p-1)×p”.)  As Hilbert sees it, both (1) and (2) abbreviate non-quantified statements:

(d’) (n=p+1 and n is prime) or (n=p+2 and n is prime) or (n=p+3 and n is prime) or …

(e’) (n=p+1 and n is prime) or (n=p+2 and n is prime) or … or (n=p!+1 and n is prime)

Statement (e) abbreviates a finitary statement, (e’), because it sets an upper limit of p!+1 on the candidate numbers that might be primes greater than p.  But statement (d) does not; it sets no upper limit and does not abbreviate a finitary statement. Hilbert contends that only finitary statements are meaningful.  So, while statements like (d) are theoretically useful, they are ‘ideal statements’ and strictly speaking “signify nothing”.

[W]e conceive mathematics to be a stock of two kinds of formula: first, those to which meaningful communications of finitary statements correspond; and, secondly, other formulas which signify nothing and which are the ideal structures of our theory. (Hilbert 1925: 146)

One might have a uniform semantics, across language, on which quantifiers represent universal disjunctions or universal conjunctions.  But in claiming that quantified statements like (d) “signify nothing”, Hilbert treats arithmetical quantifiers in a heterogenous fashion, with no apparent parallel in non-mathematical discourse.  Accordingly, Hilbert’s finitist account does not satisfy the semantic constraint.

This offers a sense of what sort of account fails to satisfy the semantic constraint.  But what, precisely, is required for an account to satisfy that constraint?  In addition to providing a homogenous semantics, such an account should apply the same general semantic account to mathematical and non-mathematical language.  This was done by both Frege and Quine; both gave arguments for the existence of mathematical objects that depended on the idea that semantic features of language incur the same ontological commitments regardless of whether or not that language is mathematical.

Benacerraf’s key take-away from the semantic constraint is that, if non-mathematical and mathematical sentences are similar in superficial sentential form, they should have parallel semantics.  With that in mind, he focuses his argument for mathematical platonism on two sentences that are both standardly accepted as true:

(1) There are at least three large cities older than New York.

(2) There are at least three perfect numbers greater than 17.

Assuming a compositional semantics, (1) seems to be an instance of a fairly simple “logico-grammatical” form involving straightforward uses of quantification, predicates, and singular terms:

(3) There are at least three FGs that bear R to a.

(In keeping with standard interpretation, the form of (3) is preserved when the English phrase ‘There are at least three’ is replaced with more transparently quantificational notation.)  Furthermore, since (1) and (2) are similar in superficial sentential form, an account that satisfies the semantic constraint ought to attribute the same form to (2).  That is, the reasoning goes, if (1) exhibits the form of (3), then (2) ought to exhibit the form of (3) as well.  (For discussion of Benacerraf’s assimilation of (1) and (2) to (3), see Katz 1995: 497 and Nutting 2018.)

At this point in the argument, Benacerraf wants to place an account of mathematical truth that satisfies the semantic constraint within the context of a general account of truth.  He only sees one plausible general account of truth on offer: “I take it that we have only one such account: Tarski’s, and that its essential feature is to define truth in terms of reference (or satisfaction) on the basis of a particular kind of syntactico-semantical analysis of the language” (1973: 667).

The “particular kind of syntactico-semantical analysis of the language” of a Tarskian account is one that will treat both (1) and (2) as having the ‘logico-grammatical’ form of (3).

A “Tarskian” account of truth goes beyond the T-Convention. (For more on the T-Convention, see the IEP article on Truth.)  As Benacerraf sees it, Tarskian accounts are correspondence-style accounts of truth; they “define truth in terms of reference (or satisfaction).”  Because he takes the only plausible accounts of truth to be Tarskian in this sense, and hence referential in nature, Benacerraf takes there to be ontological import to analyzing (2) as exhibiting the form of (3).  The truth of (2) requires both that its singular term (“17”) refer to a mathematical object and that its quantifiers range over mathematical objects such as 28 (a perfect number).  Just as the truth of (1) requires the existence of cities, the truth of (2) requires the existence of numbers.

Furthermore, if there are such mathematical objects, Benacerraf assumes that they are platonistic.  That is, mathematical objects are abstract objects, outside of space, time, and the causal realm, as well as independent of human minds and language.  So, the argument goes, an account of mathematical truth that satisfies the semantic constraint will be an account on which the singular terms of true mathematical sentences refer to platonistic mathematical objects, and the quantifiers of such sentences quantify over the same.  This view is mathematical platonism.  (See the IEP article on mathematical platonism.) So, on the picture presented in Benacerraf’s argument, accounts that satisfy the semantic constraint appear to be versions of mathematical platonism.  (Not all philosophers have accepted that mathematical objects must be platonistic.  For example, Maddy (1990) offers an account on which some mathematical objects, namely impure sets, can be perceived, and hence are causally efficacious.  Alternatively, Kit Fine (2006) and Audrey Yap (2009) both offer views on which mathematical objects depend for their existence on human minds. All of these views are briefly sketched in section 4b.)

Some philosophers (for example, Creath 1980, Tait 1986) have objected to Benacerraf’s characterization of Tarskian accounts of truth, claiming that Tarski’s account is not referential and that it does not have the ontological implications that Benacerraf claims.  But, assuming that statements (1) and (2) both have the form of (3), Benacerraf only needs one of two key claims from his Tarskian assumption to get a semantic argument for mathematical platonism off the ground.  The first, which also is used in Fregean arguments (see section 2a), is that singular terms in true sentences refer to objects.  This claim allows one to argue that, since the term “17” in (2) plays the same role as “New York” in (1), it refers to some specific mathematical object.  The second, which also is used in Quinean arguments (see section 2b), is that quantifiers quantify over objects; that is, true existential claims must be witnessed by objects.  This claim allows one to argue that, given the quantification in (2), the truth of (2) commits us to the existence of perfect numbers greater than 17—and hence, numbers. For the purposes of Benacerraf’s semantic argument, it does not matter whether either of these claims is genuinely Tarskian.  It only matters that any plausible general semantics will uphold at least one of them, and that either claim entails that the truth of (2) requires the existence of mathematical objects.

3. Epistemological Problems for Platonism

Mathematical platonists claim that mathematical sentences involve reference to, and quantification over, abstract mathematical objects like numbers and sets.  Such accounts of mathematical truth appear to fare well with respect to the semantic constraint. But because of the abstract nature of the mathematical objects that such accounts posit, they do not appear to fare as well with respect to the epistemological constraint.  This section addresses the two most central epistemological arguments against mathematical platonism in the literature: Benacerraf’s (1973) and Hartry Field’s (1989).

Benacerraf’s and Field’s arguments have given rise to a range of further epistemological arguments and challenges that bear something of a family resemblance to one another, and that target a broader range of accounts of mathematical truth (and not just mathematical platonism).  More of these epistemological arguments and challenges are addressed in section 4a.

a. Benacerraf’s Epistemological Argument

In “Mathematical Truth,” Benacerraf presents a tentative argument against mathematical platonism.  The argument depends on a causal constraint on accounts of mathematical knowledge: “for X to know that S is true requires some causal relation to obtain between X and the referents of the names, predicates, and quantifiers of S” (Benacerraf 1973: 671).

This suggested causal constraint on knowledge poses a problem for mathematical platonism.  As described above, mathematical platonism entails the view that the terms (including names) in mathematical sentences refer to abstract mathematical objects, which are outside space, time, and the causal realm.  According to platonistic views, the quantifiers of such sentences (“there are…”, “every…”) range over those same abstract mathematical objects.  But causal relations cannot obtain between people and the referents of names, predicates, and quantifiers of mathematical sentences if those referents are abstract, acausal mathematical objects.  As a result, Benacerraf’s causal constraint renders mathematical knowledge impossible if mathematical platonism is true.

The thrust of this argument is that it appears to be impossible for mathematical platonists to explain mathematical knowledge in a way that coheres with a plausible general epistemological theory.  At least, that is the case if Benacerraf’s causal constraint is a consequence of any plausible general epistemological theory.  But why should someone endorse the causal constraint?

Benacerraf suggests two different reasons for endorsing the causal constraint.  First, he thinks it is a consequence of a causal account of knowledge.  And second, he thinks it is a consequence of a causal theory of reference.

The causal account of knowledge that Benacerraf cites is not precisely specified.  But it seems to be motivated, in part, by the idea that there are limits to what kinds of justification, warrant, or evidence can support knowledge of a claim.  As Benacerraf sees it, “it must be possible to establish an appropriate sort of connection between the truth conditions of p… and the grounds on which p is said to be known” (Benacerraf 1973: 672).

Benacerraf seems to think that, if one is to know a claim, one’s evidence for that claim must trace back to the constraints that the truth of the claim imposes upon the world.  More specifically, Benacerraf thinks one’s evidence must trace back to those constraints—those truth-conditions—in a causal way.  In his words, a person must “make necessary (causal) contact with the grounds of the truth of the proposition… to be in a position of evidence adequate to support the inference (if an inference was relevant)” (1973: 671–672).

Benacerraf’s causal epistemological picture appears to require that instances of knowledge depend on some kind of causal interaction with the facts known.

The causal theories of knowledge that Benacerraf cites are found in the work of Alvin Goldman (1967), Brian Skyrms (1967), and Gilbert Harman (1973).  All three of them propose that causal interaction with at least some facts is a necessary condition for knowledge of those facts.  (None of them, however, actually accepts such a necessary condition for all knowledge.)  But without any sense of what is required to causally interact with a fact, this necessary condition does not yet entail the causal constraint on knowledge set out at the beginning of this section—that knowing a sentence to be true requires causal interactions with the referents of the names, and so forth, that appear in that sentence.

To support the causal constraint on knowledge, Benacerraf also seems to rely on something like Paul Grice’s (1961) causal theory of perception to get that causal interaction with a fact necessarily includes causal interaction with the objects involved in that fact, that is, the referents of the names and quantifiers in sentences that state the fact.  To use an example of Benacerraf’s (1973: 671), Hermione knows that she holds a black truffle because she visually perceives that fact.  The view seems to be that, as part of perceiving that fact, she perceives the truffle itself.  So, not only do instances of knowledge depend on some kind of causal interaction with the facts known, but causal interaction with the facts known seems to require causal interaction with the objects involved in those facts.  Together, these two claims support the causal constraint on knowledge.

The second reason that Benacerraf gives for endorsing the causal constraint is that he explicitly endorses a causal theory of reference.  Maddy offers a standard description of such a theory, focusing on predicates and reference to natural kinds.

According to the causal theory, successful reference to a natural kind is accomplished by means of a chain of communication from the referrer back to an initial baptism. (Of course, the baptist refers without such a chain, but most of us are rarely in that role.) One member of this chain acquires the word by means of a causal interaction with the previous link; that is, I learn a word, in part, by hearing it, reading it, or some such sensory experience, caused, in part, by my predecessor in the chain. (Maddy 1980: 166)

In cases of reference to individual objects through names, the same general picture of baptism and transmission is thought to hold.  Benacerraf takes a causal theory of reference to entail that reference to an object (or kind) through a name (or predicate) requires a causal connection—perhaps a complex one—with the object (or an instance of the kind) so named.

If that is correct, then the successful use of language depends on appropriate causal connections with the referents of the relevant names and predicates.  A person cannot even entertain a proposition expressed by a sentence that includes names, predicates, or quantification without causal connections with the referents of those linguistic elements.  And a person cannot know the truth of a sentence if she cannot entertain the proposition it expresses. Thus, the argument goes, a causal theory of reference also entails Benacerraf’s causal constraint on knowledge.

Benacerraf is not fully committed to his epistemological argument against platonism. Ultimately, his aim is to show that there is a need for a better unified account of mathematical truth and mathematical knowledge.  But regardless of his own attitude towards the argument, the success or failure of the argument ultimately rides on the plausibility of the causal constraint on knowledge.

b. Objections to and Reformulations of Benacerraf’s Epistemological Argument

The standard objection to Benacerraf’s epistemological argument against mathematical platonism is that one ought not to presuppose a causal theory of knowledge.  At least two different sorts of reason are given for this argument.  First, causal theories of knowledge had a short-lived prominence in epistemology.  As David Liggins puts it, “These days, Benacerraf’s argument has little force, since causal theories of knowledge are not taken very seriously” (2006: 137).

Second, causal constraints seem to be designed for empirical knowledge, and not for knowledge in general.  Some philosophers have even proposed the very case at issue, namely mathematics, as a counterexample to fully general causal constraints.  W.D. Hart objects to a general causal constraint on reference: “since I believe that some people and some singular terms refer to causally inert numbers, I am therefore forced to conclude that such causal theories of reference are false of numbers” (Hart 1979: 164).  David Lewis says something similar about knowledge: “Causal accounts of knowledge are all very well in their place, but if they are put forward as general theories, then mathematics refutes them” (Lewis 1986: 109).

But counterexamples are not the only objections that have been raised to the general causal constraint.  Some have objected that the general causal constraint leads to implausible results in domains other than mathematics.  John Burgess and Gideon Rosen (1996: 40), for example, argue that causal theories of knowledge implausibly preclude knowledge of future contingents.  Others have objected that the causal constraint involves an unwarranted assumption of empiricism, or an unwarranted assimilation of all knowledge to empirical knowledge (for example, Katz 1995: 493, Linnebo 2006: 546).  And, in a similar vein, Lewis (1986: 111) has argued that causal interaction is only required for knowledge of contingent truths, not for knowledge of necessary truths (like mathematical truths).  On his view, the role of causal requirements is to ensure that our beliefs are counterfactually dependent on the world around us; had the world been different, a causal requirement can ensure that our beliefs would likewise have been different.  But no causal interaction is required to ensure such counterfactual dependence in the case of necessary truths.  Those facts could not have been different, and so it is vacuously true that if the facts had been different our beliefs would likewise have been different.

A third sort of objection focuses on the resources Benacerraf uses to support the causal constraint.  While Benacerraf cites both a causal theory of reference and a causal theory of knowledge as supporting his causal constraint on knowledge, it is not clear that either kind of causal theory actually entails his causal constraint on knowledge.  Proponents of causal theories of knowledge (for example, Goldman 1967, Skyrms 1967) typically restrict their accounts to empirical or a posteriori knowledge, and do not apply causal constraints to domains like logic and mathematics that are standardly taken to be a priori.  And causal theories of reference are typically accounts of the transmission of a name or predicate from one person to another; they are not accounts of how objects get baptized with names in the first place.  Such theories of reference transmission leave open the possibility that names might be originally assigned without causal interactions with the things named (see Burgess and Rosen 1996: 50).  Causal theories in epistemology and in the philosophy of language do not entail Benacerraf’s causal constraint.

After these objections are raised, Benacerraf’s argument against platonism is left in an unsettled dialectical position.  Proponents and opponents of the argument agree that, if the causal constraint holds, then knowledge of platonistic mathematical truths is impossible.  They simply disagree about the plausibility of the causal constraint.  Proponents of the argument take the causal constraint on knowledge to be secure enough to pose a serious problem to platonistic theories of mathematical truth.  (Recall that commitments to mathematical truth and commitments to mathematical knowledge stand and fall together.)  Opponents of the argument take the causal constraint to be insufficiently secure to do such work.  Opponents like Burgess and Rosen (1996) argue that the burden of proof is on the argument’s proponents to establish its premises.  Burgess and Rosen (1996, 2005) also contend that proponents of the epistemological argument against mathematical platonism unreasonably expect mathematical platonists to develop accounts of knowledge and reference in the mathematical domain that far exceed what is expected of such accounts in, for example, the perceptual domain; for example, one ought not to demand a more detailed account of how the referent of ‘4’ is fixed than for how the referent of “the Rock of Gibraltar” is fixed.

In light of the problems with Benacerraf’s appeal to a causal constraint, or a causal theory of knowledge, Benacerraf’s epistemological argument against mathematical platonism has been recast in several ways.  First, the epistemological argument is often understood instead as a challenge that mathematical platonists must overcome; platonists seem to bear the burden of setting out a plausible epistemological picture on which knowledge of platonistically construed mathematical truths is feasible.  This, in fact, seems to be Benacerraf’s own attitude, given his overall conclusion that a better theory of mathematical truth (and knowledge) is required.

Second, on some interpretations, Benacerraf’s use of the causal constraint on knowledge reduces to a weaker casual constraint.  Some (Maddy 1984: 51, Hart 1991a: 95, Nutting 2016 and 2020) suggest that a weaker causal constraint still applies to direct, immediate, or non-inferential knowledge, something akin to Grice’s (1961) causal theory of perception.  While this interpretation weakens the objectionable causal constraint, it appears to build in the assumption that platonists are committed to positing some perception-like cognition of mathematical abstracta into Benacerraf’s epistemological argument.  Colin Cheyne (1998: 35) suggests that a weaker causal constraint applies specifically to knowledge of existence claims, and that it can be used successfully in an argument to the effect that we cannot know that abstract mathematical objects exist.

Third, some philosophers have suggested replacing Benacerraf’s causal constraint, or his appeal to a causal theory of knowledge, with other conditions on knowledge or justification in the epistemology literature.  Penelope Maddy (1984) suggests that the causal constraint might be swapped out for a reliability constraint on justification (and hence knowledge, assuming knowledge entails justification).  While the resulting argument has a more plausible epistemological constraint, she suggests that it is consistent with a platonistic account of mathematical truth.  Against this suggestion, Albert Casullo (1992) argues that reliabilism does not help platonism cohere with a plausible general epistemology.  In a different vein, Joshua Thurow (2013) offers a reformulation of Benacerraf’s argument which sets a ‘no-defeater’ condition on knowledge, instead of a causal one.  Thurow then suggests that the lack of causal interaction with mathematical entities makes it unlikely that beliefs about them are true, and this fact serves as a defeater for mathematical knowledge.  This reformulation avoids an appeal to the causal theory of knowledge, but still entails a kind of causal constraint on knowledge.

Benacerraf’s original epistemological argument does not appear to be easily revised into a plausible, convincing argument against mathematical platonism.  But it does at least pose a serious challenge for platonists.  The proponent of Benacerraf’s epistemological argument may well bear the burden of defending the causal constraint on knowledge, or of showing how some more plausible alternative constraint can be used to argue against mathematical platonism.  But the mathematical platonist still seems to bear the burden of setting out a plausible epistemological picture on which knowledge of platonistically construed mathematical truths is feasible.  Benacerraf’s epistemological challenge exposes that it is unclear how a platonist can make mathematical reference and mathematical knowledge consistent with a plausible general theory of reference or a plausible general epistemological picture.

c. Field’s Epistemological Argument

Unlike Benacerraf, Hartry Field (1989) appears to endorse an epistemological argument against mathematical platonism.  (Though some think he treats it as a forceful challenge; see Liggins 2010.)  Field employs his version of such an argument in support of mathematical fictionalism.  (See the IEP article on mathematical fictionalism.)  His view is that there are no mathematical objects, and so mathematical claims that don’t begin with universal quantifiers are literally false. On this view, mathematical claims that begin with universal quantifiers are vacuously true, because they have empty domains of quantification.

Field’s epistemological argument against mathematical platonism is the most influential reformulation of Benacerraf’s in the literature. It avoids the most common objections to Benacerraf’s version by avoiding any appeal to constraints on knowledge whatsoever.  Largely because of this, Field’s version is widely taken to be a refinement of, or improvement on, Benacerraf’s version (see, for example, Burgess and Rosen 1997: 41, Liggins 2010: 71, Clarke-Doane 2014: 246).  (But this perspective is not universal; see Kasa 2010 and Nutting 2020.)

Field avoids appealing to constraints on knowledge in his argument by avoiding any mention of knowledge. Instead, he focuses on an epistemological phenomenon that seems to be entailed by knowledge: believing, or accepting, claims.  Specifically, Field (1989: 230) argues that mathematical platonists are obligated to explain why the following holds for the overwhelming majority of mathematical sentences:

(1) If most mathematicians accept “p”, then p.

In short, platonists think that mathematicians are remarkably reliable about the mathematical realm.  The kind of reliability that platonists posit is both overwhelming and surely not coincidental.  It calls out for explanation.

Field grants that a partial explanation of this reliability is possible.  In particular, the platonist can appeal to the fact that most mathematical claims are accepted on the basis of proof from axioms.  If so, Field writes in (Field 1989: 231), “what needs explanation is only the fact that the following holds for all (or most) sentences ‘p’”:

(2) If most mathematicians accept “p” as an axiom, then p.

But this partial explanation merely reduces the challenge of explaining mathematicians’ remarkable reliability to the challenge of explaining why mathematicians reliably accept true axioms.  Field argues that the platonist cannot explain this:

The claims that the platonist makes about mathematical objects appear to rule out any reasonable strategy for explaining the systematic correlation in question.… [I]t is impossible in principle to give a satisfactory explanation of the general fact (2). (Field 1989: 230-231)

The reason that it appears to be impossible to satisfactorily explain this reliability fact is that the way abstract mathematical objects are described appears to leave people without any way to access facts about the mathematical realm.  If mathematicians lack any interaction whatsoever with mathematical objects, then the idea that they would be reliably right about the facts involving those objects seems to be inexplicable.

Field does think that it is possible to explain why the axioms that mathematicians accept are mutually consistent.  Over time, mathematicians discover when combinations of potential axioms lead to contradictions.  Such discoveries lead them to reject or revise axiom candidates in the interest of setting out consistent systems of axioms.  Field thinks that this gradual process explains why we accept consistent systems of axioms (though Leng 2007 registers doubts).  But the mere fact that certain sentences are mutually consistent does not generally entail that they are all true.  Unless one develops an account on which mutually consistent mathematical claims must be true (as in Balaguer 1998), appealing to aspects of mathematical practice that promote consistency will not thereby serve to explain why mathematicians tend to accept axioms that are true.

These concerns about the putative reliability of mathematicians give rise to Field’s epistemological argument against mathematical platonism.  According to mathematical platonism, mathematical facts are facts about abstract mathematical objects, and mathematicians are remarkably reliable in accepting those facts.  But there must be an explanation of such a remarkable correlation between the claims mathematicians accept and the mathematical facts.  And there is not one.  The abstract nature of mathematical objects renders it impossible, in principle, to explain the correlation.  Without the possibility for an explanation, Field concludes, the posited correlation must not hold—or, at the very least, we are not justified in believing that it holds.  So, the argument goes, mathematical platonism is wrong—or, at the very least, unjustified.  Ultimately, Field infers, we ought not to believe mathematical statements unless they are vacuously true (that is, begin with universal quantifiers).

Some philosophers find Field’s argument compelling, if not knock-down (see Liggins 2006, 2010: 74).  Others have raised objections to it.  First, Burgess and Rosen (1996: 41-45) object that the remarkable reliability that Field thinks demands explanation reduces to a much less remarkable correlation between (or, more accurately, conjunction of) exactly two facts:

(i) It is true that the full cumulative hierarchy of sets exists.

(ii) It is believed that the full cumulative hierarchy of sets exists.

They also contend that demanding an explanation of this true belief amounts to demanding a heavy-duty sort of justification of the standards of a science, and this is potentially objectionable.

Second, objections have been raised to Field’s apparent assumption that the correlation between mathematical facts and mathematical beliefs requires a causal explanation.  Ivan Kasa (2010) and Eileen Nutting (2020) argue that Field’s assumption that the explanation must be causal is effectively a minor modification of Benacerraf’s causal constraint on knowledge; this leaves Field’s causal assumption open to some of the same sorts of criticisms as Benacerraf’s.  Øystein Linnebo (2006: 553) points out that some correlations have good non-causal explanations, such as the correlation between the consistency of a first-order theory and the existence of a model for that theory. Linnebo suggests that a dubious analogy between the discipline of mathematics and the discipline of physics motivates Field’s assumption that the correlation at issue in his argument (between mathematical facts and mathematical beliefs) requires a causal explanation.

4. Related Epistemological Challenges and Responses

Section 3 focused specifically on the epistemological arguments against mathematical platonism raised by Benacerraf and Field. Section 4a addresses further epistemological challenges and arguments, most of which are taken to apply equally well to other accounts of mathematical truth, and hence to generalize the epistemological problems raised by Benacerraf and Field. (Many of these challenges also are taken to apply to other domains as well, such as morality, modality, and logic. Those other domains are not addressed in this article.)  Section 4b addresses some proposed solutions to Benacerraf’s dilemma, which involve presenting an account of mathematical knowledge, and, in many cases, also involve shifting to a non-platonistic account of mathematical truth.

a. Further Challenges and Generalizations

Both Benacerraf’s and Field’s epistemological arguments have been generalized or modified in ways that appear to raise epistemological challenges for views beyond mathematical platonism.

Recall that Benacerraf’s argument against platonism is rooted in the problem of combining a causal constraint on knowledge, as well as a causal constraint on reference, with a platonistic account on which mathematical objects are abstract and acausal.  This problem can be generalized in two ways.  First, there are other accounts of mathematical truth on which mathematical claims are true in virtue of some mathematical subject matter with which we do not causally interact.  The causal constraint on knowledge might appear to equally pose a problem for, for example, accounts of mathematical truth on which mathematical claims are true in virtue of transcendent mathematical properties (as in, for example, Yi 1998), or in virtue of mathematical structures that have dual object/universal natures (as in, for example, Shapiro 1997), or in virtue of systems of concrete possibilia (as in, for example, Hellman 1989).  Strictly speaking, these views (with the possible exception of Shapiro’s) are not platonistic; they do not appeal to abstract mathematical objects.

The second generalization emerges from the fact that concerns have been raised about causal constraints on knowledge and reference.  Even so, Benacerraf’s problem can be posed as the challenge of providing an epistemological account that explains knowledge of the entities in question.  The more general challenge, then, is to combine an account of mathematical truth with an account of the way in which we know those truths.  While this challenge may seem particularly pressing for accounts of mathematical truth on which some causal form of knowledge is impossible, it will arise for any correspondence-style account on which mathematical claims are true in virtue of some kind of subject matter, be that mathematical objects, mathematical properties, mathematical structures, or anything else.  The general strategies to approaching this subject-matter-oriented challenge seem to require either providing an account of knowledge of certain specific kinds of causally inaccessible entities or positing some causally efficacious mathematical subject matter (see Nutting 2016).  (Approaches of both kinds are sketched in section 4b.)

Recall that Field’s epistemological argument against mathematical platonism focuses on the problem of explaining the correlation between mathematicians’ beliefs and mathematical facts, which are assumed to be about acausal mathematical entities. Field’s is the first in a family of epistemological challenges that focus on the reliability of mathematicians.  Some of these are presented as attempts to make Field’s argument more precise; others are presented as new epistemological challenges, distinct from Field’s.

Most of the ensuing reliability challenges generalize the problem as presented by Field to apply equally to a range of non-platonistic accounts.  These challenges take the problem of explaining mathematicians’ reliability to apply to any account of mathematical truth, platonistic or otherwise, that assumes that mathematicians are, in fact, remarkably reliable when it comes to some mind-independent mathematical facts.  Given the kinds of claims and logic that mathematicians usually accept, views on which mathematicians are remarkably reliable typically are ones that entail what is sometimes called semantic realism or truth-value realism, that is, that all (mathematical) claims are either true or false.  However, these challenges may still apply to accounts that allow for some indeterminate claims, for example, in set theory—though the challenges are perhaps less likely to apply to accounts that admit indeterminate claims in, for example, arithmetic.

Some of the reliability challenges rooted in Field’s work arise from attempts to make something in Field’s argument precise.  Field claims that the correlation between mathematical facts and mathematicians’ beliefs is so remarkable as to demand an explanation.  He further suggests that such an explanation is impossible if mathematical objects are abstract and acausal.  But, setting aside Field’s focus on abstract objects, what kind of explanation is required for such a correlation?  One seemingly unsatisfactory explanation is what Linnebo (2006: 554) calls ‘the Boring Explanation’: mathematicians are taught true mathematical claims (especially axioms) as part of their training, and then go on to prove things that logically follow from those true claims, and hence are also true.  Given that the Boring Explanation is unsatisfactory, what conditions must a satisfactory explanation satisfy?

One possibility is that explaining mathematicians’ reliability requires showing that the correlation between facts and beliefs has counterfactual force, where that is usually interpreted as showing that mathematicians’ beliefs are sensitive to the mathematical facts.  For any claim p, a person’s belief that p is sensitive if and only if, had p been false, the person would not have believed p. But intuitively, it seems that demonstrating that mathematicians’ beliefs are sensitive to the mathematical facts will not suffice to explain the reliability in question.  That is because mathematical truths are usually taken to be necessary; they could not have been false.  If so, the antecedent of the sensitivity counterfactual is always false.  That guarantees that mathematical beliefs are sensitive (Lewis 1986: 111, Linnebo 2006: 550-51, Clarke-Doane 2017: 26).  But this seems inadequate; the necessity of mathematical truths does not seem to explain mathematicians’ reliability at all.

Another possibility is that the epistemological challenge to platonists is to explain why it is—that is, what makes it the case—that mathematicians’ beliefs are reliable.  Linnebo takes such an approach, and he argues that platonists must provide an ‘external explanation’ to show that mathematicians’ methods are “conducive to finding out whether these claims are true” (Linnebo 2006: 563).  But Linnebo takes the sensitivity of mathematical beliefs to be an indication that these methods are, in fact, conducive to determining whether mathematical claims are true.  He tries to revive sensitivity from the vacuity problem by appealing to meta-semantic facts about what propositions are picked out by sentences.  Mathematical propositions might be necessary, but there is a sense in which mathematical sentences could have been false; they could have expressed different propositions (or no propositions at all).  And, Linnebo contends, there is a sensible question about whether mathematicians would have still accepted those sentences in those meta-semantically different situations.

A slightly different way of casting the epistemological challenge is as an evolutionary debunking argument.  (See Korman 2020 on debunking arguments in general.)  Like standard moral evolutionary debunking arguments, such as those of Street (2006) and Joyce (2006), a mathematical evolutionary debunking argument appeals to the role of evolutionary pressures in guiding many of our most fundamental mathematical beliefs.  These evolutionary pressures provide an explanation of our mathematical beliefs that, according to the argument, are independent of the mathematical facts.  This gives reason to doubt the reliability of the beliefs formed on that evolutionary basis, and even to doubt that there are any mathematical facts at all.

But, as Justin Clarke-Doane (2012, 2014) cashes out the evolutionary debunking argument, the claim that evolutionary pressures are independent of the mathematical facts amounts to a sensitivity claim: had the mathematical facts not been facts, the beliefs resulting from evolutionary pressures would have remained the same.  Like Linnebo, Clarke-Doane tries to make sense of this counterfactual by identifying a sense in which mathematical claims are not necessary.  His suggestion is that, while mathematical truths might be metaphysically necessary, we might appeal to conceptual possibility instead.  On this approach, conceptual possibility does not entail metaphysical possibility, and it is conceptually possible (that is, intelligible to imagine) that mathematical claims could have been different (Clarke-Doane 2012: 321, 2014: 249-50).  A third approach to reviving a sensitivity interpretation of the reliability challenge might also be available if one were to deny the vacuous truth of counterfactual claims with necessary antecedents.  This third approach would involve accepting an account of counterfactuals that depends on counterpossibles.  (See Nolan 1997, Williamson 2017, and Berto and others 2018.)

b. Addressing the Epistemological Challenges

A wide range of new accounts of mathematical knowledge have been developed in attempts to overcome these epistemological challenges.  This section provides an incomplete survey of some of those attempts.

With the notable exception of minimalist responses (vi), the accounts below are primarily targeted at addressing epistemological challenges in the mold of Benacerraf’s, rather than those in the mold of Field’s.  That is, they are primarily concerned with explaining knowledge of some mathematical subject matter, rather than with explaining the reliability of mathematicians.  Many of the accounts described seem to be less well-suited to addressing Fieldian reliability challenges, because they appeal to considerations that likely would not have influenced most mathematicians.   Even when it comes to addressing Benacerrafian concerns, some of the accounts below appear to have trouble explaining all mathematical knowledge.  They might, for example, fail to explain knowledge of those parts of set theory that are, for example, not involved in scientific theorizing or that lack clearly observable instances or applications.

In addition to potential failures to fully address the epistemological challenges, it is worth noting that substantive objections have been raised to each of the accounts below, though this article does not engage with those objections.

i. Mathematics’ role in empirical science.

Some philosophers have appealed to the important role of mathematics in empirical science to explain how the epistemological challenges might be avoided, or why mathematical beliefs are justified.  For example, Mark Steiner (1973: 61) suggests that, although people do not causally interact with mathematical objects, mathematical truths play a role in causal explanations of mathematical beliefs.  A thorough causal explanation of a person’s mathematical beliefs inevitably will involve a scientific theory that presupposes mathematics, and so the mathematical claims will play a role in those causal explanations.  If this is correct, then mathematical claims do play a causal role of some sort in mathematical beliefs.

A different science-oriented approach is to appeal to some of the reasoning in so-called indispensability arguments (see for example, Resnik 1995, Colyvan 2001, Baker 2005).  According to these arguments, mathematics and quantification over mathematical objects are indispensable to the best scientific theories, or to our best scientific explanations, and so mathematical theories are supported by the same body of evidence that supports those empirical theories or explanations.  Typically, indispensability arguments are used to establish the existence of mathematical objects.  But similar reasoning supports a more epistemological claim: mathematical claims, and the positing of mathematical objects, are justified by empirical science (see for example, Colyvan 2007).  If that is the case, then mathematical knowledge does not appear to depend on causal interaction with mathematical objects.  (See the IEP article on indispensability arguments article.)

ii. Perceiving sets.

Maddy (1980: 178-84, 1990: 58-63) offers a view on which we do causally interact with some mathematical objects.  Specifically, on this view people routinely perceive sets.  For example, a person might perceive a set of three eggs in the refrigerator.  In the exact same location, that person might also perceive two disjoint sets: a set of two eggs, and a singleton set of one egg.  On this view, we are causally related to the sets we perceive, and we can have non-inferential perceptual knowledge involving those sets.  That non-inferential perceptual knowledge can be numerical, for example, that the set of eggs is three-membered.  The causal constraint, it seems, would not undermine mathematical knowledge if people routinely perceived impure sets and their numerical properties.

iii. Pattern-recognition and abstraction.

On certain versions of mathematical structuralism, mathematical objects are abstract mathematical structures and the positions in those structures.  Michael Resnik (1982), Charles Parsons (2008), and Stewart Shapiro (1997) all offer accounts on which mathematical structures, and the positions in those structures, can be epistemically accessed by pattern recognition and/or abstraction that starts from concrete objects.  (Parsons differs from Resnik and Shapiro in talking of ‘intuition’ of what are effectively the types of which concreta are tokens.)  Consider, for example, the following sequence of strokes written (let us suppose) on a piece of paper:

| , || , |||

At each additional step in the sequence, an additional stroke is added.  The pattern here, of course, is the beginning of the sequence of natural numbers.  On Shapiro’s (1997: 115-116) view, for example, the sequence above is a physical instantiation of the 3-ordinal pattern, that is, a system of 3 objects taken in a particular order.  “A, B, C” is another physical instantiation of that same pattern.  People can abstract the ordinal number 3 from sequences such as these by using the ordinary human capacity of pattern recognition.  This is the same capacity that allows us to recognize, for example, the pattern of player positions on a baseball diamond.  Shapiro’s view adds an additional step for access to the natural number 3, which is a position in the structure of the natural numbers, and hence distinct from the 3-ordinal pattern.  Once a person has cognition of various ordinal patterns, the structure of the natural numbers can be abstracted from thinking about the natural ordering of all those ordinals.  This further step, too, can be done using ordinary human capacities of abstraction.  On this sort of account, causal interaction with mathematical objects is not required for cognitive access to those objects; people can access abstract mathematical objects through an ordinary process of abstraction that starts with the recognition of patterns in concrete, perceived things.  (See the structuralism article.)

iv. Abstraction Principles.

Neo-Fregeans like Bob Hale, Crispin Wright, and Linnebo turn to a somewhat different form of abstraction, one that relies on abstraction principles to address epistemological challenges (see, for example, Hale and Wright 2002, Linnebo 2018).  The most-discussed abstraction principle is Hume’s Principle:

FG (#F=#G ↔ the Fs are in 1-1 correspondence with the Gs).

That is, for any concepts F and G (for example “finger on my left hand”, “U.S. president in the 21st century”), the number of Fs is identical with the number of Gs if and only if there is a relation that pairs each F with exactly one G, and vice versa.  On this view, it is not just a lucky happenstance that concepts have the same number precisely when they are in one-to-one correspondence with each other.  Rather, according to Hale and Wright, that is just how numbers work; the claim about number identity on the left side of the biconditional has the same content as the claim about one-to-one correspondence on the right side of the biconditional, but expresses that content in a different way.  Hume’s Principle is analytic, that is, true in virtue of meaning; it implicitly defines what the term ‘number’ means.  On this view, if we understand one-to-one correspondences between concepts (for example, we understand what it is to be able to pair up the fingers on my left hand with the fingers on my right hand in a tidy way), we can use the implicit definition of number given in Hume’s Principle to come to know things about numbers—even though numbers are abstract objects with which we do not causally interact.  Other abstraction principles can be used in similar ways to provide epistemological access to other kinds of mathematical objects.  Linnebo (2018) even appeals to a different abstraction principle to define the natural numbers as ordinals (positions in orderings), though he still accepts Hume’s Principle as an account of the cardinal numbers (how many). (For more on abstractionism, see the IEP article on abstractionism.)

v. Logical Knowledge and Knowledge through Descriptions.

Philosophers with very different accounts of mathematical objects have offered epistemological accounts on which mathematical knowledge is acquired through definitions.  The neo-Fregeans described above, who take abstraction principles like Hume’s Principle to be implicit definitions, are one example.

Other approaches in this general category shift away from the traditional conception of mathematical objects as abstract and mind-independent.  Audrey Yap (2009), for example, follows Richard Dedekind (1888) in taking mathematical objects to be “free creations of the human mind.”  On Yap’s version of the account, because the second-order Peano Axioms (which govern arithmetic) are both consistent and categorical (that is, all models of them are isomorphic/structurally identical), stipulating those axioms effectively serves to create the subject matter of arithmetic—the natural numbers.  People who engage in such generative stipulations are in a position to know what they have stipulated—and so, to know mathematical truths.  In a similar vein, Kit Fine (2006) has a view on which all mathematical knowledge is derived from what he calls ‘procedural postulations’, which serve to generate mathematical objects.  Again, on his picture, mathematical knowledge is possible because our postulations create the subject matter of mathematics.

A rather different account of mathematical knowledge through description is found in Mark Balaguer’s (1998) plenitudinous platonism.  According to plenitudinous platonism, all logically possible mathematical objects exist in some mathematical universe.  There are, for example, many set theoretic universes, and the Continuum Hypothesis is true in some and false in others.  Balaguer thinks that any consistent mathematical theory truly describes some part of the mathematical universe.  And, Balaguer suggests, all that is required for mathematical knowledge is knowledge that a mathematical theory is consistent.  On Balaguer’s account, then, all that is required for mathematical knowledge is the description of a consistent theory and the knowledge that it is consistent.  Similarly, Sharon Berry argues that any coherent mathematical theory could have expressed a truth, and that “the mathematical realist only needs to explain how we came to accept some logically coherent characterization of ‘the numbers’ and derive our beliefs from that characterization” (2018: 2293).

vi. Minimalist Responses.

Some accounts of mathematical knowledge are in the spirit of what Korman (2019) and Korman and Locke (2020) call minimalist responses.  These accounts involve an explanation of why no interaction with abstract mathematical objects is required for mathematical knowledge.  But they do not provide much in the way of accounts of how mathematical knowledge is acquired; they claim that little is required on this front.  One such account is Lewis’s (1986) (see section 3b).  Lewis claims that only contingent facts require causal explanation, and we have mathematical knowledge because mathematical claims are necessary, and so our beliefs about them are inevitably sensitive (in the sense discussed in section 4a).  Clarke-Doane (2017) spins an idea similar to Lewis’s into a rejection of Field’s challenge.  According to Clarke-Doane, requiring an explanation of the reliability of mathematicians would require explaining the counterfactual dependence of mathematicians’ beliefs on the mathematical facts.  But, as Lewis points out, sensitivity is trivial due to the necessity of mathematical truth.  And, Clarke-Doane argues, the counterfactual notion of safety—that a belief could not easily have been false—does not pose much of a problem either.

vii. Mathematical fictionalism.

Some philosophers have been convinced that mathematical truth does require a platonistic account, and that the epistemological arguments are compelling enough to reject such an account.  For example, Field (1989) and Mary Leng (2007, 2010) both conclude, on the basis of these arguments, that there are no mathematical objects.  Hence, they conclude that mathematical sentences that purport to refer to or quantify over such objects must be false.  Note that universally quantified sentences are vacuously true on such fictionalist accounts; if there are no numbers, then it is vacuously true that every number has a successor.

There are a number of related epistemological arguments against mathematical platonism, and a wide range of attempts to address them.  But if the semantic constraint really does require an account of mathematical truth that involves reference to or quantification over mathematical objects, as the arguments in section 2 suggest, and especially if it requires reference to platonistic mathematical objects, then there are real epistemological challenges for those who accept mathematical truth.

5. Epistemologically Plausible Accounts of Mathematical Truth

This section examines the second horn of the general dilemma, which finds a problem for accounts of mathematical truth that appear to satisfy the epistemological constraint.

Benacerraf uses the description “‘combinatorial’ views of the determinants of mathematical truth’’ for those accounts that he takes to fare well with respect to the epistemological constraint (Benacerraf 1973: 665).  He also treats such accounts as largely motivated by epistemological concerns.  But what are combinatorial accounts?  Why are they supposed to fare well epistemologically?  And why think they fail to satisfy the semantic constraint?

a. Combinatorial Accounts

Benacerraf’s ‘combinatorial’ classification encompasses views from multiple distinct traditions in the philosophy of mathematics.  He specifically discusses two different kinds of accounts of mathematical truth that qualify as combinatorial:

(a) accounts on which mathematical truth is a matter of formal derivability from axioms; and

(b) conventionalist accounts on which “the truths of logic and mathematics are true (or can be made true) in virtue of explicit conventions where the conventions in question are usually postulates of the theory” (p. 676).

Benacerraf also mentions “certain views of truth in arithmetic on which the Peano axioms are claimed to be “analytic” of the concept of number” (p. 665).  Benacerraf does not develop such accounts any further; neither does this article`.

Paradigmatically, accounts of the first type (a) are formalist accounts.  Benacerraf specifically identifies David Hilbert (1925) as a formalist of this stripe; Haskell B. Curry and Johann von Neumann seem to be too.  (Note that formalist accounts of this sort need not adopt Hilbert’s heterogeneous semantics, discussed in section 2c.) On such formalist accounts, formalized axioms are stipulated and taken to give rise to a mathematical system.  Here is how Benacerraf describes it:

The leading idea of combinatorial views is that of assigning truth values to arithmetic sentences on the basis of certain (usually proof-theoretic) syntactic facts about them.  Often, truth is defined as (formal) derivability from certain axioms. (Frequently a more modest claim is made—the claim to truth-in-S, where S is the particular system in question.) (Benacerraf 1973: 665)

A formalist of this stripe might, for example, set out the axioms of Zermelo Fraenkel set theory (ZFC) as sentences in a symbolic language, and then stipulate ZFC to be a set-theoretic system.  The axioms of ZFC would be taken to be “true by definition” (Curry 1964: 153.)  The remaining set-theoretic truths—or the claims that are true-in-ZFC—would be syntactically derivable from the axioms of ZFC using the symbolic manipulations licensed by the inference rules of the specified logical system.  The same formalist might equally accept ZF (ZFC minus the axiom of choice) to be a set-theoretic system, giving rise via derivations to sentences that are true-in-ZF.  The formalist can accept multiple systems at once.

Some formalist accounts characterize mathematics as something of a game of symbol-manipulation.  That is how von Neumann describes Hilbert’s view:

We must regard classical mathematics as a combinatorial game played with the primitive symbols, and we must determine in a finitary combinatorial way to which combinations of primitive symbols the construction methods or “proofs” lead. (von Neumann 1964: 51)

Typically, combinatorial accounts of this formalist stripe do not take the “primitive symbols” of mathematics—for example, “0” or “∈”—to be meaningful outside the mathematical game.  Rather, the meanings of these symbols are implicitly defined by the axioms (and perhaps logical rules) of the system adopted.  As a consequence, such accounts do not take the claims of mathematics to have meanings or truth-values outside of the specified system of axioms.  This, together with a syntactic understanding of logic in terms of derivation rules, motivates the idea that mathematical truth is a matter of derivability from the axioms of the system.

Conventionalist accounts of type (b) were common in the mid-twentieth century, especially among logical positivists and Wittgensteinians.  According to such views, mathematical sentences are true in virtue of linguistic convention.  The main idea has much in common with formalist views of type (a).  Certain basic sentences are set out as true by fiat, as “stipulated truths” or “truths by convention”, and the rest of the truths of the relevant branch of mathematics are taken to follow logically from them.  Often, the sentences initially set out are axioms for some branch of mathematics, for example, the Peano Axioms for arithmetic or the axioms of ZFC for set theory.  These axioms, which conventionalists often called ‘postulates’, are taken to serve as implicit definitions of the non-logical terms in the theory.  On both formalism and conventionalism, the primitive terms or symbols in the relevant branch of mathematics are not meaningful until the axioms are set out to define them.

For example, according to a conventionalist, a convention can be formed to take the Peano Axioms as true.  Those axioms are then true by convention, and they implicitly define arithmetical terms like “number” and “zero”.  The Peano Axioms would then also be true in virtue of the meaning of the relevant arithmetical terms, because the convention that established the truth of those axioms also set out the conventional meanings of those terms.  Alfred Jules Ayer captures this idea when he says about analytic sentences, among which he specifically includes mathematical truths, “they [analytic sentences] simply record our determination to use words in a certain fashion.  We cannot deny them without infringing the conventions which are presupposed by our very denial, and so falling into self-contradiction” (1964: 299).

For the conventionalist about mathematical truth, denying, for example, one of the Peano Axioms would involve rejecting part of the conventional meaning of some of the relevant arithmetical terms, while also presupposing that same conventional meaning in order to make a meaningful statement using those terms.

There are some differences between these conventionalist and formalist accounts.  One is that, while formalists of the early and mid-20th century were open to different stipulated axiomatic systems, they traditionally defended classical logic against the likes of intuitionists (who deny, for example, the law of excluded middle).  But conventionalists thought of logic, like mathematics, as true by convention.  Accordingly, conventionalists were not wedded to classical logic.  They were free to adopt whatever logic they chose, provided that the relevant conditions for making something a convention were met—whatever those conditions might be.

The most significant difference between formalism and conventionalism is that formalist accounts are primarily concerned with symbols and their manipulations, while conventionalist accounts are primarily concerned with the linguistic meanings of terms.  This difference matters because mathematical language can be used outside the context of pure mathematics.  The conventionalist, but not the formalist, will typically think that mathematical terms should be used with the same conventional meaning in mathematics as in relevant scientific or other applied contexts; changing the mathematical conventions would require changing the use of those mathematical terms in such applications.  This idea can be seen in Hector-Neri Casteñeda’s characterization of conventionalist Douglas Gasking’s (1964) view: “Professor Gasking has argued most persuasively for the view that mathematical propositions are like conventions or rules as to how we should describe what happens in the world” (1964: 405).

While the formalist need not ignore applications entirely, the basic formalist project does not require any coordination between mathematical systems and the world.  Indeed, the formalist can equally accept multiple competing mathematical systems governed by different axioms, regardless of how well those systems serve in applications.  The conventionalist typically is constrained to one system (though not always—Carnap (1947) might allow for different logical systems for different linguistic frameworks), and applications serve as pragmatic considerations in deciding which system that will be.

It is worth noting that the conventionalist thinks that these applications neither can confirm/disconfirm mathematical claims, nor provide any grounds for their truth.  Again, mathematical sentences are true by convention.  As Gasking put it, “I… say that 3×4=12 depends upon no fact about the world, other than some facts about usage of symbols” (1964: 400).  But the conventionalist does think that other conventions are entirely possible.  Again quoting Gasking, “we could use any mathematical rules we liked, and still get on perfectly well in the business of life” (1964: 397).  Indeed, Gasking thinks that “we need never alter our mathematics,” because we can always modify our discussion of the world to fit our existing mathematics.  But he also suggests that there are possible circumstances in which we might want to; there could be pragmatic reasons to change existing mathematical conventions, for example, to simplify particularly cumbersome applications in physics (Gasking 1964: 402-3).  This is a typical conventionalist approach, on which we can form conventions to use whatever mathematics we like, but there may well be practical reasons to choose conventions that are convenient for everyday and scientific applications.  This idea that we can choose which conventions (or frameworks) to adopt on the basis of pragmatic considerations is especially prominent in the conventionalist work of Rudolf Carnap (1946).

Together, these formalist and conventionalist accounts comprise a class of “combinatorial” views on which a mathematical claim is true in virtue of following logically (typically via syntactic derivation rules) from sentences that are stipulated or postulated as starting points or axioms.

b. Combinatorial Accounts and the Epistemological Constraint

Combinatorial views are well-placed to satisfy the epistemological constraint.  In fact, Benacerraf repeatedly cites them as having “epistemological roots” or being “motivated by epistemological concerns” (1973: 668, 675).  Such views appear to start with the idea that mathematical knowledge is acquired via proof, and then work backwards to an account of mathematical truth as the sort of thing that can be known in that way.

Combinatorial accounts appear to fare well epistemologically for two reasons.  First, our knowledge of sentences that are stipulated or postulated is almost trivial.  We know foundational mathematical truths independently of our knowledge or examination of any independent subject matter because we ourselves set them out as truths to define certain terms or symbols.  They are human constructions.  Second, if truth is identified with formal derivability, then in Benacerraf’s words, “We need only account for our ability to produce and survey formal proofs” to explain knowledge of non-foundational truths (Benacerraf 1973: 668). Even if mathematical truth is identified with the results of less formal logical derivation, explaining our knowledge of those truths simply becomes a matter of accounting for our ability to reason deductively.  Mathematical truths are knowable on combinatorial accounts because mathematical truths just are the results of logical derivation from the stipulated or postulated starting points.

All this makes combinatorial accounts likely to mesh with a reasonable epistemology.  That is, they are likely to satisfy the epistemological constraint.  It seems plausible that any plausible epistemological account that accommodates our ability to know stipulated or postulated truths, and that accommodates our ability to reason deductively, will accommodate mathematical knowledge as well.

Despite all this, some objections have been raised to the epistemological adequacy of combinatorial accounts.  Nutting (2013) argues that, for reasons having to do with nonstandard models, combinatorial accounts cannot explain our understanding of the structure of the natural numbers.  Clarke-Doane (2022: 19, 34) claims both that proofs are themselves abstracta and hence no more epistemologically accessible than numbers, and also that combinatorial accounts cannot explain why mathematicians typically accept CON(ZF)—the sentence in Gödel coding that is often translated as “ZF is consistent”.

c. The Problem with Combinatorial Accounts

Although combinatorial accounts appear to satisfy the epistemological constraint, they also appear to fare poorly as accounts of mathematical truth. This will seem obvious if one is convinced by Benacerraf’s semantic arguments described in section 2c, and the idea that an account of mathematical truth must invoke reference, denotation, and satisfaction in order to parallel the semantics of the rest of language.  Combinatorial accounts explain mathematical truth in terms of logical consequences from stipulated or postulated sentences; on such accounts, “truth is conspicuously not explained in terms of reference, denotation, or satisfaction” (Benacerraf 1973: 665).  If Benacerraf is right that a Tarskian account of truth is required to satisfy the semantic constraint, then combinatorial accounts seem not to satisfy it.

But appeals to the semantic arguments provided in section 2 are not the only concerns that have been raised about the adequacy of combinatorial accounts as accounts of mathematical truth.  Perhaps the most prominent concern about such views is specifically targeted at accounts, most notably formalist accounts, on which mathematical truth is a matter of syntactic derivability from stipulated axioms.  This concern starts from the observation that, if the initial mathematical truths of such accounts are to be stipulated, they must be enumerable.  But Gödel’s First Incompleteness Theorem shows that, if it is possible to enumerate the axioms of a formal system, then either (a) the formal system is not strong enough to characterize the basic arithmetic of the natural numbers, (b) there are statements in the language of the formal system that can neither be proved nor disproved in the system itself, or (c) it is possible to prove a contradiction in the system.  Accordingly, formalist accounts, and other accounts on which mathematical truths follow from stipulated or postulated truths by syntactic derivation rules, inevitably suffer from at least one of three major problems: (a) they do not include the Peano Axioms governing arithmetic; (b) they leave some arithmetical claims indeterminate in truth-value because neither they nor their negations are syntactically derivable from the initially stipulated/postulated truths; or (c) they are inconsistent.  Accordingly, Gödel’s result suggests that formalist and other combinatorial accounts that rely on syntactic derivation do not capture all the truths of arithmetic.  And this problem is not restricted to arithmetic; similar concerns also arise for set theory.

Attempts have been made to circumvent the incompleteness problem.  Carnap’s conventionalist account in The Logical Syntax of Language (1937) explains mathematical truth in terms of syntactic derivability from initial sentences that are postulated to be true by convention, and so seems like it might succumb to this Gödelian concern.  But Gödel’s incompleteness theorems do not directly apply to Carnap’s view because Carnap expands the standard syntactic derivation rules of classical logic to include an omega rule:

A(0), A(1), A(2) … A(i), A(i+1), …
ꓯnA(n)

Gödel’s Incompleteness Theorems do not apply to systems that include the omega rule.  However, a few problems remain for accounts that, like Carnap’s, include an omega rule.  First, the omega rule cannot be stated without presupposing the structure of the natural numbers; the numbers need to be presupposed in order to be included among the premises.  Second, because the omega rule requires infinitely many premises, it cannot be used in a finite proof.  And third, incompleteness results akin to Gödel’s can be secured for systems, like Carnap’s, that include an omega rule (see Rosser 1937).

Regardless of whether derivability accounts could be extensionally adequate, Benacerraf argues that they are inadequate as accounts of truth.  He argues that derivability, or “theoremhood” in a formal system, is at best a condition that guarantees truth; it is not an account of truth itself.  To put it another way, there is a difference between derivability (which is akin to verification) and truth (or even satisfaction).  A sentence is derivable in a system if it has a proof in that system; in contrast, a sentence is valid in the system if it is satisfied (or true) in all models of the system.  The fact that derivability coincides with validity in many logical systems is a substantive result—we prove that these two distinct features coincide by proving soundness and completeness theorems for the relevant systems.  But to prove that derivability and validity coincide, we need an account of the semantics of the system; we need an account of what makes it the case that a sentence is satisfied (or true) in a model of the system.  The same is required if derivability is to coincide with truth; a substantive account of truth will also be required.  In Benacerraf’s words, “any theory that proffers theoremhood as a condition of truth [must] also explain the connection between truth and theoremhood” (1973: 666).  Combinatorial accounts that identify mathematical truth with derivability fail to explain this connection, and hence are inadequate as accounts of mathematical truth.

Certain stripes of conventionalists might avoid both the Gödelian Incompleteness problem and Benacerraf’s derivability-truth problem by taking a different approach to how additional true sentences logically follow from the initial postulates.  A conventionalist need only claim that mathematical truth is to be explained in terms of being a logical consequence of the initial sentences that are postulated to be true by convention, where logical consequence is a semantic notion cashed out in terms of satisfaction in all models.  A later iteration of Carnap, in Meaning and Necessity (1947), takes such an approach (though Carnap strictly speaks of holding in all state descriptions, rather than satisfaction in all models).

Benacerraf argues that such conventionalists encounter a different problem, albeit one that will also be a problem for formalists.  The problem is that merely setting a convention does not guarantee truth; the fact that certain mathematical claims are true cannot simply consist in the fact that there is a convention of taking them as true.  Here is how Benacerraf describes the problem:

[O]nce the logic is fixed, it becomes possible that the conventions thus stipulated turn out to be inconsistent.  Hence it cannot be maintained that setting down conventions guarantees truth.  But if it does not guarantee truth, what distinguishes those cases in which it provides for it from those in which it does not? Consistency cannot be the answer.  To urge it as such is to misconstrue the significance of the fact that inconsistency is proof that truth has not been attained. (Benacerraf 1973: 678-9)

This problem mirrors the formalist’s problem.  Proof and inconsistency are methods of demonstrating that mathematical claims are true, or that systems of postulates cannot be jointly true, respectively.  In either case, the epistemological explanation of how we know or demonstrate that a sentence is true, or how we know or demonstrate that a collection of sentences cannot be jointly true, is not itself an explanation of the concept of truth that is to be explained.  They confuse methods of discovery with the nature of what is discovered. The problem for both the formalist and the conventionalist is that their explanations of what makes mathematical claims true show no clear connections with any property that might be recognizable as truth.

So, there are accounts of mathematical truth that appear to fare well with respect to the epistemological constraint; they are combinatorial accounts.  The concern about such accounts is that they tie mathematical truth too closely to mathematical proof.  In tending too closely to the methods of demonstration or verification, such accounts lose track of the target phenomenon of truth that they are supposed to explain.

6. Conclusion

The mathematical truth-mathematical knowledge problem boils down to the fact that commitments to mathematical truth and commitments to mathematical knowledge stand and fall together, and that it is difficult to develop an account of mathematical truth that is amenable to both.  It seems that accounts of mathematical truth that satisfy the semantic constraint and fare well as accounts of truth (as opposed to some other property) are platonistic accounts that face substantial intuitive epistemological problems.  This is Benacerraf’s dilemma, the platonism-focused problem that has received most of the attention in the literature on the mathematical truth-mathematical knowledge problem.  But it is not the whole of the problem.  It also seems that accounts that are designed to fare well epistemologically are conventionalist or formalist accounts that rely on stipulation and derivation to undergird truth, and that these accounts intuitively fail to provide genuine accounts of truth.

Either way, proponents of mathematical truth face the challenge of developing an account of mathematical truth that satisfies both the semantic constraint and the epistemological constraint.  Some philosophers attempt to take on this challenge, and to develop accounts of mathematical truth and mathematical knowledge that are compatible with each other.  Others conclude that the challenge is so difficult as to be impossible.  Rejecting both mathematical truth and mathematical knowledge, these philosophers tend to adopt mathematical fictionalism instead.

7. References and Further Reading

  • Ayer, Alfred Jules (1958).  Language, Truth, and Logic.  New York: Dover.  Pages 71–87 reprinted in Benacerraf and Putnam (1964), 289–301.
  • Baker, Alan (2005).  “Are There Genuine Mathematical Explanations of Physical Phenomena?” Mind 114 (454): 223–238.
  • Balaguer, Mark (1998). Platonism and Anti-Platonism in Mathematics. Oxford: Oxford University Press.
  • Benacerraf, Paul (1968). “What Numbers Could Not Be.” Philosophical Review 74 (1): 47-73.
  • Benacerraf, Paul (1973). “Mathematical Truth.” The Journal of Philosophy 70 (8): 661-80.
  • Benacerraf, Paul and Putnam, Hilary (eds.) (1964). Philosophy of Mathematics: Selected Readings, 1st edition.  Eaglewood Cliffs, New Jersey: Prentice-Hall.
  • Berry, Sharon (2018). “(Probably) Not Companions in Guilt.” Philosophical Studies 175 (9): 2285-308.
  • Berto, Francesco, Rohan French, Graham Priest, and David Ripley (2018).  “Williamson on Counterpossibles.”  Journal of Philosophical Logic 47 (4): 693-713.
  • Bonevac, Daniel (1983). “Freedom and Truth in Mathematics.” Erkenntnis 20 (1): 93-102.
  • Burgess, John and Gideon Rosen (1997). A Subject with No Object. Oxford: Oxford University Press.
  • Burgess, John and Gideon Rosen (2005). “Nominalism reconsidered.” In Stewart Shapiro (ed.) The Oxford Handbook of Philosophy of Mathematics and Logic. New York: Oxford University Press.  515-35.
  • Carnap, Rudolf (1937).  The Logical Syntax of Language.  Trans. Amethe Smeaton.  London: Routledge.
  • Carnap, Rudolf (1947). Meaning and Necessity.  Chicago: The University of Chicago Press.  2nd edition: 1956.
  • Casteñeda, Hector-Neri (1959). “Arithmetic and Reality.” Australasian Journal of Philosophy 37 (2): 92-107. Reprinted in Benacerraf and Putnam (1964), 404-17.
  • Casullo, Albert (1992). “Causality, Reliabilism, and Mathematical Knowledge.” Philosophy and Phenomenological Research 52 (3): 557-84.
  • Cheyne, Colin (1998). “Existence Claims and Causality.”  Australasian Journal of Philosophy 76 (1): 34-47.
  • Clarke-Doane, Justin (2012).  “Morality and Mathematics: The Evolutionary Challenge.”  Ethics 122 (2): 313-40.
  • Clarke-Doane, Justin (2014). “Moral Epistemology: The Mathematics Analogy.” Noûs 48 (2): 238-55.
  • Clarke-Doane, Justin -(2017). “What is the Benacerraf Problem?” In Pataut (2017), 17-44.
  • Clarke-Doane, Justin (2022). Mathematics and Metaphilosophy. Cambridge: Cambridge University Press.
  • Creath, Richard (1980).  “Benacerraf and Mathematical Truth.” Philosophical Studies 37 (4): 335-40.
  • Colyvan, Mark (2001). The Indispensability of Mathematics. Oxford: Oxford University Press.
  • Colyvan, Mark (2007). “Mathematical Recreation vs Mathematical Knowledge.” In Leng, Paseau, and Potter (2007), 109-22.
  • Curry, Haskell B. (1954). “Remarks on the Definition and Nature of Mathematics.” Dialectica 48: 228-33.  Reprinted in Benacerraf and Putnam (1964), 152-56.
  • Dedekind, Richard (1888). “Was Sind und Was Sollen Die Zahlen,” tr. as “The Nature and Meaning of Numbers” in Essays on the Theory of Numbers (1924). Translation by W.W. Beman. Chicago: Open Court Publishing.
  • Dummett, Michael (1973). Frege: Philosophy of Language, London: Duckworth.
  • Field, Hartry (1988). “Realism, Mathematics, and Modality.” Philosophical Topics 16 (1): 57-107.
  • Fine, Kit (2006). “Our Knowledge of Mathematical Objects.” In Tamar Z. Gendler and John Hawthorn (eds.) Oxford Studies in Epistemology. Oxford: Clarendon Press. 89-109.
  • Frege, Gottlob (1884). Die Grundlagen der Arithmetik. English translation by J.L. Austin as The Foundations of Arithmetic. 2nd Revised Ed. Evanston, IL: Northwestern University Press, 1980.
  • Gasking, Douglas (1940). “Mathematics and the World.” Australasian Journal of Philosophy 18 (2): 97-116.  Reprinted in Benacerraf and Putnam (1964), 390-403.
  • Geach, Peter Thomas (1962). Reference and Generality: An Examination of some Medieval and Modern Theories. Ithaca: Cornell University Press.
  • Goldman, Alvin (1967). “A Causal Theory of Knowing.” Journal of Philosophy 64 (12): 357-72.
  • Grice, Paul (1961). “The Causal Theory of Perception.” Proceedings of the Aristotelian Society 35 (1): 121-52.
  • Hale, Bob (2001). “Singular Terms (1).” In Hale and Wright (2001), 31-47.
  • Hale, Bob and Wright, Crispin (2001). The Reason’s Proper Study: Essays Towards a Neo-Fregean Philosophy of Mathematics. Oxford: Clarendon Press.
  • Hale, Bob and Wright, Crispin -(2002). “Benacerraf’s Dilemma Revisited.” European Journal of Philosophy 10 (1): 101-29.
  • Halimi, Brice (2017). “Benacerraf’s Mathematical Antinomy.” In Pataut (2017), 45-62.
  • Harman, Gilbert H. (1973).  Thought. Princeton, NJ: Princeton University Press.
  • Hart, W.D. (1979). “The Epistemology of Abstract Objects II: Access and Inference.” Proceedings of the Aristotelian Society 53: 153-65.
  • Hart, W.D. -(1991a). “Benacerraf’s Dilemma.” Crítica, Revista Hispanoamerica de Filosophía 23 (68): 87-103.
  • Hart, W.D. -(1991b). “Natural Numbers.” Crítica, Revista Hispanoamerica de Filosophía 23 (69): 61-81.
  • Hilbert, David (1925).  “On the Infinite.” Trans. Erna Putnam and Gerald J. Massey. Mathematische Annalen 95: 161-90.  Reprinted in Benacerraf and Putnam (1964), 134-51.
  • Joyce, Richard (2006). The Evolution of Morality. Cambridge, Mass: MIT Press.
  • Kasa, Ivan (2010). “On Field’s Epistemological Argument against Platonism.” Studia Logica 96 (2): 141-47.
  • Katz, Jerrold J. (1995). “What Mathematical Knowledge Could Not Be.” Mind 104 (415). 491-522.
  • Kitcher, Philip and Asprey, William (1987).  History and Philosophy of Mathematics Vol. 11.  Minneapolis: University of Minnesota Press.
  • Korman, Daniel (2019). “Debunking Arguments.” Philosophy Compass 14 (12): 1-17.
  • Korman, Daniel and Locke, Dustin (2020). “Against Minimalist Responses to Moral Debunking Arguments.” Oxford Studies in Metaethics 15: 309-332.
  • Leng, Mary (2007). “What’s There to Know? A Fictionalist Account of Mathematical Knowledge.” In Leng, Paseau, and Potter (2007), 84-108.
  • Leng, Mary (2010). Mathematics and Reality. Oxford: Oxford University Press.
  • Leng, Mary; Paseau, Alexander; and Potter, Michael (eds.) (2007). Mathematical Knowledge. Oxford: Oxford University Press.
  • Lewis, David (1986). On the Plurality of Worlds. Malden: Blackwell.
  • Liggins, David (2006). “Is There a Good Epistemological Argument Against Platonism?” Analysis 66 (2): 135-41.
  • Liggins, David – (2010). “Epistemological Objections to Platonism.” Philosophy Compass 5 (1): 67-77.
  • Linnebo, Øystein (2006). “Epistemological Challenges to Mathematical Platonism.” Philosophical Studies 129 (3): 545-74.
  • Linnebo, Øystein (2018).  Thin Objects: An Abstractionist Account.  Oxford: Oxford University Press.
  • Maddy, Penelope (1980). “Perception and Mathematical Intuition.” The Philosophical Review 89 (2): 163-96.
  • Maddy, Penelope (1984). “Mathematical Epistemology: What is the Question?” The Monist 67 (1): 46-55.
  • Maddy, Penelope (1990). Realism in Mathematics. Oxford: Oxford University Press.
  • Maddy, Penelope (1991). “Philosophy of Mathematics: Prospects for the 1990s.”  Synthese 88 (2): 155-164.
  • Nolan, Daniel (1997). “Impossible Worlds: A Modest Approach.” Notre Dame Journal of Formal Logic 38 (4): 535-72.
  • Nutting, Eileen S. (2013). Understanding Arithmetic through Definitions. UCLA Dissertation. https://escholarship.org/uc/item/5xr5x4f1
  • Nutting, Eileen S. (2016). “To Bridge Gödel’s Gap.” Philosophical Studies. 173 (8): 2133-50.
  • Nutting, Eileen S. (2018). “Ontological Realism and Sentential Form.” Synthese 195 (11): 5021-5036.
  • Nutting, Eileen S. (2020). “Benacerraf, Field, and the Agreement of Mathematicians.” Synthese 197 (5): 2095-110.
  • Papineau, David (1988).  “Mathematical Fictionalism.”  International Studies in the Philosophy of Science 2 (2): 151-174.
  • Parsons, Charles (1979).  “Mathematical Intuition.” Proceedings of the Aristotelian Society 80: 145-168.
  • Parsons, Charles – (2008). Mathematical Thought and Its Objects. Cambridge: Cambridge University Press.
  • Pataut, Fabrice (ed.) (2017).  Truth, Objects, Infinity: New Perspectives on the Philosophy of Paul Benacerraf. (Volume 28: Logic, Epistemology, and Unity of Science).  Dordrecht: Springer.
  • Quine, W.V.O. (1936).  “Truth by Convention.” In Otis E. Lee, ed., Philosophical Essays for A.N. Whitehead. New York: Longmans, Green and Co.  Reprinted in Benacerraf and Putnam (1964), 322-45.
  • Quine, W.V.O. (1948). “On What There Is.” Review of Metaphysics 2 (5). 21-36.
  • Resnik, Michael (1982). “Mathematics as a Science of Patterns: Epistemology.” Noûs 16 (1): 95-105.
  • Resnik, Michael (1995). “Scientific vs. Mathematical Realism: The Indispensability Argument.” Philosophia Mathematica 3 (2): 166-174.
  • Rosser, Barkley (1937). “Gödel Theorems for Non-Constructive Logics.” The Journal of Symbolic Logic 2 (3): 129-137.
  • Shapiro, Stewart (1997). Philosophy of Mathematics: Structure and Ontology. Oxford: Oxford University Press.
  • Skyrms, Brian (1967). “The Explication of ‘X knows that p’.” Journal of Philosophy 64 (12): 373-89.
  • Steiner, Mark (1973). “Platonism and the Causal Theory of Knowledge.” The Journal of Philosophy 70 (3): 57-66.
  • Street, Sharon (2006). “A Darwinian Dilemma for Realist Theories of Value.” Philosophical Studies 127 (1): 109-66.
  • Tait, W.W. (1986). “Truth and Proof: The Platonism of Mathematics.” Synthese 69 (3): 341-70.
  • Thurow, Joshua C. “The Defeater Version of Benacerraf’s Problem for A Priori Knowledge.” Synthese 190 (9): 1587-603.
  • von Neumann (1964).  “The Formalist Foundations of Mathematics.” In Benacerraf and Putnam (1964), 50-54.
  • Weinstein, Scott (1983). “The Intended Interpretation of Intuitionistic Logic.”  Journal of Philosophical Logic 12 (2): 261-270.
  • Williamson, Timothy (2017).  “Counterpossibles in Semantics and Metaphysics.”  Argumenta 2 (2): 195-226.
  • Yap, Audrey (2009). “Logical Structuralism and Benacerraf’s Problem.” Synthese 171 (1): 157-73.

 

Author Information

Eileen S. Nutting
Email: nutting@ku.edu
The University of Kansas
U. S. A.

Michel de Montaigne (1533-1592)

MontaigneMichel de Montaigne, the sixteenth century French essayist, is one of the most renowned literary and philosophical figures of the late Renaissance.  The one book he wrote, Les Essais de Michel de Montaigne, is not a traditional work of philosophy.  Having begun work on it around 1572, he published the first edition in 1580.  He then went on to publish four more editions during the 1580s, adding new material each time, and was at work on a sixth edition—which would extend the length of the book by nearly a third—when he died in 1592. Over the course of 107 chapters he ranges over a great number of typical philosophical topics such as skepticism, education, virtue, friendship, politics, poetry, and death, as well as many less traditional topics such as drunkenness, horse riding techniques, smells, and his own dietary preferences.  There is even a chapter on thumbs.  Aiming both to address these topics and to make himself known to the reader, Montaigne relates stories from ancient and contemporary sources, recounts his own experiences, interjects quotations from ancient Greek and Roman texts, and offers his own personal judgments.  In the text, digressions, inconsistencies, and exaggerations abound; Montaigne himself described it as “a book with a wild and eccentric plan” and “a bundle of so many disparate pieces.”  His motto was “What do I know?”

To some of Montaigne’s sixteenth-century European contemporaries, the Essais seemed to mark the birth of French philosophy.  One dubbed him “The French Thales”; others called him “The French Socrates.”  While for most of the twentieth century philosophers’ interests in Montaigne were largely limited to his role in the history of skepticism, in the last forty years he has begun to receive more scholarly attention for his contributions to moral and political philosophy, as well as to the ways in which his work anticipates various subsequent philosophical and political movements, such as liberalism, pragmatism, and postmodernism.

Table of Contents

  1. Life
  2. The Philosophical Projects of the Essays
  3. Skepticism
  4. Moral Relativism
  5. Moral and Political Philosophy
  6. Philosophical Legacy
  7. References and Further Reading
    1. Selected Editions of Montaigne’s Essays in French and English
    2. Secondary Sources

1. Life

Michel Eyquem de Montaigne was born in the château Montaigne, thirty miles east of Bordeaux, on February 28, 1533.   His father, Pierre Eyquem, was the first in the family to lead the life of a minor nobleman, living entirely off of his assets and serving as a soldier in the armies of King Francis I before returning in 1528 to live on the estate that his grandfather, a wealthy herring merchant, had purchased in 1477.  Montaigne’s mother, Antoinette de Loupes de Villeneuve, came from a wealthy bourgeois family that had settled in Toulouse at the end of the 15th century.  Montaigne describes Eyquem as “the best father that ever was,” and mentions him often in the Essays.  Montaigne’s mother, on the other hand, is almost totally absent from her son’s book.  Amidst the turbulent religious atmosphere of sixteenth century France, Eyquem and his wife raised their children Catholic.  Michel, the eldest of eight children, remained loyal to the Catholic Church his entire life, while three of his siblings became Protestants.

Montaigne reports that as an infant he was sent to live with a poor family in a nearby village so as to cultivate in him a natural devotion to “that class of men that needs our help” (“Of experience”).  When Montaigne returned as a young child to live at the château, Eyquem arranged for Michel to be awakened each morning to music.  He then hired a German tutor to teach Michel Latin.  Members of the household were forbidden to speak to the young Michel in any other language; as a result, Montaigne reports that he was six years old before he learned any French.  It was at this time that Eyquem sent Montaigne to attend the prestigious Collège de Guyenne, where he studied under the Scottish humanist George Buchanan.

The details of Montaigne’s life between his departure from the Collège at age thirteen and his appointment as a Bordeaux magistrate in his early twenties are largely unknown.  He is thought to have studied the law, perhaps at Toulouse.  In any case, by 1557 he had begun his career as a magistrate, first in the Cour des Aides de Périgueux, a court with sovereign jurisdiction in the region over cases concerning taxation, and later in the Bordeaux Parlement, the highest court of appeals in Guyenne.  There he encountered Etienne La Boétie, with whom he formed an intense friendship that lasted until La Boétie’s sudden death in 1563.  Years later, the bond he shared with La Boétie would inspire one of Montaigne’s best-known essays, “Of Friendship.”  Two years after La Boétie’s death Montaigne married Françoise de la Chassaigne.  His relationship with his wife seems to have been amiable but cool; it lacked the spiritual and intellectual connection that Montaigne had shared with La Boétie.  Their marriage produced six children, but only one survived infancy: a daughter named Léonor.

Montaigne’s career in the Parlement was not a distinguished one, and he was passed over for higher offices.  Meanwhile, after years of simmering tensions between Catholics and Protestants in France, the French Wars of Religion had begun in 1562.  They would continue intermittently throughout the rest of Montaigne’s life, and thus provide the context for much of Montaigne’s social and political thought.  In 1570 Montaigne sold his office in the Parlement and retreated to his château, where in 1571 he built the tower that was to house the famous study where he had Greek, Roman, and Biblical epigrams painted onto the ceiling joists in Latin and Greek.   Less than a year later he began to write the first chapters of what would become his Essais.  Nevertheless, retirement from the Parlement did not mean the abandonment of political aspirations.  Montaigne courted the patronage of several regional nobles who seem to have helped to bring him to the attention of King Charles IX, who made him a Gentleman of the King’s Chamber and a Knight of the Order of Saint Michel in 1571.  He occasionally served as an envoy on behalf of members of the high nobility during the 1570s, and in 1577 Montaigne was made a Gentleman the King’s Chamber by Henri, King of Navarre, an independent kingdom just north of the Pyrenees in what is now southwest France.  Between diplomatic missions, he continued to write.

By 1580 he had completed his book.  It took the form of ninety-four chapters divided into two books bound in a single volume, and he gave it the title Essais de Messire Michel Seigneur de Montaigne, adding on the title page his honorific titles of “Knight of the Order of the King,” and “Ordinary Gentleman of His Chamber.”  He printed the book in Bordeaux, and then personally delivered a copy to Henri III at Saint-Maur-des-Fossés.  Shortly after his audience with the king, Montaigne embarked on a trip to Rome via Germany and Switzerland.  Montaigne recorded the trip in a journal that he apparently never intended to publish.  Lost after his death, it was rediscovered and published for the first time in the 18th century as the Journal de Voyage.  While Montaigne tells us in later editions of the Essais that the reasons for his trip were his hope of finding relief from his kidney stones in the mineral baths of Germany, his desire to see Rome, and his love of travel, it has recently been argued that the 1580 edition of the Essais was more a political project than a theoretical one, and that in writing his book, Montaigne intended to gain the attention of the king and demonstrate how well-suited he would be for a career as a high-level diplomat or counselor.  Thus, his primary motivation for the trip to Rome may have been his hope that Henri III would make him an interim ambassador there.  As it turned out, Montaigne was never offered the post, and in 1581, the king called him back to Bordeaux to serve as the city’s mayor.

Montaigne’s first two-year term as mayor was mostly uneventful.  His second term was much busier, as the death of the Duke of Anjou made Henri of Navarre, a Protestant, heir to the French throne.  This resulted in a three-way conflict between the reigning Catholic King Henri III of France, Henri de Guise, leader of the conservative Catholic League, and Henri of Navarre.  Bordeaux, which remained Catholic and loyal to Henri III, was situated in close proximity to Navarre’s Protestant forces in southwest France.  As a mayor loyal to the king and on friendly terms with Navarre, who visited Montaigne twice in the 1580s, Montaigne worked successfully to keep the peace, protecting the city from seizure by the League while also maintaining diplomatic relations with Navarre.  By the end of his second term, however, relations between Catholics and Protestants, and between Henri III and Navarre, had deteriorated.  Returning to his château in 1586, he began to write what would become the third part of his Essais.  Though relegated to the sidelines, his political career was not quite over.  Regarded by both kings as diplomatically capable and trustworthy, it seems that Navarre sent him on a secret mission to Henri III in Paris in February 1588.  Montaigne took the occasion of the trip to deliver the manuscript for the fifth edition of the Essais to his printer in Paris.  Apparently his mission was unsuccessful; no agreement was reached between Henri III and Navarre.  In May 1588, when Henri III was forced to flee Paris due to an uprising instigated by the Catholic League, Montaigne went with him.  When he returned to Paris in July, Montaigne was imprisoned in the Bastille on the orders of a duke loyal to the League, “by right of reprisal” against Henri III.  Released on the same day at the request of Catherine de Medici, the Queen mother, Montaigne collected his copies of the newly printed fifth edition of his book, and left Paris immediately.

He did not, however, go home to Montaigne.  Earlier that spring, he had made the acquaintance of Marie de Gournay, daughter of the king’s treasurer and, as a result of her having read the Essais years earlier, a great admirer of Montaigne’s.  So, instead of returning to Bordeaux, Montaigne travelled to Picardy, to pay a visit to Gournay and her mother.  He would return to their home perhaps three times that summer and fall, forming a friendship that would result in Gournay becoming Montaigne’s literary executrix.  Gournay turned out to be a notable philosopher in her own right, and went on to compose essays on a variety of topics, including equality between the sexes, in addition to faithfully bringing out new editions of the Essais throughout the rest of her life.  (See Gournay.)

When Navarre finally succeeded Henri III as king of France in 1589, he invited Montaigne to join him at court, but Montaigne was too ill to travel.  He spent the majority of the last three years of his life at the château, where he continued to make additions to the Essais by writing new material in the margins of a copy of the 1588 edition, thereby extending the length of his book by about one-third.  He died on September 13, 1592, never having published what he intended to be the sixth edition of his Essais.

Gournay learned of Montaigne’s death three months later from Justus Lipsius, and was given what is now known as the “Exemplar,” one of the two personal copies of the fifth edition of the Essais into the margins of which Montaigne had written corrections and additions for the purposes of publishing a sixth edition.  Gournay used this text to put together the first posthumous edition of the book, which she edited and published in 1595.  With the “Exemplar” having been destroyed during the printing process (as was customary at the time), Gournay’s edition of the Essais was the only version that would be read for the next two hundred years, until the other personal copy marked with Montaigne’s handwritten corrections and additions was discovered.  This text, known today as the “Bordeaux Copy,” contained roughly two hundred passages that differed in minor ways from the 1595 edition, and eventually achieved near-canonical status as the authoritative text of the Essais in the twentieth century.  Still, the scholarly debate over which version of Montaigne’s text should be considered authoritative continues today, as exemplified by the 2007 publication of a Pléiade edition of the Essais based on the 1595 text.

2. The Philosophical Projects of the Essays

Montaigne wrote different portions of his book at different times and in different personal and political contexts, and his fundamental interests in life were neither purely philosophical nor purely political.  Thus, it should come as no surprise that Montaigne writes in “Of friendship” that his book is monstrous, that is, “pieced together of diverse members, without definite shape.”  This is certainly the way the book initially presents itself to the reader, and consequently, piecing together Montaigne’s fundamental goals and purposes in writing his Essais is a contentious business.

Since Montaigne was the first author to call his writings “essays,” he is often described as the “inventor of the essay,” which is both apt and misleading at the same time.  It is misleading in that today we tend to think of an essay as a free-standing literary unit with its own title and subject, composed and published independently, and perhaps later collected into an anthology with previously published pieces of the same kind.  If this is what we mean by an “essay” today, then Montaigne could not be said to have invented the essay, for two reasons.  First, this genre dates back to the ancient world; Plutarch, for example, Montaigne’s favorite writer and philosopher, could be said to have written such “essays,” as could Seneca, another ancient author from whom Montaigne borrows liberally.  Second, Montaigne, who referred to the individual units of his book as “chapters,” never published any of those chapters independently.

When Montaigne gives the title Essais to his book, he does not intend to denote the literary genre of the works contained therein so much as to refer to the spirit in which they are written and the nature of the project out of which they emerge.  The title is taken from the French verb “essayer,” which  Montaigne employs in a variety of senses throughout his Essais, where it carries such meanings as “to attempt,” “to test,” “to exercise,” and “to experiment.”  Each of these expressions captures an aspect of Montaigne’s project in the Essais.  To translate the title of his book as “Attempts” would capture the epistemic modesty of Montaigne’s essays, while to translate it as “Tests” would reflect the fact that he takes himself to be testing his judgment.  “Exercises” would communicate the sense in which essaying is a way of working on oneself, while “Experiments” would convey the exploratory spirit of the book.

That he presented his philosophical reflections in a particular literary form is, of course, no accident.  And while it is somewhat misleading to understand the chapters of Montaigne’s book to be essays in our current sense of the term, they do certainly possess a number of features that remain associated with the essay as a literary form today.  For the most part, they are short, covering less than twenty pages, and to a certain extent they can be taken to constitute free-standing literary and philosophical units.  Stylistically, they are suitable for a general audience: informal and conversational in tone, they are free of philosophical jargon and formal argumentation.  In “The Essay as Form,” a reflection on the contemporary genre of the essay, Theodore Adorno contrasts the spirit in which essays are written with the four methodological principles that Descartes introduces in his Discourse on Method. Whereas Descartes vows to assent only to propositions that are clear and distinct; to analyze problems into their simple component parts; to proceed in an orderly fashion, starting with the simple and then moving to the most complex; and to ensure that matters are dealt with completely, the essay, according to Adorno, does the opposite, eschewing certainty, analysis into parts, logical order, and completeness.  The same can be said for the individual chapters of Montaigne’s book as well as for the book as a whole.  For the Essais appears to be a decidedly unsystematic work in almost every respect.  The sixth and final edition of the text is composed of 107 chapters on a wide range of topics, including—to name a few—knowledge, education, love, the body, death, politics, the nature and power of custom, and the colonization of the “New World.”  Chapter titles often seem only tangentially related to their contents, and there rarely seems to be any explicit connection between one chapter and the next.  The lack of logical progression from one chapter to the next creates a sense of disorder that is compounded by Montaigne’s style, which can be described as deliberately nonchalant.  Montaigne intersperses historical anecdotes, Latin quotations—often unattributed—from ancient authors, and autobiographical remarks throughout the book, and most essays include a number of digressions.  In some cases the digressions seem to be due to Montaigne’s almost stream-of-consciousness style,  while in others they are the result of his habit of inserting additions (sometimes just a sentence or two, other times a number of paragraphs) into essays years after they were first written.

Still, it should be noted that in “Of vanity,” Montaigne warns readers against mistaking the disorderly form of his text for a lack of coherence: “I go out of my way, but rather by license than carelessness.  My ideas follow one another, but sometimes it is from a distance, and look at each other, but with a sidelong glance . . . It is the inattentive reader who loses my subject, not I.  Some word about it will always be found off in a corner, which will not fail to be sufficient, though it takes little room.”  And indeed, in many cases, scholars have discovered connections that link one chapter to the next, and found both individual chapters and the book as a whole to be less disjointed than they initially appear to be.  Thus, while individual chapters can be read profitably on their own, a full appreciation of each chapter’s meaning and significance requires close attention to its relation to surrounding chapters and the Essais as a whole.  Moreover, it requires study of the literary elements of the Essais, such as the images, metaphors, and stories mentioned above.  These elements are not merely ornamental; Montaigne’s decision to deploy these literary elements derives from his anthropology, according to which we are better understood as imaginative creatures than rational animals.  For Montaigne, then, the form and the content of the Essais are internally related.

One example of this is the way that the nature of Montaigne’s project itself contributes to the disorderly style of his book.  Part of that project is to cultivate his own judgment.  For Montaigne, “judgment” refers to the sum total of our intellectual faculties; in effect, it denotes the interpretive lens through which we view the world.  One way in which he cultivates his judgment is simply by exercising it through simple practice.  As he writes in “Of Democritus and Heraclitus”:

Judgment is a tool to use on all subjects, and comes in everywhere. Therefore in the tests (essais) that I make of it here, I use every sort of occasion.  If it is a subject I do not understand at all, even on that I essay my judgment, sounding the ford from a good distance; and then, finding it too deep for my height, I stick to the bank.  And this acknowledgment that I cannot cross over is a token of its action, indeed one of those it is most proud of.  Sometimes in a vain and nonexistent subject I try (j’essaye) to see if [my judgment] will find the wherewithal to give it body, prop it up, and support it.  Sometimes I lead it to a noble and well-worn subject in which it has nothing original to discover, the road being so beaten that it can only walk in others’ footsteps.  There it plays its part by choosing the way that seems best to it, and of a thousand paths it says that this one or that was the most wisely chosen.

One look at the Essais’ table of contents will convince readers that he is true to his word when he writes of taking up what would seem like “vain and nonexistent” subjects.  Chapter titles include: “Of smells”; “Of thumbs”; “A trait of certain ambassadors”; and “Of the arms of the Parthians.”  Montaigne holds that in cultivating one’s judgment, “everything that comes to our eyes is book enough: a page’s prank, a servant’s blunder, a remark at table, are so many new materials” (Of the education of children”).  The goal of cultivating his judgment and the conviction that everything one encounters in the world can be useful for this purpose results in a book that contains topics that seem out of place in an ordinary philosophical treatise and thus give rise to the reader’s sense of the haphazard character of the book.

An additional way in which he aims to cultivate his judgment is through attempting to transform his customary or habitual judgments into reflective judgments that he can self-consciously appropriate as his own.  In a well-known passage from “Of custom, and not easily changing an accepted law,” Montaigne discusses how habit “puts to sleep the eye of our judgment.”  To “wake up” his judgment from its habitual slumber, Montaigne must call into question those beliefs, values, and judgments that ordinarily go unquestioned.  By doing so, he is able to see more clearly the extent to which they seem to be reasonable, and so decide whether to take full ownership of them or to abandon them.  In this sense we can talk of Montaigne essaying, or testing, his judgment.  We find clear examples of this in essays such as “Of drunkenness” and “Of the resemblance of children to their fathers,” where he tests his pre-reflective attitudes toward drunkenness and doctors, respectively.

Another part of Montaigne’s project that contributes to the form his book takes is to paint a vivid and accurate portrait of himself in words.  For Montaigne, this task is complicated by his conception of the self.  In “Of repentance,” for example, he announces that while others try to form man, he simply tells of a particular man, one who is constantly changing:

I cannot keep my subject still.  It goes along befuddled and staggering, with a natural drunkenness.  I take it in this condition, just as it is at the moment I give my attention to it.  I do not portray being: I portray passing….  I may presently change, not only by chance, but also by intention.  This is a record of various and changeable occurrences, and of irresolute and, when it so befalls, contradictory ideas: whether I am different myself, or whether I take hold of my subjects in different circumstances and aspects.  So, all in all, I may indeed contradict myself now and then; but truth, as Demades said, I do not contradict. (“Of repentance”)

Given Montaigne’s expression of this conception of the self as a fragmented and ever-changing entity, it should come as no surprise that we find contradictions throughout the Essays.  Indeed, one of the apparent contradictions in Montaigne’s thought concerns his view of the self.  While on the one hand he expresses the conception of the self outlined in the passage above, in the very same essay – as if to illustrate the principle articulated above – he asserts that his self is unified by his judgment, which has remained essentially the same his entire life, as well as by what he calls his “ruling pattern,” which he claims is resistant to education and reform.

In part, his self-portraiture is motivated by a desire for self-knowledge.  There are two components to Montaigne’s pursuit of self-knowledge.  The first is the attempt to understand the human condition in general.  This involves reflecting on the beliefs, values, and behavior of human beings as represented both in literary, historical, and philosophical texts, and in his own experience.  The second is to understand himself as a particular human being.  This involves recording and reflecting upon his own idiosyncratic tastes, habits, and dispositions.  Thus while the Essais is not an autobiography, it contains a great deal of autobiographical content, some of which may seem arbitrary and insignificant to readers.  Yet for Montaigne, there is no detail that is insignificant when it comes to understanding ourselves: “each particle, each occupation, of a man betrays and reveals him just as well as any other” (“Of Democritus and Heraclitus”).

Still another fundamental goal of essaying himself is to present his unorthodox ways of living and thinking to the reading public of 16th century France.  Living in a time of war and intolerance, in which men were concerned above all with honor and rank in a hierarchical French society, Montaigne presents his own way of life as an attractive alternative.  He presents to readers not the life of a great public figure, such as one would find in Plutarch’s Lives, but the merely passable and ordinary life of an individual who for the most part led a private life, neither distinguishing himself on the battlefield or in government.  Eschewing self-mastery and the pursuit of moral perfection that one finds among ancient Greek and Roman philosophers and Christian ascetics, he claims to be basically satisfied with himself (“Of repentance”), and in his one public role, as mayor of Bordeaux, he praises himself for not having made things worse (“Of husbanding your will”).  Montaigne’s character marries compassion, innocence, and self-acceptance to courage, prudence, and moderation, and in presenting such a figure to his audience, he thereby problematizes prevailing conceptions of the good life that emphasized Stoic self-discipline, heroic virtue, and religious zeal.

Similarly, he presents his ways of behaving in the intellectual sphere as alternatives to what he takes to be prevailing habits among Renaissance philosophers.  He claims not to have spent much time studying Aristotle, the “god of scholastic knowledge” (“Apology for Raymond Sebond”).  He eschews definition and deduction, instead opting for description of particulars, and he does not do natural philosophy or metaphysics, as traditionally conceived: “I study myself more than any other subject.  That is my metaphysics, that is my physics” (“Of repentance”).  While he discusses historical events and testimonies frequently, and eagerly reports what he has learned about the “New World,” he confesses that he cannot vouch for the truth of what he relays to his readers and admits that in the end, whether the accounts he relates are accurate or not is not so important as the fact that they represent “some human potentiality” (“Of the power of the imagination”).  Moreover, Montaigne rarely makes what philosophers would recognize as arguments. Rather than discursively justifying the value of his ways of being by appeal to general principles, Montaigne simply presents them to his readers: “These are my humors and my opinions; I offer them as what I believe, not what is to be believed.  I aim here only at revealing myself, who will perhaps be different tomorrow, if I learn something new which changes me.  I have no authority to be believed, nor do I want it, feeling myself too ill-instructed to instruct others” (“Of the education of children”).  Yet while he disavows his own authority, he admits that he presents this portrait of himself in the hopes that others may learn from it (“Of practice”).  In essaying himself, then, Montaigne’s ends are both private and public: on the one hand, he desires to cultivate his judgment and to develop his self-understanding; on the other hand, he seeks to offer examples of his own habits as salutary alternatives to those around him.

3. Skepticism

One topic on which Montaigne does offer readers traditional philosophical arguments is skepticism, a philosophical position of which he often speaks approvingly, especially in the longest chapter of the Essais, “Apology for Raymond Sebond.”  Just what exactly Montaigne’s own skepticism amounts to has been the subject of considerable scholarly debate.  Given the fact that he undoubtedly draws inspiration for his skeptical arguments from the ancient Greeks, the tendency has been for scholars to locate him in one of those skeptical traditions.  While some interpret him as a modern Pyrrhonist, others have emphasized what they take to be the influence of the Academics.  Still other scholars have argued that while there are clearly skeptical moments in his thought, characterizing Montaigne as a skeptic fails to capture the true nature of Montaigne’s philosophical orientation.  Each of these readings captures an aspect of Montaigne’s thought, and consideration of the virtues of each of them in turn provides us with a fairly comprehensive view of Montaigne’s relation to the various philosophical positions that we tend to identify as “skeptical.”

The Pyrrhonian skeptics, according to Sextus Empiricus’ Outlines of Pyrrhonism, use skeptical arguments to bring about what they call equipollence between opposing beliefs.  Once they recognize two mutually exclusive and equipollent arguments for and against a certain belief, they have no choice but to suspend judgment.  This suspension of judgment, they say, is followed by tranquility, or peace of mind, which is the goal of their philosophical inquiry.

In “Apology for Raymond Sebond,” Montaigne expresses great admiration for the Pyrrhonists and their ability to maintain the freedom of their judgment by avoiding commitment to any particular theoretical position.  We find him employing the skeptical tropes introduced by Sextus in order to arrive at equipollence and then the suspension of judgment concerning a number of theoretical issues, from the nature of the divine to the veracity of perception.  Elsewhere, such as in the very first essay of his book, ”By diverse means we arrive at the same end,” Montaigne employs skeptical arguments to bring about the suspension of judgment concerning practical matters, such as whether the best way to obtain mercy is by submission or defiance.  Introducing historical examples that speak for each of the two positions, he concludes that “truly man is a marvelously vain, diverse, and undulating object.  It is hard to found any constant and uniform judgment on him.”   It seems that we cannot, then, achieve certainty regarding practical matters any more than we can regarding theoretical matters.

If there are equipollent arguments for and against any practical course of action, however, we might wonder how Montaigne is to avoid the practical paralysis that would seem to follow from the suspension of judgment.  Here Sextus tells us that Pyrrhonists do not suffer from practical paralysis because they allow themselves to be guided by the way things seem to them, all the while withholding assent regarding the veracity of these appearances.  Thus Pyrrhonists are guided by passive acceptance of what Sextus calls the “fourfold observances”: guidance by nature, necessitation by feelings, the handing down of laws and customs, and the teaching of practical expertise.  The Pyrrhonist, then, having no reason to oppose what seems evident to her, will seek food when hungry, avoid pain, abide by local customs, and consult experts when necessary – all without holding any theoretical opinions or beliefs.

In certain cases, Montaigne seems to abide by the fourfold observances himself.  At one point in ”Apology for Raymond Sebond,” for instance, he seems to suggest that his allegiance to the Catholic Church is due to the fact that he was raised Catholic and Catholicism is the traditional religion of his country.  This has led some scholars to interpret him as a skeptical fideist who is arguing that because we have no reasons to abandon our customary beliefs and practices, we should remain loyal to them.  Indeed, some Catholics would employ this argument in the Counter-Reformation movement of the sixteenth and seventeenth centuries.  (Nonetheless, other readers have argued that Montaigne is actually an atheist, and in fact the Essais would be placed on the Catholic Church’s Index of Prohibited Books in the late seventeenth century, where it would remain for nearly two hundred years.)

Yet, for all the affinities between Montaigne and the Pyrrhonists, he does not always suspend judgment, and he does not seem to take tranquility to be the goal of his philosophical reflections.  Thus some scholars have argued that Montaigne has more in common with the Academic Skeptics than with the Pyrrhonists.  The Academics, at certain points in the history of their school, seem to have allowed for admitting that some judgments are more probable or justified than others, thereby permitting themselves to make judgments, albeit with a clear sense of their fallibility, and this is precisely the stance towards his judgments that Montaigne seems to take throughout the Essais.  Thus Montaigne’s remarks are almost always prefaced by acknowledgments of their fallibility: “I like these words, which soften and moderate the rashness of our propositions: ‘perhaps,’ ‘to some extent,’ ‘some,’ ‘they say,’ ‘I think,’ and the like” (“Of experience”).  Another hallmark of Academic Skepticism was the strategy of dialectically assuming the premises of interlocutors in order to show that they lead to conclusions at odds with the interlocutors’ beliefs.  Montaigne seems to employ this argumentative strategy in “Apology for Raymond Sebond.” There he dialectically accepts the premises of Sebond’s critics in order to reveal the presumption and confusion involved in their objections to Sebond’s project.  For example, Montaigne shows that according to the understanding of knowledge held by Sebond’s secular critics, there can be no knowledge.  This is not necessarily the dramatic and dogmatic conclusion that it has appeared to be to some scholars, since Montaigne’s conclusion may be founded upon a premise that he himself rejects.  If we understand knowledge as Sebond’s critics do, then there can be no knowledge.  But there is no reason why we must accept their notion of knowledge in the first place.  In this way, just as the Academic Skeptics argued that their Stoic opponents ought to suspend judgment, given the Stoic principles to which they subscribe, so Montaigne shows that Sebond’s secular critics must suspend judgment, given the epistemological principles that they claim to espouse.

Still other scholars have argued that while Montaigne certainly employs Pyrrhonian and Academic argumentative strategies in the Essais, in the final analysis it is misleading to characterize him as a skeptic.  While they acknowledge both that there is a skeptical moment in his thought and that he takes a fallibilistic stance toward his own judgments, such scholars point to the fact that Montaigne not only seems to hold some beliefs with a degree of conviction inappropriate for a traditional skeptic, but also argues for unconventional moral positions.  When we take a broader view of the Essays as a whole, they suggest, we find that Montaigne’s employment of skeptical tropes is fairly limited and that while he shares the ancient skeptics’ concern to undermine human presumption, that is not the only lesson that he sets out to teach his readers.

4. Moral Relativism

One of the primary targets of Montaigne’s attack on presumption is ethnocentrism, or the belief that one’s culture is superior to others and therefore is the standard against which all other cultures, and their moral beliefs and practices, should be measured.  This belief in the moral and cultural superiority of one’s own people, Montaigne finds, is widespread.  It seems to be the default belief of all human beings.  The first step he takes toward undermining this prejudice is to display the sheer multiplicity of human beliefs and practices.  Thus, in essays such as “Of some ancient customs,” “Of Custom, and not easily changing an accepted law,” and “Apology for Raymond Sebond” Montaigne catalogues the variety of behaviors to be found in the world in order to draw attention to the contingency of his own cultural norms.  By reporting so many practices that are at odds with contemporary European customs, he creates something like an inverted world for his readers, stunning their judgment by forcing them to question which way is up: here men urinate standing up and women do so sitting down; elsewhere it is the opposite.  Here we bury our dead; there they eat them.  Here we believe in the immortality of the soul; in other societies such a belief is nonsense, and so on.

Montaigne is not terribly optimistic about reforming the prejudices of his contemporaries, for simply reminding them of the apparent contingency of their own practices in most cases will not be enough.  The power of custom over our habits and beliefs, he argues, is stronger than we tend to recognize.  Indeed, Montaigne devotes almost as much time in the Essays to discussing the power of custom to shape the way we see the world as he does to revealing the various customs that he has come across in his reading and his travels.  Custom, whether personal or social, puts to sleep the eye of our judgment, thereby tightening its grip over us, since its effects can only be diminished through deliberate and self-conscious questioning.  It begins to seem as if it is impossible to escape custom’s power over our judgment: “Each man calls barbarism whatever is not his own practice; for indeed it seems we have no other test of truth and reason than the example and pattern of the opinions and customs of the country we live in” (“Of cannibals”).

Montaigne’s concern with custom and cultural diversity, combined with his rejection of ethnocentrism, has led many scholars to argue that Montaigne is a moral relativist, which is to say that he holds that there is no objective moral truth and that therefore moral values are simply expressions of conventions that enjoy widespread acceptance at a given time and place.  And there are passages that seem to support this interpretation: “The laws of conscience, which we say are born of nature, are born of custom.  Each man, holding in inward veneration the opinions and behavior approved and accepted around him, cannot break loose from them without remorse, or apply himself to them without self-satisfaction.”

Yet elsewhere in the Essais Montaigne says and does things that suggest a commitment to moral objectivism, or the theory that there is in fact objective moral truth.  First, Montaigne does not hesitate to criticize the customary values and practices.  For instance, in “Of cannibals,” after praising the virtues of the cannibals, he criticizes them for certain behaviors that he identifies as morally vicious, and then goes on to criticize his own culture.  For a relativist, such criticism would be unintelligible: if there is no objective moral truth, it makes little sense to criticize others for having failed to abide by it.  Rather, since there is no external standard by which to judge other cultures, the only logical course of action is to pass over them in silence. Then there are moments when Montaigne seems to refer to categorical duties, or moral obligations that are not contingent upon either our own preferences or cultural norms (see, for example, the conclusion of “Of cruelty”).  Finally, Montaigne sometimes seems to allude to the existence of objective moral truth, for instance in “Of some verses of Virgil” and “Of the useful and the honorable,” where he distinguishes between relative and absolute values.

Thus, Montaigne’s position regarding moral relativism remains the subject of scholarly dispute.  What is not a matter of dispute, however, is that Montaigne was keenly interested in undermining his readers’ thoughtless attitudes towards other cultures, as well as their naïve acceptance of the customs of their own.

5. Moral and Political Philosophy

Montaigne rarely makes explicitly prescriptive moral or political arguments.  Still, the Essais are the expression of a distinctive view of the good life, a view that is self-consciously at odds with views and attitudes that Montaigne takes to be both fairly widespread among his audience and in some sense derived from or connected to major currents in the history of Western philosophy and Christian theology.  And while he presents himself as telling readers about his way of life, rather than teaching them how they ought to live, he admits at one point in “Of giving the lie” that he does aim to edify his reader, albeit indirectly.  Rather than a systematically elaborated and discursively justified ethics, then, he offers readers a series of provocations built into a descriptive account of a particular vision of the good.  These provocations can take any number of forms, including bald assertions, juxtapositions of familiar figures from the ancient world, stories, appeals to the authority of poets and ancient philosophers, and anecdotes about himself.  Ultimately, each contributes to what scholars have variously referred to as Montaigne’s attempt to effect “a transvaluation of values” or “a reordering” of his contemporaries’ conceptions of virtue and vice.

An essential element of his “reordering” is his account of the human condition.  While Montaigne does not frame it this way himself, it might be helpful to readers to juxtapose Montaigne’s anthropology and ethics with those that Giovanni Pico della Mirandola propounds in his famous Oration on the Dignity of Man (published in 1496).  There human beings are celebrated for the freedom that they possess to transform themselves into angels by means of the use of reason.  Montaigne, on the other hand, moves readers in the opposite direction, drawing our attention to our animality, challenging the pretensions of reason, and emphasizing the ways in which our agency is always limited and often thwarted.

Thus, Montaigne repeatedly challenges dualistic conceptions of the human being.  “It is a wonder how physical [our] nature is” (“Of the art of discussion”), he writes, and to help remind readers this basic fact of our being that he fears we tend to forget, Montaigne spends a great deal of time in “Apology for Raymond Sebond” drawing readers’ attention to our own animality and the ways in which we resemble other animals, while chiding us for our presumptuous confusion, both about what we are and which goods are most deserving of our care and attention: “We attribute to ourselves imaginary and fanciful goods, goods future and absent, for which human capacity by itself cannot answer, or goods which we attribute to ourselves falsely through the license of our opinion, like reason, knowledge, and honor.  And to them for their share we leave essential, tangible, and palpable goods: peace, repose, security, innocence, and health—health, I say, the finest and richest present that nature can give us.”  Elsewhere he takes a different tack, reminding readers of the vulnerability of our bodies to injury, disease, and death, pointing out the way that experience teaches us that our capacity for philosophical reflection depends entirely upon our physical condition, and thus that philosophers ought to acknowledge more vocally and explicitly the great good that is health: “Health is a precious thing, and the only one, in truth, which deserves that we employ in its pursuit not only time, sweat, trouble, and worldly goods, but even life; inasmuch as without it life comes to be painful and oppressive to us.  Pleasure, wisdom, knowledge, and virtue, without it, grow tarnished and vanish away; and to the strongest and most rigorous arguments that philosophy would impress on us to the contrary, we have only to oppose the picture of Plato being struck with a fit of epilepsy or apoplexy; and on this supposition to defy him to call to his aid these noble and rich faculties of his soul” (“Of the resemblance of children to fathers”).

It is no accident that Montaigne here adds pleasure to wisdom, knowledge, and virtue on this list of the greatest goods for human beings.  While Montaigne consistently describes pleasure, whether intellectual or physical, as a good for human beings, he positively celebrates the place of earthly pleasures—enjoyed in moderation, of course—throughout Book Three, and he devotes the final eight or so pages of the Essais to what could be described as an apology for their rightful place in a life well-lived.  Philosophically, Montaigne argues, to disparage or try to set aside the body and its desires betrays a lack of self-knowledge, and can only have only destructive consequences for most of us.  Theologically, he argues, we are wrong to refuse to love mere life itself and the pleasures that go with it, all of which are gifts from God.  While most scholars no longer accept Pierre Villey’s theory that Montaigne’s thought can be divided into three successive periods corresponding to his allegiance to Stoicism, Skepticism, and finally Epicureanism, there is little doubt that he, more than most philosophers in the Western tradition, constantly reminds us of our embodiment and revels in our “mixed constitution,” which he describes as “intellectually sensual, sensually intellectual” (“Of experience”).

However one understands Montaigne’s relation to skepticism, it is certainly clear that Montaigne consistently attempts to challenge the philosophical tendency to privilege and esteem reason as defining human nature and as making us worthy of special respect.  On the one hand, if we use the term to refer to our capacity to learn from experience and calculate costs and benefits, he introduces evidence that other animals possess this same capacity, even if not to the same degree.  On the other hand, if we take reason to be the capacity to grasp the theoretical truths of metaphysics, he has little confidence that it is a reliable guide: “I always call reason that semblance of intellect that each man fabricates in himself.  That reason, of which, by its condition, there can be a hundred different contradictory ones about one and the same subject, is an instrument of lead and of wax, stretchable, pliable, and adaptable to all biases and measures; all that is needed is the ability to mold it” (“Apology for Raymond Sebond”).  Experience, Montaigne holds, is often more reliable guide than reason, and while he does not exactly enter the fray regarding whether human beings possess innate knowledge, he clearly takes the senses to be the source of virtually all our knowledge of the world.  Moreover, practically speaking, he takes the imagination to be our most important cognitive faculty.  On the one hand Montaigne explicitly says that it is responsible for our most grievous difficulties.  It contributes not only to human presumption, as discussed above, but also to problematic ways in which we relate to each other, one example of this being the tendency to fail to recognize that our “betters” are, ultimately, human beings just like us.  On the other hand, with the style in which he composes his Essais, Montaigne implicitly suggests that the imagination can be a useful tool for combatting its own misperceptions. Thus in the Essais he often evokes readers’ imaginations with remarks that challenge our imaginative preconceptions: “Kings and philosophers shit, and ladies do, too” (“Of experience”).

As this example suggests, there is an egalitarian thread that runs throughout the Essais.  Much of our sense of the superiority of some persons to others is a function of our imagination’s tendency to be moved too greatly by appearances, and by our judgement’s tendency to take accidents for essences, as he writes in “Of the inequality that is between us”: “If we consider a peasant and a king, a nobleman and a plebeian, a magistrate and a private citizen, a rich man and a pauper, there immediately appears to our eyes an extreme disparity between them, though they are different, so to speak, only in their breeches. . .  Yet these things are only coats of paint, which make no essential difference.  For like actors in a comedy . . . so the emperor, whose pomp dazzles you in public . . . see him behind the curtain: he is nothing but an ordinary man, and perhaps viler than the least of his subjects.” This is one way in which his ethics is at odds with that of Aristotle, to whom Montaigne refers as that “monarch of modern learning” (“Of the education of children).  For Aristotle’s ethics can be understood to be hierarchical in a rather categorical fashion.  While in one sense every member of the species possesses the form of that species, in another sense, the form, or nature of the species, which is defined by the perfect instance of that species, belongs to individuals to greater or lesser degrees.  Montaigne, on the other hand, insists that “You can tie up all moral philosophy with common and private life just as well as with a life of richer stuff.  Each man bears the entire form of l’humaine condition” (“Of repentance”).  Thus Montaigne both rejects Aristotle’s hierarchical conception of human nature and refocuses readers’ attention away from varying human capacities and onto the universally shared human condition.

Part of what belongs to that condition is to be profoundly shaped by custom and habit, and to be subjected to the vicissitudes of fortune, not only in the external circumstances of one’s life, but even in one’s very nature and thinking.  Indeed, Montaigne makes much of the way that fortune plays a role in making the great what they are, and in this way again he both challenges the notion that they are creatures of a different order than the common person and presents human beings as possessing much less agency than they are wont to attribute to themselves: “We do not go; we are carried away, like floating objects, now gently, now violently, according as the water is angry or calm…  We float between different states of mind; we wish nothing freely, nothing absolutely, nothing constantly” (“Of the inconsistency of our actions”).

Another thematic element in Montaigne’s account of the human condition is diversity.  In part due to custom and habit, and in part due to forces not entirely understood, human beings are remarkably diverse their practices, priorities, values, and opinions.  Not only are we different from those whom we naturally take to be different from ourselves, but we are also quite different from our friends, a fact that Montaigne indirectly emphasizes in his famous essay, “Of friendship.”  Moreover, difference reigns within us as well as without, and in more ways than one.  For starters, Montaigne suggests that we are monstrous creatures, composed of incongruous parts, and thus often at odds with ourselves in various ways.  Then we also differ from ourselves temporally, in that we are inconstant creatures who think and behave differently over time.

Not only does Montaigne emphasize human diversity, but he also casts doubt on the idea that there is one way to achieve happiness that is the same for all human beings: “All the glory that I aspire to in my life is to have lived it tranquilly – tranquilly not according to Metrodorus or Arcesilaus or Aristippus, but according to me.  Since philosophy has not been able to find a way to tranquility that is suitable to all, let everyone seek it individually” (“Of glory”).  Combined with his insistence that every person bears the entire form of the human condition, this suggests that the good life is available to us all, regardless of our social, political, or economic standing, and that we must each find our own individual path to it.  For some, at least, that good will be found privately and idiosyncratically, rather than in the public realm or according to a common pattern.  Therefore, Montaigne consistently emphasizes the importance of the private realm.  One ought to maintain “a back shop” all one’s own, “entirely free, in which to establish our real liberty and our principal retreat and solitude” (“Of solitude”).  While we certainly have public obligations and duties to others, Montaigne is generally averse to sacrificing oneself for the sake of others and at one point remarks that “the greatest thing in the world is to know how to belong to oneself” (“Of solitude”).  One’s identity, then, is not exhausted by one’s status or role in the public realm, nor is one’s good to be found solely by means of the virtuous performance of that role.

Another important feature of the human condition, according to Montaigne, is imperfection.  He constantly emphasizes what he takes to be the inevitable limits and inadequacies of human beings, their cultures, and their institutions.  Whether this conviction derives from his study of Plutarch, the ancient philosopher whom he respects more than any other, or from his Christian faith and the doctrine of original sin is not clear.  What is clear is that Montaigne holds that it is vital for human beings to recognize and count on such imperfection.  In this way he seeks to lower readers’ expectations of themselves, other human beings, and human institutions, such as governments.

With so much diversity and imperfection, there is bound to be conflict, internal and external, and conflict is thus another feature of the human condition that Montaigne emphasizes, beginning with the first chapter of the Essais, “By diverse means we arrive at the same end.” Discussing whether standing one’s ground or submitting is the most efficacious way of engendering mercy in one’s conqueror, Montaigne introduces what he calls his “marvelous weakness in the direction of mercy and gentleness,” and points out that he is easily moved to pity, thereby setting up an explicit contrast between himself and the Stoics, and an implicit one between himself and Alexander, whose merciless and cruel treatment of the Thebans he describes at the end of the chapter.  Compassion, innocence, and flexible goodness, all united to courage, independence, and openness, become the hallmarks of the best life in the Essais.  Thus in “Of the most outstanding men,” Montaigne ranks the Theban general Epaminondas highest, above Alexander the Great and even Socrates, of whom Montaigne nearly always speaks highly.  Each of these men possess many virtues, according to Montaigne, but what sets Epaminondas apart is his character and conscience, as exemplified by his “exceeding goodness” and his unwillingness to do unnecessary harm to others, as well as by his reverence for his parents and his humane treatment of his enemies.  Montaigne shares these sociable virtues, and thus while he explicitly presents Epaminondas figures as a moral exemplar for the great and powerful, he implicitly presents himself as an exemplar for those leading ordinary private lives.

As scholars have pointed out, the virtues that Montaigne foregrounds in his portrayal of himself are those conducive to peaceful co-existence in a pluralistic society composed of diverse and imperfect individuals pursuing their own vision of the good.  Hence Montaigne’s well-known esteem for the virtue of tolerance.  Known in his lifetime as a politique who could get along with both Catholics and Huguenots in France, in the Essais Montaigne regularly models the ability to recognize the virtues in his enemies and the vices in his friends.  In “Of freedom of conscience,” for example, he begins with the theme of recognizing the faults of one’s own side and the commendable qualities in those whom one opposes, before going on to celebrate the Roman Emperor Julian, known to Christians as “The Apostate.”  After cataloguing Julian’s many political and philosophical virtues, he wryly remarks: “In the matter of religion he was bad throughout.”  Still, he notes that Julian was a harsh enemy to Christians, but not a cruel one, and it has been suggested that his positive portrait of Julian in the central chapter of Book Two was meant as a rebuke of the Christian kings of France, who granted freedom of conscience to their opponents only when they could not do otherwise.  Montaigne also recommends tolerance in private life in “Of friendship,” where he makes the striking remark that his doctor’s or lawyer’s religion means nothing to him because he concerns himself only with whether they are good at their respective professions.  Finally, it has recently been argued that one of the primary purposes of later editions of the Essais was to model for readers the basic capacities necessary for engaging with ideological opponents in a way that preserves the possibility of social cooperation, even where mutual respect seems to be lacking.

Montaigne’s re-ordering of the vices follows this same pattern.  He argues that drunkenness and lust, for example, are not so bad as even he himself had once taken them to be, insofar he comes to recognize that they do not do as much damage to society as other vices such as lying, ambition, and vainglory, and, above all, physical cruelty, which Montaigne ranks as the most extreme of all vices.  Montaigne’s ethics has been called an ethics of moderation; indeed, about the only immoderate element in his ethics is his hatred of cruelty, which he himself describes as a cruel hatred.

It might be said, then, that Montaigne does for ethics what Socrates was said to have done for philosophy: he brings it back down from the heavens.  The conception of human perfection that he presents to readers aims at merely human goods, and does not involve the attempt to approximate the divine.  In other words, instead of valorizing the philosophical or theological pursuit of divine perfection on the one hand, or the glory that comes with political greatness on the other, Montaigne directs our attention the social virtues and the humble goods of private life, goods accessible to all, such as friendship, conversation, food, drink, sex, and even sleep.  Even with respect to the great public figures of the classical world, Montaigne insists that their true greatness was to be measured by their ordinary conduct, private and hidden from public view as it was, rather than their military exploits, which depended a great deal on fortune.  Any life that seeks to transcend the human condition—in all its fleshy, vulnerable, limited animality—is met with mockery in the Essais: “Between ourselves, these are two things that I have always observed to be in singular accord: supercelestial thoughts and subterranean conduct” (“Of experience”).  The most “beautiful lives,” according to Montaigne, are those that are lived well both in public and in private, in a tranquil and orderly fashion, full-well enjoying the pleasures of both body and mind, compassionate and innocent of harm done to others, and possessed of what Montaigne calls a “gay and sociable wisdom.”  In this way Montaigne challenges some of the Platonic, Aristotelian, Stoic, Christian, and aristocratic elements of the 16th century French ethical imaginary.

This view of the good life has implications for the nature of politics.  Rather than being one of the two best and most human lives available, in Montaigne’s hands politics becomes the realm of necessity; it is a practice whose value is primarily instrumental.  Thus, it has been argued, Montaigne reverses the Aristotelian order, according to which the private is the realm of necessity and the public is the realm of excellence where human beings define themselves by their political actions.  In this way, Montaigne seems to be in accord with Machiavelli’s modern understanding of the political realm.  On the other hand, he parts ways with Machiavelli—at least as the latter is commonly understood—as well as his own countryman Jean Bodin, insofar as he rejects the notion that there can be a science of politics.  Human beings are too various, both inside and out, and too inconstant in order to develop a science of politics that could be of much use, practically speaking.  Machiavelli’s arguments, Montaigne says, can always be refuted with contrary examples, so diverse are human affairs (“Of presumption”).  Fortune reigns supreme, not only in the outcomes of our projects but also in our very act of reasoning about them (see 3.8.713).  This unpredictability of human belief and behavior, along with his fundamental conviction that human beings and institutions are necessarily and so inevitably imperfect and unstable, results in Montaigne’s impatience with theorizing the best regime for human beings per se.  Thus in “Of vanity,” where he suggests that in theory he favors republican regimes over monarchies, he criticizes those who attempt to describe utopian political regimes, arguing that we must take men as they are, and that in truth—as opposed to theory—the best government is the one under which a nation preserves its existence.  Politics, for Montaigne, is a prudential art that must always take into account historical and cultural context and aim low, so to speak, targeting achievable goals such as peace and order.  He rejects political justifications that appeal to “reasons of state” or the even “the common good,” since such rhetoric can give a “pretext of reason” to “wicked, bloody, and treacherous natures” (“Of the useful and the honorable”).  Thus, he refuses to apologize for the fact that he did not accomplish more during his two terms as mayor of Bordeaux.  It was enough, he says, that he managed to keep the peace.  As scholars have pointed out, readers must keep in mind that he endorses these modest political goals in the context of civil war and religious hostility.  For this reason it is not clear that he deserves the conservative and quietist labels that some critics have been quick to pin on him.

In addition to addressing these relatively abstract questions of contemporary political theory, Montaigne also took up notable positions on specific matters such as the treatment of alleged witches, heretics, and the indigenous peoples of the Americas.  In each case, Montaigne urges moderation and argues against any form of the use of force or violence.  In “Of cripples,” he opposes the position staked out by Jean Bodin in On the Demon-Mania of Witches (1580), arguing—based on his understanding of human nature and his encounter with people accused of witchcraft—against imprisonment and capital punishment for alleged witches on the grounds that it is nearly always more likely that the judgment of the accusers is deranged or malevolent than that the accused actually performed the supernatural feats attributed to them.  As he famously says, “it is putting a very high price on one’s conjectures to have a man roasted alive because of them” (“Of cripples”).

Similarly, while Montaigne remained Catholic and made clear that he opposed the Protestant Reformation, at the same time he consistently argues, sometimes rather subtly, against the violent suppression of the Huguenots and other religious minorities.  These arguments for religious tolerance come in several forms.  There is the explicit rejection of the use of force against heretics and unbelievers (“Of the punishment of cowardice”).  There are less explicit condemnations of particular cases of religious intolerance (“That the taste of good and evil depends in large part on the opinion we have of them”).  Then there is the portrait that he paints of himself throughout the Essais, which is one of a man who is “without hate, without ambition, without avarice, and without violence” (“Of husbanding your will”), and who, far from being threatened by the variety of beliefs, values, and practices that obtain in the human world, takes active pleasure in contemplating them, and welcomes discussion with those whose words and deeds differ from his own (“Of the art of discussion”).

The pleasure the Montaigne takes in contemplating other ways of living is evident in the way he relates what he has learned about the indigenous peoples of the “New World.” In the “Of coaches,” he condemns the Europeans’ dishonest, cowardly, rapacious, and cruel treatment of indigenous peoples in the Americas, arguing that while the Europeans may have possessed superior technology and an element of surprise that allowed them to dominate their hosts, they in no way surpassed the Americans with respect to virtue.  In “Of cannibals” he makes use of what he knows of a certain Brazilian tribe to critique his own culture.  While not unequivocal in his praise of the Brazilians, he makes it clear that he judges them to be superior to the French in a variety of ways, not the least of which is their commitment to equality and, consequently, their shock at the tremendous gap between the rich and the poor that they find when they visit France.  Mocking the human tendency to fail to recognize oneself in the other as a result of being overwhelmed by superficial differences, Montaigne famously concludes the chapter with remarks about the Brazilians’ sound judgment before exclaiming: “All this is not too bad—but what’s the use?  They don’t wear breeches.”

6. Philosophical Legacy

The philosophical fortunes of the Essais have varied considerably over the last four hundred years.  In the early seventeenth century, Montaigne’s skepticism was quite influential among French philosophers and theologians.  After his death, his friend Pierre Charron, himself a prominent Catholic theologian, produced two works that drew heavily from the Essais: Les Trois Véritez (1594) and La Sagesse (1601).  The former was primarily a theological treatise that united Pyrrhonian skepticism and Christian negative theology in an attempt to undermine Protestant challenges to the authority of the Catholic Church.  The latter is considered by many to be little more than a systematized version of “Apology for Raymond Sebond.”  Nonetheless, it was immensely popular, and consequently it was the means by which Montaigne’s thought reached many readers in the first part of the seventeenth century.  Indeed, influence can express itself positively or negatively, and the skeptical “problems” that Montaigne brought to the fore in “Apology for Raymond Sebond” set the stage for the rationalist response from René Descartes.  Without citing him by name, Descartes borrowed from Montaigne liberally, particularly in the Discourse on Method (1637), even as he seemed to reach epistemological and metaphysical conclusions that were fundamentally at odds with the spirit, method, and conclusions of the Essays.  Blaise Pascal, unlike Descartes, agreed with Montaigne that reason cannot answer the most fundamental questions about ultimate reality, such as the theoretical question of the existence of God.  This led Pascal to inquire, famously, into the practical rationality of religious belief in his Pensées (1670).  All the same, while sharing Montaigne’s aversion to speculative theology, recognizing himself in much of Montaigne’s self-portrait, and drawing in many respects on Montaigne’s conception of the human condition, Pascal often sets himself up in opposition to the self-absorption and lack of concern with salvation that he found in Montaigne. Meanwhile, Pascal’s associates at the Abbey of Port-Royal, Antoine Arnauld and Pierre Nicole, troubled by what they took to be Montaigne’s Pyrrhonism, excessive self-love, and lack of religious feeling, rejected him as scandalous.  Their harsh criticisms in the Port-Royal Logic, published in 1662, combined with the Roman Catholic Church’s placement of the Essays on the Index of Prohibited Books in 1676, effectively reduced the scope of Montaigne’s influence in France for the next fifty years.

In England, Francis Bacon borrowed the title of Montaigne’s book when he published his own Essayes in 1597, and it has been suggested that his work on scientific methodology in The New Organon bears the marks of the influence of Montaigne’s meditations on the frailties of human judgment.  John Florio produced the first English translation of the Essais in 1603, under the title Essayes or Morall, Politike, and Millitarie Discourses; scholars have argued that Shakespeare read them around this time, and have found evidence for this in plays such as Hamlet, The Tempest, and King Lear.  Recently, scholars have also begun to draw attention to connections between Montaigne’s anthropological and political views and those of Hobbes.

In the eighteenth century, Montaigne once again found favor in France, and loomed large in the literary and philosophical imaginations of les philosophes, who, eager to distance themselves from Cartesian rationalism, eagerly embraced Montaigne’s skepticism, empiricism, and opposition to religious fanaticism.  Scholars point out that Jean-Jaques Rousseau and Denis Diderot in particular bear the signs of Montaigne’s influence.  The former borrowed a great deal from essays such as “Of cannibals” and “Of the education of children,” while the latter shares Montaigne’s skepticism, naturalism, and digressive literary style.  Meanwhile, David Hume, who himself spent many years in France, developed a form of mitigated skepticism that bears a clear resemblance to Montaigne’s own epistemic stance, and wrote his own Essays for the purpose of establishing discourse between the “learned” and the “conversable” worlds.

In the nineteenth century, Montaigne would become a favorite of both Ralph Waldo Emerson and Friedrich Nietzsche.  In Emerson’s essay “Montaigne; or, the Skeptic,” he extols the virtues of Montaigne’s brand of skepticism and celebrates Montaigne’s capacity to present himself in the fullness of his being on the written page: “The sincerity and marrow of the man reaches into his sentences.  I know not anywhere the book that seems less written.  Cut these words, and they would bleed; they are vascular and alive.”  Montaigne’s view of inquiry as a never-ending process constantly open to revision would shape, by means of Emerson, the American Pragmatic Tradition as found in the likes of William James, Charles Sanders Pierce, and John Dewey.   Nietzsche, for his part, admired what he took to be Montaigne’s clear-sighted honesty and his ability to both appreciate and communicate the joy of existence.  Moreover, Nietzsche’s aphorisms, insofar as they offer readers not a systematic account of reality but rather “point[s] of departure”—to borrow the expression that T.S. Eliot once used when describing Montaigne’s essays—could be read as a variation on the theme that is the Montaignian essay.

For most of the twentieth century, Montaigne was largely ignored by philosophers of all stripes, whether their interests were positivistic, analytic, existential, or phenomenological.  It was not until the end of the century that Montaigne began to attract more philosophical attention, and to be identified as a forerunner of various philosophical movements, such as liberalism and postmodernism.  Judith Shklar, in her book Ordinary Vices, identified Montaigne as the first modern liberal, by which she meant that Montaigne was the first to argue that physical cruelty is the most extreme of the vices and the one above all that we must guard against.  Meanwhile, Jean François Lyotard suggested that the Essays could be read as a postmodern text avant la lettre.  Michel Foucault, for his part, described his own work as a type of essaying, and identified the essay—understood in the Montaignian sense of the philosophical activity of testing or experimenting on one’s way of seeing things—as “the living substance of philosophy.” In Contingency, Irony, and Solidarity, Richard Rorty borrowed Shklar’s definition of a liberal to introduce the figure of the “liberal ironist.”  Rorty’s description of the liberal ironist as someone who is both a radical skeptic and a liberal in Shklar’s sense has led some to interpret Montaigne as having been a sort of liberal ironist himself.

As many scholars have noted, the style of the Essais makes them amenable to a wide range of interpretations, and in fact philosophers over the years have often seemed to be as much or more interested in what the Essais have to say to them and their own cultural milieu as they have been in the history of the Essais’ reception or the details of the historical context in which they were written.  This would likely please Montaigne, who himself celebrated the notion that sometimes a book can be better understood by its readers than its author (“Various outcomes of the same plan”), and who claimed to have been more interested in uncovering possibilities than determining historical facts (“Of the power of the imagination”).

7. References and Further Reading

a. Selected Editions of Montaigne’s Essays in French and English

  • Montaigne, Michel de.  Essais. 2nd Ed. Edited by Pierre Villey and V.-L. Saulnier. Paris: Presses Universitaires de France, 1992.
  • Montaigne, Michel de. Essais. Edited by André Tournon.  Paris: Imprimerie nationale, 1998.
  • Montaigne, Michel de. Essais. Edited by J. Balsamo, M. Magnien, and C. Magnien-Simonin. Paris: Gallimard, 2007.
  • Montaigne, Michel de. The Complete Essays of Montaigne. Translated by Donald M. Frame.  Stanford: Stanford University Press, 1943.
    • The translation used in the quotations above.
  • Montaigne, Michel de. The Complete Essays. Translated by M.A. Screech.  New York: Penguin, 1991.
  • Montaigne, Michel de. Michel de Montaigne: The Complete Works. Translated by Donald M. Frame. New York: Alfred A. Knopf, 2003.
    • Includes the “Travel Journal” from Montaigne’s trip to Rome as well as letters from his correspondence.

b. Secondary Sources

  • Adorno, T.W. “The Essay as Form.” New German Critique 32 (Spring 1984): 151-171.
    • Although not dealing with Montaigne specifically, Adorno’s meditation on the essay as literary-philosophical form shows how much that form owes to Montaigne.
  • Bakewell, Sarah. How to Live or A Life of Montaigne in One Question and Twenty Attempts at an Answer. New York: Other Press, 2011
    • A biography of Montaigne written for a general audience.
  • Brush, Craig B. Montaigne and Bayle:  Variations on the Theme of Skepticism.  The Hague: Martinus Nijhoff, 1966.
    • Includes a lengthy commentary on “Apology for Raymond Sebond.”
  • Cave, Terence. How to Read Montaigne. London: Granta Books, 2007.
    • An introduction to Montaigne’s thought.
  • Desan, Philippe. Montaigne: A Life. Trans. Steven Rendall and Lisa Neal.  Princeton: Princeton University Press, 2017.
    • Scholarly biography that argues for an evolutionary understanding of the development of the Essays grounded in Montaigne’s changing social and political ambitions and prospects.  Argues that Montaigne wrote the first edition of his book for the purpose of advancing his own political career.
  • Desan, Philippe, ed. The Oxford Handbook of Montaigne.  Oxford: Oxford University Press, 2016.
  • Fontana, Biancamaria.  Montaigne’s Politics: Authority and Governance in the Essais.  Princeton: Princeton University Press, 2008.
    • Study of Montaigne’s account of the relationships among private opinion, political authority, and the preservation of peace and freedom.  Characterizes Montaigne’s ethics as one of moderation and includes material on Montaigne’s treatments of freedom of conscience, toleration, and Julian the Apostate.
  • Frame, Donald M. Montaigne: A Biography. New York: Harcourt, Brace and World, 1965.
    • Biography suitable for students and scholars alike.
  • Friedrich, Hugo. Montaigne. Edited by Philippe Desan. Translated by Dawn Eng. Berkeley: University of California Press, 1991.
    • A landmark work in Montaigne studies; provides a thorough account of both the Essays themselves and the cultural context out of which they emerged.
  • Green, Felicity. “Reading Montaigne in the Twenty-First Century.”  The Historical Journal vol. 52, no. 4 (December 2009): 1085-1109.
    • A review article that gives an account of recent developments in Montaigne studies, especially those concerning the contested question of which text of the Essais—the 1595 edition or the “Bordeaux Copy”—should be taken as authoritative.
  • Hallie, Philip. The Scar of Montaigne: An Essay in Personal Philosophy. Middletown:  Wesleyan University Press, 1966.
    • An accessible account of Montaigne as a skeptic for whom the practice of philosophy is intimately tied to one’s way of life.
  • Hartle, Ann. Michel de Montaigne: Accidental Philosopher. Cambridge: Cambridge University Press, 2003.
    • Presents Montaigne as an original philosopher whose thought is best understood not as skeptical, but as dialectical.  Argues that Montaigne is undertaking to reorder the virtues.
  • Hartle, Ann. Montaigne and the Origins of Modern Philosophy. Evanston: Northwestern     University Press, 2013.
    • Elucidates modern features of Montaigne’s project, especially the ways in which it poses a challenge to the dominant Aristotelian paradigm of his time and develops a new standard of morality that is neither Aristotelian nor Christian.
  • La Charité, Raymond C. The Concept of Judgment in Montaigne. The Hague: Martinus Nijhoff, 1968.
    • Studies the role that judgment plays in Montaigne’s philosophical project.
  • Langer, Ullrich, ed. The Cambridge Companion to Montaigne. Cambridge: Cambridge University Press, 2005.
  • Levine, Alan. Sensual Philosophy: Toleration, Skepticism, and Montaigne’s Politics of the Self. Lanham: Lexington Books, 2001.
    • Interprets Montaigne as a champion of modern liberal values such as tolerance and the protection of a robust private sphere.
  • Nehamas, Alexander.  The Art of Living: Socratic Reflections from Plato to Foucault. Berkeley: University of California Press, 1998.
    • Includes a study of Montaigne’s relationship to Socrates, especially in connection with the essay “Of Physiognomy.”
  • Popkin, Richard. The History of Scepticism from Savonarola to Bayle. Oxford: Oxford University Press, 2003.
    • Interprets Montaigne as a skeptical fideist in the Pyrrhonian tradition.
  • Quint, David. Montaigne and the Quality of Mercy: Ethical and Political Themes in the Essais. Princeton: Princeton University Press, 1998.
    • Argues that Montaigne’s primary concern in the Essays is to replace the martial conception of virtue prevalent during his time with a new conception of virtue more conducive to the preservation of public peace.  Draws attention to Montaigne’s celebration of Epaminondas as a moral exemplar as well as the way that Montaigne presents himself as a private analogue to Epaminondas.
  • Regosin, Richard. The Matter of My Book: Montaigne’s Essays as the Book of the Self. Berkeley: University of California Press, 1977.
    • A literary study examining the relation between Montaigne’s text and his conception of the self.
  • Sayce, Richard. The Essays of Montaigne: A Critical Exploration. London: Weidenfeld and Nicolson, 1972.
    • A classic comprehensive study of the Essays.
  • Schaefer, David Lewis. The Political Philosophy of Montaigne. Ithaca: Cornell University Press, 1990.
    • Argues that the Essays are more systematic than they initially appear, and that Montaigne’s primary project in writing them was to initiate a transvaluation of values that would usher in a new moral and political order centered around the individualistic pursuit of the humble and earthly pleasures of private life.
  • Schachter, Marc. D. “That Friendship Which Possesses the Soul,” Journal of Homosexuality, 41:3-4, 5-21.
    • An analysis of Montaigne’s “Of friendship” and the nature of his friendship with La Boétie.
  • Schwartz, Jerome.  Diderot and Montaigne: The Essais and the Shaping of Diderot’s Humanism. Geneva: Librairie Droz, 1966.
    • A lucid account of Diderot’s literary and philosophical relationship with Montaigne’s Essais.
  • Shklar, Judith. Ordinary Vices. Cambridge: Harvard University Press, 1984.
    • Interprets Montaigne’s ranking of physical cruelty as the worst vice as both a radical rejection of the religious and political conventions of his time and a foundational moment in the history of liberalism.
  • Starobinski, Jean. Montaigne in Motion. Translated by Arthur Goldhammer. Chicago: University of Chicago Press, 1985.
    • Traces the dialectical movement of Montaigne’s engagement with the world in connection with major themes of the Essais such as the body, friendship, the public and the private, and death.
  • Thompson, Douglas I. Montaigne and the Tolerance of Politics. Oxford: Oxford University Press, 2018.
    • Argues that Montaigne seeks to teach readers tolerance as a political capacity necessary for living in a pluralistic age.
  • Taylor, Charles. Sources of the Self: The Making of the Modern Identity. Cambridge, MA:  Harvard University Press, 1989.
    • Situates Montaigne in the history of modern conceptions of the self.

 

Author Information

Christopher Edelman
Email: edelman@uiwtx.edu
University of the Incarnate Word
U. S. A.

Immanuel Kant: Transcendental Idealism

KantTranscendental idealism is one of the most important sets of claims defended by Immanuel Kant (1724–1804), in the Critique of Pure Reason. According to this famous doctrine, we must distinguish between appearances and things in themselves, that is, between that which is mind-dependent and that which is not. In Kant’s view, human cognition is limited to objects that somehow depend on our minds (namely, appearances), whereas the mind-independent world (things in themselves) lies beyond the limits of our experience and cognition. The doctrine of transcendental idealism is fundamental to Kant’s entire critical philosophy: its adoption marks the distinction that is typically drawn between Kant’s “pre-critical” phase (preceding the publication of the Critique of Pure Reason, that is, Kant’s first Critique) and his “critical” phase (typically taken to start—in its full-blown form—with the first Critique and to extend to all works produced thereafter). For this reason, the term ‘transcendental idealism’ is sometimes used in a (perhaps unduly) broad sense to refer to Kant’s critical philosophy in general. Kant himself uses the term in the more specific sense just outlined, which rests on the distinction between appearances and things in themselves. How and to what extent this distinction is linked to other major views and arguments in Kant’s critical philosophy is a question that has no easy or uncontroversial answer.

The focus of this article is Kant’s transcendental idealism (in the strict sense of the term) from the perspective of Kant’s theoretical philosophy and the first Critique in particular. Transcendental idealism is certainly one of the most vigorously discussed Kantian views, and there is a—sometimes astonishing—lack of consensus in these discussions. Substantial controversies concern interpretive questions (for example: What does the doctrine mean? What are the arguments that Kant formulates in its favor? How are we to understand these arguments exactly?) as well as philosophical questions (for example: How good are the arguments? Is the resulting view coherent?). To do justice to the deeply controversial nature of the issues discussed here, this article begins (Section 1) with an overview of key claims and arguments as presented by Kant in the first Critique to inform the reader about the key texts and considerations with respect to transcendental idealism, without straying into deeper issues of interpretation and evaluation. The remainder of the article introduces controversial interpretive and philosophical questions surrounding these texts, claims, and arguments. Section 2 discusses Kant’s argument(s) for transcendental idealism, as well as some relevant issues that have sparked heated debate. The following two sections focus on the doctrine itself—that is, the distinction between appearances and things in themselves—as opposed to the argument intended to establish its truth. Section 3 discusses questions and controversies related to the first part of the distinction, namely the status of Kantian appearances. Section 4 then focuses on the second part of this distinction, namely the status of Kantian things in themselves. The article concludes with some general remarks (Section 5) and references for further reading (Section 6).

Table of Contents

  1. Transcendental Idealism in the Critique of Pure Reason
    1. The World of Experience and Cognition as a World of Appearances, not Things in Themselves
    2. Transcendental Idealism as a Moderate Brand of Idealism
  2. The Argument(s) for Transcendental Idealism and Some Related Disputes
    1. The Ideality of Space and Time
      1. (Representations of) Space and Time as A Priori Intuitions
      2. The Ideality of Space and Time and the Neglected Alternative
      3. Antinomial Conflict and the Indirect Argument for Idealism
    2. Beyond Space and Time: Sensibility, Understanding, and the Case for Idealism
      1. Sensibility, Receptivity, and (Short) Arguments for Idealism
      2. Understanding and the Role of the Categories in Transcendental Idealism
  3. Controversies with Respect to the Status of Kantian Appearances
    1. The Radical Idealism Charge and the Affection Dilemma: The Problem of Appearances
    2. Rival Interpretations of the Mind-Dependence of Kantian Appearances
      1. One- vs. Two-World Interpretations of Transcendental Idealism
      2. Textual and Philosophical Motivation
  4. Controversies with Respect to the Status of Kantian Things in Themselves
    1. The Radical Idealism Charge and the Affection Dilemma: The Problem of Things in Themselves
    2. Commitment to Things in Themselves
      1. The Exegetical Question: Noumenon, Transcendental Object, and Thing in Itself
      2. The Philosophical Question: The Legitimacy of a Commitment to Things in Themselves
  5. Concluding Remarks
  6. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Transcendental Idealism in the Critique of Pure Reason

In the Critique of Pure Reason, Kant develops and advocates the doctrine of transcendental idealism: we can have cognition only within the realm of experience; objects in this realm, that is, empirical objects, are mind-dependent. Kant calls such objects ‘appearances’ (in German: Erscheinungen), which are to be contrasted with ‘things in themselves’ (Ding(e) an sich). In contrast to ‘appearances’, the terms ‘thing in itself’ and ‘things in themselves’ stand for the mind-independent world. According to Kant’s idealism, things in themselves—the mind-independent world—are beyond our epistemic reach and cannot be an object of cognition (or knowledge) for epistemic agents such as ourselves, that is, human beings (or perhaps finite cognizers more broadly). (Although the terms ‘cognition’ and ‘knowledge’ are used interchangeably in this article, the former is used more commonly. This is an issue to which we return in Section 4.)

a. The World of Experience and Cognition as a World of Appearances, not Things in Themselves

The first Critique is an inquiry into the possibility, scope, and limits of a priori cognition, that is, cognition that is pre-empirical and as such independent of experience—in a suitably qualifiable sense of ‘independence’. (Kant has no quarrel with the idea that “all our cognition commences with experience”; this, however, does not yet mean that all cognition “arises from experience” (B1).) (References to the Critique follow the standard A/B edition pagination. References to other Kantian works are cited by the volume and page number of the Academy edition: Immanuel Kant, Gesammelte Schriften, ed. by Preußische Akademie der Wissenschaften, Berlin 1900ff. For information on the translations used, see Section 6.) In the first Critique, Kant distinguishes three faculties of cognition (that is, cognitive powers of the mind)—sensibility, understanding, and reason—which are examined in turn with respect to their potential for a priori cognition. Transcendental idealism is developed within the course of this project, but most notably in the Transcendental Aesthetic, which is the part of the Critique in which Kant’s investigation into the possibility and limits of a priori cognition begins; the focus of this part is on sensibility. Kant’s account of sensibility is, in its entirety, of special relevance for his idealism.

Sensibility is the power of the mind to have intuitions, which are singular, immediate “representations” (that is, mental states) (A19/B33). (As you are reading this article, you are most probably perceiving the computer screen that is in front of you; in Kant’s parlance, you are having an intuition of your computer screen.) Kant asks whether intuitions and sensibility could be said to have a form that is characteristic of human perceiving subjects, allowing us to anticipate some features of objects, independently of our experience of them. He formulates his space and time arguments (A22–25/B37–B41, A30–32/B46–49) and concludes that this is indeed the case. In Kant’s view, space and time are “nothing other than merely forms”: they are somehow subjective, to be attributed to human minds and their workings (as opposed to the mind-independent world) (A26/B42, A32–33/B49–50). In a similar, albeit more negative sense, he speaks of the “transcendental ideality” of space and time: space and time are not entities or properties of the mind-independent world (A27–28/B43–44, A35–36/B52–53). It is thanks to this mind-dependent nature of space that the possibility of mathematical cognition, for example (which, in Kant’s view, is a species of a priori cognition), can be explained (4: 287–288). Kant thinks that his views on space and time have idealist implications for all objects encountered in sensible experience: these objects are spatiotemporal objects and, as such, merely appearances (A30). This claim regarding appearances is often framed as a claim about these objects being representations (of some sort) (ibid.); elsewhere, however, Kant suggests that appearances are the same things as the entities he calls ‘things in themselves’. In any case, a crucial claim of Kant’s idealism is that all our cognition of empirical, spatiotemporal objects amounts only to cognition of appearances, not things in themselves.

In the next major part of the Critique, the Transcendental Analytic, Kant further carries out his project with respect to the possibility and limits of a priori cognition, turning his attention to the second faculty of cognition, namely the understanding. The understanding is the power of the mind that allows us to have concepts, which, in contrast to intuitions, are general representations. (Think of the previous example: the intuition was described as a mental state that represents your very own computer screen; the concept <computer screen> is, by contrast, general: all computer screens, not just your screen, fall under the concept.) In examining the understanding and its concepts, Kant focuses on a priori, non-empirical concepts—that is, concepts that could not arise from mere experience. According to Kant, there are twelve such concepts. These concepts are the “categories”, paradigm cases of which are the concept of substance and the concept of cause and effect (A79–80/B105–106). Kant’s entire treatment of the understanding has implications for his idealism. The chapter on Phenomena and Noumena at the end of the Analytic explicitly takes up the question of what lessons follow for Kant’s idealism from his overall treatment of the understanding.

A prominent part of this overall treatment, forming the basis of these lessons, is the Transcendental Deduction, in which (A84–130/B116–169) Kant asks how we can legitimately apply non-empirical concepts (such as the concept of causality) to objects. Kant argues that this is possible because the categories function as “conditions of experience”. (‘Experience’ is a technical term in Kant and is synonymous with ‘empirical cognition’.) According to Kant, all objects that can be empirically cognized by finite/human cognizers are constituted in such a way that the categories are valid of them—we can legitimately apply our a priori concepts to them. The way Kant establishes the “objective validity” of the categories has an important implication, however: the objects to which we legitimately apply the categories can only be empirical objects—appearances, not things in themselves (see especially A129–130, B163–165). Again, Kant underlines the transcendental idealist implications of his account of the possibility of a priori cognition (this time in the course of an examination of the faculty of the understanding). It is in the context of this discussion (in the A-version of the Transcendental Deduction, as found in the first, 1781, edition of the Critique) that we encounter some influential Kantian formulations to the effect that the object of our cognition and thought is merely a transcendental object (A108–109).

In the chapter on Phenomena and Noumena (A235–260/B294–315), Kant discusses extensively the transcendental idealist implications of his account and (once again) draws attention to the fact that the domain of application of the categories is the domain of appearances, not things in themselves. Kant contrasts his own conception with earlier approaches (including his own, as it was presented in his Inaugural Dissertation – De mundi from 1770), which frame the distinction between appearances and things in themselves in more traditional terms, namely, as a division between a sensible and an intelligible world, or between phenomena and noumena. Things in themselves, by his account, are not noumena (at least not so far as a specific, “positive”, construal of the term is concerned—this is made clear in the second, 1787, edition of the Critique), and his own (critical) view does not amount to a division between an intelligible world and a sensible world (in the “positive” sense) (A255–256/B311–312).

The Transcendental Dialectic, which follows the Transcendental Analytic, contains the only two explicit definitions of ‘transcendental idealism’ in the entire Critique, and presents further notable considerations in favor of idealism. In this part of the work, Kant’s inquiry into the possibility and limits of a priori cognition concerns the third faculty of cognition, namely reason. Kant’s conception and characterization of reason is complex, but one important idea is that reason is a “faculty of drawing inferences” and a “faculty of principles” (A299/B355–356). (Reason is, among other things, the power of the mind to explore the relationships between propositions, drawing inferences from them: where propositions p and q, for instance, act as premises in a syllogism, proposition r, the conclusion, is drawn thanks to the faculty of reason. It is the task of reason to also move in the other direction: given a proposition p, we could seek further “cognitions”—expressed by further propositions (s, t)—that could serve as premises, such that p is a valid conclusion from them.)

Unlike the previous parts of the Critique, which established the possibility of a priori cognition with respect to the faculties of sensibility and of the understanding, a large part of the Dialectic answers the question of whether a priori cognition through reason is possible in the negative: metaphysica specialis in its traditional form—which had objects such as the soul, the world as a whole, and God—comes under attack. Of special relevance for idealism are the Antinomy chapter (A405–567/B432-595), and a particular section in the Paralogisms chapter (A366–380). (The latter played a historically important role in the reception and interpretation of Kant’s idealism.)

Kant thinks that reason inevitably leads to (a four-fold) antinomy. Some important metaphysical questions (for example, the question of whether the world has a beginning in time, or whether freedom could be compatible with natural necessity) are such that they (seem to) admit contradictory answers: that is, we can formulate good arguments for both a positive and a negative answer to such questions. According to Kant, we shall be confronted with this depressing situation of (equally) good arguments for opposing substantial metaphysical claims so long as we do not embrace transcendental idealism. Only if we accept transcendental idealism shall we have a way out of this situation and thus avoid the “contradiction of reason with itself” (A740/B768). Thus, Kant’s resolution of the antinomy functions as a further, indirect argument for transcendental idealism (A490–496/B518–525, A506–507/B534–535; for an important internal differentiation between the four antinomies that affects how the argument works, see Section 2.a.iii).

It is in this context that we find the second explicit definition of transcendental idealism in the entire Critique. (The first is mentioned shortly.) Kant writes:

We have sufficiently proved in the Transcendental Aesthetic that everything intuited in space or in time, hence all objects of an experience possible for us, are nothing but appearances, i.e., mere representations, which, as they are represented, as extended beings or series of alterations, have outside our thoughts no existence grounded in itself. This doctrine I call transcendental idealism. (A490–491/B518–519)

There is a further section in the Dialectic that is worth mentioning: a section that made its appearance in the A-edition and was so substantially revised in the B-edition that it was practically deleted. In the A-version of the Fourth Paralogism, Kant addresses questions of external world skepticism. Engaging with a “Cartesian” who calls our justification for the belief in an external world into question, Kant seeks to show that the skeptical worry is the product of fallacious reasoning, a “paralogism”. In response to such skeptical worry, Kant invokes transcendental idealism and gives us the first explicit definition of the doctrine in the Critique:

I understand by the transcendental idealism of all appearances the doctrine that they are all together to be regarded as mere representations and not as things in themselves, and accordingly that space and time are only sensible forms of our intuition, but not determinations given for themselves or conditions of objects as things in themselves. (A369)

Kant thinks that adopting this doctrine enables one to undercut the skeptical worry. Immunity to skeptical worries could thus be deemed a further consideration in favor of Kant’s idealism.

b. Transcendental Idealism as a Moderate Brand of Idealism

 This article has thus far looked at Kant’s way of proceeding, the main arguments, and the most relevant passages for establishing and defining transcendental idealism in the first Critique. Before going deeper into controversial questions of interpretation and philosophical assessment in the sections to follow, there is one further general feature of Kant’s transcendental idealism that is worth noting, given that Kant himself highlights it in different ways and in different places. On multiple occasions, Kant insists that transcendental idealism, despite being called ‘idealism’, is somehow less idealist than doctrines that normally fall under that label. When first arguing for his doctrine in the Transcendental Aesthetic—in the A-edition of the Critique—Kant makes it clear that he is defending both the transcendental ideality and the empirical reality of space and time (A27–28/B43–44, A35–36/B52–53). (See also Kant’s take on the relationship between transcendental idealism and empirical realism in A369–370.)

In the Prolegomena, which was published two years after the first edition of the Critique and was meant to summarize the main claims and arguments of the Critique in a more accessible form, Kant elaborates further on the non-idealist features of his position through remarks dedicated to the issue (4: 288–294) and the inclusion of a relevant Appendix (4: 371–383). In this work—in response to initial reactions to his doctrine, which had already seen the light of day—Kant even suggests dropping the term ‘transcendental idealism’ in favor of terms such as ‘critical’ (4: 293–294, 4: 375) and ‘formal idealism’ (4: 375), which he thought to be more accurate/less misleading formulations of his actual views. (See also B519 note.)

In the second, revised edition of the Critique, Kant seeks to clarify matters further and to highlight these non-idealist features by making changes and adding footnotes and remarks that directly concern his idealism. In this context, he emphasizes that his claims about spatiotemporal objects as appearances do not amount to the claim “that these objects would be a mere illusion” (B69–71). He even adds a section called “Refutation of Idealism” (B274–279), in which he opposes idealism by addressing questions of external world skepticism, to replace the Fourth Paralogism section of the A-edition (a section that had already acquired a questionable reputation).

Despite their differences, all of Kant’s attempts to qualify his idealism point in the same direction: Kantian idealism is—at least by intention—supposed to be a moderate brand of idealism, to be distinguished from more radical—and, in Kant’s view, untenable—versions thereof. The label ‘formal idealism’, which Kant presents as an alternative to ‘transcendental idealism’, is informative in this respect. Kant distinguishes between the matter and form of our experience. (This is a distinction made in the very first section of the Transcendental Aesthetic, where Kant begins by discussing sensibility and intuitions and making his case for the claim that time and space are their forms; A19–22/B33–36.) The Kantian claim that time and space are (transcendentally) ideal, and that empirical objects are mind-dependent appearances, is to be understood as a thesis about the form of our experience. Idealism, as a view about the mind-dependence of empirical objects, is not the whole story, however. With regard to the matter of experience, Kant’s view is not meant to be idealist. In a letter to Beck from 1792, Kant reacts to certain criticisms: “I speak of ideality in reference to the form of representation while they [Kant’s critics, Garve and Eberhard] construe it as ideality with respect to the matter, i.e., ideality of the object and its existence itself” (11: 395).

With this general overview of Kant’s idealism, we now are in a position to look more carefully at the main texts, claims, and arguments just presented, paying particular attention to controversial questions of interpretation and philosophical assessment. The next section looks more closely at the argument(s) for idealism, whereas the following two main sections discuss in more detail the doctrine itself.

2. The Argument(s) for Transcendental Idealism and Some Related Disputes

If there is one set of considerations that is indisputably central to Kant’s case for idealism, it is those relating to time and space that we find in the Aesthetic. A particular focus of this section is thus on the space and time arguments, the ideality of space and time, and related disputes. However, there are important (and controversial) questions to be raised with respect to other considerations (beyond those concerning time and space) that have a role in Kant’s overall case for idealism, which are therefore also addressed in this section.

a. The Ideality of Space and Time

As we have seen, Kant’s case for the ideality of space and time in the Aesthetic is of fundamental importance for establishing transcendental idealism. His arguments there have attracted considerable attention in the history of the reception of Kant as well as in Kant scholarship. The focus in this subsection is mostly on these arguments; towards the end of the subsection, Kant’s indirect argument for idealism in the Antinomy is also discussed.

i. (Representations of) Space and Time as A Priori Intuitions

Kant rests his case for the ideality of space and time in his so-called “space and time arguments” in the Aesthetic (A22–25/B37–41, A30–32/B46–49), which are intended to show that (representations of) space and time are a priori intuitions. It is worth noting here a subtlety that can have important consequences in our interpretation and that relates to the extent to which Kant’s space and time arguments (up to A25/B41 and A32/B49, respectively) concern merely our representations of space and time (that is, the nature and origin of the mental states of subjects that represent space and time) or, rather, space and time themselves (as the objects represented and as opposed to our representations of them). Although Kant’s own presentation is not always clear in this respect—some of his formulations would suggest that we are talking about space and time themselves—there are good reasons to think that Kant’s considered view is the former. The expression “(representation of) space and time” that is adopted in the following presentation of the space and time arguments is meant to call attention to this subtle and important issue—an issue to which we briefly return in the next subsection.

These arguments underwent some changes in the second edition of the Critique. In the B-edition, which is often thought to present a more helpful version of Kant’s arguments, and is thus the presentation referred to as standard, Kant splits his arguments into two broader categories: he distinguishes a “metaphysical” and a “transcendental exposition” of the concepts of space and time.

The arguments of the first group, the metaphysical exposition, proceed from certain reflections on some features of (our representations of) space and time to make claims about their a priori origin or intuitive nature. Kant thinks, for instance, that we need to presuppose the representation of space to be able to represent objects as “outside” us and “next to one another”. Based on such a consideration, he claims that the representation of space cannot be empirical and must be a priori (A23/B38, first space argument). (For a discussion of the first space argument, see Warren 1998.) Other arguments of this group seek to establish the claim that our representation of space and time is an intuition, as opposed to a concept. A key idea is that, in the case of (our representation of) space, “if one speaks of many spaces, one understands by that only parts of one and the same unique space” (A25/B39, third space argument). (For discussions of these types of arguments, see Allison 2004: 99–116, Falkenstein 1995: 159–252, Vaihinger 1892: 151–263.)

The “transcendental exposition” proceeds in a different way, and, in the case of both space and time, one specific argument is singled out. In the “argument from geometry” (which is the space version of the transcendental exposition, B40–41), the standard interpretation is that Kant is arguing that our representation of space must be an a priori intuition, which he demonstrates by invoking some features that he takes geometrical cognition to have. According to Kant, the sort of truths one establishes when engaging in geometry are necessary (as opposed to contingent) and synthetic (as opposed to analytic; that is, we cannot tell the truth value of geometrical propositions merely on the basis of their meaning). In the most widely accepted reading, Kant argues that our representation of space has to be an a priori intuition, if one wishes to account for these features of geometrical cognition. The argument can thus be described as a regressive transcendental argument that proceeds from certain assumptions (that we have synthetic a priori knowledge in the form of geometrical cognition) and leads to a conclusion (representation of space as an a priori intuition) that is a requirement, or an explanation, for the possibility of the assumed phenomenon. (For a reconstruction along such lines, see, for example, Strawson 1966: 277–278, Van Cleve 1999: 34–43.)

This argument has attracted considerable attention. One question concerns the argumentative weight of this type of (transcendental) consideration in Kant’s overall case for the transcendental ideality of space: it is sometimes thought that this is indeed Kant’s central argument (see, for example, Strawson 1966: 277), whereas some scholars assign more weight to the metaphysical expositions (see, for example, Allison 2004: 116–118). The latter stance can be motivated by the idea that the argument sketched above seems especially vulnerable, as it operates with substantive assumptions that many modern readers would want to reject. (For an alternative reading of the argument that resists reading it as a regressive transcendental argument, see Shabel 2004.)

ii. The Ideality of Space and Time and the Neglected Alternative

The space and time arguments aim to establish that (our representations of) space and time are a priori intuitions. Building on these conclusions, Kant proceeds to draw some further conclusions in the section that immediately follows, entitled “Conclusions from the above concepts” (A26–28/B42–44, A32–36/B49–53). It is claimed there that space and time are merely a subjective condition of our sensibility and as such transcendentally ideal. Kant argues that “space represents no property at all of any things in themselves nor any relation of them to each other” (A26/B42); similarly, “time is not something that would subsist for itself or attach to things as an objective determination” (A32/B49). Things in themselves—the mind-independent world—are not in space and time.

These additional—and decisive, since they play a key role in establishing transcendental idealism—conclusions have attracted much criticism and have led to a famous problem. It has been argued that one could accept the time and space arguments and the intermediate conclusion that they establish, and still resist the further conclusion. That is, we could accept that (our representations of) space and time are a priori intuitions (as established in the metaphysical and transcendental expositions) but deem the further conclusion(s) regarding the transcendental ideality of space and time as unwarranted. This sort of criticism is often referred to as the “neglected alternative” charge or the “Trendelenburg’s gap” problem. Kant’s argument for the transcendental ideality of space and time is thought to involve a gap between the conclusion with respect to a priori intuitions and the further conclusion with respect to ideality. Another way of framing the point is to say that Kant failed to acknowledge and argue against an alternative, realist, possibility, namely that, although our representation of space and time has a non-empirical and intuitive nature, the mind-independent world still is in space and time. (The previous subsection mentioned that there is a subtle interpretive issue concerning the question of whether Kant’s concern in the space and time arguments is with our representations of space and time or, rather, with space and time themselves. It can be seen that this subtlety is intimately connected to the neglected alternative problem: how one frames these arguments—as arguments about space and time or, rather, as arguments about our representations of them—has an influence on where exactly and in what precise form the potential gap arises.)

This sort of objection has been raised in different—and perhaps not equivalent—forms against Kant. A much discussed and influential version is to be found in Trendelenburg (1870: 156–168), but considerations along similar lines can be found in some of the earliest responses to the Critique (see, for example, Pistorius 1786 and Pistorius 1788 for a somewhat different version of the neglected alternative charge). Similarly, Kantian defenses against this type of charge can assume different forms. A particularly heated debate from Kantian and anti-Kantian perspectives was conducted in 19th-century Germany (and is described, going back to the very early reception of the Critique, in Vaihinger 1892: 134–151). Contemporary Kant scholarship sees Kant’s distinctive account of intuitions, and the way he understands the distinction between sensibility and understanding in general, as key to understanding (and justifying) the move from the intermediate conclusion about (our representations of) space and time as a priori intuitions to the further conclusion about the ideality of space and time. (For treatments of the neglected alternative charge that focus on the intuitive nature of our representation of space and time, and how Kant understands this as key, see Allais 2015: 145–204, Willaschek 1997. For further discussions of the neglected alternative, see Falkenstein 1995: 289–309 and Allison 2004: 128–132.)

As a concluding remark on the problem of the neglected alternative, it is worth reminding ourselves of a feature of Kant’s case for idealism in the Critique that was mentioned above: the Aesthetic presents us with the direct, and central, argument, but Kant also claims to have provided us with an indirect argument for idealism in the Antinomy chapter of the Dialectic. Thus, even if the direct argument were faced with (insurmountable) difficulties, one might think that the indirect argument might fare better. For this reason, we shall now briefly turn to this further, indirect argument in favor of idealism.

iii. Antinomial Conflict and the Indirect Argument for Idealism

Kant thinks that reason is faced with an antinomial conflict. The antinomial conflict arises with respect to the following questions: Does the world have a beginning in time and bounds in space (first antinomy, A426–433/B454/461)? Do composite substances consist of simple parts (second antinomy, A434–443/B462–471)? Could (or should) causality in accordance with the laws of nature be combined with a different kind of causality, namely causality through freedom (third antinomy, A444–451/B472–479)? Does a necessary being belong to the world (fourth antinomy, A452–461/B480–489)?

These questions (seem to) admit contradictory answers, supported by (equally) good arguments, and it is in this that the conflict resides. The “thesis” and the “antithesis”, as the two conflicting claims in response to the questions in each of the four cases, along with the arguments that are supposed to establish their truth, are the two sides of the antinomial conflict. Kant thinks that the conflict can be resolved (only) if we accept transcendental idealism; his resolution of the antinomy thus serves as an indirect argument for his idealism. This is the general picture—for a more specific sense of how the indirect argument works, it is helpful to take note of an important internal differentiation among the four questions/antinomies; namely, we ought to distinguish between the mathematical antinomies (first and second antinomies) and the dynamical antinomies (third and fourth antinomies).

Kant’s way of resolving the antinomial conflict is quite different with respect to each of these. In the case of mathematical antinomies, he argues that the thesis and the antithesis are merely contraries, not contradictories. (Contradictories always differ in truth value—if there is opposition in the sense of ‘contradiction’ between two claims, then it has to be the case that one of the opposing claims is true and the other one is false; by contrast, when the opposing claims are merely contraries, they both can be false, although they cannot both be true.) Kant takes the view that both the thesis as well as the antithesis are actually false, and that a third option emerges, once one accepts transcendental idealism. In the case of dynamical antinomies, the strategy differs. Kant argues that there is a sense in which both the thesis as well as the antithesis are true, and that the seemingly conflicting claims of each side of the conflict are in fact compatible (A528–532/B556–560). Because of these differences in strategy, there is a sense in which only the resolution of the mathematical antinomies could lay claim to being an “indirect proof” of transcendental idealism (A502–507/B530–535), given that it is supposed to be a reductio ad absurdum of the opposite of transcendental idealism (transcendental realism), elevating transcendental idealism to a necessary condition for avoiding the contradictory implications of the alternative. The dynamical antinomies are still supposed to play an invaluable role in Kant’s case for idealism, however, because the appeal to idealism is at least a sufficient condition for resolving the antinomial conflict that would otherwise arise in this case. (For an account of Kant’s resolution of the antinomy, which is followed closely in this presentation here, along with a discussion of some criticisms of Kant’s resolution, see Allison 2004: 384–395. See also Chiba 2012: 130–158 and 252–332.)

It is thus clear and indisputable that Kant’s considerations in the Antinomy play an important dialectical role, motivating the case for transcendental idealism. Kant himself takes special note of the dialectical role of the antinomy, when he writes, for instance (in a Letter to Garve from 1798), that the point from which he started was “the antinomy of pure reason” (as opposed to “the investigation of the existence of God, of immortality, and so on”); “that is what first aroused” him “from the dogmatic slumber” and drove him “to the critique of reason itself, in order to fix the scandal of ostensible contradiction of reason with itself” (12: 257–258). (Note that in a further famous remark in Prolegomena 4: 260, Kant says that it was David Hume that interrupted his dogmatic slumber. For a discussion of Hume’s role in the Kantian project, see Watkins 2005: 363–389.) In a so-called “Reflexion”, Kant speaks of the system of the Critique of Pure Reason as revolving “around 2 cardinal points: as system of nature and of freedom, each of which leads to the necessity of the other. – The ideality of space and time and the reality of the concept of freedom, from each of which one is unavoidably led to the other analytically” (Reflexion 6353, 18: 679). He goes on to establish some connections with his practical philosophy, but, in any case, the passage indirectly points to the crucial connection between the (third) antinomy and Kant’s idealism.

It is clear that the Antinomy is supposed to strengthen the case of transcendental idealism. How should we assess the arguments there from a philosophical perspective? We started the discussion about the Antinomy as a possible reaction to the problem of the neglected alternative with respect to the ideality of time and space presented above. Given that issues around space and time have a major role in the Antinomy, and that the Antinomy is supposed to furnish us with an indirect argument for idealism, the considerations there could be deemed directly relevant for this kind of problem. However, there might be some problems with invoking the Antinomy in response to this sort of problem. A case can be made for the view that the Antinomy does not concern the nature of time and space as such, but the relationship between the world, on the one hand, and space and time, on the other. (This is emphasized, for example, in Al-Azm 1972, throughout the analysis of the antinomies, which does not focus on the more specific issue of idealism; see especially p. 8.) In relation to this, the claim for which the Antinomy clearly functions as an indirect argument is not the transcendental ideality of time and space as such, but a more general core claim of transcendental idealism, namely that empirical objects (that is, the world we experience and cognize) are appearances, not things in themselves—the world of our experience and cognition is a mind-dependent world.

In any case, even if the Antinomy does not afford a solution to the neglected alternative problem, it is still a major indirect argument for idealism, understood more broadly as the idea that the empirical world is a world of appearances. It has traditionally been thought that the indirect case for idealism in the Antinomy is less likely to convince than the direct argument in the Aesthetic. Questions can be raised as to whether the resolution of the antinomy really depends on idealism or, rather, on key thoughts with respect to reason and sensibility that could be disentangled from it. (For a discussion of such questions see Wood et al. 2007. For the relevant exchange between Wood and Allison, see pp. 1–10 and 24–31, as well as Allison’s treatment of the Antinomy mentioned above. For an extensive treatment of the antinomy in its relationship to idealism, and in particular of how it lends support to a particular interpretation of Kant’s idealism, see Chiba’s discussion of the antinomy referred to above. For a critical analysis of the Second Antinomy that establishes some explicit connections with the role of idealism in the resolution of the antinomy, see Van Cleve 1999: 6–272. For a further treatment of the antinomy and its relationship to idealism, albeit embedded in a broader discussion of other issues, see Ameriks 1992.) A further source of worry concerns the antinomy itself (the conflict between the thesis and the antithesis) and whether it arises in the first place. The whole idea of the resolution of the antinomy is that we desperately need idealism to come to the rescue, but if the arguments of the thesis or the antithesis that are supposed to establish the conflicting claims are not sufficiently good, then no rescue would be needed at all. (For an overview of criticisms concerning the potential of these arguments to establish the claims they purport to establish and thus generate the antinomy in the first place—along with some responses from a Kantian perspective—see Allison 2004: 366–384. A notable type of criticism that has been voiced concerns the following problem: the arguments for the thesis or the antithesis, which should lead to a conflict to be resolved by appeal to transcendental idealism, could be taken to presuppose transcendental idealism themselves. If the very same claims that are supposed to be established in the course of the argument in the Antinomy were already presupposed by it, then such an argument in favor of idealism would of course be profoundly unsatisfactory. For versions of this type of criticism, see Kreimendahl 1998: 424–444.)

Against this background, a—not particularly ambitious—line of defense would be to assign to the indirect argument in the Antinomy a more heuristic, dialectical role in Kant’s case for idealism. One could view the considerations in the Antinomy as genetically important for motivating the case for idealism and historically alerting Kant to its potential merits; this would be compatible with letting the justificatory burden be carried by the direct argument in the Aesthetic. (Such an option is briefly discussed—without being adopted—in Jauernig 2021: 348–353, esp. 350–1.)

What is controversial about Kant’s case for idealism is not only how to assess and interpret his direct and indirect arguments for idealism, but also whether these two types of arguments are indeed the only considerations on which Kant rests his case, or rather, whether further considerations play an essential role. It is to such questions that we now turn.

b. Beyond Space and Time: Sensibility, Understanding, and the Case for Idealism

Transcendental idealism is the claim that all empirical objects, objects in space and time, are mind-dependent, and that we cannot cognize the mind-independent world. We have taken a look at arguments that put time and space center stage in order to establish this doctrine: the reason why we should think that the empirical, spatiotemporal world is a mind-dependent world is the fact that time and space are mind-dependent; this is the central idea in the Aesthetic. (In the case of the indirect argument in the Antinomy, things are somewhat trickier, as mentioned above.)

In any case, there are two notable sets of considerations in the Critique that clearly do not concern the status of space and time and are worth discussing here: one rests on Kant’s generic views on sensibility, quite independently of his specific views on time and space; the other rests on Kant’s account of the understanding and, in particular, the role that his views on a priori concepts and their objective validity play in his idealism.

i. Sensibility, Receptivity, and (Short) Arguments for Idealism

As we have seen, the Aesthetic discusses the faculty of cognition called ‘sensibility’. Kant asks whether intuitions and sensibility could be said to have a form that is characteristic of human perceiving subjects; space and time are then shown to be “nothing other than merely forms”, and it is on such grounds that transcendental idealism is established. But there is a different kind of key thought, operative at the level of sensibility but independent of considerations relating to space and time, which could be taken to lead to idealism. These key thoughts pertain to the “matter” (as opposed to the “form”) of experience. Sensibility is “the receptivity of our mind to receive representations insofar as it is affected in some way” (A51/B75). Kant begins his account of sensibility in the Aesthetic by noting that objects affect us, that is, have a (causal) impact on us, thereby supplying us with sensations. These sensations are “impressions” supplied by the senses; Kantian empirical intuitions have the form of space and time, and—as opposed to a priori intuitions—include the “material” component of sensations; through “affection” by objects we thus “receive” the “matter” of experience (A19–20/B33–34).

These thoughts on sensibility, affection, receptivity, and matter could be said to be intimately connected to idealism, quite independently of the views one holds with respect to space and time specifically. In fact, in his proto-critical Inaugural Dissertation, Kant explicitly states considerations with respect to affection and how the “representative state” of different subjects is “modified” differently by the presence of objects “according to the variations in the subjects”, as sufficient reason for the conclusion that, through the senses we represent only “things as they appear, while things which are intellectual are representations of things as they are” (2: 392). It is only a few sections later that Kant goes on to present considerations regarding time and space, which closely parallel his arguments with respect to space and time in the Aesthetic of the Critique (2: 398–406). (In the main body of the Critique, the claim that the sensible world is a world of appearances, not things in themselves, is introduced—at least explicitly/officially—as a conclusion that only follows from the specific considerations with respect to space and time in the Aesthetic.)

The idea that a generic feature of sensibility, as opposed to specific considerations with respect to space and time, leads to Kantian idealism is an influential but controversial reading. One could describe an argument for idealism that bypasses specific considerations with respect to space and time as a “short argument to idealism”. (This is an expression used in Ameriks 1990, where the early reception of Kant’s idealism by figures such as Reinhold and Fichte is described as resting on such an idea and is criticized on precisely such grounds; see also Ameriks 2003: 134–158. Some versions of “short” arguments rest the case for idealism on Kant’s account of the understanding and its concepts as opposed to Kant’s (specific) account of sensibility and its forms; see Ameriks 1992. For a prominent reading that interprets Kant’s idealism as turning on generic considerations with respect to sensibility, see Strawson 1966: 250.) Although readings that attribute to Kant a “short argument” have come under criticism as an interpretation of Kant, not everyone in contemporary Kant scholarship agrees that not focusing on space and time is a problem. In the influential interpretation of Kant’s idealism developed in Langton 1998, it is argued extensively, on exegetical and philosophical grounds, that Kant’s idealism follows from his distinctive views on sensibility, receptivity, and affection. (See esp. pp. 43–47 and 210–218.)

ii. Understanding and the Role of the Categories in Transcendental Idealism

Although transcendental idealism is already established in the Aesthetic, that is, the part of the Critique that concerns sensibility, Kant’s treatment of the understanding and its a priori concepts is of particular importance—and arguably presupposed and anticipated in the Aesthetic—for establishing idealism. As we saw in Section 1, the objects to which we legitimately apply the categories can only be empirical objects, appearances, not things in themselves. This is certainly crucial for establishing transcendental idealism; however, the exact relationship between Kant’s account of the understanding and his idealism raises difficult interpretive and philosophical questions. The presentation of Kant’s argument(s) for idealism and related disputes shall be concluded by taking note of two such central questions. (For a series of contributions on transcendental idealism, where the (contested) role of the categories and the understanding takes center stage, see Schulting and Verburgt 2011, in particular the contributions on “transcendental idealism and logic”.)

With regard to interpretation, it is clear that a specific Kantian view of the understanding is necessary to establish one of the key claims of transcendental idealism, namely the non-cognizability of things in themselves: Kantian idealism requires the view that the understanding and its concepts cannot supply us with cognition of things in themselves. The more optimistic view, according to which the understanding can supply us with such cognition, would be a view that Kant himself held in his Inaugural Dissertation and abandoned in the Critique. As previously mentioned, the Inaugural Dissertation contains considerations with respect to the ideality of space and time which closely parallel Kant’s arguments with respect to space and time in the Critique. In the Inaugural Dissertation, Kant already thought that sensibility presents us with appearances, not things in themselves. But he thought simultaneously that the understanding does allow us to cognize things in themselves; hence, although cognitive access to the mind-independent world was precluded with respect to sensibility, the understanding did provide a route to this mind-independent world. It is precisely this view of the understanding that marks this work as “pre-critical”, whereas Kant’s view of sensibility as developed there already broadly corresponds to the “critical” view of the Critique, thus leading to the distinctive “proto-critical” status of the work as a whole.

The indisputable fact that Kant changes his view in the Critique could be interpreted as a rather radical break: one possible (and influential) reading is to interpret Kant’s treatment of the understanding and his claims with respect to a priori concepts as essentially parallel to his treatment of sensibility and his claims with respect to a priori intuitions. In such an understanding, we would essentially have two Kantian arguments for idealism: one argument that establishes the mind-dependence of purely “sensible” properties (such as spatiotemporal properties) in the Aesthetic and another that establishes a similar result with respect to a different kind of property, those that are described by means of the a priori concepts of the understanding (the categories) and could thus be called ‘categorial’—for example being a cause or a substance—in the Analytic. Even if Kant had not written a word on sensibility and space and time, he would still be committed to the view that the objects we cognize are mind-dependent appearances, not things in themselves, on the grounds of his distinctive account of the understanding and categorial properties; he would be committed to a form of conceptual idealism. (For a description of this kind of view, see Allais 2015: 292–303. This is not a view endorsed by Allais.)

In contrast to this interpretation, an alternative—and more moderate—reading of the kind of rupture between the “pre-critical” and “critical” Kant could operate along the following lines: the mind-dependence of objects of cognition and their properties—including categorial properties—is not to be attributed to the contribution of the understanding per se; it is instead to be attributed to the Kantian view that cognition requires the contribution of both the understanding and sensibility (A51–52/B75–76), so that the understanding alone is not sufficient. We cannot cognize the mind-independent world because we can have no intuition of it as such. (For this kind interpretation, see Allais 2015: esp. 292–303. For a defense of the claim that the central arguments in the Analytic cannot establish the mind-dependence of categorial properties, see Watkins 2002. The critique of “short arguments” for idealism in Ameriks 1990, 1992 is of relevance here as well.)

Related questions also arise at a more philosophical level. Kant’s (otherwise) valuable project in the Analytic has often been thought to be too intimately connected to his idealism. This has led to the influential view that one should try, from a philosophical perspective and with all due respect to Kant himself, to disentangle two aspects of Kant’s treatment that seem closely linked to each other in the Critique: (i) Kant’s defense of some a priori, non-empirical elements in our cognition of objects of experience, and (ii) the additional, idealist claim that objects of experience thus have to be mind-dependent. (For this kind of approach see Strawson 1966: esp. 15–24, Guyer 1987, Westphal 2004: 68–126.) How one could react to this sort of criticism from a Kantian perspective depends partly on how one interprets the exact relationship between Kant’s account of the understanding and his idealism in the first place; there are connections between this philosophical criticism and the more exegetical point of controversy presented above.

Some prominent philosophical and interpretive issues that surround Kant’s argument(s) for transcendental idealism have now been considered. We shall now take a closer look at the doctrine itself.

3. Controversies with Respect to the Status of Kantian Appearances

Transcendental idealism is a set of claims about appearances and things in themselves. Even independently of how one argues for this doctrine, there are additional, and difficult, questions concerning the doctrine itself: what does the distinction between appearances and things in themselves amount to, exactly, and how are we to understand the claim about the non-cognizability of the latter? A host of controversies surrounds these questions, and the rest of this article is dedicated to some of them. To understand the distinction between appearances and things in themselves, one, naturally, has to get a grip on issues that pertain to appearances as well as issues pertaining to things in themselves; these issues are often interconnected. For the purposes of presentation, we shall try to focus on each of these two sets of issues in turn. In this section, the focus is on Kant’s doctrine of appearances—that is, the status of mind-dependent objects—whereas the next section focuses on the doctrine of things in themselves—that is, the status of the mind-independent world according to Kant.

In the course of this discussion of Kantian appearances, we look at (part of) an influential objection to transcendental idealism—namely that it is too radical—and are introduced to a famous controversy in Kant scholarship, the debate between “one-world” and “two-world” interpretations of Kant’s idealism.

a. The Radical Idealism Charge and the Affection Dilemma: The Problem of Appearances

According to Kant, the empirical, spatiotemporal world is a world of appearances, not things in themselves; appearances, as opposed to things in themselves, somehow depend on minds. That much is clear. The doctrine has traditionally raised some eyebrows. As shown above, Kant’s idealism is intended to be a moderate version of idealism, to be distinguished from more radical—and in Kant’s view highly unattractive—versions thereof. Readers of the Critique have often felt that this is not quite the case and that transcendental idealism is, at least by implication, more radical than advertised. (The very first readers focused on precisely this sort of problem and inaugurated a long tradition of such worries; see especially Feder and Garve 1782, Jacobi 2004 [1787]. For a collection of early critical responses to Kant, in English translation, see Sassen 2000.) The concern has also been voiced that Kant himself openly admits sometimes how radical his idealism is, for instance in the Fourth Paralogism in the A-edition of the Critique. It is there that we find the first official definition of transcendental idealism, which is enlisted as a solution to the problem of external world skepticism. Kant (in)famously says there that “external objects (bodies) are merely appearances, hence also nothing other than a species of my representations, whose objects are something only through these representations, but are nothing separated from them” (A 370). In the historically dominant reading, Kant pursues an anti-skeptical strategy of a Berkeleyan stripe, aiming to secure our belief in the existence of the external world by reducing this world to a mind-dependent, mental entity to which we have privileged access. This strategy is often thought to be too radical and unattractive. (For readings of Kant’s Fourth Paralogism along such lines see Jacobi 2004 [1787]: 103–106, Kemp Smith 1923: 304–305, Turbayne 1955: 228–239.)

The radical idealism charge is general and complex and has been framed in a number of ways—but there is a famous problem that has played a particularly important role in framing this sort of criticism, which merits special mention. Kant starts presenting his account of sensibility (in the Aesthetic) by speaking about objects that affect us, that is, that have a (causal) impact on us, thereby supplying us with sensations—in this sense they supply us with the “matter” of experience. The affecting object-talk soon raised the question: what object are we talking about here? Given Kant’s distinction between appearances and things in themselves, it is natural to think that there are two candidates: the objects that affect human minds are either appearances or things in themselves. The concern has been raised that both options are untenable, the principal worries being (i) that objects as appearances are simply not fit for purpose, and (ii) that embracing the claim that things in themselves are the affecting objects would lead to serious problems and inconsistencies in the Kantian system. (For a classical statement of the problem see Jacobi 2004 [1787]; see also Vaihinger 1892: 35–55.)

For the purposes of this section, we shall briefly look at the first option. The reason for appearances being considered unfit for purpose has to do with what it means for something to be an appearance. If we understand Kantian appearances as representations in human minds—as Kant himself sometimes says, and as some readers that point to this sort of problem do—then we would get the following picture: The “objects” that have a causal impact on minds (“affect” minds), thereby serving as sources of some mental states in these minds, namely their sensations, would themselves be mental states. This problem is intimately connected to the radical idealism charge and serves to illustrate it. From the point of view of commonsense realism, your mental state of perceiving a computer screen right now is (at least in part) the effect of there actually being an object out there, which is not itself a mental state in some human mind, but a “real” object, that is, an actual computer screen, or something close to it. In the kind of Kantian picture just sketched, however, there seem to be no such objects at all—and even if they exist, we are fundamentally cut off from them—and all we have are minds and their mental states.

b. Rival Interpretations of the Mind-Dependence of Kantian Appearances

In the kind of Kantian picture just presented, Kant’s talk of appearances as representations is taken at face value. Kantian appearances are mind-dependent, in the sense of being mind-immanent; they are taken to be mental states in our minds (or some construction out of such states). This reading was prevalent in the early reception of Kant, but subsequently it has been openly called into question by Kant scholars, leading to a debate that has shaped contemporary discussions of Kant’s idealism: the debate between so-called “one-world” and “two-world” interpretations of Kant’s idealism.

i. One- vs. Two-World Interpretations of Transcendental Idealism

The interpretation of the mind-dependence of appearances just described has been subsequently characterized as a two-world interpretation of Kant’s idealism: the main idea behind this characterization is that such entities, in being mental states (of some sort), are to be contrasted with mind-transcendent, external world objects that would somehow exist “behind” the veil of appearances. In this kind of picture, the distinction between appearances and things in themselves amounts to a distinction between numerically distinct entities that form two disjoint sets. It is in this sense that we end up with two “worlds”: the world of mental states (appearances) on the one hand, and the world of mind-transcendent objects (things in themselves) on the other. This sort of interpretation of the mind-dependence of appearances is also often called ‘phenomenalist’, as it somehow “mentalizes” Kantian appearances. (For some (older) interpretations of Kant’s idealism along such lines see, for example, Strawson 1966: 256–263, Vaihinger 1892: 51, as well as the interpretations by early readers of Kant mentioned above, such as Jacobi, Feder and Garve. There have also been some newer interpretations that fall under the category, which explicitly react to the alternative that is discussed shortly and which are mentioned again further below; see Guyer 1987: 333–344, Van Cleve 1999: esp. 8–12 and 143–150, Stang 2014, and Jauernig 2021.) In any analysis of phenomenalist interpretations of Kantian appearances, one should also take note of the fact that, strictly speaking, not all such interpretations subscribe to the claim that Kantian appearances are mental states; in some versions, appearances are some sort of intentional object of our representations. (This is the case with respect to Jauernig’s interpretation; Van Cleve’s view of appearances as “virtual objects” is also closer to this reading; see also Aquila 1979, Robinson 1994, Robinson 1996 and Sellars 1968: 31–53, where an intentional object view is upheld.) In any case, two-world interpreters agree that Kantian appearances are not—strictly speaking—mind-transcendent, external world objects. (It is on the basis of this criterion that some readings are classified here as two-world readings, despite the fact that their proponents stress their differences from (traditional/standard) two-world readings, or even want to resist such a classification altogether, as is the case with Guyer’s remarks in Wood et al. 2007: 12–13, for example.)

This sort of interpretation is highly controversial and has come under attack by readers who argue for a one-world interpretation of Kant’s idealism. In the alternative reading, Kant’s talk of appearances as representations is not to be taken at face value: appearances are not to be understood as (constructions out of) mental states, and the relevant sense of mind-dependence is not mind-immanence. In this contrasting view, Kantian appearances are not just “in our minds”. This view results in a one-world interpretation of transcendental idealism, which has also been dubbed a “double-aspect” view. The main idea is that appearances are external world objects that are numerically identical to things as they are in themselves. (For some related remarks on the endorsement of the numerical identity claim by one-world theorists, see Section 3.b.ii.) The distinction between appearances and things in themselves does not amount to a distinction between distinct entities; rather, it is a distinction between different “aspects” of one and the same object. Since the 1980s, one-world interpretations of transcendental idealism have become increasingly popular.

Talk of aspects can be metaphorical, and the way one construes it can make a big difference to our understanding. In some of the first (explicit) formulations of one-world interpretations, talk of aspects was understood as talk about different ways of considering things: the very same object is taken to be an appearance, when considered in its relation to epistemic subjects, and a thing in itself, when considered independently of such a relation. Interpretations that understand the distinction in this way are often called ‘methodological’ or ‘epistemic’. (Interpretations along such lines have been proposed in Bird 1962, Prauss 1974, Allison 1983 and 2004. It is common to characterize methodological readings in terms of abstraction: we distinguish between appearances and things in themselves, because we abstract from the objects as appearances and the conditions of knowing them; see, for example, Guyer’s characterization of Allison’s view in Wood et al. 2007: 11–18. For some thoughts that complicate this picture, however, see Allison’s reaction: ibid., 32–39.)

This way of understanding the distinction contrasts with a metaphysical reading of Kant’s idealism. In that reading, the distinction between appearances and things in themselves concerns how things are, not how we consider them. From the perspective of a metaphysical version of one-world interpretations of transcendental idealism, the distinction between “two aspects of one and the same object” is to be understood as a distinction between two different sets of properties. There are different ways of understanding the distinction between these two sets of properties: for instance, as a distinction between relational properties on the one hand and intrinsic properties on the other hand—or a related distinction between essentially manifest qualities and intrinsic natures of things. Another example is a distinction between dispositional properties, one the hand, and categorical ones, on the other. (For different metaphysical accounts of the distinctions that illustrate each of these options, see Langton 1998: esp. 15–40, Allais 2007, 2015: 116–144 and 230–258, and Rosefeldt 2007. For a helpful discussion of the distinction between appearances and things in themselves that broadly falls in the one-world camp, see Onof 2019. For one of the oldest interpretations that is often read as a one-world account avant la lettre, see Adickes 1924: esp. 20–27.) The main idea of the one-world reading, in its metaphysical construal, is that the bearer of mind-dependent properties (appearance) and the bearer of mind-independent properties (thing in itself) are the very same object.

One-world interpretations have not gone unchallenged; both methodological as well as metaphysical versions have received thoroughgoing criticism. (Objections against Allison’s influential methodological reading have been raised, for instance, by Guyer and Van Cleve, who think that the two-world interpretation is ultimately correct; see Guyer 1987: 336–342, Van Cleve 1999: 6–8. See also the recently mentioned exchange between Guyer and Allison in Wood et al. 2007. For a further critique of methodological readings, which is also sympathetic to two-world readings, see Chiba 2012: 72–88.) Metaphysical versions of one-world interpretations are themselves partly a reaction to the problems of methodological one-world readings, as methodological readings were the first full-blown versions of one-world interpretations. (For some criticism of methodological interpretations from the perspective of metaphysical one-world interpretations, see Allais 2015: 77–97, Langton 1998: 7–14; see also Westphal 2001.) A rejection of methodological readings in favor of a metaphysical interpretation—which is critical of prominent two-world readings, without committing to a double-aspect view—has been already suggested and defended in Ameriks 1982 and 1992. More objections have been raised against newer, metaphysical versions of one-world interpretations, and newer versions of two-world interpretations have been defended as an alternative. (See especially Jauernig 2021, Stang 2014.)

ii. Textual and Philosophical Motivation

The controversy just outlined turns on multiple interpretive and philosophical problems, which have received sustained attention and discussion. A first, major issue concerns direct textual evidence: what does Kant himself say about the status of appearances and the way we should understand the distinction between appearances and things in themselves? As we have already seen, there are plenty of passages in which Kant characterizes appearances as representations (in German: Vorstellungen; A30/B45, A104, A370, A375 note, A490–491/518–519, A494–495/B523, A563/B691). Such passages suggest a phenomenalist interpretation of appearances as mental states of some sort, supporting a two-world interpretation. On the other hand, proponents of one-world interpretations point to the many passages in the Critique that suggest that appearances and things in themselves are the same thing, and that Kant is thus committed to the numerical identity of an object as an appearance and as it is in itself (Bxx, Bxxv-xxviii, A27–28/B44, A38/B55, B69; B306). Taking a cue from this kind of passage, one could claim that Kant’s considered view cannot be that appearances are mental states of some sort, because in this case they would have to be numerically distinct from things in themselves (which would clearly be mind-independent and as such mind-transcendent entities, thus distinct from “representations”). Given this state of textual evidence, it is typical for Kant commentators to read one of these two categories of passages as some form of loose talk on Kant’s part, such that there is no contradiction with their overall interpretation.

A further prominent issue turns on philosophical considerations with respect to claims about numerical identity within the framework of transcendental idealism. A problem often pressed by two-world theorists against one-world interpretations concerns the coherence of a view that combines (i) the claim that appearances, but not things in themselves, are in space and time, with (ii) the claim about numerical identity between appearances and things in themselves. The worry is that one-world theorists, in claiming that appearances and things in themselves are numerically identical, are essentially claiming that one and the same object has contradictory properties—for example being spatial and not being spatial. (For such, or similar, philosophical objections with respect to numerical identity claims, see Van Cleve 1999: 146–150, Stang 2014. For a “one-world” perspective on such issues, see Allais 2015: 71–76. It is worth noting that one-world theorists tend to qualify and weaken the numerical identity claim that was—originally—characteristic of one-world interpretations, as part of their response to this kind of philosophical objection; for instance, the “one-world” terminology that was characteristic in Allais 2004 is dropped in Allais 2015: esp. 8–9 and 71–76. For an account that is favorable to the idea that questions of “one” vs. “two worlds” should not be at the heart of the debates on Kant’s idealism, see Walker 2010; see also Adams 1997: esp. 821–825.)

A third type of prominent philosophical and textual consideration in the debate revolves around Kant’s claim to have established a moderate brand of idealism that somehow incorporates realist features, being a version of merely formal and transcendental—rather than material or empirical—idealism. This is often taken to count against phenomenalist interpretations of Kantian appearances and to support one-world readings: in a one-world view, Kantian appearances are public, mind-transcendent objects of the external world; these objects are considered in their relation to epistemic subjects and our conditions of knowing such objects (methodological reading), or they are bearers of mind-dependent properties (metaphysical reading). This sort of view seems to accommodate Kant’s realism better than the view that Kantian appearances are somehow mental. (For a defense of this view, see, for instance, Allais 2015: 44–56. For a two-world perspective on such worries, see Jauernig 2021: esp. 114–129, 155–172.)

This final problem concerning the exact relationship between transcendental idealism and realism raises interesting questions as to whether one can confine oneself to an (alternative) interpretation of the status of appearances, or, rather, whether things in themselves have to be considered, to fully account for the non-idealist features of Kant’s doctrine. Historically, part of the motivation for one-world interpretations was the idea that if we have an understanding of appearances as sufficiently robust entities, then one can do justice to the realist features of Kant’s position, while in a sense dispensing with things in themselves. (In some of the first worked-out one-world readings, this idea was central to the discussion of the problem of affection introduced above. According to the interpretation developed in Bird 1962: 18–35 and Prauss 1974: 192–227, for instance, the role of affection is assigned to—robust, extramental—appearances; things in themselves are thought to be dispensable. See also Breitenbach 2004: 142–146.) Moreover, Kant himself, in (part of) his explicit reaction to one of the main charges pressed against him by his early readers, namely the complaint that transcendental idealism is not sufficiently realist, formulates some thoughts in his defense that make no appeal to things in themselves and merely turn on questions pertaining to the realm of appearances (4: 375).

That being said, there are some reasons—touched upon further below—to think that a solution to the radical idealism charge—and the problem of affection more specifically—that bypasses the problem of things in themselves might not be satisfactory. This also means that a discussion of the exact relationship between rival interpretations of transcendental idealism and realism can be inconclusive without an explicit discussion of the exact role that the mind-independent world—the things in themselves—play in the overall account, in each proposed view. It is to the role of things in themselves and their status in Kant’s idealism that we now turn.

4. Controversies with Respect to the Status of Kantian Things in Themselves

We have taken a closer look at Kant’s doctrine of transcendental idealism by focusing on philosophical and exegetical issues with respect to appearances—as contrasted with things in themselves—and by acquainting ourselves with notable related controversies. In this section, the focus is on the other point of the contrast, namely the status of things in themselves, which raises issues equally fraught with deep controversies.

Although both points of the contrast—appearances vs. things in themselves—are the subject of heated debate, there is a sense in which the main controversy is located elsewhere in each case. In the case of appearances, the main controversy mostly concerns how one should understand the concept of an appearance and how one should cash out the exact kind of mind-dependence implied by this concept. Some think, for instance, that it implies the mind-immanence of the object in question (two-world reading), whereas others disagree (one-world reading). Despite this lack of consensus as to how exactly we should understand the concept of appearances, there is agreement on the fact that the concept is, according to Kant, clearly instantiated: Kant is clearly committed to there being objects of some sort that he calls ‘appearances’—even if these objects are taken to be very “insubstantial”, “minimal” or “virtual”. In the case of things in themselves, and putting some subtle and complicated details aside, there is a sense in which we have some consensus as regards how one should understand the concept of things in themselves, namely as the concept of a mind-independent world. The substantial controversy arises from further questions with respect to the instantiation of the concept. In the history of the reception and interpretation of Kant’s idealism, not everyone has agreed that Kant is committed to the existence of things in themselves, and many have thought that it would be philosophically problematic for Kant to do so.

a. The Radical Idealism Charge and the Affection Dilemma: The Problem of Things in Themselves

Faced with the charge of offering too radical a version of idealism, Kant explicitly reacted by pursuing a two-pronged strategy. One part of the strategy, pursued in the Appendix in the Prolegomena (4: 375), was alluded to shortly before: it is a strategy that confines itself at the level of appearances and shows how Kant’s idealism has the resources to be distinguished from older—and untenable—versions of idealism; a key thought in this respect is the Kantian idea that experience has an a priori aspect. (For a defense of this sort of consideration, as well as related responses to the radical idealism charge, see Emundts 2008.) Kant’s strategy in the Prolegomena has a second part as well, however: in some remarks dedicated to the topic of idealism (esp. 4: 288–290; see also 4: 292–295), Kant invokes his commitment to there being a mind-independent world, things in themselves, as a feature of his overall view that distinguishes it from radical versions of idealism: the moderate, transcendental idealist thinks that the objects of experience are appearances; however, this does not mean that all that exists is an appearance—we have to add things in themselves into the mix.

The idea that things in themselves are an indispensable part of the mix has received special attention in the context of the—more specific—problem of affection. Recall the problem: Kant’s talk of affection by objects, which provides human minds with sensations, has raised the dilemma of whether these objects are appearances or things in themselves. In the view of Kant’s critics, both options are untenable. We saw above that the main worry with respect to the first option, appearances, was that such entities are not fit for purpose. Historically, this diagnosis was motivated by a two-world interpretation of Kantian appearances; it was also noted in passing that defending appearances in their role as affecting objects was part of the motivation for some of the first versions of one-world readings of appearances as robust, mind-transcendent entities. (See Bird 1962: 18–35, Prauss 1974: 192–227. For a further interpretation that wishes to resist attributing to Kant the idea that we are affected by things in themselves, see de Boer 2014.) Nonetheless, it has been argued—in some cases by proponents of one-world interpretations—that appearances cannot play the role of the affecting object, even if they are understood as mind-transcendental entities of the external world. (For such a view see, for example, Allison 2004: 64–68; for further critique of the idea that appearances could do the affection job, see Jauernig 2021: 310–312.)

Philosophical and exegetical considerations can thus be cited in support of the idea that things in themselves have to play a role in the overall picture—as part of a more plausible and coherent story about affection, but also to do justice to the moderate nature of Kant’s idealism more generally. This, however, leads us to the second horn of the affection dilemma: embracing the claim that things in themselves exist and affect human minds has been thought to lead to serious problems in the Kantian system. One difficulty is that this claim sounds like an unjustified assumption that begs the question against the external world skeptic. (For an old and influential criticism along such lines, see Schulze 1996 [1792]: esp. 183–192/[262–275].) Moreover, there is the further prominent concern that such an assumption would introduce a major inconsistency in the overall account, as it would be incompatible with Kantian epistemic strictures with respect to things in themselves. Two related problems stand out. First, according to transcendental idealism, we are supposed to have no knowledge or cognition of things in themselves—but if I claim that they exist and affect human minds, then I do seem to know a great deal about them. Second, a natural way of understanding Kant’s affection-talk is to refer to some sort of causal impact that external world objects have on human minds, thereby providing them with sensory input. According to Kant, however, the concept of cause and effect is supposed to be an a priori concept, a category, and as such to be valid only of appearances, not things in themselves. Assigning the role of affection to things in themselves seems thus to require an (illegitimate) application of the category of cause and effect to things in themselves, contrary to Kant’s own preaching. (For this type of criticism, see especially Schulze 1996 [1792]: 183–264/[263–389], Jacobi 2004 [1787].)

b. Commitment to Things in Themselves

Things in themselves have thus historically been thought to present us with fundamental problems. This raises both exegetical questions, as regards what sort of view Kant ultimately held, and philosophical questions, as regards how defensible Kant’s view on that matter actually is. This presentation of Kant’s idealism is completed by taking these two broad sets of questions in turn.

i. The Exegetical Question: Noumenon, Transcendental Object, and Thing in Itself

The claim that, as a matter of interpretation, Kant accepts the existence of things in themselves and assigns the role of affection to them has been traditionally a matter of (fierce) controversy. Although there are passages that clearly support attributing to Kant such a view, the overall picture is more complex. How one handles the relevant textual evidence depends on the stance one takes with respect to some complicated interpretive questions around terms such as ‘transcendental object’ and ‘noumenon’—terms that were briefly mentioned above but that now warrant further discussion.

With respect to direct evidence for Kant’s commitment to things in themselves, there are a number of relevant passages in the Critique and further critical works (8: 215, 4: 314–5, A251–252, A190/B235, 4: 451). In some of these passages, we have Kant speaking explicitly of things in themselves that affect perceiving subjects and provide them with the “matter” of empirical intuitions. Moreover, there are a number of passages in which Kant speaks clearly of an object that “grounds” appearances and constitutes their “intelligible cause” (A379–380, A393, A494/B522, A613–614/B641–642). (For a discussion of textual evidence for Kant’s commitment to things in themselves, see Adickes 1924: 4–37.) However, it is often noted that, as far as the latter category of passages is concerned, this object is not characterized as a ‘thing in itself’; the object is characterized instead as a ‘transcendental object’. This makes the interpretation of such passages contingent on our stance regarding the controversial question of how to best understand Kant’s references to such an object.

Kant’s discussion of a transcendental object has given rise to rather different interpretations. Some uphold the view that the expression does refer to things in themselves (see, for instance, Kemp Smith 1923: 212–219); others have denied that this is the case (see, for instance, Bird 1962: 68–69, 76–81); a weaker ambiguity thesis has also been advocated, according to which the expression refers to things in themselves only in some places, but not in some others (for this view, see Allison 1968). The main motivation for the view that ‘transcendental object’—at least in some places—does not refer to things in themselves has to do with certain passages in which Kant analyzes the concept of a transcendental object and in which he seems to be strongly suggesting that ‘transcendental object’, unlike ‘thing in itself’, does not stand for the mind-independent world, standing instead for a mind-immanent, merely intentional object. (This is often taken to be implied by Kant’s analysis of the concept of an object in the A-version of the Transcendental Deduction (A104–105, A108–109), in the course of his highly complex argument for the “objective validity” of the categories.) The main motivation for the opposing view, namely that ‘thing in itself’ and ‘transcendental object’ can be used somewhat interchangeably, is the fact that in many places Kant seems to be doing precisely that, without, for instance, feeling the need to explain the term ‘transcendental object’ when he first introduces it (A46/B63). (It is worth noting that part of the difficulty and controversy stems from the question of how we should best understand the term ‘transcendental’ more generally. Even the term ‘transcendental idealism’ is difficult to express in more standard philosophical vocabulary in an uncontroversial way for precisely this reason. For some standard definitions of ‘transcendental’ provided by Kant, see 4: 373 note, 4: 294, B25; for some considerations that challenge the idea that these definitions indeed capture Kant’s actual usage in some cases, see Vaihinger 1892: 35–355.)

A further complication when trying to evaluate the textual evidence for Kant’s commitment to things in themselves arises from Kant’s talk of noumena, certain views he holds with respect to those, and how all this connects to things in themselves. Although the predominant view in Kant scholarship is that there is evidence that Kant is committed to the existence of things in themselves and an affection by these, there are passages in the Critique that have often been thought to cast doubt on the firmness of Kant’s commitment. Some passages in the chapter on Phenomena and Noumena are particularly important in this respect, because they clearly—and intentionally—survived changes between the two editions and stem from a section in which Kant discusses his idealism in detail, thereby giving such passages particular weight. (For other passages in the Critique that have been cited as evidence against Kant’s commitment, see especially A288/B344–345, A368, A372, A375–376. For a discussion of a range of textual evidence, with the aim of showing that Kant is not committed to things in themselves, see Bird 1962: 18–35, Bird 2006: 40–44, 122–126, 553–585.)

In the chapter on Phenomena and Noumena, Kant expresses a clear agnosticism with respect to the existence of noumena (A-edition), or at least the existence of noumena in the positive sense of the term (B-edition). Kant tells us that “the concept of a noumenon” is “merely a boundary concept, in order to limit the pretension of sensibility” (A255/B311); although Kant’s stance towards this concept is not entirely dismissive, he notes that the concept has to be “taken merely problematically” (A256/B311), which, in Kant’s terminology, means that noumena, although conceptually possible, cannot be assumed to exist, because we do not know if their concept is instantiated. If ‘noumenon’ refers to things in themselves, we could infer from such passages that according to Kant we have to be non-committal with respect to the existence of a mind-independent world. (For readings along such lines, see, for instance, Bird 1962: 73–77, Cohen 1918: 658–662, Emundts 2008: 135–136, Senderowicz 2005: 162–168.)

However, the interpretation of Kant’s stance based on such passages is complicated and contentious, for a number of reasons. First, the changes Kant made in the B-edition, by introducing two senses of ‘noumenon’ (positive and negative), raise some questions of interpretation. Second, the idea that ‘noumenon’ (in the positive sense) and ‘thing in itself’ can be used interchangeably could be challenged. One line of thought that serves to challenge this idea turns on the debate between one- and two-world interpretations of transcendental idealism. Proponents of one-world interpretations have suggested that noumena, unlike things in themselves, are objects that are numerically distinct from appearances; in such an interpretation, when Kant expresses agnosticism with respect to noumena, he wants only to rule out two-world readings of transcendental idealism. (See, for instance, Allais 2010: 9–12 for this kind of strategy.) A different line of thought to the same effect would focus less on the metaphysical status of noumena—their numerical identity with, or distinctness from, things in themselves—and more on their epistemic status: noumena are objects of the (pure) understanding according to Kant and are as such fully cognizable by it; this is not the case with respect to things in themselves. Not affirming the possibility of an object of such demanding cognition would be compatible with a commitment to the mere existence of things in themselves; for some passages in support of this view, see A249, A252, A253, A249–250, A251, B306.

The exegetical and philosophical questions with respect to Kant’s commitment to things in themselves are deeply intertwined. Concerns about philosophical problems have often motivated certain interpretive stances, which aim to jettison the thing in itself. (Fichte, who opposed attributing to Kant such a commitment, is characteristic in this respect; see esp. Fichte 1970 [1797/1798]: 209–269.) Moreover, some approaches to the exegetical question already provide a hint of how a proponent of Kant’s commitment to things in themselves would deal with this from a philosophical perspective. This becomes obvious in the following subsection, which addresses the issue from a more philosophical perspective, and with which we conclude the presentation of Kant’s doctrine of appearances and things in themselves.

ii. The Philosophical Question: The Legitimacy of a Commitment to Things in Themselves

For those who ascribe to Kant a commitment to things in themselves, the question arises as to how one could defend Kant’s idealism against the worries raised above. As far as the concern that Kant’s commitment to things in themselves is an unjustified assumption (that begs the question against the external world skeptic) is concerned, different strategies have been suggested. One way of proceeding would be to argue that Kant has such an argument; of relevance are passages in Kant that sound like arguments for the view that the existence of appearances implies the existence of things in themselves (Bxxvi-xxvii, A251–252, 4: 314, 4: 451), as well as the argument in the Refutation of Idealism in the B-edition (more controversially so). (For some relevant thoughts—and problems—see Langton 1998: 21–22 and Guyer 1987: 290–329. For a further reconstruction of a Kantian argument that draws on less obvious resources—and is also relevant for further concerns with respect to Kant’s commitment—see Hogan 2009a and Hogan 2009b.) A different way of proceeding consists in not conceding that Kant has to argue for the existence of a mind-independent world: according to this line of defense, it is not essential to the Kantian project to refute external world skepticism and to give us an argument for the existence of things in themselves; rather, one could interpret the claim about the existence of a mind-independent world as a commonsense assumption that can be taken for granted. Since the Critique is an inquiry into the possibility and limits of a priori cognition, there is no question begging in accepting some commonsense assumptions (for instance about empirical knowledge or the existence of the external world) in this type of project. (For this type of reaction, see especially Ameriks 2003: 9–34.)

As far as the inconsistency worry (incompatibility with epistemic strictures and illegitimate application of the categories) is concerned, the following, to some extent interrelated, lines of defense stand out. It is often noted that, from a Kantian perspective, there is a distinction to be made between thinking and cognizing/knowing things in themselves through the categories; Kantian restrictions with respect to the application of the categories to things in themselves concern only the latter. (For more on this interpretation, see especially Adickes 1924: 49–74.) Moreover, it has been argued that we could interpret Kant’s epistemic strictures differently, so that no incompatibility arises between the non-cognizability of things in themselves and a commitment to them. Kant’s epistemic strictures are not a sweeping claim against all knowledge with respect to things in themselves; Kant merely prohibits determinate cognition of things in themselves (which would involve having intuition of the object as such and representing its properties). This is compatible with a more minimal commitment to the existence of the entity in question. Another (related) way of framing the point is to distinguish between knowing that an object has some mind-independent properties and knowing which properties these are. (For this type of reaction, see especially Langton 1998: 12–24, Chiba 2012: 360–368, Rosefeldt 2013: 248–256.)

These lines of defense can be connected with contemporary work on Kant’s concept of cognition (in German: Erkenntnis), which suggests that there is a distinction to be drawn between cognition and knowledge (in German: Wissen), and that Kant’s strictures are to be interpreted as non-cognizability, not as unknowability claims (which would preclude any form of knowledge about things in themselves). (For the distinction between cognition and knowledge, see Watkins/Willaschek 2017: esp. 85–89, 109 and Chignell 2014: 574–579.) The main idea in this distinction is that Kantian knowledge is closer to our contemporary notion of knowledge—which is generally analyzed as a form of justified true belief—whereas Kantian cognition is a kind of mental state that represents the object and involves concepts and intuitions. The upshot of these lines of defense is that the problem of things in themselves in Kant’s idealism might be more manageable than is often thought.

5. Concluding Remarks

As has become evident, Kant’s transcendental idealism is a highly controversial doctrine, both in terms of interpretation and in terms of philosophical evaluation. As soon as one tries to translate Kantian jargon into more standard contemporary philosophical vocabulary, one often has to take a stance with respect to notable controversies. The conflicting interpretations are often subtle and very well worked-out, and in many cases aim to present Kant’s transcendental idealism as a coherent position that does not rest on crude considerations and blatant mistakes; in some other cases, the aim is to salvage parts of Kant’s doctrine that are philosophically defensible, while explicitly letting go of certain aspects of the official overall view. In any case, Kant’s transcendental idealism has been the topic of intensive scholarly engagement and rather vigorous philosophical discussions, which have led to deep controversies that are yet to reach consensus but that have also greatly advanced our understanding of Kant’s philosophy as a whole.

One clearly discernible tendency in contemporary interpretations of Kant, which exemplifies the enduring lack of consensus but also shows how much the interpretation of Kant’s idealism has evolved, concerns the debate between one-world and two-world interpretations. In its initial formulation, the debate was between two clear-cut interpretations of transcendental idealism: the old, traditional, metaphysical two-world interpretation on the one hand, and a newer, more methodological, one-world interpretation on the other hand. Metaphysical interpretations of Kant’s idealism have been making a comeback, complicating the picture with respect to one-world interpretations, as metaphysical versions thereof have been proposed. Moreover, there is a sense in which even the distinction between one- and two-world interpretations is being eroded. Partly pressed by proponents of two-world interpretations—which are also making something of a comeback—views initially associated with (metaphysical) one-world readings are now no longer cashed out in terms of the “one-world” or “numerical identity” terminology; the resulting view is a weaker, qualified version of “double aspect” readings. (This was noted in passing in Section 3.) Notwithstanding these signs of convergence, there are substantial differences remaining between opposing readings of Kant’s idealism, but more clarity has been achieved as to what the points of contention ultimately are.

A further contemporary tendency with wide-ranging implications, which was indirectly noted at different junctures of this article, concerns an emerging, more sophisticated understanding of Kant’s account of cognition as well as the (distinctive) role that intuitions and concepts play there. Getting a better grasp of these features of Kant’s view is central for understanding Kant’s entire philosophy. But it is also becoming increasingly clear that this is key to understanding and evaluating Kant’s idealism; it affects where we locate the argument(s) for transcendental idealism, whether Kant’s conclusions follow from the premises from which they are supposed, and how coherent the overall resulting view is. Subtler accounts of these issues are important resources, with many suggesting that Kant’s idealism is more philosophically viable than was traditionally thought.

6. References and Further Reading

a. Primary Sources

  • Kant, Immanuel. Gesammelte Schriften. Ed. by Preußische Akademie der Wissenschaften. Berlin: Georg Reimer, 1900ff.
  • Translations used are from:
  • Kant, Immanuel. Correspondence. Transl. and ed. by Arnulf Zweig. Cambridge: Cambridge University Press, 1999.
  • Kant, Immanuel. Critique of Pure Reason. Transl. and ed. by Paul Guyer and Allen W. Wood. Cambridge: Cambridge University Press, 1998.
  • Kant, Immanuel. Notes and Fragments. Ed. by Paul Guyer, transl. by Curtis Bowman, Paul Guyer, and Frederick Rauscher. Cambridge: Cambridge University Press, 2005.
  • Kant, Immanuel. Theoretical Philosophy After 1781. Ed. by Henry Allison and Peter Health, transl. by Gary Hartfield, Michael Friedman, Henry Allison, and Peter Heath. Cambridge: Cambridge University Press, 2002.

b. Secondary Sources

  • Adams, Robert Merrihew (1997): “Things in Themselves”. Philosophy and Phenomenological Research 57:4, 801–25.
  • Adickes, Erich (1924): Kant und das Ding an sich. Berlin: Rolf Heise.
  • Al-Azm, Sadik J. (1972): The Origins of Kant’s Argument in the Antinomies. Oxford: Clarendon Press.
  • Allais, Lucy (2004): “Kant’s One World: Interpreting ‘transcendental idealism’”. British Journal for the History of Philosophy 12:4, 665–84.
  • Allais, Lucy (2007): “Kant’s Idealism and the Secondary Quality Analogy”. Journal of the History of Philosophy 45:3, 459–84.
  • Allais, Lucy (2010): “Transcendental idealism and metaphysics: Kant’s commitment to things as they are in themselves”. Kant Yearbook 2:1, 1–32.
  • Allais, Lucy (2015): Manifest Reality: Kant’s Idealism and His Realism. Oxford/New York: Oxford University Press.
  • Allison, Henry E. (1968): “Kant’s Concept of the Transcendental Object”. Kant-Studien 59:2, 165–86.
  • Allison, Henry E. (1983): Kant’s Transcendental Idealism: An Interpretation and Defense. New Haven/London: Yale University Press.
  • Allison, Henry E. (2004): Kant’s Transcendental Idealism: An Interpretation and Defense. Revised and enlarged edition. New Haven/London: Yale University Press.
  • Ameriks, Karl (1982): “Recent Work on Kant’s Theoretical Philosophy”. American Philosophical Quarterly 19:1, 1–24.
  • Ameriks, Karl (1990): “Kant, Fichte, and Short Arguments to Idealism”. Archiv für Geschichte der Philosophie 72:1, 63–85.
  • Ameriks, Karl (1992): “Kantian Idealism Today”. History of Philosophy Quarterly 9:3, 329–342.
  • Ameriks, Karl (2003): Interpreting Kant’s Critiques. Oxford/New York: Oxford University Press.
  • Aquila, Richard E. (1979): “Things in Themselves and Appearances: Intentionality and Reality in Kant”. Archiv für Geschichte der Philosophie 61:3, 293–307.
  • Bird, Graham (1962): Kant’s Theory of Knowledge: An Outline of One Central Argument in the Critique of Pure Reason. London: Routledge & Kegan Paul.
  • Bird, Graham (2006): The Revolutionary Kant: A Commentary on the Critique of Pure Reason. Chicago/La Salle: Open Court.
  • Breitenbach, Angela (2004): “Langton on things in themselves: a critique of Kantian Humility”. Studies in History and Philosophy of Science 35:1, 137–48.
  • Chiba, Kiyoshi (2012): Kants Ontologie der raumzeitlichen Wirklichkeit: Versuch einer anti-realistischen Interpretation. Berlin/Boston: De Gruyter.
  • Chignell, Andrew (2014): “Modal Motivations for Noumenal Ignorance: Knowledge, Cognition, and Coherence”. Kant-Studien 105:4, 573–597.
  • Cohen, Hermann (1918): Kants Theorie der Erfahrung. 3rd edition. Berlin: B. Cassirer.
  • de Boer, Karin (2014): “Kant’s Multi-Layered Conception of Things in Themselves, Transcendental Objects, and Monads”. Kant-Studien 105:2, 221–60.
  • Emundts, Dina (2008): “Kant’s Critique of Berkeley’s Concept of Objectivity”. Kant and the Early Moderns. Ed. by Béatrice Longuenesse and Daniel Garber. Princeton: Princeton University Press, 117–41.
  • Falkenstein, Lorne (1995): Kant’s Intuitionism: A Commentary on the Transcendental Aesthetic. Toronto: University of Toronto Press.
  • [Feder, Johann Georg Heinrich and Garve, Christian] (1782): “Critic der reinen Vernunft. Von Imman. Kant”. Zugabe zu den Göttingischen Anzeigen von gelehrten Sachen 3, 40–48.
  • Fichte, Johann Gottlieb (1970) [1797/1798]: Versuch einer neuen Darstellung der Wissenschaftslehre. In: J. G. Fichte, Gesamtausgabe der Bayerischen Akademie der Wissenschaften. Vol. I, 4. Ed. by. Reinhard Lauth and Hans Gliwitzky, Stuttgart-Bad Cannstatt: Frommann-Holzboog, 169–281.
  • Guyer, Paul (1987): Kant and the claims of knowledge. Cambridge/New York: Cambridge University Press.
  • Hogan, Desmond (2009a): “How to Know Unknowable Things in Themselves”. Noûs 43:1, 49–63.
  • Hogan, Desmond (2009b): “Noumenal Affection”. The Philosophical Review 118:4, 501–532.
  • Jacobi, Friedrich Heinrich (2004) [1787]: David Hume über den Glauben oder Idealismus und Realismus. Ein Gespräch. In: F. H. Jacobi, Werke – Gesamtausgabe. Vol. 2,1. Ed. by Walter Jaeschke and Irmgard-Maria Piske. Hamburg: Meiner/Frommann-Holzboog, 5–112.
  • Jauernig, Anja (2021): The World According to Kant. Appearances and Things in Themselves in Critical Idealism. Oxford/New York: Oxford University Press.
  • Kemp Smith, Norman (1923): A Commentary to Kant’s ‘Critique of Pure Reason’. 2nd edition, revised and enlarged. London, Basingstoke: Macmillan.
  • Kreimendahl, Lothar (1998): “Die Antinomie der reinen Vernunft, 1. und 2. Abschnitt”. Immanuel Kant: Kritik der reinen Vernunft. Ed. by Georg Mohr & Marcus Willaschek. Berlin: Akademie Verlag, 413–446.
  • Langton, Rae (1998): Kantian Humility: Our Ignorance of Things in Themselves. Oxford/New York: Oxford University Press.
  • Onof, Christian (2019): “Reality in-itself and the Ground of Causality”. Kantian Review 24:2, 197–222.
  • [Pistorius, Hermann Andreas] (1786): “Erläuterungen über des Herrn Professor Kant Critic der reinen Vernunft von Joh. Schultze, Königl. Preußischem Hofprediger. Königsberg 1984”. Allgemeine Deutsche Bibliothek 66:1, 92–123.
  • [Pistorius, Hermann Andreas] (1788): “Prüfung der Mendelssohnschen Morgenstunden, oder aller spekulativen Beweise für das Daseyn Gottes in Vorlesungen von Ludwig Heinrich Jakob, Doctor der Philosophie in Halle. Nebst einer Abhandlung vom Herrn Professor Kant. Leipzig 1786”. Allgemeine Deutsche Bibliothek 82:2 , 427–470.
  • Prauss, Gerold (1974): Kant und das Problem der Dinge an sich. Bonn: Bouvier.
  • Robinson, Hoke (1994): “Two perspectives on Kant’s appearances and things in themselves”. Journal of The History of Philosophy 32:3, 411–41.
  • Robinson, Hoke (1996): “Kantian appearances and intentional objects”. Kant-Studien 87:4, 448–54.
  • Rosefeldt, Tobias (2007): “Dinge an sich und sekundäre Qualitäten”. Kant in der Gegenwart. Ed. by Jürgen Stolzenberg. Berlin/New York: De Gruyter, 167–209.
  • Rosefeldt, Tobias (2013): “Dinge an sich und der Außenweltskeptizismus: Über ein Missverständnis der frühen Kant-Rezeption”. Self, World, and Art. Metaphysical Topics in Kant and Hegel. Ed. by Dina Emundts. Berlin/Boston: De Gruyter, 221–260.
  • Sassen, Brigitte (ed.) (2000): Kant’s Early Critics: The Empiricist Critique of the Theoretical Philosophy. Edited and translated by Brigitte Sassen. Cambridge: Cambridge University Press.
  • Schulting, Dennis and Verburgt, Jacco (eds.) (2011): Kant’s Idealism: New Interpretations of a Controversial Doctrine. Dordrecht: Springer.
  • [Schulze, Gottlob Ernst] (1996) [1792]: Aenesidemus oder über die Fundamente der von dem Herrn Professor Reinhold in Jena gelieferten Elementar-Philosophie. Nebst einer Verteidigung des Skeptizismus gegen die Anmaßungen der Vernuftkritik. Ed. by Manfred Frank. Hamburg: Meiner.
  • Sellars, Wilfrid (1968): Science and Metaphysics. Variations on Kantian Themes. London: Routledge & Kegan Paul.
  • Senderowicz, Yaron M. (2005): The coherence of Kant’s transcendental idealism. Dordrecht: Springer.
  • Shabel, Lisa (2004): “Kant’s Argument from Geometry”. Journal of the History of Philosophy 42:2, 195–215.
  • Stang, Nicholas F. (2014): “The Non-Identity of Appearances and Things in Themselves”. Noûs 48:1, 106–36.
  • Strawson, Peter F. (1966): The Bounds of Sense: An Essay on Kant’s Critique of Pure Reason. London: Methuen.
  • Trendelenburg, Adolf. (1870): Logische Untersuchungen. Vol. 1. 3rd, enlarged edition. Leipzig: S. Hirzel.
  • Turbayne, Colin M. (1955): “Kant’s Refutation of Dogmatic Idealism”. The Philosophical Quarterly 5, 225–44.
  • Vaihinger, Hans (1892): Commentar zu Kants Kritik der reinen Vernunft. Vol. 2. Stuttgart: Union Deutsche Verlagsgesellschaft.
  • Van Cleve, James (1999): Problems from Kant. Oxford/New York: Oxford University Press.
  • Walker, Ralph C. S. (2010): “Kant on the Number of Worlds”. British Journal for the History of Philosophy 18:5, 821–843.
  • Warren, Daniel (1998): “Kant and the Apriority of Space”. Philosophical Review 107:2, 179–224.
  • Watkins, Eric (2002): “Kant’s Transcendental Idealism and the Categories”. History of Philosophy Quarterly 19:2, 191–215.
  • Watkins, Eric (2005): Kant and the Metaphysics of Causality. Cambridge: Cambridge University Press.
  • Watkins, Eric and Willaschek, Marcus (2017): “Kant’s Account of Cognition”. Journal of the History of Philosophy 55:1, 83–112.
  • Westphal, Kenneth R. (2001): “Freedom and the Distinction Between Phenomena and Noumena: Is Allison’s View Methodological, Metaphysical, or Equivocal?”. Journal of Philosophical Research 26, 593–622.
  • Westphal, Kenneth R. (2004): Kant’s Transcendental Proof of Realism. Cambridge: Cambridge University Press.
  • Willaschek, Marcus (1997): “Der transzendentale Idealismus und die Idealität von Raum und Zeit. Eine ‘lückenlose’ Interpretation von Kants Beweis in der “Transzendentalen Ästhetik””. Zeitschrift für philosophische Forschung 51:4, 537–64.
  • Wood, Allen and Guyer, Paul, and Allison, Henry. E. (2007): “Debating Allison on Transcendental Idealism”. Kantian Review 12:2, 1–39.

 

Author Information

Marialena Karampatsou
Email: marialena.karampatsou.1@hu-berlin.de
Humboldt University of Berlin
Germany

Divine Hiddenness Argument against God’s Existence

The “Argument from Divine Hiddenness” or the “Hiddenness Argument” refers to a family of arguments for atheism. Broadly speaking, these arguments try to demonstrate that, if God existed, He would (or would likely) make the truth of His existence more obvious to everyone than it is. Since the truth of God’s existence is not as obvious to everyone as it should be if God existed, proponents of arguments from divine hiddenness conclude that God must not (or probably does not) exist. While there is disagreement about how obvious God would make His existence, all the most prominent arguments from divine hiddenness maintain that God would (or would likely) make Himself obvious enough to everyone that nonbelief (or particular kinds of nonbelief) in God’s existence would not occur (or would not be nearly as common).

While the “argument from divine hiddenness” refers to a family of arguments for atheism, that term is often used interchangeably with the term “problem of divine hiddenness”. But the “problem of divine hiddenness” may refer to a much broader range of concerns than arguments for atheism. For example, those who want to believe in God’s existence, but who find themselves unable to believe, may experience pain or anxiety because of their lack of belief. This pain or anxiety can be considered an experiential problem of divine hiddenness, even if these nonbelievers never consider their own nonbelief as a piece of evidence against God’s existence. While other problems of divine hiddenness are briefly addressed at the end of this article, the bulk of what follows discusses hiddenness in the context of arguments for atheism.

Table of Contents

  1. Arguments from Divine Hiddenness
    1. J.L. Schellenberg’s Argument from Divine Hiddenness
    2. Other Arguments from Divine Hiddenness
  2. Responses to the Arguments from Divine Hiddenness
    1. There is no Nonbelief (of the Wrong Kind)
    2. Personal Relationship with God does not Require Belief that God Exists
    3. Greater Goods and Other Reasons God Might Hide
    4. Other Responses
  3. Divine Hiddenness and Specific Faith Traditions
  4. Other Problems of Divine Hiddenness
  5. References and Further Reading

1. Arguments from Divine Hiddenness

As implied above, there is no single “argument from divine hiddenness”, but instead there are several arguments from divine hiddenness. Assuming that the primary concern of proponents of arguments for divine hiddenness involves nonbelief as it occurs in the actual world, the simplest version of a deductive argument from divine hiddenness might be formulated in the following way:

IP: If God exists, then nonbelief does not occur.

EP: Nonbelief occurs.

Therefore,

C: God does not exist.

While no one has proposed a version of the argument as simple as this, it demonstrates the basic structure that virtually all deductive arguments from hiddenness have at their core. First, they have an “incompatibility” premise (IP) which identifies two states of affairs that are supposed to be incompatible with each other: the existence of God (on some conception of “God”) and the occurrence of some kind of “nonbelief phenomenon”. A nonbelief phenomenon could include patterns of nonbelief, such as the geographic or temporal distribution of nonbelief, as well as different kinds of nonbelief, such as “nonresistant” or “reasonable” nonbelief. Second, deductive hiddenness arguments have an “existential” premise (EP) which states that the nonbelief phenomenon specified in the incompatibility premise occurs in the actual world. From this, all deductive arguments from hiddenness conclude that God (as understood by the conception of God specified by the incompatibility premise) does not exist.

Inductive arguments from hiddenness follow a similar pattern, but differ from the deductive pattern above in at least one of a couple of ways, as seen in the following simplified version of an inductive hiddenness argument:

IPP: If nonbelief exists then, probably, God does not.

PP: Nonbelief probably exists.

Therefore,

C: God probably does not exist.

Some inductive arguments from hiddenness, in place of an incompatibility premise, have an “improbability premise” (IPP). Improbability premises state that the existence of some nonbelief phenomenon makes some variety of theism less likely than we might have initially thought. Improbability premises may make even bolder claims than this, as seen in the sample improbability premise above (IPP), but they are at least this bold. Improbability premises might be spelled out in a few different ways. For example, some improbability premises might argue that nonbelief (of a certain kind) is more likely on atheism than on theism, all else being equal, and therefore counts as evidence of atheism, while other improbability premises may argue that the probability of God existing given the existence of nonbelief is less than the probability of God existing given no nonbelief. Arguments with an improbability premise can also be called evidential arguments, since they take the occurrence of the nonbelief phenomenon as evidence that God does not exist. Some inductive arguments from hiddenness, in place of an existential premise, may have a “probability premise” (PP), which states that the nonbelief phenomenon in question probably exists in the actual world. And so, there are at least three basic models for inductive arguments from hiddenness, since it is also possible to have an inductive argument from hiddenness with both an improbability premise and a probability premise (as seen above). Depending on which adjustments are made, the conclusion will have to be adjusted accordingly. For further discussion of inductive arguments from divine hiddenness—especially evidential arguments—see Anderson and Russell (2021).

Considering deductive and inductive arguments, there are already several different arguments from divine hiddenness one could make. The number of potential arguments from divine hiddenness increases considerably when we factor in the two variables in the basic structure of hiddenness arguments. Arguments from divine hiddenness can vary with regards to which conception of God is under consideration, and which nonbelief phenomenon is under consideration. Of course, most potential arguments from divine hiddenness have not been defended, but this framework allows us to recognize the common structure that virtually all arguments from divine hiddenness share.

There is one final and important way that we can draw distinctions between the various arguments from divine hiddenness: how they defend their core premises. Different arguments provide different reasons for thinking that God’s existence (under a certain conception) is incompatible with a particular nonbelief phenomenon. And different arguments will provide different reasons for thinking that the nonbelief phenomenon in question actually occurs.

a. J.L. Schellenberg’s Argument from Divine Hiddenness

The most commonly discussed argument from divine hiddenness was proposed and primarily defended by J.L. Schellenberg. While Schellenberg has gone through numerous formulations of his argument from divine hiddenness, the basic idea has remained the same. Schellenberg’s core argument is that, if God exists, then, necessarily, there will be no nonresistant nonbelief. But since nonresistant nonbelief exists, we must conclude that God does not.

Viewing this argument in light of the framework established above, we can identify the nonbelief phenomenon that Schellenberg is concerned with as nonresistant nonbelief. Initially, Schellenberg used the terms reasonable nonbelief and inculpable nonbelief (Schellenberg 1993). But later he clarified reasonable/inculpable nonbelief as one species of a broader kind of nonbelief which he labeled “nonresistant nonbelief” (Schellenberg 2007). It should be noted that the term “nonresistant nonbelief” is primarily employed by Schellenberg (and those defending or responding to his argument) for the sake of simplicity. “Nonresistant nonbeliever” is really a shorthand for someone who is (i) not resisting God and (ii) capable of a meaningful conscious relationship with God, and yet who does not (iii) believe that God exists (Schellenberg 2007).

On Schellenberg’s usage, “God” refers to the concept you end up with if you suppose that there is an “ultimate being” who is also a person. From this, it follows at least that God is a perfect person. That is, God is perfect in all the properties appropriate to persons, including (but not limited to) power, knowledge, creativity, and love. Schellenberg takes this concept of God to be the traditional concept of God as understood by major western monotheistic religions, but warns against taking too much from tradition. He argues that we should not include anything in our idea of God that conflicts with the central idea of God as the ultimate, perfect person.

So, the core of Schellenberg’s argument is that, necessarily, God (as the ultimate, perfect person) would ensure that there are no nonresistant nonbelievers, but since there actually are nonresistant nonbelievers, we must conclude that God does not exist.

One distinguishing feature of Schellenberg’s argument is the way he supports the claim that God would ensure that there are no nonresistant nonbelievers. He claims that this simply follows as a logically necessary entailment of his concept of God as the ultimate, perfect person. Since a perfect personal God would be perfect in love, any personal creatures God creates would, necessarily, be created as an expression of that love, and towards the end of loving and being loved by God. In other words, God would create persons for the purpose of engaging in positively meaningful conscious relationships with God. It follows, necessarily, according to Schellenberg, that such a God, would do everything He could to prevent any created persons from being deprived of the possibility of such relationship. Since a conscious relationship with God necessarily requires that one believes that God exists, God would do everything He could to prevent created persons from failing to believe that God exists. And since God’s perfections would also include perfect power and knowledge, it follows that only culpable resistance to belief on the part of an individual person could prevent God from ensuring that all persons believe in His existence. It should be noted that Schellenberg makes a concession here by assuming that God might allow culpably resistant belief, because he knows many will balk at the idea that God would either force belief in His existence on those who resist it or, alternatively, create humans without the free will to resist God in the first place. Schellenberg himself, however, doubts that God would create free will of the sort that would allow resistance to belief in God (Schellenberg 2004, 2007). Keeping that concession in mind, it follows that, if such a God exists, then, necessarily, the only kind of nonbelief that should exist is resistant nonbelief. In other words, there would be no nonresistant nonbelief.

The idea that God would ensure that no nonresistant nonbelief occurs because He would always make a relationship with Himself available to all created persons has become the focal point of much of the hiddenness literature, with much discussion focusing on defending or refuting this idea specifically. But it is important to note that the class of hiddenness arguments, in general, do not obviously stand or fall based solely on whether this idea turns out to be right.

A common formal statement of Schellenberg’s hiddenness argument is as follows:

Necessarily, if God exists, anyone who is (i) not resisting God and (ii) capable of meaningful conscious relationship with God is also (iii) in a position to participate in such relationship (able to do so just by trying). (PREMISE)

Necessarily, one is at a time in a position to participate in meaningful conscious relationship with God only if at that time one believes that God exists. (PREMISE)

Necessarily, if God exists, anyone who is (i) not resisting God and (ii) capable of meaningful conscious relationship with God also (iii) believes that God exists. (From 1 and 2)

There are (and often have been) people who are (i) not resisting God and (ii) capable of meaningful conscious relationship with God without also (iii) believing that God exists. (PREMISE)

God does not exist (Schellenberg 2007).

For an in-depth examination of Schellenberg’s hiddenness argument, how its fine details have developed over time, and much of the discussion it has prompted, see Veronika Weidner (2018).

b. Other Arguments from Divine Hiddenness

While most of the literature on the problem of divine hiddenness concerns itself with Schellenberg’s hiddenness argument, there are several other notable hiddenness arguments.

One of the earliest hiddenness arguments of note comes from Theodore Drange. Drange’s argument is inductive, rather than deductive, and it specifically targets God as conceived by evangelical Christianity. The nonbelief phenomenon he claims as evidence against the existence of such a God is the sheer amount of nonbelief in that God. The idea is that, on evangelical Christianity, it seems that God wants everyone to believe that He exists, and it seems very likely that God could do much more to ensure a much greater number of people believe that He exists. If those two propositions are true, then it’s very likely that if such a God did exist, then there would be far fewer people who lacked belief in the existence of the God. God would ensure that there were fewer nonbelievers. But since so many people do lack belief in God’s existence, Drange concludes that it’s very likely that God (as conceived of by evangelical Christians) does not exist (Drange 1993).

Drange offers several reasons for thinking that God (as conceived by Evangelical Christians) would want everyone to believe in His existence, but the one with perhaps the most intuitive force is that, according to the Bible, belief in God is necessary to achieve salvation. Since evangelical Christians also believe that God wants everyone to be saved, it follows that God would want to ensure that everyone believes in His existence. A somewhat similar argument is made by Greg Janzen (2011).

Stephen Maitzen proposes an evidential hiddenness argument in which he suggests that the uneven distribution of theism, both geographically and historically, is much more likely on naturalism than on theism. The sort of theism he takes as his target conceives of God as a personal creator who is unsurpassably loving. He argues that, on this sort of theism, it would be quite unlikely for it to be the case that, in some parts of the world, theism is exceptionally common, while in other parts of the world, theism is exceptionally uncommon. But he notes that this is exactly the kind of world in which we find ourselves. As an overwhelmingly Muslim nation, Saudi Arabia, for example, is around 95% theistic, while Thailand, an overwhelmingly Buddhist nation, has very few theists, at most comprising 5% of the population (Maitzen 2006). This disparity of theistic belief between different parts of the world seems surprising if God exists and loves everyone equally. But it is not at all surprising on naturalism, since, on naturalism, theism primarily spreads person-to-person and its spread is heavily influenced by history, culture, and geographic proximity between groups of people, and there is no force for spreading theism that can transcend those influences. Thus, the uneven distribution of theism provides some evidence for naturalism and some evidence against theism.

While Maitzen never criticizes other hiddenness arguments, he notes that one virtue of his argument is that it easily avoids most of the criticisms made against other arguments from divine hiddenness. For example, while certain responses to hiddenness arguments claim that nonbelief arises from one’s own culpability, Maitzen notes that we can’t extrapolate that to a global scale. It seems highly unlikely that one part of the world would have such a high concentration of culpable nonbelievers compared with another part of the world. And even if God has some good reason for allowing nonbelief (see below for more on these responses), it would still be surprising that God’s reasons allowed for so much nonbelief in one part of the world but allowed for very little nonbelief in another part of the world (Maitzen 2006).

Another notable hiddenness argument comes from Jason Marsh. He considers a nonbelief phenomenon he dubs “natural nonbelief”. Natural nonbelief is the belief-state pre-historic humans were in (with regards to God’s existence) prior to the emergence of anything even close to the relatively modern concept of monotheism. The idea here is that, for a large proportion of the history of human existence, humans lacked not only the concept of an ultimate creator God with unsurpassable power, intelligence, and love, they lacked even the concept of a “theistic-like” god: a “high” god with significant power, knowledge etc. (though not to the maximal degree), who may or may not exist alongside but above lower gods. And so, for much of human existence, all humans were nonbelievers in the existence of not just a monotheistic God, but also any sort of high god at all. To use Marsh’s term, they were natural nonbelievers. Marsh thinks that the existence of so many natural nonbelievers is highly unlikely if a God exists who is conceived of in the way that most modern monotheists conceive of Him, since natural nonbelief precludes the possibility of personal relationship with such a God (whereas believers in a “high god” of some kind, even though it would turn out they do not fully understand God, might still have the possibility for a relationship with God as conceived of by monotheists, if such a God existed).  By contrast, the occurrence of natural nonbelief is not at all unlikely if naturalism is true. Thus, the existence of so much natural nonbelief provides evidence against theism and for naturalism (Marsh 2013).

2. Responses to the Arguments from Divine Hiddenness

Given that, at their core, hiddenness arguments have two central premises, any response to an argument from divine hiddenness must deny one of its two central premises. Responses to deductive hiddenness arguments must deny either the claim that the existence of God is incompatible with the existence of some stated nonbelief phenomenon, or they must deny that the stated nonbelief phenomenon occurs. Responses to inductive arguments from divine hiddenness may involve several different strategies, depending on what sort of inductive argument is in question. For inductive arguments with one probabilistic premise and one non-probabilistic premise, responses may simply deny the non-probabilistic premise. For inductive arguments with an “improbability premise”, responses may argue that the existence of God is not rendered significantly less likely due to the occurrence of some nonbelief phenomenon. For inductive arguments with a “probability premise”, responses may argue that the probability that the nonbelief phenomenon in question really does occur in the actual world is lower than the probability premise in question claims. Since some inductive arguments may have one probabilistic premise and one non-probabilistic premise, responses to those sorts of evidential arguments may also deny the non-probabilistic premise. Those sorts of responses will resemble responses to deductive hiddenness arguments. In practice, many of the replies to hiddenness arguments are implied to be relevant to inductive or deductive hiddenness arguments, even though they often directly address a deductive hiddenness argument.

Although most prominent hiddenness arguments, such as those defended by Drange, Maitzen, and Marsh, have received some direct engagement in the literature, most responses to hiddenness arguments target Schellenberg’s hiddenness argument specifically, or some generalized hiddenness argument that strongly resembles Schellenberg’s hiddenness argument. This usually involves one of three strategies. First, some authors attempt to cast doubt on the claim that Schellenberg’s proposed nonbelief phenomenon actually occurs. Second, some authors argue that personal relationship with God (of a positively meaningful sort) is possible even without belief that God exists, and thus for that reason God does not need to eliminate nonresistant nonbelief, which would undercut Schellenberg’s support for his claim that if God exists then there is no nonresistant nonbelief. Third (and most commonly), many authors try to propose some reason God might have for withholding (at least temporarily) the possibility of a meaningfully positive conscious relationship with God. And if God has a reason for withholding such a relationship, that reason will also constitute a reason that God might sometimes withhold belief from people despite a lack of resistance on their part.

The following section discusses each of these kinds of responses in turn.

a. There is no Nonbelief (of the Wrong Kind)

Some authors have responded to Schellenberg by denying that there is reasonable nonbelief, inculpable nonbelief, or nonresistant nonbelief. Given that Schellenberg did not initially use the term “nonresistant nonbelief” (as defined above), some responses of this kind do not directly target nonresistant nonbelief, but nonbelief understood in light of previous terms used by Schellenberg, such as “reasonable nonbelief” or “inculpable nonbelief”. Schellenberg maintains that many such responses have misunderstood what sort of nonbelief phenomenon he had in mind, which motivated him to use the label “nonresistant nonbelief” and further clarify what exactly he means by that label (Schellenberg 2007). Due to this misunderstanding, responses targeting (for example) “reasonable nonbelief” might fail to address the concept Schellenberg actually has in mind (i.e. nonresistant nonbelief, as defined above).

Douglas Henry, for example, argues that reasonable nonbelief is not likely to exist. A reasonable nonbeliever who is aware of the question of God’s existence would recognize the importance of that question and take the time to adequately investigate whether God exists. This should involve not just armchair philosophy, but a more active search for evidence or arguments outside of what one is able to consider on one’s own. But Henry notes that, given the importance of the question of God’s existence, it is unlikely that many nonbelievers (if any) have conducted an adequate investigation into finding the answer. He concludes that it is not likely that a large number of reasonable nonbelievers exists, and adds that he suspects that there are no reasonable nonbelievers (Henry 2001).

But, as noted above, Schellenberg has clarified that the sort of nonbelief he is concerned with is nonresistant nonbelief more broadly, not just reasonable nonbelief. One might fail to adequately investigate the question of God’s existence, not due to any active resistance to God, but something else. For example, a failure to properly recognize the priority of the question compared with other pursuits in one’s life, or an inability to adequately investigate or even recognize whether one’s investigation is adequate. There are also geographically isolated nonbelievers who do not even know about the question of God’s existence, and so do not know there is anything to investigate (Schellenberg 2007). So, it seems that, even if there is a reason for thinking that reasonable nonbelief does not exist, it is still possible that nonresistant nonbelief exists, and Henry’s reply fails to demonstrate that Schellenberg’s hiddenness argument is unsound.

Ebrahim Azadegan provides an argument denying that inculpable nonbelief exists. His idea is that every case of nonbelief occurs due to sin. By acting wrongly, we develop wrong desires and a proneness to act in ways we know are wrong. Some of those desires conflict with what we know we ought to do if God exists (Azadegan 2013). How might this demonstrate that there is no inculpable nonbelief? It could be that desires that conflict with what we know we ought to do if God exists can blind us when assessing the evidence for God’s existence, leading us to favour nonbelief. If one’s sin in this way leads one to nonbelief, then that nonbelief is culpable, argues Azadegan. Thus, since plausibly everyone has done wrong, it might be that all nonbelief results from a bias towards nonbelief, due to a desire to act in ways that are wrong if God exists.

Tyler Taber and Tyler Dalton McNabb offer a somewhat similar response, arguing that divine hiddenness is not a problem for reformed epistemologists. They argue that divine hiddenness simply amounts to the problem of sin’s consequences (Taber and McNabb 2016). V. Martin Nemoianu also argues along similar lines that there are no nonresistant nonbelievers, defending Pascal’s view that God’s hiddenness from us is primarily due to our own choices (Nemoianu 2016).

While the above arguments were global in scale, in that they attempt to use premises that could apply to all apparent nonresistant nonbelievers, there is another style of argument that may cast doubt on claims that certain particular individuals are nonresistant nonbelievers, even when those individuals honestly self-identify as nonresistant nonbelievers. This sort of argument appeals to evidence from various sources which seems to show that people often need more evidence than expected in order to accept beliefs of certain kinds. Miles Andrews, for example, cites findings from psychology that show that we are bad at predicting how we will respond to being confronted with evidence. While we may think, “If I had evidence of type X, then I would believe,” we are often wrong about that, and may not believe even when confronted with evidence of type X (Andrews 2014). Another example of an argument like this comes from Jake O’Connell, who points to historical cases of people witnessing what they themselves believed to be miracles, and yet who did not come to believe that God exists. O’Connell argues that this provides some reason for thinking that an increase in miracles would not necessarily reduce the number of nonbelievers. And even those who claim that they would believe if they witnessed a miracle might be mistaken about that (O’Connell 2013). Ultimately, these “local” arguments seem unlikely on their own to demonstrate that the existence of nonresistant nonbelief does not obtain (or even that it is unlikely to obtain). But what they might demonstrate is that some self-professed nonresistant nonbelievers may require more than just additional evidence in order to come to a belief in God; one impediment could plausibly be unrecognized resistance to belief. If this is right, then there are plausibly fewer nonresistant nonbelievers than we might initially think.

An objection of this first sort has also been raised against Jason Marsh’s argument from “natural nonbelief”. Recall that Marsh claimed that prehistoric humans were “natural nonbelievers”, which means that they had no opportunity for a personal relationship with God because they lacked the very concept of not just monotheism, but also any “theistic-like” belief that posited a “high god”. Matthew Braddock questions Marsh’s claim that there really was overwhelming natural nonbelief amongst prehistoric human beings. As just one of several critiques of Marsh’s support for that claim, Braddock notes that the findings of cognitive science of religion suggest that humans are predisposed to several different supernatural beliefs which, taken together, could plausibly lead one to believe in a high god. So rather than providing evidence that prehistoric humans were natural nonbelievers, cognitive science of religion seems to provide evidence that many of them might not have been (Braddock, 2022).

b. Personal Relationship with God does not Require Belief that God Exists

Another relatively common type of response to Schellenberg-style hiddenness arguments involves the claim that one can enjoy a positively meaningful personal relationship with God even without the explicit belief that God exists. The idea here is to undercut Schellenberg’s reason for thinking that God would ensure that there are no nonresistant nonbelievers. Schellenberg’s stated reason is that God would do what He could to ensure that all capable persons can participate in a positively meaningful conscious relationship with God. Since Schellenberg claims that the belief that God exists is necessary for such a relationship, God would therefore do what He could to eliminate nonbelief in God. And this means that the only nonbelievers would be resistant. But, as some have objected, if belief in God is not necessary for such relationship, then God will not necessarily do what He can to eliminate all nonresistant nonbelief.

There’s an important distinction to make between conscious and nonconscious relationships. Conscious relationships involve an awareness of the relationship, while nonconscious relationships (if they are possible) involve no awareness of the relationship. Schellenberg’s focus is on conscious relationships specifically. He thinks that God would do what He could to ensure that all capable creatures can participate in a conscious relationship with God (that is positively meaningful). And so, because a conscious relationship with God involves an awareness of that relationship, Schellenberg thinks that it necessarily follows that such a relationship with God involves the belief that God exists (Schellenberg 2007).

We can therefore identify two kinds of responses that attempt to argue that personal relationships with God do not require the belief that God exists. One type of response focuses on conscious relationships, and the other type of response focuses on nonconscious relationships. These two types of responses will be discussed in that order.

The first type of response attempts to demonstrate that conscious personal relationships with God do not require the belief that God exists. This type directly denies Schellenberg’s premise that states that such relationships do require the belief that God exists, and so if such a response succeeds, it demonstrates that Schellenberg’s argument (as stated) is unsound. Call these “conscious” responses.

The second type of response attempts to demonstrate that a nonconscious personal relationship with God is possible and that it does not require the belief that God exists. This type does not directly contradict Schellenberg’s premise. Responses of this type, therefore, require an argument (or an assumption) that a nonconscious relationship with God is not relevantly different from a conscious relationship (or that, if there is a relevant difference, that difference can somehow be made up for), and so God would not necessarily ensure that a conscious relationship with Him is possible for every capable person. A nonconscious relationship will do fine. Call these responses “nonconscious” responses.

Some authors have proposed conscious responses. Andrew Cullison, for example, describes a hypothetical case in which two people—Bob and Julie—develop a romantic relationship online. They have long personal discussions, encourage and comfort each other, and generally engage in activities indicative of friendship and romantic relationships. But upon learning that sophisticated computer programs can simulate human conversation so well that humans cannot tell the difference between the program and a real person, Bob begins to doubt that Julie exists, and his doubt becomes strong enough that he no longer feels justified in the belief that she exists. Despite lacking belief, he holds out hope that she exists and decides to keep continuing his interactions with her. Ultimately it turns out that Julie does exist, and it seems that Bob and Julie were able to have a personal relationship with each other despite Bob failing to believe that she exists. Cullison further reasons that if this can be true of two humans, then it can also be true for divine-human relationships (Cullison 2010). Ted Poston and Trent Dougherty defend a similar argument. They maintain that even a person with low confidence (well below full belief) in the existence of another might have a meaningful, personal relationship with that other person. And this may be so even if one person has no idea who that other person might be—they can only identify them by their interactions. They provide an example of two prisoners tapping back and forth on a shared prison wall. For all the prisoners know, they might be hallucinating or mistakenly identifying random patterns as purposeful and caused by another person. So, neither prisoner is certain that the other person exists, or even who that person might be, and yet Poston and Dougherty suggest that they plausibly share in a personal relationship (Dougherty and Poston 2007).

In both Cullison’s example and Poston and Dougherty’s example, the person who lacks belief is at least aware that they might be personally relating to someone else. Thus, their arguments attempt to establish that conscious personal relationships with God are possible without (full) belief that God exists. Terence Cuneo, on the other hand, argues that a personal relationship with God might be possible even for those who are entirely unaware of that relationship. Thus, Cuneo’s argument is that nonconscious personal relationships with God are possible even without the belief that God exists. Cuneo argues that the vitally important elements of divine-human relationships don’t rely on the belief that God exists. His argument relies on understanding God as importantly different from created persons. So, for example, we can unknowingly experience God through our experience of the world, unlike with humans. And there are also actions we can do in the world towards other created persons which can count as being done towards God. For example, the Bible claims that doing good towards other created people can constitute doing good towards God (see Matthew 25:34-40). If we can experience God through the world, and if we can do good towards God by doing good to others, then there is a kind of reciprocal relationship that at least parallels positively meaningful personal relationships. Cuneo argues that one might relate to God in ways like this without any awareness of God, and without any awareness that one might be relating to God. And, if this is so, then one might have a nonconscious relationship with God without any belief that God exists (or even that God might exist) (Cuneo 2013).

c. Greater Goods and Other Reasons God Might Hide

The most common strategy for responding to arguments from divine hiddenness is to argue that there is some reason God allows nonresistant nonbelief (or whatever nonbelief phenomenon is in question). Usually, this involves identifying some good thing that God wants to bring about, but which He cannot bring about without also bringing about (or risking) an undesirable nonbelief phenomenon. But there are also responses that cite a negative state of affairs that God wants to prevent as God’s reason for allowing such nonbelief. These responses argue that such negative states of affairs cannot be prevented (or cannot be guaranteed to be prevented) without allowing some undesirable nonbelief phenomenon. There are also responses which cite neither a greater good nor a prevented evil as God’s reason for allowing an undesirable nonbelief phenomenon.

The first category of proposed reasons for God’s hiddenness is composed of “greater goods” responses. The first question one might ask regarding the notion of a greater goods response is, “greater than what?”. This isn’t always addressed each time an author proposes a greater goods response, but plausibly the good for the sake of which God might allow nonbelief of some kind should be greater than the total of the lost value and the negative value caused by the nonbelief phenomenon in question. The lost value might include the missed opportunity for a meaningfully positive conscious relationship with God during each nonbeliever’s period of nonbelief (at least for Schellenberg-style hiddenness arguments), and the negative value could include any psychological pain experienced by nonbelievers who want to believe.

Depending on one’s view of God’s foreknowledge, it may not be a simple case of weighing lost and negative value against the positive value of the greater good. If one is an open theist, for example, then one will instead have to determine the probability of the greater good occurring if God allows for its possibility, and weigh that against the risk of nonbelief that comes with God allowing for the possibility of that greater good, in order to determine the expected value of God trying to bring about the greater good. For simplicity, the following will discuss greater goods responses under the assumption that God has perfect foreknowledge.

But if the value of the good must outweigh the value of the sum of the lost value and negative value of the nonbelief phenomenon, there is a potential problem. One might wonder if there could even possibly be a greater good. This problem is particularly relevant to Schellenberg-style hiddenness arguments. Schellenberg argues that plausibly there could not be any greater goods. A positively meaningful conscious relationship with God would be the greatest good for created persons (Schellenberg 2007). And so, one might think that greater goods responses must fail in principle; one doesn’t have to analyze the details of a greater goods response to know whether it fails. It fails just by virtue of being a greater goods response. Luke Teeninga attempts to address this problem, arguing that a lesser good, even at the temporary expense of a greater good, may actually lead to more value overall (Teeninga 2020). Nevertheless, most authors do not address the question of whether there could even possibly be a good great enough to justify God in allowing nonresistant nonbelief, and instead just attempt to propose such a good (or at least a good that constitutes part of God’s reason).

Several goods have been proposed as the reason (or part of the reason) that God allows undesirable nonbelief phenomena. One such good is morally significant free will. The idea here is that the greater awareness one has of God, the greater the motivation one has to act rightly (due to a desire to please God, a fear of punishment for doing wrong, and so forth), and therefore if God were too obvious, we would have such a strong motivation to do good that it would cease to be a true choice. This has been defended by Richard Swinburne (1998). Helen De Cruz also addresses this question, examining it through the lens of cognitive science of religion. She suggests that there is some empirical evidence for the claim that a conscious awareness of God heightens one’s motivation to do good (De Cruz 2016).

A similar idea to Swinburne’s can be found in John Hick’s “soul-making theodicy”, which is primarily presented as a general response to the problem of evil. Hick argues that in order to experience the highest goods possible for created beings (including the deepest kind of personal relationship with God), humans must begin in an imperfect state and, through moral striving, develop virtuous characters. Hick’s idea connects to hiddenness because part of the imperfect state that Hick describes necessarily involves being at an “epistemic distance” from God because, as with Swinburne, Hick argues that such epistemic distance is necessary to allow for the capacity for genuine moral choice that is necessary for the development of virtue (Hick 1966).

Richard Swinburne proposes other goods, including the opportunity to find out for oneself whether there is a God and the opportunity for believers to help nonbelievers to come to believe in God (Swinburne 1998). In both cases, it seems there must be nonbelief of some kind to begin with in order for these goods to be possible. Travis Dumsday builds on the latter response (which has become known as the “responsibility” response) and suggests that a believer’s friendship with God is greatly benefited if they can together pursue joint aims. Bringing knowledge of God to others is one such aim (Dumsday 2010a). But of course, this joint aim is impossible if there are no nonbelievers. Dustin Crummett expands on the responsibility response further, noting that communities can be responsible for fostering an atmosphere where individuals within or near to that community are more likely to experience God’s presence (or, conversely, by neglecting our duties, we can create an atmosphere where individuals are less likely to experience God’s presence). These duties go beyond direct evangelization through natural theology or preaching the gospel, and include acts such as encouraging one another in prayer, joining together in collective worship, doing good deeds, and building one another up (Crummett 2015).

Another good often cited as a reason God might allow nonbelief is an increased longing for God (for example, Sarah Coakley 2016 and Robert Lehe 2004). Some authors suggest that spiritual maturity can sometimes require God to hide Himself. A temporary period of God’s hiddenness from us may increase the longing we have for God, which may ultimately be necessary to grow deeper in our relationship with Him.

Several other authors have proposed goods apart from those listed. Aaron Cobb suggests that God hides to increase the opportunity to practice the virtue of hope (Cobb 2017). Travis Dumsday suggests that salvation itself may require God to make Himself less obvious, if salvation requires faith, since plausibly faith requires at least enough doubt to make nonbelief warranted (Dumsday 2015). Andrew Cullison argues that the opportunity for acts of genuine self-sacrifice may naturally result in nonresistant nonbelief, since, in a world where everyone is psychologically certain that a perfectly just God exists, everyone would know that God would compensate us in the afterlife if we sacrificed our lives for the sake of others. Genuine self-sacrifice, Cullison argues, requires that the world contains enough room to doubt God’s existence so that there is enough room for one to think that one truly is accepting the end of one’s own consciousness for eternity when one dies to save another (Cullison 2010). Kirk Lougheed connects the hiddenness literature to the literature on the axiology of theism (the question of what value, lack of value, or negative value might exist due to God’s existence or nonexistence) by arguing that the experience of many of the goods that “antitheists” claim would come from the nonexistence of God (such as privacy, independence, and autonomy) can actually obtain if God exists but hides (Lougheed 2018).

Some authors also propose greater goods responses that specifically apply to hiddenness arguments apart from Schellenberg’s. Max Baker-Hytch, in response to Stephen Maitzen’s demographics of theism argument, suggests that one good thing God wants is for humans to be mutually epistemically dependent on one another. That is, God wants us to rely on each other for our knowledge. Baker-Hytch argues that if mutual epistemic dependence is a good God wants, then it would be no surprise that there is an uneven distribution of theism, both geographically and temporally. In a world where we rely on each other for knowledge, but people are separated from each other geographically and temporally, what one group tends to believe may very likely not reach another group (Baker-Hytch 2016). Kevin Vandergriff proposes a good as a response to Jason Marsh’s argument from natural nonbelief. He suggests that natural nonbelievers, who had no opportunity for belief in God, had instead an opportunity to enjoy particularly unique kinds of meaningfulness. For example, natural nonbelievers had the opportunity to be part of what enabled later humans to relate to God. The idea is that religious concepts developed over time to eventually give rise to theism as understood today, and that natural nonbelievers had the opportunity to contribute to that development, which is in itself a meaningful role to play in history (Vandergriff 2016).

After greater goods responses, the most common type of reason proposed for why God might allow undesirable nonbelief phenomena is that God does so in order to prevent some negative state of affairs. Strictly speaking, these can all be thought of as greater goods responses as well, because in each case God could prevent the evil in question by withholding some good or another (for example, free will) or, in the most extreme case, God could prevent the evil in question by creating nothing at all. Several reasons proposed of this sort suggest that God hides from some people to prevent those people from making morally sub-optimal responses of one sort or another upon learning that God exists.

There are several kinds of morally sub-optimal response God might want to prevent. To start off, despite their nonresistance while nonbelievers, some nonresistant nonbelievers may nevertheless reject God upon learning of His existence and grasping a full understanding of who He is and what acceptance of Him means for their lives (See Dumsday 2010b and Howard-Snyder 1996 and 2016). Others might accept God but for the wrong reasons—for their own benefit, for example, rather than for His own sake. And even those who accept God for the right reasons might not do so due to their own moral merit (Howard-Snyder 1996 and 2016). God might also want to prevent some people from resenting Him due to the evil they see or experience in the world (Dumsday 2012a). He might also be motivated to prevent people from using the experience of God merely as a drug, rather than attempting to foster a real loving relationship with God (Dumsday 2014b). God might also care about human beings fostering their personal relationships with other created persons, and thus in some cases He might hide to prevent some people from neglecting those relationships due to a sole focus on God (Dumsday 2018).

There are other proposed negative states of affairs God might want to prevent by hiding that do not fit neatly into the previous category. For example, Michael J. Murray suggests that God hides because, if His existence were too obvious, created persons would be coerced into following God, and God wants to prevent this (Murray 1993). Travis Dumsday argues that one who sins with knowledge of God’s existence is more culpable than one who sins without knowledge of God’s existence, and so God might hide to prevent certain people from gaining more moral culpability (Dumsday 2012b). Dumsday suggests that God might also hide for the benefit of resistant nonbelievers. If God made Himself known to all nonresistant nonbelievers, then that could constitute evidence for God’s existence to the resistant nonbelievers, who may respond by doubling-down in their resistance (or they may respond in other undesirable ways, such as acting in morally bad ways to spite God). God thus hides to prevent this negative state of affairs, and so that he can work in the hearts of the resistant until they are ready to accept His revelation of Himself.

It is important to note that most (if not all) of the negative states of affairs proposed as reasons God might hide could be prevented by God removing human free will. Schellenberg has argued elsewhere that God would not create human free will (of this kind) precisely because of all of the negative states of affairs that it makes possible (Schellenberg 2004). Nevertheless, it remains a standard practice in the literature on the problem of divine hiddenness to appeal to free will of the kind that risks such negative states of affairs.

In addition to greater goods God wants to bring about, and negative states of affairs God wants to prevent, there are other reasons proposed as the explanation (or part of the explanation) for why God allows undesirable nonbelief phenomena. Paul Moser suggests that propositional knowledge of God’s existence does no good in bringing created persons closer to a personal relationship with God. Thus, God instead works to ready a person’s will to accept a relationship with Him, rather than working with their minds to accept belief that He exists (Moser 2014). Ebrahim Azadegan appeals to a similar idea, arguing that, if God’s love includes eros (the kind of love typical of intimate relationships) then created persons must be more than merely nonresistant to engage in a personal relationship with God; they must also act (for example, repent, pray, reflect on revelation). And so, God must work on people’s hearts, not their minds (Azadegan 2014). In order for their arguments to succeed, Moser and Azadegan must both assume that all apparently nonresistant nonbelievers (even any who may also be actual nonresistant nonbelievers) would not accept a relationship with God upon learning of His existence and that their current lack of a personal relationship with God is, in fact, a heart issue (even though, in some cases, they would not be accurately described as “resistant” to a relationship with God), rather than an issue of propositional knowledge.

The final kind of reason to discuss that is cited as an explanation for why God might allow undesirable nonbelief phenomena is that some of God’s other attributes might explain God’s motivation for allowing those nonbelief phenomena. For example, Travis Dumsday suggests that God’s justice may be what explains divine hiddenness, rather than His love. According to Dumsday, we may not deserve knowledge of God’s existence, and so that is why God does not do more to reveal Himself to us. While Schellenberg thinks that, in this case, God’s love should bring Him to offer us more than what we deserve, Dumsday thinks that this prioritizes God’s love over His justice and that we can’t be so certain that love would always trump justice (Dumsday 2014).

Michael Rea also argues that God’s other attributes may explain why God does not make His existence more obvious to everyone. Rea appeals to God’s personality. It may be possible that God has a distinct personality, and furthermore it may be very good for God to act in alignment with that personality. Rea says that what we think of as divine hiddenness may actually be better characterized as divine silence. According to Rea, God may be justified in remaining “silent” if doing so is in accordance with his personality, so long as He provides other ways of experiencing God’s presence. Rea thinks God has done this by providing us with Liturgy (especially taking the eucharist) and with the biblical narrative (Rea 2009).

d. Other Responses

There are other kinds of responses to hiddenness arguments that do not fit neatly into the previous categories. The first of these, which was borrowed from the literature on the problem of evil, is “skeptical theism”. Like the previous category of responses, skeptical theism does suggest that God may have a reason (or reasons) for remaining hidden, but, unlike the previous category, skeptical theism does not claim to know what that reason (or those reasons) are (or even what they might be). Skeptical theists argue that no one is in a position to know whether God has or doesn’t have any reasons for allowing nonresistant nonbelief. And so long as, for all we know, God might have a reason, we cannot conclude that the existence of nonresistant nonbelief is strong evidence against God’s existence (McBrayer and Swenson 2012).

One final kind of response claims, contra Schellenberg, that divine love does not entail that God would be open to a personal relationship with all created persons at all times. These sorts of arguments attempt to respond specifically to Schellenberg-style hiddenness arguments. Michael Rea, for example, argues that Schellenberg’s understanding of divine love, which is heavily analogous to human parental love, is mistaken, and pays very little attention to the ways many theologians have understood divine love throughout history. God is, according to the tradition, completely transcendent, and beyond human comprehension. Divine love must be understood in this light, and thus Rea argues that human parent-child relationships are a bad analogy for the kind of relationship a created person might have with God. And so, he concludes that God would not always be open to a relationship with every created person, at all times, if “relationship” is understood how Schellenberg understands it (Rea 2016).

Ebrahim Azadegan also employs the idea that God’s love might not entail openness to a personal relationship for all created persons, at all times. He frames it in terms of a dilemma: either God’s love is pure “agape” (benevolent love) or it also includes “eros” (the kind of love typical in intimate relationships). As mentioned previously, Azadegan thinks that created persons must be much more than merely nonresistant in order to be ready for personal relationship with God if God’s love includes eros. But if God’s love is purely agape, then there is no room for personal relationship with God. God, in that case, would be a purely benevolent giver, with no need for reciprocal relationship (Azadegan 2014).

3. Divine Hiddenness and Specific Faith Traditions

While Schellenberg intends his argument to apply to any being who could rightly be called “God”, most of the literature regarding hiddenness arguments concerns itself with a more-or-less Christian understanding of God, whether implicitly or explicitly. But some authors have tried to address what other faith traditions might say when faced with hiddenness arguments.

Jerome Gellman argues that, in certain understandings of Judaism, God is explicitly understood as hidden by His very nature. He is utterly inaccessible to created persons, and all we can do is yearn for God. Thus, God’s hiddenness is built into the very concept of God (Gellman 2016).

Jon McGinnis looks at hiddenness from the perspective of medieval Muslim philosophers. He argues that Schellenberg’s arguments don’t apply to God as understood in the Islamic tradition. “Love” is not a perfection, according to Islam. And insofar as God might be loving, God is both lover and beloved. “Personal relationship” with created persons would not be sought by God, since there is no relevant sense in which God is a person, nor is there a relevant sense in which God can relate to created persons (McGinnis 2016).

Nick Trakakis addresses the hiddenness argument from the perspective of several eastern religions, including Eastern Orthodox Christianity, Islam, and Hinduism. He argues, like Michael Rea, that western philosophers anthropomorphize God, assuming His attributes are comparable to human attributes. But in eastern religion, God is incomprehensible. He is not merely one being amongst other beings; He is utterly different from created beings. One implication here is that, according to these religious traditions, we can’t understand “personal relationship” with God in the sense Schellenberg uses in his argument (Trakakis 2016).

4. Other Problems of Divine Hiddenness

While the “argument from divine hiddenness” refers to a family of arguments for atheism, that term is sometimes used interchangeably with the term “problem of divine hiddenness”. But the “problem of divine hiddenness” may refer to a much broader range of concerns than what has been covered above. Mirroring what is often said about the “problem of evil”, we might identify “theological” and “experiential” problems of divine hiddenness, in addition to the “philosophical” problem that has been the focus above.

“Theological” problems of divine hiddenness differ from philosophical problems in that they are not posed as arguments against God’s existence, but instead as puzzles that need to be solved, usually with the assumption that there is a theistic solution. Until the late 20th century, when philosophers such as Schellenberg and Drange started rigorously defending hiddenness as an argument for atheism, the historical approach to the topic of hiddenness had primarily been approached as a theological (or experiential) problem. St. Anselm of Canterbury expresses a theological problem when he writes, “But if you are everywhere, why do I not see you, since you are present?” (Anselm trans. 1995). While there is significant overlap between the theological and philosophical problems of divine hiddenness, one reason for considering them distinct problems is that theological problems would not necessarily be solved just by determining that atheistic hiddenness arguments fail to establish the truth of atheism. One who is concerned with the theological problem of divine hiddenness wants to know why God is (or at least seems to be) hidden from some people. And thus, for example, the skeptical theist response (that we are not in a position to know whether God has reason to remain hidden) is not satisfying to one who is interested in the theological problem of divine hiddenness. Even if it turns out that skeptical theism solves the philosophical problem of divine hiddenness, it plausibly cannot solve the theological problem of divine hiddenness. Nevertheless, many of the ideas considered regarding the philosophical problem are relevant to the theological problem. Consider, for example, responses that aim to propose a good for the sake of which God would be willing to hide. These would plausibly also be relevant to the theological problem.

Compared to theological problems, experiential problems of divine hiddenness almost certainly overlap much less with matters relevant to philosophical problems of divine hiddenness. An “experiential” problem is the lived experience of someone to whom God seems hidden. It includes the unmet desire, and any suffering, that results from failing to know or experience God or God’s presence. Although some of those who feel that God is hidden from them may find some degree of comfort in considering certain responses to the philosophical problem of divine hiddenness (if they judge that any responses are plausible), for the most part, such responses are irrelevant to help ease the difficulty of their experiences of hiddenness.

While some might think that philosophy is impotent to address experiential problems of divine hiddenness, some philosophers, including Yujin Nagasawa (2016) and Ian DeWeese-Boyd (2016), have nevertheless attempted to address experiential problems.

5. References and Further Reading

  • Anderson, Charity and Jeffrey Sanford Russel. “Divine Hiddenness and Other Evidence.” Oxford Studies in Philosophy of Religion 10 (2021),
    • Discusses evidential arguments from divine hiddenness, and responds to two kinds of objections to such arguments, suggesting that both objections fail to demonstrate that hiddenness has no evidential bearing on the existence of God.
  • Andrews, M. “Divine Hiddenness and Affective Forecasting.” Res Cogitans 5(1) (2014): 102-110.
    • Argues, by appealing to psychological data, that humans are bad at predicting how we would respond to confronting evidence of God’s existence.
  • Anselm. Proslogion. Translated by Thomas Williams. Hackett Publishing Company, 1995.
    • Features a historical example of a hiddenness sentiment being expressed.
  • Azadegan, Ebrahim. “Divine Hiddenness and Human Sin: The Noetic Effects of Sin,” Journal of Reformed Theology 7(1) (2013): 69-90.
    • Argues that there is no inculpable nonbelief, because Sin affects the cognitive faculty.
  • Azadegan, Ebrahim. “Divine Love and the Argument from Divine Hiddenness.” European Journal for the Philosophy of Religion 6(2) (2014): 101-116.
    • Argues that Divine Love does not entail that God would ensure there are no nonresistant nonbelievers.
  • Baker-Hytch, Max. “Mutual Epistemic Dependence and the Demographic Divine Hiddenness Problem.” Religious Studies 52(3) (2016): 375-394.
    • Responds to Stephen Maitzen’s “demographics of theism” argument, arguing that the actual distribution of theism is to be expected on theism if God wants humans to be mutually epistemically dependent on each other.
  • Braddock, Matthew. “Natural Nonbelief in God: Prehistoric Humans, Divine Hiddenness, and Debunking.” In Evolutionary Debunking Arguments: Ethics, Philosophy of Religion, Philosophy of Mathematics, Metaphysics, and Epistemology, edited by Diego Machuca. Routledge, 2022.
    • Responds to Jason Marsh’s “natural nonbelief” argument, undercutting Marsh’s support for the claim that prehistoric humans were natural nonbelievers.
  • Coakley, Sarah. “Divine Hiddenness or Dark Intimacy? How John of the Cross Dissolves a Contemporary Philosophical Dilemma.” In Hidden Divinity and Religious Belief: New Perspectives, edited by Adam Green and Eleonore Stump, 229-245. Cambridge University Press, 2016.
    • Argues that Divine Hiddenness is actually a unique way that God reveals Himself to us.
  • Cobb, Aaron. “The Silence of God and the Theological Virtue of Hope.” Res Philosophica 94(1) (2017): 23-41.
    • Argues that God may remain silent to make space for humans to cultivate the virtue of hope.
  • Crummett, Dustin. “’We Are Here to Help Each Other’: Religious Community, Divine Hiddenness, and the Responsibility Argument.” Faith and Philosophy 32(1) (2015): 45-62.
    • Builds on the “responsibility argument” developed by Richard Swinburne by noting additional responsibilities humans and faith communities have towards each other that influence individuals’ dispositions to believe in God.
  • Cullison, Andrew. “Two Solutions to the Problem of Divine Hiddenness.” American Philosophical Quarterly 47(2) (2010): 119-134.
    • Argues that personal relationship with God is possible even if one lacks belief that God exists, and proposes that one benefit of divine hiddenness is the possibility for genuine self-sacrifice.
  • Cuneo, Terence. “Another Look at Divine Hiddenness.” Religious Studies 49 (2013): 151-164.
    • Argues that the vitally important elements of divine-human relationships don’t rely on believing at all times that God exists.
  • De Cruz, Helen. “Divine Hiddenness and the Cognitive Science of Religion.” In Hidden Divinity and Religious Belief: New Perspectives, edited by Eleonore Stump and Adam Green, 53-68. Cambridge University Press, 2016.
    • Analyzes certain responses to hiddenness arguments through the lens of Cognitive Science of Religion.
  • DeWeese-Boyd, Ian. “Lyric Theodicy: Gerard Manley Hopkins and the Problem of Existential Hiddenness.” In Hidden Divinity and Religious Belief: New Perspectives, edited by Adam Green and Eleonore Stump, 260-277. Cambridge University Press, 2016.
    • Addresses the existential problem of divine hiddenness by looking at the poetry of G.M. Hopkins.
  • Dougherty, Trent, and Ted Poston. “Divine Hiddenness and the Nature of Belief.” Religious Studies 43 (2007): 183-196.
    • Argues that relationship with God might be possible (for a time) with merely partial, de re belief that God exists.
  • Drange, Theodore. “The Argument from Non-Belief.” Religious Studies 29 (1993): 417-432.
    • Argues that the God of evangelical Biblical Christianity is unlikely to exist given the prevalence of nonbelief.
  • Dumsday, Travis. “Divine Hiddenness and the Responsibility Argument: Assessing Schellenberg’s Argument against Theism.” Philosophia Christi 12(2) (2010a): 357-371.
    • Argues that God might hide in order to make deeper relationships possible with some created persons, by providing them the opportunity to work alongside God to share news of Him with other created persons.
  • Dumsday, Travis. “Divine Hiddenness, Free Will, and the Victims of Wrongdoing.” Faith and Philosophy 27(4) (2010b): 423-438.
    • Argues that God might hide to protect victims of suffering from reacting sub-optimally to knowledge of God’s existence.
  • Dumsday, Travis. “Divine Hiddenness and Creaturely Resentment.” International Journal for Philosophy of Religion 72 (2012a) 41-51.
    • Argues that God might hide to prevent some created persons from resenting God’s greatness.
  • Dumsday, Travis. “Divine Hiddenness as Divine Mercy.” Religious Studies 48(2) (2012b): 183-198.
    • Argues that God hides out of mercy, since we gain greater culpability if we sin with knowledge of God.
  • Dumsday, Travis. “Divine Hiddenness as Deserved.” Faith and Philosophy 31 (2014): 286-302.
    • Argues that God might hide as an expression of His perfect justice, given that we do not deserve knowledge of God’s existence.
  • Dumsday, Travis. “Divine Hiddenness and Special Revelation.” Religious Studies 51(2) (2015): 241-259.
    • Argues that God might hide to make possible salvation through faith.
  • Dumsday, Travis. “Divine Hiddenness and Alienation.” Heythrop Journal 59(3) (2018): 433-447.
    • Argues that God may hide so that we do not neglect our relationships with other humans.
  • Gellman, “The Hidden God of the Jews: Hegel, Reb Nachman, and the Aqedah.” In Hidden Divinity and Religious Belief: New Perspectives, edited by Eleonore Stump and Adam Green, 175-191. Cambridge University Press, 2016.
    • Approaches hiddenness from the perspective of Jewish authors, some of whom embrace God as essentially hidden.
  • Henry, Douglas. “Does Reasonable Nonbelief Exist?” Faith and Philosophy 18(1) (2001): 74-92.
    • Argues that reasonable nonbelief plausibly does not exist.
  • Hick, John. Evil and the God of Love. Macmillan, 1966.
    • As part of his general theodicy against evil, Hick argues that humans need to start out in a state that involves a certain epistemic distance from God.
  • Howard-Snyder, Daniel. “The Argument from Divine Hiddenness.” Canadian Journal of Philosophy 26(3) (1996): 433-453.
    • Argues that God hides to prevent created persons from reacting inappropriately in one way or another to knowledge of God’s existence.
  • Howard-Snyder, Daniel. “Divine Openness and Creaturely Non-Resistant Non-Belief.” In Hidden Divinity and Religious Belief: New Perspectives, edited by Eleonore Stump and Adam Green, 126-138. Cambridge University Press, 2016.
    • Continues to build on his 1996 argument and defend it against various criticisms.
  • Janzen, Greg. “Is God’s belief requirement Rational?” Religious studies 47 (2011) 465-478.
    • Argues that the existence of nonbelief is evidence against any kind of theism that requires belief in God for salvation.
  • Lehe, Robert. “A Response to the Argument from the Reasonableness of Nonbelief.” Faith and Philosophy 21(2) (2004) 159-174.
    • Argues that God might hide to intensify one’s longing for God, and thus make one more likely to embrace a personal relationship with God upon revelation of His existence.
  • Lougheed. Kirk. “The Axiological Solution to Divine Hiddenness.” Ratio 31(3) (2018): 331-341.
    • Argues that the experiences of several goods claimed by anti-theists to require atheism are possible even if God exists, so long as He hides.
  • Maitzen, Stephen. “Divine Hiddenness and the Demographics of Theism.” Religious 219 Studies 42 (2006): 177-191.
    • Argues that the actually temporal and geographic distribution of theism is more expected on naturalism than on theism, and so that distribution provides evidence for naturalism and against theism.
  • Marsh, Jason. “Darwin and the Problem of Natural Nonbelief.” The Monist 96 (2013): 349-376.
    • Argues that the existence of early human nonbelief, prior to the advent of monotheism, is more probable on atheism than theism, and so this kind of nonbelief provides evidence for atheism and against theism.
  • McBrayer, Justin P., and Philip Swenson, “Scepticism about the Argument from Divine Hiddenness.” Religious Studies 48(2) (2012): 129-150.
    • Argues that we are not in a position to know whether there is any good reason for the existence of nonresistant nonbelief, and so the existence of nonresistant nonbelief is not evidence against theism.
  • McGinnis, Jon. “The Hiddenness of ‘Divine Hiddenness’: Divine Love in Medieval Islamic Lands.” In Hidden Divinity and Religious Belief: New Perspectives, edited by Eleonore Stump and Adam Green, 157-174. Cambridge University Press, 2016.
    • Argues that the assumptions made by Schellenberg’s hiddenness argument do not apply to Islam.
  • Moser, Paul. “The Virtue of Friendship with God.” In Religious Faith and Intellectual Virtue, edited by L.F. Callahan and Timothy O’Connor, 140-156. New York: Oxford University Press, 2014.
    • Argues that faith, as friendship with God, has an irreducible volitional component, and thus God is not motivated to provide humans with mere propositional belief in God’s existence.
  • Murray, Michael J. “Coercion and the Hiddenness of God.” American Philosophical Quarterly 30(1) (1993): 27-38.
    • Argues that God hides because full revelation of God’s existence to created persons might constitute a kind of coercion.
  • Nagasawa, Yujin. “Silence, Evil, and Shusaku Endo.” In Hidden Divinity and Religious Belief: New Perspectives, edited by Eleonore Stump and Adam Green, 246-259. Cambridge University Press, 2016.
    • Suggests a kind of response to the experiential problem of hiddenness.
  • Nemoianu, V.M. “Pascal on Divine Hiddenness.” International Philosophical Quarterly 55(3) (2015): 325-343.
    • Discusses Divine Hiddenness through the lens of Blaise Pascal.
  • O’Connell, Jake. “Divine Hiddenness: Would More Miracles Solve the Problem?” Heythrop Journal 54 (2013) 261-267.
    • Argues that there’s a high probability that many people would not believe in God even if there were significantly more miracles.
  • Rea, Michael. “Narrative, Liturgy, and the Hiddenness of God.” In Metaphysics and God: Essays in Honor of Eleonore Stump, edited by Kevin Timpe and Eleonore Stump, 76-96. New York: Rutledge, 2009.
    • Argues that God would be justified in remaining silent so long as it is in accordance with His personality, and He provides a widely accessible way to experience His presence.
  • Rea, Michael. “Hiddenness and Transcendence.” In Hidden Divinity and Religious Belief: New Perspectives, edited by Eleonore Stump and Adam Green, 210-226. Cambridge University Press, 2016.
    • Argues that Schellenberg’s hiddenness argument relies on assumptions about Divine Love not shared by much of the tradition of Christian theology.
  • Schellenberg, J.L. Divine Hiddenness and Human Reason. Ithaca: Cornell University Press, 1993.
    • Argues that the existence of reasonable nonbelief is a reason for thinking that God does not exist.
  • Schellenberg, J.L. “The Atheist’s Free Will Offence.” International Journal for Philosophy of Religion 56 (2004): 1-15.
    • Argues that the existence of libertarian free will would provide evidence against God’s existence.
  • Schellenberg, J.L. The Wisdom to Doubt. Ithaca and London: Cornell University Press, 2007.
    • See especially chapters 9 and 10. Argues that, if God exists, then all nonbelievers must either be resistant or incapable of a positively meaningful conscious relationship with God
  • Swinburne, Richard. Providence and the Problem of Evil. Oxford University Press, 1998.
    • See especially chapter 10. Proposes several reasons God hides.
  • Taber, Tyler, and Tyler Dalton McNabb. “Is the Problem of Divine Hiddenness a Problem for the Reformed Epistemologist?” Heythrop Journal 57(6) (2016): 783-793.
    • Argues that divine hiddenness amounts to the problem of sin’s consequences.
  • Teeninga, Luke. “Divine Hiddenness and the Problem of No Greater Goods.” International Journal for Philosophy of Religion 89 (2020): 107-123.
    • Addresses the problem of whether there could possibly be a greater good than that of a conscious personal relationship with God, and thus whether all “greater goods” responses to the hiddenness argument must fail.
  • Trakakis, N.N. “The Hidden Divinity and What It Reveals.” In Hidden Divinity and Religious Belief: New Perspectives, edited by Eleonore Stump and Adam Green, 192-209. Cambridge University Press, 2016.
    • Looks at hiddenness from the perspective of eastern philosophies and religions including Eastern Orthodox Christianity, Islam, and Hinduism.
  • Vandergriff, Kevin. “Natural Nonbelief as a Necessary Means to a Life of Choiceworthy Meaning.” Open Theology 2(1) (2016): 34-52.
    • Responds to Jason Marsh’s problem of “natural nonbelief”, arguing that nonbelief allowed early humans to have a particular kind of meaningful life not available to modern humans.
  • Weidner, Veronika. Examining Schellenberg’s Hiddenness Argument. Springer Verlag, 2018.
    • Provides an in-depth examination of Schellenberg’s hiddenness argument, how its fine details have developed over time, and much of the discussion it has prompted. Weidner also provides an argument that explicit belief in God is not necessary for one to have a personal relationship with God.

 

Author Information

Luke Teeninga
Email: luketeeninga@gmail.com
Tyndale University
Canada

Renaissance Skepticism

The term “Renaissance skepticism” refers to a diverse range of approaches to the problem of knowledge that were inspired by the revitalization of Ancient Greek Skepticism in fifteenth through sixteenth century Europe. Much like its ancient counterpart, Renaissance skepticism refers to a wide array of epistemological positions rather than a single doctrine or unified school of thought. These various positions can be unified to the extent that they share an emphasis on the epistemic limitations of human beings and offer the suspension of judgment as a response to those limits.

The defining feature of Renaissance skepticism (as opposed to its ancient counterpart) is that many of its representative figures deployed skeptical strategies in response to religious questions, especially dilemmas concerning the criterion of religious truth. Whereas some Renaissance thinkers viewed skepticism as a threat to religious orthodoxy, others viewed skepticism as a powerful strategy to be adopted in Christian apologetics.

Philosophers who are typically associated with Renaissance skepticism include Gianfrancesco Pico della Mirandola, Michel de Montaigne, Pierre Charron, and Francisco Sanches. Beyond philosophy, the revitalization of skepticism in Renaissance Europe can also be seen through the writings of religious thinkers such as Martin Luther, Sebastian Castellio, and Desiderius Erasmus; pedagogical reformers such as Omer Talon and Petrus Ramus; and philologists such as Henri Estienne and Gentian Hervet. This article provides an overview of the revitalization of skepticism in Renaissance Philosophy through the principal figures and themes associated with this movement.

Table of Contents

  1. The Ancient Sources of Renaissance Skepticism
  2. The Transmission of Ancient Skepticism into the Renaissance
  3. Popkin’s Narrative of the History of Renaissance Skepticism
  4. Medieval Skepticism and Anti-Skepticism
  5. Renaissance Skepticism Pre-1562: Skepticism before the Publication of Sextus Empiricus
    1. Gianfrancesco Pico della Mirandola’s use of Pyrrhonism
    2. Skepticism and Anti-Skepticism in the Context of the Reformation
    3. Skepticism and Anti-Skepticism in Pedagogical Reforms
  6. Renaissance Skepticism Post-1562: The Publication of Sextus Empiricus
    1. Henri Estienne’s Preface to Sextus Empiricus’ Outlines of Skepticism
    2. Gentian Hervet’s Preface to Sextus Empiricus’ Adversus Mathematicos
  7. Late Renaissance Skepticism: Montaigne, Charron, and Sanches
    1. Michel de Montaigne
      1. Montaigne and Pyrrhonism
      2. Pyrrhonism in the “Apology for Raymond Sebond”
      3. Pyrrhonian Strategies Beyond the “Apology”
      4. Montaigne and Academic Skepticism
    2. Pierre Charron
    3. Francisco Sanches
  8. The Influence of Renaissance Skepticism
  9. References and Further Reading
    1. Primary Sources
    2. Additional Primary Sources
    3. Secondary Sources

1. The Ancient Sources of Renaissance Skepticism

Ancient Greek skepticism is traditionally divided into two distinct strains: “Academic skepticism” and “Pyrrhonian skepticism.” Both types of skepticism had a considerable influence on Renaissance philosophy albeit at different times and places. The term “Academic skepticism” refers to the various positions adopted by different members of Plato’s Academy in its “middle” and “late” periods. Figures such as Arcesilaus (c. 318-243 B.C.E.), Carneades (c. 213-129 B.C.E.), Clitomachus (187-110 B.C.E.), Antiochus (c. 130-c. 68 B.C.E.), Philo of Larissa (c. 159/8 – c. 84/3 B.C.E.), and Cicero (106-43 B.C.E.) are associated with Academic skepticism. The term “Pyrrhonian skepticism” refers to an approach adopted by a later group of philosophers who sought to revive a more radical form of skepticism that they associated with Pyrrho (c. 365-270 B.C.E.). Figures associated with Pyrrhonian skepticism include Aenesidemus in the first century B.C.E., and Sextus Empiricus in the second century C.E.

Both strains of ancient skepticism share an emphasis on the epistemic limitations of human beings and recommend the suspension of assent in the absence of knowledge. Both varieties of skepticism advance their arguments in response to dogmatism, although they differ in their specific opponents. The Academic skeptics direct their arguments primarily in response to Stoic epistemology, particularly the theory of cognitive impressions. In contrast, the Pyrrhonian skeptics direct their arguments in response to Academic skepticism as well as other ancient schools of thought.

One key distinction between the two strains of ancient skepticism can be found in their differing stances on the nature and scope of the suspension of assent. Arcesilaus, for example, maintains the radical view that nothing can be known with certainty. In response to the absence of knowledge, he recommends the suspension of assent. In response to the Stoic objection that the suspension of judgment impedes all rational and moral activity, Arcesilaus offers a practical criterion, “the reasonable” (to eulogon), as a guide to conduct in the absence of knowledge. Arcesilaus’ student Carneades presents yet another kind of practical criterion, “the persuasive” (to pithanon), as a guide for life in the absence of knowledge. In response to the inactivity charge of the Stoics, he maintains that in the absence of knowledge, we can still be guided by convincing or plausible impressions.

Philo offers a “mitigated” interpretation of Academic skepticism. His mitigated skepticism consists in the view that an inquirer can offer tentative approval to “convincing” or “plausible” impressions that survive skeptical scrutiny. Cicero discusses Philo’s mitigated interpretation of Academic skepticism in his Academica, translating Carneades’ practical criterion as “probabile” and “veri simile.” Cicero gives this practical criterion a “constructive” interpretation. In other words, he proposes that probability and verisimilitude can bring the inquirer to ever closer approximations of the truth. Admittedly, the question of whether Academic skeptics such as Cicero, Carneades, and Arcesilaus are “fallibilists” who put forth minimally positive views, or “dialectical” skeptics who only advance their arguments to draw out the unacceptable positions of their opponents, is a subject of considerable scholarly debate. For further discussion of this issue, see Ancient Greek Skepticism.

The Pyrrhonian skeptics introduce a more radical approach to the suspension of assent in the absence of knowledge. They offer this approach in response to what they perceive to be dogmatic in the Academic position. Aenesidemus, for example, interprets the Academic view that “nothing can be known” as a form of “negative dogmatism.” That is, he views this position as a positive and therefore insufficiently skeptical claim about the impossibility of knowledge. As an alternative, Aenesidemus endeavors to “determine nothing.” In other words, he seeks to neither assert nor deny anything unconditionally. Sextus Empiricus, another representative figure of Pyrrhonian skepticism, offers another alternative to the allegedly incomplete skepticism of the Academics. In the absence of an adequate criterion of knowledge, Sextus practices the suspension of assent (epoché). Although the Academics also practice the suspension of assent, Sextus extends its scope. He recommends the suspension of assent not only regarding positive knowledge claims, but also regarding the skeptical thesis that nothing can be known.

2. The Transmission of Ancient Skepticism into the Renaissance

 In fourteenth to mid-sixteenth century Europe, the writings of Augustine, Cicero, Diogenes Laertius, Galen, and Plutarch served as the primary sources on Ancient skepticism. The writings of Sextus Empiricus were not widely available until 1562 when they were published in Latin by Henri Estienne. Due to the limited availability of Sextus Empiricus in the first half of the sixteenth century, philosophical discussions of skepticism were largely constrained to the Academic skeptical tradition, with very few exceptions. It was only in the latter half of the sixteenth century that Renaissance discussions of skepticism began to center around Pyrrhonism. For this reason, this article divides Renaissance skepticism into two distinct periods: pre-1562 and post-1562.

Throughout the Renaissance, the distinction between Academic and Pyrrhonian skepticism was neither clearly nor consistently delineated. Before the publication of Sextus Empiricus in the 1560s, many authors were unaware of Pyrrhonian skepticism, often treating “skepticism” and “Academic skepticism” as synonyms. Those who were aware of the difference did not always consistently distinguish between the two strains. Following the publication of Sextus Empiricus, many thinkers began to use the terms “Pyrrhonism” and “skepticism” interchangeably. For some, this was in apparent acceptance of Sextus’ view that the Academic skeptics are negative dogmatists rather than genuine skeptics. For others, this was due to a more syncretic interpretation of the skeptical tradition, according to which there is common ground between the various strains.

3. Popkin’s Narrative of the History of Renaissance Skepticism

Scholarly debate surrounding the revitalization of ancient skepticism in the Renaissance has been largely shaped by Richard Popkin’s History of Scepticism, first published in 1960, and expanded and revised in 1979 and 2003. This section presents Popkin’s influential account of the history of skepticism, addressing both its merits and limitations.

The central thesis of Popkin’s History of Scepticism is that the revitalization of Pyrrhonian skepticism in Renaissance Europe instigated a crisis of doubt concerning the human capacity for knowledge. According to Popkin, this skeptical crisis had a significant impact on the development of early modern philosophy. On Popkin’s account, the battles over theological authority in the wake of the Protestant Reformation set the initial scene for this skeptical crisis of doubt. This crisis of uncertainty was brought into full force following the popularization of Pyrrhonian skepticism among figures such as Michel de Montaigne.

While influential, Popkin’s narrative of the history of skepticism in early modernity has drawn criticism from many angles. One common charge is that Popkin exaggerated the impact of Pyrrhonian skepticism at the expense of Academic skepticism and other sources and testimonia such as Augustine, Plutarch, Plato, and Galen. Another common criticism is that he overstated the extent to which skepticism was forgotten throughout late Antiquity and the Middle Ages and only recovered in the Renaissance. This section provides an overview of these two main criticisms.

Charles Schmitt’s 1972 study of the reception of Cicero’s Academica in Renaissance Europe demonstrates that the impact of Academic skepticism on Renaissance thought was considerable. Schmitt argues that although the Academica was one of Cicero’s more obscure works throughout the Latin Middle Ages, it witnessed increased visibility and popularity throughout the fifteenth and sixteenth centuries. By the sixteenth century, Cicero’s Academica had become the topic of numerous commentaries, such as those by Johannes Rosa (1571) and Pedro de Valencia (1596) (for an analysis of these commentaries, see Schmitt 1972). Not only that, but the Academica became an object of critique among scholars such as Giulio Castellani (Schmitt 1972). Although Schmitt ultimately concedes that the impact of Academic skepticism on Renaissance thought was minimal in comparison to that of Pyrrhonism, nevertheless, he maintains that it was not as marginal as Popkin had initially suggested (Schmitt 1972; 1983). Over the past few decades, scholars such as José Raimundo Maia Neto have studied the impact of Academic skepticism further, arguing that its influence on early modern philosophy was substantial (Maia Neto, 1997; 2013; 2017; see also Smith and Charles eds. 2017 for further discussion of the impact of Academic skepticism on early modern philosophy).

Popkin’s “rediscovery” narrative has also been challenged, specifically the idea that Pyrrhonism was largely forgotten throughout Late Antiquity and the Middle Ages only to be rediscovered in the Renaissance. One notable example is Luciano Floridi’s study on the transmission of Sextus Empiricus, which documents the availability of manuscripts throughout late antiquity and the Middle Ages. Floridi shows that although Sextus was admittedly obscure throughout Late Antiquity and the Middle Ages, he was not quite as unknown as Popkin had initially supposed (Floridi 2002).

Increased scholarly attention to medieval discussions of skepticism has shown further limitations to Popkin’s rediscovery narrative (Perler 2006; Lagerlund ed. 2010; Lagerlund 2020). Although neither strain of ancient Greek skepticism was particularly influential in the Latin Middle Ages, discussions of skeptical challenges and rehearsals of skeptical arguments occurred in entirely new contexts such as debates regarding God’s power, the contingency of creation, and the limits of human knowledge in relation to the divine (Funkenstein 1987). Although most medieval discussions of skepticism were critical, such as that of Henry of Ghent, who drew on Augustine’s Contra Academicos in his attack on skeptical epistemology, some were sympathetic, such as that of John of Salisbury, who discussed the New Academy in a favorable light, and adopted elements of Cicero’s and Philo of Larissa’s probabilism in his own epistemology (Schmitt 1972; see also Grellard 2013 for a discussion of John of Salisbury’s probabilism). The following section provides an overview of medieval treatments of skepticism.

4. Medieval Skepticism and Anti-Skepticism

 Philosophers typically associated with medieval skepticism and anti-skepticism include John of Salisbury, Henry of Ghent, John Duns Scotus, Nicolas of Autrecourt, and John Buridan, among others. Although not all of these thinkers engaged directly with ancient Greek skepticism, they still responded to epistemological challenges that can be called “skeptical” in a broader sense.

John of Salisbury (1115-1180) was one of the first philosophers of the Latin Middle Ages to discuss Academic skepticism in any significant detail and to openly embrace certain views associated with the New Academy. In the Prologue to the Metalogicon, for example, John associates his own methodology with Academic probabilism. He writes, “[b]eing an Academician in matters that are doubtful to a wise man, I cannot swear to the truth of what I say. Whether such propositions be true or false, I am content with probable certitude” (ML 6). Similarly, in the Prologue to the Policraticus, John writes that “[i]n philosophy, I am a devotee of Academic dispute, which measures by reason that which presents itself as more probable. I am not ashamed of the declarations of the Academics, so that I do not recede from their footprints in matters about which wise men have doubts” (PC 7). John associates Academic methodology with epistemic modesty toward claims that have not been conclusively demonstrated and combines this humility with an openness toward the possibility of truth.

Although John associates his own methodology with Academic probabilism, he stipulates very clear limits to his skepticism. He restricts his skeptical doubt to the inferences derived from ordinary experience, maintaining that these inferences should be affirmed as probable rather than necessary. Although John believes that it is reasonable to doubt the inferences derived from ordinary experience, he maintains that we can still affirm the truth of what can be known rationally. He argues, for example, that we cannot doubt the certainty of God’s existence, the principle of non-contradiction, or the certainty of mathematical and logical inferences (PC 153-156).

In the thirteenth century, both Henry of Ghent (c. 1217-1293) and John Duns Scotus (1265/1266-1308) were concerned with establishing the possibility of knowledge in opposition to skeptical challenges (for a discussion of their positions in relation to skepticism, see Lagerlund 2020). Henry’s Summa begins by posing the question of whether we can know anything at all. Henry attempts to guarantee the possibility of knowledge through a theory of divine illumination he attributes to Augustine. John Duns Scotus discusses and rejects Henry’s divine illumination theory of knowledge, arguing that the natural intellect is indeed capable of achieving certainty through its own powers. Like Henry, Scotus develops his theory of knowledge in response to a skeptical challenge to the possibility of knowledge (Lagerlund 2020). In contrast to Henry, Scotus maintains that the natural intellect can achieve certitude regarding certain kinds of knowledge, such as analytic truths and conclusions derived from them, thus requiring no assistance through divine illumination.

In early fourteenth century Latin philosophy, a new type of skeptical argument, namely the “divine deception” argument, began to emerge (Lagerlund 2020). The divine deception argument, made famous much later by Descartes, explores the possibility that God is deceiving us, thus threatening the very possibility of knowledge. Philosophers such as Nicholas of Autrecourt and John Buridan developed epistemologies that could respond to the threat to the possibility of knowledge posed by this type of skeptical argument (Lagerlund 2020). Nicholas offers an infallibilist and foundationalist epistemology whereas Buridan offers a fallibilist one (Lagerlund 2020).

Nicholas of Autrecourt (c. 1300-c. 1350) entertains and engages with skeptical challenges in his Letters to Bernard of Arezzo. In these letters, Nicholas draws out what he takes to be unacceptable implications of Bernard’s epistemology. Nicolas takes Bernard’s position to entail an extreme and untenable form of skepticism about the external world and even of one’s own mental acts (Lagerlund 2020). In response to this hyperbolic skepticism, Nicholas develops a positive account of knowledge, offering what Lagerlund calls a “defense of infallible knowledge” (Lagerlund 2020). Nicholas’ epistemology is “infallibilist” insofar as he maintains that the principle of noncontradiction and everything that can be resolved into this principle is immune to skeptical doubt. This infallibilist epistemology is tailored to respond to the skeptical challenge of divine deception.

Nicholas’s approach to skepticism sets a very high bar for the possibility of knowledge. This exceedingly high standard of knowledge is challenged by John Buridan (c. 1295-1361). Like Nicholas, Buridan also develops an epistemology that can withstand the skeptical challenge of divine deception. Unlike Nicholas, the epistemology he develops is a “fallibilist” one (Lagerlund 2020). As Jack Zupko argues, Buridan’s strategy against the skeptical challenges entertained by Nicholas is to show that it is unreasonable to accept the excessively high criterion of knowledge presupposed by the hypothesis of divine deception (Zupko 1993). As Zupko shows, Buridan’s response to the divine deception argument entertained by Nicholas is to “acknowledge it, and then to ignore it” (Zukpo 1993). Instead, Buridan develops a fallibilist epistemology in which knowledge admits of degrees which correspond to three distinct levels of “evidentness” (Lagerlund 2020).

Throughout the Latin Middle Ages skepticism did not disappear to the extent that Popkin suggests. Nevertheless, although many medieval philosophers deal with skeptical challenges to the possibility of knowledge, and develop epistemologies tailored to withstand skeptical attack, their approaches to these issues are not always shaped by Ancient Greek skepticism. In the Renaissance, this began to change due to the increasing availability of classical texts. The next section discusses Renaissance treatments of skepticism both before and after the publication of Sextus Empiricus.

5. Renaissance Skepticism Pre-1562: Skepticism before the Publication of Sextus Empiricus   

In Renaissance Europe, philosophical treatments of skepticism began to change as Cicero’s Academica witnessed increased popularity and Sextus Empiricus’ works were translated into Latin. This section discusses how Renaissance thinkers approached the issue of skepticism (both directly and indirectly) from the early sixteenth century up until the 1562 publication of Sextus Empiricus by Henri Estienne. Due to the limited availability of Sextus Empiricus in Renaissance Europe, most discussions of skepticism prior to 1562 draw primarily on the Academic skeptical tradition. One notable exception is Gianfrancesco Pico della Mirandola.

a. Gianfrancesco Pico della Mirandola’s use of Pyrrhonism

Gianfrancesco Pico della Mirandola (1469-1533) is the earliest Renaissance thinker associated with Pyrrhonian skepticism. His Examination of the Vanity of the Doctrines of the Gentiles and of the Truth of the Christian Teaching (1520) is often acknowledged as the first use of Pyrrhonism in Christian apologetics (Popkin 2003). Although Sextus Empiricus was not widely available in Pico’s time, he had access to a manuscript that was housed in Florence (Popkin 2003).

In the Examination of the Vanity of the Doctrines of the Gentiles and of the Truth of the Christian Teaching, Pico deploys skeptical strategies toward both positive and negative ends. His negative aim is to undermine the authority of Aristotle among Christian theologians and to discredit the syncretic appropriation of ancient pagan authors among humanists such as his uncle, Giovanni Pico della Mirandola. Pico’s more positive aim is to support the doctrines of Christianity by demonstrating that revelation is the only genuine source of certitude (Schmitt 1967; Popkin 2003). Pico subjects the various schools of ancient philosophy to skeptical scrutiny in order to demonstrate their fundamental incertitude (Schmitt 1967; Popkin 2003). In so doing, he seeks to reveal the special character of divinely revealed knowledge.

Although Pico uses skeptical strategies to attack the knowledge claims advanced by ancient pagan philosophers, he maintains that the truths revealed in Scripture are immune to skeptical attack. One reason for this is that he understands the principles of faith to be drawn directly from God rather than through any natural capacity such as reason or the senses. Since Pyrrhonian arguments target reason and the senses as criteria of knowledge, they do not apply to the truths revealed in Scripture. Not only does Pico maintain that Pyrrhonian arguments are incapable of threatening the certainty of revelation, he also suggests that this Pyrrhonian attack on natural knowledge has the positive potential to assist Christian faith.

Pico’s use of Pyrrhonism presents a case of what would later become common throughout the Reformation and Counter-Reformation, namely the deployment of Pyrrhonian skeptical strategies towards non-skeptical Christian ends. Pico did not subject the doctrines of Christianity to skeptical attack. Instead, he deployed Pyrrhonism in a highly circumscribed context, namely as an instrument for detaching Christianity from ancient pagan philosophy and defending the certitude of Christian Revelation (see Copenhaver 1992 for a discussion of Pico’s detachment of Christianity from ancient philosophy).

b. Skepticism and Anti-Skepticism in the Context of the Reformation

As Popkin argues, skeptical dilemmas such as the Problem of the Criterion appear both directly and indirectly in Reformation-era debates concerning the standard of religious truth. The “problem of the criterion” is the issue of how to justify a standard of truth and settle disputes over this standard without engaging in circular reasoning. According to Popkin, this skeptical problem of the criterion entered into religious debates when Reformers challenged the authority of the Pope on matters of religious truth and endeavored to replace this criterion with individual conscience and personal interpretation of Scripture (Popkin 2003).

Popkin draws on the controversy between Martin Luther (1483-1546) and Desiderius Erasmus (1466-1536) on the freedom of the will as one example of how the skeptical problem of the criterion figured indirectly into debates concerning the criterion of religious truth (Popkin 2003). In On Free Will (1524), Erasmus attacks Luther’s treatment of free will and predestination on the grounds that it treats obscure questions that exceed the scope of human understanding (Popkin 2003; Maia Neto 2017). Erasmus offers a loosely skeptical response, proposing that we accept our epistemic limitations and rely on the authority of the Catholic Church to settle questions such as those posed by Luther (Popkin 2003; Maia Neto 2017). Luther’s response to Erasmus, entitled The Bondage of the Will (1525), argues against Erasmus’ skeptical emphasis on the epistemic limitations of human beings and acquiescence to tradition in response to those limits. Luther argues instead that a true Christian must have inner certainty regarding religious knowledge (Popkin 2003).

Sebastian Castellio (1515/29-1563), another reformer, takes a more moderate approach to the compatibility between faith and epistemic modesty in his On the Art of Doubting and Concerning Heretics (1554). In Castellio’s stance, Popkin identifies yet another approach to the skeptical problem of the criterion (Popkin 2003). Like Erasmus, Castellio emphasizes the epistemic limitations of human beings and the resultant difficulty of settling obscure theological disputes. In contrast to Erasmus, Castellio does not take these epistemic limitations to require submission to the authority of the Catholic Church. In contrast to Luther, Castellio does not stipulate inner certainty as a requirement for genuine Christian faith. Instead, Castellio argues that human beings can still draw “reasonable” conclusions based on judgment and experience rather than either the authority of tradition or the authority of inner certitude (Popkin 2003; Maia Neto 2017).

c. Skepticism and Anti-Skepticism in Pedagogical Reforms

Academic skepticism had a major impact in mid-sixteenth century France through the pedagogical reforms proposed by Petrus Ramus (1515-1572) and his student Omer Talon (c. 1510-1562) (Schmitt 1972). Ramus developed a Ciceronian and anti-Scholastic model of education that sought to bring together dialectic with rhetoric. Although Ramus expressed enthusiastic admiration for Cicero, he never explicitly identified his pedagogical reforms with Academic skepticism and his association with it was always indirect.

Omer Talon had a direct and explicit connection to Academic skepticism, publishing an edition of the Academica Posteriora in 1547, and an expanded and revised version which included the Lucullus in 1550. Talon included a detailed introductory essay and commentary that Schmitt has called the “first serious study of the Academica to appear in print” (Schmitt 1972). Talon’s introductory essay explicitly aligns Ramus’ pedagogical reforms with the philosophical methodology of Academic skepticism. He presents Academic methodology as an alternative to Scholastic models of education, defending its potential to cultivate intellectual freedom.

Talon adopts the Academic method of argument in utramque partem, or the examination of both sides of the issue, as his preferred pedagogical model. Cicero also claims this as his preferred method, maintaining that it is the best way to establish probable views in the absence of certain knowledge through necessary causes (Tusculan Disputations II.9). Although this method is typically associated with Cicero and Academic skepticism, Talon attributes it to Aristotle as well, who discusses this method in Topics I-II, 100a-101b (Maia Neto 2017). Despite the Aristotelian origins of the method of argument in utramque partem, it was not popular among the Scholastic philosophers of Talon’s time (Maia Neto 2017).

Talon’s use of skepticism is constructive rather than dialectical, insofar as he interprets the Academic model of argument in utramque partem as a positive tool for the pursuit of probable beliefs, rather than as a negative strategy for the elimination of beliefs. Specifically, he presents it as a method for the acquisition of probable knowledge in the absence of certain cognition through necessary causes. Following Cicero, Talon maintains that in a scenario where such knowledge is impossible, the inquirer can still establish the most probable view and attain by degrees a closer and closer approximation of the truth.

Talon’s main defense of Academic skepticism hinges on the idea of intellectual freedom (Schmitt 1972). He follows Cicero’s view that the Academic skeptics are “freer and less hampered” than the other ancient schools of thought because they explore all views without offering unqualified assent to any one of them (see Cicero’s Acad. II.3, 8). Like Cicero, Talon maintains that probable views can be found in all philosophical positions, including Platonism, Aristotelianism, Stoicism, and Epicureanism. To establish the most probable view, the inquirer should freely examine all positions without offering unqualified assent to any one of them.

Talon’s syncretism is another distinctive feature of his appropriation of Ciceronian skepticism. His syncretism consists in his presentation of Academic skepticism as harmonious with Socratic, Platonic, and even at times, Aristotelian philosophy. Throughout his introductory essay, Talon makes a point of demonstrating that Academic skepticism has an ancient precedent with Socrates, Plato, Aristotle, and even some earlier Pre-Socratic philosophers. He takes great care to clear Academic skepticism of common charges such as negative dogmatism, presenting it in a more positive light that emphasizes its common ground with other philosophical schools. Talon places particular emphasis on Socratic learned ignorance and commitment to inquiry as central to skeptical inquiry.

The impact of Academic skepticism in mid-sixteenth-century France can also be seen through the emergence of several anti-skeptical works. Ramus’ and Talon’s proposed pedagogical reforms were controversial for many reasons, one of which was the problem of skepticism. Pierre Galland (1510-1559), one of Ramus’ colleagues at the Collège de France, launched a fierce attack on the role of skepticism in these proposed reforms (for a detailed discussion of Galland’s critique, see Schmitt 1972). Galland’s main concern was that Ramus’ and Talon’s pedagogical reforms threatened to undermine philosophy and Christianity alike (Schmitt 1972). He argues that a skeptical attack on the authority of reason would eventually lead to an attack on all authority, including theological authority (Schmitt 1972).

Another example of anti-skepticism can be seen in a work by Guy de Brués entitled Dialogues contre les nouveaux académiciens (1557) (for a discussion of de Brués, see Schmitt 1972; see also Morphos’ commentary to his 1953 translation). In this work, Brués advances an extended attack on Academic skepticism through a dialogue between four figures associated with the Pléiade circle: Pierre de Ronsard, Jean-Antoine de Baïf, Guillaume Aubert, and Jean Nicot. In his dedicatory epistle to the Cardinal of Lorraine, Brués states that the goal of his anti-skeptical dialogue is to prevent the youth from being corrupted by the idea that “all things are a matter of opinion,” an idea he attributes to the New Academy. He argues that skepticism will lead the youth to distain the authority of religion, God, their superiors, justice, and the sciences. Much like that of Galland, Brués’ critique of skepticism centers on the threat of relativism and the rejection of universal standards (Schmitt 1972).

6. Renaissance Skepticism Post-1562: The Publication of Sextus Empiricus

In 1562, the Calvinist printer and classical philologist Henri Estienne (Henricus Stephanus) (c. 1528-1598) published the first Latin translation and commentary on Sextus Empiricus’ Outlines of Skepticism. This publication of Sextus’ works into Latin reshaped Renaissance discussions of skepticism. In 1569, Estienne printed an expanded edition of Sextus’ works that included a translation and introductory essay on Adversus Mathematicos by the Catholic counter-reformer, Gentian Hervet (1499-1584). This edition also included a translation of Diogenes Laertius’ Life of Pyrrho, and a translation of Galen’s anti-skeptical work, The Best Method of Teaching, by Erasmus. In contrast to the numerous editions and commentaries on the Academica that were available throughout the sixteenth century, Estienne’s editions were the only editions of Sextus’ works that were widely available in the sixteenth century. A Greek edition was not printed until 1621.

Estienne and Hervet both include substantial prefaces with their translations (for a discussion of these prefaces, see Popkin 2003; for a translation and discussion of Estienne’s preface, see Naya 2001). In each preface, the translator comments on the philosophical value of Sextus and states his goals in making Pyrrhonism available to a wider audience. Both prefaces treat the question of whether and how Pyrrhonism can be used in Christian apologetics, and both respond to the common objection that Pyrrhonism poses a threat to Christianity. Although Estienne was a Calvinist, and Hervet was an ardent Counter-Reformer, both offer a similar position on the compatibility of Christianity with Pyrrhonism. Both agree that Pyrrhonism is a powerful resource for undermining confidence in natural reason and affirming the special character of revelation. Although Estienne and Hervet were not philosophers, their framing of skepticism and its significance for religious debates had an impact on how philosophers took up these issues, especially given that these were the only editions of Sextus that were widely available in the sixteenth century.

a. Henri Estienne’s Preface to Sextus Empiricus’ Outlines of Skepticism

 Henri Estienne’s preface to Sextus’ Outlines combines the loosely skeptical “praise of folly” genre popularized by Erasmus with a fideistic agenda resembling that of Gianfrancesco Pico della Mirandola. Estienne begins with a series of jokes, playfully presenting the Outlines as a kind of joke book. It begins as a dialogue between the translator and his friend, Henri Mesmes, in which the one Henri inquires upon the nature and value of skepticism, and the other Henri offers responses that parody the traditional Pyrrhonian formulae.

When asked about the nature and value of skepticism, Estienne recounts the “tragicomic” story of his own “divine and miraculous metamorphosis” into a skeptic. Drawing on conventional Renaissance representations of melancholy, Estienne recounts a time when he suffered from quartan fever, a disease associated with an excess of black bile. This melancholy prevented him from pursuing his translation work. One day, Estienne wandered into his library with his eyes closed out of fear that the mere sight of books would sicken him, and fortuitously came across his old notes for a translation of Sextus Empiricus. While reading the Outlines, Estienne began to laugh at Sextus’ skeptical attack on the pretensions of reason. Estienne’s laughter counterbalanced his melancholy, allowing him to return to his translation work afresh.

Estienne discusses the “sympathy” between his illness and its skeptical cure, describing an “antiperistasis” in which his excess of learning was counterbalanced by its opposite (namely skepticism). Much to his surprise, this skeptical cure had the fortuitous result of reconciling him with his scholarly work, albeit on new terms. Estienne’s encounter with skepticism allowed him to return to the study of classical texts by reframing his understanding of the proper relationship between philosophy and religion.

In the second half of his preface, Estienne turns to the question of whether skepticism poses a threat to Christianity. Anticipating the common objection that skepticism leads to impiety and atheism, he replies that it is the dogmatist rather than the skeptic who poses a genuine threat to Christianity and is at greater risk of falling into atheism. Whereas skeptics are content to follow local customs and tradition, dogmatists endeavor to measure the world according to their reason and natural faculties.

In the final paragraphs of the preface, Estienne addresses his reasons for publishing the first Latin translation of Sextus Empiricus. Returning to the themes of illness discussed at the beginning of the preface, he remarks that his goal is a therapeutic one. That is, his aim is to cure the learned of the “impiety they have contracted by contact with ancient dogmatic philosophers” and to relieve those with an excessive reverence for philosophy. Here, Estienne presents skepticism as a cure for the pride of the learned, playing on the ancient medical idea that health is a humoral balance that can be restored by counterbalancing one excess with another.

Finally, Estienne responds to the common objection that skepticism is an anti-philosophical method that will destroy the possibility of establishing any kind of truth. Estienne argues that this skeptical critique does not pertain to religious truths revealed in Scripture. He suggests instead that a skeptical attack on natural knowledge will only serve to reaffirm the prerogative of faith. Much like his predecessor, Gianfrancesco Pico della Mirandola, Estienne envisions Pyrrhonism as a tool to be used toward non-skeptical religious ends. He presents Pyrrhonian skepticism both as a therapy to disabuse the learned of their overconfidence in natural reason, and as a tool for affirming the special character of the truths revealed in Scripture.

b. Gentian Hervet’s Preface to Sextus Empiricus’ Adversus Mathematicos

 Gentian Hervet’s 1569 preface to Sextus’ Adversus Mathematicos is more somber in tone than Estienne’s and places a more transparent emphasis on the use of Sextus in Christian apologetics. Hervet frames his interest in skepticism in terms of his desire to uphold the doctrines of Christianity, voicing explicit approval for the project of Gianfrancesco Pico della Mirandola. Hervet adds a new dimension to his appropriation of Pyrrhonism, presenting it as a tool for combatting the Reformation, and not only as a means of loosening the grip of ancient philosophy on Christianity.

Much like Estienne’s preface, Hervet’s preface begins with a brief history of his encounter with Sextus. He reports that he fortuitously stumbled across the manuscript in the Cardinal of Lorraine’s library when in need of a diversion. He recounts the great pleasure he took in reading Adversus Mathematicos, noting its particular success at demonstrating that no human knowledge is immune to attack. Like Estienne, Hervet argues that a skeptical critique of natural reason can help reinforce the special character of the truths revealed in Scripture. In contrast to Estienne, Hervet emphasizes the potential of Pyrrhonian arguments for undermining the Reformation.

Within his preface, Hervet discusses the value of Pyrrhonism for resolving religious controversies concerning the rule of faith. He raises the problem of the criterion in the context of religious authority, condemning the Reformers for taking their natural faculties as the criterion of religious truth, and for rejecting the authority of tradition (that is, the Catholic Church). He suggests that there is a fundamental incommensurability between our natural faculties and the nature of the divine, and thus that the effort to measure the divine based on one’s own natural faculties is fundamentally misguided. Hervet expresses hope that Pyrrhonism might persuade the Reformers to return to Catholicism, presumably due to the Pyrrhonian emphasis on acquiescence to tradition in the absence of certainty.

Hervet’s preface also discusses the potential utility of Pyrrhonism in Christian pedagogy. Anticipating the common objection that skepticism will corrupt the morals of the youth and lead them to challenge the authority of Christianity, Hervet argues instead that the skeptical method of argument in utramque partem—a method he mistakenly attributes to the Pyrrhonians rather than to the Academics—will eventually lead an inquirer closer and closer to the truth of Christianity. Far from undercutting faith, Hervet proposes that skeptical inquiry will ultimately support it. Specifically, he argues that the method of argument in utramque partem can help the student distinguish the ‘verisimile,’ or the truth-like from the truth itself.

Hervet borrows this vocabulary of verisimilitude from Cicero, who translates Carneades’ practical criterion, to pithanon, as veri simile and probabile. Although within this context, Hervet is ostensibly discussing the merits of Pyrrhonian methodology and not Academic methodology, his description of the goals of argument in utramque partem conflate Pyrrhonism with Academic Skepticism. Whereas the Academic practice of arguing both sides of every issue aims at the discovery of the most probable view, at least in certain cases and on certain interpretations, the Pyrrhonian practice of pitting opposing arguments against each other aims at equipollence.

7. Late Renaissance Skepticism: Montaigne, Charron, and Sanches

The most influential philosophers associated with Renaissance skepticism are Michel de Montaigne, Pierre Charron, and Francisco Sanches. Unlike their predecessors whose appropriations of ancient skepticism were largely subordinated to religious ends, these thinkers drew on skeptical strategies to address a wider range of philosophical questions and themes in areas ranging from epistemology to practical philosophy.

a. Michel de Montaigne

The most famous thinker associated with Renaissance skepticism is the French essayist and philosopher Michel de Montaigne (1533-1592). His Essays, first published in 1580, and expanded and revised up until his death, draw extensively on both Academic and Pyrrhonian skepticism among many other ancient and medieval sources. Throughout the Essays, Montaigne treats a great number of skeptical themes including the diversity of human custom and opinion, the inconsistency of human actions and judgment, the relativity of sense-perception to the perceiver, and the problem of the criterion.

The precise nature and scope of Montaigne’s skepticism is a topic of considerable scholarly debate. Some have located Montaigne’s inspiration in the Pyrrhonian skeptical tradition (Popkin 2003; Brahami 1997). Others have noted how Cicero’s Academica serves as one of Montaigne’s most frequently cited skeptical sources (Limbrick 1977; Eva 2013; Prat 2017). Still others have maintained that the philosophical character of Montaigne’s thought is not reducible to skepticism (Hartle 2003; 2005; Sève 2007). The following sections present a range of different views on the sources, nature, and scope of Montaigne’s skepticism, considering the merits and limitations of each.

i. Montaigne and Pyrrhonism

 Among commentators, Montaigne is primarily associated with Pyrrhonian skepticism. In large part, this is due to Richard Popkin’s influential account of the central role that Montaigne played in the transmission of Pyrrhonian skepticism into early modernity. On Popkin’s account, Montaigne played a pivotal role in the revitalization of skepticism by applying Pyrrhonian strategies in a broader epistemological context than the one envisioned by his predecessors and contemporaries (Popkin 2003). Whereas earlier Renaissance thinkers used Pyrrhonian arguments to debate questions concerning the criterion of religious truth, Montaigne applies Pyrrhonian arguments to all domains of human understanding, thus launching what Popkin has termed the “Pyrrhonian crisis” of Early Modern Europe (Popkin 2003).

Popkin’s concept of the “Pyrrhonian crisis” is deeply indebted to Pierre Villey’s influential account of a personal “Pyrrhonian crisis” that Montaigne allegedly underwent while reading Sextus Empiricus. According to Villey, Montaigne’s intellectual development maps onto roughly three stages corresponding to the three books of the Essays: the earlier chapters exhibit an austere Stoicism, the middle chapters exhibit a Pyrrhonian crisis of uncertainty, and the final chapters exhibit an embrace of Epicurean naturalism (Villey 1908). Admittedly, most Montaigne scholars have rejected Villey’s three-stage developmental account for a variety of reasons. Some have rejected the idea that Montaigne’s skepticism was the result of a personal “Pyrrhonian crisis,” preferring to assess his skepticism on a more philosophical rather than psychological level. Others have questioned whether the Essays developed according to three clearly defined stages at all, pointing to evidence that Montaigne’s engagement with Skepticism, Stoicism, and Epicureanism extends beyond the confines of each book.

Scholars typically draw on Montaigne’s longest and most overtly philosophical chapter, the “Apology for Raymond Sebond,” as evidence for his Pyrrhonism. Here, Montaigne provides an explicit discussion of Pyrrhonian skepticism, voicing sympathetic approval for the Pyrrhonians’ intellectual freedom and commitment to inquiry in the absence of certitude. In a detailed description of ancient skepticism, Montaigne explicitly commends the Pyrrhonians in opposition to the Academic and dogmatic schools of ancient philosophy. Within this context, Montaigne voices approval for the Pyrrhonians, arguing that the Academics maintain the allegedly inconsistent view that knowledge is both unattainable and that some opinions are more probable than others. Within this description, Montaigne commends the Pyrrhonians both for remaining agnostic on whether knowledge is possible and for committing to inquiry in the absence of knowledge.

ii. Pyrrhonism in the “Apology for Raymond Sebond”

Since the “Apology” is the longest and most overtly philosophical chapter of the Essays, many scholars, such as Popkin, have treated the “Apology” as a summation of Montaigne’s thought. They have also treated Montaigne’s sympathetic exposition of Pyrrhonism as an expression of the author’s personal sympathies (Popkin 2003). Although scholars generally agree that “Apology” is heavily influenced by Pyrrhonism, its precise role is a matter of considerable debate. The main reasons have to do with the essay’s format and context.

As for the issue of context, the “Apology” was likely written at the request of the Catholic princess, Marguerite of Valois, to defend the Natural Theology of Raymond Sebond (1385-1436), a Catalan theologian whose work Montaigne translated in 1569. Montaigne’s defense of Sebond is (at least in part) intended to support Marguerite’s specific concerns in defending her Catholic faith against the Reformers (Maia Neto 2013; 2017).

As for the issue of format, the “Apology” is loosely structured as a disputed question rather than as a direct articulation of the author’s own position (Hartle 2003). In the manner of a disputed question, Montaigne defends Sebond’s Natural Theology against two principal objections, offering responses to each objection that are tailored to the views of each specific objector. For this reason, the statements that Montaigne makes within this essay cannot easily be removed from their context and taken to represent the author’s own voice in any unqualified sense (Hartle 2003; Maia Neto 2017).

Within the “Apology,” Montaigne ostensibly sets out to defend Sebond’s view that the articles of faith can be demonstrated through natural reason. The first objection he frames is that Christians should not support their faith with natural reason because faith has a supernatural origin in Divine grace (II: 12, F 321; VS 440). The second objection is that Sebond’s arguments fail to demonstrate the doctrines they allege to support (II: 12, F 327; VS 448). The first objection hinges on a dispute about the meaning of faith, and the second objection hinges on a dispute concerning the strength of Sebond’s arguments. Montaigne responds to both objections, conceding and rejecting aspects of each. In response to the first objection, Montaigne concedes that the foundation of faith is indeed Divine grace but denies the objector’s conclusion that faith has no need of rational support (II: 12, F 321; VS 441). In response to the second objection, Montaigne presents a Pyrrhonian critique of the power of reason to demonstrate anything conclusively—not only in domain of religious dogma, but in any domain of human understanding (II: 12, F 327-418; VS 448-557).

It is in the context of this second objection that Montaigne provides his detailed and sympathetic presentation of Pyrrhonism. Montaigne’s response to the second objection begins with a long critique of reason (II: 12, F 370-418; VS 500-557). Drawing on the first of Sextus’ modes, Montaigne presents an extended discussion of animal behavior to undermine human presumption about the power of reason. Following Sextus, Montaigne compares the different behaviors of animals to show that we have no suitable criterion for preferring our own impressions over those of the allegedly inferior animals. Drawing on the second mode, Montaigne points to the diversity of human opinion as a critique of the power of reason to arrive at universal truth. Montaigne places special emphasis on the diversity of opinion in the context of philosophy: despite centuries of philosophical inquiry, no theory has yielded universal assent. Finally, Montaigne attacks reason on the grounds of its utility, arguing that knowledge has failed to bring happiness and moral improvement to human beings.

Following this critique of reason, Montaigne turns to an explicit discussion of Pyrrhonian skepticism, paraphrasing the opening of Sextus’ Outlines (II: 12, F 371; VS 501). Identifying three possible approaches to philosophical inquiry, he writes that investigation will end in the discovery of truth, the denial that it can be found, or in the continuation of the search. Following Sextus, Montaigne frames these approaches as dogmatism, negative dogmatism, and Pyrrhonism. In contrast to the two alternative dogmatisms that assert either that they have attained the truth, or that it cannot be found, Montaigne commends the Pyrrhonists for committing to the search in the absence of knowledge.

Montaigne devotes the next few paragraphs to a detailed description of Pyrrhonian strategies (II:12 F 372; VS 502-3). He provides a sympathetic consideration of Pyrrhonian strategies in contrast to dogmatism and the New Academy (II: 12, F 374; VS 505). He concludes by commending Pyrrhonism for its utility in a religious context, writing that: “There is nothing in man’s invention that has so much verisimilitude and usefulness [as Pyrrhonism]. It presents man naked and empty, acknowledging his natural weakness, fit to receive from above some outside power; stripped of human knowledge in himself, annihilating his judgment to make room for faith” (II:12, F 375; VS 506). By undermining the pretenses of reason, Pyrrhonism prepares human beings for the dispensation of Divine grace.

This connection Montaigne draws between the Pyrrhonian critique of reason and the embrace of faith in the absence of any rational grounds for adjudicating religious disputes, has led to his related reputation as a “skeptical fideist” (for a discussion of Montaigne’s skeptical fideism, see Popkin 2003 and Brahami 1997. For the view that Montaigne is not a fideist, see Hartle 2003; 2013). Those who interpret Montaigne as a “skeptical fideist” often take his exposition of Pyrrhonism and its utility in a Christian context as an expression of Montaigne’s personal view on the limited role of reason in the context of faith (Popkin 2003).

Others, however, have argued that Montaigne’s endorsement of Pyrrhonism and its utility in a religious context cannot be taken as a simple expression of Montaigne’s own position (see, for example, Hartle 2003 and 2013. See also Maia Neto, 2017). In the context of the “Apology” as a whole, Montaigne’s endorsement of Pyrrhonism and its utility in a religious context is part of a response to the second objection to Sebond. In his response to the second objection, Montaigne is arguing on the basis of the assumptions of Sebond’s opponents. He counters the conclusion that Sebond’s arguments fail to adequately demonstrate religious dogma by showing that all rational demonstrations (and not just Sebond’s specific effort to demonstrate the Articles of Faith) are similarly doomed.

Further evidence that suggests that Montaigne’s endorsement of Pyrrhonism ought to be understood as a qualified one can be found in his address to the intended recipient of the essay. Following his detailed exposition of Pyrrhonism and its potential to assist a fideistic version of Catholicism, Montaigne addresses an unnamed person as the intended recipient of his defense of Sebond (II: 12 F 419; VS 558). This addressee is typically assumed to be Princess Marguerite of Valois (Maia Neto 2013; 2017). In Montaigne’s address to the princess, he qualifies his sympathetic presentation of Pyrrhonism in ambivalent terms, calling it a “final fencer’s trick” and a “desperate stroke” that should only be used “reservedly,” and even then, only as a last resort (II: 12, F 419; VS 558). Montaigne urges the Princess to avoid Pyrrhonist arguments and continue to rely on traditional arguments to defend Sebond’s natural theology against the Reformers. Montaigne warns the Princess of the consequences of undermining reason in defense of her Catholic faith (II: 12, F 420; VS 559).

Following this warning to the Princess, Montaigne returns to some further consequences of the Pyrrhonian critique of reason and the senses. He returns to Pyrrhonian themes, such as the relativity of sense-perception to the perceiver, challenging the idea that the senses can serve as an adequate criterion of knowledge. Borrowing from the third mode, Montaigne argues that 1.) if we were lacking in certain senses, we would have a different picture of the world, and 2.) we might perceive additional qualities to those that we perceive through our existing faculties. He raises further challenges to the senses using the first and second mode, pointing to 1.) the lack of perceptual agreement between humans and animals, and 2.) the lack of perceptual agreement between different human beings (II:12, F 443-54).

After Montaigne’s reformulation of the skeptical modes, he turns to a reformulation of the problem of the criterion: “To judge the appearances that we receive of objects, we would need a judicatory instrument; to verify this instrument, we need a demonstration; to verify the demonstration, an instrument: there we are in a circle” (II:12 F 454; VS 600-01).  If the senses cannot serve as a criterion of truth, Montaigne then asks whether reason can, but concludes that demonstration leads to an infinite regress (II: 12, F 454; VS 601).

The suspension of assent is the traditional skeptical response to the absence of an adequate criterion of knowledge. This can be done either in the manner of certain Academics by provisionally approving of likely appearances, or in the manner of the Pyrrhonians by suspending assent while permitting oneself to be non-dogmatically guided by appearances. Montaigne expresses reservations toward both solutions. In response to the Academic solution, he raises the Augustinian objection that we have no criterion for selecting certain appearances as likelier than others, a problem that introduces yet another infinite regress (II: 12, F 455; VS 601). In response to the Pyrrhonian solution, he expresses reservations about acquiescence to changeable custom. He concludes that “[t]hus nothing certain can be established about one thing by another, both the judging and the judged being in continual change and motion” (II: 12 F 455; VS 601).

This claim that “nothing certain can be established about one thing by another,” is the conclusion to Montaigne’s response to the second objection. To recall, the second objection was that Sebond’s arguments fail to prove what he set out to demonstrate, namely the Articles of Faith. The Pyrrhonian response to the second objection is that reason is incapable of establishing anything conclusive at all, not only in matters of religious truth but in any domain of human understanding. Montaigne concludes his response to the second objection with the idea that the only knowledge we can attain would have to be through the assistance of divine grace. This conclusion to the essay is often taken as additional evidence for Montaigne’s “skeptical fideism” (Popkin 2003).

Again, whether this conclusion should serve as evidence that Montaigne is personally committed to Pyrrhonism or to skeptical fideism depends on whether we interpret his response to the objections to Sebond as dialectical. That is, it depends on whether we view Montaigne’s arguments as responses he generates on the basis of the assumptions of his opponents—assumptions to which he is not personally committed—in order to generate a contradiction or some other conclusion deemed unacceptable by his opponents.

When taken as a dialectical strategy, however, Montaigne’s use of skepticism still shares much in common with Pyrrhonism understood as a “practice” and “way of life.” For this reason, one might still conclude that although Montaigne’s endorsements of Pyrrhonism within the “Apology” do not necessarily represent the author’s own voice, the methodology and argumentative strategies he adopts within the “Apology” do indeed share much in common with the practice of Pyrrhonism.

iii. Pyrrhonian Strategies Beyond the “Apology”

Although most discussions of Montaigne’s skepticism focus on the “Apology for Raymond Sebond,” this is hardly the sole example of his use of Pyrrhonian strategies. In chapters such as “Of cannibals” and “Of custom and not easily changing an accepted law,” Montaigne adopts the Pyrrhonian mode concerning the diversity of custom to challenge his culture’s unexamined claims to moral superiority. In chapters such as “We taste nothing pure,” he adopts the modes concerning the relativity of sense-perception to the perceiver to challenge the authority and objectivity of the senses. In chapters such as “That it is folly to measure the true and the false by our own capacity” and “Of cripples,” Montaigne adopts skeptical arguments to arrive at the suspension of judgment concerning matters such as knowledge of causes and the possibility of miracles and other supernatural events.

In the domain of practical philosophy, Montaigne borrows from the Pyrrhonian tradition yet again, often recommending behavior that resembles the Pyrrhonian skeptic’s fourfold observances. In the absence of an adequate criterion of knowledge, the Pyrrhonian skeptics live in accordance with four guidelines that they claim to follow “non-dogmatically” (PH 1.11). These observances include the guidance of nature; the necessity of feelings; local customs and law; and instruction in the arts (PH 1.11). Montaigne frequently recommends conformity to similar observances. In “Of custom and not easily changing an accepted law,” for example, he recommends obedience to custom, and criticizes the presumption of those who endeavor to change it. In many cases, Montaigne’s recommendation to obey custom in the absence of knowledge extends to matters of religion. This is yet another reason why Montaigne’s affinities with Pyrrhonian skepticism are often associated with “fideism.” On the “skeptical fideist” interpretation, Montaigne’s obedience to Catholicism is due to the skeptical acquiescence to custom (Popkin 2003; Brahami 1997). The question of Montaigne’s religious convictions as well as his alleged “fideism” is a matter of considerable debate (see, for example, Hartle 2003; 2013).

iv. Montaigne and Academic Skepticism

Although most commentators focus on the Pyrrhonian sources of Montaigne’s skepticism, some scholars have emphasized the influence of Academic skepticism on Montaigne’s thought (see, for example, Limbrick 1977; Eva 2013; Prat 2017; and Maia Neto 2017). One reason to emphasize this role is that Cicero’s Academica was Montaigne’s most frequently quoted skeptical source (Limbrick 1977). Another reason has to do with the philosophical form and content of the Essays, especially Montaigne’s emphasis on the formation of judgment (as opposed to the suspension of judgment and elimination of beliefs) and his emphasis on intellectual freedom from authority as the defining result of skeptical doubt.

We can find one example of Cicero’s influence on Montaigne’s skepticism in his detailed exposition of skepticism in the “Apology for Raymond Sebond.” Here Montaigne weighs the relative merits of the dogmatic and skeptical approaches to assent, embedding two direct quotations from Cicero’s Academica (II:12, F 373; VS 504). Following Cicero’s distinction characterized in Academica 2.8, Montaigne articulates the value of suspending judgment in terms of intellectual freedom (II:12, F 373; VS 504). Although within this context, Montaigne is admittedly discussing the value of Pyrrhonian skepticism, he borrows Cicero’s language and emphasis on intellectual freedom as the defining result of the epoché (for a discussion of Montaigne’s blending of Academic and Pyrrhonian references, see Limbrick 1977; Eva 2013; and Prat 2017).

Although these passages in the “Apology” provide evidence for the influence of Cicero’s Academica on Montaigne’s skepticism, they also admittedly provide evidence for a critical view of the New Academy. As discussed above, within this exposition of skepticism, Montaigne voices explicit approval for the Pyrrhonians over the Academics. Although his characterizations of skepticism borrow significantly from Cicero, he uses these descriptions to present the Pyrrhonians in a more favorable light. Whether these statements should be taken as representative of Montaigne’s own voice, or whether they are part of a dialectical strategy, is discussed above.

Beyond the “Apology for Sebond,” we can see further examples of Montaigne’s debt to Academic skepticism. In contrast to the Pyrrhonian emphasis on the elimination of beliefs, Montaigne adopts skeptical strategies in ways that appear to accommodate the limited possession of beliefs. In this respect, Montaigne’s skepticism resembles the “mitigated” skepticism attributed to Cicero, whose “probabilism” permits the acquisition of tentative beliefs on a provisional basis (see Academica 2.7-9).

One example of the influence of Cicero’s mitigated skepticism can be seen in Montaigne’s discussion of education in chapter I: 26, “Of the education of children.” Given the prominent role of Cicero’s Academica in pedagogical debates in sixteenth century France, this context is hardly surprising (see discussion of Omer Talon above). Throughout “Of the education of children,” Montaigne articulates the goal of education as the formation of individual judgment and the cultivation of intellectual freedom (I: 26, F 111; VS 151). Montaigne recommends a practice that closely resembles the Academic method of argument in utramque partem as a means of attaining intellectual freedom. He recommends that the student weigh the relative merits of all schools of thought, lending provisional assent to the conclusions that appear most probable. The student should be presented with as wide a range of views as possible in the effort to carefully examine the pros and cons of each (I: 26, F 111; VS 151). The student should resist unqualified assent to any doctrine before a thorough exploration of the variety of available positions.

Montaigne presents this exercise of exploring all available positions as a means to attaining a free judgment (I: 26, F 111; VS 151). Through this emphasis on the freedom of judgment, Montaigne’s discussion of the nature and goals of education has clear resonances with that of his contemporary, Omer Talon. Like Talon, Montaigne presents skeptical strategies as a positive tool for cultivating intellectual freedom from authority rather than as a negative strategy for undermining unqualified assent to dogmatic knowledge claims. This emphasis on intellectual freedom and the freedom of judgment resonates more clearly with Ciceronian skepticism than with Pyrrhonism.

Montaigne’s appropriation of Academic skeptical strategies extends beyond his discussions of pedagogy. Beyond the essay “Of the education of children,” Montaigne emphasizes the formation of judgment as a goal of his essay project, often referring to his Essays as the “essays of his judgment” (see II. 17, F 495; VS 653; II: 10, F 296; VS 407; and I: 50, F 219; VS 301-302.) In “Of Democritus and Heraclitus,” for example, Montaigne writes: “Judgement is a tool to use on all subjects and comes in everywhere. Therefore in the essays [essais] that I make of it here, I use every sort of occasion. If it is a subject I do not understand at all, even on that I essay [je l’essaye] my judgment” (I: 50, F 219; VS 301-302). Rather than containing a finished product, or set of conclusions, the Essays embody the very activity of testing or “essaying” judgment (see La Charité 1968 and Foglia 2011 for the role of judgment in Montaigne’s thought).

Throughout the Essays, Montaigne tests or “essays” his judgment on a wide range of topics, attempting to explore these topics from all possible directions. At times he entertains the evidence for and against any given position in a manner that resembles the Academic method of argument in utramque partem. At other times, his method resembles the Pyrrhonian practice of counterbalancing opposing arguments, appearances, and beliefs. Although the “method” of the Essays shares aspects of both skeptical traditions, where it appears closer to Ciceronian skepticism is in Montaigne’s apparent acceptance of certain positive beliefs. Like Cicero, Montaigne appears to hold some beliefs that he accepts on a tentative and provisional basis. In this respect, his skepticism is closer to Cicero’s “mitigated” skepticism than to Sextus’ more radical skepticism that aspires to a life without beliefs.

Although the precise character and extent of Montaigne’s skepticism remain a topic of considerable scholarly debate, most commentators would likely agree on at least some version of the following points: Montaigne was deeply influenced by ancient skepticism and incorporates elements of this tradition into his own thought. Whatever the precise nature of this influence, Montaigne appropriates aspects of ancient skepticism in an original way that goes beyond what was envisioned by its ancient proponents. Montaigne’s essay form, for example, is just one way that he appropriates skeptical strategies toward new ends.

b. Pierre Charron

After Montaigne, Pierre Charron (1541-1643) is one of the most influential figures of Renaissance skepticism. Charron was a close friend and follower of Montaigne. He draws heavily on Montaigne and the Academic skeptical tradition in his major work, Of Wisdom (1601, 1604). According to Maia Neto, Charron’s Of Wisdom was “the single most influential book in French philosophy during the first half of the seventeenth century” (Maia Neto 2017).

In Of Wisdom, Charron expounds what he takes to be the core of Montaigne’s thought. He does so through a method and model of knowledge adopted from Academic skepticism. Charron’s indebtedness to the Academic skeptical tradition can be seen in his emphasis on intellectual freedom from authority and his idea that wisdom consists in the avoidance of error. Following certain Academic skeptics, Charron maintains that truth is not fully accessible to human beings (Maia Neto 2017). Instead, he argues that the truth is only fully available to God. Despite the inaccessibility of truth to human beings, Charron proposes that through the proper use of reason, we can nonetheless avoid error. In Charron’s view, it is the avoidance of error rather than the establishment of a positive body of knowledge that constitutes genuine wisdom. In this respect, he develops what Maia Neto calls a “critical rationalism not unlike that held earlier by Omer Talon and by Karl Popper in the twentieth century” (Maia Neto 2017).

c. Francisco Sanches

 Along with Montaigne and Charron, the Iberian physician and philosopher Francisco Sanches (1551-1623) is one of the most notable thinkers associated with Renaissance skepticism. His skeptical treatise, That Nothing is Known (1581), sets out a detailed critique of Aristotelian epistemology drawing on familiar skeptical lines of attack. Sanches’ use of skepticism stands out from many of his predecessors and contemporaries insofar as he applies it to epistemological issues rather than strictly religious ones.

In That Nothing is Known, Sanches targets the Scholastic concept of scientia, or knowledge through necessary causes. Throughout this work, Sanches mobilizes skeptical arguments to attack several Aristotelian ideas, including the idea that particulars can be explained through universals (TNK 174-179) and the idea that the syllogism can generate new knowledge (TNK 181-182). Based on these critiques, Sanches concludes that the Aristotelian concept of scientia results in an infinite regress and is therefore impossible (TNK 195-196). We cannot have scientia of first principles or of any conclusions derived from first principles (TNK 199). It is in this sense that Sanches argues for the skeptical thesis suggested by his title.

Much like that of Montaigne, the precise character of Sanches’ skepticism is a topic of considerable debate. Some scholars maintain that Sanches’ skepticism was inspired by Pyrrhonism. This interpretation was first advanced by Pierre Bayle, who refers to Sanches as a “Pyrrhonian” skeptic in his 1697 Dictionary entry (Limbrick 1988; Popkin 2003). This interpretation finds support in Sanches’ use of skeptical arguments against sense perception as a criterion of knowledge, a strategy resembling the Pyrrhonian modes. One issue with this interpretation, however, is that many thinkers of Bayle’s time used the terms “Pyrrhonism” and “skepticism” interchangeably. Another issue with this interpretation is that there is no conclusive evidence that Sanches read Sextus Empiricus (Limbrick 1988).

For this reason, many scholars maintain that Sanches drew inspiration from Academic skepticism instead (Limbrick 1988; Popkin 2003). This interpretation finds support in the title of Sanches’ work—a clear reference to the skeptical thesis attributed to Arcesilaus. As further evidence of Sanches’ affinities with the New Academy, scholars often point to a letter to the mathematician Clavius (Limbrick 1988; Popkin 2003). In this letter, Sanches uses skeptical arguments to challenge the certainty of mathematical knowledge. He even signs his name as “Carneades philosophus,” explicitly associating himself with a famous representative of Academic skepticism.

Still others have argued that the Galenic medical tradition serves as another source of inspiration for Sanches’ skepticism. Elaine Limbrick, for example, shows that Sanches’ medical training was particularly influential for his skepticism and epistemology in general (Limbrick 1988). She argues that Galen’s emphasis on empirical observation and experiment was fundamental to Sanches’ rejection of Aristotelianism and his effort to develop a new scientific methodology (Limbrick 1988).

Although Sanches uses skeptical strategies in his attack on Aristotelian epistemology, he was not himself a thoroughgoing skeptic. Although Sanches concludes that the Aristotelian concept of scientia is impossible, he does not therefore conclude that all knowledge is impossible. One indication of this is that throughout That Nothing is Known, Sanches refers to other works, one of which deals with methodology, and another of which deals with the acquisition of positive knowledge of the natural world (TNK 290).  Sanches appears to have intended these works to explain what knowledge might look like—specifically knowledge of the natural world—in the absence of scientia. Unfortunately, the fate of Sanches’ additional works on the positive acquisition of knowledge remains unknown. They were either lost or never published.

Although we can conclude that Sanches never intended That Nothing is Known to serve as a final statement on his own epistemology, we can only speculate as to what his positive epistemology might have looked like. Since Sanches uses skeptical arguments to undermine the Aristotelian conception of knowledge and pave the way for a different approach to knowledge of the natural world, Popkin and many others have characterized his skepticism as “mitigated” and “constructive” (Popkin 2003). Popkin goes further to argue that Sanches’ theory of knowledge would have been “experimental” and “fallibilist” (Popkin 2003). In this view, although Sanches uses skeptical strategies to undermine the Aristotelian conception of scientia, his ultimate goal is not to undermine the possibility of knowledge as such, but to show that in the absence of scientia, a more modest kind of fallible knowledge is nonetheless possible.

8. The Influence of Renaissance Skepticism

Renaissance skepticism had a considerable impact on the development of seventeenth century European philosophy. Thinkers ranging from Descartes to Bacon developed their philosophical systems in response to the skeptical challenges (Popkin 2003). A close friend of Montaigne’s, Marie Le Jars de Gournay (1565-1645), for example, draws on skeptical arguments in her Equality of Men and Women (1641). In this work, Gournay deploys traditional skeptical strategies to draw out the logically unacceptable conclusions of arguments for gender inequality (O’Neill 2007). François La Mothe Le Vayer (1588-1672), often associated with the “free-thinking” movement in seventeenth century France, also deploys skeptical strategies in his attacks on superstition (Popkin 2003; Giocanti 2001). Pierre Gassendi (1592-1665), known for his revival of Epicureanism, adopts skeptical challenges to Aristotelianism in his Exercises Against the Aristotelians (1624) and draws on the probabilism of the New Academy in his experimental and fallibilist approach to science (Popkin 2003). René Descartes (1596-1650) takes a methodical and hyperbolic form of skeptical doubt as the starting point in his effort to establish knowledge on secure foundations. Although Descartes uses skeptical strategies, he only does so in an instrumental sense, that is, as a tool for establishing a model of scientific knowledge that can withstand skeptical attack. Much like Descartes, Blaise Pascal (1623-1662) was both influenced by skeptics such as Montaigne and deeply critical of them. Although he arguably embraced a version of fideism that shared much in common with thinkers such as Charron and Montaigne, he also attacks these thinkers for their skepticism.

Much like Renaissance skepticism, post-Renaissance treatments of skepticism represent a diverse set of philosophical preoccupations rather than a unified school of thought. To the extent that a central distinction between Renaissance and post-Renaissance skepticism can be identified, it could be said that most Renaissance skeptics place a greater emphasis on debates concerning the criterion of religious truth, whereas most post-Renaissance skeptics place a greater emphasis on the application of skeptical arguments to epistemological considerations. Moreover, most Renaissance skeptics, much like their ancient counterparts, are explicitly concerned with the practical implications of skepticism. In other words, many of the representative figures of Renaissance skepticism are concerned not only with identifying our epistemic limitations, but with living well in response to those limits.

9. References and Further Reading

a. Primary Sources

  • Brués, Guy de. Dialogues: Critical Edition with a Study in Renaissance Scepticism and Relativism. Translated and edited by Panos Paul Morphos. Johns Hopkins Studies in Romance Literatures and Languages; Baltimore: John Hopkins Press, 1953.
    • Translation and commentary on Guy de Brués’ Dialogues in English.
  • Castellio, Sebastien. Concerning Heretics. Trans. and ed. Roland H. Bainton. New York: Columbia University Press, 1935. English translation.
  • Castellio, Sebastien. De Arte Dubitandi et confidendi Ignorandi et Sciendi. Ed. Elisabeth Feist Hirsch. Leiden: E. J. Brill, 1981.
  • Charron, Pierre, De la Sagesse, Corpus des Oeuvres de Philosophie en Langue Français, revised by B. de Negroni, Paris: Fayard, 1986.
  • Cicero, Marcus Tullius. Academica and De Natura Deorum. Loeb Classical Library, trans. H.
  • Rackham, Cambridge: Harvard University Press, 1933.
  • Erasmus, Desiderius, and Martin Luther, translated by Ernst F. Winter. Discourse on Free Will. Milestones of Thought. New York: Continuum, 1997.
  • English translation of Erasmus’ Free Will and Luther’s Bondage of the Will.
  • Henry of Ghent, “Can a Human Being Know Anything?” and “Can a Human Being Know Anything Without Divine Illumination?” in The Cambridge Translations of Medieval Philosophical Texts, Volume 3: Mind and Knowledge. Edited and translated by R. Pasnau, Cambridge University Press, 2002, pp. 93-135.
  • John Buridan, “John Buridan on Scientific Knowledge,” in Medieval Philosophy: Essential Readings with Commentary, G. Klima (ed.), Blackwell, 2007, pp. 143-150.
  • John Duns Scotus, Philosophical Writings, ed. and trans. Allan B. Walter, Cambridge: Hackett Publishing Co., 1987.
  • John of Salisbury, The Metalogicon of John of Salisbury, trans. with an introduction by Daniel McGarry, Berkeley: University of California Press, 1955.
    • The translation used in the quotations above, parenthetically cited as “ML” followed by page number.
  • John of Salisbury, Policraticus: Of the Frivolities of Courtiers and the Footprints of Philosophers, ed. C.J. Nederman, Cambridge: Cambridge University Press, 1990.
    • The translation used in the quotations above, parenthetically cited as “PC” followed by page number.
  • Montaigne, Michel de. Les Essais. Ed. Pierre Villey and V.-L. Saulnier. 3 vols., 2nd ed. Paris : Presses Universitaires de France. 1992.
    • The French edition used in the quotations above, parenthetically cited as “VS” followed by page number.
  • Montaigne, Michel de. Œuvres complètes. Ed. Albert Thibaudet and Maurice Rat. Paris: Gallimard, Bibliothèque de la Pléiade, 1962.
  • Montaigne, Michel de. The Complete Essays of Montaigne. Translated by Donald M. Frame. Stanford: Stanford University Press, 1943.
    • The English translation used in the quotations above, parenthetically cited as “F” followed by page number.
  • Montaigne, Michel de. The Complete Works of Montaigne: Essays, Travel Journal, Letters. Trans. Donald Frame. Stanford, Calif.: Stanford University Press, 1967.
  • Naya, Emmanuel. “Traduire les Hypotyposes pyrrhoniennes : Henri Estienne entre la fièvre quarte et la folie chrétienne.” In Le Scepticisme Au XVIe Et Au XVIIe Siècle.  Ed.
  • Moreau, Pierre-François. Bibliothèque Albin Michel Des Idées. Paris: A. Michel, 2001.
    • Includes a French translation of Henri Estienne’s introductory essay to his translation of Sextus Empiricus’ Outlines of Skepticism.
  • Nicholas of Autrecourt, His Correspondence with Master Giles and Bernard of Arezzo: A Critical Edition and English Translation by L.M. de Rijk, Leiden: E.J. Brill, 1994.
  • Popkin, Richard H., and José Raimundo Maia Neto. Skepticism: An Anthology. Amherst, N.Y.: Prometheus Books, 2007.
  • Includes an English translation of Hervet’s introductory essay to his translation of Sextus’ Adversus Mathematicos, and excerpts from Gianfrancesco Pico della Mirandola’s Examination of the Vanity of the Doctrines of the Gentiles and of the Truth of the Christian Teaching (1520) among other sources.
  • Sanches, Francisco, That Nothing Is Known = (Quod Nihil Scitur), introduction, notes, and bibliography by Elaine Limbrick, and text established, annotated, and translated by D. F. S. Thomson. Cambridge; New York: Cambridge University Press, 1988.
    • Critical edition with an extensive introduction. Latin text and English translation. The English translation is parenthetically cited above as “TNK” followed by page number.
  • Sextus Empiricus, Outlines of Pyrrhonism, Loeb Classical Library, trans. R.G. Bury, Cambridge: Harvard University Press, 1933.
  • Sextus Empiricus, Adversus Mathematicos, Loeb Classical Library, trans. R.G. Bury, Cambridge: Harvard University Press, 1935.

b. Additional Primary Sources

c. Secondary Sources

  • Brahami, Frédéric. Le scepticisme de Montaigne. Paris : Presses Universitaires de France, 1997.
  • Brush, Craig B. Montaigne and Bayle:  Variations on the Theme of Skepticism. The Hague: Martinus Nijhoff, 1966.
  • Carraud, Vincent, and J.-L. Marion, eds. Montaigne: scepticisme, métaphysique, théologie. Paris : Presses Universitaires de France, 2004.
  • La Charité, Raymond C. The Concept of Judgment in Montaigne. The Hague: Martinus Nijhoff, 1968.
  • Copenhaver, B. P., & Schmitt, C. B., Renaissance Philosophy. Oxford: Oxford University Press, 1992.
  • Eva, Luiz, “Montaigne et les Academica de Cicéron,” Astérion, 11, 2013.
  • Floridi, Luciano. Sextus Empiricus: The Transmission and Recovery of Pyrrhonism. American Classical Studies. New York: Oxford University Press, 2002.
  • Foglia, Marc. Montaigne, pédagogue du jugement. Paris : Classiques Garnier, 2011.
  • Friedrich, Hugo. Montaigne. Edited by Philippe Desan. Translated by Dawn Eng. Berkeley: University of California Press, 1991.
  • Funkenstein, Amos. “Scholasticism, Scepticism, and Secular Theology,” in R. Popkin and C. Schmitt (eds.), Scepticism from the Renaissance to the Enlightenment, Wiesbaden: Harrassowitz. 1987 : 45-54.
  • Giocanti, Sylvia. Penser l’irrésolution : Montaigne, Pascal, La Mothe Le Vayer: Trois itinéraires sceptiques. Paris: Honoré Champion, 2001.
  • Grellard, Christophe. Jean de Salisbury et la renaissance médévale du scepticisme, Paris: Les Belles Lettres, 2013.
  • Hartle, Ann. Michel de Montaigne: Accidental Philosopher. Cambridge: Cambridge University Press, 2003.
  • Hartle, Ann.  “Montaigne and Skepticism” in The Cambridge Companion to Montaigne, ed. Langer, Ullrich. Cambridge: Cambridge University Press, 2005.
  • Hartle, Ann.  Montaigne and the Origins of Modern Philosophy. Evanston: Northwestern University Press, 2013.
  • Lagerlund, Henrik. Rethinking the History of Skepticism: The Missing Medieval Background. Studien Und Texte Zur Geistesgeschichte Des Mittelalters; Bd. 103. Leiden ; Boston: Brill, 2010.
  • Lagerlund, Henrik. Skepticism in Philosophy, a Comprehensive Historical Introduction. New York : Routledge, 2020.
  • Larmore, Charles. “Un scepticisme sans tranquillité: Montaigne et ses modèles antiques.” In Montaigne: scepticisme, métaphysique, théologie, edited by V. Carraud and J.- L. Marion, 15-31. Paris: Presses Universitaires de France, 2004.
  • Limbrick, Elaine, “Was Montaigne Really a Pyrrhonian?” Bibliothèque d’Humanisme et Renaissance 39, no. 1 (1977): 67-80.
  • Maia Neto, José Raimundo. “Academic Skepticism in Early Modern Philosophy.” Journal of the History of Ideas 58, no. 2 (1997): 199-220.
  • Maia Neto, José Raimundo & Richard H. Popkin (ed.), Skepticism in Renaissance and Post-Renaissance Thought: New Interpretations. Humanity Books, 2004.
  • Maia Neto, José Raimundo “Le probabilisme académicien dans le scepticisme français de Montaigne à Descartes” Revue Philosophique De La France Et De L’Étranger 203, no.4 (2013): 467-84.
  • Maia Neto, José Raimundo. “Scepticism” in Lagerlund, Henrik, and Benjamin Hill eds. Routledge Companion to Sixteenth Century Philosophy. New York: Routledge, 2017.
  • Naya, Emmanuel. “Renaissance Pyrrhonism, a Relative Phenomenon,” In Maia Neto J.R., Paganini G. (eds) Renaissance Scepticisms. International Archives of the History of Ideas, vol 199. Dordrecht: Springer, 2009.
  • O’Neill, Eileen. “Justifying the Inclusion of Women in Our Histories of Philosophy: The Case of Marie de Gournay.” In The Blackwell Guide to Feminist Philosophy (eds L.M. Alcoff and E.F. Kittay), 2007.
  • Paganini, G., & Maia Neto, J. R., eds., Renaissance Scepticisms. International Archives of the History of Ideas, vol 199. Dordrecht: Springer, 2009.
  • Paganinni, Gianni. Skepsis. Le débat des moderns sur le scepticisme. Montaigne – Le Vayer – Campanella – Hobbes – Descartes – Bayle. Paris: J. Vrin, 2008.
  • Perler, Dominik. Zweifel und Gewissheit: Skeptische Debatten im Mittelalter. Frankfurt am Main: Klosterman, 2006.
  • Popkin, R. H., The History of Scepticism from Savonarola to Bayle. Oxford: Oxford University Press, 2003.
  • Prat, Sebastien. “La réception des Académiques dans les Essais: une manière voisine et inavouée             de faire usage du doute sceptique” in Academic Scepticism in Early Modern Philosophy,       eds. Smith, Plínio J., and Sébastien Charles Archives Internationales D’histoire des Idées; 221. Cham, Switzerland: Springer, 2017: 25-43.
  • Schiffman, Zachary S. “Montaigne and the Rise of Skepticism in Early Modern Europe: A Reappraisal,” Journal of the History of Ideas, Vol. 45, No. 4 (Oct. – Dec. 1984).
  • Smith, Plínio J., and Sébastien Charles. Academic Scepticism in Early Modern Philosophy. Archives Internationales D’histoire des Idées; 221. Cham, Switzerland: Springer, 2017.
  • Schmitt, C. B., Gianfrancesco Pico Della Mirandola (1469–1533) and His Critique of Aristotle. The Hague: Nijhoff, 1967.
  • Schmitt, C. B., Cicero Scepticus: A Study of the Influence of the Academica in the Renaissance. The Hague: Nijhoff, 1972.
  • Schmitt, C.B., “The Rediscovery of Ancient Skepticism in Modern Times,” in Myles Burnyeat, ed., The Skeptical Tradition. Berkeley: University of California Press, 1983: 226-37.
  • Sève, Bernard. Montaigne: Des Règles Pour L’esprit. Philosophie D’aujourd’hui. Paris: PUF, 2007.
  • Villey, Pierre. Les Sources & L’évolution Des Essais De Montaigne. 1908. Reprint, Paris: Hachette & Cie, 1933.
  • Zupko, Jack “Buridan and Skepticism.” Journal of the History of Philosophy, 31 (2), (1993): 191-221.

Author Information

Margaret Matthews
Email: margaret.matthews@villanova.edu
Villanova University
U. S. A.

George Berkeley: Philosophy of Science

George Berkeley announces at the very outset of Three Dialogues Between Hylas and Philonous that the goals of his philosophical system are to demonstrate the reality of genuine knowledge, the incorporeal nature of the soul, and the ever-present guidance and care of God for us. He will do this in opposition to skeptics and atheists.

A proper understanding of science, as Berkeley sees it, will be compatible with his wider philosophy in achieving its goals. His project is not to rail against science or add to the scientific corpus. Quite to the contrary, he admires the great scientific achievements of his day. He has no quarrel with the predictive power and hence the usefulness of those theories.

His project is to understand the nature of science including its limits and what it commits us to. A proper understanding of science will show, for example, that it has no commitment to material objects and efficient causation. Understanding this and other philosophical prejudices will undercut many of the assumptions leading to skepticism and atheism.

In exploring the nature of science, Berkeley provides insights into several of the central topics of what is now called the philosophy of science. They include the nature of causation, the nature of scientific laws and explanation, the nature of space, time, and motion, and the ontological status of unobserved scientific entities. Berkeley concludes that causation is mere regularity; laws are descriptions of fundamental regularities; explanation consists in showing that phenomena are to be expected given the laws of nature; absolute space and time are inconceivable; and at least some of the unobserved entities in science do not exist, though they are useful to science. Each of these topics is explored in some detail in this article.

Table of Contents

  1. Background
  2. Causation
    1. Physical Causation
    2. Efficient Causation
  3. Laws of Nature
  4. Explanation
  5. Theories and Theoretical Entities
    1. Scientific Instrumentalism and Newtonian Forces
    2. Scientific Realism and Corpuscularianism
    3. Absolute Space and Motion
    4. General Anti-Realism Arguments
  6. References and Further Reading

1. Background

Philosophy of Science emerged as a specialized academic discipline in the mid-20th Century, but philosophers as early as Plato and Aristotle developed views about science that we recognize today as addressing central topics of the discipline. Philosophy of Science addresses the nature of science including its methods, goals, and institutions. Recent Philosophy of Science has drawn heavily from the history and sociology of science (Marcum). Typical topics are the structure of explanation, theories, confirmation, the objectivity of science, the role of values in science, and the difference between science and pseudoscience. It is especially important to reflect on science since it appears to give us our very best examples of knowledge and our best tools for understanding nature.

Periods of significant scientific change, such as the introduction of general relativity and quantum mechanics or Darwin’s theory of evolution, have and continue to provoke heightened philosophical reflection. George Berkeley had the good fortune of living during one of these periods. Through a critique of Scholasticism (an amalgam of Aristotelianism and Catholicism) what is now recognized as the beginning of modern science emerged. The period was roughly from 1550 to 1750. Its luminaries included Copernicus, Kepler, Galileo, Descartes, Boyle, Torricelli, and Newton. Berkeley had a broad understanding of the science of his day including what we now call the psychology of visual perception, medicine, biology, chemistry, and physics. He also had a keen grasp of current mathematics.

Building or elaborating scientific theories was not Berkeley’s goal. He had no quarrels with the empirical content of the best theories. He welcomed their usefulness in bettering our lives. His project was to critique mistaken philosophical interpretations and mistaken popularizations of some theories, especially those that led to skepticism and atheism. His philosophical system is largely a reaction to the materialistic mechanism as espoused by many scientists and philosophers, in particular, Descartes and Locke. Berkeley’s critique rejects a key provision of the theory: an ordinary object (apple or chair) is a material substance—an unthinking something that exists independently of minds. Berkeley’s ontology only includes spirits or minds and ideas. Our senses are to be trusted and all physical knowledge comes by way of experience (3Diii 238, DM ϸ21).

This is an oversimplification, but here is not the place to consider his arguments and qualifications for immaterialism (Flage).

In the course of his reaction to materialistic mechanism and other scientific theories, Berkeley made important and novel contributions to understanding concepts crucial to the nature of science. For example, causation, laws of nature, explanation, the cognitive status of theoretical entities, and space and time. His contribution to these topics is examined below. Berkeley’s reflection on science occurs throughout his many works, from Essay on Vision to Sirus (S), but the bulk of his thought is contained in The Principles of Human Knowledge (PHK), Three Dialogues Between Hylas and Philonous (3D), and De Motu (DM). His views, on the important topics mentioned, continued to evolve throughout his writings, becoming more sensitive to actual scientific practice.

2. Causation

a. Physical Causation

Causal claims occur throughout ordinary language and science. Overcooking caused the chicken to be tough. Salt caused the freezing point of the water to rise. Diabetes is caused by insulin insufficiency.  Causes as commonly understood, make their effects happen. Many verbs in English, such as the terms ‘produce’ or ‘bring about’, capture the “make happen” feature of causation.

Berkeley’s account of causation plays a central role in his philosophical system and his understanding of the methods, goals, and limits of science. Take the example of fire causing water to boil. When one examines the case, according to Berkeley, ideas of yellow, orange, and red in shimmering motion are followed by ideas of a translucent bubbly haze. In short, one set of ideas is accompanied by another set of ideas. The crucial point is that no “making happen” or “producing” is available to the senses.

All our ideas, sensations, or the things which we perceive, . . . are visibly inactive: there is nothing of power or agency included in them. So that one idea or object of thought cannot produce or make any alteration in another. To be satisfied of the truth of this, there is nothing else requisite but a bare observation of our ideas. (PHK ϸ25)

The basic argument is as follows:

    1. Efficient causes are active.
    2. Ideas are inert (inactive).
    3. Therefore, ideas are not efficient causes.

The justification for b is d.

    1. Ideas when observed are found to be inert.

Ideas do undergo changes and we do have ideas of motion, but none of this counts as activity for Berkeley. What constitutes activity in an idea? Could there not be some feature or aspect of ideas that are hidden from sense, some feature that is active?  Berkeley’s answer is no.

    1. Ideas exist only in the mind.
    2. Therefore, there is nothing in them but what is perceived.

Causation in the physical world amounts to one set of ideas regularly followed by another set of ideas.  Berkeley uses a variety of terms to mark the contrast with efficient causation: ‘natural causes,’  ‘second causes,’  ‘material causes,’ ‘instruments,’ ‘physical causes,’ and ‘occasional causes’ (S ϸ160, 245; PC Berkeley to Johnson ϸ2). There is no necessary connection between the relata in a causal relation. Berkeley suggests that a better way to conceive of the regularity among ideas in a “causal” relation is as of signs to things signified. Fire is a sign of boiling water. Additionally, signs do not make happen what they signify. The appropriateness of the sign/thing signified relation is further explored in a later section.

This account does not fit our common understanding of causation. Berkeley recognizes this and has no desire to propose that we speak differently in ordinary affairs. In fact, he often lapses into the vernacular. Our common speech presents no problems in ordinary practical affairs, but the philosopher, when being careful, knows that physical causes do not make their effects happen.

b. Efficient Causation

There is a domain where real or efficient causes occur as opposed to the mere physical regularities described above. When one intends to raise her arm and by force of will raises it, stands as an example of efficient causation. Real causation is carried out by an act of mind. Considering the example, Berkeley believes we know this is efficient causation containing the active requirement for causation, though he thinks we have no sensible idea of it.

Returning to the physical causation, the regularities among ideas are created and maintained by God’s will. Although we as creatures with minds have the ability to will certain ideas, many ideas are forced upon us, independently of our will. These are caused by God.

An important consequence of the distinction between physical and efficient causes is what natural philosophy should and should not study. Natural philosophy should focus on understanding the world in terms of physical causes. Efficient causation is the business of theology and metaphysics. Only these disciplines should consider explanations invoking efficient causation (DM ϸ41).

It is not known to what extent Berkeley influenced David Hume. Hume, the third member of the British Empiricists along with John Locke and Berkeley, developed a more detailed version of a regularity theory of causation. Though Berkeley denies any necessary connection between the causal relata in physical causation, he provides no account of our strong tendency to believe that the relation between the relata is more than constant conjunction. For Hume, the power or necessity in causation is produced from our experience; it is in us not in the objects themselves. He also speaks to the temporal and spatial requirements for the relation between cause and effect and considers what counts as an appropriate regularity (Lorkowski). Hume’s theory is importantly different from Berkeley’s in that he holds that all causation is mere regularity. Acts of the will are no exception. Using Berkeley’s terminology, on Hume’s account, all causes are physical causes.

3. Laws of Nature

The early account of laws of nature in The Principles of Human Knowledge treats them as the regularities discussed under causation:

The ideas of sense . . . have likewise a steadiness, order and coherence, and are not excited at random, . . . but in a regular train or series . . . Now the set rules, or established methods, wherein the Mind we depend on excites in us the ideas of Sense, are called the laws of nature; and these we learn by experience, which teaches us that such and such ideas are attended with such and such other ideas, in the ordinary course of things (PHK ϸ30).

The same account is repeated in the Three Dialogues. Laws are “no more than a correspondence in the order of Nature between two sets of ideas, or things immediately perceived” (3Diii 24).

Here laws of nature are low-level empirical generalizations that assert a regularity between phenomena or aspects of phenomena. They are learned by experience by both natural philosophers and ordinary people alike and are found useful in guiding their lives. Berkeley emphasizes that the relation between the relata in a law of nature is not a necessary relation. God has conjoined smoke with fire, but he could have conjoined fire with a high-frequency tone or anything else He pleased. Though Berkeley is not explicit on this matter, it does appear that laws of nature are not restricted to a universal logical form, that is, the form whenever phenomena of type A occur without exception, phenomena of type B occur. Statements expressing probabilities count as laws as well. So, both “breeding albatrosses lay one egg per year” and “most people with lung cancer are smokers” are laws. Berkeley persistently stresses that the important feature of laws is that they are useful. For Berkeley, this usefulness attests to the wisdom and benevolence of God, who has created and maintains them.

In addition to laws of modest scope whose terms refer to what is immediately perceived, “there are certain general laws that run through the whole chain of natural effects . . .” (PHK ϸ61).  An example is Galileo’s Law: Any body falling from rest freely to earth covers a distance proportional to the square of the time it has spent falling. These general laws and sets of general laws such as Newton’s Laws of Motion provide a “largeness of comprehension” that occupy the attention of the natural philosopher. They enable one to see the unity in apparently diverse phenomena. For example, the unity in falling bodies, tides, and planetary orbits. Some very general and fundamental laws allow for the explanation of other laws.

In mechanical philosophy those are to be called principles, in which the whole discipline is grounded and contained, these primary laws of motion which have been proved by experiments, elaborated by reason and rendered universal. These laws of motion are conveniently called principles, since from them are derived both general mechanical theorems and particular explanations of phenomena (DM ϸ36).

These more fundamental laws are no longer simple correlations or inductive generalizations perceived and learned by experience. Instead, they are laws of great generality containing theoretical terms (such as “force”) and proved by experiments.

4. Explanation

To explain phenomena, they must be “reduced to general rules” (PHK ϸ 105) or alternatively be shown to conform to the laws of nature (PHK ϸ61). This account is a very early version of what is now called the covering law account of explanation. The sentence describing the phenomenon or event to be accounted for is called the explanandum. The sentences describing the information that does the explaining is called the explanans. According to Berkeley’s account, the explanans must contain a law of nature (DM ϸ37). It will typically also contain sentences describing a number of facts. Consider a simple example that would have been quite familiar to Berkeley: Suppose a pendulum oscillates for a period of 6.28 seconds. The explanandum, a period of 6.28 seconds, must be shown to be in conformity with a law. The relevant law is T=2π√(L/g) where T is the period, L is the length of the pendulum, and g is the acceleration due to gravity (10 meters per second2). If L is 10 meters, the period will be 6.28 seconds. The explanandum follows deductively from the explanans. The length of the pendulum being 10 meters and the law cited, explain the period being 6.28 seconds.

Explanans (1) T = π√L/g
(2) L = 10 meters
———————-
Explanandum T = 6.28 seconds

There is nothing puzzling about the period once the law and pendulum length are known. The period was to be expected.

Figure 1. Diagram of simple pendulum.

An important difference between the contemporary covering law account of explanation and Berkeley’s version is that the contemporary account requires that the sentences making up the explanans, including the law(s), be true (Hempel 89-90). As discussed in the next section, Berkeley regards some laws of nature, most notably Newton’s laws of motion, as neither true nor false. They are not the sort of things that can be true or false. They are guides, calculating devices, and useful fictions. This is not to disparage them. Berkeley regards Newton’s laws as the greatest achievement in natural philosophy and a model for future science (PHK ϸ110, S ϸ243, 245). The role of laws is to enable us to expect what will happen. Newton’s laws are remarkably successful at this goal.

Berkley argues that the goal of science is not necessarily to uncover true laws, nor will true laws be better at helping us expect phenomena. The goal of mature science is to produce general laws. They are easy to use, few in number, and give predictive control of a wide range of phenomena. The virtue of laws and the explanations they enable is serving these practical goals. His insight is that true laws may be in tension with these practical virtues. True laws may be too complex, too cumbersome to apply, too numerous to serve the practical goal of simplicity, and so forth. The first objective of laws and explanations is usefulness.

The covering law account of explanation has received a range of criticisms. This is not the place to rehearse these criticisms and evaluate their force. But there is one prominent criticism that deserves consideration. Seeing how Berkeley would respond brings together his positions on causation, laws of nature, and explanation.

Consider the pendulum example again: Intuitively there is an asymmetry between explaining the period in terms of the length of the pendulum verses explaining the length in terms of the period. L explains T.  T does not explain L, but T can be calculated from L and L can be calculated from T. Using Berkeley’s position on how explanations make phenomena intelligible, given L, T is expected and given T, L is expected. So, it appears that the covering law view of explanation cannot account for the asymmetry.  The covering law view lets in too much. It sanctions T explains L, yet this conflicts with strong intuitions.     The problem is not merely an artifact of the pendulum case. It arises with many natural laws including Boyle’s Law, Ohm’s Law, and the laws of geometric optics, along with others.

In response to this, Berkeley would insist that there are no efficient causes in nature. The alleged asymmetry is a relic of the mistaken view that the length of the pendulum causes its period, but the period does not cause the length of the pendulum. Causal relations and laws of nature describe regularities, not what makes things happen:

. . . the connexion of ideas does not imply the relation of cause and effect, but only the mark of sign and thing signified.  The fire which I see is not the cause of the pain I suffer upon my approaching it. In like manner the noise that I hear is not the effect of that motion or collision . . . , but the sign thereof (PHK ϸ65).

In customary talk, there may be an asymmetry where causes can explain effects but not vice versa, but when efficient causation is replaced with regularities between sign and thing signified, the asymmetry disappears. “Causes” can be signs of “effects” and, as in the above quotation, “effects” can be signs of “causes”. Noise is the sign of a collision.

The Berkeleyan defense of the covering law account rests on the claim that the way in which explanations make phenomena intelligible is by giving one reason to expect them or to calculate their occurrence (PHK ϸ31, S ϸ234). This is undoubtedly Berkeley’s official position. Carl Hempel, the leading contemporary defender of the covering law account of explanation, would agree with Berkeley on the point of explanation and how to handle the asymmetries. The asymmetries according to Hempel are due to “preanalytic causal and teleological ideas” (Hempel 95). These ideas are hardly the basis for a systematic and precise analysis of explanation.

In De Motu Berkeley hints at a very different account of how explanations make phenomena intelligible:

For once the laws of nature have been found out, then it is the philosopher’s task to show that each phenomenon is in constant conformity with those laws, that is, necessarily follows from those principles. In that consist the explanation and solution of phenomena and assigning their cause, i. e. the reason why they take place (DM ϸ37).

There are two issues of concern here: 1) Berkeley asserts that the explanandum must follow necessarily from the explanans. This is inconsistent with allowing statistical laws in explanations. As has been suggested, there is no reason Berkeley cannot allow this. God created and maintains the laws of nature to help us know what to expect. Their practical nature is well served by statistical laws. 2) Much more importantly, he invokes a different rationale for how explanations make phenomena intelligible. There is a significant difference between providing grounds for expecting or calculating events and providing “the reason why they take place.” In the pendulum example, the period allows for the calculation of the length, but it does not provide the cause or reason why it is 10 meters. That rests with the designer of the pendulum or the manufacturing process.

Perhaps Berkeley has misspoken or is speaking not as a philosopher, or perhaps he is under the spell of the very view of causation he has rejected. If Berkeley wants to maintain the requirement that explanations tell us why events take place, he will need an account of the asymmetry discussed. Of course, he must do this without appeal to efficient causation. There are numerous ways to do this. For one, the length of the pendulum can be given a covering law explanation independently of the period, but an explanation of the period appears to require appeal to the length of the pendulum (Jobe). This suggestion and others, need careful development including an account of their relevance to the larger issue of explanation. The point here is that answers to the asymmetry problem might be available that do not invoke efficient causation.

5. Theories and Theoretical Entities

a. Scientific Instrumentalism and Newtonian Forces

Much of De Motu is an argument for how to understand the status of forces in Newton’s theories of motion and gravitation. In the first section Berkeley warns the reader of “. . . being misled by terms we do not rightly understand” (DM ϸ1). The suspect terms at issue occur in the science of motion. They fall into two groups: The first includes ‘urge,’ ‘conation,’ and ‘solicitation.’ These play no role in the best accounts of motion and have no legitimate role in physical science. They are “of somewhat abstract and obscure signification” (DM ϸ2) and on reflection clearly apply solely to animate beings (DM ϸ3). The second group includes ‘force,’ ‘gravitation,’ and allied terms. Berkeley’s attention is focused on this group. He expresses a worry about these terms by way of an example. When a body falls toward the center of the earth it accelerates. Some natural philosophers are not satisfied with simply describing what happens and formulating the appropriate regularity. In addition, a cause of the acceleration is assigned—gravity.

A major motivation for Berkeley writing De Motu was to resist treating forces and gravitation as efficient causes. Some of Newton’s followers and perhaps Newton himself held this view. Given the prestige of Newton’s physics, it was particularly important for Berkeley to respond. Treating forces as efficient causes would undermine Berkeley’s immaterialism, but Berkeley is not merely defending his own philosophical territory. Regardless of one’s commitment, or lack of it, to immaterialism, Berkeley raises significant issues about forces.

One could simply argue that there are no forces. So, force-talk should be abandoned. This would certainly rid the scene of forces as causes. Much the same has happened with caloric, phlogiston, ether, and witches. The terms have disappeared from highly confirmed theories along with any causal role assigned to the entities. Berkeley’s view is more subtle than this. His general thesis is that “force,” “gravity,” and allied terms lack the significance required to indicate the real nature of things. The terms are not meaningless, as they have a useful role to play in scientific theories, but they lack the sort of significance needed to support a realistic understanding of forces. They fail to indicate distinct entities or qualities.

Lisa Downing has detailed Berkeley’s argument for an anti-realistic understanding of forces (Downing 1996, 2005 238-249). The key premise is as follows:

P. Forces are unknown qualities of bodies, that is, unsensed.

From this he concludes:

C. Force terms (‘force,’ ‘gravity,’ ‘attraction’) fail to indicate (refer to) distinct qualities.

Though Berkeley takes P as obvious, he does have an argument for it. Forces as efficient causes are active qualities of bodies. They must be unsensed, for on careful examination all the sensed qualities of bodies are passive.

What licenses the move from P to C? Naming or referring to forces requires conceiving of forces. To conceive of physical entities requires a sense-based idea of them (Downing 2005 247).

Berkeley does not hold that all meaningful words stand for ideas. This view, often attributed to John Locke, is aggressively criticized by Berkeley (Pearce 194-196). Words need not bring a distinct idea to the speaker’s or hearer’s mind. In fact, force terms without standing for ideas are meaningful. Their significance comes from the usefulness they provide in Newtonian dynamics. A system of mathematical rules that employ force terms allow for precise predictions. This is accomplished lacking the kind of significance needed to secure reference. With ‘force’ failing to name anything, forces cannot be understood realistically.

Berkeley’s examination of forces is not only destructive. He had a great appreciation of the explanatory success of Newtonian dynamics. He saw that force terms play an important role in the theory. He interprets those terms instrumentally. They do not “indicate so many distinct qualities,” but they are useful in reasoning about motion:

Force, gravity, attraction and terms of this sort are useful for reasonings and reckonings about motion . . . As for attraction, it was certainly introduced by Newton, not as a true physical quantity, but only as a mathematical hypothesis (DM ϸ17).

Berkeley gives perspicuous illustrations of what he means by mathematical hypotheses and being useful in reasoning. The first occurring after the above quote concerns the parallelogram of forces. This mathematical technique allows for the computation of the resultant force. But this force is not proffered as a “true physical quantity” though it is very useful for predicting the motion of bodies (DM ϸ18). The second illustration reminds us of how considering a curve as an infinite number of straight lines (though it is not in reality) can be of great utility. For instance, it allows a geometrical proof of the common formula for the area of a circle—A = π r2, and in mechanics, it is also useful to think of circular motion as “arising from an infinite number of rectilinear directions” (DM ϸ61).

Figure 2

For numerous practical purposes a circle can be regarded as composed of many straight lines.

b. Scientific Realism and Corpuscularianism

Corpuscularianism was the dominant theoretical framework for the physical sciences in the 17th century.  The basic position is a form of atomism. Bodies are material objects existing independently of human minds and composed of minute particles (corpuscles) that are unobservable. Their properties are restricted to size, shape, position, and motion (the primary qualities). Corpuscles explain the properties of bodies including their color, smell, temperature, and sound (the secondary qualities).

Given the prominence of the corpuscularian theoretical framework and Berkeley’s intimate familiarity with the works of many of the theory’s proponents (notably Rene Descartes, Robert Boyle, and John Locke), it is appropriate to ask how he understood the status of the framework’s fundamental entities—corpuscles. The received view has been that Berkeley must hold instrumentalism for all theoretical entities (Popper; Warnock 202; Newton-Smith 152; Armstrong 32-34). This position is encouraged by at least two considerations: (1) When Berkeley explicitly addresses the cognitive status of theoretical entities it is always to argue against realism. He never offers arguments for a realistic understanding of some theoretical entities. (2) Berkeley’s immaterialism maxim, esse est percipi, (to be is to be perceived) was thought to be incompatible with realism for theoretical entities.

More recent scholarship attempts to show that a realistic understanding of corpuscles is compatible with Berkeley’s wider philosophical position, if not embraced by him (Downing 1995, 2005 230-235; Garber; Winkler 238-275). Berkeley’s immaterialist version of corpuscularianism must be qualified in several important ways: First, corpuscles are not bits of matter that are mind independent. They are sets of ideas just as ordinary objects are. Second, corpuscles do not cause anything, but they can be signs of things signified. Third, Berkeley does not endorse the primary/secondary quality distinction. The ideas that make up corpuscles have the same range of qualities as the ideas that make up ordinary objects. This does not prohibit him from recognizing that the primary qualities may be more useful in formulating laws with predictive power. Fourth, corpuscles are in principle sensible. This qualification was accepted by many practicing corpuscularian scientists. Sensing corpuscles is neither logically nor scientifically impossible. It allows a response to the charge that esse est percipi rules out a realistic account of corpuscles.

At the beginning of the Principles, Berkeley spells out his account of ordinary physical objects—apples, stones, books, and so forth. A group of ideas are ‘’observed to accompany each other”, given a name and regarded as one thing (P ϸ1). An apple has a certain odor, color, shape, and texture associated with it. Berkeley immediately recognizes a problem. If things are sets or bundles of ideas, what happens to the existence of things when not sensed? “The table I write on I say exists; that is, I see and feel it: and if I were out of my study I should say it existed; meaning thereby that if I was in my study I might perceive it . . .“(P ϸ3). The counterfactual account is not just needed to explain the continuity of physical objects when unsensed. Apples have a backside and a core. When held in one’s hand only a part of the apple is seen. But under certain conditions, according to Berkeley, one would see the backside and the core. Consider an apple that has fallen from a tree and rolled under leaves never to be sensed by anyone. Quite plausibly there are such apples. Again, Berkeley can use his counterfactual analysis to deal with their existence. If one were walking through the orchard and removed the leaves, she would perceive the apple. This account of the continuity of ordinary objects is clear, but unfortunately it appears to violate common sense—something Berkeley claims to champion. Berkeley’s table goes in and out of existence. To say he would see it when he enters his study is not to say it exists when he is absent from his study. Berkeley sees this as problematic and considers various approaches to continuity in his writings. There is disagreement among scholars on what Berkeley’s preferred position is and on what position fits best with the core principles of his immaterialism (Pitcher 163-179; Winkler 207-244).

In the Three Dialogues, Berkeley hints at a position that both elaborates the counterfactual account and speaks directly to what entities actually exist. Hylas, the spokesperson for materialism, claims that immaterialism is incompatible with the scriptural account of creation. Everything exists eternally in the mind of God; hence everything exists from eternity. So how can entities both exist from eternity and be created in time? Berkeley agrees with Hylas that nothing is new or begins in God’s mind. The creation story must be relativized to finite minds. What actually exists is what God has decreed to be perceptible in accord with the laws of nature. He has made his decrees in the order of the biblical account. If finite minds would have been present, they would have had the appropriate perceptions (3Diii 253).

Obviously, God has decreed that apples are perceivable by finite minds. Given the laws of nature, the core, the backside, and buried apples, would be perceived given one is in the right location. Once God has decreed that something is perceivable, the relevant counterfactuals are supported by the laws of nature, which God created and maintains.

Berkeley’s account is situational. It depends on the observer being in the right place at the right time with no barriers interfering with the light and the observer having well-working visual faculties. If corpuscles exist God has decreed that they are observable under certain conditions. Perhaps corpuscles are analogous to the apple under the leaves. Though neither has been observed they are both observable in principle. Observing the buried apple requires removing the leaves. Observing corpuscles requires being in the right place with a sufficiently powerful microscope. It is not required that the appropriate microscope ever be invented. Economic conditions, for example, may prevent its development. What is required is that the scope is scientifically possible.

The analogy is not perfect. First, in the 18th century, some apples had been observed; no corpuscles had been observed. Second, a special apparatus is not required to see apples. Seeing corpuscles requires a very powerful microscope.

The fact that apples have generic observability (some apples have been observed) whereas no corpuscles have been observed, will only be damning if this provides a reason for corpuscles being inconceivable. As is discussed, it does not. The need for a special apparatus in the case of corpuscles can be answered. Surely eyeglasses are a permissible apparatus. The principles by which light microscopes work are known. They work basically in the same way eyeglasses work. Microscopes do not enable one to merely detect some entity or see the effects of the entity; they, like eyeglasses, enable one to see the entity.

This raises the question of how can corpuscles be treated realistically when forces cannot? In both cases, they are unsensed. There are two important differences for Berkeley: (1) Forces are unperceivable in principle whereas corpuscles are not; (2) Corpuscles can be imagined, and forces cannot be. For Berkeley, imagining is a kind of inner perceiving. Images are constructed by us from ideas that are copies of ideas originally “imprinted on the senses” (PHK ϸ1). One can imagine elephants with train wheels for legs moving about on tracks. Similarly, scientists can imagine corpuscles as tiny objects with a certain shape, size, and texture. Berkeley does not think a construction of any sort is available for forces (DM ϸ6). So, though no corpuscles have been perceived, they are conceivable and the term ‘corpuscle’ is not without meaning.

The textual evidence for Berkeley endorsing corpuscularianism comes from Principles (ϸ60-66) where Berkeley answers a particular objection to his philosophy. What purpose do the detailed mechanisms of plants and animals serve when those mechanisms are ideas caused by God and of no causal power?  Similarly, why the inner wheels and springs of a watch? Why does not God simply have the hands turn appropriately without internal complexity?

Berkeley’s answer is that God could do without the inner mechanisms of watches and nature, but he chooses not to do so in order for their behavior to be consonant with the general laws that run throughout nature. These laws of manageable number have been created and maintained by God to enable us to explain and anticipate phenomena. A world without internal mechanisms would be a world where the laws of nature would be so numerous to be of little use.

Berkeley describes the mechanisms as “elegantly contrived and put together” and “wonderfully fine and subtle as scarce to be discerned by the best microscope” (P ϸ60). Admittedly he does not explicitly mention corpuscularian mechanisms, but Garber (182-184) gives several reasons why Berkeley included them. Nowhere does Berkeley deny the existence of the subtle mechanisms or suggest that they should be treated instrumentally. His descriptions of the mechanisms often mirror those of John Locke speaking of corpuscles. Perhaps most importantly, if the science of Berkeley’s day is to explain various phenomena including heat combustion and magnetism, it must refer to hidden mechanisms including corpuscles.

Siris is Berkeley’s last major work. It provides textual support for corpuscularian realism. Siris is a peculiar work. Much of it is devoted to praising the medicinal virtues of tar water (a mixture of pine tar and water), and explaining the scientific basis for its efficacy. The latter project explores parts of 18th century chemistry drawing on a number of corpuscularian hypotheses. The key point is that Berkeley never raises anti-realistic concerns about the relevant entities. He does this in the context of affirming his immaterialism and pointedly repeating his instrumental account of Newtonian forces found in De Motu (Downing 205).

Figure 3: Cartesian diagram showing how screw shaped particles accounted for magnetism.

Berkeley’s familiarity with the advances in microscopy provides further indirect support for immaterialistic corpuscularianism. Berkeley knew that there were many entities that were unobservable at one time and later became observable. There was no reason to believe that progress in microscope technology would not continue revealing further mechanisms. In fact, some of Locke’s contemporaries believed that microscopes would improve to a point where corpuscles could be seen.

The general point, one supporting realism, is that mere current unobservability does not speak against realism. To the contrary, the progressive unveiling of nature supports realism.

If Berkeley is a scientific realist about corpuscles, aether, and other entities, this might explain his lack of an argument for realism. He thought that all that was valuable in the best science was not incompatible with immaterialism. Immaterialism along with realism about entities is perhaps regarded as the norm. The outlier is Newtonian forces. They require special argument.

c. Absolute Space and Motion

Absolute motion and absolute space are not understood realistically or instrumentally by Berkeley. He recommends that natural philosophers dismiss the concepts. Relative space and motion will more than adequately serve the purposes of physics. The debate about absolute motion and space has a long and complex history. Berkeley’s critique is often regarded as an anticipation of that of Ernest Mach (Popper).

According to Newton, absolute space “. . . in its own nature and without regard to anything external, always remains similar and immovable.” Absolute space is not perceivable. It is known only by its effects. It is not a physical object or a relation between physical objects. It is a “container” in which motions occur. Absolute motion is the motion of a physical object with respect to absolute space. Relative space, as Berkeley understood it, is “. . . defined by bodies; and therefore, an object of sense” (DM ϸ52). Relative motion requires at least two bodies. One body changes its direction and distance relative to another body (DM ϸ58). If all bodies were annihilated but one, it could not be in motion.

Newton had many reasons, including theological, for endorsing absolute space. In Newtonian physics a special frame of reference must be stipulated in order to apply the laws of motion. There are many possible frames of reference⁠—­the earth, the sun, our galaxy, and so on. Are they all equally adequate? A falling object will have a different acceleration and trajectory depending on the chosen reference frame. The differences may be slight and of minimal practical importance, but present a significant theoretical problem. If Newton’s laws are to apply in every reference frame, various forces will need to be postulated from frame to frame. This appears ad hoc and leads to great complexity. To blunt the problem, Newton thought a privileged frame was needed—absolute space (Nagel 204 -205).

Berkeley argued against Newton’s position from his early writings in The Notebooks, The Principles of Human Knowledge, and De Motu. As with forces, he wanted to reject absolute space as an efficient cause, but he also had theological motivations. He found the view that absolute space necessarily exists, is uncreated, and cannot be annihilated, abhorrent. It put absolute space in some respects on the level of God. Nevertheless, Berkeley’s arguments against absolute space do not involve theological principles.  The focus here is on the critique in De Motu, Berkeley’s last and most thorough treatment of the topic.

Berkeley has two lines of criticism of absolute space and in turn absolute motion. The first is a general argument from his theory of language; the second responds to Newton’s demonstration of absolute space. On the first line of criticism, imagine all bodies in the universe being destroyed. Supposedly what remains is absolute space. All its qualities (infinite, immovable, indivisible, insensible and without relation and distinction) are negative qualities (DM ϸ53). There is one exception. Absolute space is extended, a positive quality. But Berkeley asks what kind of extension can neither be measured nor divided nor sensed nor even imagined? He concludes that absolute space is pure negation, a mere nothing. The term “absolute space” fails to refer to anything since it is neither sensible nor imaginable (DM ϸ53). This reasoning is similar to the argument against forces, though absolute space has no instrumental value in theorizing.

In the second line of criticism, two thought experiments of Newton designed to demonstrate the existence of absolute space and motion are examined. Though Newton admitted that absolute space was insensible, he thought it could be known through its effects. It was essential that Berkeley take up these experiments. Even though the first line of criticism showed, if cogent, that ‘absolute space’ fails to name anything in nature, further argument was required to show that it was not needed, even instrumentally, for an adequate physical account of motion.

The first thought experiment involves two globes attached by a cord spinning in circular motion. No other physical bodies exist. There is no relative motion of the globes but there is a tension in the cord.  Newton believes the tension is a centrifugal effect and is explained by the globes being in motion with respect to absolute space. Berkeley’s response is to deny the conceivability of the experiment. The circular motion of the globes “cannot be conceived by the imagination” (DM ϸ59). In other words, given Newton’s description of the experiment there can be no motion of the globes. Berkeley then supposes that the fixed stars are suddenly created. Now the motion of the globes can be conceived as they approach and move away from different heavenly bodies. As for the tension in the cord, Berkeley does not speak to it. Presumably, there is no tension or motion until the stars are created.

In the much-discussed second thought experiment, a bucket half-filled with water is suspended from a tightly twisted cord. In Phase 1 the bucket is allowed to start spinning. The surface remains a plane and the sides of the bucket accelerate relative to the water. In Phase 2 the water rotating catches up with the bucket sides and is at rest relative to them. Now the surface of the water is concave having climbed the sides of the bucket. In Phase 3 the bucket is stopped. The water remains concave and is accelerated relative to the sides of the bucket. In Phase 4 the water ceases to rotate and is at rest relative to the sides.

On Newton’s understanding, the shape of the water does not depend on the water’s motion relative to the sides of the bucket. It is a plane in Phase 1 and Phase 4 and concave in Phase 2 and Phase 3. However, the concave shape of the water demands explanation. A force must be responsible for it. According to his second law (the force acting on an object is equal to the mass of the object times its acceleration), a force indicates an acceleration. Since the acceleration is not relative to the bucket sides, it must be relative to absolute space (Nagel 207-209).

Figure 4: Relevant phases in bucket experiment.

Berkeley has a response. Given a body moving in a circular orbit, its motion at any instant is the result of two motions: One along the radius and one along the tangent of the orbit. The concave shape of the water in phase 2 is due to an increase of the tangential forces on the particles of water without a corresponding force along its radius. Though Berkeley’s account of the deformation of the water by factors internal to the bucket system is an appropriate strategy for undermining Newton (showing that absolute space is unnecessary), it fails because his alternative explanation does not in fact correctly explain the deformation (Suchting 194-195, Brook 167-168).

Following Berkeley’s “solution” to the bucket experiment, he points out that given relative space, a body may be in motion relative to one frame of reference and at rest with respect to another. To determine true motion or rest, remove ambiguity, and to serve the purposes of natural philosophers in achieving a widely applicable account of motion, the fixed stars regarded at rest will serve admirably. Absolute space will not be needed (DM ϸ64).

The fixed stars are not explicitly invoked to account for the centrifugal effect in the bucket experiment as they were in the two globes experiment. It is a promising solution available to Berkeley. Karl Popper and Warren Asher, among others, assume that Berkeley understands it as a cogent response to the bucket experiment (Popper 232, Asher 458).

d. General Anti-Realism Arguments

In two very brief passages, one in De Motu and one in Siris, Berkley appears to offer arguments that would undermine realism not only for corpuscles, but for all theoretical entities. These arguments are difficult to interpret given that they are not amplified in any other works. They are intriguing for they hint at widely discussed issues in contemporary philosophy of science.

Berkeley briefly examines a pattern of inference, the hypothetico-deductive method, commonly used to justify theoretical hypotheses. The pattern of inference, as he understands it, is to derive certain consequences, C, from a hypothesis, H. If the consequences are born out (observed to occur), then they are evidence for H. Berkeley expresses skepticism that the method allows for the discovery of “principles true in fact and nature” (S ϸ228). He defends his position by making a logical point and giving an example: If H implies C, and H is true, then one can infer C. But from H implies C and C, one cannot infer H. The Ptolemaic systems of epicycles has as a consequence the movements of the planets. This, however, does not establish the truth of the Ptolemaic system.

Berkeley’s description of the hypothetico-deductive method is overly simplified. In actual scientific practice many factors are considered in accepting a hypothesis, including the number of positive predictions, the existence of negative predictions, the riskiness of the predictions, plausibility of competing hypotheses, and the simplicity of the hypothesis. Nevertheless, the method in its most refined form does not guarantee the truth of the hypothesis under consideration. If this is Berkeley’s point, it is well taken. A certain caution is warranted. But if anti-realism is to follow from the lack of certainty that the hypothesis is true, additional argument is required, including how corpuscularianism escapes anti-realism.

The passage is important in another regard. It reinforces Berkeley’s pragmatic understanding of explanation. Though the Ptolemaic system is not “true in fact”, it “explained the motions and appearances of the planets” (S ϸ238). Whether true or not, it has significant predictive power. It helps us expect how the planets will move.

A fascinating and complex passage in De Motu (section 67) has been interpreted by at least one commentator as offering an argument for instrumentalism based on the underdetermination of theory by data (Newton-Smith). For any theory, T, there is another theory, T*. T and T* are both about the same subject manner, logically incompatible, and fit all possible evidence. This lands in skepticism.  Which theory is true is beyond our grasp. Berkeley cannot accept this result. A chief motivation for his philosophical system is to avoid skepticism. Skepticism, for Berkeley, is the thesis that our sense experience is not reliable. It is insufficient to determine the true nature of physical reality and often outright misleads us as to that reality. According to the underdetermination thesis, despite complete observational evidence (evidence provided by the senses) the correct theory can still not be sorted.

But given instrumentalism, the skeptical consequences of the underdetermination thesis can be avoided. Since theories are understood as calculating devices, not a set of propositions that are true or false, logical incompatibility can be avoided, and skepticism as well.

In an effort to strengthen his instrumental account of forces, Berkeley does appear to offer an underdetermination argument.  “…great men advance very different opinions, even contrary opinions…and yet in their results attain the truth” (DM ϸ67). He provides an illustration: When one body impresses a force on another, according to Newton, the impressed force is action alone and does not persist in the body acted upon. For Torricelli, the impressed force is received by the other body and remains there as impetus. Both theories fit the observational evidence.

A sketch of one example hardly establishes the underdetermination thesis; an argument for the underdetermination thesis is needed. Perhaps a crucial experiment will settle the Newton/Torricelli disagreement. Perhaps the two theories differ only verbally.

Berkeley was aware that at certain moments in the history of science two or more competing theories were consistent with the known evidence, but it is a much stronger thesis to claim that the theories are compatible with all possible evidence. Although there is no textual indication that Berkeley holds this strong thesis, without it, the argument from underdetermination for instrumentalism fails.

Margaret Atherton provides an alternative to Newton-Smith’s analysis (Atherton 248-250). She does not see Berkeley employing the underdetermination thesis. Rather he is explicating how natural philosophers use mathematical hypotheses. Newton and Torricelli “attain the truth” while supposing contrary theoretical positions on how motion is communicated.

Despite Newton and Torricelli sharing the same set of observations—the same sense-based descriptions of how bodies actually move, “They use different pictures to describe what links instances of this sort together….” (Atherton 249). The same regularities are discovered regardless of which picture is operative.

This raises questions about the cognitive status of the pictures. Do they differ only verbally? Are they shorthand descriptions for the movements of bodies? If they are genuinely different calculating devices what guarantees that they will continue to fit or predict the same future observations? How to understand De Motu ϸ67 as well as Siris ϸ228 remains contentious.

6. References and Further Reading

  • Armstrong, David. “Editor’s Introduction” in Berkeley’s Philosophical Writings, edited by David Armstrong, Collier Books, New York, 1965, pp 7-34.
    • Contains a very brief introduction to the whole of Berkeley’s philosophy including his philosophy of science.
  • Asher, Warren O. “Berkeley on Absolute Motion.” History of Philosophy Quarterly. 1987, pp 447-466.
    • Examines the differing accounts of absolute motion in the Principles and De Motu.
  • Atherton, Margaret. “Berkeley’s Philosophy of Science” in The Oxford Handbook of Berkeley, edited by Samuel C. Rickless, Oxford University Press, Oxford, 2022, pp 237-255.
  • Berkeley, George. Philosophical Works, Including the Works on Vision. Edited by Michael R. Ayers. Everyman edition. London: J.M. Dent, 1975.
    • This is a readily available edition of most of Berkeley’s important works. When a text is without section numbers the marginal page numbers refer to the corresponding page in The Works of George Berkeley.
  • Berkeley, George. The Works of George Berkeley, Bishop of Cloyne. Edited by A.A. Luce and T.E. Jessop. Nine volumes. London: Thomas Nelson and Sons, 1948-1957.
    • Standard edition of Berkeley’s works. All references are to this edition.
  • Brook, Richard. “DeMotu: Berkeley’s Philosophy of Science” in The Bloomsbury Companion to Berkeley, edited by Richard Brook and Bertil Belfrage, Bloomsbury, London, 2017, pp 158-173.
    • Brief survey of Berkeley’s philosophy of science. Includes references to important scholarly work on the topic.
  • Dear, Peter. Revolutionizing The Sciences. Second Edition. Princeton University Press, Princeton, 2009.
  • Downing, Lisa. “Berkeley’s Case Against Realism about Dynamics” in Berkeley’s Metaphysics: Structural, Interpretive, and Critical Essays, edited by Robert Muehlmann, Pennsylvania State University press, University Park, PA, 1996, pp 197-214.
    • Detailed treatment of Berkeley’s antirealism for Newtonian forces.
  • Downing, Lisa. “Berkeley’s Natural Philosophy and Philosophy of Science” In The Cambridge Companion to Berkeley, edited by Kenneth P. Winkler, Cambridge University Press, Cambridge, 2005, pp 230-265.
  • Downing, Lisa. “’Siris’ and the Scope of Berkeley’s Instrumentalism”. British Journal for the History of Philosophy, 1995, 3:2, pp 279-300.
    • Looks at the realism/antirealism issue in the context of Siris.  Argues that corpuscular theories are not subject to the anti-realism consequences of the hypothetico-deductive method.
  • Flage, Daniel E. “Berkeley” in Internet Encyclopedia of Philosophy.
    • Provides a broad discussion of Berkeley’s philosophy.
  • Garber, Dan. “Locke, Berkeley, and Corpuscular Scepticism” in Berkeley: Critical and Interpretative Essays, edited by Colin M. Turbayne, University of Minnesota Press, Minneapolis, 1982, pp 174-194.
    • Defense of realism for corpuscles in Berkeley.
  • Hempel, Carl. “Deductive-Nomological versus Statistical Explanation” in The Philosophy of Carl G. Hempel, edited by James H. Fetzer, Oxford University Press, New York, 2001, pp 87-145.
  • Jobe, Evan K. “A Puzzle Concerning D-N Explanation”. Philosophy of Science, 43:4, pp 542-547.
  • Lorkowski, C. M. “David Hume: Causation” in Internet Encyclopedia of Philosophy.
    • Thorough discussion of Hume’s account of causation.
  • Marcum, James A. “Thomas S. Kuhn” in Internet Encyclopedia of Philosophy.
    • Reviews the work of historian and philosopher of science Thomas Kuhn.  Kuhn was instrumental in initiating a historiographical turn for many philosophers of science.  His work challenged prevailing views on the nature of science, especially accounts of scientific change.
  • Nagel, Ernest. The Structure of Science. Harcourt, Brace and World, New York, 1961.
    • Classic introduction to the philosophy of science. Excellent on the cognitive status of theories of space and geometry.
  • Newton-Smith, W. H. “Berkeley’s Philosophy of Science” in Essays on Berkeley, edited by John Foster and Howard Robinson, Clarendon Press, Oxford, 1985, pp 149-161.
    • Argues that Berkeley gives an argument for instrumentalism from the underdetermination of theories.
  • Pearce, Kenneth L. “Berkeley’s Theory of Language” in The Oxford Handbook of Berkeley, edited by Samuel C. Rickless, Oxford University Press, Oxford, 2022, pp 194-218.
    • Discusses four accounts of Berkeley’s theory of language.  Defends the use theory.
  • Pitcher, George.  Berkeley. Routledge & Kegan Paul, London, 1977.
    • Account of Berkeley’s main philosophical positions.
  • Popper, Karl. “A Note on Berkeley as Precursor of Mach and Einstein” in Conjectures and Refutations, Routledge, London, 2002, pp 224-236.
    • Early explication of Berkeley’s instrumentalism by an influential philosopher of science.
  • Suchting, W. A. “Berkeley’s Criticism of Newton on Space and Motion”. Isis, 58:2, pp 186-197.
  • Warnock, G.J. Berkeley. Penguin Books, Baltimore, 1953.
    • Introduction to Berkeley’s thought.
  • Wilson, Margaret D. “Berkeley and the Essences of the Corpuscularians” in Essays on Berkeley, edited by John Foster and Howard Robinson, Clarendon Press, Oxford, 1985, pp 131-147.
    • Raises concerns about interpreting Berkeley as a scientific realist for corpuscles.
  • Winkler, Kenneth. Berkeley An Interpretation. Clarendon Press, Oxford, 1989.
    • Thorough discussions of both the continuity of physical objects and corpuscularianism.

 

Author Information

A. David Kline
Email: akline@unf.edu
University of North Florida
U. S. A.

The Experience Machine

The experience machine is a thought experiment first devised by Robert Nozick in the 1970s. In the last decades of the 20th century, an argument based on this thought experiment has been considered a knock-down objection to hedonism about well-being, the thesis that our well-being—that is, the goodness or badness of our lives for us—is entirely determined by our pains and pleasures. The consensus about the strength of this argument was so vigorous that, in manuals about ethics, it had become canonical to present hedonism as a surely false view because of the experience machine thought experiment. However, in the second decade of the 21st century, an experimental literature emerged that successfully questioned whether this thought experiment is compelling. This suggests that the experience machine thought experiment, in addition to being central to the debate on hedonism about well-being, touches other topical debates, such as the desirability of an experimental method in philosophy and the possibility of progress in this discipline. Moreover, since the experience machine thought experiment addresses the question of the value of virtual lives, it has become particularly relevant with the technological developments of virtual reality. In fact, the debate on the experience machine thought experiment or “intuition pump” also affects the debate on the value of virtual lives in relation to technological advances.

In this article, one of the original formulations of the experience machine thought experiment (EMTE) is first presented, together with the question that it is meant to isolate, its target theory, how to best understand the argument based on it, and the implications that have historically been attributed to it. Second, a revisionist trend in the scholarship that undermines traditional confidence in the argument based on the experience machine thought experiment is introduced. Third, some objections to this revisionist trend, especially the expertise objection, are considered. Finally, some further versions of the experience machine thought experiment are discussed that have been advanced in response to the “death” of the original one.

Table of Contents

  1. Nozick’s Thought Experiment
  2. Target Theory: Mental Statism
  3. Some Stipulations of the Experience Machine Thought Experiment
  4. The Argument Based on the Experience Machine Thought Experiment
  5. The Experience Machine Thought Experiment as an Intuition Pump
  6. Imaginative Failures
    1. Memory’s Erasure
    2. Moral Concerns
  7. The Status Quo Bias
  8. Methodological Challenges
  9. The Expertise Objection
  10. The Experience Pill
  11. A New Generation of Experience Machine Thought Experiments
  12. Concluding Remarks
  13. References and Further Reading

1. Nozick’s Thought Experiment

Nozick first introduced the experience machine thought experiment in 1974 in his book Anarchy, State, and Utopia. This section focuses, however, on the formulation found in Nozick’s book The Examined Life (1989), because this version is particularly effective in capturing the narrative of the thought experiment. After this presentation, the structure of the thought experiment and the implications that it has traditionally been thought to have are summarized..

In The Examined Life (1989), Nozick presented the EMTE as follows:

Imagine a machine that could give you any experience (or sequence of experiences) you might desire. When connected to this experience machine, you can have the experience of writing a great poem or bringing about world peace or loving someone and being loved in return. You can experience the felt pleasures of these things, how they feel “from the inside.” You can program your experiences for tomorrow, or this week, or this year, or even for the rest of your life. If your imagination is impoverished, you can use the library of suggestions extracted from biographies and enhanced by novelists and psychologists. You can live your fondest dreams “from the inside.” Would you choose to do this for the rest of your life? If not, why not? (Other people also have the same option of using these machines which, let us suppose, are provided by friendly and trustworthy beings from another galaxy, so you need not refuse connecting in order to help others.) The question is not whether to try the machine temporarily, but whether to enter it for the rest of your life. Upon entering, you will not remember having done this; so no pleasures will get ruined by realizing they are machine-produced. Uncertainty too might be programmed by using the machine’s optional random device (upon which various preselected alternatives can depend).

The most relevant difference between Nozick’s two versions of the thought experiment lies in the temporal description of plugging in. In the 1974’s EMTE the plugging in is for two years, while in the 1989’s EMTE the plugging in is for life. In his testing of the 1974’s EMTE, Weijers (2014) reported that 9% of the participants averse to plugging in justified it by saying something like “getting out every two years would be depressing”. On the one hand, this kind of reply is legitimate: well-being concerns lives and to maximize a life’s net pleasure, it is fully legitimate to consider the possible displeasure felt every two years when unplugging. Yet, on the other hand, this kind of reply seems to elude the question that the thought experiment is designed to isolate. Thus, the 1989’s EMTE is more effective in tracking the choice between two lives, one spent in touch with reality and one spent inside an experience machine (EM), that the thought experiment aims at isolating.

Several studies have suggested that the majority of readers of the EMTE are averse to plugging in. Weijers (2014) found that this judgement was shared by 84% of the participants asked to respond to Nozick’s 1974’s EMTE. Similarly, 71% of the subjects facing the succinct version of EMTE developed by Hindriks and Douven (2018) shared the pro-reality judgement, a percentage different from Weijers’ but still a considerable majority. Since spending one’s life, or at least a part of it, inside the EM should be favored according to mental state theories of well-being in general and prudential hedonism—that is, hedonism about well-being—in particular, these majority’s preferences might be taken as evidence against mental state theories of well-being and prudential hedonism. In fact, people’s judgements in favor of living in touch with reality have been thought to mean that reality must be intrinsically prudentially valuable. In this context, the term “prudential” is understood as referring to what is good for a person, which is often taken to correspond to well-being. If reality is intrinsically prudentially valuable, theories of well-being that hold that only how experiences feel “from the inside” directly contributes to well-being are false. With this argument based on the EMTE and on the response it elicits in the majority of subjects, this thought experiment has been widely considered as providing a knock-down argument against mental state theories of well-being and prudential hedonism. In other words, these theories have been traditionally quickly dismissed through appeal to the EMTE. Weijers (2014), for example, compiled a non-exhaustive list of twenty-eight scholars writing that the EMTE constitutes a successful refutation of prudential hedonism and mental state theories of well-being.

2. Target Theory: Mental Statism

This section identifies the target theory of the thought experiment. Traditionally, the experience machine has been mostly understood as a thought experiment directed against prudential hedonism. It should however be noted that the points being made against prudential hedonism by the EMTE equally apply to non-hedonistic mental state theories of well-being. Mental state theories of well-being value subjective mental states¾how our experiences feel to us from the inside¾and nothing else. Put simply, what does not affect our consciousness cannot be good or bad for us. Accordingly, for mental state theories, well-being is necessarily experiential. Notice that these theories do not dispute that states of affairs contribute to well-being. For example, they do not dispute that winning a Nobel Prize makes one’s life go better. Mental state theories dispute that states of affairs intrinsically affect well-being. According to these theories, winning a Nobel Prize makes one’s life go better only instrumentally because, for example, it causes pleasure.

Different mental states theories can point to different mental states as the ultimate prudential good. For example, according to subjective desire-satisfactionism, well-being is increased by believing that one is getting what one wants, rather than by states of affairs aligning with what one wants, as in the standard version of desire-satisfactionism. Standard desire-satisfactionism—a prominent alternative to hedonism in philosophy of well-being—is usually thought to be immune from objections based on the EMTE: since most of us want to live in touch with reality, plugging into the EM would frustrate this desire and make our lives go worse. However, the supposed insusceptibility of standard desire-satisfactionism to the EMTE is questionable. In fact, given that a minority of people want to plug into the EM, these people’s lives, according to standard desire-satisfactionism, would be better inside the EM. This implication conflicts with the majority’s judgement that a life inside the EM is not a good life. Note that if a person’s desires concern only mental states, standard desire-satisfactionism becomes undistinguishable from a mental state theory of well-being.

In any case, probably because prudential hedonism is the most famous mental state theory of well-being, the EMTE has traditionally been used against this particular theory. Thus, this article refers to prudential hedonism as the target theory of the EMTE, although the argument based on it is equally applicable to any other mental state theory of well-being.

3. Some Stipulations of the Experience Machine Thought Experiment

By Nozick’s stipulation, we should be able to disregard any metaphysical and epistemological concerns that the thought experiment might elicit. Since the EMTE is meant to evoke the intuition that physical reality, in contrast to the virtual reality of the EM, is intrinsically valuable, it might seem natural to ask “what is reality?” and “how can we know it?”. If there is no such thing as reality, reality cannot be intrinsically valuable. In other words, if there is no mind-independent reality, mental state theories of well-being cannot be objected to on the ground of not intrinsically valuing mind-independent reality (the metaphysical issue).

Similarly, someone might say that even if there is a mind-independent reality, we cannot know it. In this case, reality would collapse in a supposed intrinsic value with no use in evaluating lives¾if we cannot know what is real, we cannot judge whether a life has more or less of it. For example, if we do not have knowledge of reality, we cannot say whether a life in touch with the physical world or a life inside an EM is more real (the epistemological issue).

Nevertheless, the EMTE is designed to isolate a prudential concern and stipulates that we should ignore any metaphysical or epistemological concern elicited by the narrative of the thought experiment. Thus, below, Nozick’s stipulation of common-sense conceptions of reality and our access to it are adopted (for a thought experiment with an EMTE-like narrative directed against metaphysical realism, see The Brain in a Vat Argument).

Nozick also asks readers to ignore contextual factors. For example, he claims, we should not evaluate whether a life inside an EM is worse than a life of torture. In fact, it seems reasonable to prefer a life plugged into an EM to a life of intense suffering, but this preference does not respect the thought experiment’s stipulation. To isolate the relevant prudential question, we should think of a hedonically average life. Having said that, we might doubt that our trade-off between pleasure and reality can be insensitive to contextual factors. If we are among the hedonically less privileged people, for example someone being afflicted by chronic depression or pain, it seems reasonable to want to plug in.

4. The Argument Based on the Experience Machine Thought Experiment

The argument based on the EMTE has sometimes been interpreted as a deductive argument. According to this version of the argument, if the vast majority of reasonable people value reality in addition to pleasure, then reality has intrinsic prudential value; therefore, prudential hedonism is false. The main problem with this deductive argument consists in disregarding the is-ought dichotomy: knowing “what is” does not by itself entail knowing “what ought to be”. This argument jumps too boldly from a descriptive claim—the majority of people prefer reality—to a normative claim—reality is intrinsically valuable. The deductive argument is thus invalid because the fact that reality intrinsically matters to many of us does not necessarily imply that it should be intrinsically valued by all of us. For example, the majority of us, perhaps instrumentally, value wealth, but it does not necessarily follow that is wrong not to value wealth.

Instead, the most convincing argument based on the EMTE seems to be an appeal to the best explanation. According to this version of the argument, the best explanation for something intrinsically mattering to many people is something being intrinsically valuable. In the abductive argument, the passage from the descriptive level to the normative level, from “reality intrinsically matters to the majority of people” to “reality is intrinsically valuable”, is more plausibly understood as an inference to the best explanation.

5. The Experience Machine Thought Experiment as an Intuition Pump

As explained above, according to the abductive argument based on the EMTE intuition pump, reality being intrinsically prudentially valuable is the best explanation for reality intrinsically mattering to the majority of people. One can however wonder whether this is really the best explanation available. In the first two decades of the 21st century, a trend in the scholarship on the EMTE questioned this abduction by pointing to several biases that might determine, and thus explain, people’s apparent preference for reality. In this and the next two sections, phenomena advanced by this revisionist scholarship that seem to partially or significantly bias judgments about the EMTE are presented. These distorting factors are grouped under hedonistic bias, imaginative failures, and status quo bias.

The hedonistic bias is the most speculative of the proposed biases that have been thought to affect our responses to the EMTE. According to Silverstein (2000), who argued for the influence of such a hedonistic bias on our reactions to the EMTE, the preferences apparently conflicting with prudential hedonism are themselves hedonistically motivated, because, he claimed, the preference for not plugging in is motivated by a pleasure-maximizing concern. Silverstein’s argument is based on the thesis that the desire for pleasure is at the heart of our motivational system, in the sense that pleasure determines the formation of all desires.

The existence of a similar phenomenon affecting the formation of preferences has also been put forward by Hewitt (2009). Following Hewitt, reported judgements cannot be directly taken as evidence regarding intrinsic value. In fact, we usually devise thought experiments to investigate our pre-reflective preferences. The resulting judgements are therefore also pre-reflective, which means that their genesis is not transparent to us and that reflection on them does not guarantee their sources becoming transparent. Thus, our judgements elicited by the EMTE do not necessarily track intrinsic value.

Notice that Silverstein’s argument for the claim that pleasure-maximization alone explains the anti-hedonistic preferences depends on the truth of psychological hedonism—that is, the idea that our motivational system is exclusively directed at pleasure. However, the EMTE can be taken as constituting itself a counterexample to psychological hedonism. In fact, the majority of us, when facing the choice of plugging into an EM, have a preference for the pleasure-minimizing option. What the studies on our responses to the EMTE tell us is precisely that most people have preferences conflicting with psychological hedonism. The majority of people do not seem to have an exclusively pleasure-maximizing motivational system. The descriptive claim of psychological hedonism seems to struggle with a convincing counterexample. Psychological hedonists are thus forced to appeal to unproven unconscious desires¾conscious pleasure-minimizing preferences as a result of an unconscious desire for pleasure—to defend their theory.

Nevertheless, a week version of Silverstein’s hedonistic bias, according to which pleasure-maximization partly explains the anti-hedonistic judgements, seems plausible. In fact, empirical research has shown a partial role of immediate pleasure-maximization in decision-making. This conclusion points in the direction of a weak hedonistic bias—that is, the fact that apparently non-hedonistic judgements might be partly motivated by pleasure-maximization. For example, Nozick asks to disregard the distress that choosing to plug in might cause in the short-term. According to Nozick, we should eventually be rational and accept an immediate suffering for the sake of long-term pleasure. Still, as everyday experience shows us, we do not always have such a rational attitude toward immediate suffering for a long-term gain. Some people do not go to the dentist although it would benefit them, or do not overcome their fear of flying although they would love to visit a different continent. Again, it seems doubtful that the factor “distress about plugging in” is actually disregarded just because Nozick asks to do so. Our adverse judgement about plugging in might be hedonistically motivated by the avoidance of this displeasure, regardless of Nozick asking us to disregard it. However, the claim that pleasure-maximization plays a remarkable role in our anti-hedonistic responses to the EMTE is an empirically testable claim. As a result, even if the hedonistic bias seems to be a real phenomenon, it would be speculative to advance that it crucially affects our judgements about the EMTE without appealing to empirical evidence.

6. Imaginative Failures

Thought experiments are devices of the imagination. In this section, two confounding factors involving imagination are discussed: imaginative resistance and overactive imagination. Those phenomena are empirically shown to significantly distort our judgements about the EMTE. Imaginative resistance occurs when subjects reject some important stipulation of a thought experiment. Regarding the EMTE, examples include worrying about an EM’s malfunctioning or its inability to provide the promised bliss, although the scenario is explicit that the EM works perfectly and provides blissful feelings. According to Weijers’ study (2014), imaginative resistance affected 34% of the subjects that did not want to plug into the EM. In other words, one third of the participants that chose reality appeared to disregard some of the thought experiment’s stipulations. This is important because it shows, in general, that imagined scenarios are not fully reliable tools of investigation and, in particular, that a large portion of the pro-reality judgements are potentially untrustworthy because they do not comply with the EMTE’s stipulations.

Notice that philosophers can suffer from imaginative resistance too. Bramble (2016), while arguing that prudential hedonism might not entail the choice of plugging in, claims that the EM does not provide the pleasures of love and friendship. According to him, artificial intelligence is so primitive in regard to language, facial expressions, bodily gestures, and actions that it cannot deliver us the full extent of social pleasures. While his claim seems true of the technology of the mid-2010s, it clearly violates the thought experiment’s stipulations. In addition to being implied by the 1974’s version of EMTE, Nozick says explicitly in his 1989’s version that the machine has to be imagined as perfectly simulating the pleasure of loving and being loved.

Overactive imagination is another distorting phenomenon related to imagination. This phenomenon consists in subjects imagining non-intended features of the EMTE. In his test of Nozick’s 1974’s scenario, Weijers (2014) claimed to have found that 10% of the pro-reality responses displayed signs of overactive imagination. In other words, he claimed that a non-negligible proportion of participants unnecessarily exaggerated aspects of the thought experiment’s narrative. Notice that, here, Weijers’ claim seems problematic. Weijers reported that some subjects declared that they did not want to plug in because “the machine seems scary or unnatural” and he took these declarations as indicating cases of overactive imagination. Yet, the artificiality of the EM is one of the main reasons advanced by Nozick for not plugging in: ruling out such a response as biased seems therefore unfair. Nevertheless, putting aside this issue, the possibility of the EMTE eliciting judgements biased by technophobic concerns seems very plausible. This possibility has been made more likely by the popularity of the film The Matrix, in which a similar choice between reality and comfort is presented. Yet, this movie elicits a new set of intuitions that the EMTE is not supposed to elicit. For example, political freedom is severely hampered in The Matrix. The machines, after having defeated us in a war, enslaved us. Notice the difference with the 1989’s version of EMTE where “friendly and trustworthy beings from another galaxy” serve us. Thus, the narrative of The Matrix should not be used to understand the EMTE because it elicits a further layer of intuitions, such as, for example, the (intuitive) desire not to be exploited.

Considering overall imaginative failures (imaginative resistance and overactive imagination together), in Löhr’s study (2018) on the EMTE, it affected 46% of the pro-reality philosophers and 39% of the pro-reality laypeople. Thus, given the imaginative failures that affect the EMTE, it seems that this thought experiment may legitimately be accused of being far-fetched both in its narrative—at least in its first version, as the second version clarifies the benevolent intention of the EM providers—and in its stipulations. In fact, it might be that we lack the capacity to properly form judgements in outlandish cases, such as the one the EMTE asks us to imagine.

Nevertheless, concerning the role of technophobia and fantasy in imaginative failures, consider that the technological innovations of the beginning of the 21st century render virtual reality progressively less fantastic. This increasing concreteness of virtual reality technology, compared to the 1970s when the thought experiment was first devised, might lead to a progressive reduction of the influence of these factors on responses to the EMTE. Even more, it is not implausible that one day the pro-reality judgement will not anymore be shared by the majority of people. The evidential power of thought experiments is likely to be locally and historically restricted; therefore, we cannot exclude the fact that changes in technology and culture will determine different judgements in subjects presented with the EMTE.

a. Memory’s Erasure

Remember that the EMTE’s target theory is prudential hedonism, not hedonistic utilitarianism. The offer to plug in does not concern maximizing pleasure in general, but one’s own pleasure. Well-being concerns what is ultimately good for a person. Thus, in deliberating about what is in your best interest, you need to be certain about the persistence of the you in question. Given that, the thought experiment would be disrupted if the continuation of your personal identity were not guaranteed by the EM.

Bramble (2016) expressed precisely this worry. Remember that the EM is thought to provide a virtual reality that is experientially real; thus, the users need to be oblivious of the experiences and choices that led them to plug in. Following Nozick’s infelicitous mentioning of plugging in as a “kind of suicide”, Bramble held that the EM, in order to provide this kind of feeling of reality, might kill you in the sense that your consciousness will be replaced with a distinct one. Personal identity would therefore be threatened by the EM, and since we are trying to understand what is a good life for the person living it, it seems easy to see that ending this life would not be good. Following Bramble, it seems that we have a strong reason for not plugging in: it is bad for us to die. Similarly, Belshaw (2014) expressed concerns about the EMTE and personal identity. In particular, Belshaw claimed that to preserve a sense of reality inside an EM, the memory erasure operated by the machine should be invasive. Belshaw’s point seems stronger than Bramble’s because it does not concern a small memory-dampening. Belshaw points to a tension between two requirements of the EM: preserving personal identity and providing exceptional experiences that feel real (“You are, as you know, nothing special. So, seeming to rush up mountains, rival Proust as a novelist, become the love-interest of scores of youngsters, will all strike you as odd”). For him, if some alterations of one’s psychology do not threaten personal identity, this is not the case of the EM, where invasive alterations are required to provide realistic exceptional experiences.

Nevertheless, both Bramble’s and Benshaw’s points can be seen as cases of imaginative resistance. In fact, although the experience machine thought experiment does not explicitly stipulate that the EM’s memory erasure can occur while guaranteeing the persistence of personal identity, this can be considered as implied by it. It should be imagined that the amnestic re-embodiment—that is, the re-embodiment of the subject of experience inside the EM without the conscious knowledge that he is presently immersed in a virtual environment and possesses a virtual body—preserves personal identity. If you pay attention to the text of the thought experiment, it emerges that nothing in the wording insinuates that personal identity is not preserved. The continuation of personal identity results as implicitly stipulated by the thought-experiment. Neither Bramble’s point nor Belshaw’s does comply with this implicit stipulation and they thus end up constituting cases of imaginative resistance. In fact, whether the preservation of personal identity is technically problematic or not does not concern the prudential question at stake.

b. Moral Concerns

Drawing a clear-cut distinction between moral and prudential concerns should help refine the relevant judgements regarding the EMTE. By Nozick’s stipulation, only prudential judgements are at stake in this though experiment. However, imaginative resistance is a plausible phenomenon supported by empirical evidence. According to it, subjects do not fully comply with the stipulations of thought experiments. The possibility that judgements elicited by the EMTE are distorted by moral concerns seems therefore likely. In fact, according to experimental evidence, the absence of a clear-cut distinction between morality and well-being, such as laypeople’s evaluative conception of happiness, seems to be the default framework.

Weijers (2014) reported answers to the EMTE like “I can’t because I have responsibilities to others” among participants that did not want to connect. Similarly, Löhr (2018) mentioned pro-reality philosophers’ answers like: “I cannot ignore my husband and son,” “I cannot ignore the dependents”, or “Gf[girlfriend] would be sad”. These answers can be seen as examples of imaginative resistance. When considering the EMTE, we should by stipulation disregard our moral judgements—that is, to “play by the rules” of the thought experiment one should be able to disregard morality. In his 1974’s version, Nozick claims “others can also plug in to have the experiences they want, so there’s no need to stay unplugged to serve them. Ignore problems such as who will service the machines if everyone plugs in”. Nozick asks us to imagine a scenario where everyone could plug into an EM. Since, by stipulation, there is no need to care for others, we should disregard our preference for it. Taking moral evaluations into account in one’s decision about plugging into an EM constitutes a possible case of imaginative resistance.

However, it is far from clear that we are actually able to disregard our moral concerns. In any case, it seems unlikely that stipulating that we should not worry about something necessarily implies that we will actually not worry about it. For example, being told to suspend our moral judgement in a sexual violence case because of the perpetrator’s mental incapacity does not imply that, as jurors, we will be able to do so. Prudential value is not the only kind of value that we employ in evaluating life-choices: the majority of people value more in life than their well-being. Concerning the EMTE, common-sense morality seems to deny the moral goodness of plugging in. Common-sense morality views plugging in as self-indulgent and therefore blameworthy. Moreover, it values making a real impact on the world, such as saving lives, not just having the experience of making such an impact.

To understand the imaginative resistance observed in philosophers’ answers to the EMTE, it should be noted that the main philosophical ethical systems seem to deny the moral goodness of plugging in. It seems that even hedonistic utilitarianism, the only ethical system prima facie sympathetic to plugging in, would not consider this choice morally good. To morally plug in, a hedonistic utilitarian agent should believe that this would maximize net happiness. This seems plausible only if all the other existing sentient beings are already inside an EM (and they have no obligations toward future generations). Otherwise, net happiness would be maximized by the agent’s not plugging in, since this would allow her to eventually convince two or more other beings to plug in, and two or more blissful lives, rather than only hers, will be a greater contribution to overall happiness. Given that moral philosophical concerns seem to oppose the choice of plugging in, it appears plausible that philosophers’ judgements elicited by the EMTE are also distorted by morality.

To sum up, moral concerns constitute a plausible case of imaginative resistance distorting philosophers’ and laypeople’s judgements about the EMTE. Most people seem to agree that pleasant mental states are valuable. Yet, it is unlikely that everyone is persuaded by the claim that, all things considered, only personal pleasure is intrinsically valuable. Nevertheless, if we consider only prudential good, this claim seems importantly more convincing. In other words, if we carefully reason to dismiss our moral concerns, plugging into an EM seems a more appealing choice.

7. The Status Quo Bias

In addition to the biases mentioned above, the status quo bias has received special attention in the literature. The status quo bias is the phenomenon according to which subjects tend to irrationally prefer the status quo—that is, the way things currently are. In other words, when facing complex decision-making, subjects tend to follow the adage “when in doubt, do nothing”. This bias is thought to show up in many decisions, such as voting for an incumbent office holder or not trading in a car. Moving to the relevance of the status quo bias for the EMTE, it seems that when subjects are presented with the choice of leaving reality and plugging in, most appear averse to it. However, when they are presented with the choice of leaving the EM to “move” into reality, they also appear averse to it (see Kolber, 1994). This phenomenon seems best explained by our irrational preference for the status quo, rather than by a constant valuing of pleasure and reality. In 1994, Kolber advanced the idea of the reverse experience machine (REM). In this revised version of the thought experiment, readers are asked: “would you get off of an experience machine to which you are already connected?”. In the REM, subjects have thus to choose between staying into the EM or moving to reality while losing a significant amount of net pleasure. Since the REM is supposed to isolate the same prudential concern as the EMTE through a choice between pleasure and reality (with a proportion of pleasure and reality similar in both thought experiments), the REM should elicit the same reactions as the EMTE. The replication of the results would indicate that Nozick’s thought experiment is able to isolate this concern. Instead, when De Brigard tested a version of the REM, the results did not fulfill this prediction. While a large majority of readers of the original EMTE are unwilling to plug in, when imagining being already connected to an EM and having to decide whether to unplug or stay, the percentage of subjects that chose reality over the machine dropped significantly to 13%. De Brigard (2010) and the following literature have interpreted this result as demonstrating the influence of the status quo bias. Because of the status quo bias, when choosing between alternatives, subjects display an unreasonable tendency to leave things as they are. Applied to the EMTE, the status quo bias explains why the majority of subjects prefer to stay in reality when they are (or think they are) in reality and to stay in an EM when imagining being already inside one.

This interpretation is also supported by another empirical study conducted by Weijers (2014). Weijers introduced a scenario—called “the stranger No Status Quo scenario” (or “the stranger NSQ”)—that is meant to reduce the impact of status quo bias. This scenario is partly based on the idea that the more we are detached from the subject for whom we have to take a decision, the more rational we should be. Accordingly, the scenario NSQ asks us to decide not whether we would plug into an EM, but whether a stranger should. Moreover, the Stranger NSQ scenario adds a 50-50 time split: at the time of the choice, the stranger has already spent half of her time inside an EM and has had most of her enjoyable experiences while plugged into it. Both elements—that is, the fact that we are asked to choose for a stranger and the fact that this stranger has already spent half of her life inside an EM—are meant to minimize the influence of the status quo bias. Weijers observed that in this case a tiny majority (55%) of the participants chose pleasure over reality. In other words, a small majority of subjects, when primed to choose the best life for a stranger who has already spent half of her life into an EM, preferred pleasure over reality. This result again contradicts the vast majority of pro-reality responses elicited by Nozick’s original thought experiment. Importantly, Weijers’ study is noteworthy because it avoided the main methodological flaws of De Brigard’s (2010), such as a small sample size and a lack of details on the conduct of the experiments.

To sum up, the aforementioned studies and the scholarship on them have challenged the inference to the best explanation of the abductive argument based on the EMTE. Note that something can be considered good evidence in favor of a hypothesis when it is consistent only with that hypothesis. According to this new scholarship, the fact that the large majority of people respond to the original EMTE in a non-hedonistic way by choosing reality over pleasure is not best explained by reality being intrinsically valuable. In fact, modifications of the EMTE like the REM and the stranger NSQ scenario, while supposedly isolating the same prudential question, elicit considerably different preferences in the experimental subjects. The best explanation of this phenomenon seems to be the status quo bias, a case of deviation from rational choice that has been repeatedly observed by psychologists in many contexts.

8. Methodological Challenges

Smith (2011) criticized the above-mentioned studies for the lack of representativeness of the experimental subjects. In fact, De Brigard’s studies were conducted on philosophy students and Weijers’ studies on marketing and philosophy students, both in Anglo-Saxon universities. Obviously, these groups do not represent the whole English-speaking population, let alone the whole human population. Nevertheless, this objection seems misplaced. Although it would be interesting to know what the whole world thinks of the EMTE, or to test an indigenous population that has never had any contact with Western philosophy, that is not what is relevant for the negative experimental program concerning the EMTE—that is, the experimental program devoted to question the abductive argument against prudential hedonism based on the EMTE. Smith seems to confuse the revisionist scholarship’s goal of challenging philosophers’ previous use of intuitions with the sociological or anthropological goal of knowing what humans think.

Another methodological objection advanced by Smith (2011) concerns the fact that experimental subjects in these studies are not in the position of confronted agents. The participants are asked to imagine some fantastical scenarios rather than being in a real decision-making situation, with the affective responses that this would elicit. Again, Smith’s objection seems flawed: what Smith considers a methodological problem might actually be a methodological strength. Unconfronted agents are very likely to be more rational in the formation of their judgements about the EMTE. Once again, the experimental program on the EMTE is interested in how to refine and properly use intuitions for the sake of rational deliberation, not in the psychological project of knowing what people would choose, under the grip of affects, in a real situation. In other words, the reported judgements expressed in questionnaires, although not indicative of what intuitions we would have in front of a real EM, seem less biased by affects and more apt to be the starting point for a rational judgement about what has intrinsic prudential value.

9. The Expertise Objection

A major methodological challenge to much of experimental philosophy concerns the use of laypeople’s judgements as evidence. According to the expertise objection, the judgements reported by laypeople cannot be granted the same epistemic status as the judgements of philosophers (that is, the responses of trained professionals with years of experience in thinking about philosophical issues). Philosophers, accordingly, should know how to come up with “better” judgements. Following this objection, the responses of subjects with no prior background in philosophy, which inform the aforementioned studies, lack philosophical significance.

Although the concern appears legitimate, it seems disproved by empirical evidence from both experimental philosophy in general and experiments about the EMTE in particular. Concerning the EMTE in particular, Löhr (2018) tested whether philosophers are more proficient than laypeople in disregarding irrelevant factors when thinking about several versions of EMTE. He observed that philosophers gave inconsistent answers when presented with different versions of EMTE and that their degree of consistency was only slightly superior to laypeople. Also, philosophers were found to be susceptible to imaginative failures approximately as much as laypeople. This suggests that philosophers do not show a higher proficiency than laypeople in complying with the stipulations of the thought experiment and that their consistency between different EMTE’s scenarios is only slightly better.

The empirical evidence we possess on philosophers’ judgements in general and philosophers’ judgements concerning the EMTE in particular seems therefore to cast much doubt on the expertise objection. The current empirical evidence does not support granting an inferior epistemic status to the preferences of laypeople that inform the aforementioned studies on the EMTE. The burden of proof, it seems, lies squarely on anyone wishing to revive the expertise objection. Moreover, given the value of equality than informs our democratic worldview, the burden of proof should always lie on the individual or group—philosophers in this case—that aspires to a privileged status.

Furthermore, in addition to philosophical expertise not significantly reducing the influence of biases, philosophers might have their own environmental and training-specific set of biases. For example, a philosopher assessing a thought experiment might be biased by the dominant view about this thought experiment in the previous literature or in the philosophical community. This worry seems particularly plausible in the case of the EMTE because there is a strong consensus among philosophers not specialized in this thought experiment that one should not enter the EM. In other words, it is reasonable to hypothesize that the “textbook consensus”—that is, the philosophical mainstream position as expressed by undergraduate textbooks—adds a further layer of difficulty for philosophers trying to have an unbiased response to the EMTE.

10. The Experience Pill

In a recent study, Hindriks and Douven (2018) changed the EM into an experience pill. With this modification, an increase of pro-pleasure judgements from 29% to 53% was observed. In other words, substituting, in the narrative of the thought experiment, the science fiction technology by a pill seems to cause a significant shift in the subjects’ responses. This can be attributed to the more usual delivery mechanism and, more importantly, to the fact that the experience pill does not threaten in many respects the relationship with reality. The experience pill does not resemble psychedelic drugs such as LSD (notice that interestingly Nozick took the view of psychedelic drugs fans, together with traditional religious views, as examples of views that deeply value reality). In fact, while the experience pill drastically alters the hedonic experience, perhaps similarly to amphetamines or cocaine, it does not affect the perception of the world.

Therefore, the experience pill though experiment does not seem to propose a narrative that can be compared with the EMTE. Here, the choice is not between reality and pleasure but rather between affective appropriateness¾having feelings considered appropriate to the situation¾and pleasure. Thus, the experience pill should be seen as an interesting but different thought experiment not to be compared with the EMTE. Concerning this issue, it should be noted that the EMTE’s scenarios that are used across both armchair and experimental philosophy literature vary significantly. This is worrying because the experimental philosophy and psychology literature on intuitions seems to show that the wording of scenarios can greatly affect the responses they elicit. We might thus find that a particular wording of the scenario will get different results, adding new layers of difficulty to answering the question at stake. In other words, the inter-comparability of different scenarios adopted by different authors is limited.

11. A New Generation of Experience Machine Thought Experiments

Some authors have challenged the revisionist scholarship on the EMTE presented above by claiming that it does not address the most effective version of this thought experiment. According to them, the narrative of the original EMTE should be drastically modified in order to effectively isolate the question at stake. Moreover, they claim that a new argument based on this transformed version of the EMTE can be advanced against prudential hedonism: the experientially identical lifetime comparison argument. For example, Crisp (2006) attempted to eliminate the status quo bias and the concern that the technology may malfunction (imaginative resistance) by significantly modifying the narrative of the EMTE. He asks us to compare two lives. Life A is pleasant, rich, full, autonomously chosen, involves writing a great novel, making important scientific discoveries, as well as virtues such as courage, wittiness, and love. B is a life experientially identical to A but lived inside an EM. A and B, according to prudential hedonism, are equal in value. However, it seems that the majority of us has an intuition contrary to that. This is the starting point of the experientially identical lifetime comparison argument. Likewise, according to Lin (2016), to isolate the question that the EMTE is supposed to address, we should consider the choice between two lives that are experientially identical but differently related to reality, because this locates reality as the value in question. According to Lin, his version of the EMTE has also the advantages of not being affected by the status quo bias and not involving claims about whether we would or should plug in or not.

Rowland (2017) conducted empirical research on a version of the EMTE in which two hedonically equal lives of a stranger must be compared. Presented with Rowland’s EMTE, more than 90% of the subjects answered that the stranger should choose the life in touch with reality. Surprisingly, Rowland does not provide the possibility of answering that the two lives have equal value. Unfortunately, this methodological mistake is so macroscopic that it severely undermines the significance of Rowland’s study.

Notice that, once the narrative of the thought experiment is devised in this way, it assumes the same structure as Kagan’s deceived businessman thought experiment (Kagan, 1994). In fact, both thought experiments are based on the strategy of arguing against a view according to which B-facts are equal in value with A-facts, by devising a scenario where there is intuitively a difference of value between A-facts and B-facts. In his thought experiment, Kagan asks to imagine the life A of a successful businessman that has a happy life because he is loved by his family and respected by his community and colleagues. Then, Kagan asks to imagine an experientially equal life B where the businessman is deceived about the causes of his happiness—everyone is deceiving him for their personal gain. Lives B and A contain the same amount of pleasure, thus, according to prudential hedonism, they are equal in value. Nevertheless, we have again the intuition that life A is better than life B.

Discussing this new version of the EMTE, de Lazari-Radek and Singer (2014) concluded that our judgements about it are also biased. They attributed this biased component to morality. Life A contributes to the world while life B does not; thus, life A is morally superior to life B. Therefore, according to them, our judgement that life A is better is affected by moral considerations extraneous to the prudential question at stake. As in the case of imaginative failures regarding the original EMTE, it seems possible that the comparison intuition is based on scales of evaluation different than well-being.

Moreover, the structure of this new version of the thought experiment seems to suffer from the freebie problem. Since it is irrational to have 100% confidence in the truth of prudential hedonism, it is irrational not to prefer life A to life B. If you are not 100% confident about the truth of prudential hedonism, life A has a >0% chance of being more prudentially valuable than life B, making it unreasonable to decline the reality freebie. Note that this is especially true when the decision between the two lives is forced (that is, when there is no “equal value” option) as in Rowland’s study. Because of the freebie problem, transforming the narrative of the EMTE in this drastic way does not seem to increase its strength. Rather, it seems to make this thought experiment unhelpful to compare our judgements about two lives that roughly track the competing values of pleasure and reality.

To reiterate, since a person cannot be 100% sure about the truth of prudential hedonism, they would be a bad decision-maker if they did not choose the life with both pleasure and reality. Reality has a greater than 0% chance of being intrinsically prudentially valuable, as is presumably true of all the other candidate goods that philosophers of well-being discuss. Importantly, the original structure of the EMTE traded off more reality against more pleasure. That the vast majority of people reported a preference for reality was therefore a sign that they really valued reality, since they were ready to sacrifice something of value (pleasure) to get more of another value (reality). A properly devised EMTE, aiming at revealing subjects’ relevant preferences, has to trade off against each other non-negligible amounts of two competing goods. The supposed intrinsic value of reality can be intuitively apprehended only if you have to sacrifice an amount of pleasure computed as significant by the brain. The epistemic value of the EMTE lies in presenting two options, one capturing the pro-reality intuition and one the pro-pleasure intuition. In fact, the strength of EMTE against prudential hedonism is that the vast majority of subjects agree that connecting to an EM is not desirable even though bliss is offered by connecting to the machine. Thus, the proper design of the thought experiment involves a meaningful pairwise comparison. Pairwise comparison is the method of comparing entities in pairs to reveal our preferences toward them. This simple comparison can constitute the building block of more complex decision-making. Symmetrically, complex decision-making can be reduced to a set of binary comparisons. That is indeed what we want from the EMTE: reducing a complex decision about intrinsic prudential value to a binary comparison between two competing lives that allows us to study people’s judgements about the prudential value of two competing goods.

Another example of this new generation of the EMTE is to be found in Inglis’s (2021) universal pure pleasure machine (UPPM). Inglis imagined a machine that provides a high heroine-like constant—that is, a machine that provides pure pleasure, without producing any virtual reality—and a world where every sentient being is plugged into such a machine (universality condition). Then, Inglis asked her participants: “is this a good future that we should desire to achieve?”. Only 5.3% of the subjects presented with this question replied positively. Interestingly, notice that this study was the first to be conducted in a Chinese university. From her results, Inglis concluded that the UPPM is once again able to disprove prudential hedonism. Nevertheless, more studies are necessary for accepting confidently this conclusion. For example, the universality condition that, according to Inglis, is able to reduce biases descending from morality, might, on the contrary, work as a moral intuition pump. In fact, empirical evidence shows that moral judgements, contrary to prudential judgements, seems to be characterized by universality (for example, it is wrong for everyone to commit murder vs. it is wrong for me to play videogames). Also, the UPPM might determine significant imaginative failures, for example if subjects view the machine with no virtual reality as boring (imaginative resistance) or perceive its heroin-like bliss as a disgusting kind of existence (overactive imagination).

12. Concluding Remarks

This article reviewed the salient points of the literature on the EMTE, since its introduction in 1974 by Nozick until the beginning of the 2020s. In presenting the scholarship on this thought experiment, a historical turn was emphasized. In fact, the debate on the EMTE can be divided in two phases. In a first phase, starting with the publication of Nozick’s Anarchy, State, and Utopia in 1974 and ending about 2010, we observe a huge consensus on the strength of the EMTE in proving prudential hedonism and mental state theories of well-being wrong. In a second phase, starting more or less at the beginning of the 2010s, we witness the emergence of a scholarship specialized in the EMTE that crushes the confidence about its ability to generate a knock-down argument against prudential hedonism and mental statism about well-being. Anecdotally, it should be noticed that the philosophical community at large—that is, not specialized in the EMTE—is not necessarily updated with the latest scholarship and it is common to encounter views more in line with the previous confidence. Nevertheless, the necessity felt by anti-hedonistic scholars to devise a new generation of EMTE demonstrates that the first generation is dead. Further scholarship is needed to establish whether and to what extent these new versions are able to resuscitate the EMTE and its goal.

13. References and Further Reading

  • Belshaw, C. (2014). What’s wrong with the experience machine? European Journal of Philosophy, 22(4), 573–592.
  • Bramble, B. (2016). The experience machine. Philosophy Compass, 11(3), 136–145.
  • Crisp, R. (2006). Hedonism reconsidered. Philosophy and Phenomenological Research, 73(3), 619–645.
  • De Brigard, Felipe. (2010). If you like it, does it matter if it’s real?, Philosophical Psychology, 23:1, 43-57, DOI: 10.1080/09515080903532290
  • De Lazari-Radek, K., & Singer, P. (2014). The point of view of the universe: Sidgwick and contemporary ethics. Oxford University Press.
  • Feldman, F. (2011). What we learn from the experience machine. In R. M. Bader & J. Meadowcroft (Eds.), The Cambridge Companion to Nozick’s Anarchy, State, and Utopia (pp. 59–86), Cambridge University Press.
  • Forcehimes, A. T., & Semrau, L. (2016). Well-being: Reality’s role. Journal of the American Philosophical Association, 2(3), 456–468.
  • Hewitt, S. (2009). What do our intuitions about the experience machine really tell us about hedonism? Philosophical Studies, 151(3), 331–349.
  • Hindriks, F., & Douven, I. (2018). Nozick’s experience machine: An empirical study. Philosophical Psychology, 31(2), 278–298.
  • Inglis, K. (2021). The universal pure pleasure machine: Suicide or nirvana? Philosophical Psychology, 34(8), 1077–1096.
  • Kagan, S. (1994). Me and my life. Proceedings of the Aristotelian Society, 94, 309–324.
  • Kawall, J. (1999). The experience machine and mental state theories of well-being. Journal of Value Inquiry, 33(3), 381–387.
  • Kolber, A. J. (1994). Mental statism and the experience machine. Bard Journal of Social Sciences, 3, 10–17.
  • Lin, E. (2016). How to use the experience machine. Utilitas28(3), 314–332.
  • Löhr, G. (2018). The experience machine and the expertise defense. Philosophical Psychology, 32(2), 257–273.
  • Nozick, R. (1974). Anarchy, State, and Utopia. Blackwell.
  • Nozick, R. (1989). The Examined Life. Simon & Schuster.
  • Rowland, R. (2017). Our intuitions about the experience machine. Journal of Ethics and Social Philosophy, 12(1), 110–117.
  • Silverstein, M. (2000). In defense of happiness. Social Theory and Practice, 26(2), 279–300.
  • Smith, B. (2011). Can we test the experience machine? Ethical Perspectives, 18(1), 29–51.
  • Stevenson, C. (2018). Experience machines, conflicting intuitions and the bipartite characterization of well-being. Utilitas30(4), 383–398.
  • Weijers, D. (2014). Nozick’s experience machine is dead, long live the experience machine! Philosophical Psychology, 27(4), 513–535.

 

Author Information

Lorenzo Buscicchi
Email: lorenzobuscicchi@hotmail.it
University of Waikato
New Zealand

Bonaventure (1217/1221-1274)

Bonaventure (d. 1274) was a philosopher, a theologian, a prolific author of spiritual treatises, an influential prelate of the Medieval Church, the Minister General of the Franciscan Order, and, later in his life, a Cardinal. He has often been placed in the Augustinian tradition in opposition to the work of his peer, Thomas of Aquinas, and his successors in the Franciscan Order, John Duns Scouts and William of Ockham, who relied more heavily on the recent recovery of Aristotle’s philosophical texts and those of Aristotle’s commentators, notably Ibn Rushd. However, a more accurate reading of the relevant sources places Bonaventure at one end of a spectrum of a wide range of classical traditions, Pythagorean, Platonic, Neo-Platonic, Augustinian, and Stoic, as well as that of Aristotle and the commentators, in his effort to develop a distinct philosophy, philosophical theology, and spiritual tradition that remains influential to this day. His philosophy was part and parcel of his greater effort to further the knowledge and love of God; nevertheless, he clearly distinguished his philosophy from his theology, although he did not separate them, and this distinction is the basis for his status as one of the most innovative and influential philosophers of the later Middle Ages—a list that also includes Aquinas, Scotus, Ockham, and, perhaps, Buridan.

Bonaventure derived the architectonic structure of his thought from a Neo-Platonic process that began in the logical analysis of a Divine First Principle, continued in the analysis of the First Principle’s emanation into the created order, and ended in an analysis of the consummation of that order in its reunion with the First Principle from which it came. He was a classical theist, indeed, he contributed to the formation of classical theism, and he advanced the depth of that tradition in his development of a logically rigorous series of epistemic, cosmological, and ontological arguments for the First Principle. He argued that the First Principle created the heavens and earth in time and ex nihilo, contra the dominant opinion of the philosophers of classical antiquity, and he based his argument on the classical paradoxes of infinity. He emphasized the rational soul’s apprehension of the physical realm of being—although he was no empiricist—and argued for a cooperative epistemology, in which the rational soul abstracts concepts from its apprehension of the physical realm of being, but it does so in the light, so to speak, of a divine illumination that renders its judgments of the truth of those concepts certain. He revised a classical eudaimonism, steeped in Aristotelian virtue theory, in the context of the Christian doctrines and spiritual practices of the thirteenth century. He weaved these and other elements into a memorable account of the soul’s causal reductio of the cosmos, its efficient cause, final cause, and formal cause, to its origin in the First Principle, and the soul’s moral reformation that renders it fit for ecstatic union with the First Principle.

Table of Contents

  1. Life, Work, and Influence
    1. Life and Work
    2. Influence
  2. The Light of Philosophy
  3. The First Principle
    1. The Properties of the First Principle
    2. The Theory of the Forms
    3. Truth and the Arguments for the First Principle
      1. The Epistemological Argument
      2. The Cosmological Argument
      3. The Ontological Argument
    4. Epistemic Doubt
  4. The Emanation of the Created Order
  5. The Epistemological Process
    1. Apprehension
    2. Delight
    3. Judgment
      1. The Agent Intellect
      2. Divine Illumination
  6. Moral Philosophy
  7. The Ascent of the Soul into Ecstasy
  8. References and Further Reading
    1. Critical Editions
    2. Translations into English
    3. General Introductions
    4. Studies

1. Life, Work, and Influence

a. Life and Work

Giovanni, later Bonaventure, was born in 1217—or perhaps as late as 1221—in the old City of Bagnoregio, the Civita di Bagnoregio, on the border between Tuscany and Lazio in central Italy. The view is striking. The Civita stands atop a scarped hill of volcanic stone that overlooks a valley at the foot of the Apennines. Bonaventure’s home has since collapsed into the valley, but a plaque remains to mark its former location. His father, Giovanni di Fidanza, was reportedly a physician and his father’s status as a member of the small but relatively prosperous professional classes provided the young Giovanni opportunity to study at the local Franciscan convent. His mother, Maria di Ritello, was devoted to St. Francis of Assisi (d. 1226), and her devotion provides the context for one of Giovanni’s few autobiographical reflections. He tells us that he suffered a grave illness when he was a young boy, but his mother’s prayers to St. Francis, who had passed in 1226, saved him from an early death (Bonaventure, Legenda maior prol. 3). He thus inherited his mother’s devotion to Francis, affectionately known as the Poor One (il Poverello).

Giovanni arrived in Paris in 1234 or perhaps early in 1235 to attend the newly chartered Université de Paris. He may well have found the city overwhelming. Philip II (d. 1223) had transformed France into the most prosperous kingdom in medieval Europe and rebuilt Paris to display its prosperity. He and his descendants oversaw a renaissance in art, architecture, literature, music, and learning. The requirements for the degree in the arts at the University focused on the trivium of the classical liberal arts, grammar, rhetoric, and logic. But they also included the quadrivium, arithmetic, geometry, music, and astronomy, which emphasized the role of number and other mathematical concepts in the structure of the universe, and Aristotle’s texts on philosophy and the natural sciences—the students and masters of the university routinely ignored the prohibitions to study Aristotle and his commentators first issued in 1210. Giovanni made good use of these “arts” throughout his career. Priscian’s grammar, the cadences of Cicero and other classical authors, deductive, inductive, and rhetorical argument, the prevalence of the concept of numerical order, and a firm grasp of the then current state of the natural sciences inform the entire range of his works.

Giovanni’s encounter with Alexander of Hales (d. 1245), an innovative Master of Theology at the University, would set the course for his future. Alexander entered the Franciscan Order in 1236 and established what would soon become a vibrant Franciscan school of theology within the University. Giovanni, who regarded Alexander as both his “master” and “father”, followed him into the Order in 1238, or perhaps as late as 1243, and began to study for an advanced degree in theology. He took the name Bonaventure when he entered the Order to celebrate his “good fortune”.

Alexander set the standard for Bonaventure and a long list of other students who emphasized a philosophical approach to the study of the scriptures and theology with particular attention to Aristotle and Aristotle’s commentators, whose entire corpus, with the notable exception of the Politics, would have been available to Bonaventure as a student of the arts, as well as the Liber de Causis, an influential Neo-Platonic treatise attributed to Aristotle. Alexander was fundamentally a Christian Platonists in the tradition of Augustine, Anselm, and the School of St. Victor; nevertheless, he was one of the first to incorporate Aristotelian doctrines into the broader Platonic framework that dominated the Franciscan school of thought in the thirteenth century.

Bonaventure continued his studies under Alexander’s successors, Jean de la Rochelle, Odo Rigaud, and William of Melitona. He began his commentaries on the scriptures, Ecclesiastes, Luke, and John, in 1248, and his commentary on the Sentences of Peter Lombard, the standard text for the advanced degree in theology, in 1250. He completed his studies in 1252 and began to lecture, engage in public disputations, and preach—the critical edition of his works includes over 500 sermons preached throughout the course of his life. He received his license to teach (licentia docendi) and succeeded William as Master and Franciscan Chair of Theology in 1254. The Reduction of the Arts to Theology is probably a revision of his inaugural lecture. His works from this period also include a revised version of the Commentary on the Sentences of Peter Lombard, his most extensive treatise in philosophical theology, and a series of disputations On the Knowledge of Christ, in which he presented his first extensive defense of his doctrine of divine illumination, On the Trinity, in which he summarized his arguments for the existence of God, and On Evangelical Perfection, in which he defended the Franciscan commitment to poverty.

But the secular masters of the University—those professors who did not belong to a religious order—refused to recognize his title and position. They had long been at odds with members of the religious orders who often flaunted the rules of the University in deference to those of their own orders. When the secular masters suspended the work of the University in a dispute with the ecclesial authorities of Paris in 1253, the religious orders refused to join them. The secular masters then attempted to expel them from the University. Pope Alexander IV intervened and settled the dispute in favor of the religious orders. The secular masters formally recognized Bonaventure as Chair of Theology in August of 1257, but Bonaventure had already relinquished his title and position. The Franciscan friars had elected him Minister General of the Order in February of 1257.

His initial task as Minister General proved difficult. His predecessor, John of Parma (d. 1289), endorsed some of the heretical tendencies of Joachim of Fiore (d. 1202), who had foretold that a New Age of the Holy Spirit would descend on the faithful and transcend the prominence of Christ, the papacy, Christ’s vicar on earth, and the current ecclesial leadership who served the papacy. John and other Franciscans, notably Gerard of Borgo San Donnino, had identified Francis as the herald of that New Age and his disciples, the Franciscans, as the Order of the Spirit. The papacy and other members of the ecclesial hierarchy formally condemned some aspects of Joachim’s doctrine at the Fourth Lateran Council in 1215 and issued a more thorough condemnation, in response to Franciscan support of his doctrine, at the Council of Arles in 1260.  Bonaventure would display some degree of sympathy for their claims. He, too, insisted that Francis was the Angel of the Sixth Seal who had heralded the start of the New Age of the Spirit, but he also insisted Francis’ disciples remain in full obedience to the current ecclesial hierarchy (Bonaventure, Legenda maior prol. 1).

His second challenge stemmed from a dispute that emerged within the Order during Francis’ lifetime. Francis had practiced a life of extreme poverty in obedience to Christ’s admonition to the rich young man to “sell everything you have and distribute the proceeds to the poor… and come and follow me” (Luke 18:18-30). Francis, like the rich young man, had been rather wealthy until he renounced his father’s inheritance in obedience to the admonition and spent the rest of his life as a charismatic preacher. But many of his followers argued for some degree of mitigation to their life of extreme poverty so they could better serve in other capacities, as administrators, teachers, and more learned preachers. The debate came to a head shortly after Francis’ death, since many of the friars who practiced a more rigorous commitment to poverty also supported Joachim’s apocalypticism. Bonaventure strongly supported Francis’ commitment to poverty as evidenced in his initial Encyclical Letter to the Provincial Ministers of the Order in 1257 and his codification of the Franciscan Rule in the Constitutions of Narbonne in 1260; nevertheless, he also permitted some degree of mitigation for specific purposes in specific contexts—the books, for example, students needed to complete their studies at Paris and other universities to become administrators, teachers, and preachers. Bonaventure maintained the peace in these disputes largely through his own commitment to poverty. But that peace would collapse shortly after his death when the Fraticelli, also known as the Spiritual Franciscans, who argued for a more rigorous life of poverty, and the Conventuals, who argued for a degree of mitigation, would split into factions. Many of the Fraticelli would oppose the papacy and the established hierarchy in their zeal for poverty and suffer censure, imprisonment, and, on occasion, death. Boniface VIII pronounced them heretics in 1296 and Clement V, who had also suppressed the Templars, sentenced four of the Fraticelli to burn at the stake in Marseille in 1318.

Bonaventure resided in the convent of Mantes sur Seine, to the west of Paris, throughout his term as Minister General and visited the university often—it was the center of the European intellectual world. He also travelled widely, in frequent visits to the friars throughout France, England, Italy, Germany, the Low Countries, and Spain, and he did so on foot, the standard means of transportation for those who had pledged themselves to Francis’ Lady Poverty. His works from this period reveal his careful attention to his friars’ spiritual needs. He published the Breviloquium, a short summary of his philosophical theology, at the bequest of the friars in 1257, shortly after his election, and a number of spiritual treatises in which he displays a deft ability to weave his philosophical theology in a sophisticated and often moving prose. These include the Soliloquies, a series of the soul’s dialogues with its innermost self in its effort to further its knowledge and love of God, the Threefold Way, a treatise on spiritual development, the Tree of Life, a series of meditations on Christ’s life, death, and resurrection that furthers the late medieval emphasis on the suffering of Christ, and the Longer Life of Saint Francis of Assisi, which would become the most influential biography of the saint until the nineteenth century. But the most influential of these texts is the Soul’s Journey into God (Itinerarium Mentis in Deum), a short summary of the ascent of the soul on the steps of Bonaventure’s reformulation of the Platonic Ladder of Love that ends in an ecstatic union with God. Those interested in Bonaventure’s thought should begin their reading with the Itinerarium.

His final challenge as Minister General dealt directly with the proper relationship between reason and faith. Aristotle and his commentators, the so-called radical Aristotelians, had argued for a number of doctrines that contradicted the orthodox reading of the Christian scriptures. He met this challenge in a series of Collations, academic conferences in which he singled out their errors, the Collations on the Ten Commandments, the Seven Gifts of the Holy Spirit, and the Six Days of Creation—the last of these remains unfinished. These errors included the eternity of the world, the absolute identity of the agent intellect, the denial of Platonic realism in regard to the theory of metaphysical forms, the denial of God’s direct knowledge of the world, the denial of the freedom of the will, and the denial of reward or punishment in the world to come. Bonaventure provided detailed arguments against each of these positions in his Collations and other works, but his principal argument was the concept of Christ the Center (medium) of all things, neither Aristotle, whom Bonaventure regarded as the Philosopher par excellence, nor his commentators (Bonaventure, Hexaëmeron 1:10-39). Thus, Christ’s teaching and, through extension, the entire scriptures, remained the only reliable guard against the tendency of the human intellect to error.

Pope Clement IV attempted to appoint Bonaventure Bishop of York in 1265, but Bonaventure refused the honor. Clement’s death in 1268 then precipitated a papal crisis. Louis IX of France and his younger brother, Charles of Anjou, attempted to intervene, but the Italian Cardinals and a number of other factions resisted. Bonaventure supported a compromise candidate, Teobaldi Visconti, whose election in 1271 brought the crisis to an end. Teobaldi, now Pope Gregory X, appointed Bonaventure the Cardinal Bishop of Albano in 1273, perhaps in gratitude for his support, and called on him to lead the effort to reunify the Roman Catholic Church and the Orthodox Church at the Second Council of Lyon. Once again, his efforts proved instrumental. The Council celebrated the reunion on July 6, 1274. It would not last. The representatives of the Emperor, Michael VIII Palaiologos, and the Orthodox Patriarch had agreed to the terms of union without the support of their clergy or the faithful. Bonaventure passed away unexpectedly shortly thereafter, on July 15, 1274, while the Council was in session. Gregory and the delegates of the Council celebrated his funeral mass. Pope Sixtus IV declared him a saint in 1482 and Sixtus V declared him a Doctor of the Church, the Doctor Seraphicus, in 1588.

b. Influence

Historically, Bonaventure remains the preeminent representative of the Christian Neo-Platonic tradition in the thirteenth century and the last influential representative of that tradition. He was also the last single individual to master the three critical components of the Christian intellectual tradition, philosophy, theology, and spirituality. His prominent disciples in the thirteenth century include Eustace of Arras, Walter of Bruges, John Peckham, William de la Mare, Matthew of Aquasparta, William of Falgar, Richard of Middleton, Roger Marsten, and Peter John Olivi.

Bonaventure fell out of favor in the fourteenth century. Scotus, Ockham, and other, less influential philosophers possessed less confidence in reason’s ability to ascend the Ladder of Love without the assistance of faith. They began to dismantle the close knit harmony between the two that Bonaventure had wrought, and set the stage for the opposition between them that emerged in the Enlightenment.

Nevertheless, the Franciscans revived interest in Bonaventure’s thought in response to his canonization in the fifteenth century and again in the sixteenth. The Conventual Franciscans, one of the three current branches of the medieval Order of Friars Minor, established the College of St. Bonaventure in Rome in 1587 to further interest in Bonaventure’s thought. They produced the first edition of his works shortly thereafter in 1588-1599, revised in 1609, 1678, 1751, and, finally, in 1860. The Conventuals and other Franciscans also supported the effort to establish medieval philosophy as a distinct field of academic inquiry in the nineteenth century, and rallied to include Bonaventure in the standard canon of medieval philosophers. The Observant branch of the Friars Minor founded the College of St. Bonaventure in Quaracchi, just outside Florence, in 1877 to prepare a new critical edition of Bonaventure’s works in support of this effort. It appeared in 1882-1902 and remains, with some relatively minor revisions, the foundation of current scholarship.

Bonaventure’s philosophy continued, and still continues, to command considerable interest, particularly among historians of medieval thought and philosophers in the Roman Catholic and other Christian traditions in their effort to distinguish philosophy from theology and develop a metaphysics, epistemology, ethics, and even aesthetics within their respective traditions. Notable examples include Malebranche, Gioberti, and other ontologists who revived a robust Platonism to argue that the human intellect possesses direct access to the divine ideas, Tillich and other Christian existentialists who developed Bonaventure’s epistemology into an existential encounter with the “truth” of the Divine Being, and, most recently, Delio, among others, who have relied on Bonaventure and the wider Franciscan intellectual tradition in their attempt to solve current problems in environmental ethics, health care, and other areas of social justice.

2. The Light of Philosophy

Bonaventure’s reputation as a philosopher had been the subject of debate throughout the nineteenth and early twentieth centuries. He never penned a philosophical treatise independent of his theological convictions, such as Aquinas’ On Being and Essence, and he imbedded his philosophy within his theological and spiritual treatises to a greater extent than Aquinas, Scotus, Ockham, and other medieval philosophers. Nevertheless, he clearly distinguished philosophy from theology and insisted on the essential role of reason in the practice of theology, the rational reflection on the data of revelation contained in the Christian scriptures. This distinction provides the basis for a successful survey of his philosophy. But its integral role in his larger enterprise, the rational reflection on the data of revelation, requires some degree of reference to fields that normally fall outside the scope of philosophy as practiced today, namely, his theology, spirituality, and, on occasion, even mysticism.

Bonaventure classified philosophy as a rational “light” (lux), a gift from God, “the Father of Lights and the Giver of every good and perfect gift” (Bonaventure, De reductione artium 1). It would prove critical in Bonaventure’s overarching goal to further his reader’s knowledge and love of God, and an indispensable “handmaiden” to theology, the greater “light” and the “queen of the sciences”. It reveals intelligible truth. It inquires into the cause of being, the principles of knowledge, and the proper order of the human person’s life. It is a critical component in a Christian system of education (paideia) with its roots in the thought of Clement of Alexandria, Augustine, Boethius, and Capella, who first delineated the classical system of the seven liberal arts. But it is also important to note that it possessed a wider range of denotation than it does today: according to Bonaventure, philosophy included grammar, rhetoric, and logic, mathematics, physics, and the natural sciences, ethics, household economics, politics, and metaphysics, in sum, rational investigation into full extent of the created order and its Creator independent of the data of revelation.

The light of reason also played a critical role in Bonaventure’s approach to theology (Bonaventure, De reductione artium 5). Alexander and his heirs in the Franciscan school at Paris had pioneered the transformation of theology into a rationally demonstrative science on the basis of Aristotle’s conception of scientia. Bonaventure brought those efforts to perfection in a rigorous causal analysis of the discipline (Bonaventure, 1 Sent. prol., q. 1-4). Its subject is the sum total of all things, God, the First Principle, the absolute first cause of all other things, and the full extent of God’s creation revealed in the scriptures and the long list of councils, creeds, and commentaries on the doctrine contained in its pages. Its method, the “sharp teeth” of rational inquiry, analysis, and argument (Aristotle, Physics 2.9). And its goal, the perfection of the knowledge and love of God that ends in an ecstatic union with God. But for what purpose? Why engage in a rational demonstration of the faith rather than a pious reading of the scriptures? Bonaventure listed three reasons: (1) the defense of the faith against those who oppose it, (2) the support of those who doubt it, and (3) the delight of the rational soul of those who believe it. “There is nothing,” Bonaventure explained, “which we understand with greater pleasure than those things which we already believe” (Bonaventure, 1 Sent. prol., q. 2, resp.). Nevertheless, reason was a handmaiden who knew her own mind. Bonaventure routinely admonished his readers against the “darkness of error” that diminished the light of intelligible truth and led them into the sin of pride: “Many philosophers,” he lamented, “have become fools. They boasted of their knowledge and have become like Lucifer” (Bonaventure, Hexaëmeron 4:1).

The evaluation of Bonaventure’s status as a philosopher had been closely bound to his attitude toward Aristotle and Aristotle’s commentators. Mandonnet had placed Bonaventure within a Neo-Platonic school of thought, largely Augustinian, that rejected Aristotle and his commentators and failed to develop a formal distinction between philosophy and theology (Quinn, Historical Constitution, 17-99). Van Steenberghen had argued that Bonaventure relied on a wide range of sources, Platonic, Neo-Platonic, and Aristotelian, for his philosophy, but it was an eclectic philosophy that served only to provide ad hoc support for his theological doctrines. Gilson had argued that Bonaventure developed a Christian Neo-Platonic philosophy, largely Augustinian, distinct from theology but in support of it, and he did so in a deliberate effort to distance himself from the radical Aristotelians who opposed the doctrinal positions of the Christian tradition. The debate on particular aspects of Bonaventure’s status as a philosopher and his debt to Aristotle and Aristotle’s commentators continues, but current consensus recognizes that Bonaventure developed a distinct and cohesive philosophy in support of his theology, and that he relied on a wide range of sources, Pythagorean, Platonic, Neo-Platonic, Stoic, Augustinian, and even Aristotelian to do so.

3. The First Principle

Bonaventure began the comprehensive presentations of his philosophy and philosophical theology in the beginning (in principio), in a statement of faith that testifies to the existence of the First Principle (Primum Principium) of Genesis, the God of Abraham, Isaac, and Jacob or, more specifically, God the Father, the first person of the Christian Trinity (Bonaventure, 1 Sent. d. 2, a. 1, q. 1; Breviloquium 1.1; and Itinerarium prol. 1). But he also insisted that this Principle is the fundamental cause of each and every other thing in heaven and earth and so, through the rational reductio of each thing to its efficient, formal, and final cause, this Principle is known to the human intellect independent of divine revelation. It is also common, in some form, to the philosophical traditions of classical antiquity, Pythagorean, Platonic, Neo-Platonic, Peripatetic, and Stoic. Indeed, Bonaventure absorbed much of that heritage in his own exposition of the existence and nature of the One God.

a. The Properties of the First Principle

He also developed a philosophical description of the fundamental properties of the First Principle on the basis of that classical heritage (Bonaventure, Itinerarium 5.5). The First Principle is being itself (ipsum esse). It comes neither from nothing nor from something else and is the absolute first cause of every other thing (Bonaventure, 1 Sent. d. 28, a. 1, q. 2 ad 4). If not, it would possess some degree of potential and not be absolute being. It is also eternal, simple, actual, perfect, and one in the sense of its numerical unity and the simplicity of its internal unification. If not, it would, again, possess some degree of potential and thus not be absolute being. Bonaventure developed slightly different lists of these divine properties throughout his works, but they all share the common root in the concept of absolute being.

b. The Theory of the Forms

Bonaventure’s arguments for the existence of the First Principle depended on his revision of the Platonic theory of the forms and an analysis of truth on the basis of those formal principles. Plato had developed his theory in response to a problem Heraclitus first proposed. All things in the physical realm of being are in a constate state of change. So much so, that when we claim to know them, we fail. The things we claim to know no longer exist. They have changed into something else. Our claim of knowledge, then, is at best a fleeting glimpse of the past and a fiction in the present. But Plato argued that we do, in fact, know things and we know some of them with certainty, such as mathematical principles and evaluative concepts like justice. If so, what accounts for them? Plato proposed his theory of forms to answer the question. The forms (eíde) are the paradigmatic exemplars of the things they inform. They exist in a permanent realm of being independent of the things they inform, and they persist in spite of the changes within those things. The mind grasps the forms through its recollection (anamnesis) of them or, in the testimony of Diotima in the Symposium, in an ecstatic vision of those forms in themselves.

But Plato and his successors, notably Aristotle, continued to debate particular aspects of the theory. Do the forms exist within a divine realm of being (ante rem) independent of the things they inform? Do they serve as exemplars so that the individual instantiations of those forms in the physical realm of being imitate them? Do they exist in the things they inform so that those things participate, in some way, in the forms (in re)? Do they exist in the mind that conceives them (post rem)? If so, how does the mind acquire those forms? Or, as later philosophers would argue, are they merely a linguistic expression (flatus vocis)? Or some combination of the above?

Bonaventure relied on Plato’s theory, transmitted through Augustine, and Aristotle’s criticisms of that theory to develop a robust “three-fold” solution that embraced the full spectrum of possibilities (Bonaventure, Itinerarium 1.3; Christus unus magister 18). The forms, he argued, exist eternally in the Eternal Art (ante rem), in the things they inform (in re), and in the mind who apprehends them (post rem), and this included their expression in the speech of the person who apprehends them (flatus vocis). They serve as exemplars so that the individual instantiations of them in the physical realm of being imitate them, and they participate in the presence of those forms in re. Finally, the mind acquires them, Bonaventure argued, through the cooperative effort of the rational soul that abstracts them from its sensory apprehension of the things they inform and the illumination of the Eternal Art that preserves the certainty of their truth (Bonaventure, Itinerarium 2.9).

Bonaventure’s commitment to a robust theory of Platonic realism in regard to the forms has earned him the title as the last of the great Platonists of the Middle Ages, but the praise is a thinly veiled criticism. It often implies his endorsement of a dead end in philosophical metaphysics in contrast to more enlightened philosophers, such as Aquinas, Scotus, and Ockham, who would reject a robust Platonism in their anticipation of a more thoroughly rational and naturalistic metaphysics. But it is important to note that Platonism endured. Renaissance Platonists, with the benefit of the full scope of the Platonic corpus, would reinvigorate a tradition that survived and often thrived in subsequent generations and continues to do so, particularly in the field of the philosophy of mathematics, to the present day.

c. Truth and the Arguments for the First Principle

Bonaventure’s arguments for the existence of the First Principle remain impressive for their depth and breadth (Houser, 9). He classified his arguments for the existence of the First Principle on the basis of the metaphysical forms, ante rem, in re, and post rem, and a type of correspondence theory of truth mapped onto the three fundamental divisions of the Neo-Platonic concept of being (esse): the cosmological truth of physical being, the epistemological truth of intelligible being, and the ontological truth of Divine Being (Bonaventure, 1 Sent. d. 8, p. 1, a. 1, q. 1). Cosmological truth depends on the correspondence between an object and its form in the divine mind (ante rem), intelligible truth on the correspondence between an object and its intelligible form in the human mind (post rem), and ontological truth on the correspondence between an object and the form within it (in re) that renders it into a particular type of thing. But the First Principle, the absolute origin of every other thing, does not possess the material principle of potential. It is the “pure” act of being—a concept Bonaventure will develop throughout his arguments. It and It alone is the perfect instantiation of its form, so to speak, and, thus, It and It alone is necessarily true in Itself.

i. The Epistemological Argument

Bonaventure began with the epistemological argument which, he claimed, is certain, but added that the arguments in the other categories are more certain. His initial formulation of the argument asserted that the rational soul, in its self-reflection, recognizes the “impression” (impressio) of the First Principle on its higher faculties, its memory, intellect, and will, and their proper end in the knowledge and love of that First Principle (Bonaventure, Mysterio Trinitatis q.1, a.1 fund. 1-10). The argument is not as viciously circular as it appears. Bonaventure contended that the soul possesses an innate desire for knowledge and love that remains unsated in the physical and intelligible realms of being. These realms, as Bonaventure will explain in more detail in his revision of the argument, possess a degree of potential that necessarily renders them less than fully satisfying. But, as Aristotle had frequently insisted, nature does nothing in vain. Thus, per the process of elimination, the soul finds satisfaction in the knowledge and love of a Divine Being.

He revised the epistemological argument in his later works into a more sophisticated, and less circular, argument from divine illumination (Bonaventure, Scientia Christi q. 4; Christus unus magister 6-10, 18; and Itinerarium 2.9).

  1. The rational soul (mens) possesses knowledge of certain truth.

Bonaventure presumed that the soul possesses certain truth. Paradigmatic examples include the principles of discrete quantity, the point, the instant, and the unit, and the logical axioms of the Aristotelian sciences, such as the principle of non-contradiction. He also cited Plato’s account of the young boy who possessed an innate knowledge of the principles of geometry that enabled him, without the benefit of formal education, to successfully double the area of a square.

  1. The rational soul is fallible and the object of its knowledge mutable and thus fails to provide the proper basis for certain truth.

Bonaventure appears to have some sympathy for Heraclitus’ argument that the world is in a constant state of change. If so, our knowledge of the world at any given time is accurate for only a fleeting passage of time. But the thrust of his argument in support of this premise stems from his conception of truth in relation to his theory of the metaphysical forms that exist in the divine realm of being (ante rem), in the intelligible (post rem), and in the physical (in re). Empirical observation reveals that the physical realm of being is always in some degree of potency in relation to its participation in the metaphysical forms in re. This potential testifies to its existence ex nihilo—it is the root cause of its possession of some degree of potency. Its truth depends on the degree to which it participates in its metaphysical form in re and the degree to which it imitates its form ante rem. Thus, its truth necessarily falls short of its ideal. The soul, also created ex nihilo, also possesses some degree of potency in relation to its ideal. Thus, its abstraction of these imperfect forms from the physical realm of being is fallible.

  1. Therefore, the rational soul relies on a divine “light” to render itself infallible and the object of its knowledge immutable.

Bonaventure divided the sum total of the cosmos into three realms of being without remainder: physical, intelligible, and divine. Thus, per the process of elimination, the soul relies on an “eternal reason” from the divine realm of being for its possession of certain truth.

ii. The Cosmological Argument

Bonaventure relied on Aristotle’s analysis of the concept of being (esse) into being-in-potency and being-in-act for his cosmological argument (Mysterio Trinitatis q. 1, a. 1 fund. 11-20). He began with a series of empirical observations. The physical ream of being possesses a number of conditions that reveal it is being-in-potency. It is posterior, from another, possible, and so on. Its possession of these properties indicates that it depends on something else to account for its existence, something prior, something not from another, something necessary. In fact, its being-in-potency reveals its origin ex nihilo, and it continues to possess some degree of the “nothingness” from which it came—contra Aristotle, who had argued for the eternal existence of a continuous substratum of matter on the basis of Parmenides’ axiom that nothing comes ex nihilo. Thus, it “cries out” its dependence on something prior, something not from another, something necessary. This being cannot be another instance of physical being, since each and every physical thing is posterior, from another, possible, and so on. It cannot be intelligible being which shares the same dependencies. Thus, again per the process of elimination, it is the Divine Being.

iii. The Ontological Argument

Bonaventure was the first philosopher of the thirteenth century to possess a thorough grasp of the ontological argument and advance its formulation (Seifert, Si Deus est Deus). He did so in two significant directions. First, he provided an affirmative compliment to the emphasis on the logical contradiction of the reductio ad absurdum common among traditional forms of the argument (Bonaventure, Mysterio Trinitatis q. 1, a.1 fund. 21-29 and Itinerarium mentis in Deum 5.3). He began with the first of Aristotle’s conceptual categories of being, non-being, and argued that the concept of non-being is a privation of the concept of being and thus presupposes the concept of being. The next category, being-in-potency, refers to the potential inherent in things to change. An acorn, for example, possesses the potential to become a sapling. The final category, being-in-act, refers to the degree to which something has realized its potential. The thing that had been an acorn is now a sapling. Thus, the concept of being-in-potency depends on the concept of being-in-act. But if so, then the concept of being-in-act depends on a final conceptual category, the concept of a “pure” act of being without potential and this final concept, Bonaventure argued, is being itself (ipsum esse).

Second, he reinforced the argument to include the transcendental properties of being, the one, the true, and the good. The concept of the transcendentals (transcendentia) was a distinctive innovation in the effort of medieval philosophers and theologians to reengage the sources of the syncretic philosophical systems of late antiquity, such as Porphyry’s Introduction to Aristotle’s Categories, Boethius’ On the Cycles of the Seven Days—often cited in its Latin title, De hebdomadibus—and Dionysius’ On the Divine Names, to identify the most common notions (communissima) of the concrete existence of being (ens) that “transcended” the traditional Peripatetic division of things, or perhaps the names of things, into the categories of substance and its accidents: quantity, quality, relation, and so on. Each and every particular thing that exists in the physical realm of being is one, true, and good. But it is imperfectly one, true, and good. It is one-in-potency, truth-in-potency, and good-in-potency. It depends on one-in-act, truth-in-act, and good-in-act, and thus on the “pure” act of being the one, the true, and the good.

His final step is to locate this being itself, the one, the true, and the good, within the Neo-Platonic division of being. It does not fall within the category of the physical realm of being “which is mixed with potency”. It does exist within the intelligible realm of being, but not entirely so. If it existed in the rational soul and only in the soul, it would exist only as a concept, an intelligible fictum, and thus possess “only a minimal degree of being”. And so, per the process of elimination, being itself is the Divine Being.

d. Epistemic Doubt

Bonaventure insisted that these arguments are self-evident (per se notum) and thus indubitable. But if so, what accounts for doubt or disbelief? Bonaventure located the root of doubt in a series of objective failures, a vague definition of terms, insufficient evidence in support of the truth of propositions, or a formal error in the logical process (Bonaventure, 1 Sent. d. 8, p. 1, a. 1, q. 2 resp.). None of these, he argued, applied in the case of his arguments for the existence of God. Nevertheless, he recognized the reality of those who would deny them. But if so, what accounts for their denial? Either a subjective failure to grasp the terms, the propositions, or the arguments or, perversely, a willful ignorance to do so.

4. The Emanation of the Created Order

Bonaventure’s account of creation depended on Plato’s Timaeus which he read in a careful counterpoint with Genesis as well as Aristotle’s Physics and other texts in natural philosophy. It begins with a revision of Plato’s myth of the Divine Architect (Bonaventure, De reductione artium 11-14). The First Principle, God the Father, similar to Plato’s Divine Architect, carefully studied the metaphysical forms in God the Son, the Eternal Art “in whom He has disposed all things” (Wisdom 11:21). The Father then fashioned the artifacts (artificia) of the created realm of being in imitation of those formal exemplars and declared them good in terms of their utility and their beauty. Finally, He created human beings in the image of Himself, so that they would recognize the presence of the artist in the work of His hands and praise Him, serve Him, and delight in Him.

Bonaventure’s description of the Divine Architect differs from Plato’s in one crucial respect. His fidelity to the orthodox formula of the ecumenical councils compelled him to insist that the First Principle exists in one and the same substance (ousia) with the Eternal Art (Breviloquium 1.2). The distinction between them, in the phrase of the councils, is entirely personal (hypostatic). Bonaventure argued that this distinction is the result of their respective modes of origin. God the Father, the first of the divine hypostases, is the absolute First Principle without beginning, and the Son, the second divine hypostasis, comes from the Father. Philosophically, Bonaventure’s distinction between the two on the basis of origin seems too metaphysically thin to account for a real distinction between them. Nevertheless, his commitment to orthodox monotheism precluded a substantial distinction between the two.

Bonaventure also insisted that the First principle created the world in time and out of nothing (Bonaventure, 2 Sent. d. 1, p. 1, a. 1, q. 2). Aristotle proposed the first detailed argument for the eternity of the world and most ancient philosophers, notably Proclus, endorsed it. Jewish and Christian philosophers, notably Augustine, developed the doctrine of creation in time and ex nihilo to oppose it. But the rediscovery of Aristotle’s natural philosophy in the twelfth century revived the debate. Bonaventure was the first of the philosopher-theologians of the later Middle Ages to possess a firm grasp of the classical arguments for and against the proposition, particularly Philoponus’ reflection on the paradoxes of infinity, the impossibility of the addition, order, traversal, comprehension, and simultaneous existence of an infinite number of physical entities (Dales, 86-108). He may not have thought the argument against the eternity of the world on the basis of these paradoxes was strictly demonstrative; nevertheless, he clearly thought that the eternal existence of a created world was an impossibility.

Bonaventure relied on three closely related concepts, the Aristotelian principle of matter, the Stoic concept of causal principles, and the Neo-Platonic concept of metaphysical light, to further develop his account of creation (Bonaventure, Breviloquium 2.1-12). He relied on Aristotle’s principle of matter, distinct from matter in the sense of concrete, physical things, to account for the continuity of those things through change. The metaphysical forms within them rendered them into something particular and directed the changes within them in the course of time. The principle of matter received those forms, and rendered those concrete, physical particulars into stable receptacles of change. It was the foundation (fundamentum) of the physical realm of being. But Bonaventure also argued, contra Aristotle, that the rational souls of human persons and other intelligent creatures in the intelligible realm of being, angels and even the devil and his demons, possess the potential to change and thus they possess the same principle of matter. Their metaphysical forms rendered them into something intelligible, distinct from other concrete, physical things, but those forms subsist in the same material principle. Bonaventure’s doctrine of universal hylomorphism, in which both physical and intelligible creatures possess the principles of matter and form, is an essential feature in his distinction between the First Principle, who does not change (Malachi 3:6), and Its creation.

Early Jewish and Christian philosophers developed the Stoic concept of causal principles or “seeds” (rationes seminales) to account for the potential within concrete particulars to change. The metaphysical forms that informed those changes exist in potentia within them. They exist as metaphysical seeds, so to speak, that will develop into forms in re in the fullness of time in response to a secondary agent, such as the human artist who, in imitation of the Eternal Art, creates artifacts that are useful and beautiful. Bonaventure insisted on the presence of these metaphysical seeds in his rejection of Ibn Sina’s doctrine of occasionalism, in which the First Principle, and the First Principle alone, is the efficient cause of each and every change in the created realm of being. Bonaventure argued, contra Ibn Sina, that the dignity of the human person, created in the image of God, demands that they, too, serve as efficient causes in the created order and cooperate with Him in their effort to know and love Him.

Bonaventure developed his light metaphysics in opposition to Aristotle who had argued that light was an accidental form that rendered things bright, not a substantial form (Bonaventure, 2 Sent. d. 13). Aristotle’s approach accounted for bright things, such as the sun, the moon, and the stars, but did not allow for the existence of light in itself. Bonaventure favored the less popular but more syncretic approach of Grosseteste, who argued that a metaphysical light (lux) was the substantial form of corporeity. It bestowed extension on physical things and rendered them visible—the fundamental properties of all physical things. It also prepared them for further formation. Bonaventure was a proponent of the “most famous pair” (binarium famosissimum), the logical entailment of the thesis of (near) universal hylomorphism and the plurality of metaphysical form that distinguished the Franciscan school of thought throughout the thirteenth century. He argued that all created things possess the metaphysical attributes of matter and form—his advocacy of the doctrine of divine simplicity precluded his application of the thesis to the Divinity. He also argued that a series of forms determined the precise nature of each thing. The form of light (lux) was common to all physical things, but other forms rendered them into particular types of physical things according to an Aristotelian hierarchy of the forms, the nutritive form common to all living things, the sensory form common to all animals, and the rational form, or soul, which distinguishes the human person from other terrestrial creatures.

The First Principle weaved these threads together in Its creation of the corporeal light (lumen) on the first day of creation. This light was a single, undifferentiated physical substance in itself, not an accidental property of something else, extended in space and, in potential, time. It possessed the inchoate forms that would guide its further development and stood, the principle of matter made manifest, as the corner stone of the physical cosmos.

The First Principle divided this primal light into three realms of physical being (naturae), the luminous on the first day of creation, the translucent on the second, and the opaque on the third, and then filled them with their proper inhabitants on the subsequent days of creation. The luminous realm consists of the purest form of metaphysical light (lux) and corresponds to the heavens, bright with the light of its form and a modest amount of prime matter. The translucent realm consists of air and, to a lesser extent, water, and contains a less pure degree of the primordial light in its mixture with prime matter. The opaque consists of earth and contains the least pure degree of light in its mixture. He relied on Aristotelian cosmology to further divide the cosmos into the empyrean heaven, the crystalline heaven, and the firmament; the planetary spheres, Saturn, Jupiter, Mars, the Sun, Venus, Mercury, and the Moon; the elemental natures of fire, air, water, and the earth; and, finally, the four qualities, the hot, the cold, the wet, and the dry, the most basic elements of Aristotelian physics. The heavenly spheres, he explained, correspond to the luminous realm. The elemental natures of air, water, and earth, correspond to the sublunar realms and contain the birds of the air, the fish of the sea, and each and every thing that crawls on the earth. Fire is a special case. Although elemental, it shares much in common with the luminous, and thus consumes the air around it in its effort to rise to the heavens.

The process came to its end in the formation of the human person in the image of God. Bonaventure adopted a definition of the human person common among Jewish, Christian, and Islamic philosophers and theologians throughout the Middle Ages: the human person is a composite of a soul (anima) and body, “formed from the mire (limus) of the earth” (Breviloquium 2.10). The human soul is the metaphysical forma of its body. It perfects its body in so far as its union with its body brings the act of creation to its proper end in the formation of the human person, the sum of all creation, in the image of God. It then directs its body in the completion of its principal task, to enable the human person to recognize creation’s testimony to its Creator so that it might come to its proper end in union with its Creator.

He distinguished his definition of the human composite from his peers in his juxtaposition of two convictions that initially seem to oppose one another: the ontological independence of the soul as a substantial, self-subsisting entity and the degree to which he emphasized the soul’s disposition to unite with its body. Plato and his heirs who had insisted on the soul’s substantial independence tended to denigrate its relationship with the body. Plotinus’ complaint is indicative if hyperbolic: Porphry, his biographer, told us that he “seemed ashamed of being in the body” (Porphyry, On the Life of Plotinus 1). Bonaventure rejected this tendency. He agreed that the soul is an independent substance on the basis of his conviction that it possesses its own passive potential. It is able to live, perceive, reason, and will independently of its body in this life and the next and, after its reunion with a new, “spiritual” body, in its eternal contemplation of God. The soul, Bonaventure insisted, is something in itself (Bonaventure, 2 Sent. d. 17, a. 1, q. 2). The human spirit is a fully functioning organism with or without its corporeal body.

But he also argued that the soul is the active principle that brings existence to the human composite in its union with its body and enables it to function properly in the physical realm of being (Bonaventure, 4 Sent. d. 43, a.1, q.1 fund. 5). Thus, the soul possesses an innate tendency to unite with its body (unibilitas). The soul is ordered to its body, not imprisoned within it. It realizes its perfection in union with its body, not in spite of it; and with its body, it engages in the cognitive reductio that leads to its proper end in the knowledge of God and ecstatic union with God. Its relationship with its body is so intimate that it no longer functions properly at the moment of its body’s death. It yearns for its reunion with its risen body in the world to come, a clear, impassible, subtle, and agile body that furthers its access to the beatific vision.

5. The Epistemological Process

Bonaventure began his account of the epistemological process with a classical theme common throughout the Middle Ages: the human person, body and soul, is a microcosm (minor mundus) of the wider world (Bonaventure, Itinerarium 2.2). Its body consists of the perfect proportion of the fundamental elements that comprise the physical realm of being, the primordial light (lux) of the first day of creation that regulates the composition of the other four elements: the earth that renders the body into something substantial, the air that gives it breath, the bodily fluids that regulate its health, and the fire that instills the physiological basis for its passions. Its soul (anima) renders it into the most well-developed of all creatures in its capacities for nutrition, which it shares with the plant kingdom, sensation, which it shares with the animal kingdom, and reason, which belongs to rational creatures alone. But above all, it possesses the capacity to know all things throughout the full extent of the created order, the luminous, translucent, and opaque realms of the cosmos.

Bonaventure divided the epistemological process through which the rational soul (mens), the definitive aspect of the human person, comes to know all things into three distinct stages: apprehension, delight, and rational judgment.

a. Apprehension

Bonaventure’s theory of sense apprehension (apprehensionem) depended on the current state of the physical sciences in the early thirteenth century, psychology, biology, physiology, neurology, and physics. He located the start of the process in the rational soul’s sensory apprehension of the physical realm of being. But Bonaventure was not an empiricist. He admitted the soul possesses the rational principles that enable it to reason in its estimation of the physical realm of being and the ability to know itself and other intelligible beings, namely, angels, the devil, demons, and the Divine Being. The internal sense is the first to engage the physical realm of being. It determines the degree of threat the physical realm poses and thus serves to protect the soul and its body from harm. The next series of senses includes the more familiar senses of sight, hearing, smell, taste, and touch. Each sense is a “door” (porta) that opens onto a particular element or combination of elements. Sight opens to the primordial light of the first day of creation. Hearing opens to air, smell to “vapors”, an admixture of air, water, and fire, the remnant of the elemental particle of heat, taste to water, and touch to earth. Each sense also apprehends common aspects of physical things, their number, size, physical form, rest, and motion.

Bonaventure insisted that, for most things, the human person invokes each of the senses in tandem with the others. Each sense opens onto particular properties inherent within physical things and when the rational soul applies them in conjunction with one another, they provide a comprehensive grasp of the universe in its totality. Some things, like the morning star, remain so bright, so pure, that it is accessible only to sight. But most things contain a more thorough mixture of the primordial light of creation and the more substantial elements of earth, air, fire, and water that in themselves consist of the fundamental particles of medieval physics, the hot, the cold, the wet, and the dry. The rational soul’s apprehension of the macrocosmus as a whole demands the use of all its senses.

Bonaventure’s metaphor of the door reveals his debt to an Aristotelian intromission theory of sense perception. The sense organs, the eyes, ears, nose, and so on, are passive, but the metaphysical light within the objects of the senses render them into something active. They shine, so to speak, and impress an impression (similitudo) of themselves onto the media that surround them, the light, fire, air, water, or in some cases, earth. Each impression is an exemplum of an exemplar, like the wax impression of a signet ring. These impressions in the media impress another exemplum of themselves onto the exterior sense organs, the eyes, ears, nose, and so on. These impress an image of themselves onto the inner sense organs of the nervous system and, finally, onto the apprehension, the outermost layer of the mind. The physical realm of being is filled with the “brightness” of its self-disclosure, “loud” with its cries, pungent, savory, and tangible. The soul cannot escape its self-disclosure. It remains in “tangible” contact with each and every thing it apprehends—albeit through a series of intermediary impressions.

These sensory species or similitudines within the soul’s apprehension are “intentions” in the sense of signs, rarified, information bearing objects within itself, not merely the soul’s awareness of the objects of its apprehension. They contain information about the way things look, sound, smell, taste, and feel, information about their size, shape, whether they are in rest or in motion, hic et nunc, the imprint of the concrete reality of physical being. But the soul’s apprehension of them is an apprehension of the impression of a series of impressions of the object, not the object in itself, and the subtle decline in the accuracy of each impression accounts for the errors of perception.

b. Delight

Bonaventure emphasized the role of delight (delectatio) in this process to a greater extent than his peers (Lang, Bonaventure’s Delight in Sensation). He identified three sources of the rational soul’s delight in its apprehension of the “abstracted” impression: the beautiful (speciositas), the agreeable (suavitas), and the good (salubritas). (It is clear from the context of the passage that the soul delights in its “abstraction” of a sensory impression at this stage of the process, not in its abstraction of a metaphysical form from those sensory impressions.) His immediate source for the innovation was the incipit of Aristotle’s Metaphysics: “All men [and women] naturally desire to know” and the further claim that the “delight” they take in their senses is evidence of that desire (Aristotle, Metaphysics 1.1). But he derived his classification of those delights from a long tradition of Pythagorean, Platonic, and Peripatetic texts on the proper objects of natural desire, distilled in Ibn Sina’s On First Philosophy.

The first of these three, the soul’s delight in its apprehension of beauty, was wholly original. It doesn’t appear in the pertinent sources. It refers to the “proportion” (proportio) between the sensory impression and its object, the sensory impression of a sunset, for example, and the sunset itself. Thus, beauty is subject to truth. The greater the degree of similarity between the sensory impression of an object and the object, the greater degree its truth and thus its beauty. The second, agreeableness, was common in the pertinent sources. It refers to the proportion between the sensory impression and the media through which it passes. A pleasant light, for example, is proportional to its media. A blinding light is disproportional. The third, goodness, was also common. It refers to the proportion between a sensory impression and the needs of the recipient, like a cool glass of water on a hot day.

Bonaventure aligned particular delights with particular senses, beauty with sight, agreeableness with hearing or smell, and goodness with taste or touch, but he does so “through appropriation in the manner of speech” (appropriate loquendo), that is, to “speak” about what is “proper” to each sense, not what is exclusive to each of them. The soul’s delight in the beauty of an object is most proper to sight, not restricted to it, and so, too, for the other forms of delight and their proper senses. The soul can also delight in the beauty of the sound of well-proportioned verse, the smell of a well-proportioned perfume, the taste of well-proportioned ingredients, or even the touch of a well-proportioned body. The soul is able to access beauty through all its senses and the loss of one or more of them does not deny it the opportunity to delight in the beauty of the world.

Finally, it is important to note that he distinguished between the beautiful, agreeable, and good properties within the soul’s apprehensions of things, not beautiful, agreeable, or good things. The same objects are, at once, beautiful, agreeable, and good.

The similarity between Bonaventure’s distinction of the soul’s delight in speciositas, suavitas, and salubritas and Kant’s seminal distinction of the beautiful (das Schöne), the agreeable (das Angenehme), and the good (das Gute) is striking (Kant, Critique of Judgment 5). But the lists are not coordinate. Bonaventure’s concept of suavitas is comparable to Kant’s concept of das Angenehme, but neither his conceptualization of speciositas to Kant’s das Schöne nor his conceptualization of salubritas to Kant’s das Gute. Bonaventure’s conceptualization of the soul’s pleasure in the apprehension of beauty, speciositas, depends on the degree of correspondence between the soul’s apprehension of the sensible species and its proper object, not on the free play of the mind’s higher cognitive faculties, the intellect and the imagination; and his conceptualization of the pleasure in salubritas depends on the wholesomeness of the object, not on the degree of esteem or approval we place upon it. Nor is there evidence for Kant’s familiarity with Bonaventure’s text. The best explanation for the similarity is that Kant relied on the common themes of antiquity, namely, the beauty of proper proportions and the pleasure in the contemplation of them, and perhaps common texts, but not Bonaventure’s direct influence on Kant.

c. Judgment

Bonaventure brought his account of this epistemological process to completion in its third stage, rational judgment (diiudicatio). It is in this stage, and only this stage, that the soul determines the reason for its delight, and it does so in its abstraction of concepts, the metaphysical forms in re, from the sensory species in its apprehension. He developed an innovative two-part process to account for the rational soul’s abstraction of the sensible species: an Aristotelian abstraction theory of concept formation and a Platonic doctrine of divine illumination that rendered its judgments on the basis of those species certain. Most other philosophers depended on one or the other, or principally one or the other, not both.

i. The Agent Intellect

Bonaventure depended on the common distinction between two fundamental powers (potentia) of the intellect for his development of an abstraction theory of concept formation: the active power (intellectus agens) and the passive (intellectus possibilis). The active power abstracts the intelligible forms from the sensory species and impresses them in the intellect. The passive power receives the impressions of those intelligible forms. But he also insisted that the agent power subsists in one and the same substance (substantia) with the possible. He did so to distinguish his theory from those who identified the active power with a distinct substance, the Divine Being or a semi-divine intelligence. If so, this would render the human person entirely passive in its acquisition of knowledge and reduce its dignity as a rational creature in the image of God. Thus, the phrase “intellectus agens” refers to a distinction (differentia) in the action of one and the same intellectual faculty. It is a natural “light” that “shines” on the intelligible properties of the sensible species and reveals them. It makes them “known” and then “impresses” them upon the intellect. It also depends on the potential of the intellect to do so. The agent power, in itself, cannot retain the impression of the forms, and the intellect’s potential, in itself, is unable to abstract them. The intellect requires the interdependent actions of both its active and passive powers to function properly.

ii. Divine Illumination

Bonaventure insisted that the rational soul possesses the innate ability to abstract the intelligible species from the impressions of its sensory apprehension and thus come to know the created order without the assistance of the Divine Being or other, semi-divine intelligences. But he also insisted that the soul requires the assistance of the illumination of the forms in the Eternal Art (ante rem) to do so with certitude.

Bonaventure presented his doctrine of divine illumination in the context of his epistemological argument for the existence of God. The rational soul is fallible, and the object of its knowledge in the physical realm of being is mutable. Thus, it relies on a divine “light” that is infallible and immutable to render its abstraction of the metaphysical forms from its sensory apprehension infallible and the object of its knowledge immutable. But precisely how this occurs has been the subject of a wide range of debate (Cullen, Bonaventure 77-87). Gioberti had placed Bonaventure within the tradition of Malebranche and other advocates of ontologism, who had argued that the soul has direct access to the divine forms as they exist in the Eternal Art ante rem. Portalie had placed him within the tradition of Augustine who, so Portalie insisted, had argued that the Eternal Art impresses the forms directly on the rational soul. (Note, however, that Augustine’s theory of illumination is also the subject of a wide range of debate.) Gilson argued for a formalist position in which the rational soul depends on the light of the divine forms to judge the accuracy, objectivity, and certainty of its conceptual knowledge, but denied its role in the formation of concepts.

Gendreau pioneered the current consensus that endorses an interpretative synthesis between Portalie’s reading of Bonaventure’s doctrine and Gilson’s (Gendreau, The Quest for Certainty). Bonaventure explicitly affirmed that the “light” of the forms in the Eternal Art, but not the forms as they exist in the Eternal Art ante rem, “shines” on the soul to “motivate” and “regulate” its abstraction of the intelligible forms from its sensory apprehension of the physical realm of being. But he explicitly denied that it is the “sole” principle in “its full clarity”. Furthermore, if the soul possessed direct access to the divine forms in the Eternal Art (ante rem) or if that Art impressed those forms directly onto its faculties (post rem), there would be no need for the agent power of the intellect to abstract the forms from the sensory species (in re). Bonaventure would have undermined his careful effort to delineate the subtle distinctions between the agent power of the intellect and the possible in their role in concept formation.

Thus, Bonaventure developed a cooperative epistemology in which the Eternal Art projects some type of image of the divine forms ante rem onto the rational soul’s higher cognitive faculties, its memory, intellect, and will. But this projection is not the “sole” principle of cognition. The rational soul “sees” its abstraction of the intelligible forms within itself (post rem) in the “light” of the projection of the forms in the Eternal Art, and this light enables it to overcome the imperfection of its abstraction of the intelligible forms and renders its knowledge of them certain—although it does not see the forms ante rem in themselves, that is, in their full clarity. Nevertheless, Bonaventure did not think that the projection of this light of the Eternal Art was fully determinative. The soul could and would occasionally err in its judgment of the intelligible forms within itself even in light of the projection of the Eternal Art through either natural defect or willful ignorance.

6. Moral Philosophy

The goal of Bonaventure’s moral philosophy is happiness, a state of beatitude in which the soul satisfies its fundamental desire to know and love God (Bonaventure, 4 Sent. d. 49, p. 1, a. 1, q. 2). He admitted that the human person must attend to the needs of its body—at least in this lifetime. His emphasis on the virtue of charity and the care of the lepers, the poor, and others in need, in body and soul, attests to that commitment (Bonaventure, Legenda maior 1.5-6). But while necessary, the satisfaction of these physical needs remains insufficient. The rational soul is the essential core of the human person and thus the soul and its needs set the terms for its happiness in this life and the next.

The structure of the rational soul established its proper end. Its rational faculties, its memory, intellect, and will, worked in close cooperation with one another in its effort to know and love the full extent of the cosmos in the physical realm of being, the intelligible, and the divine until it comes “face to face” with the divine in an ecstatic union that defies rational analysis. Bonaventure readily admitted that the soul finds delight in its contemplation of the physical realm of being. Indeed, he encouraged the proper measure of the soul’s delight in “the origin, magnitude, multitude, plenitude, operation, order, and beauty” of the full extent of the physical realm of being (Bonaventure, Itinerarium 1:14). But, Bonaventure argued, the soul’s knowledge and love of the physical realm of being fails to provide full satisfaction. Even if the soul could plumb the full extent of its depths, it would still fail to satisfy. The physical realm of being comes from nothing (ex nihilo) and is therefore fundamentally nothing in itself. “Everything is vanity (vanitas), says the teacher” (Ecclesiastes 1:1). It is fleeting (vanum) and vain (vanitas) and, at most, provides a fleeting degree of satisfaction (Bonaventure, Ecclesiastae c. 1, p. 1, a. 1). So, too, the intelligible realm. Even if the soul could come to a full comprehension of itself, it comes from nothing. Thus, per the process of elimination, the soul finds its satisfaction only in its knowledge and love of Divine Being, Being Itself, the Pure Act of Being, first, eternal, simple, actual, perfect, and unsurpassed in its unity and splendor.

Bonaventure reinforced this argument with a careful analysis of Aristotle’s conception of happiness in the first book of the Nicomachean Ethics. He pointed out that the Philosopher had defined happiness as the soul’s practice or possession of its proper excellence (areté). Thus, the human person, precisely as a rational animal, finds happiness in its rational contemplation of truth and, in particular, the highest truth of the first, eternal, immutable cause of every other thing, the contemplation of the Thought that Thinks Itself.

Bonaventure accepted much of Aristotle’s account of happiness, but relied on Augustine’s critique of eudaimonism in the City of God to point out three critical errors: (1) it lacks the permanence of immortality, (2) the contemplation of abstract truth is insufficient, and (3) the soul cannot attain its proper end in itself (Bonaventure, Breviloquium 2.9). He argued, contra Aristotle, that the soul’s perfect happiness is found in its eternal knowledge and love of the highest truth in an ecstatic union with the concrete instantiation of that truth in the Divine Being, not merely the rational contemplation of that Divine Truth. He also argued, contra Aristotle, that the soul relied on the assistance (gratia) of the Divine Being to ensure that it comes to its proper end in union with the Divine Being.

Bonaventure proposed two means to attain this end. The first was the rational reductio of the physical realm of being and, through self-reflection, the intelligible, to its fundamental causes, efficient, formal, and final, in the First Principle. The second, the practice of the virtues, the moral counterpart to the soul’s rational ascent in its effort to know and love the First Principle and come to its proper end in union with that Principle.

Bonaventure relied primarily on Aristotle to derive his definition of virtue in its proper sense: a rationally determined voluntary disposition (habitus) that consists in a mean [between extremes] (Bonaventure 2 Sent. d. 27, dub. 3). The definition requires some explanation. First, Bonaventure insisted that the higher faculties of the rational soul, its memory, intellect, and will, worked in close cooperation with one another to exercise the free decision of its will (liberum arbitrium). The process begins in the memory, the depths of the human person, in which the First Principle infused the dispositions of the virtues. The process continues in the intellect that, with the cooperation of the light of divine illumination, recognizes those dispositions and directs its will to act on them. The process comes to its end in an act of the will that freely chooses to put them into practice.

Second, Bonaventure argued that all the virtues reside in the rational faculties of the soul, its memory, intellect, and will, and not in its sensory appetites, such as its natural desire for those things that benefit its health. He provided a number of reasons to support his claim, but two are of particular importance. First, some of the virtues may be prior to others in the sense that love, discussed below, is the form of all the other virtues, but all of them are equal in the degree to which they provide a source of merit. Bonaventure had argued that the rational soul is unable to reform itself. Thus, it relies on divine grace to help it reform itself in a process of cooperative development in which the soul’s efforts merit divine assistance. Second, the rational faculties render free decision possible, and free decision, in cooperation with divine grace, is the essential criterion for merit.

Finally, Bonaventure’s insistence on the mean between extremes in the practice of virtue corrects the tendency to read him and other medieval philosophers through the lens of a dichotomy in which the pilgrim soul must choose between heaven and earth. Bonaventure, as mentioned, encouraged the rational soul to delight in the physical realm of being in its proper measure. Nevertheless, his conception of this proper measure is often rather closer to dearth than excess. He practiced a degree of asceticism that while not as extraordinary as his spiritual father, St Francis, exceeded even the standards of his own day. His conception of a middle way consisted in the minimum necessary for sustenance and the practice of one’s vocation (Bonaventure, Hexaëmeron 5.4). He provided a memorable rebuke to illustrate this standard. In response to the criticism that a person requires a modest degree of possessions to practice the mean between the extremes of dearth and excess, Bonaventure replied that having sexual intercourse with half the potential partners in the world is hardly the proper mean between having intercourse with all of them and none. One, he argued, would suffice.

Bonaventure also argued, contra Aristotle, that virtue in its most proper sense referred to an infused disposition of the soul in fidelity to the Platonic tradition passed down through Augustine and the orthodox doctrines of the Christian theological tradition. But this posed a problem. Is Aristotle correct in his claim that virtue is an acquired disposition of the soul or is Augustine correct? Bonaventure’s solution is not entirely clear. He appeared to reject Aristotle on this point and almost every other in the critical edition of his final but unfinished treatise, the Collations on the Six Days of Creation. But DeLorme has edited another reportatio of the Collations in which Bonaventure provided a more subtle critique of Aristotle and his commentators. The weight of evidence suggests that DeLorme’s edition of the Collations is more accurate. Bonaventure appears to have argued that virtue is an infused disposition of the soul, but it does not fully determine the free decision of its will. Rather, the First Principle plants the seeds (rationes seminales) of virtue in the soul, and that soul must carefully cultivate them, with the further assistance of divine grace, to bring them to fruition.

Bonaventure presented long lists of virtues, gifts of the sprit, and beatitudes in the development of his moral philosophy (Bonaventure, 3 Sent. d. 23-36; Breviloquium 5.4; Hexaëmeron 5:2-13 and 6.6-32). These include: the theological virtues, faith, hope, and love; the cardinal virtues, justice, temperance, fortitude, and prudence; the intellectual virtues of science, art, prudence, understanding, and wisdom—some virtues, such as prudence, appear in more than one category; the gifts of the spirit, fear of the lord, piety, knowledge, fortitude, counsel, understanding, and wisdom; and the beatitudes, poverty of spirit, meekness, mourning, thirst for justice, mercy, cleanliness of heart, and peace. The authors of the secondary literature on Bonaventure’s moral philosophy tend to restrict themselves to the virtues, but the distinction between them and other dispositions of the soul is slight. Bonaventure derived the theological virtues from the scriptures and placed them in the same broad category as the cardinal and intellectual virtues. Furthermore, all three of the categories—the virtues, gifts, and beatitudes—dispose the soul to the rational consideration of the mean, and they do so to order the soul to its proper end.

Bonaventure insisted that all of the virtues and, per extension, the gifts and beatitudes, retain the same degree of value in relation to their end. Nevertheless, some of them are more fundamental than others, namely, love, justice, humility, poverty, and peace.

Love is the first and most important of the virtues (Bonaventure, Breviloquium 5.8). It is the metaphysical form of the other virtues and common to all of them. It brings them into being and renders them effective. Without love, the other virtues exist in the rational soul in potentia, and thus fail to dispose the soul’s will to its proper end. It also provides the fundamental impetus (pondus inclinationis) that inclines the affections of the will to the First Principle, itself, others, and, finally, its body and the full extent of the physical realm of being. Bonaventure’s ethics, like Francis’, includes a substantial degree of regard for the wider world in itself and as a sign (signum) that testifies to the existence of the First Principle in its causal dependence on that Principle.

Justice is the “sum” of the virtues (Bonaventure, De reductione artium 23). It inclines the will to the good. It further refines the proper order of its inclination to the First Principle, itself, others, and the physical realm of being and, finally, it establishes the proper measure of its affection to the First Principle, itself, and others.

Humility is the “foundation” of the virtues and the principal antidote to pride (Bonaventure, Hexaëmeron 6.12). It is the soul’s recognition that the First Principle brought it into being ex nihilo and of its inherent nothingness (nihilitatem). It thus enables the will to overcome its inordinate love for itself and love the First Principle, itself, and others in their proper order and measure.

Bonaventure relegated poverty, the most characteristic of Francis’ virtues, to a subordinate position relative to love and the other virtues to correct the tendency of some of his Franciscan brothers and sisters to take excessive pride in their practice of poverty (Bonaventure, Perfectione evangelica q. 2, a. 1). Bonaventure encouraged the practice of poverty, but argued that it is the necessary but insufficient instrumental cause of love, humility, and all the other virtues, not an end in itself. It corrects the tendency to cupidity, the narcissistic cycle in which the soul’s regard for itself dominates its regard for other things. Indeed, it is poverty that is the mean between extremes and not, as the opponents of the mendicant orders had argued, the violation of the mean.

Peace is the disposition of the will to its final end (Bonaventure, Triplica via 7). The disorder of the soul led to conflict between the soul and the First Principle, itself, others, and its body. The practice of love, justice, and the long list of virtues, gifts, and beatitudes restored the proper order of its will and dissolved that conflict. Peace is the result of that effort. It is the tranquility of the perfection of the rectitude of the will. It is the state of the soul’s complete satisfaction of its desires in its union with the First Principle.

Bonaventure delineated the soul’s progress in its practice of these virtues, gifts, and beatitudes in his reformulation of the Neo-Platonic process of the triple way (triplica via): the purgation, illumination, and perfection that renders the soul fit for its proper end in ecstatic union with the First Principle (Bonaventure, Triplica via 1). The first stage consists in the purgation of sin in which the practice of the virtues rids the soul of its tendencies toward vice, for example, the practice of love in opposition to greed, justice to malice, fortitude to weakness, and so on. The second stage consists in the imitation of Christ, Francis, and other moral exemplars. He authored a number of innovative spiritual treatises in which he asked his readers to contemplate the life of Christ, Francis, and others who modeled their lives on Christ, and then to imagine their participation in the life of Christ, to imagine that they, too, cared for the lepers, for example, to foster their practice of the virtues (Bonaventure, Lignum vitae, prol. 1-6; Legenda maior, prol. 1.5-6). The third and final stage consists in the perfect order of the soul in relation to the First Principle, itself, others, its body, and the full extent of the physical realm of being. It restored its status as an image of the First Principle (deiformitas) and rendered the soul fit for union with that Principle in the perfection of its well-ordered love (Bonaventure, Breviloquium 5.1.3).

Bonaventure’s reformulation of this hierarchic process differed from its original formulation in the Neo-Platonic tradition in three significant ways. First, the original process had been primarily epistemic and referred to the rational soul’s purgation of the metaphysical forms from the physical realm of being (in re), its illumination of those forms in the intelligible realm of being (post rem), and the perfection of those forms in the divine realm (ante rem). Second, he allotted a more significant role for the imitation of Christ and other moral exemplars in the process than even his predecessors in the Christian tradition. Finally, he insisted that the soul progresses along the three ways simultaneously. The soul engages in purgation, illumination, and perfection throughout its progress in its effort to reform itself into the ever more perfect image of the First Principle.

7. The Ascent of the Soul into Ecstasy

“This is the sum of my metaphysics: It consists in its entirety in emanation, exemplarity, and consummation, the spiritual radiations that enlighten [the soul], and leads it back to the highest reality” (Bonaventure, Hexaëmeron 1.17).

Bonaventure had argued that the rational soul’s proper end is union with its Creator. But he also argued that the rational soul, created ex nihilo, possesses a limit to its intellectual capacities that prevents the application of its proper function, reason, in the full attainment of its proper end in union with God. The human mind, Bonaventure argued, cannot fully comprehend its Creator.

Plotinus and his heirs in late antiquity, principally Proclus, developed an elegant three-part formula that provided Bonaventure with the raw material to resolve the dilemma: It began with (1) the existence of the First Principle, the One (to Hen), the foundation of the Neo-Platonic cosmos, continued in (2) the emanation (exitus) of all other things from the First Principle, and ended in (3) its recapitulation (reditus) into the First Principle.

Bonaventure did not possess direct access to the formula. He relied on Augustine, Dionysius, and Dionysius’s heirs in the medieval west, Hugh, Richard, and Thomas Gallus of the School of St. Victor, to access the formula, and refine it into a viable solution. He contracted the first two movements of the process into one, “emanation” (emanatio), and reformulated his contraction in two significant ways (Bonaventure, Hexaëmeron 1.17). First, Plotinus and other classical Neo-Platonists envisioned a linear exitus, the First Principle “expresses” Itself in a series of distinct hypostases, the Nous, the Psyché, and its further expression into the intelligible and physical realms of being from eternity (ab aeterno). Bonaventure divided that exitus into two distinct movements: (1) the “emanation” of the First Principle (Principium) that exists in one substance with Its Eternal Art and Spirit ab aeterno and (2) the further “emanation” of that First Principle, in its perfect perichoresis—its reciprocal coinherence—with Its Art and Spirit, in Its creation of the intelligible and physical realms of being in time and ex nihilo.

Second, he interposed a middle term, so to speak, in the process, “exemplarity” (exemplaritas). The created realm of being exemplifies its origins in the First Principle in Its perfect perichoresis with Its Art and Spirit through a carefully graded series of resemblances (Bonaventure, 1 Sent. d. 3). The first degree of its resemblance, the shadow (umbra), exemplified its indeterminate causal dependence on the First Principle. The second degree, the vestige (vestigium), exemplified its determinate causal dependence, efficient, formal, and final, on the First Principle. The third degree, the image (imago), exemplified its explicit dependence on the First Principle in Its perfect perichoresis with Its Eternal Art and Spirit. Bonaventure would abandon the first degree of resemblance, the shadow, in his latter works and introduce a fourth, the moral reformation of the soul into a more perfect image, the similitude (similitudo), fit for union with the First Principle.

Bonaventure also reimagined the final stage of the Neo-Platonic process as a “consummation” (consummatio) that consisted of two movements: (1) the soul’s recognition of the carefully graded series of resemblances, the “spiritual radiations that enlighten the soul” and testify to its causal dependence on the “highest reality” of the First Principle, in Its perfect perichoresis with Its Eternal Art and Spirit, and (2) its transformation into a more perfect image (similitudo) of the First Principle that fits it for union with that Principle, Its Art, and Spirit. Thus, he explained, the process curves into itself “in the manner of an intelligible circle” and ends in principium (Bonaventure, Mysterio Trinitatis q. 8 ad 7).

Bonaventure provided a particularly rich account of his reformulation of this Neo-Platonic process in his most celebrated text, the Itinerarium mentis in Deum. It is a difficult text to categorize. It is a philosophical text, but not exclusively. It is a philosophical text steeped in a Neo-Platonic Christian tradition that relies heavily on the data of revelation contained in the Christian scriptures and the spiritual practices of the thirteenth century to construct a Platonic Ladder of Love in the context of that syncretic tradition. Bonaventure’s distinction between philosophy and theology provides the means to distinguish the philosophical core of the text from its theological setting—with occasional reference to its theological dimensions to provide a comprehensive analysis of each rung of that ladder.

Bonaventure derived the initial division of the rungs of that ladder from the Neo-Platonic division of the cosmos that permeates so much of his thought: the rational soul’s contemplation of the vestige (vestigium) of the First Principle in the physical realm of being (esse), its  contemplation of the image (imago) of the First Principle in the intelligible realm of being, and its contemplation of the First Principle in Itself in the divine realm of being that prepares it for union with that Principle. The path is a deft harmony of Dionysius’ contrast between the soul’s cataphatic contemplation of creation—in which it applies its intellect—and its apophatic contemplation of the divine—in which it suspends its intellect in mystical union (McGinn, “Ascension and Introversion”). The soul moves from its contemplation of the physical realm of being outside itself, to its contemplation of the intelligible realm of being within itself, and ends in the contemplation of the divine above itself.

He further subdivided each of these three stages of contemplation into two, for a total of six steps. The first step in each stage focuses on that stage’s testimony to (per) the First Principle, the second to the presence of the First Principle in that stage. This pattern dissolves in the soul’s contemplation of the First Principle in Itself in the third stage. The first step on this stage focuses on the contemplation of the First Principle as the One God of the Christian tradition. The second stage focuses on the contemplation of the emanation of the One God in Three Hypostases or, more commonly, Persons. The ascent comes to its end in a seventh step in which the soul enters into an ecstatic union with the First Principle in Its perfect perichoresis with Its Eternal Art and Spirit. The philosophical core of the text is particularly apparent on steps one, three, and five, and the theological core on steps two, four, and six. The two come together on the seventh step.

It is also important to reiterate that Bonaventure insisted on the necessity of grace for the soul to achieve its goal contra Plotinus and his immediate heirs in classical antiquity, Porphyry, Iamblichus, and Proclus. Thus, Bonaventure included a series of prayers and petitions to the First Principle, the incarnation of the Eternal Art in the person of Christ, St. Francis, Bonaventure’s spiritual father, and other potential patrons to “guide the feet” of the pilgrim soul in its ascent into “that peace that surpasses all understanding” (Philippians 4:7) in its union with the First Principle.

The first step of the soul’s ascent consists in its rational reductio of the vestige of the physical realm of being to its efficient, formal, and final cause in the First Principle. Bonaventure relied on yet another reformulation of a Neo-Platonic triad to align each of these causes with particular properties of the First Principle: the power of the First Principle as the efficient cause that created the physical realm of being ex nihilo, the wisdom of the First Principle as the formal cause that formed the physical realm of being, and the goodness of that Principle as the final cause that leads it to its proper end in union with Itself. The rational soul relies on the testimony of the entire physical realm of being to achieve this union, “the origin, magnitude, multitude, plenitude, operation, order, and beauty of all things” (Bonaventure, Itinerarium 1.14), even though the rest of that realm will end in a final conflagration (Bonaventure, Breviloquium 7.4). It will have served its purpose and persist only in the memory of rational beings.

Bonaventure paired this philosophical argument with an analogy that takes the reader into the theological dimensions of the text: The power, wisdom, and goodness of God suggests some degree of distinction within the First Principle. The power of the First Principle points to God the Father as the efficient cause of all other things, Its wisdom to the Son, the Eternal Art, as the formal cause, and Its goodness to the Spirit as the final cause through “appropriation in the manner of speech” (appropriate loquendo). Bonaventure insisted that, properly speaking, the One God, the First Principle in Its perfect perichoresis with Its Art and Spirit, is the efficient, formal, and final cause of all things, but he also insisted that it is proper to attribute particular properties to each of the Divine Persons to distinguish them from one another. Nevertheless, he admitted that the analogical argument remained inconclusive. The rational soul, without the light of divine revelation, is able to realize that creation testifies to the power, wisdom, and goodness of the First Principle, but it is not able to realize that the power, wisdom, and goodness of that Principle testifies to Its existence in Three Persons.

The second step consists in the soul’s contemplation of the epistemological process, its apprehension, delight, and judgment of the sensory species of the physical realm of being. Its contemplation of this process reveals that it depends on the presence of the “light” of the Eternal Art in its cooperative effort to discern certain truth through the careful consideration of the propositions of the epistemological argument: It possesses certain truth, but it is fallible and the object of its knowledge mutable, so it relies on the “light” of the Eternal Art to render itself infallible and the object of its knowledge immutable. But the thrust of this step is the derivation of the first of three analogies between the epistemological process and distinct types of mysticism. If the soul looks at the “light” of the Eternal Art, so to speak, rather than the intelligible forms post rem it illumines, then the epistemological process becomes the occasion for an epistemic mysticism in which the soul apprehends, delights, and judges a Divine Species of the Eternal Art, although not the Eternal Art Itself, in its epistemological union with the Eternal Art.

The third step consists in the soul’s contemplation of itself as an image (imago) of the First Principle in its higher faculties of memory, intellect, and will. These, too, testify to the power of the First Principle as the efficient cause that created it ex nihilo, the wisdom of the First Principle as the formal cause that formed it, and the goodness of that Principle as the final cause that leads it to its proper end in its union with Itself. But the analogical argument is more prominent on this step. The rational soul is one substance (ousia) that consists of three distinct faculties, memory, intellect, and will, and this suggests that the First Principle, which is one in substance, consists of three distinct persons, Father, Son, and Spirit. But again, the analogical argument remains inconclusive without the benefit of the light of revelation.

The fourth step consists in the soul’s contemplation of its moral reformation into a more perfect image or similitude (similitudo) of the First Principle through its progress on the triplica via of purgation, illumination, and perfection in its practice of the virtues. Bonaventure insisted that its moral reformation depends on the presence of the Eternal Art in the person of Christ as a moral principle, similar to its dependence on the Eternal Art as an epistemological principle, to motivate and guide its pursuit of perfection. But the thrust of this step is the derivation of the second of three analogies between the epistemological process and distinct types of mysticism. The soul’s progress along the triplica via restores its “spiritual” senses. It is able to see, hear, smell, taste, and touch the Divine Species of the Eternal Art in the form of the mystical presence of Christ, delight in that Species, and judge the reasons for its delight in a type of nuptial mysticism, like “the Bride in the Song of Solomon,” Bonaventure explains, who “rests wholly on her beloved” (Song of Solomon 8:5)—a thinly veiled reference to the intimacy of sexual union.

The fifth step consists in the soul’s direct contemplation of the First Principle as Being Itself in its careful analysis of the propositions of his reformulation of the ontological argument. The concept of beings falls into three initial categories: non-being, being-in-potency, and being-in-act. The concept of non-being is a privation of being and presupposes the concept of being. The concept of being-in-potency presupposes the concept of being-in-act. If so, the concept of being-in-act depends on the concept of a “pure” act of being without potential and this final concept is being itself (ipsum esse). But this “pure” act of being does not fall within the category of the physical realm of being “which is mixed with potency”. It does exist within the intelligible realm of being, but not entirely so. If it existed in the rational soul and only in the soul, it would exist only as a concept, and thus possess “only a minimal degree of being”. And so, per the process of elimination, being itself is the Divine Being.

Bonaventure extended this ontological argument on the sixth step to provide rational justification for the theological doctrine of the One God in Three Persons. He began with the Neo-Platonic concept of the One (to Hen) as the Self-Diffusive Good. He derived this definition principally from the Neo-Platonic tradition, particularly Dionysius, for whom the “Good” was the perfect and preeminent name of God, the name that subsumed all other names (Dionysius, Divine Names 3.1 and 4.1-35). But he also derived it from his notion of the transcendental properties of being that “transcended” the traditional Peripatetic division of things into the categories of substance and accident and thus applied to all beings, physical, intelligible, and divine. He identified three and only three of these “highest notions” of being: unity, truth, and goodness—although he listed others, notably beauty, as second, third, or fourth order properties of being (Aertsen, “Beauty in the Middle Ages”). So, the Divine Being Itself, the highest Being, is also the highest unity, truth, and goodness. The good is self-diffusive per definitionem and the highest good, the most self-diffusive—a proposition he inherited from the Neo-Platonists. Thus, Divine Being Itself diffuses Itself in a plurality of Divine Hypostases, God the Father, Son, and Spirit.

Bonaventure brought his account of the soul’s ascent to its proper end in a direct encounter with the First Principle in Itself, in Its perfect perichoreses with Its Eternal Art and Spirit. He stands in a long tradition of philosophers who had attempted to provide a description of that mystical experience: Plato, Plotinus, Dionysius, and Bonaventure’s immediate predecessors, Hugh, Richard, and Thomas Gallus of the School of St. Victor. All of them have fallen short—perhaps necessarily so. Bonaventure began his attempt with an analogy of the epistemological process, the apprehension, delight, and judgment of the sensory speices, but he deliberately undermined his own effort. He relied on two rhetorical devices he derived from Dionysius’ Mystical Theology to do so. The first is a series of denials of the soul’s intellectual capabilities that he drew from Dionysius’ practice of negative theology, the soul sees, but it does so in a dark light, it hears, but in the silence of secrets whispered in the dark, it learns, but it learns in ignorance. The second is a series of metaphors, the fire of the affections of the will, the blindness of the intellect and its slumber, the hanging, crucifixion, and death of the soul’s cognitive faculties in its inability to comprehend the incomprehensible.

Bonaventure’s rhetoric, similar to the excess of Plato, Plotinus, and others in the same tradition, has supported a wide range of interpretation (McGinn, Flowering, 189-205). Some scholars emphasized the cognitive dimensions of the soul’s contemplation of the First Principle even if the object of its vision exceeded its cognitive capabilities in a vision, so to speak, of a light so bright it blinded the intellect so that it seemed to see nothing. Others emphasized the affective dimensions of the experience. The soul’s contemplation of the First Principle is a type of experiential knowledge in which the affections of the will outpace the intellect in “that peace which surpasses all understanding”. Still others disengaged the rational faculties of the soul from the experience in their entirety.

McGinn laid the groundwork for the current consensus that argues for a mean between these extremes. The soul’s cognitive faculties remain intact, but the object of their contemplation exceeds their capabilities. The soul knows, but it is an experiential knowledge, not propositional. It may even strive to know in the so-called proper, propositional intension of the concept—after all, it possesses the inclination to do so. But it fails. It knows the First Principle in Its eternal perichoresis with Its Art and Spirit in the sense that it experiences the real presence of that Principle. But it cannot apprehend that Principle, it cannot abstract an intelligible species of that Principle, it cannot imagine, compound, divide, estimate, or remember that Principle. Nevertheless, it experiences the immediate presence of that Principle, Its Art, and Its Spirit that remains forever inexplicable—an experience that ignites its affections to an unfathomable degree of intensity. “Let it be, let it be”, Bonaventure pleaded as he brought his account of the soul’s ascent to a close. “Amen” (Bonaventure, Itinerarium 7.6).

8. References and Further Reading

a. Critical Editions

  • Doctoris Seraphici S. Bonaventurae Opera Omnia. 10 vols. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • This is the current standard critical edition of Bonaventure’s works. Since its publication, scholars have determined that a small portion of its contents are spurious. See A. Horowski and P. Maranesi, listed below, for recent discussions of the question.
  • Breviloquium. In Opuscula Varia Theologica, 199-292. S. Bonaventurae Opera Omnia. Vol. 5.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The Breviloquium is a short summary of Bonaventure’s philosophical theology.
  • Christus unus omnium magister. In Opuscula Varia Theologica, 567-574. S. Bonaventurae Opera Omnia. Vol. 5.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • Christ the One Master of All is an academic sermon that contains a discussion of Bonaventure’s theory of the forms and divine illumination.
  • Collationes in Hexaëmeron. In Opuscula Varia Theologica, 327-454. S. Bonaventurae Opera Omnia. Vol. 5.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The Collations on the Six Days of Creation is Bonaventure’s final and one of his most important texts in philosophical theology. It remained unfinished at the time of his death. This reportatio of the Collationes contains a harsh criticism of Aristotle and the radical Aristotelians. See also DeLorme’s edition below.
  • Collationes in Hexaëmeron et Bonaventuriana Quaedam Selecta. Edited by F. Delorme. In Bibliotheca Franciscana Scholastica Medii Aevi. Vol. 8. Quaracchi: Collegium S. Bonaventurae, 1934.
    • DeLorme based his edition of the Collationes in Hexaëmeron on a single manuscript. It contains a less harsh criticism of Aristotle and the radical Aristotelians. Scholars remain divided on the question of which reportatio is more authentic.
  • Commentarius in librum Ecclesiastae. In Commentarii in Sacram Scripturam, 1-103. S. Bonaventurae Opera Omnia. Vol. 6.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • Bonaventure’s Commentary on the Book of Ecclesiastes contains a discussion of the concept of non-being and the inherent nothingness of the world.
  • Commentarius in I Librum Sententiarum: De Dei Unitate et Trinitate. Doctoris Seraphici S. Bonaventurae Opera Omnia. Vol. 1.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The First Book of the Commentary on the Sentences is Bonaventure’s most extensive discussion of his philosophy and philosophical theology of the One God, the First Principle, in Three Persons.
  • Commentarius in II Librum Sententiarum: De Rerum Creatione et Formatione Corporalium et Spiritualium . Doctoris Seraphici S. Bonaventurae Opera Omnia. Vol. 2.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The Second Book of the Commentary on the Sentences contains Bonaventure’s most extensive discussion on creation.
  • Commentarius in IV Librum Sententiarum: De Doctrina Signorum. Doctoris Seraphici S. Bonaventurae Opera Omnia. Vol. 1.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The fourth book of the Sentences, On the Sacraments, contains Bonaventure’s exhaustive treatise on sacramental theology, but it also contains passages on his philosophy and philosophical psychology of the human person.
  • Itinerarium mentis in Deum. In Opuscula Varia Theologica, 293-316. S. Bonaventurae Opera Omnia. Vol. 5.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The Itinerarium is Bonaventure’s treatise on the soul’s ascent into God and his most popular work.
  • Lignum vitae. In Opuscula Varia ad Theologiam Mysticam, 68-87. Doctoris Seraphici S. Bonaventurae Opera Omnia. Vol. 8. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The Tree of Life is Bonaventure’s innovative life of Christ and an often neglected source for his virtue theory.
  • Quaestiones disputatae de scientia Christi. In Opuscula Varia Theologica, 1-43. S. Bonaventurae Opera Omnia. Vol. 5.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The Disputed Questions on the Knowledge of Christ contains information on philosophical psychology and epistemology. The fourth question is a detailed discussion of divine illumination.
  • Quaestiones disputatae de mysterio Ss. Trinitatis. In Opuscula Varia Theologica, 45-115. S. Bonaventurae Opera Omnia. Vol. 5.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The Disputed Questions on the Mystery of the Trinity contains a detailed series of debates on the existence and nature of the First Principle. The first article of each quesiton is philosophical. The second theological.
  • Quaestiones disputatae de perfectione evangelica. In Opuscula Varia Theologica, 117-198. S. Bonaventurae Opera Omnia. Vol. 5.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The Disputed Questions on Evangelical Perfection is an important text in moral philosophy and philosophical theology.
  • Opusculam de reductione artium ad theologiam. In Opuscula Varia Theologica, 317-326. S. Bonaventurae Opera Omnia. Vol. 5.  Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • On the Reduction of the Arts to Theology contains a discussion of philosophy and its distinction from philosophical theology.
  • De triplici via. In Opuscula Varia ad Theologiam Mysticam, 3-27. Doctoris Seraphici S. Bonaventurae Opera Omnia. Vol. 8. Quaracchi: Collegium S. Bonaventurae, 1882-1902.
    • The Triple Way is Bonaventure’s treatise on spiritual and moral reformation.
  • Doctoris Seraphici S. Bonaventurae Opera Theologica Selecta. 5 vols. Quaracchi: Collegium S. Bonaventurae, 1934-1965.
    • This is a smaller edition of the Commentary on the Sentences and three short works, the Breviloquium, the Itinerarium, and the De reductione artium ad theologiam. The text is complete but the critical appartus is significantly reduced.
  • Legenda Maior. In Analecta Franciscana 10 (1941): 555-652.
    • This is the revised critical edition of the Longer Life of St. Francis, and another often neglected source for Bonaventure’s virtue theory.

b. Translations into English

  • Bonaventure: The Soul’s Journey into God, The Tree of Life, The Life of St. Francis. Translated by E. Cousins. New York: Paulist Press, 1978.
    • Cousins’ translations of these short but influential works is refreshingly dynamic but faithful.
  • “Christ, the One Teacher of All”. In What Manner of Man: Sermons on Christ by Bonaventure, 21-55. Translated by Z. Hayes. Chicago: Franciscan Herald Press, 1974.
  • Breviloquium. Edited by D. V. Monti. Works of Bonaventure. Vol. 9. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2005.
  • Collations on the Hexaemeron. Edited by J. M. Hammond. Works of Bonaventure. Vol. 18. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2018.
  • Commentary on Ecclesiastes. Edited by R. J. Harris and C. Murray. Works of Bonaventure. Vol. 7. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2005.
  • Commentary on the Sentences: The Philosophy of God. Edited by R. E. Houser and T. B. Noone. Works of Bonaventure. Vol. 16. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2013.
    • This rather large volume contains only a small selection of texts from the Commentary on the First Book of the Sentences.
  • Disputed Questions on Evangelical Perfection. Edited by R. J. Harris and T. Reist. Works of Bonaventure. Vol. 13. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2008.
  • Disputed Questions on the Knowledge of Christ. Edited by Zachary Hayes. Works of Bonaventure. Vol. 4. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 1992.
  • Disputed Questions on the Mystery of the Trinity. Edited by Z. Hayes. Works of Bonaventure. Vo. 3. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 1979.
  • Itinerarium Mentis in Deum. Edited by P. Boehner and Z. Hayes. Works of Bonaventure. Vol. 2. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2002.
  • On the Reduction of the Arts to Theology. Edited by. Z. Hayes. Works of Bonaventure. Vol. 1. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 1996.
  • The Threefold Way. In Writings on the Spiritual Life, 81-133. Edited by F. E. Coughlin. Works of Bonaventure. Vol. 10. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2006.
  • Works of Bonaventure. 18 vols. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 1955.
    • The Franciscan Institute of St. Bonaventure University, a major research center for Franciscan Studies, began to publish this series in 1955. The pace of publication has increased in recent years, but the series remains incomplete—Bonaventure authored a vast amount of material. This is the standard series of translations in English.

c. General Introductions

  • Bettoni, E. S. Bonaventura. Brescia: La Suola, 1944. Translated by A. Gambatese as St Bonaventure (Notre Dame, IN: University of Notre Dame Press, 1964).
    • Bettoni’s St. Bonaventure is the best short work on Bonaventure’s life and thought. Unfortunately, it is out of print.
  • Bougerol, G. Introduction a Saint Bonaventure. Paris: J. Vrin, 1961. Revised 1988. Translated by J. de Vinck as Introduction to the Works of St. Bonaventure (Paterson, NJ: St. Anthony Guild Press, 1963.
    • Bougerol’s Introduction is an insightful commentary on the literary genres of Bonaventure’s works. Note that the English translation is of the first French edition, not the second.
  • Cullen, C. M. Bonaventure. Oxford: Oxford University Press, 2006.
    • Cullen’s Bonaventure is the most recent comprehensive introduction to his life and thought.
  • Delio, I. Simply Bonaventure. Hyde Park: New City Press, 2018.
    • Delio’s Simply Bonaventure, now in its second edition, is intended for those with little or no background in medieval philosophy or theology.
  • Gilson, É. La philosophie de saint Bonaventure. Paris: J. Vrin, 1924. Revised 1943. Translated by I. Trethowan and F. J. Sheed as The Philosophy of St. Bonaventure (London: Sheed and Ward, 1938. Reprinted 1940, 1965).
    • Gilson’s Philosophy of St. Bonaventure is foundational. He was the first to insist on Bonaventure’s careful distinction between philosophy and theology and to identify Bonaventure as the principal representative of Christian Neo-Platonism in the Middle Ages. Note that the English translation is of an earlier edition.

d. Studies

  • Aertsen, J. A. “Beauty in the Middle Ages: A Forgotten Transcendental?” Medieval Philosophy and Theology 1 (1991): 68-97.
  • Aertsen, J. A. . Medieval Philosophy as Transcendental Thought: From Philip the Chancellor to Francisco Suárez. Leiden: Brill, 2012.
    • Aertsen’s is an exhaustive study of the concept of the transcendentals with reference to Bonaventure and other philosopher-theologians in the latter Middle Ages. He refutes the widespread assumption that Bonaventure had listed beauty as a transcendental on par with the one, the true, and the good.
  • Baldner, S. “St. Bonaventure and the Demonstrability of a Temporal Beginning: A Reply to Richard Davis.” American Catholic Theological Quarterly 71 (1997): 225-236.
  • Baldner, S. “St. Bonaventure and the Temporal Beginning of the World.” New Scholasticism 63 (1989): 206-228.
    • Baldner’s pieces are two of the more important relatively recent discussions of the question. See also Dales, Davis, and Walz.
  • Bissen, J. M. L’exemplarisme divin selon saint Bonaventure. Paris: Vrin, 1929.
    • This is the foundational study of Bonaventure’s exemplarism and remains unsurpassed in breadth. See also Reynolds.
  • Bonnefoy, J. F. Une somme Bonaventurienne de Theologie Mystique: le De Triplici Via. Paris: Librarie Saint-François, 1934.
    • This is the seminal analysis of Bonaventure’s treatise on the soul’s moral reformation.
  • Bowman, L. “The Development of the Doctrine of the Agent Intellect in the Franciscan School of the Thirteenth Century.” The Modern Schoolman 50 (1973): 251–79.
    • Bowman provides one of the few extensive treatments of Bonaventure’s doctrine of the agent intellect.
  • Burr, D. The Spiritual Franciscans: From Protest to Persecution in the Century after Francis of Assisi. University Park, PA: The Pennsylvania State University Press, 2001.
    • The first chapter provides a summary of the state of the conflict between the Fraticelli and the Conventuals during Bonaventure’s tenure as Minister General.
  • Cullen, C. M. “Bonaventure’s Philosophical Method.” In A Companion to Bonaventure, 121-163. Edited by J. M. Hammond, J. A. Wayne Hellmann, and J. Goff. Leiden: Brill, 2014.
    • Cullen provides a precise summary of Bonaventure as a philosopher and his method.
  • Dales, R. C. Medieval Discussions of the Eternity of the World. Leiden: Brill, 1990.
    • Dales locates Bonaventure in the larger stream of thought on this question. A companion volume includes the relevant Latin texts.
  • Davis, R. “Bonaventure and the Arguments for the impossibility of an Infinite Temporal Regression.” American Catholic Philosophical Quarterly 70 (1996): 361-380.
  • Delio, I. A Franciscan View of Creation: Learning to Live in a Sacramental World. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2003.
    • Delio derives her sacramental view of creation from a careful consideration of the thought of Francis, Clare, Bonaventure, and Scotus.
  • Gendreau, B. “The Quest for Certainty in Bonaventure.” Franciscan Studies 21 (1961): 104-227.
    • Gendreau first proposed the current solution to the problem of Bonaventure’s theory of divine illumination. Compare with Speer.
  • Horowski, A. “Opere autentiche e spurie di San Bonaventura.” In Collectanea Franciscana 86 (2016): 461-606.
    • This is the most recent assessment of the current state of the critical edition of Bonaventure’s works. See also Maranesi.
  • Houser, R. E. “Bonaventure’s Three-Fold Way to God.” In Medieval Masters: Essays in Honor of E. A. Synan, 91-145. Houston: University of St. Thomas Press, 1999.
    • Houser’s analysis of Bonaventure’s arguments for the existence of God emphasizes their logical structure and highlights Bonaventure’s command of the formal logic of the Aristotelian tradition.
  • Johnson, T. J., K. Wrisley-Shelby, and M. K. Zamora, eds. Saint Bonaventure: Friar, Teacher, Minister, Bishop: A Celebration of the Eighth Centenary of His Birth. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2020.
    • A collection of papers delivered at a major conference to celebrate the eighth centenary of Bonaventure’s birth at St. Bonaventure University. It provides a thorough overview of the current state of research into Bonaventure’s philosophy, philosophical theology, and mysticism.
  • Lang, H. “Bonaventure’s Delight in Sensation.” New Scholasticism 60 (1986): 72-90.
    • Lang was the first to highlight the role of delight in Bonaventure’s account of the epistemological process.
  • Malebranche, N. De la recherche de la verité. 1674-1675. Translated by T. M. Lennon and P. J. Olscamp, as The Search after Truth (Cambridge: Cambridge University Press, 1997).
    • Malebranche presented his famous, or perhaps infamous, doctrine of the vision of God in Book 3. He was incorrect in his interpretation of Bonaventure’s epistemology. See Gendreau.
  • Maranesi, P. “The Opera Omnia of St. Bonaventure: History and Present Situation.” In A Companion to Bonaventure, 61-80. Edited by J. M. Hammond, J. A. Wayne Hellmann, and J. Goff. Leiden: Brill, 2014.
    • This is an indispensable assessment of the current state of the critical edition of Bonaventure’s works.
  • McEvoy, J. “Microcosm and Macrocosm in the Writing of St. Bonaventure.” In S. Bonaventura 1274-1974, 2:309-343. Edited by F. P. Papini. Quaracchi: Collegium S. Bonaventurae, 1973.
    • McEvoy places this theme in its wider context.
  • McGinn, B. “Ascension and Introversion in the Itinerarium mentis in Deum.” In S. Bonaventura 1274-19cv74, 3:535-552. Edited by F. P. Papini. Quaracchi: Collegium S. Bonaventurae, 1973.
  • McGinn, B. The Flowering of Mysticism. New York: Crossroad, 1998.
    • McGinn provides a thorough introduction to the structure and content of Bonaventure’s Itinerarium in the context of the mystical practices of the latter Middle Ages with particular attention to the cognitive dimensions—or lack thereof—of the soul’s ecstatic union with the First Principle.
  • McKenna, T. J. Bonaventure’s Aesthetics: The Delight of the Soul in Its Ascent into God (Lexington Books: London, 2020).
    • This is the first comprehensive analysis of Bonaventure’s philosophy and philosophical theology of beauty since Balthasar’s Herrlichkeit (1961).
  • Monti, D. V. and K. W. Shelby, eds. Bonaventure Revisited: Companion to the Breviloquium. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2017.
    • A helpful commentary on Bonaventure’s own summary of his philosophical theology in the Breviloquium.
  • Noone, T. “Divine Illumination.” In The Cambridge History of Medieval Philosophy, I: 369-383. Edited by R. Pasnau. Cambridge: Cambridge University Press, 2010.
    • Noone provides a helpful overview of the doctrine of divine illumination in the medieval west.
  • Noone, T. “St. Bonaventure: Itinerarium mentis in Deum.” In Debates in Medieval Philosophy: Essential Readings and Contemporary Responses, 204-213. Edited by J. Hause. London: Routledge, 2014.
    • Noone provides insight into Bonaventure’s sources for his analysis of the epistemological process.
  • Panster, K. “Bonaventure and Virtue.” In Saint Bonaventure Friar, Teacher, Minister, Bishop: A Celebration of the Eighth Centenary of His Birth, 209-225. Edited by T. J. Johnson, K. Wrisley Shelby, and M. K. Zamora. St. Bonaventure, NY: The Franciscan Institute of St. Bonaventure University, 2021.
    • Panster provides an insightful overview of the current state of research on Bonaventure’s virtue theory.
  • Pegis, A. C. “The Bonaventurean Way to God.” Medieval Studies 29 (1967): 206-242.
    • Pegis, an expert on Thomas Aquinas, was one of the first to recognize and clearly distinguish Bonaventure’s approach from Aquinas’.
  • Quinn, J. F. The Historical Constitution of St. Bonaventure’s Philosophy. Toronto: Pontifical Institute of Medieval Studies, 1973.
    • Quinn’s Historical Constitution includes a detailed historiographical essay on early approaches to Bonaventure’s thought. But he devotes most of the volume to an extensive if somewhat controversial analysis of Bonaveture’s epistemology—in spite of its title.
  • Reynolds, P. L. “Bonaventure’s Theory of Resemblance.” Traditio 49 (2003): 219-255.
    • Reynolds’ is an analytic approach to Bonaventure’s theory of exemplarity that highlights Bonaventure’s command of formal logic.
  • Schaeffer, A. “Corrigenda: The Position and Function of Man in the Created World According to Bonaventure.” Franciscan Studies 22 (1962): 1.
  • Schaeffer, A. “The Position and Function of Man in the Created World According to Bonaventure.” Franciscan Studies 20 (1960): 261-316 and 21 (1961): 233-382.
    • Schaeffer’s remains one of the most detailed analyses of Bonaventure’s philosophy and philosophical psychology of the human person.
  • Schlosser, M. “Bonaventure: Life and Works.” In A Companion to Bonaventure, 7-59. Edited by J. M. Hammond, J. A. Wayne Hellmann, and J. Goff. Leiden: Brill, 2014.
    • Schlosser considers the current state of research on Bonaventure’s biography.
  • Seifert, J. “Si Deus est Deus, Deus est: Reflections on St. Bonaventure’s Interpretation of St. Anselm’s Ontological Argument.” Franciscan Studies 52 (1992): 215-231.
    • Seifert was the first to recognize the full force of Bonaventure’s version of the argument.
  • Speer, A. “Bonaventure and the Question of a Medieval Philosophy.” Medieval Philosophy and Theology 6 (1997): 25-46.
    • Speer provides a candid discussion of the question. See also Cullen on Bonaventure’s philosophical method.
  • Speer, A. “Illumination and Certitude: The Foundation of Knowledge in Bonaventure.” American Catholic Philosophical Quarterly 85 (2011): 127–141.
    • Speer provides further insight into Bonaventure’s doctrine of divine illumination. See also Gendreau.
  • Tillich, P. Systematic Theology. 3 vols. Chicago: The University of Chicago Press, 1973.
    • Tillich acknowledges his debt to the mystical aspect Bonaventure’s doctrine of divine illumination in the introduction to the first volume of the series.
  • Walz, M. D. “Theological and Philosophical Dependencies in St. Bonaventure’s Argument against an Eternal World and a Brief Thomistic Reply.” American Catholic Philosophical Quarterly 72 (1998): 75-98.

 

Author Information

Thomas J. McKenna
Email: tjmckenna@concord.edu
Concord University
U. S. A.

Animism

Animism is a religious and ontological perspective common to many indigenous cultures across the globe. According to an oft-quoted definition from the Victorian anthropologist E. B. Tylor, animists believe in the “animation of all nature”, and are characterized as having “a sense of spiritual beings…inhabiting trees and rocks and waterfalls”. More recently, ethnographers and anthropologists have moved beyond Tylor’s initial definition, and have sought to understand the ways in which indigenous communities, in particular, enact social relations between humans and non-human others in a way which apparently challenges secular, Western views of what is thought to constitute the social world. (This new approach in anthropology is sometimes called the “new animism”.) At a minimum, animists accept that some features of the natural environment such as trees, lakes, mountains, thunderstorms, and animals are non-human persons with whom we may maintain and develop social relationships. Additionally, many animist traditions regard features of the environment to be non-human relatives or ancestors from whom members of the community are descended.

Animism, in some form or other, has been the dominant religious tradition across all human societies since our ancestors first left Africa. Despite the near ubiquity of animistic beliefs and practices among indigenous peoples of every continent, and despite the crucial role of animism in the early emergence and development of human religious thought, contemporary academic philosophy of religion is virtually silent on the subject. This article outlines some key ideas and positions in the current philosophical and social scientific discourse on animism.

Table of Contents

  1. Concepts of Animism
    1. Hylozoism, Panpsychism, and Vitalism
    2. Modernist Animism
    3. Enactivist Animism
    4. Animism as Ontology
    5. Social-Relational Animism
  2. The Neglect of Animism
  3. Public Arguments for Animism
    1. Argument from Innateness
    2. Argument from Common Consent
  4. Private Arguments for Animism
  5. Pragmatic Arguments for Animism
    1. Environmentalism
    2. Feminism
    3. Nationalism and Sovereignty
  6. References and Further Reading

1. Concepts of Animism

Animist religious traditions have been particularly prevalent among hunter-gatherer societies worldwide. A variety of different and conflicting religious traditions across the globe have been labeled “animist”. So, animism is not a single religious tradition, but is instead a category to which various differing traditions appear to belong. Just as “theism” is a term that extends to cover any belief system committed to the existence of a god, “animism” is a term that extends to cover any belief system satisfying the appropriate definition (such as the classical Tylorian definition given in the introduction to this article). Note that the terms “theism” and “animism” are not mutually exclusive: an animist may or may not also be a theist. There is some dispute, particularly among anthropologists, as to whether there is a single definition that works to draw the wide variety of traditions typically considered as animist under a single umbrella.

Contemporary social scientific discussion of animism has witnessed a renaissance beginning in the late twentieth century, and this has led different authors to consider a range of alternative ways in which we might conceive of the characteristic qualities of animist thought. Some noteworthy recent contributors to this debate are Nurit Bird-David, Philippe Descola, Tim Ingold, Graham Harvey, and Stewart Guthrie. Before surveying a few of the conceptions currently discussed in this literature, it is worthwhile to be clear on what is not meant by the term “animism”.

a. Hylozoism, Panpsychism, and Vitalism

When attempting to define “animism”, it is important to first disentangle the concept from three closely related philosophical doctrines: Hylozoism, Panpsychism, and Vitalism. Animism is often conflated with these three doctrines as scholarly concepts of animism have traditionally drawn from the work of Tylor, and particularly from his conception of animism as a belief in the “animation of all nature,” a doctrine which he also labels “universal vitality”. Phrases such as these, with their allusions to a “world consciousness”, have given rise to the mistaken impression that animism is a doctrine about the entire universe being fundamentally alive or sentient or filled with soul.

Hylozoism is the view that the universe is itself a living organism. It is a doctrine often attributed (although erroneously, [see Fortenbaugh 2011, 63]) to the third director of Aristotle’s Lyceum, Strato of Lampsacus, who argued that motion in the universe was explicable by internal, unconscious, naturalistic mechanisms, without any need for an Aristotelian prime mover (ibid, 61). This characterization of the universe’s motion as sustained by internal, unconscious mechanisms is seen as analogous to the biological mechanisms and processes sustaining life. However, religious animists typically reject the claim that all things are living, and they also reject that the universe as a whole is a living being. Typically, the animist takes particular features of the natural world as endowed with personhood or some form of interiority, often having their own cultural lives and communities akin to those of human beings.

Panpsychism is the view that “mentality is fundamental and ubiquitous in the natural world” (Goff and others 2017, §2.1). Mind is, on this view, a building block of the universe. Unlike the animist, the panpsychist does not take features of the natural world to have a fully-fledged interior or cultural life akin to that of human beings. Additionally, it is not characteristic of animism to take mental properties to be fundamental to the universe or to be distributed in all systems or objects of a given type. For example, the animist need not accept that all rocks have an interior life, but only that some particular rocks do (perhaps a rock with unusual features, or one which moves spontaneously or unpredictably).

Vitalism is the out-of-favour scientific view that biological phenomena cannot be explained in purely mechanical terms and that a complete explanation of such phenomena will require an appeal to spiritual substances or forces. Proponents of the view included Francis Glisson (1597-1677), Xavier Bichat (1771-1802), and Alessandro Volta (1745-1827). Vitalists hold that all living things share in common a spiritual quality or fluid (famously dubbed by Henri Bergson as the “élan vital”). With this élan vital in hand, it was thought that phenomena that appeared recalcitrant to purely mechanical explanation (for example, the blossoming of flowers, the reproduction of worms, the musings of humans, the growth of algae, and so forth) could be explained in other, more spiritual terms. Animists, unlike vitalists, need not be committed to the existence of any sort of metaphysically special spirit or soul phenomena. Additionally, animists very often take non-biological phenomena (rivers, winds, and the like) to be animate.

b. Modernist Animism

In his Natural History of Religion, David Hume speaks of a tendency for primitive human beings “to conceive all beings like themselves.” Natural phenomena are attributed to “invisible powers, possessed of sentiment and intelligence” and this can be “corrected by experience and reflection”. For Tylor, “mythic personification” drives the primitive animist to posit souls inhabiting inanimate bodies. In a similar vein, Sigmund Freud writes that the animist views animals, plants, and objects as having souls “constructed on the analogy of human souls” (1950, 76). James Frazer’s Golden Bough is a particularly good example, with animism being referred to as “savage dogma” (1900, 171). Rites and rituals relating to animism are described as “mistaken applications” of basic principles of analogy between the human world and the natural world (62). For these writers, animism is understood as a kind of promiscuous dualism and stray anthropomorphism. The animist is committed to a superstitious belief in anthropomorphic spirits, which reside within non-human animals or altogether inanimate objects. It is considered an erroneous view.

Although this position has faced criticism for presenting the animist’s worldview as a kind of mistake (see, for example, Bird-David 1999), similar modernist conceptions of animism persist, particularly in the evolutionary psychology of religion. A notable modern proponent is Stewart Guthrie, who takes animist belief as a problem requiring an explanation. The problem needing explained is why people are so disposed to ascribe agency and personhood to non-agents and non-persons. Setting the problem in this light, Guthrie rejects post-modern and relativist tendencies in the contemporary anthropological literature, which seek to investigate animism as an ontology that differs from, but is not inferior to, naturalistic scientific understandings of the world. This post-modernist approach, Guthrie argues, makes “local imagination the arbiter of what exists,” and thereby abandons many important realist commitments inherent in the scientific project (2000, 107).

Guthrie’s own view is that animistic thinking is the result of an evolutionarily adaptive survival strategy. Animistic interpretations of nature are “failures of a generally good strategy” to perceive agents in non-agentive phenomena. The strategy is generally good since, as he puts it, “it is better for a hiker [for example] to perceive a boulder as a bear than to mistake a bear for a boulder” (2015, 6). If we are mistaken in seeing agents everywhere, the price to pay is small. Whereas if we are not mistaken, the payoff is high. This idea has been developed further by other cognitive scientists of religion, such as Justin Barrett, who accounts for this propensity as resulting from what he calls a hyperactive agency detection device (usually abbreviated to HADD): an innate and adaptive module of human cognition.

This modernist or positivist view of animism can be contrasted with several post-modernist views, which are surveyed below.

c. Enactivist Animism

Another approach to animism takes it as a kind of non-propositional, experiential state. Tim Ingold characterizes animism as a lived practice of active listening. Animism, he says, is “a condition of being alive to the world, characterized by a heightened sensitivity and responsiveness, in perception and action, to an environment that is always in flux, never the same from one moment to the next” (2006, 10). Borrowing a phrase from Merleau-Ponty, Ingold characterizes the lived experience of animism as “the sense of wonder that comes from riding the crest of the world’s continued birth”. The animist does not so much believe of the world that it contains spooky “nature spirits”; rather, she participates in a natural world, which is responsive and communicative. Animism is not a system to which animists relate, but rather it is immanent in their ways of relating.

On this enactivist account, the crucial thread of animist thinking is not characterized by a belief in spirits or a belief in the intentionality of non-intentional objects. Animist thinking is instead construed as a kind of experience—the living of a particular form of life, one which is responsive and communicative with the local environment, and one which engages with the natural environment as subject, not object. Thus, there is a distinctive and characteristically interpersonal quality of animist phenomenology. The animist’s claim—say, that whales are persons—is not a belief to which she assents, nor is it a hypothesis which she might aim to demonstrate or falsify. On the contrary, the animist does not know that whales are persons, but rather knows how to get along with whales.

This understanding of animism echoes the philosophy of religion of Ludwig Wittgenstein, who famously rejected the notion that the substantive empirical claims of religion should be understood as attempts at objective description or explanation. Instead, religion should be understood as a frame of experience or “world picture”. A similarly permissive approach to animism and indigenous religion has recently been championed by Mikel Burley, who stresses the importance of evaluating competing religious world-pictures according to their own internal criteria and the effects that such world pictures have on the lived experience of their adherents (see Burley [2018] and [2019]).

A similar view can also be found in work outside of the analytic philosophical tradition. Continental philosophers such as Max Horkheimer and Theodor Adorno, for example, argue that the modern scientistic worldview alienates us from our environment and is the cause of widespread disenchantment by way of the “extirpation of animism” (2002, 2). Thus, it is our experience of the world around us which is diminished on the scientistic frame, and this disenchantment can be cured by taking up an animistic frame of experience. Martin Buber is another philosopher who stresses the fundamentally spiritual nature of what he calls the “I-Thou” aspect of experience (a subject-subject relation), which can be contrasted with the “I-It” aspect (a subject-object relation). The pragmatist philosopher William James uses the very same terms in his expression of what is characteristic of a religious perception of the universe: “The universe is no longer a mere It to us, but a Thou” (2000 [1896], 240).

It is through the animist’s experience of the world as fundamentally grounded in interpersonal relations that her experience is characterized as distinct from the Western, naturalistic world picture, in which interpersonal encounters are austerely restricted to encounters between human beings.

d. Animism as Ontology

For many post-modern anthropologists, the purpose of research is understood to be a mediation between different but equally valid constructions of reality or “ontologies”. This modern shift of emphasis is sometimes labeled the “ontological turn” in anthropology. According to theorists in this school, animism should be understood as consisting in a distinct ontology with distinct commitments. Philippe Descola is one writer who characterizes animism as just one competing way among several in which the culturally universal notions of “interiority” (subjective or private states of experience) and “exteriority” (physical states) can be carved up. For Descola, the animist views elements of the external world as sharing a common interiority while differing in external features. This can be contrasted with the naturalist’s worldview, which holds that the world contains beings which are similar in their physicality (being made of the same or similar substances), yet which differ in their interiority. Thus, for the animist, while humans and trees, for example, differ in exteriority, they nevertheless share in common the possession of similar interior states. In ‘animic’ systems, humans and non-humans possess the same type of interiority. Since this interiority is understood to be common to both humans and non-humans alike, it follows that non-humans are understood as having social characteristics, such as respecting kinship rules and ethical codes.

On such an account, the animist takes the interiority of any given creature to differ from human interiority only to the extent that it is grounded in different cognitive and perceptual instruments. A tree, for example, cannot change location at will, and so has an interior life very different from that of a human being or a raven. Nevertheless, trees, humans, and ravens share in common the quality of interiority.

A more radical view in the same vein, dubbed “perspectivism”, is described by Viveiros de Castro, who notes that among various Amerindian indigenous religions, a common interiority is understood to consist in a common cultural life, and it is this common culture which is cloaked in diverse exterior appearances. This view turns the traditional, Western, naturalistic notion of the unity of nature and the plurality of culture on its head. Instead, the unity of culture is a fundamental feature of the animists’ world. Whereas in normal conditions, humans see humans as humans, and animals as animals, it is not the case that animals see themselves as animals. On the contrary, animals may see humans as possessing the exteriority of animals, while viewing themselves as humans. Jaguars, for example, see the blood they drink as beer, while vultures see maggots as grilled fish. They see their fur, feathers, claws, beaks, and so forth as cloaks and jewellery. In addition, they have their own social system organized in the same way as human institutions are (Viveiros de Castro, 1998, 470).

Although a particularly interesting position in its own right, perspectivism seems to apply to only a limited number of Amerindian cultures which are the objects of Viveiros de Castro’s studies, and so perspectivism may not serve as a broad and inclusive account of animism that could act as an umbrella for the broad range of traditions which are, on their face, animistic.

Both Descola’s and Viveiros de Castro’s accounts assume that the animist ascribes interiority to non-humans as well as to non-living creatures. However, it is unclear whether all or even most putatively animist communities share in this view. At least some communities regarded as animist appear to enact social relationships with non-human persons, yet do not appear to be committed to any dualist ontological view according to which non-human persons are actually sentient or have their own unique interior states or souls (for a discussion with reference to indigenous Australian animism, see Peterson 2011, 177).

e. Social-Relational Animism

An increasingly popular view understands animism, not as depending upon some abstract notion of interiority or soul, but rather as being fundamentally to do with relationships between human and other-than-human persons. Irving Hallowell, for example, emphasizes an ontology of social relations that holds between the world’s persons, only some of whom are human (1960, 22). Thus, what is fundamental to the animist’s worldview is a commitment to the existence of a broad set of non-human persons. This approach has been championed by Graham Harvey, who summarizes the animist’s belief as the position that “the world is full of persons, only some of whom are human, and that life is always lived in relationship with others” (2005, xi). That is not to say that animists have no concept of objecthood as divorced from personhood, but rather that animist traditions seriously challenge traditional Western views of what sorts of things can count as persons.

A version of this view has been championed by Nurit Bird-David (1999) who takes animism to be a “relational epistemology”, in which social relations between humans and non-humans are fundamental to animist ontology. What is fundamental to the animist’s worldview is the subject-subject relation, in contrast to the subject-object relation taken up in a naturalist’s understanding of the world. This in no way hinges on a metaphysical dualism that makes any distinction between spirits/souls and bodies/objects. Rather, this account hinges on a particular conception of the world as coming to be known principally via socialization. The animist does not hypothesize that some particular tree is a person, and socialize accordingly. Instead, one personifies some particular tree as, when and because one socializes with it. Thus, the commitment to the idea of non-human personhood is a commitment that develops across time and through social interaction.

The animist’s common adoption of kinship terms (such as “grandfather”) for animals and other natural phenomena may also be elucidated on this picture. In earlier writing, Hallowell (1926) describes the extent to which “bear cults” of the circumpolar region carefully avoid general terms, such as “bear” or “animal”, when addressing bears both pre- and post-mortem. Instead, kinship terms are regularly adopted. This can be explained on the assumption that the social role of the kinship term is being invoked (“grandfather”, for example, refers to one who is wise, who deserves to be listened to, who has authority within the social life of the community, and so on). Indeed, in more recent writing, Bird-David speculates whether an understanding of animist belief as fundamentally built on the notion of relatives rather than persons may more accurately account for the sense in which the animist relates with non-human others (2017). For the animist, on this revised account, the world does not so much consist in a variety of human and non-human persons, who differ in their species-specific and special forms of community life; rather, the world is composed of a network of human and non-human relatives, and what is fundamental to the animist’s worldview is this network as well as the maintenance of good relations within it.

2. The Neglect of Animism

Unlike theism, animism has seldom been the focus of any sustained critical or philosophical discourse. Perhaps unsurprisingly, where such traditions of critical discourse have flourished, a realist interpretation of animist belief has been received negatively. 17th century Japanese commentaries on Shinto provide a rare example of such a critical tradition. Writers such as Fukansai Habian, Arai Hakuseki, and Ando Shoeki critically engaged with the mythological and animistic aspects of Shinto, while also illuminating their historical and political subtexts. Interestingly, in his philosophical discourse On Shinto, Habian produces several naturalistic debunking arguments against animism, among which is the argument that the Japanese have developed their peculiar ontology in the same way as other island peoples, who all developed similar mythologies pertaining to a familial relationship with the unique piece of land on which they find themselves. He goes on to argue that the Japanese cannot truly be descendants of the Sun, since in the ordinary course of events, humans beget humans, dogs beget dogs, and so on. Thus, human beings could not actually be born of the Sun. It follows that the Japanese race could not be the descendants of Ameratsu, the Sun kami. Another critical argument from Habian runs that the heavenly bodies cannot be animate beings, since their movements are too linear and predictable. Were the moon truly animate, he argues, we should witness it zig-zagging as does an ant (Baskind and Bowring 2015, 147). Such debates, naturally, had little impact in the West where they were largely inaccessible (and anyways considered irrelevant). After the voyages of discovery and during the age of empire, the steady conversion of colonized peoples to the proselytizing theistic traditions of colonial powers added credence to the notion that primitive animist religions had indeed been discarded for more sophisticated religious rivals (particularly, Christianity and Islam).

The modernist view (championed by the likes of Hume, Tylor, and Frazer) according to which animism is an unsophisticated, primitive, and superstitious belief was carried over wholesale into contemporary 20th century analytic philosophy of religion. One might expect that as religious exclusivism waned in popularity in philosophy and popular culture, animism would come to be appreciated as one valid religious perspective among many other contenders. Yet even permissive religious pluralists, such as the philosopher John Hick, denied that primitive animistic traditions count as genuine transformative encounters with transcendental ultimacy or “the Real” (1989, 278). One recent attempt to reconcile religious pluralism and indigenous religious traditions can be found in the work of Mikel Burley, although it remains to be seen what impact this approach will have on the field.

Other philosophers of religion (such as Kevin Schilbrack [2014] and Graham Oppy [2014]) argue that the philosophy of religion needs to critically engage with a greater diversity of viewpoints and traditions, which would include animism, as well as ancestor worship, shamanism and the like. Of course, it is not incumbent on these philosophers to celebrate animism. But it is important that that the field dubbed “philosophy of religion” engage with religion as a broad and varied human phenomenon. The cursory dismissal that animism receives within the discipline is, apparently, little more than a hangover of colonial biases.

The view that animist traditions fail to compete with the “great world religions” remains surprisingly pervasive in mainstream philosophy of religion. A reason for this may have to do with a prevailing conception of animist traditions as having no transformative or transcendental aspects. They are immanentist religions, which seldom speak of any notions of salvation or liberation as a central religious aim. Instead, there is a focus on the immediate needs of the community and on good working relationships which stand between human persons and the environment. Because animists are immanentists, their traditions are seen as failing to lead believers to the ultimate religious goal: salvation. However, it is clear enough that there are no non-circular grounds on which to base this appraisal. Why would we judge transcendentalist religions superior, or more efficacious, compared to immanentist ones, unless we were already committed to the view that the ultimate goal of religion is salvation?

3. Public Arguments for Animism

Some philosophical arguments can be mounted in support of animism. Some of these arguments hinge on evidence which is publicly available (call them “public arguments”). Others may hinge on what is ultimately private or person-relative evidence (call them “private arguments”). Two closely related public arguments may be proffered in support of animism:

a. Argument from Innateness

Within the field of psychology, it has been observed that children have a “tendency to regard objects as living and endowed with will” (Piaget 1929, 170). Young children are more inclined to ascribe agency to what most adults regard as inanimate objects. Evidence suggests that this tendency to attribute agency decreases markedly between three and five years of age (Bullock 1985, 222). This tendency to attribute agency is not the result of training by caregivers. Implicit in much of the psychological research is the idea that the child’s perception of widespread agency is naive, and corrected by the path of development to adulthood. This was an argument already stated by the Scottish enlightenment philosopher Thomas Reid in the 18th century. Yet it is unclear on what grounds, apart from our pre-existing naturalistic commitments, we might base this appraisal.

Against the view that childhood animism is corrected by experience, David Kennedy writes that the shift towards naturalist modernism has left the child’s animist commitments in a state of abandonment. Kennedy asks somewhat rhetorically: “Do young children, because of their different situation, have some insight into nature that adults do not? Does their “folly” actually represent a form of wisdom, or at least a philosophical openness lost to adults, who have learned, before they knew it, to read soul out of nature?” (Kennedy 1989). The idea that childhood animism is corrected by experience is the natural consequence of a commitment to a modernist conception of animism, but it would be a harder position to maintain according to the alternative conceptions surveyed above.

b. Argument from Common Consent

The traditional argument from common consent runs that because the majority of religious believers are theists, theism is probably true. A revised common consent argument may be launched, however, which runs that since separate and isolated religious communities independently agree to the proposition that features of the natural world (such as rocks, rivers and whales) are persons, it follows, therefore, that animism is probably true (Smith 2019). This argument relies on the social epistemic claim that independent agreement is prima facie evidence for the truth of some proposition. A similar argument supporting ancestor worship can be found in a recent article by Thomas Reuter (2014). Moreover, it is argued that since the widespread distribution of theists has been brought about by the relatively recent proselytization of politically disempowered peoples, such widespread agreement is not compelling evidence for the truth of theism. Indeed, even contemporary defenders of the common consent argument for the existence of God accept that independent agreement is stronger evidence for the truth of some proposition than agreement generated by some other means, such as “word of mouth” or indoctrination (see, for example, Zagzebski [2012, 343]). It would seem, then, that even on their own terms, contemporary proponents of the common consent argument for the existence of God ought to consider animism as a serious rival to theism.

4. Private Arguments for Animism

Recently, it has been popular to move beyond public defenses of religious belief and toward private or person-relative defenses. Such defenses typically charge that believers are warranted to accept their religious beliefs, even if they lack compelling discursive arguments or public evidence that their views are reasonable to believe or probably true. Alvin Plantinga’s Warranted Christian Belief is the best-known work in this vein.

If such defenses are not inherently epistemologically suspicious, then it remains open for the animist to argue that while there may be no overwhelmingly convincing arguments for animism, animist beliefs are nevertheless internally vindicated according to the standards that animists themselves hold, whatever those standards might be. It could be argued that animist belief is properly basic in the same way that Plantinga takes theistic belief to be (Plantinga 1981). In addition, it may be argued that animist beliefs are not defeated by any external challenges (Smith [2019], for example, gives rebuttals to evolutionary debunking arguments of animism). Thus, it might be argued that animism is vindicated not by external or discursive arguments according to which animism can be shown to be probably true, but by epistemic features internal to the relevant animistic belief system. The animist may argue that although animist belief is thereby justified in a circular manner, this is in no way inferior to the justification afforded to other fundamental beliefs (beliefs about perception, for example), since epistemic circularity is a feature of many of our most fundamental beliefs (William Alston [1986] defends Christian mystical practices by appeal to this kind of tu quoque argument).

5. Pragmatic Arguments for Animism

It is the nature of pragmatic arguments to present some aim as worthwhile, and to recommend some policy conducive to the achievement of that aim. Animist belief has been recommended by some writers as conducive to achieving the following three aims.

a. Environmentalism

An understanding of the environment as rich with persons clearly has implications for conservation, resource management, and sustainability. The scope of human moral decision-making, which may affect the well-being of other persons, is broadened beyond a concern only for human persons. It is on these grounds that philosophers and environmental theorists have argued that a shift towards animism is conducive to the success of worldwide conservation and sustainability efforts. Val Plumwood, for example, argues that an appreciation of non-human others is nothing short of a “basic survival project”. She writes that “reversing our drive towards destroying our planetary habitat,” may require “a thorough and open rethink which has the courage to question our most basic cultural narratives” (2010, 47). The argument runs that within a positivist, scientistic paradigm, reverence and appreciation for the natural world is replaced by a disregard or even an antipathy, according to which the natural world is understood as a mere resource for human consumption.

Much of the appeal of this view appears to hinge on the popular belief that many indigenous societies lived in harmony with nature and that this harmony is a direct result of their understanding of the outside world as an extension of their own society and culture. Against this ecological “noble savage” view, some scholars have charged that this romanticized picture of the animist is unrealistic, as there seems to be at best a tenuous causal connection between traditional animist belief systems and enhanced conservation practices (Tiedje 2008, 97). Any link between animism and environmentalism will also hinge importantly on precisely which natural phenomena are understood to be persons, and whether such persons require much or any respect at all.  A tradition that views a fire as a subject and a grassland as a mere object is unlikely to be concerned when the former consumes the latter.

b. Feminism

It has been argued that the liberation of women is a project which cannot be disentangled from the liberation of (and political recognition of) the environment. The objectification of nature is seen as an aspect of patriarchy, which may be undone by the acceptance of an ethics of care which acknowledges the existence of non-human persons. The frame of thinking in which patriarchy flourishes depends upon a system of binary opposition, according to which “nature” is contrasted with “reason”, and according to which anything considered to fall within the sphere of the former (women, indigenous peoples, animals, and so forth) is devalued and systematically disempowered (Mathews 2008, 319). Thus, animism, in so far as it rejects the traditional binary, is perceived as an ally of (a thoroughly intersectional) feminism.

Moreover, the argument is made that animism, as an epistemological world picture, itself constitutes a feminist challenge to patriarchal epistemologies and the conclusions drawn from them. So, whereas monotheistic religious traditions are taken to be grounded in abstract reasoning about ultimate causes and ultimate justice (supposedly “masculine” reasoning), animism is taken to be grounded in intuition and a concern for the maintenance of interpersonal relationships. Likewise, while an austere philosophical naturalism views the external world as fundamentally composed of unconscious, mechanistic, and deterministic causal objects whose real natures are grasped by sense perception and abstract reasoning, an animist epistemology is sensitive to the fundamentality of knowing others, and so shares common cause with feminist epistemological approaches (such as Stuckey [2010, 190]).

c. Nationalism and Sovereignty

Given the intimate connection that the animist draws between communities and their local environment, animism has been endorsed in promoting nationalistic political agendas as well as in reasserting indigenous sovereignty over contested ancestral lands. In New Zealand, for example, legal personhood has been granted to both a mountain range (Te Urewera Act 2014) and a river (Te Awa Tupua (Whanganui River Claims Settlement) Act 2017). In both cases, legal personhood was granted in accordance with the traditional animist commitments of local Māori, and the acts were thereby seen as reasserting indigenous sovereignty over these lands.

Nationalist political movements have also made appeal to animism and neo-paganism, particularly in hostile quests of expansion. Since animist traditions draw strong connections between environment and culture, land and relatedness, there is fertile ground for such traditions to invoke exclusive rights to the use and habitation of the environment. The promotion of Volkisch neo-paganism, for example, was used to motivate Nazi arguments for German Lebensraum, or living space—the expansion into “ancestral” German lands (Kurlander 2017, 3-32). Similarly, Shinto was instituted as the state religion in Japan in 1868 to consolidate the nation after the Meiji restoration. It was further invoked to defend notions of racial superiority up to the Second World War. As direct descendants of Amaterasu (the sun kami), the Japanese race had a claim to racial superiority, particularly over other Asian races. This claim to Japanese racial supremacy, itself a consequence of animist aspects of Shinto mythology, was often used in defense of the expansion of the Japanese empire throughout the Asia-Pacific region (Holtom 1947, 16).

6. References and Further Reading

  • Alston, W. (1986) ‘Epistemic Circularity’ Philosophy and Phenomenological Research. 47 (1): pp. 1—30.
  • Baskind, J. and Bowring, R. (2015) The Myotei Dialogues: A Japanese Christian Critique of Native Traditions. Boston: Brill.
  • Bird-David, N. (1999) ““Animism” Revisited: Personhood, Environment, and Relational Epistemology” Current Anthropology. 40: (S1). pp. S67-S91.
  • Bird-David, N. (2017) Us, Relatives: Scaling and Plural Life in a Forager World. Oakland: University of California Press.
  • Buber, M. (1970) in Kaufman, W. (trans.). I and Thou. New York: Charles Scribner’s and Sons.
  • Bullock, M. (1985) “Animism in Childhood Thinking: a New Look at an Old question” Developmental Psychology. 21: (2) pp. 217-225.
  • Burley, Mikel (2019) A Radical Pluralist Philosophy of Religion. New York: Bloomsbury Academic.
  • Descola, P. (2013) in Lloyd, J. (trans.) Beyond Nature and Culture. Chicago. Chicago University Press.
  • Fortenbaugh, W. (2011) Strato of Lampsacus: Text, Translation and Discussion. New York: Routledge.
  • Frazer, J. (1900) The Golden Bough: A Study in Magic and Religion. New York: Macmillan and Co.
  • Freud, S. (1950) Totem and Taboo. London: Routledge and Kegan Paul Ltd.
  • Guthrie, S. (2015) Faces in the Clouds: A New Theory of Religion. Oxford: Oxford University Press.
  • Guthrie, S. (2000) “On Animism” Current Anthropology. 41 (1): pp. 106-107.
  • Hallowell, I. (1926) “Bear Ceremonialism in the Northern Hemisphere” American Anthropologist. 28 (1): 1-175.
  • Hallowell, I. (1960) “Ojibwa Ontology, Behavior and World View” in Harvey, G. (ed.) (2002) Readings in Indigenous Religions. London. Continuum: pp. 18-49.
  • Harvey, G. (2005) Animism: Respecting the Living World. New York: Columbia University Press.
  • Hick, J. (1989) An Interpretation of Religion. New Haven: Yale University Press.
  • Holtom, D. C. (1963) Modern Japan and Shinto Nationalism (3rd edition). New York: Reprinted with special arrangement with University of Chicago Press by Paragon Book Reprint Corp.
  • Horkheimer, M. and Adorno, T. (2002) Dialectic of Enlightenment. Stanford: University Press.
  • Hume, D. in G. C. A. Gaskin (ed.) (2008). Dialogues Concerning Natural Religion, and The Natural History of Religion. Oxford. Oxford University Press.
  • Ingold, T. (2006) “Rethinking the Animate, Re-Animating Thought” Ethnos. 71 (1): pp. 9-20.
  • James, W. (1896) “The Will to Believe” in Stuhr, J. (ed.) (2007) Pragmatism and Classical American Philosophy: Essential Readings and Interpretive Essays 2nd ed. Oxford: Oxford University Press. pp. 230-243.
  • Kennedy, D. (1989) “Fools, Young Children, Animism, and the Scientific World Picture” Philosophy Today. 33 (4): pp. 374-381.
  • Kurlander, E. (2017) Hitler’s Monsters. New Haven: Yale University Press.
  • Mathews, F. (2008) “Vale Val: In Memory of Val Plumwood” Environmental Values. 17 (3): pp. 317-321.
  • Oppy, G. (2014) Reinventing Philosophy of Religion: An Opinionated Introduction. New York: Palgrave Macmillan.
  • Peoples, H., Duda, P. and Marlowe, F. (2016) “Hunter-Gatherers and the Origins of Religion” Human Nature. 27. pp. 261-282.
  • Peterson, N. (2011) “Is the Aboriginal Landscape Sentient? Animism, the New Animism and the Warlpiri” Oceania 81 (2): pp. 167-179.
  • Piaget, J. (1929) The Child’s Conception of the World. New York: Harcourt Brace.
  • Plantinga, A. (1981) “Is Belief in God Properly Basic?” Noûs. 15 (1): pp. 41-51.
  • Plumwood, V. (2010) “Nature in the Active Voice” in Irwin, R. (ed.) Climate Change and Philosophy: Transformational Possibilities. London: Continuum. pp. 32-47.
  • Reid, T. 1975. Inquiries and Essays. (K. Lehrer and R. E. Beanblossom, Eds.). Indianapolis: Bobbs-Merrill Company, Inc.
  • Reuter, T. (2014) “Is Ancestor Veneration the Most Universal of All World Religions? A Critique of Modernist Cosmological Bias” Wacana. 15 (2): pp. 223-253.
  • Skrbina, D. (2018) “Panpsychism” Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/panpsych/ (Accessed 25-May-2018).
  • Schilbrack, K. (2014) Philosophy and the Study of Religions: A Manifesto. Oxford: Wiley Blackwell.
  • Smith, T. (2019) “The Common Consent Argument for the Existence of Nature Spirits” Australasian Journal of Philosophy.
  • Stucky, P. (2010) “Being Known by a Birch Tree: Animist Refigurings of Western Epistemology” Journal for the Study of Religion, Nature and Culture. 4 (3): 182-205.
  • Tiedje, K. (2008) “Situating the Corn Child: Articulating Animism and Conservation from a Nahua Perspective” Journal for the Study of Religion, Nature and Culture. 2 (1): pp. 93-115.
  • Tylor, E. B. (1929) Primitive Culture: Researches into the Development of Mythology, Philosophy, Religion, Language, Art and Custom. Vol. 1. London: John Murray.
  • Viveiros De Castro, E. (1998) “Cosmological Deixis and Amerindian Perspectivism” Journal of the Royal Anthropological Institute. 4 (3): pp. 469-488.
  • Zagzebski, L. (2012) Epistemic Authority: A Theory of Trust, Authority, and Autonomy in Belief. Oxford: University Press.

 

Author Information

Tiddy Smith
Email: smithtiddy@gmail.com
Otago University
New Zealand

Zeno’s Paradoxes

Zeno_of_EleaIn the fifth century B.C.E., Zeno offered arguments that led to conclusions contradicting what we all know from our physical experience—that runners run, that arrows fly, and that there are many different things in the world. The arguments were paradoxes for the ancient Greek philosophers. Because many of Zeno’s arguments turn crucially on the notion that space and time are infinitely divisible, he was the first person to show that the concept of infinity is problematical.

In his Achilles Paradox, Achilles races to catch a slower runner—for example, a tortoise that is crawling in a straight line away from him. The tortoise has a head start, so if Achilles hopes to overtake it, he must run at least to the place where the tortoise presently is, reasons Zeno, but by the time he arrives there, it will have crawled to a new place, so then Achilles must run at least to this new place, and so forth. According to this reasoning, Achilles will never catch the tortoise, says Zeno. Whether Zeno and Parmenides themselves denied motion is very controversial, but subsequent scholars over the centuries assumed this, so it has been the majority position. One minority position is that they were not denying motion, but only showing that their opponents were committed to this.

We cannot escape the Achilles paradox by jumping up from our seat and chasing down a tortoise, nor by saying Zeno’s opponents should have constructed a new argument in which Achilles takes better aim and runs toward a place ahead of the tortoise. Because Zeno was correct in saying Achilles needs to run at least to all those places where the tortoise once was, what is required is an analysis of Zeno’s own argument.

This article explains his ten known paradoxes and considers the treatments that have been offered. In the Achilles Paradox, Zeno assumed distances and durations are infinitely divisible in the sense of having an actual infinity of parts, and he assumed there are too many of these parts for the runner to complete. Aristotle‘s treatment says Zeno should have assumed instead that there are only a potential infinity of places to run to, so that at any time the hypothetical division into parts produces only a finite number of parts, and the runner has time to complete all these parts. Aristotle’s treatment was generally accepted until the late 19th century. The current standard treatment, the so-called “Standard Solution,” implies Achilles’s path contains an actual infinity of parts, but Zeno was mistaken to assume this is too many parts for a runner to complete. This treatment employs the mathematical apparatus of calculus which has proved its indispensability for the development of modern science.  The article ends by exploring newer treatments of the paradoxes—and related paradoxes such as Thomson’s Lamp Paradox—that were developed since the 1950s.

Table of Contents

  1. Zeno of Elea
    1. His Life
    2. His Book
    3. His Goals
    4. His Method
  2. The Standard Solution to the Paradoxes
  3. The Ten Paradoxes
    1. Paradoxes of Motion
      1. The Achilles
      2. The Dichotomy (The Racetrack)
      3. The Arrow
      4. The Moving Rows (The Stadium)
    2. Paradoxes of Plurality
      1. Alike and Unlike
      2. Limited and Unlimited
      3. Large and Small
      4. Infinite Divisibility
    3. Other Paradoxes
      1. The Grain of Millet
      2. Against Place
  4. Aristotle’s Treatment of the Paradoxes
  5. Other Issues Involving the Paradoxes
    1. Consequences of Accepting the Standard Solution
    2. Criticisms of the Standard Solution
    3. Supertasks and Infinity Machines
    4. Constructivism
    5. Nonstandard Analysis
    6. Smooth Infinitesimal Analysis
  6. The Legacy and Current Significance of the Paradoxes
  7. References and Further Reading

1. Zeno of Elea

a. His Life

Zeno was born in about 490 B.C.E. in the city-state of Elea, now Velia, on the west coast of southern Italy; and he died in about 430 B.C.E. He was a friend and student of Parmenides, who was twenty-five years older and also from Elea. He was not a mathematician.

There is little additional, reliable information about Zeno’s life. Plato remarked (in Parmenides 127b) that Parmenides took Zeno to Athens with him where he encountered Socrates, who was about twenty years younger than Zeno, but today’s scholars consider this encounter to have been invented by Plato to improve the story line. Zeno is reported to have been arrested for taking weapons to rebels opposed to the tyrant who ruled Elea. When asked about his accomplices, Zeno said he wished to whisper something privately to the tyrant. But when the tyrant came near, Zeno bit him, and would not let go until he was stabbed. Diogenes Laërtius reported this apocryphal story seven hundred years after Zeno’s death.

b. His Book

According to Plato’s commentary in his Parmenides (127a to 128e), Zeno brought a treatise with him when he visited Athens. It was said to be a book of paradoxes defending the philosophy of Parmenides. Plato and Aristotle may have had access to the book, but Plato did not state any of the arguments, and Aristotle’s presentations of the arguments are very compressed. The Greek philosophers Proclus and Simplicius commented on the book and its arguments. They had access to some of the book, perhaps to all of it, but it has not survived. Proclus is the first person to tell us that the book contained forty arguments. This number is confirmed by the sixth century commentator Elias, who is regarded as an independent source because he does not mention Proclus. Unfortunately, we know of no specific dates for when Zeno composed any of his paradoxes, and we know very little of how Zeno stated his own paradoxes. We do have a direct quotation via Simplicius of the Paradox of Denseness and a partial quotation via Simplicius of the Large and Small Paradox. In total we know of less than two hundred words that can be attributed to Zeno. Our knowledge of these two paradoxes and the other seven discussed in this article comes to us indirectly through paraphrases of them, and comments on them, primarily by his opponents Aristotle (384-322 B.C.E.), Plato (427-347 B.C.E.), Proclus (410-485 C.E.), and Simplicius (490-560 C.E.). The names of the paradoxes were created by later commentators, not by Zeno. A thousand years after Zeno, one comment by Hesychius suggested that there were perhaps three more books by Zeno than the one mentioned by Plato, but scholars do not generally accept this claim because at least three of the book names by Hesychius are believed to be the name for just one book.

c. His Goals

In the early fifth century B.C.E., Parmenides emphasized the distinction between appearance and reality. Reality, he said, is a seamless unity that is unchanging and cannot be destroyed, so appearances of reality are deceptive. Our ordinary observation reports are false; they do not report what is real. This metaphysical theory is the opposite of Heraclitus’ theory, but evidently it was supported by Zeno. Although we do not know from Zeno himself whether he accepted his own paradoxical arguments or exactly what point he was making with them, or exactly what the relationship was between Parmenides’ views and Zeno’s, the historically most influential position is  Plato‘s. Plato said the paradoxes were designed to provide detailed, supporting arguments for Parmenides beliefs by demonstrating that Greek common sense confidence in the reality of motion, change, and ontological plurality (that is, that there exist many things), involve absurdities. Plato’s classical interpretation of Zeno was accepted by Aristotle and by most other commentators throughout the intervening centuries. On Plato’s interpretation, it could reasonably be said that Zeno’s goal was to show that his Dichotomy and Achilles paradoxes demonstrate that any continuous process takes an infinite amount of time, which is paradoxical, while Zeno’s Arrow and Stadium paradoxes demonstrate that the concept of discontinuous change is paradoxical. Because both continuous and discontinuous change are paradoxical, so is any change.

This is Gregory Vlastos’ position regarding Zeno’s goals. Eudemus, a student of Aristotle, offered another interpretation of Zeno’s goals. He suggested that Zeno was challenging both pluralism and Parmenides’ monism, which would imply that Zeno was a nihilist. Paul Tannery in 1885 and Wallace Matson in 2001 offer a third interpretation of Zeno’s goals regarding the paradoxes of motion. Plato and Aristotle did not understand Zeno’s arguments nor his purpose, they say. Zeno was actually challenging the Pythagoreans and their particular brand of pluralism, not Greek common sense. Tannery and Matson suggest Zeno himself did not believe the conclusions of his own paradoxes. The controversial issue of interpreting Zeno’s true goals and purposes is not pursued further in this article. Instead, Plato’s classical interpretation is assumed because it is the one that was so influential throughout history and because the paradox as classically interpreted needs to be countered even if Matson and Tannery are correct about Zeno’s own position.

Aristotle believed Zeno’s Paradoxes were trivial and easily resolved, but later philosophers have not agreed on the triviality.

d. His Method

Before Zeno, Greek thinkers favored presenting their philosophical views by writing poetry. Zeno began the grand shift away from poetry toward a prose that contained explicit premises and conclusions. And he employed the method of indirect proof in his paradoxes by temporarily assuming some thesis that he opposed and then attempting to deduce an absurd conclusion or a contradiction, thereby undermining the temporary assumption. This method of indirect proof or reductio ad absurdum probably originated with Greek mathematicians, but Zeno used it more systematically and self-consciously.

2. The Standard Solution to the Paradoxes

Any paradox can be treated by abandoning enough of its crucial assumptions. For Zeno’s it is very interesting to consider which assumptions to abandon, and why those. A paradox is an argument that reaches a contradiction by apparently legitimate steps from apparently reasonable assumptions, while the experts at the time cannot agree on the way out of the paradox, that is, agree on its resolution. It is this latter point about disagreement among the experts that distinguishes a paradox from a mere puzzle in the ordinary sense of that term. Zeno’s paradoxes are now generally considered to be puzzles because of the wide agreement among today’s experts that there is at least one acceptable resolution of the paradoxes.

This resolution is called the Standard Solution. It points out that, although Zeno was correct in saying that at any point or instant before reaching the goal there is always some as yet uncompleted path to cover, this does not imply that the goal is never reached. More specifically, the Standard Solution says that for the runners in the Achilles Paradox and the Dichotomy Paradox, the runner’s path is a physical continuum that is completed by using a positive, finite speed. The details presuppose differential calculus and classical mechanics. The Standard Solution treats speed as the derivative of distance with respect to time. It assumes that physical processes are sets of point-events. It implies that durations, distances and line segments are all linear continua composed of indivisible points, then it uses these ideas to challenge various assumptions made, and inference steps taken, by Zeno. To be very brief and anachronistic, Zeno’s mistake (and Aristotle’s mistake) was to fail to use calculus. More specifically, in the case of the paradoxes of motion such as the Achilles and the Dichotomy, Zeno’s mistake was not his assuming there is a completed infinity of places for the runner to go, which was what Aristotle said was Zeno’s mistake. Instead, Zeno’s and Aristotle’s mistake was in assuming that this is too many places (for the runner to go to in a finite time).

A key background assumption of the Standard Solution is that this resolution is not simply employing some concepts that will undermine Zeno’s reasoning—Aristotle’s reasoning does that, too, at least for most of the paradoxes—but that it is employing concepts which have been shown to be appropriate for the development of a coherent and fruitful system of mathematics and physical science. Aristotle’s treatment of the paradoxes does not employ these fruitful concepts of mathematical physics. Aristotle did not believe that the use of mathematics was needed to understand the world. The Standard Solution is much more complicated than Aristotle’s treatment, and no single person can be credited with creating it.

The Standard Solution allows us to speak of one event happening pi seconds after another, and of one event happening the square root of three seconds after another. In ordinary discourse outside of science we would never need this kind of precision, but it is needed in mathematical physics and its calculus. The need for this precision has led to requiring time to be a linear continuum, very much like a segment of the real number line. By “real numbers” we do not mean actual numbers but rather decimal numbers.

Calculus was invented in the late 1600’s by Newton and Leibniz. Their calculus is a technique for treating continuous motion as being composed of an infinite number of infinitesimal steps. After the acceptance of calculus, most all mathematicians and physicists believed that continuous motion should be modeled by a function which takes real numbers representing time as its argument and which gives real numbers representing spatial position as its value. This position function should be continuous or gap-free. In addition, the position function should be differentiable in order to make sense of speed, which is treated as the rate of change of position. By the early 20th century most mathematicians had come to believe that, to make rigorous sense of motion, mathematics needs a fully developed set theory that rigorously defines the key concepts of real number, continuity and differentiability. Doing this requires a well defined concept of the continuum. Unfortunately Newton and Leibniz did not have a good definition of the continuum, and finding a good one required over two hundred years of work.

The continuum is a very special set; it is the standard model of the real numbers. Intuitively, a continuum is a continuous entity; it is a whole thing that has no gaps. Some examples of a continuum are the path of a runner’s center of mass, the time elapsed during this motion, ocean salinity, and the temperature along a metal rod. Distances and durations are normally considered to be real physical continua whereas treating the ocean salinity and the rod’s temperature as continua is a very useful approximation for many calculations in physics even though we know that at the atomic level the approximation breaks down.

The distinction between “a” continuum and “the” continuum is that “the” continuum is the paradigm of “a” continuum. The continuum is the mathematical line, the line of geometry, which is standardly understood to have the same structure as the real numbers in their natural order. Real numbers and points on the continuum can be put into a one-to-one order-preserving correspondence. There are not enough rational numbers for this correspondence even though the rational numbers are dense, too (in the sense that between any two rational numbers there is another rational number).

For Zeno’s paradoxes, standard analysis assumes that length should be defined in terms of measure, and motion should be defined in terms of the derivative. These definitions are given in terms of the linear continuum. The most important features of any linear continuum are that (a) it is composed of indivisible points, (b) it is an actually infinite set, that is, a transfinite set, and not merely a potentially infinite set that gets bigger over time, (c) it is undivided yet infinitely divisible (that is, it is gap-free), (d) the points are so close together that no point can have a point immediately next to it, (e) between any two points there are other points, (f) the measure (such as length) of a continuum is not a matter of adding up the measures of its points nor adding up the number of its points, (g) any connected part of a continuum is also a continuum, and (h) there are an aleph-one number of points between any two points.

Physical space is not a linear continuum because it is three-dimensional and not linear; but it has one-dimensional subspaces such as paths of runners and orbits of planets; and these are linear continua if we use the path created by only one point on the runner and the orbit created by only one point on the planet. Regarding time, each (point) instant is assigned a real number as its time, and each instant is assigned a duration of zero. The time taken by Achilles to catch the tortoise is a temporal interval, a linear continuum of instants, according to the Standard Solution (but not according to Zeno or Aristotle). The Standard Solution says that the sequence of Achilles’ goals (the goals of reaching the point where the tortoise is) should be abstracted from a pre-existing transfinite set, namely a linear continuum of point places along the tortoise’s path. Aristotle’s treatment does not do this. The next section of this article presents the details of how the concepts of the Standard Solution are used to resolve each of Zeno’s Paradoxes.

Of the ten known paradoxes, The Achilles attracted the most attention over the centuries. Aristotle’s treatment of the paradox involved accusing Zeno of using the concept of an actual or completed infinity instead of the concept of a potential infinity, and accusing Zeno of failing to appreciate that a line cannot be composed of indivisible points. Aristotle’s treatment is described in detail below. It was generally accepted until the 19th century, but slowly lost ground to the Standard Solution. Some historians say Aristotle had no solution but only a verbal quibble. This article takes no side on this dispute and speaks of Aristotle’s “treatment.”

The development of calculus was the most important step in the Standard Solution of Zeno’s paradoxes, so why did it take so long for the Standard Solution to be accepted after Newton and Leibniz developed their calculus? The period lasted about two hundred years. There are four reasons. (1) It took time for calculus and the rest of real analysis to prove its applicability and fruitfulness in physics, especially during the eighteenth century. (2) It took time for the relative shallowness of Aristotle’s treatment of Zeno’s paradoxes to be recognized. (3) It took time for philosophers of science to appreciate that each theoretical concept used in a physical theory need not have its own correlate in our experience. (4) It took time for certain problems in the foundations of mathematics to be resolved, such as finding a better definition of the continuum and avoiding the paradoxes of Cantor’s naive set theory.

Point (2) is discussed in section 4 below.

Point (3) is about the time it took for philosophers of science to reject the demand, favored by Ernst Mach and most Logical Positivists, that each meaningful term in science must have “empirical meaning.” This was the demand that each physical concept be separately definable with observation terms. It was thought that, because our experience is finite, the term “actual infinite” or “completed infinity” could not have empirical meaning, but “potential infinity” could. Today, most philosophers would not restrict meaning to empirical meaning.

Point (1) is about the time it took for classical mechanics to develop to the point where it was accepted as giving correct solutions to problems involving motion. Point (1) was, and still is, challenged in the metaphysical literature on the grounds that the abstract account of continuity in real analysis does not truly describe either time, space or concrete physical reality. This challenge is discussed in later sections.

Point (4) arises because the standard of rigorous proof and rigorous definition of concepts has increased over the years. As a consequence, the difficulties in the foundations of real analysis, which began with George Berkeley’s criticism of inconsistencies in the use of infinitesimals in the calculus were not satisfactorily resolved until the early 20th century with the development of Zermelo-Fraenkel set theory. The key idea was to work out the necessary and sufficient conditions for being a continuum. To achieve the goal, the conditions for being a mathematical continuum had to be strictly arithmetical and not dependent on our intuitions about space, time and motion. The idea was to revise or “tweak” the definition until it would not create new paradoxes and would still give useful theorems. When this revision was completed, it could be declared that the set of real numbers is an actual infinity, not a potential infinity, and that not only is any interval of real numbers a linear continuum, but so are the spatial paths, the temporal durations, and the motions that are mentioned in Zeno’s paradoxes. In addition, it was important to clarify how to compute the sum of an infinite series (such as 1/2 + 1/4 + 1/8 + …) without requiring any person to manually add or otherwise perform some action that requires an infinite amount of time. Finally, mathematicians needed to define motion in terms of the derivative. This new mathematical system required many new well-defined concepts such as compact set, connected set, continuity, continuous function, convergence-to-a-limit of an infinite sequence (such as 1/2, 1/4, 1/8, …), curvature at a point, cut, derivative, dimension, function, integral, limit, measure, reference frame, set, and size of a set. Just as for those new mathematical concepts, rigor was added to the definitions of these physical concepts: place, instant, duration, distance, and instantaneous speed. The relevant revisions were made by Euler in the 18th century and by Bolzano, Cantor, Cauchy, Dedekind, Frege, Hilbert, Lebesgue, Peano, Russell, Weierstrass, and Whitehead, among others, during the 19th and early 20th centuries.

What happened over these centuries to Leibniz’s infinitesimals and Newton’s fluxions? Let’s stick with infinitesimals, since fluxions have the same problems and same resolution. In 1734, Berkeley had properly criticized the use of infinitesimals as being “ghosts of departed quantities” that are used inconsistently in calculus. Earlier, Newton had defined instantaneous speed as the ratio of an infinitesimally small distance and an infinitesimally small duration, and he and Leibniz produced a system of calculating variable speeds that was very fruitful. But nobody in that century or the next could adequately explain what an infinitesimal was. Newton had called them “evanescent divisible quantities,” whatever that meant. Leibniz called them “vanishingly small,” but that was just as vague.

The practical use of infinitesimals was unsystematic. For example, the infinitesimal dx is treated as being equal to zero when it is declared that x + dx = x, but is treated as not being zero when used in the denominator of the fraction [f(x + dx) – f(x)]/dx which is used in the derivative of the function f. In addition, consider the seemingly obvious Archimedean property of pairs of positive numbers: given any two positive numbers A and B, if you add enough copies of A, then you can produce a sum greater than B. This property fails if A is an infinitesimal. Finally, mathematicians gave up on answering Berkeley’s charges (and thus re-defined what we mean by standard analysis) because, in 1821, Cauchy showed how to achieve the same useful theorems of calculus by using the idea of a limit instead of an infinitesimal. Later in the 19th century, Weierstrass resolved some of the inconsistencies in Cauchy’s account and satisfactorily showed how to define continuity in terms of limits (his epsilon-delta method). As J. O. Wisdom points out (1953, p. 23), “At the same time it became clear that [Leibniz’s and] Newton’s theory, with suitable amendments and additions, could be soundly based” provided Leibniz’s infinitesimals and Newton’s fluxions were removed. In an effort to provide this sound basis according to the latest, heightened standard of what counts as “sound,” Peano, Frege, Hilbert, and Russell attempted to properly axiomatize real analysis. Unfortuately, this led in 1901 to Russell’s paradox and the fruitful controversy about how to provide a foundation to all of mathematics. That controversy still exists, but the majority view is that axiomatic Zermelo-Fraenkel set theory with the axiom of choice blocks all the paradoxes, legitimizes Cantor’s theory of transfinite sets, and provides the proper foundation for real analysis and other areas of mathematics, and indirectly resolves Zeno’s paradoxes. This standard real analysis lacks infinitesimals, thanks to Cauchy and Weierstrass. Standard real analysis is the mathematics that the Standard Solution applies to Zeno’s Paradoxes.

In Standard real analysis, the rational numbers are not continuous although they are infinitely numerous and infinitely dense. To come up with a foundation for calculus there had to be a good definition of the continuity of the real numbers. But this required having a good definition of irrational numbers. There wasn’t one before 1872. Dedekind’s definition in 1872 defines the mysterious irrationals in terms of the familiar rationals. The result is a clear and useful definition of real numbers. The usefulness of Dedekind’s definition of real numbers, and the lack of any better definition, convinced many mathematicians to be more open to accepting the real numbers and actually-infinite sets.

Let’s take a short interlude and introduce Dedekind’s key, new idea that he discovered in the 1870s about the reals and their relationship to the rationals. He envisioned how to define a real number to be a cut of the rational numbers, where a cut is a certain ordered pair of actually-infinite sets of rational numbers.

A Dedekind cut (A,B) is defined to be a partition or cutting of the standardly-ordered set of all the rational numbers into a left part A and a right part B. A and B are non-empty, and they partition all the rationals so that the numbers in A are less than all those in B, and also A contains no greatest number. Every real number is a unique Dedekind cut. The cut can be made at a rational number or at an irrational number. Here are examples of each:

Dedekind’s real number 1/2 is ({x : x < 1/2} , {x: x ≥ 1/2}).

Dedekind’s positive real number √2 is ({x : x < 0 or x2 < 2} , {x: x2 ≥ 2}).

The value of ‘x’ must be rational only. For any cut (A,B), if B has a smallest number, then the real number for that cut corresponds to this smallest number, as in the definition of ½ above. Otherwise, the cut defines an irrational number which, loosely speaking, fills the gap between A and B, as in the definition of the square root of 2 above. By defining reals in terms of rationals this way, Dedekind gave a foundation to the reals, and legitimized them by showing they are as acceptable as actually-infinite sets of rationals.

But what exactly is an actually-infinite (or transfinite) set, and does this idea lead to contradictions? This question needs an answer if there is to be a good theory of continuity and of real numbers. In the 1870s, Cantor clarified what an actually-infinite set is and made a convincing case that the concept does not lead to inconsistencies. These accomplishments by Cantor are why he (along with Dedekind and Weierstrass) is said by Russell to have “solved Zeno’s Paradoxes.”

That solution recommends using very different concepts and theories than those used by Zeno. The argument that this is the correct solution was presented by many people, but it was especially influenced by the work of Bertrand Russell (1914, lecture 6) and the more detailed work of Adolf Grünbaum (1967). In brief, the argument for the Standard Solution is that we have solid grounds for believing our best scientific theories, but the theories of mathematics such as calculus and Zermelo-Fraenkel set theory are indispensable to these theories, so we have solid grounds for believing in them, too. The scientific theories require a resolution of Zeno’s paradoxes and the other paradoxes; and the Standard Solution to Zeno’s Paradoxes that uses standard calculus and Zermelo-Fraenkel set theory is indispensable to this resolution or at least is the best resolution, or, if not, then we can be fairly sure there is no better solution, or, if not that either, then we can be confident that the solution is good enough (for our purposes). Aristotle’s treatment, on the other hand, uses concepts that hamper the growth of mathematics and science. Therefore, we should accept the Standard Solution.

In the next section, this solution will be applied to each of Zeno’s ten paradoxes.

To be optimistic, the Standard Solution represents a counterexample to the claim that philosophical problems never get solved. To be less optimistic, the Standard Solution has its drawbacks and its alternatives, and these have generated new and interesting philosophical controversies beginning in the last half of the 20th century, as will be seen in later sections. The primary alternatives contain different treatments of calculus from that developed at the end of the 19th century. Whether this implies that Zeno’s paradoxes have multiple solutions or only one is still an open question.

Did Zeno make mistakes? And was he superficial or profound? These questions are a matter of dispute in the philosophical literature. The majority position is as follows. If we give his paradoxes a sympathetic reconstruction, he correctly demonstrated that some important, classical Greek concepts are logically inconsistent, and he did not make a mistake in doing this, except in the Moving Rows Paradox, the Paradox of Alike and Unlike and the Grain of Millet Paradox, his weakest paradoxes. Zeno did assume that the classical Greek concepts were the correct concepts to use in reasoning about his paradoxes, and now we prefer revised concepts, though it would be unfair to say he blundered for not foreseeing later developments in mathematics and physics.

3. The Ten Paradoxes

Zeno probably created forty paradoxes, of which only the following ten are known. Only the first four have standard names, and the first two have received the most attention. The ten are of uneven quality. Zeno and his ancient interpreters usually stated his paradoxes badly, so it has taken some clever reconstruction over the years to reveal their full force. Below, the paradoxes are reconstructed sympathetically, and then the Standard Solution is applied to them. These reconstructions use just one of several reasonable schemes for presenting the paradoxes, but the present article does not explore the historical research about the variety of interpretive schemes and their relative plausibility.

a. Paradoxes of Motion

i. The Achilles

Achilles, whom we can assume is the fastest runner of antiquity, is racing to catch the tortoise that is slowly crawling away from him. Both are moving along a linear path at constant speeds. In order to catch the tortoise, Achilles will have to reach the place where the tortoise presently is. However, by the time Achilles gets there, the tortoise will have crawled to a new location. Achilles will then have to reach this new location. By the time Achilles reaches that location, the tortoise will have moved on to yet another location, and so on forever. Zeno claims Achilles will never catch the tortoise. This argument shows, he believes, that anyone who believes Achilles will succeed in catching the tortoise and who believes more generally that motion is physically possible is the victim of illusion. The claim that motion is an illusion was advanced by Zeno’s mentor Parmenides .

The source for all of Zeno’s arguments is the writings of his opponents. The Achilles Paradox is reconstructed from Aristotle (Physics Book VI, Chapter 8, 239b14-16) and some passages from Simplicius in the fifth century C.E. There is no evidence that Zeno used a tortoise rather than a slow human. The tortoise is a later commentator’s addition. Aristotle spoke simply of “the runner” who competes with Achilles.

It won’t do to react and say the solution to the paradox is that there are biological limitations on how small a step Achilles can take. Achilles’ feet are not obligated to stop and start again at each of the locations described above, so there is no limit to how close one of those locations can be to another. A stronger version of his paradox would ask us to consider the movement of Achilles’ center of mass. It is best to think of Achilles’ change from one location to another as a continuous movement rather than as incremental steps requiring halting and starting again. Zeno is assuming that space and time are infinitely divisible; they are not discrete or atomistic. If they were, this Paradox’s argument would not work.

continuous-discrete

One common complaint with Zeno’s reasoning is that he is setting up a straw man because it is obvious that Achilles cannot catch the tortoise if he continually takes a bad aim toward the place where the tortoise is; he should aim farther ahead. The mistake in this complaint is that even if Achilles took some sort of better aim, it is still true that he is required to go to every one of those locations that are the goals of the so-called “bad aims,” so remarking about a bad aim is not a way to successfully treat Zeno’s argument.

The treatment called the “Standard Solution” to the Achilles Paradox uses calculus and other parts of real analysis to describe the situation. It implies that Zeno is assuming Achilles cannot achieve his goal because

(1) there is too far to run, or

(2) there is not enough time, or

(3) there are too many places to go, or

(4) there is no final step, or

(5) there are too many tasks.

The historical record does not tell us which of these was Zeno’s real assumption, but they are all false assumptions, according to the Standard Solution.

Let’s consider assumption (1). Presumably Zeno would defend that assumption by remarking that there are an infinity of sub-distances involved in Achilles’ run, and the sum of the sub-distances is an actual infinity, which is too much distance even for Achilles. However, the advocate of the Standard Solution will remark, “How does Zeno know what the sum of this infinite series is, since in Zeno’s day the mathematicians could make sense of a sum of a series of terms only if there were a finite number of terms in the series? Maybe he is just guessing that the sum of an infinite number of terms could somehow be well-defined and be infinite.” According to the Standard Solution the sum is finite. Here is a graph using the methods of the Standard Solution showing the activity of Achilles as he chases the tortoise and overtakes it.

graph of Achilles and the TortoiseFor ease of understanding, Zeno and the tortoise are assumed to be point masses or infinitesimal particles, each moving at a constant velocity (that is, a constant speed in one direction). The graph is displaying the fact that Achilles’ path is a linear continuum and so is composed of an actual infinity of points. (An actual infinity is also called a “completed infinity” or “transfinite infinity.” The word “actual” does not mean “real” as opposed to “imaginary.”) Zeno’s failure to assume that Achilles’ path is a linear continuum is a fatal step in his argument, according to the Standard Solution which requires that the reasoner use the concepts of contemporary mathematical physics.

Achilles travels a distance d1 in reaching the point x1 where the tortoise starts, but by the time Achilles reaches x1, the tortoise has moved on to a new point x2. When Achilles reaches x2, having gone an additional distance d2, the tortoise has moved on to point x3, requiring Achilles to cover an additional distance d3, and so forth. This sequence of non-overlapping distances (or intervals or sub-paths) is an actual infinity, but happily the geometric series converges. The sum of its terms d1 + d2 + d3 +… is a finite distance that Achilles can readily complete while moving at a constant speed.

Similar reasoning would apply if Zeno were to have made assumptions (2) or (3) above about there not being enough time for Achilles or there being too many places for him to run. Regarding assumption (4), Zeno’s requirement that there be a final step or final sub-path in Achilles’ run is simply mistaken—according to the Standard Solution. (More will be said about assumption (5) in Section 5c when we discuss supertasks.)

In Zeno’s day, since the mathematicians could make sense only of the sum of a finite number of distances, it was Aristotle’s genius to claim that Achilles covered only a potential infinity of distances, not an actual infinity since the sum of a potential infinity is a finite number at any time; thus Achilles can in that sense achieve an infinity of tasks while covering a finite distance in a finite duration. When Aristotle made this claim and used it to treat Zeno’s paradoxes, there was no better solution to the Achilles Paradox, and a better solution would not be discovered for many more centuries. In Zeno’s day, no person had a clear notion of continous space, nor of the limit of an actually infinite series, nor even of zero.

The Achilles Argument, if strengthened and not left as vague as it was in Zeno’s day, presumes that space and time are continuous or infinitely divisible. So, Zeno’s conclusion might have more cautiously asserted that Achilles cannot catch the tortoise if space and time are infinitely divisible in the sense of actual infinity. Perhaps, as some commentators have speculated, Zeno used or should have used the Achilles Paradox only to attack continuous space, and he used or should have used his other paradoxes such as the “Arrow” and the “The Moving Rows” to attack discrete space.

ii. The Dichotomy (The Racetrack)

As Aristotle realized, the Dichotomy Paradox is just the Achilles Paradox in which Achilles stands still ahead of the tortoise. In his Progressive Dichotomy Paradox, Zeno argued that a runner will never reach the stationary goal line on a straight racetrack. The reason is that the runner must first reach half the distance to the goal, but when there he must still cross half the remaining distance to the goal, but having done that the runner must cover half of the new remainder, and so on. If the goal is one meter away, the runner must cover a distance of 1/2 meter, then 1/4 meter, then 1/8 meter, and so on ad infinitum. The runner cannot reach the final goal, says Zeno. Why not? There are few traces of Zeno’s reasoning here, but for reconstructions that give the strongest reasoning, we may say that the runner will not reach the final goal because there is too far to run, the sum is actually infinite. The Standard Solution argues instead that the sum of this infinite geometric series is one, not infinity.

The problem of the runner getting to the goal can be viewed from a different perspective. According to the Regressive version of the Dichotomy Paradox, the runner cannot even take a first step. Here is why. Any step may be divided conceptually into a first half and a second half. Before taking a full step, the runner must take a 1/2 step, but before that he must take a 1/4 step, but before that a 1/8 step, and so forth ad infinitum, so Achilles will never get going. Like the Achilles Paradox, this paradox also concludes that any motion is impossible.

The Dichotomy paradox, in either its Progressive version or its Regressive version, assumes here for the sake of simplicity and strength of argumentation that the runner’s positions are point places. Actual runners take up some larger volume, but the assumption of point places is not a controversial assumption because Zeno could have reconstructed his paradox by speaking of the point places occupied by, say, the tip of the runner’s nose or the center of his mass, and this assumption makes for a clearer and stronger paradox.

In the Dichotomy Paradox, the runner reaches the points 1/2 and 3/4 and 7/8 and so forth on the way to his goal, but under the influence of Bolzano and Dedekind and Cantor, who developed the first theory of sets, the set of those points is no longer considered to be potentially infinite. It is an actually infinite set of points abstracted from a continuum of points, in which the word “continuum” is used in the late 19th century sense that is at the heart of calculus. And any ancient idea that the sum of the actually infinite series of path lengths or segments 1/2 + 1/4 + 1/8 + … is infinite now has to be rejected in favor of the theory that the sum converges to 1. This is key to solving the Dichotomy Paradox according to the Standard Solution. It is basically the same treatment as that given to the Achilles. The Dichotomy Paradox has been called “The Stadium” by some commentators, but that name is also commonly used for the Paradox of the Moving Rows, so readers need to be on the alert for ambiguity in the literature.

Aristotle, in Physics Z9, said of the Dichotomy that it is possible for a runner to come in contact with a potentially infinite number of things in a finite time provided the time intervals becomes shorter and shorter. Aristotle said Zeno assumed this is impossible, and that is one of his errors in the Dichotomy. However, Aristotle merely asserted this and could give no detailed theory that enables the computation of the finite amount of time. So, Aristotle could not really defend his diagnosis of Zeno’s error. Today the calculus is used to provide the Standard Solution with that detailed theory.

There is another detail of the Dichotomy that needs resolution. How does Zeno’s runner complete the trip if there is no final step or last member of the infinite sequence of steps (intervals and goals)? Don’t trips need last steps? The Standard Solution answers “no” and says the intuitive answer “yes” is one of many intuitions held by Zeno and Aristotle and the average person today that must be rejected when embracing the Standard Solution.

iii. The Arrow

Zeno’s Arrow Paradox takes a different approach to challenging the coherence of our common sense concepts of time and motion. Think of how you would distinguish an arrow that is stationary in space from one that is flying through space, given that you look only at a snapshot (an instantaneous photo) of them. Would there be any difference? As Aristotle explains, from Zeno’s “assumption that time is composed of moments,” a moving arrow must occupy a space equal to itself during any moment. That is, during any indivisible moment or instant it is at the place where it is. But places do not move. So, if in each moment, the arrow is occupying a space equal to itself, then the arrow is not moving in that moment. The reason it is not moving is that it has no time in which to move; it is simply there at the place. It cannot move during the moment because there is not enough time for any motion, and the moment is indivisible. The same reasoning holds for any other moment during the so-called “flight” of the arrow. So, the arrow is never moving. By a similar argument, Zeno can establish that nothing else moves. The source for Zeno’s argument is Aristotle (Physics, Book VI, chapter 5, 239b5-32).

The Standard Solution to the Arrow Paradox requires appeal to our contemporary theory of speed from calculus. This theory defines instantaneous motion, that is, motion at an instant, without defining motion during an instant. This new treatment of motion originated with Newton and Leibniz in the sixteenth century, and it employs what is called the “at-at” theory of motion, which says motion is being at different places at different times. Motion is not some feature that reveals itself only within a moment. The modern difference between rest and motion, as opposed to the difference in antiquity, has to do with what is happening at nearby moments and—contra Zeno—has nothing to do with what is happening during a moment.

Some researchers have speculated that the Arrow Paradox was designed by Zeno to attack discrete time and space rather than continuous time and space. This is not clear, and the Standard Solution works for both. That is, regardless of whether time is continuous and Zeno’s instant has no finite duration, or time is discrete and Zeno’s instant lasts for, say, 10-44 seconds, there is insufficient time for the arrow to move during the instant. Yet regardless of how long the instant lasts, there still can be instantaneous motion, namely motion at that instant provided the object is in a different place at some other instant.

To re-emphasize this crucial point, note that both Zeno and 21st century mathematical physicists agree that the arrow cannot be in motion within or during an instant (an instantaneous time), but the physicists will point out that the arrow can be in motion at an instant in the sense of having a positive speed at that instant (its so-called instantaneous speed), provided the arrow occupies different positions at times before or after that instant so that the instant is part of a period in which the arrow is continuously in motion. If we do not pay attention to what happens at nearby instants, it is impossible to distinguish instantaneous motion from instantaneous rest, but distinguishing the two is the way out of the Arrow Paradox. Zeno would have balked at the idea of motion at an instant, and Aristotle explicitly denied it.

The Arrow Paradox is refuted by the Standard Solution with its new at-at theory of motion, but the paradox seems especially strong to someone who would prefer instead to say that motion is an intrinsic property of an instant, being some propensity or disposition to be elsewhere.

Let’s reconsider the details of the Standard Solution assuming continuous motion rather than discrete motion. In calculus, the speed of an object at an instant (its instantaneous speed) is the time derivative of the object’s position; this means the object’s speed is the limit of its series of average speeds during smaller and smaller intervals of time containing the instant. We make essentially the same point when we say the object’s speed is the limit of its average speed over an interval as the length of the interval tends to zero. The derivative of the arrow’s position x with respect to time t, namely dx/dt, is the arrow’s instantaneous speed, and it has non-zero values at specific places at specific instants during the arrow’s flight, contra Zeno and Aristotle. The speed during an instant or in an instant, which is what Zeno is calling for, would be 0/0 and is undefined. But the speed at an instant is well defined. If we require the use of these modern concepts, then Zeno cannot successfully produce a contradiction as he tries to do by his assuming that in each moment the speed of the arrow is zero—because it is not zero. Therefore, advocates of the Standard Solution conclude that Zeno’s Arrow Paradox has a false, but crucial, assumption and so is unsound.

Independently of Zeno, the Arrow Paradox was discovered by the Chinese dialectician Kung-sun Lung (Gongsun Long, ca. 325–250 B.C.E.). A lingering philosophical question about the arrow paradox is whether there is a way to properly refute Zeno’s argument that motion is impossible without using the apparatus of calculus.

iv. The Moving Rows (The Stadium)

According to Aristotle (Physics, Book VI, chapter 9, 239b33-240a18), Zeno try to create a paradox by considering bodies (that is, physical objects) of equal length aligned along three parallel rows within a stadium. One track contains A bodies (three A bodies are shown below); another contains B bodies; and a third contains C bodies. Each body is the same distance from its neighbors along its track. The A bodies are stationary. The Bs are moving to the right, and the Cs are moving with the same speed to the left. Here are two snapshots of the situation, before and after. They are taken one instant apart.

Diagram of Zeno's Moving Rows

Zeno points out that, in the time between the before-snapshot and the after-snapshot, the leftmost C passes two Bs but only one A, contradicting his (very controversial) assumption that the C should take longer to pass two Bs than one A. The usual way out of this paradox is to reject that controversial assumption.

Aristotle argues that how long it takes to pass a body depends on the speed of the body; for example, if the body is coming towards you, then you can pass it in less time than if it is stationary. Today’s analysts agree with Aristotle’s diagnosis, and historically this paradox of motion has seemed weaker than the previous three. This paradox has been called “The Stadium,” but occasionally so has the Dichotomy Paradox.

Some analysts, for example Tannery (1887), believe Zeno may have had in mind that the paradox was supposed to have assumed that both space and time are discrete (quantized, atomized) as opposed to continuous, and Zeno intended his argument to challenge the coherence of the idea of discrete space and time.

Well, the paradox could be interpreted this way. If so, assume the three objects A, B, and C are adjacent to each other in their tracks, and each A, B and C body are occupying a space that is one atom long. Then, if all motion is occurring at the rate of one atom of space in one atom of time, the leftmost C would pass two atoms of B-space in the time it passed one atom of A-space, which is a contradiction to our assumption about rates. There is another paradoxical consequence. Look at the space occupied by left C object.  During the instant of movement, it passes the middle B object, yet there is no time at which they are adjacent, which is odd.

So, Zeno’s argument can be interpreted as producing a challenge to the idea that space and time are discrete. However, most commentators suspect Zeno himself did not interpret his paradox this way.

b. Paradoxes of Plurality

Zeno’s paradoxes of motion are attacks on the commonly held belief that motion is real, but because motion is a kind of plurality, namely a process along a plurality of places in a plurality of times, they are also attacks on this kind of plurality. Zeno offered more direct attacks on all kinds of plurality. The first is his Paradox of Alike and Unlike.

i. Alike and Unlike

According to Plato in Parmenides 127-9, Zeno argued that the assumption of plurality–the assumption that there are many things–leads to a contradiction. He quotes Zeno as saying: “If things are many, . . . they must be both like and unlike. But that is impossible; unlike things cannot be like, nor like things unlike” (Hamilton and Cairns (1961), 922).

Zeno’s point is this. Consider a plurality of things, such as some people and some mountains. These things have in common the property of being heavy. But if they all have this property in common, then they really are all the same kind of thing, and so are not a plurality. They are a one. By this reasoning, Zeno believes it has been shown that the plurality is one (or the many is not many), which is a contradiction. Therefore, by reductio ad absurdum, there is no plurality, as Parmenides has always claimed.

Plato immediately accuses Zeno of equivocating. A thing can be alike some other thing in one respect while being not alike it in a different respect. Your having a property in common with some other thing does not make you identical with that other thing. Consider again our plurality of people and mountains. People and mountains are all alike in being heavy, but are unlike in intelligence. And they are unlike in being mountains; the mountains are mountains, but the people are not. As Plato says, when Zeno tries to conclude “that the same thing is many and one, we shall [instead] say that what he is proving is that something is many and one [in different respects], not that unity is many or that plurality is one….” [129d] So, there is no contradiction, and the paradox is solved by Plato. This paradox is generally considered to be one of Zeno’s weakest paradoxes, and it is now rarely discussed. [See Rescher (2001), pp. 94-6 for some discussion.]

ii. Limited and Unlimited

This paradox is also called the Paradox of Denseness. Suppose there exist many things rather than, as Parmenides would say, just one thing. Then there will be a definite or fixed number of those many things, and so they will be “limited.” But if there are many things, say two things, then they must be distinct, and to keep them distinct there must be a third thing separating them. So, there are three things. But between these, …. In other words, things are dense and there is no definite or fixed number of them, so they will be “unlimited.” This is a contradiction, because the plurality would be both limited and unlimited. Therefore, there are no pluralities; there exists only one thing, not many things. This argument is reconstructed from Zeno’s own words, as quoted by Simplicius in his commentary of book 1 of Aristotle’s Physics.

According to the Standard Solution to this paradox, the weakness of Zeno’s argument can be said to lie in the assumption that “to keep them distinct, there must be a third thing separating them.” Zeno would have been correct to say that between any two physical objects that are separated in space, there is a place between them, because space is dense, but he is mistaken to claim that there must be a third physical object there between them. Two objects can be distinct at a time simply by one having a property the other does not have.

iii. Large and Small

Suppose there exist many things rather than, as Parmenides says, just one thing. Then every part of any plurality is both so small as to have no size but also so large as to be infinite, says Zeno. His reasoning for why they have no size has been lost, but many commentators suggest that he’d reason as follows. If there is a plurality, then it must be composed of parts which are not themselves pluralities. Yet things that are not pluralities cannot have a size or else they’d be divisible into parts and thus be pluralities themselves.

Now, why are the parts of pluralities so large as to be infinite? Well, the parts cannot be so small as to have no size since adding such things together would never contribute anything to the whole so far as size is concerned. So, the parts have some non-zero size. If so, then each of these parts will have two spatially distinct sub-parts, one in front of the other. Each of these sub-parts also will have a size. The front part, being a thing, will have its own two spatially distinct sub-parts, one in front of the other; and these two sub-parts will have sizes. Ditto for the back part. And so on without end. A sum of all these sub-parts would be infinite. Therefore, each part of a plurality will be so large as to be infinite.

This sympathetic reconstruction of the argument is based on Simplicius’ On Aristotle’s Physics, where Simplicius quotes Zeno’s own words for part of the paradox, although he does not say what he is quoting from.

There are many errors here in Zeno’s reasoning, according to the Standard Solution. He is mistaken at the beginning when he says, “If there is a plurality, then it must be composed of parts which are not themselves pluralities.” A university is an illustrative counterexample. A university is a plurality of students, but we need not rule out the possibility that a student is a plurality. What’s a whole and what’s a plurality depends on our purposes. When we consider a university to be a plurality of students, we consider the students to be wholes without parts. But for another purpose we might want to say that a student is a plurality of biological cells. Zeno is confused about this notion of relativity, and about part-whole reasoning; and as commentators began to appreciate this they lost interest in Zeno as a player in the great metaphysical debate between pluralism and monism.

A second error occurs in arguing that the each part of a plurality must have a non-zero size. The contemporary notion of measure (developed in the 20th century by Brouwer, Lebesgue, and others) showed how to properly define the measure function so that a line segment has nonzero measure even though (the singleton set of) any point has a zero measure. The measure of the line segment [a, b] is b – a; the measure of a cube with side a is a3. This theory of measure is now properly used by our civilization for length, volume, duration, mass, voltage, brightness, and other continuous magnitudes.

Thanks to Aristotle’s support, Zeno’s Paradoxes of Large and Small and of Infinite Divisibility (to be discussed below) were generally considered to have shown that a continuous magnitude cannot be composed of points. Interest was rekindled in this topic in the 18th century. The physical objects in Newton’s classical mechanics of 1726 were interpreted by R. J. Boscovich in 1763 as being collections of point masses. Each point mass is a movable point carrying a fixed mass. This idealization of continuous bodies as if they were compositions of point particles was very fruitful; it could be used to easily solve otherwise very difficult problems in physics. This success led scientists, mathematicians, and philosophers to recognize that the strength of Zeno’s Paradoxes of Large and Small and of Infinite Divisibility had been overestimated; they did not prevent a continuous magnitude from being composed of points.

iv. Infinite Divisibility

This is the most challenging of all the paradoxes of plurality. Consider the difficulties that arise if we assume that an object theoretically can be divided into a plurality of parts. According to Zeno, there is a reassembly problem. Imagine cutting the object into two non-overlapping parts, then similarly cutting these parts into parts, and so on until the process of repeated division is complete. Assuming the hypothetical division is “exhaustive” or does comes to an end, then at the end we reach what Zeno calls “the elements.” Here there is a problem about reassembly. There are three possibilities. (1) The elements are nothing. In that case the original objects will be a composite of nothing, and so the whole object will be a mere appearance, which is absurd. (2) The elements are something, but they have zero size. So, the original object is composed of elements of zero size. Adding an infinity of zeros yields a zero sum, so the original object had no size, which is absurd. (3) The elements are something, but they do not have zero size. If so, these can be further divided, and the process of division was not complete after all, which contradicts our assumption that the process was already complete. In summary, there were three possibilities, but all three possibilities lead to absurdity. So, objects are not divisible into a plurality of parts.

Simplicius says this argument is due to Zeno even though it is in Aristotle (On Generation and Corruption, 316a15-34, 316b34 and 325a8-12) and is not attributed there to Zeno, which is odd. Aristotle says the argument convinced the atomists to reject infinite divisibility. The argument has been called the Paradox of Parts and Wholes, but it has no traditional name.

The Standard Solution says we first should ask Zeno to be clearer about what he is dividing. Is it concrete or abstract? When dividing a concrete, material stick into its components, we reach ultimate constituents of matter such as quarks and electrons that cannot be further divided. These have a size, a zero size (according to quantum electrodynamics), but it is incorrect to conclude that the whole stick has no size if its constituents have zero size. [Due to the forces involved, point particles have finite “cross sections,” and configurations of those particles, such as atoms, do have finite size.] So, Zeno is wrong here. On the other hand, is Zeno dividing an abstract path or trajectory? Let’s assume he is, since this produces a more challenging paradox. If so, then choice (2) above is the one to think about. It’s the one that talks about addition of zeroes. Let’s assume the object is one-dimensional, like a path. According to the Standard Solution, this “object” that gets divided should be considered to be a continuum with its elements arranged into the order type of the linear continuum, and we should use the contemporary notion of measure to find the size of the object. The size (length, measure) of a point-element is zero, but Zeno is mistaken in saying the total size (length, measure) of all the zero-size elements is zero. The size of the object  is determined instead by the difference in coordinate numbers assigned to the end points of the object. An object extending along a straight line that has one of its end points at one meter from the origin and other end point at three meters from the origin has a size of two meters and not zero meters. So, there is no reassembly problem, and a crucial step in Zeno’s argument breaks down.

c. Other Paradoxes

i. The Grain of Millet

There are two common interpretations of this paradox. According to the first, which is the standard interpretation, when a bushel of millet (or wheat) grains falls out of its container and crashes to the floor, it makes a sound. Since the bushel is composed of individual grains, each individual grain also makes a sound, as should each thousandth part of the grain, and so on to its ultimate parts. But this result contradicts the fact that we actually hear no sound for portions like a thousandth part of a grain, and so we surely would hear no sound for an ultimate part of a grain. Yet, how can the bushel make a sound if none of its ultimate parts make a sound? The original source of this argument is Aristotle Physics, Book VII, chapter 4, 250a19-21). There seems to be appeal to the iterative rule that if a millet or millet part makes a sound, then so should a next smaller part.

We do not have Zeno’s words on what conclusion we are supposed to draw from this. Perhaps he would conclude it is a mistake to suppose that whole bushels of millet have millet parts. This is an attack on plurality.

The Standard Solution to this interpretation of the paradox accuses Zeno of mistakenly assuming that there is no lower bound on the size of something that can make a sound. There is no problem, we now say, with parts having very different properties from the wholes that they constitute. The iterative rule is initially plausible but ultimately not trustworthy, and Zeno is committing both the fallacy of division and the fallacy of composition.

Some analysts interpret Zeno’s paradox a second way, as challenging our trust in our sense of hearing, as follows. When a bushel of millet grains crashes to the floor, it makes a sound. The bushel is composed of individual grains, so they, too, make an audible sound. But if you drop an individual millet grain or a small part of one or an even smaller part, then eventually your hearing detects no sound, even though there is one. Therefore, you cannot trust your sense of hearing.

This reasoning about our not detecting low amplitude sounds is similar to making the mistake of arguing that you cannot trust your thermometer because there are some ranges of temperature that it is not sensitive to. So, on this second interpretation, the paradox is also easy to solve. One reason given in the literature for believing that this second interpretation is not the one that Zeno had in mind is that Aristotle’s criticism given below applies to the first interpretation and not the second, and it is unlikely that Aristotle would have misinterpreted the paradox.

ii. Against Place

Given an object, we may assume that there is a single, correct answer to the question, “What is its place?” Because everything that exists has a place, and because place itself exists, so it also must have a place, and so on forever. That’s too many places, so there is a contradiction. The original source is Aristotle’s Physics (209a23-25 and 210b22-24).

The standard response to Zeno’s Paradox Against Place is to deny that places have places, and to point out that the notion of place should be relative to reference frame. But Zeno’s assumption that places have places was common in ancient Greece at the time, and Zeno is to be praised for showing that it is a faulty assumption.

4. Aristotle’s Treatment of the Paradoxes

Aristotle’s views about Zeno’s paradoxes can be found in his Physics, book 4, chapter 2, and book 6, chapters 2 and 9. Regarding the Dichotomy Paradox, Aristotle is to be applauded for his insight that Achilles has time to reach his goal because during the run ever shorter paths take correspondingly ever shorter times.

Aristotle had several criticisms of Zeno. Regarding the paradoxes of motion, he complained that Zeno should not suppose the runner’s path is dependent on its parts; instead, the path is there first, and the parts are constructed by the analyst. His second complaint was that Zeno should not suppose that lines contain indivisible points. Aristotle’s third and most influential, critical idea involves a complaint about potential infinity. On this point, in remarking about the Achilles Paradox, Aristotle said, “Zeno’s argument makes a false assumption in asserting that it is impossible for a thing to pass over…infinite things in a finite time.” Aristotle believed it is impossible for a thing to pass over an actually infinite number of things in a finite time, but he believed that it is possible for a thing to pass over a potentially infinite number of things in a finite time. Here is how Aristotle expressed the point:

For motion…, although what is continuous contains an infinite number of halves, they are not actual but potential halves. (Physics 263a25-27). …Therefore to the question whether it is possible to pass through an infinite number of units either of time or of distance we must reply that in a sense it is and in a sense it is not. If the units are actual, it is not possible: if they are potential, it is possible. (Physics 263b2-5).

Aristotle denied the existence of the actual infinite both in the physical world and in mathematics, but he accepted potential infinities there. By calling them potential infinities he did not mean they have the potential to become actually infinite; potential infinity is a technical term that suggests a process that has not been completed. The term actual infinite does not imply being actual or real. It implies being complete, with no dependency on some process in time.

A potential infinity is an unlimited iteration of some operation—unlimited in time. Aristotle claimed correctly that if Zeno were not to have used the concept of actual infinity and of indivisible point, then the paradoxes of motion such as the Achilles Paradox (and the Dichotomy Paradox) could not be created.

Here is why doing so is a way out of these paradoxes. Zeno said that to go from the start to the finish line, the runner Achilles must reach the place that is halfway-there, then after arriving at this place he still must reach the place that is half of that remaining distance, and after arriving there he must again reach the new place that is now halfway to the goal, and so on. These are too many places to reach. Zeno made the mistake, according to Aristotle, of supposing that this infinite process needs completing when it really does not need completing and cannot be completed; the finitely long path from start to finish exists undivided for the runner, and it is the mathematician who is demanding the completion of such a process. Without using that concept of a completed infinity there is no paradox. Aristotle is correct about this being a treatment that avoids paradox.

Aristotle and Zeno disagree about the nature of division of a runner’s path. Aristotle’s complaint can be expressed succinctly this way: Zeno was correct to suppose that at any time a runner’s path can be divided anywhere, but incorrect to suppose the path can be divided everywhere at the same time.

Today’s standard treatment of the Achilles paradox disagrees with Aristotle’s way out of the paradox and says Zeno was correct to use the concept of a completed infinity and correct to imply that the runner must go to an actual infinity of places in a finite time and correct to suppose the runner’s path can be divided everywhere at the same time.

From what Aristotle says, one can infer between the lines that he believes there is another reason to reject actual infinities: doing so is the only way out of these paradoxes of motion. Today we know better. There is another way out, namely, the Standard Solution that uses actual infinities, which are analyzable in terms of Cantor’s transfinite sets.

Aristotle’s treatment by disallowing actual infinity while allowing potential infinity was clever, and it satisfied nearly all scholars for 1,500 years, being buttressed during that time by the Church’s doctrine that only God is actually infinite. George Berkeley, Immanuel Kant, Carl Friedrich Gauss, and Henri Poincaré were influential defenders of potential infinity. Leibniz accepted actual infinitesimals, but other mathematicians and physicists in European universities during these centuries were careful to distinguish between actual and potential infinities and to avoid using actual infinities.

Given 1,500 years of opposition to actual infinities, the burden of proof was on anyone advocating them. Bernard Bolzano and Georg Cantor accepted this burden in the 19th century. The key idea is to see a potentially infinite set as a variable quantity that is dependent on being abstracted from a pre-exisiting actually infinite set. Bolzano argued that the natural numbers should be conceived of as a set, a determinate set, not one with a variable number of elements. Cantor argued that any potential infinity must be interpreted as varying over a predefined fixed set of possible values, a set that is actually infinite. He put it this way:

In order for there to be a variable quantity in some mathematical study, the “domain” of its variability must strictly speaking be known beforehand through a definition. However, this domain cannot itself be something variable…. Thus this “domain” is a definite, actually infinite set of values. Thus each potential infinite…presupposes an actual infinite. (Cantor 1887)

From this standpoint, Dedekind’s 1872 axiom of continuity and his definition of real numbers as certain infinite subsets of rational numbers suggested to Cantor and then to many other mathematicians that arbitrarily large sets of rational numbers are most naturally seen to be subsets of an actually infinite set of rational numbers. The same can be said for sets of real numbers. An actually infinite set is what we today call a “transfinite set.” Cantor’s idea is then to treat a potentially infinite set as being a sequence of definite subsets of a transfinite set. Aristotle had said mathematicians need only the concept of a finite straight line that may be produced as far as they wish, or divided as finely as they wish, but Cantor would say that this way of thinking presupposes a completed infinite continuum from which that finite line is abstracted at any particular time.

[When Cantor says the mathematical concept of potential infinity presupposes the mathematical concept of actual infinity, this does not imply that, if future time were to be potentially infinite, then future time also would be actually infinite.]

Dedekind’s primary contribution to our topic was to give the first rigorous definition of infinite set—an actual infinity—showing that the notion is useful and not self-contradictory. Cantor provided the missing ingredient—that the mathematical line can fruitfully be treated as a dense linear ordering of uncountably many points, and he went on to develop set theory and to give the continuum a set-theoretic basis which convinced mathematicians that the concept was rigorously defined.

These ideas now form the basis of modern real analysis. The implication for the Achilles and Dichotomy paradoxes is that, once the rigorous definition of a linear continuum is in place, and once we have Cauchy’s rigorous theory of how to assess the value of an infinite series, then we can point to the successful use of calculus in physical science, especially in the treatment of time and of motion through space, and say that the sequence of intervals or paths described by Zeno is most properly treated as a sequence of subsets of an actually infinite set [that is, Aristotle’s potential infinity of places that Achilles reaches are really a variable subset of an already existing actually infinite set of point places], and we can be confident that Aristotle’s treatment of the paradoxes is inferior to the Standard Solution’s.

Zeno said Achilles cannot achieve his goal in a finite time, but there is no record of the details of how he defended this conclusion. He might have said the reason is (i) that there is no last goal in the sequence of sub-goals, or, perhaps (ii) that it would take too long to achieve all the sub-goals, or perhaps (iii) that covering all the sub-paths is too great a distance to run. Zeno might have offered all these defenses. In attacking justification (ii), Aristotle objects that, if Zeno were to confine his notion of infinity to a potential infinity and were to reject the idea of zero-length sub-paths, then Achilles achieves his goal in a finite time, so this is a way out of the paradox. However, an advocate of the Standard Solution says Achilles achieves his goal by covering an actual infinity of paths in a finite time, and this is the way out of the paradox. (The discussion of whether Achilles can properly be described as completing an actual infinity of tasks rather than goals will be considered in Section 5c.) Aristotle’s treatment of the paradoxes is basically criticized for being inconsistent with current standard real analysis that is based upon Zermelo Fraenkel set theory and its actually infinite sets. To summarize the errors of Zeno and Aristotle in the Achilles Paradox and in the Dichotomy Paradox, they both made the mistake of thinking that if a runner has to cover an actually infinite number of sub-paths to reach his goal, then he will never reach it; calculus shows how Achilles can do this and reach his goal in a finite time, and the fruitfulness of the tools of calculus imply that the Standard Solution is a better treatment than Aristotle’s.

Let’s turn to the other paradoxes. In proposing his treatment of the Paradox of the Large and Small and of the Paradox of Infinite Divisibility, Aristotle said that

…a line cannot be composed of points, the line being continuous and the point indivisible. (Physics, 231a25)

In modern real analysis, a continuum is composed of points, but Aristotle, ever the advocate of common sense reasoning, claimed that a continuum cannot be composed of points. Aristotle believed a line can be composed only of smaller, indefinitely divisible lines and not of points without magnitude. Similarly a distance cannot be composed of point places and a duration cannot be composed of instants. This is one of Aristotle’s key errors, according to advocates of the Standard Solution, because by maintaining this common sense view he created an obstacle to the fruitful development of real analysis. In addition to complaining about points, Aristotelians object to the idea of an actual infinite number of them.

In his analysis of the Arrow Paradox, Aristotle said Zeno mistakenly assumes time is composed of indivisible moments, but “This is false, for time is not composed of indivisible moments any more than any other magnitude is composed of indivisibles.” (Physics, 239b8-9) Zeno needs those instantaneous moments; that way Zeno can say the arrow does not move during the moment. Aristotle recommends not allowing Zeno to appeal to instantaneous moments and restricting Zeno to saying motion be divided only into a potential infinity of intervals. That restriction implies the arrow’s path can be divided only into finitely many intervals at any time. So, at any time, there is a finite interval during which the arrow can exhibit motion by changing location. So the arrow flies, after all. That is, Aristotle declares Zeno’s argument is based on false assumptions without which there is no problem with the arrow’s motion. However, the Standard Solution agrees with Zeno that time can be composed of indivisible moments or instants, and it implies that Aristotle has mis-diagnosed where the error lies in the Arrow Paradox. Advocates of the Standard Solution would add that allowing a duration to be composed of indivisible moments is what is needed for having a fruitful calculus, and Aristotle’s recommendation is an obstacle to the development of calculus.

Aristotle’s treatment of The Paradox of the Moving Rows is basically in agreement with the Standard Solution to that paradox–that Zeno did not appreciate the difference between speed and relative speed.

Regarding the Paradox of the Grain of Millet, Aristotle said that parts need not have all the properties of the whole, and so grains need not make sounds just because bushels of grains do. (Physics, 250a, 22) And if the parts make no sounds, we should not conclude that the whole can make no sound. It would have been helpful for Aristotle to have said more about what are today called the Fallacies of Division and Composition that Zeno is committing. However, Aristotle’s response to the Grain of Millet is brief but accurate by today’s standards.

In conclusion, are there two adequate but different solutions to Zeno’s paradoxes, Aristotle’s Solution and the Standard Solution? No. Aristotle’s treatment does not stand up to criticism in a manner that most scholars deem adequate. The Standard Solution uses contemporary concepts that have proved to be more valuable for solving and resolving so many other problems in mathematics and physics. Replacing Aristotle’s common sense concepts with the new concepts from real analysis and classical mechanics has been a key ingredient in the successful development of mathematics and science, and for this reason the vast majority of scientists, mathematicians, and philosophers reject Aristotle’s treatment. Nevertheless, there is a significant minority in the philosophical community who do not agree, as we shall see in the sections that follow.

See (Wallace2003) for a deeper treatment of Aristotle and how the development of the concept of infinity led to the standard solution to Zeno’s Paradoxes.

5. Other Issues Involving the Paradoxes

a. Consequences of Accepting the Standard Solution

There is a price to pay for accepting the Standard Solution to Zeno’s Paradoxes. The following–once presumably safe–intuitions or assumptions must be rejected:

  1. A continuum is too smooth to be composed of indivisible points.
  2. Runners do not have time to go to an actual infinity of places in a finite time.
  3. The sum of an infinite series of positive terms is always infinite.
  4. For each instant there is a next instant and for each place along a line there is a next place.
  5. A finite distance along a line cannot contain an actually infinite number of points.
  6. The more points there are on a line, the longer the line is.
  7. It is absurd for there to be numbers that are bigger than every integer.
  8. A one-dimensional curve can not fill a two-dimensional area, nor can an infinitely long curve enclose a finite area.
  9. A whole is always greater than any of its parts.

Item (8) was undermined when it was discovered that the continuum implies the existence of fractal curves. However, the loss of intuition (1) has caused the greatest stir because so many philosophers object to a continuum being constructed from points. Aristotle had said, “Nothing continuous can be composed of things having no parts,” (Physics VI.3 234a 7-8). The Austrian philosopher Franz Brentano believed with Aristotle that scientific theories should be literal descriptions of reality, as opposed to today’s more popular view that theories are idealizations or approximations of reality. Continuity is something given in perception, said Brentano, and not in a mathematical construction; therefore, mathematics misrepresents. In a 1905 letter to Husserl, he said, “I regard it as absurd to interpret a continuum as a set of points.”

But the Standard Solution needs to be thought of as a package to be evaluated in terms of all of its costs and benefits. From this perspective the Standard Solution’s point-set analysis of continua has withstood the criticism and demonstrated its value in mathematics and mathematical physics. As a consequence, advocates of the Standard Solution say we must live with rejecting the eight intuitions listed above, and accept the counterintuitive implications such as there being divisible continua, infinite sets of different sizes, and space-filling curves. They agree with the philosopher W. V .O. Quine who demands that we be conservative when revising the system of claims that we believe and who recommends “minimum mutilation.” Advocates of the Standard Solution say no less mutilation will work satisfactorily.

b. Criticisms of the Standard Solution

Balking at having to reject so many of our intuitions, Henri-Louis Bergson, Max Black, Franz Brentano, L. E. J. Brouwer, Solomon Feferman, William James, Charles S. Peirce, James Thomson, Alfred North Whitehead, and Hermann Weyl argued in different ways that the standard mathematical account of continuity does not apply to physical processes, or is improper for describing those processes. Here are their main reasons: (1) the actual infinite cannot be encountered in experience and thus is unreal, (2) human intelligence is not capable of understanding motion, (3) the sequence of tasks that Achilles performs is finite and the illusion that it is infinite is due to mathematicians who confuse their mathematical representations with what is represented, (4) motion is unitary or “smooth” even though its spatial trajectory is infinitely divisible, (5) treating time as being made of instants is to treat time as static rather than as the dynamic aspect of consciousness that it truly is, (6) actual infinities and the contemporary continuum are not indispensable to solving the paradoxes, and (7) the Standard Solution’s implicit assumption of the primacy of the coherence of the sciences is unjustified because what is really primary is coherence with a priori knowledge and common sense.

See Salmon (1970, Introduction) and Feferman (1998) for a discussion of the controversy about the quality of Zeno’s arguments, and an introduction to its vast literature. This controversy is much less actively pursued in today’s mathematical literature, and hardly at all in today’s scientific literature. A minority of philosophers are actively involved in attempting to retain one or more of the eight intuitions listed in the previous section. An important philosophical issue is whether the paradoxes should be solved by the Standard Solution or instead by assuming that a line is not composed of points but of intervals, and whether use of infinitesimals is essential to a proper understanding of the paradoxes. For an example of how to solve Zeno’s Paradoxes without using the continuum and with retaining Democritus’ intuition that there is a lower limit to the divisibility of space, see  “Atoms of Space” in Rovelli’s theory of loop quantum gravity (Rovelli 2017, pp. 169-171).

c. Supertasks and Infinity Machines

In Zeno’s Achilles Paradox, Achilles does not cover an infinite distance, but he does cover an infinite number of distances. In doing so, does he need to complete an infinite sequence of tasks or actions? In other words, assuming Achilles does complete the task of reaching the tortoise, does he thereby complete a supertask, a transfinite number of tasks in a finite time?

Bertrand Russell said “yes.” He argued that it is possible to perform a task in one-half minute, then perform another task in the next quarter-minute, and so on, for a full minute. At the end of the minute, an infinite number of tasks would have been performed. In fact, Achilles does this in catching the tortoise, Russell said. In the mid-twentieth century, Hermann Weyl, Max Black, James Thomson, and others objected, and thus began an ongoing controversy about the number of tasks that can be completed in a finite time.

That controversy has sparked a related discussion about whether there could be a machine that can perform an infinite number of tasks in a finite time. A machine that can is called an infinity machine. In 1954, in an effort to undermine Russell’s argument, the philosopher James Thomson described a lamp that is intended to be a typical infinity machine. Let the machine switch the lamp on for a half-minute; then switch it off for a quarter-minute; then on for an eighth-minute; off for a sixteenth-minute; and so on. Would the lamp be lit or dark at the end of minute? Thomson argued that it must be one or the other, but it cannot be either because every period in which it is off is followed by a period in which it is on, and vice versa, so there can be no such lamp, and the specific mistake in the reasoning was to suppose that it is logically possible to perform a supertask. The implication for Zeno’s paradoxes is that Thomson is denying Russell’s description of Achilles’ task as a supertask, as being the completion of an infinite number of sub-tasks in a finite time.

Paul Benacerraf (1962) complains that Thomson’s reasoning is faulty because it fails to notice that the initial description of the lamp determines the state of the lamp at each period in the sequence of switching, but it determines nothing about the state of the lamp at the limit of the sequence, namely at the end of the minute. The lamp could be either on or off at the limit. The limit of the infinite converging sequence is not in the sequence. So, Thomson has not established the logical impossibility of completing this supertask, but only that the setup’s description is not as complete as he had hoped.

Could some other argument establish this impossibility? Benacerraf suggests that an answer depends on what we ordinarily mean by the term “completing a task.” If the meaning does not require that tasks have minimum times for their completion, then maybe Russell is right that some supertasks can be completed, he says; but if a minimum time is always required, then Russell is mistaken because an infinite time would be required. What is needed is a better account of the meaning of the term “task.” Grünbaum objects to Benacerraf’s reliance on ordinary meaning. “We need to heed the commitments of ordinary language,” says Grünbaum, “only to the extent of guarding against being victimized or stultified by them.”

The Thomson Lamp Argument has generated a great literature in philosophy. Here are some of the issues. What is the proper definition of “task”? For example, does it require a minimum amount of time in the physicists’ technical sense of that term? Even if it is physically impossible to flip the switch in Thomson’s lamp because the speed of flipping the toggle switch will exceed the speed of light, suppose physics were different and there were no limit on speed; what then? Is the lamp logically impossible or physically impossible? Is the lamp metaphysically impossible? Was it proper of Thomson to suppose that the question of whether the lamp is lit or dark at the end of the minute must have a determinate answer? Does Thomson’s question have no answer, given the initial description of the situation, or does it have an answer which we are unable to compute? Should we conclude that it makes no sense to divide a finite task into an infinite number of ever shorter sub-tasks? Is there an important difference between completing a countable infinity of tasks and completing an uncountable infinity of tasks? Interesting issues arise when we bring in Einstein’s theory of relativity and consider a bifurcated supertask. This is an infinite sequence of tasks in a finite interval of an external observer’s proper time, but not in the machine’s own proper time. See Earman and Norton (1996) for an introduction to the extensive literature on these topics. Unfortunately, there is no agreement in the philosophical community on most of the questions we’ve just entertained.

d. Constructivism

The spirit of Aristotle’s opposition to actual infinities persists today in the philosophy of mathematics called constructivism. Constructivism is not a precisely defined position, but it implies that acceptable mathematical objects and procedures have to be founded on constructions and not, say, on assuming the object does not exist, then deducing a contradiction from that assumption. Most constructivists believe acceptable constructions must be performable ideally by humans independently of practical limitations of time or money. So they would say potential infinities, recursive functions, mathematical induction, and Cantor’s diagonal argument are constructive, but the following are not: The axiom of choice, the law of excluded middle, the law of double negation, completed infinities, and the classical continuum of the Standard Solution. The implication is that Zeno’s Paradoxes were not solved correctly by using the methods of the Standard Solution. More conservative constructionists, the finitists, would go even further and reject potential infinities because of the human being’s finite computational resources, but this conservative sub-group of constructivists is very much out of favor.

L. E. J. Brouwer’s intuitionism was the leading constructivist theory of the early 20th century. In response to suspicions raised by the discovery of Russell’s Paradox and the introduction into set theory of the controversial non-constructive axiom of choice, Brouwer attempted to place mathematics on what he believed to be a firmer epistemological foundation by arguing that mathematical concepts are admissible only if they can be constructed from, and thus grounded in, an ideal mathematician’s vivid temporal intuitions, which are a priori intuitions of time.

Brouwer’s intuitionistic continuum has the Aristotelian property of unsplitability. What this means is that, unlike the Standard Solution’s set-theoretic composition of the continuum which allows, say, the closed interval of real numbers from zero to one to be split or cut into (that is, be the union of sets of) those numbers in the interval that are less than one-half and those numbers in the interval that are greater than or equal to one-half, the corresponding closed interval of the intuitionistic continuum cannot be split this way into two disjoint sets. This unsplitability or inseparability agrees in spirit with Aristotle’s idea of the continuity of a real continuum, but disagrees in spirit with Aristotle’s idea of not allowing the continuum to be composed of points. [For more on this topic, see Posy (2005) pp. 346-7.]

Although everyone agrees that any legitimate mathematical proof must use only a finite number of steps and be constructive in that sense, the majority of mathematicians in the first half of the twentieth century claimed that constructive mathematics could not produce an adequate theory of the continuum because essential theorems would no longer be theorems, and constructivist principles and procedures are too awkward to use successfully. In 1927, David Hilbert exemplified this attitude when he objected that Brouwer’s restrictions on allowable mathematics—such as rejecting proof by contradiction—were like taking the telescope away from the astronomer.

But thanks in large part to the later development of constructive mathematics by Errett Bishop and Douglas Bridges in the second half of the 20th century, most contemporary philosophers of mathematics believe the question of whether constructivism could be successful in the sense of producing an adequate theory of the continuum is still open [see Wolf (2005) p. 346, and McCarty (2005) p. 382], and to that extent so is the question of whether the Standard Solution to Zeno’s Paradoxes needs to be rejected or perhaps revised to embrace constructivism. Frank Arntzenius (2000), Michael Dummett (2000), and Solomon Feferman (1998) have done important philosophical work to promote the constructivist tradition. Nevertheless, the vast majority of today’s practicing mathematicians routinely use nonconstructive mathematics.

e. Nonstandard Analysis

Although Zeno and Aristotle had the concept of small, they did not have the concept of infinitesimally small, which is the informal concept that was used by Leibniz (and Newton) in the development of calculus. In the 19th century, infinitesimals were eliminated from the standard development of calculus due to the work of Cauchy and Weierstrass on defining a derivative in terms of limits using the epsilon-delta method. But in 1881, C. S. Peirce advocated restoring infinitesimals because of their intuitive appeal. Unfortunately, he was unable to work out the details, as were all mathematicians—until 1960 when Abraham Robinson produced his nonstandard analysis. At this point in time it was no longer reasonable to say that banishing infinitesimals from analysis was an intellectual advance. What Robinson did was to extend the standard real numbers to include infinitesimals, using this definition: h is infinitesimal if and only if its absolute value is less than 1/n, for every positive standard number n. Robinson went on to create a nonstandard model of analysis using hyperreal numbers. The class of hyperreal numbers contains counterparts of the reals, but in addition it contains any number that is the sum, or difference, of both a standard real number and an infinitesimal number, such as 3 + h and 3 – 4h2. The reciprocal of an infinitesimal is an infinite hyperreal number. These hyperreals obey the usual rules of real numbers except for the Archimedean axiom. Infinitesimal distances between distinct points are allowed, unlike with standard real analysis. The derivative is defined in terms of the ratio of infinitesimals, in the style of Leibniz, rather than in terms of a limit as in standard real analysis in the style of Weierstrass.

Nonstandard analysis is called “nonstandard” because it was inspired by Thoralf Skolem’s demonstration in 1933 of the existence of models of first-order arithmetic that are not isomorphic to the standard model of arithmetic. What makes them nonstandard is especially that they contain infinitely large (hyper)integers. For nonstandard calculus one needs nonstandard models of real analysis rather than just of arithmetic. An important feature demonstrating the usefulness of nonstandard analysis is that it achieves essentially the same theorems as those in classical calculus. The treatment of Zeno’s paradoxes is interesting from this perspective. See McLaughlin (1994) for how Zeno’s paradoxes may be treated using infinitesimals. McLaughlin believes this approach to the paradoxes is the only successful one, but commentators generally do not agree with that conclusion, and consider it merely to be an alternative solution. See Dainton (2010) pp. 306-9 for some discussion of this.

f. Smooth Infinitesimal Analysis

Abraham Robinson in the 1960s resurrected the infinitesimal as an infinitesimal number, but F. W. Lawvere in the 1970s resurrected the infinitesimal as an infinitesimal magnitude. His work is called “smooth infinitesimal analysis” and is part of “synthetic differential geometry.” In smooth infinitesimal analysis, a curved line is composed of infinitesimal tangent vectors. One significant difference from a nonstandard analysis, such as Robinson’s above, is that all smooth curves are straight over infinitesimal distances, whereas Robinson’s can curve over infinitesimal distances. In smooth infinitesimal analysis, Zeno’s arrow does not have time to change its speed during an infinitesimal interval. Smooth infinitesimal analysis retains the intuition that a continuum should be smoother than the continuum of the Standard Solution. Unlike both standard analysis and nonstandard analysis whose real number systems are set-theoretical entities and are based on classical logic, the real number system of smooth infinitesimal analysis is not a set-theoretic entity but rather an object in a topos of category theory, and its logic is intuitionist (Harrison, 1996, p. 283). Like Robinson’s nonstandard analysis, Lawvere’s smooth infinitesimal analysis may also be a promising approach to a foundation for real analysis and thus to solving Zeno’s paradoxes, but there is no consensus that Zeno’s Paradoxes need to be solved this way. For more discussion see note 11 in Dainton (2010) pp. 420-1.

6. The Legacy and Current Significance of the Paradoxes

What influence has Zeno had? He had none in the East, but in the West there has been continued influence and interest up to today.

Let’s begin with his influence on the ancient Greeks. Before Zeno, philosophers expressed their philosophy in poetry, and he was the first philosopher to use prose arguments. This new method of presentation was destined to shape almost all later philosophy, mathematics, and science. Zeno drew new attention to the idea that the way the world appears to us is not how it is in reality. Zeno probably also influenced the Greek atomists to accept atoms. Aristotle was influenced by Zeno to use the distinction between actual and potential infinity as a way out of the paradoxes, and careful attention to this distinction has influenced mathematicians ever since. The proofs in Euclid’s Elements, for example, used only potentially infinite procedures. Awareness of Zeno’s paradoxes made Greek and all later Western intellectuals more aware that mistakes can be made when thinking about infinity, continuity, and the structure of space and time, and it made them wary of any claim that a continuous magnitude could be made of discrete parts. ”Zeno’s arguments, in some form, have afforded grounds for almost all theories of space and time and infinity which have been constructed from his time to our own,” said Bertrand Russell in the twentieth century.

There is controversy in 20th and 21st century literature about whether Zeno developed any specific, new mathematical techniques. Most scholars say the conscious use of the method of indirect argumentation arose in both mathematics and Zeno’s philosophy independently of each other. See Hintikka (1978) for a discussion of this controversy about origins. Everyone agrees the method was Greek and not Babylonian, as was the method of proving something by deducing it from explicitly stated assumptions. G. E. L. Owen (Owen 1958, p. 222) argued that Zeno influenced Aristotle’s concept of there being no motion at an instant, which implies there is no instant when a body begins to move, nor an instant when a body changes its speed. Consequently, says Owen, Aristotle’s conception is an obstacle to a Newton-style concept of acceleration, and this hindrance is “Zeno’s major influence on the mathematics of science.” Other commentators consider Owen’s remark to be slightly harsh regarding Zeno because, they ask, if Zeno had not been born, would Aristotle have been likely to develop any other concept of motion?

Zeno’s paradoxes have received some explicit attention from scholars throughout later centuries. Pierre Gassendi in the early 17th century mentioned Zeno’s paradoxes as the reason to claim that the world’s atoms must not be infinitely divisible. Pierre Bayle’s 1696 article on Zeno drew the skeptical conclusion that, for the reasons given by Zeno, the concept of space is contradictory. In the early 19th century, Hegel suggested that Zeno’s paradoxes supported his view that reality is inherently contradictory.

Zeno’s paradoxes caused mistrust in infinites, and this mistrust has influenced the contemporary movements of constructivism, finitism, and nonstandard analysis, all of which affect the treatment of Zeno’s paradoxes. Dialetheism, the acceptance of true contradictions via a paraconsistent formal logic, provides a newer, although unpopular, response to Zeno’s paradoxes, but dialetheism was not created specifically in response to worries about Zeno’s paradoxes. With the introduction in the 20th century of thought experiments about supertasks, interesting philosophical research has been directed towards understanding what it means to complete a task.

Zeno’s paradoxes are often pointed to for a case study in how a philosophical problem has been solved, even though the solution took over two thousand years to materialize.

So, Zeno’s paradoxes have had a wide variety of impacts upon subsequent research. Little research today is involved directly in how to solve the paradoxes themselves, especially in the fields of mathematics and science, although discussion continues in philosophy, primarily on whether a continuous magnitude should be composed of discrete magnitudes, such as whether a line should be composed of points. If there are alternative treatments of Zeno’s paradoxes, then this raises the issue of whether there is a single solution to the paradoxes or several solutions or one best solution. The answer to whether the Standard Solution is the correct solution to Zeno’s paradoxes may also depend on whether the best physics of the future that reconciles the theories of quantum mechanics and general relativity will require us to assume spacetime is composed at its most basic level of points, or, instead, of regions or loops or something else.

From the perspective of the Standard Solution, the most significant lesson learned by researchers who have tried to solve Zeno’s paradoxes is that the way out requires revising many of our old theories and their concepts. We have to be willing to rank the virtues of preserving logical consistency and promoting scientific fruitfulness above the virtue of preserving our intuitions. Zeno played a significant role in causing this progressive trend.

7. References and Further Reading

  • Arntzenius, Frank. (2000) “Are there Really Instantaneous Velocities?”, The Monist 83, pp. 187-208.
    • Examines the possibility that a duration does not consist of points, that every part of time has a non-zero size, that real numbers cannot be used as coordinates of times, and that there are no instantaneous velocities at a point.
  • Barnes, J. (1982). The Presocratic Philosophers, Routledge & Kegan Paul: Boston.
    • A well respected survey of the philosophical contributions of the Pre-Socratics.
  • Barrow, John D. (2005). The Infinite Book: A Short Guide to the Boundless, Timeless and Endless, Pantheon Books, New York.
    • A popular book in science and mathematics introducing Zeno’s Paradoxes and other paradoxes regarding infinity.
  • Benacerraf, Paul (1962). “Tasks, Super-Tasks, and the Modern Eleatics,” The Journal of Philosophy, 59, pp. 765-784.
    • An original analysis of Thomson’s Lamp and supertasks.
  • Bergson, Henri (1946). Creative Mind, translated by M. L. Andison. Philosophical Library: New York.
    • Bergson demands the primacy of intuition in place of the objects of mathematical physics.
  • Black, Max (1950-1951). “Achilles and the Tortoise,” Analysis 11, pp. 91-101.
    • A challenge to the Standard Solution to Zeno’s paradoxes. Blacks agrees that Achilles did not need to complete an infinite number of sub-tasks in order to catch the tortoise.
  • Cajori, Florian (1920). “The Purpose of Zeno’s Arguments on Motion,” Isis, vol. 3, no. 1, pp. 7-20.
    • An analysis of the debate regarding the point Zeno is making with his paradoxes of motion.
  • Cantor, Georg (1887). “Über die verschiedenen Ansichten in Bezug auf die actualunendlichen Zahlen.” Bihang till Kongl. Svenska Vetenskaps-Akademien Handlingar , Bd. 11 (1886-7), article 19. P. A. Norstedt & Sôner: Stockholm.
    • A very early description of set theory and its relationship to old ideas about infinity.
  • Chihara, Charles S. (1965). “On the Possibility of Completing an Infinite Process,” Philosophical Review 74, no. 1, p. 74-87.
    • An analysis of what we mean by “task.”
  • Copleston, Frederick, S.J. (1962). “The Dialectic of Zeno,” chapter 7 of A History of Philosophy, Volume I, Greece and Rome, Part I, Image Books: Garden City.
    • Copleston says Zeno’s goal is to challenge the Pythagoreans who denied empty space and accepted pluralism.
  • Dainton, Barry. (2010). Time and Space, Second Edition, McGill-Queens University Press: Ithaca.
    • Chapters 16 and 17 discuss Zeno’s Paradoxes.
  • Dauben, J. (1990). Georg Cantor, Princeton University Press: Princeton.
    • Contains Kronecker’s threat to write an article showing that Cantor’s set theory has “no real significance.” Ludwig Wittgenstein was another vocal opponent of set theory.
  • De Boer, Jesse (1953). “A Critique of Continuity, Infinity, and Allied Concepts in the Natural Philosophy of Bergson and Russell,” in Return to Reason: Essays in Realistic Philosophy, John Wild, ed., Henry Regnery Company: Chicago, pp. 92-124.
    • A philosophical defense of Aristotle’s treatment of Zeno’s paradoxes.
  • Diels, Hermann and W. Kranz (1951). Die Fragmente der Vorsokratiker, sixth ed., Weidmannsche Buchhandlung: Berlin.
    • A standard edition of the pre-Socratic texts.
  • Dummett, Michael (2000). “Is Time a Continuum of Instants?,” Philosophy, 2000, Cambridge University Press: Cambridge, pp. 497-515.
    • Promoting a constructive foundation for mathematics, Dummett’s formalism implies there are no instantaneous instants, so times must have rational values rather than real values. Times have only the values that they can in principle be measured to have; and all measurements produce rational numbers within a margin of error.
  • Earman J. and J. D. Norton (1996). “Infinite Pains: The Trouble with Supertasks,” in Paul Benacerraf: the Philosopher and His Critics, A. Morton and S. Stich (eds.), Blackwell: Cambridge, MA, pp. 231-261.
    • A criticism of Thomson’s interpretation of his infinity machines and the supertasks involved, plus an introduction to the literature on the topic.
  • Feferman, Solomon (1998). In the Light of Logic, Oxford University Press, New York.
    • A discussion of the foundations of mathematics and an argument for semi-constructivism in the tradition of Kronecker and Weyl, that the mathematics used in physical science needs only the lowest level of infinity, the infinity that characterizes the whole numbers. Presupposes considerable knowledge of mathematical logic.
  • Freeman, Kathleen (1948). Ancilla to the Pre-Socratic Philosophers, Harvard University Press: Cambridge, MA. Reprinted in paperback in 1983.
    • One of the best sources in English of primary material on the Pre-Socratics.
  • Grünbaum, Adolf (1967). Modern Science and Zeno’s Paradoxes, Wesleyan University Press: Middletown, Connecticut.
    • A detailed defense of the Standard Solution to the paradoxes.
  • Grünbaum, Adolf (1970). “Modern Science and Zeno’s Paradoxes of Motion,” in (Salmon, 1970), pp. 200-250.
    • An analysis of arguments by Thomson, Chihara, Benacerraf and others regarding the Thomson Lamp and other infinity machines.
  • Hamilton, Edith and Huntington Cairns (1961). The Collected Dialogues of Plato Including the Letters, Princeton University Press: Princeton.
  • Harrison, Craig (1996). “The Three Arrows of Zeno: Cantorian and Non-Cantorian Concepts of the Continuum and of Motion,” Synthese, Volume 107, Number 2, pp. 271-292.
    • Considers smooth infinitesimal analysis as an alternative to the classical Cantorian real analysis of the Standard Solution.
  • Heath, T. L. (1921). A History of Greek Mathematics, Vol. I, Clarendon Press: Oxford. Reprinted 1981.
    • Promotes the minority viewpoint that Zeno had a direct influence on Greek mathematics, for example by eliminating the use of infinitesimals.
  • Hintikka, Jaakko, David Gruender and Evandro Agazzi. Theory Change, Ancient Axiomatics, and Galileo’s Methodology, D. Reidel Publishing Company, Dordrecht.
    • A collection of articles that discuss, among other issues, whether Zeno’s methods influenced the mathematicians of the time or whether the influence went in the other direction. See especially the articles by Karel Berka and Wilbur Knorr.
  • Kirk, G. S., J. E. Raven, and M. Schofield, eds. (1983). The Presocratic Philosophers: A Critical History with a Selection of Texts, Second Edition, Cambridge University Press: Cambridge.
    • A good source in English of primary material on the Pre-Socratics with detailed commentary on the controversies about how to interpret various passages.
  • Maddy, Penelope (1992) “Indispensability and Practice,” Journal of Philosophy 59, pp. 275-289.
    • Explores the implication of arguing that theories of mathematics are indispensable to good science, and that we are justified in believing in the mathematical entities used in those theories.
  • Matson, Wallace I (2001). “Zeno Moves!” pp. 87-108 in Essays in Ancient Greek Philosophy VI: Before Plato, ed. by Anthony Preus, State University of New York Press: Albany.
    • Matson supports Tannery’s non-classical and minority interpretation that Zeno’s purpose was to show only that the opponents of Parmenides are committed to absurdly denying motion, and that Zeno himself never denied motion, nor did Parmenides.
  • McCarty, D.C. (2005). “Intuitionism in Mathematics,” in The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro, Oxford University Press, Oxford, pp. 356-86.
    • Argues that a declaration of death of the program of founding mathematics on an intuitionistic basis is premature.
  • McLaughlin, William I. (1994). “Resolving Zeno’s Paradoxes,” Scientific American, vol. 271, no. 5, Nov., pp. 84-90.
    • How Zeno’s paradoxes may be explained using a contemporary theory of Leibniz’s infinitesimals.
  • Owen, G.E.L. (1958). “Zeno and the Mathematicians,” Proceedings of the Aristotelian Society, New Series, vol. LVIII, pp. 199-222.
    • Argues that Zeno and Aristotle negatively influenced the development of the Renaissance concept of acceleration that was used so fruitfully in calculus.
  • Posy, Carl. (2005). “Intuitionism and Philosophy,” in The Oxford Handbook of Philosophy of Mathematics and Logic, edited by Stewart Shapiro, Oxford University Press, Oxford, pp. 318-54.
    • Contains a discussion of how the unsplitability of Brouwer’s intuitionistic continuum makes precise Aristotle’s notion that “you can’t cut a continuous medium without some of it clinging to the knife,” on pages 345-7.
  • Proclus (1987). Proclus’ Commentary on Plato’s Parmenides, translated by Glenn R. Morrow and John M. Dillon, Princeton University Press: Princeton.
    • A detailed list of every comment made by Proclus about Zeno is available with discussion starting on p. xxxix of the Introduction by John M. Dillon. Dillon focuses on Proclus’ comments which are not clearly derivable from Plato’s Parmenides, and concludes that Proclus had access to other sources for Zeno’s comments, most probably Zeno’s original book or some derivative of it. William Moerbeke’s overly literal translation in 1285 from Greek to Latin of Proclus’ earlier, but now lost, translation of Plato’s Parmenides is the key to figuring out the original Greek. (see p. xliv)
  • Rescher, Nicholas (2001). Paradoxes: Their Roots, Range, and Resolution, Carus Publishing Company: Chicago.
    • Pages 94-102 apply the Standard Solution to all of Zeno’s paradoxes. Rescher calls the Paradox of Alike and Unlike the “Paradox of Differentiation.”
  • Rovelli, Carlo (2017). Reality is Not What It Seems: The Journey to Quantum Gravity, Riverhead Books: New York.
    • Rovelli’s chapter 6 explains how the theory of loop quantum gravity provides a new solution to Zeno’s Paradoxes that is more in tune with the intuitions of Democritus because it rejects the assumption that a bit of space can always be subdivided.
  • Russell, Bertrand (1914). Our Knowledge of the External World as a Field for Scientific Method in Philosophy, Open Court Publishing Co.: Chicago.
    • Russell champions the use of contemporary real analysis and physics in resolving Zeno’s paradoxes.
  • Salmon, Wesley C., ed. (1970). Zeno’s Paradoxes, The Bobbs-Merrill Company, Inc.: Indianapolis and New York. Reprinted in paperback in 2001.
    • A collection of the most influential articles about Zeno’s Paradoxes from 1911 to 1965. Salmon provides an excellent annotated bibliography of further readings.
  • Szabo, Arpad (1978). The Beginnings of Greek Mathematics, D. Reidel Publishing Co.: Dordrecht.
    • Contains the argument that Parmenides discovered the method of indirect proof by using it against Anaximenes’ cosmogony, although it was better developed in prose by Zeno. Also argues that Greek mathematicians did not originate the idea but learned of it from Parmenides and Zeno. (pp. 244-250). These arguments are challenged in Hntikka (1978).
  • Tannery, Paul (1885). “‘Le Concept Scientifique du continu: Zenon d’Elee et Georg Cantor,” pp. 385-410 of Revue Philosophique de la France et de l’Etranger, vol. 20, Les Presses Universitaires de France: Paris.
    • This mathematician gives the first argument that Zeno’s purpose was not to deny motion but rather to show only that the opponents of Parmenides are committed to denying motion.
  • Tannery, Paul (1887). Pour l’Histoire de la Science Hellène: de Thalès à Empédocle, Alcan: Paris. 2nd ed. 1930.
    • More development of the challenge to the classical interpretation of what Zeno’s purposes were in creating his paradoxes.
  • Thomson, James (1954-1955). “Tasks and Super-Tasks,” Analysis, XV, pp. 1-13.
    • A criticism of supertasks. The Thomson Lamp thought-experiment is used to challenge Russell’s characterization of Achilles as being able to complete an infinite number of tasks in a finite time.
  • Tiles, Mary (1989). The Philosophy of Set Theory: An Introduction to Cantor’s Paradise, Basil Blackwell: Oxford.
    • A philosophically oriented introduction to the foundations of real analysis and its impact on Zeno’s paradoxes.
  • Vlastos, Gregory (1967). “Zeno of Elea,” in The Encyclopedia of Philosophy, Paul Edwards (ed.), The Macmillan Company and The Free Press: New York.
    • A clear, detailed presentation of the paradoxes. Vlastos comments that Aristotle does not consider any other treatment of Zeno’s paradoxes than by recommending replacing Zeno’s actual infinities with potential infinites, so we are entitled to assert that Aristotle probably believed denying actual infinities is the only route to a coherent treatment of infinity. Vlastos also comments that “there is nothing in our sources that states or implies that any development in Greek mathematics (as distinct from philosophical opinions about mathematics) was due to Zeno’s influence.”==
  • Vlastos, Gregory (1967). “Zeno of Elea,” in The Encyclopedia of Philosophy, Paul Edwards (ed.), The Macmillan Company and The Free Press: New York.
    • A clear, detailed presentation of the paradoxes. Vlastos comments that Aristotle does not consider any other treatment of Zeno’s paradoxes than by recommending replacing Zeno’s actual infinities with potential infinites, so we are entitled to assert that Aristotle probably believed denying actual infinities is the only route to a coherent treatment of infinity. Vlastos also comments that “there is nothing in our sources that states or implies that any development in Greek mathematics (as distinct from philosophical opinions about mathematics) was due to Zeno’s influence.”==
  • Wallace, David Foster. (2003). A Compact History of , W. W. Norton and Company: New York.
    • A clear and sophisticated treatment of how a deeper understanding of infinity led to the solution to Zeno’s Paradoxes. Highly recommended.
  • White, M. J. (1992). The Continuous and the Discrete: Ancient Physical Theories from a Contemporary Perspective, Clarendon Press: Oxford.
    • A presentation of various attempts to defend finitism, neo-Aristotelian potential infinities, and the replacement of the infinite real number field with a finite field.
  • Wisdom, J. O. (1953). “Berkeley’s Criticism of the Infinitesimal,” The British Journal for the Philosophy of Science, Vol. 4, No. 13, pp. 22-25.
    • Wisdom clarifies the issue behind George Berkeley’s criticism (in 1734 in The Analyst) of the use of the infinitesimal (fluxion) by Newton and Leibniz. See also the references there to Wisdom’s other three articles on this topic in the journal Hermathena in 1939, 1941 and 1942.
  • Wolf, Robert S. (2005). A Tour Through Mathematical Logic, The Mathematical Association of America: Washington, DC.
    • Chapter 7 surveys nonstandard analysis, and Chapter 8 surveys constructive mathematics, including the contributions by Errett Bishop and Douglas Bridges.

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.

José Ortega y Gasset (1883—1955)

OrtegaIn the roughly 6,000 pages that Spanish philosopher José Ortega y Gasset wrote on the humanities, he covered a wide variety of topics. This captures the kind of thinker he was: one who cannot be strictly categorized to any one school of philosophy. José Ortega y Gasset did not want to constrain himself to any one area of study in his unending dialogue to better understand what was of central importance to him: what it means to be human. He wrote on philosophy, history, literary criticism, sociology, travel writing, the philosophy of life, history, phenomenology, society, politics, the press, and the novel, to name some of the varied topics he explored. He held various identities: he was a philosopher, educator, essayist, theorist, critic, editor, and politician. He did not strive to be a “professional philosopher”; rather, he aimed to be a ‘philosophical essayist.’ While there were many reasons for this, one of central importance was his hope that with shorter texts, he could reach more people. He wanted to have this dialogue not only with influential thinkers from the past, but with his readers as well. Ortega was not only one of the most important philosophers of the twentieth century from continental Europe, but he also had an important impact on Latin American philosophy, most especially in introducing existentialism and perspectivism.

Table of Contents

  1. Biography
  2. Philosophy of Life
    1. The Individual
    2. Society and The Revolt of the Masses
    3. The Mission of the University
  3. Perspectivism
    1. Ideas and Beliefs
  4. Theory of History
  5. Aesthetics
  6. Philosophy
  7. The History of Philosophy
  8. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

José Ortega y Gasset was born in 1883 in Madrid and died there in 1955 after spending many years of his life in various other countries. Throughout his life, Ortega was involved in the newspaper industry. From an early age he was exposed to what it took to run and write for a newspaper, which arguably had an important impact on his writing style. His grandfather was the founder of what was for a time a renowned daily paper, El Imparcial, and for which Ortega wrote his first article in 1904—the same year he received his Doctorate in Philosophy from the Central University of Madrid. His dissertation was titled The Terrors of Year One Thousand, and in it we see an early interest in a topic that he would explore profoundly: the philosophy on history. While he was finishing his dissertation, he also met his wife, Rose Spottorno, with whom he had three children.

Ortega spent time abroad in Germany, France, Argentina, and Portugal. Some of those years were spent in exile, as he was a staunch critic of Spanish politics across the spectrum. Though he wavered at times as to which political philosophy he was most vocal about, he was quite vociferous against both communism and fascism. Thus, it is not always clear what Ortega’s political views were, and he was also at times misappropriated by some important politicians of the time. For example, José Antonio Primo de Rivera, the son of the military dictator Miguel Primo de Rivera and founder of the Falange, the Spanish Fascist party, arguably greatly misappropriated Ortega’s political philosophy to best suit his own needs. For a time, Ortega supported Rivera, but he came to be vehemently opposed to any kind of one-man rule. He was also initially in support of the Falange and their leader, General Francisco Franco, but eventually also became very disillusioned with them. However, during the Spanish Civil War from 1936-1939, he remained quite silent, probably to ultimately express his dissatisfaction with the aims of both sides. Still, given the ambiguities in his writings, they were misappropriated by both ends of the political spectrum in Spain. This confusion can also be seen in comments such as his ‘socialist leanings for the love of aristocracy.’ Ortega retired from politics in 1933, as he was ultimately more interested in bringing about social and cultural change through education. After 1936, the great majority of his writing was of a philosophical nature.

Being too silent on certain issues, such as on the Spanish Civil War and on Hitler and the Nazis, also brought him some controversy. He did have some bouts of depression, which may have coincided at times with this lack of commentary on the Second World War. At times he longed to be in a place that enabled him to experience some sort of neutrality, which is something that drew him to Argentina. From 1936 until his death in 1955, he suffered from poor health and his productivity declined dramatically. Still, his lack of outspoken criticism of Nazism has not been fully explained. War was one of the few central topics of his day that he wrote little on, because he presumably held the position that words cannot compete with weapons in a time of war.

He also had periods in which he leaned toward socialism. But, essentially, none of the traditional or dominant political views of the time would suffice, and ultimately, he promoted his own version of a meritocracy given his dissatisfaction with democracy, capitalism, bolshevism, fascism, and his revulsion of the type of mass-person that had developed during his lifetime. Despite much confusion regarding his political views, they can perhaps be summarized best as the promotion of a cultured minority in which economic and intellectual benefits trickle down to the rest of society. Possibly, the best classification would be that of a selective individual, a meritocratic-based version of liberalism.

In 1905 he began the first of three periods of study in Germany, which was an eight-month stay at the University of Leipzig. In 1907 he returned to Germany with a stipend and began his studies at the University of Berlin. Sixth months later he went to the University of Marburg, and this experience was particularly influential. He was initially quite drawn to the Neo-Kantianism prevalent there from his studies under Hermann Cohen and Paul Natorp. This influence of idealism is quite prevalent in his first book, Meditations on Quixote, which he published at the age of thirty-two (but he would later critique idealism strongly). It was also during this time that he discovered Husserl’s phenomenology and his distinct concept of consciousness, which would have an important impact on his philosophical perspective as both an influence and as a critique.

In 1910 he returned to the University of Madrid as a professor of metaphysics, and in that same year he married his wife, Rosa Spottorno. This new position was interrupted for a third trip to Germany in 1911, which was both an opportunity for a honeymoon and to continue his studies in Marburg. Ortega and Spottorno’s first son, whom they named Miguel Alemán, was born during this last extended period abroad in Germany. Miguel’s second name translates as “German,” which shows Ortega’s great interest in the nation, as it would come to serve as a model state for him in many ways. Ortega was firmly focused on his goal to modernize Spain, which he saw as greatly behind many other European nations.

From 1932 until the beginning of the Spanish Civil War in 1936, he made a shift away from idealism to emphasize an “I” that is inextricably immersed in circumstances. He developed a new focus for his philosophy: that of historical reason, a position greatly influenced by one of his most admired thinkers, Wilhelm Dilthey. His objective was to develop a philosophy that was neither idealist nor realist.

The rise of the Spanish dictator Francisco Franco at the end of the Civil War in 1939 was the main reason for his voluntary exile in Argentina and Portugal until 1945. His return to Spain thereafter was not a peaceful one. He had made political enemies across the spectrum, and, as a result, he struggled to write and teach freely. He decided to continue to travel and lecture elsewhere. During these later years he also received two honorary Doctorates: one from the University of Marburg, and another from the University of Glasgow. Ortega suffered from poor health, especially in the last couple decades of his life, which prevented him from traveling more extensively including for an invitation to teach at Harvard. He did make his first trip to the United States in 1949 when he spoke at a conference in Aspen on Goethe. In 1951 he participated in a conference with Heidegger in Darmstadt. He gave his last lecture in Venice in 1955 before succumbing to stomach and liver cancer.

Ortega was a prolific writer—in total, his works cover about 6,000 pages. Most of these works span from the year 1914, when he published his first book, Meditations on Quixote, to the 1940s, but there were also several important posthumous publications. Ortega cannot be readily classified because he wrote about such a broad range of subjects that included philosophy, history, literary criticism, sociology, and travel writing. He was a philosopher, educator, essayist, theorist, critic, editor, and politician. He wrote on the philosophy of life; human life is central in his thought. Some of the varied topics he explored were the human body, history and its categories, phenomenology, society, politics, the press, and the novel. Always part of a newspaper and magazine family, Ortega was primarily an essayist—for this reason, some label him as being a writer of “philosophical journalism.” One of his main goals was to have a dialogue with his readers.

While he did not claim to adhere to any one philosophical movement, given the important role that history plays in his philosophy, we should certainly not deny the influence of other thinkers. Influences on Ortega include Neo-Kantianism, which he studied in Marburg with Hermann Cohen and Paul Natorp, as well as the phenomenology of Edmund Husserl. Additional crucial influences include Wilhelm Dilthey especially, as well as Gottfried Leibniz, Friedrich Nietzsche, Johann Fichte, Georg Hegel, Franz Brentano, Georg Simmel, Benedetto Croce, R. G. Collingwood, and in his country of Spain, Miguel de Unamuno, Francisco Giner de los Ríos, Joaquín Costa Azorín, Ramiro de Maeztu, and Pío Baroja, to note some key figures.

Ortega himself left a lasting impact on other important Spanish intellectuals, such as Ramón Pérez de Ayala, Eugenio d’Ors, Américo Castro, and Julian Marías. There were several disciples of his, including Eugenio Imaz, José Gaos, Ignacio Ellacuría, Joaquín Xirau, Eduardo Nicol, José Ferrater Mora, María Zambrano, and Antonio Rodríguez Huescar, who also immigrated to the Latin American countries of Mexico, Venezuela, El Salvador, and Argentina, continuing to add to the philosophical landscape abroad.

In Latin America, there were several important thinkers and historical figures directly influenced by Ortega, such as Samuel Ramos and Leopoldo Zea in Mexico, Luis Recasens Siches from Guatemala—and even the Puerto Rican politician Jaime Benítez (1908-2001) wanted his nation to be like an “Orteguian Weimar.”

2. Philosophy of Life

For Ortega, the activity of philosophy is intimately connected to human life, and metaphysics is a central source of study for how human beings address existential concerns. The first and therefore radical reality is the self, living, and all else radiates from this. The task of the philosopher is to study this radical reality. In Ortega’s philosophy, metaphysics is a ‘construction of the world,’ and this construction is done within circumstances. The world is not given to us ready-made; it is an interpretation made within human life. No matter how unrefined or inaccurate our interpretations may be, we must make them. This interpretation is the resolution of the way in which humankind needs to navigate his or her circumstances. This is a problem of absolute insecurity that must be solved. We may not be able to choose all our circumstances, but we are free in our actions, in the choice of possibilities that lie before us—this is what most strikingly makes us different, he says, from animals, plants, or stones. This emphasis on (limited) human freedom and choice is a principal reason why he is often classified as an “existentialist thinker.” Humankind must make his or her world to save themselves and install themselves within it, he argues. And nobody but the sole individual can do this.

Metaphysics consists of individuals working out the radical orientation of their situation, because life is disorientation; life is a problem that needs to be solved. The human individual comes first, as he argues that to understand “what are things,” we need to first understand “what am I.” The radical reality is the life of each individual, which is each individual’s “I” inextricably living within circumstances. This distinction also marks a break with an earlier influence of phenomenology. For Ortega, human reality is not solipsistic, as many critiqued was the case with Husserl’s method (though Husserl did try to respond to this as a misreading of his view). But neither did Ortega fully reject phenomenology, as it continued to resemble his view on how we are our beliefs—a central position elaborated on further ahead.

The individual human life is the fundamental reality in Ortega’s philosophy. There is no absolute reason (or at least that we can know), there is only reason that stems directly from the individual human life. We can never escape our life just like we can never escape our shadow. There is no absolute, objective truth (or at least that we can know), there is just the perspective of the individual human life. This of course raises the critique that it is contradictory to claim that there is no absolute, as that very idea that there are no absolutes is an absolute. However, what Ortega seems to argue is not a denial of the possibility for the existence of objective truths or absolutes, rather just that, at least for the time being, we cannot know anything beyond each of our own perspectives. Moreover, he is not a staunch relativist, as he argues that there is a hierarchy of perspectives. Life, which is always “my life,” what I call my “I,” is a vital activity that also always finds itself in circumstances. We can choose some circumstances, and others we cannot, but we are always ‘an I within circumstances’—hence his central dictum: “I am my self and my circumstance” (Yo soy yo y mi circunstancia). As a very existentialist theme, a pre-determined life is not given to us; what is given is the need to do something and so life is always a ‘having to do something’ within inescapable circumstances. Thus, life is not the self; life is both the self and the circumstance. This is his position of “razón vital,” which is perhaps best translated as “reason from life’s point of view.” Everything we find is found in our individual lives, and the meaning we attach to things depends on what it means in our lives (which here is arguably a more pragmatist rather than existentialist or phenomenological stance). By the mid-1930s, this would be developed further, adding on the importance of narrative reason to better contemplate what it means to be human. This he titled “razón vital e histórica,” or “historical and vital reason.”

Humankind is an entity that makes itself, he argued. Humans are beings that are not-yet-being, as we are constantly engaged in having to navigate through circumstances. He describes this navigation like being lost at sea, or like being shipwrecked, making life a ‘problem that needs to be solved.’ Life is a constant dialogue with our surroundings. Despite this emphasis on individuality, given humankind’s constant place within circumstances, we are also individuals living with others, so to live is to live-with. More specifically, there are two principal ways that we are living-with: in a coetaneous way as a generational group, and in a contemporaneous way in terms of being of the same historical period. Hence Ortega’s dictum: “Humankind has no nature, only history.” “Nature” refers to things, and “history” refers to humankind. Each human being is a biography of time and space; each human being has a personal and generational history. We can understand an individual only through his or her narrative. Life is defined as ‘places and dates’ immersed in systems of beliefs dominant among generations.

Ortega’s metaphysics thus consists of each human being oriented toward the future in radical disorientation; life is a problem that needs to be solved because in every instance we are faced with the need to make choices within certain circumstances. The human radical reality is the need to constantly decide who we are going to be, always within circumstances. Take, for example, an individual “I” in a room; the room is not literally a part of one’s “I,” but “I” am an “I in a room.” The “I in a room” has possibilities of choices of what to do, but in that moment those are limited to that room. He writes: “Let us save ourselves in the world, save ourselves in things” (What is Philosophy?). In every moment we are each confronted with many possibilities of being, and among those various possible selves, we can always find one that is our authentic self, which is one’s vocation. One only truly lives when one’s vocation coincides with their true self. By vocation he is not referring to strictly one’s profession but also to our thoughts, opinions, and convictions. That is why a human life is future-oriented; it is a project, and he often symbolically refers to human beings as an archer trying to hit the bullseye of his or her authentic vocation—if, of course, they are being true to themselves. The human individual is not an “is,” rather they are a becoming and an inheritor of an individual and collective past.

a. The Individual

Ortega argues that to live is to feel lost, yet one who accepts this is already closer to finding their self. An individual’s life is always “my life”; the vital activity of “I” is always within circumstances. We can choose some of those circumstances, but we can never not be in any. In every instant, we must choose what we are going to do, what we are going to be in the next instant, and this decision is not transferable. Life is comprised of two inseparable dimensions, as he describes it. The first is the I and circumstance, and as such, life is a problem. In the second dimension, we realize we must figure out what those circumstances are and try to resolve the problem. The solutions are authentic when the problem is authentic.

Each individual has an important historical element that factors in here because different time periods may have dominant ideas about how to solve problems. History is the investigation, therefore, into human lives to try to reconstruct the drama that is the life of each one of us—he often also uses the metaphor of our swimming as castaways in the world. The vital question historical study needs to inquire into is precisely the things that change a human’s life; it is not about historical variations themselves but rather what brings about that change. So, we need to ask: how, when, and why did life change?

Each individual exists in their own set of circumstances, though some overlap with those in the lives of others, and thus each individual is an effort to realize their individual “I.” Being faced with the constant need to choose means that living brings about a radical insecurity for each individual. An individual is not defined by body and soul, because those are “things”; rather, one is defined by their life. For this reason, he proclaims his famous thesis that ‘humans have no nature, only history.’ Thus, again, this is why history should be the focus in the study of human lives; history is the extended human drama. Human life, as he so often says, is a drama, and thus the individual becomes the “histrion” of their self—“histrion” referring to a “theatrical performer,” the usage of which dates back to the ancient Greeks. In its etymological roots, from the Greek historia, we have in part “narrative,” and from histōr “learned, wise human”—thus, for Ortega, to study and be aware of one’s narrative is the means by which we become learned and wise. As one lacks or ignores this historical knowledge, there is a parallel fall in living authentically, and when this increasingly manifests itself in a group of people, there is a parallel rise of barbarity and primitiveness, he argues. This, as is elaborated further ahead, is precisely what is at work with the revolt of the masses of his time.

For Ortega, the primitive human is overly socialized and lacks individuality. As a very existentialist theme in his philosophy, we live authentically via our individuality. Existentialist philosophers generally share the critique that the focus on the self, on human existence as a lived situation, had been lost in the history of philosophy. On this point Ortega agrees; however, he does not share exactly the critique that, from the birth of modern philosophy, especially from Descartes onward, the increase in a rational and detached focus on the pursuit of objective knowledge was all that detrimental. This is because humanism was also in part a result, because the new science and human reason permitted humankind to recover faith and confidence in itself. Ortega does not deny that there are certain scientific facts that we must live with, but science, he says, “will not save humankind” (Man and Crisis). In other words, scientific studies can lead to scientific facts, but these should not extend beyond science—it is an error of perspective to reach beyond. As is elaborated further ahead, the richest of the different types of perspectives is that which is focused on the individual human life, as this is the radical reality.

We each live with a future-orientation, yet the future is problematic, and this is the paradoxical condition that is human life: in living for the future, all we have is the past to realize that potential. Ortega argues that a prophet is an inverse historian; one narrates to anticipate the future. An individual’s present is the result of all the human past—we must understand human life as such, just as how one cannot understand the last chapter of a novel without having read the content that came before. This makes history supreme, the “superior science,” he argues, in response to the dominance of physics in his time, for understanding the fundamental, radical reality that is the human life. While Ortega believes that Einstein’s discoveries, for example, support his position of perspectivism and how reality can only be understood via perspectives, again his concern is when the sciences reach too far beyond science into the realm of what it means to be a human individual. Moreover, what had been largely forgotten in his time is how we are so fundamentally historical beings. Thus, a lack of historical knowledge results in a dangerous disorientation and is an important symptom of the crisis of his time: the hyper-civilization of savagery and barbarity that he defines as the “revolt or rebellion of the masses.”

b. Society and The Revolt of the Masses

Society, for Ortega, is not fully natural, as its origins are individualist. Society arises out of the memory of a remote past, so it can only be comprehended historically. But it is also the case for Ortega that an individual’s vocation can be realized only within a society. In other words, part of our circumstance is to always be with others in a reciprocal and dynamic relationship—here his views tend more toward an existential phenomenology, as the world we live in is an intersubjective one, as we are each both unique and social selves; we are living in a world of a multitude of unique individuals. Ortega is quite critical of his time period, but he is detailed enough in his critiques to point toward a potential way of resolving them. In fact, Ortega is one of the first writers to detail something resembling the European Union as a possible solution. For example, he writes, “There is now coming for Europeans the time when Europe can convert itself into a national idea” (The Revolt of the Masses). He was quite concerned with the threat of the time for politics to go in either extreme direction to the left or right, resulting from the crisis of the masses.

This served as part of his inspiration for studying in Germany, which he saw in many ways as a model state, right after he finished his Doctorate. Contemplating an ideal future for his country of Spain became of great importance to him at an early age. In 1898, Spain lost its last colonies after losing the Spanish-American War. A group of Spanish intellectuals arose, appropriately called the “Generation of ’98,” to address how to heal the future of their country. A division resulted between those labeled the “Hispanizantes” and the “Europeazantes” (to which Ortega belonged), which looked to “Hispanicizing” or “Europeanizing” Spain, respectively, to looking back to tradition or looking to Europe as a model.

The most famed of his critiques are captured in his best-selling and highly prophetic book The Revolt of the Masses from 1930. One clear way in which he describes the main problem of not only Europe, but really the world over, is to imagine what happens in an elementary school classroom when the teacher leaves, even if just momentarily; the mob of children “breaks loose, kicks up its heels, and goes wild” (The Revolt of the Masses). The mob feels themselves in control of their own destiny, which had been previously aided by the school master, but this does not mean of course that these children suddenly know exactly what to do. The world is acting like these children, and often even worse, as spoiled children who are ignorant of the history behind all that they believe they have a right to, resulting in great disrespect. Without the direction of a select minority, such as the teacher or school master, the world is demoralized. The world is being chaotically taken over by the lowest common denominator: the barbarous mass-human.

This mob he calls “the mass-man.” The mass-man is distinguished from the minority both quantitatively and qualitatively—most important is the latter. While the minority consists of specially qualified individuals, the mass-person is not, and he or she is content with that (there is an influence apparent here from Nietzsche’s distinction between “master” and “slave” moralities, but it is not the same). The mass-man sees him or herself just like everybody else and does not want to change that. The minority makes great demands on themselves because they do not see themselves as superior, yet they are striving to improve themselves, whereas the mass-man does not. Being a minority group requires setting themselves apart from the majority, whereas this is not needed for defining a majority, he argues. So, the distinction here is not about social classes; rather, it refers mostly to mentality. The problem, Ortega argues, is that this is essentially upside-down; the minorities feel mediocre, yet not a part of the mass, and the masses are acting as the superior ones, which is enabling them to replace the minorities. He calls this state a hyper-democracy, and it is for Ortega the great crisis of the time because in the process the masses are crushing that which is different, qualified, and excellent. The result is the sovereignty of the unqualified individual—and this is stronger than ever before in history, though this kind of crisis has happened before.

The masses have a debilitating ignorance of history, which is central to sustaining and advancing a civilization. History is necessary to learn what we need to avoid in the future—“We have a need of history in its entirety, not to fall back into it, but to see if we can escape from it” (The Revolt of the Masses). Civilization is not self-supporting, such as nature is, yet the masses, in their lack of historical consciousness, think this to be the case. It is a rebellion of the masses because they are not accepting their own destiny and, therefore, rebelling against themselves. The result, Ortega fears, is great decay in many areas; there will be a militarization of society, a bureaucratization of life, and a suspension of spontaneous action by the state, for example, which the mass-man believes himself to be. Everyone will be enslaved by the state. He saw this clearly in his nation of Spain, where regional particularism was dominant and demolished select individuality (he develops this theory of his country having become ‘spineless’ in another of his more successful books titled Invertebrate Spain). As is further elaborated ahead, this can also be seen in the art trends of the time, as movements toward greater abstraction were having the effect of minimizing the number of people who could ‘understand it.’ Despite having critiqued the aesthetics of the time, Ortega also thought this could shift the balance from the dominance of the masses and put them ‘back into their appropriate places.’ The arts, then, will help restore the select hierarchy.

c. The Mission of the University

Another part of his answer to this crisis of his times lies in the university: the mission of the university is to teach culture, essentially. By “culture” he was referring to more than just scholarly knowledge; it was about being in society. Universities should aim to teach the “average” person to be a good professional in society. Science, very broadly understood here, has a different mission than the university and not every student should be churned into a “scientist.” This does not mean that science and the university are not connected, it is just to emphasize that not all students are scientists. Ortega recognizes that science is necessary for the progression of society, but it should not be the focus of a university education. For this reason, university professors should be chosen for their teaching skills over their intellectual aptitudes, he argued. Again, students should be groomed to follow the vocation of their authentic self. The self is what one potentially is; life is a project. Therefore, what one needs to know is what helps one realize their personal project (again, this is why ‘vital reason’ is ‘reason from life’s point of view’). Not everybody is aware of what their project is, and it is essential to human life to strive to figure out what that is, and ideally realize it as close to fully as possible—the university can aid in this endeavor. But Ortega was also cognizant of the challenge of this endeavor, as perhaps the ‘right frame of mind, dispositions, moods, or tastes’ cannot be taught.

3. Perspectivism

As humankind is always in the situation of having to respond to a problem, we must know how to deal and live with that problem—and this is precisely the meaning of the Spanish verb “saber,” or “to know” (as in concrete facts): to know what to do with the problems we face. Thus, Ortega’s epistemology can be summarized concisely as follows: the only knowledge possible is that which originates from an individual’s own perspective. Knowledge, he writes, is a “mutual state of assimilation” between a thing and the thinker’s process of thinking. When we confront an object in the world, we are only confronting a fragment of it, and this forces us to try to think about how to complete that object. Therefore, philosophizing is unavoidable in life (for some, at least, he argued), because it is part of this process of trying to complete what is a world of fragments. Further, this forms part of the foundation of his perspectivism: we can never see and thus understand the world from a complete perspective, only our own limited one. Perspective, then, is both a component of reality as well as what we use to organize reality. For example, when we look up at a skyscraper from the street level, we can never see the whole building, only a fragment of it from our limited perspective.

In his book What is Knowledge?, which consists of lectures from 1929-1930, we find some of his initial leanings on idealism (which he would come to increasingly move away from), but even here he is not rejecting realism but rather making it subordinate to idealism. This is because, he argues, the existence of the world can only first be derived from the existence of thought. For the world to exist, thought must exist; existence of the world is secondary to the existence of thought. Idealism, he argues, is the realism of thought. Ultimately, however, Ortega rejects both idealism and realism; neither suffices to explain the radical reality that is the individual human life, the coexistence of selves with things. While science begins with a method, philosophy begins with a problem—and this is what makes philosophy one of the greatest intellectual activities because human life is a problem, and when we become aware of our ‘falling into this absolute abyss of uncertainty,’ we do philosophy. Science may be exact, but it is not sufficient for understanding what it means to be human. Philosophy may be inexact, but it brings us much closer to understanding what it means to be human because it is about contemplating the radical reality that is the life of each one of us. Because humankind constantly confronts problems, this makes philosophy a more natural intellectual task than any science. Thus, “each one knows as much as he [they] has doubted” (What is Knowledge?). Ortega measures “the narrowness or breadth of [their] wit by the scope of [their] capacity to doubt and feel at a loss” (What is Knowledge?).

Therefore, one does not need certainties, argues Ortega; what is needed is a system of certainties that creates security in the face of insecurity. He argues: “the first act of full and incontrovertible knowledge” is the acknowledgement of life as the primordial and radical reality (What is Knowledge?). No one perspective is true nor false (though there may be a pragmatic hierarchy of perspectives); it is affected by time and place. For example, it is not just about a visual field; there are many other fields that can be present and vary by time and place. Time itself can be experienced very differently, such as how Christmas Eve may seem to be the longest night of the year for young children, whereas their birthday party passes much too quickly. A subject informs an object, and vice versa; it is an abstraction to speak of one without the other. “To know,” therefore, is to know what to live with, deal with, abide by, in response to the circumstances we find ourselves in—this is a clear example of the existentialist thinking that can be found in his thought, as he emphasizes that the most important knowledge is the individual self that understands his or her circumstances well enough to know how to live with them, deal with them, and in response form principles to abide by. Science will not suffice to help us in this endeavor. It is only in being clear with our selves that we can better navigate the drama that is the individual human life—again, a very existentialist theme. This does not require being highly educated—as this also runs its own risk of getting lost in scholarship, he urges—and the individual need not look far (though neither does this mean that individuals always make this effort). Part of what makes us human is our imagination; human life then is a work of imagination, as we are the novelists of our selves.

Ortega critiques the philosophy of the mid-nineteenth century until the early twentieth century as being “little more than a theory of knowledge” (What is Philosophy?) that still has not been able to answer the most fundamental question as to what knowledge is, in its complete meaning. He is especially highly critical of positivism. He argues that we must first understand what meaning the verb to know carries within itself before we can consider seriously a theory of knowledge. And just as life is a task, knowing is a task that humans impose on themselves, and it is perhaps an impossible task, but we feel this need to know and impose this task on ourselves because we are aware of our ignorance. This awareness of our ignorance is the focus that an epistemological study should take, Ortega argued.

a. Ideas and Beliefs

An important connection between his metaphysics and his epistemology is his distinction between ideas and beliefs. A “belief,” he argues, is something we maintain without being conscious or aware of it. We do not think about our beliefs; we are inseparably united with them. Only when beliefs start to fail us do we begin to think about them, which leads to them no longer being beliefs, and they become “ideas.” Here we can see an influence of Husserl’s phenomenology in trying to ‘suspend’ habitual beliefs, and Ortega adds, to understand how history is moved by them. We do not question beliefs, because when we do, then they stop being beliefs. In moments of crisis, when we question our beliefs, this means we are thinking about them so instead again they become “ideas.” We look at the world through our ideas, and some of them may become beliefs. The “real world” is a human interpretation, an idea that has established itself as a belief.

When we are left without beliefs, then we are left without a world, and this change then converts into a historical crisis—this is an important connection to his theory on history described further ahead. The human individual is always a coming-from-something and a going-to-something, but in moments of crisis, this duality becomes a conflict. The fifteenth century provides a clear example, as it marked a historical crisis of a coming-from the medieval lifestyle conflicting with a going-to a new ‘modern’ lifestyle. So, in a historical crisis such as this one, there is an antithesis of different modes of the same radical attitude; it is not like the coming-from and going-to of the seasons, like summer into fall. The individual of the fifteenth century was lost, without convictions (especially those stemming from Christianity), and as such, was living without authenticity—just as the individual of his day, he argues. The fifteenth century is a clear example of “the variations in the structure of human life from the drama that is living” (Man and Crisis). There was a crisis then of reason supplanting faith (as was also seen when faith supplanted reason in the shift into the start of the medieval period). In times such as these, as with Ortega’s own, there is a crisis of belief as to ‘who knows best.’

Beliefs are thus connected to their historical context, and as such, historical reason is the best tool for understanding both the ebb and flow of beliefs being shaken up and moments of historical crises. Epochs may be defined by crises of beliefs. Ortega argued that he was living in precisely one of those times, and that it took the form of a “rebellion of the masses.” Beliefs in the form of faith in old enlightenment ideals of confidence in science and progress were failing, and advances in technology were making this harder for people to see, precisely because it puts science to work. Reality is a human reality, so science does not provide us with the reality. Instead, what science provides are some of the problems of reality.    

4. Theory of History

Ortega believed that “philosophy of history” was a misnomer and preferred the term “theory” for his lengthy discussions on history, a topic which has central importance in his thought. He objected to many terms, which adds to the difficulty of classifying him (others include objections to being an ‘existentialist’ and even a ‘philosopher’). Much of Ortega’s theory on history is outlined in Man and People, Man and Crisis, History as a System, An Interpretation of Universal History, and Historical Reason. The use of the term “system” in his philosophical writings on history is at times misleading because what he is referring to is a kind of pattern or trend that can be studied, but it is not a teleological vision on history. History is defined by its systems of beliefs. As outlined in the section on ideas and beliefs, we hold ideas because we consciously think about them, and we are our beliefs because we do not consciously think about them. There are certain beliefs that are fundamental and other secondary beliefs that are derived from those. To study human existence, whether it be of an individual, a society, or a historical age, we must outline what this system of beliefs is, because crises in beliefs, when they are brought to awareness and questioned, are what move history on any level (personal, generational, or societal). This system of beliefs has a hierarchized order that can help us understand our own lives, that of others, today, and of the past—and the more of these comparisons we compile, the more accurate will be the result. Changes in history are due largely to changes in beliefs. Part of this stems from his view that in these moments, some of us also become aware of our inauthentic living brought about by accepting prevailing beliefs without question. The activity of philosophy is part of this questioning.

History is of fundamental importance to all his philosophy. Human beings are historical beings. Knowledge must be considered in its historical context; “what is true is what is true now” (History as a System). This again raises the critique of what we can do without any objective, absolute knowledge (and also places him arguably in the pragmatist camp here). But Ortega responds that while science may not provide insight on the human element, vital historical reason can. He argues that we can best understand the human individual through historical reason, and not through logic or science. One of his most well-known dictums is that “humankind has no nature, only history.” For Ortega, “nature” refers to something that is fixed; for example, a stone can never be anything other than a stone. This is not the case with humankind, as life is “not given to us ready-made”; we do find ourselves suddenly in it, but then “we must make it for ourselves,” as “life is a task,” unique to each individual (History as a System). This “thrownness” in the world is another very existentialist theme (for which some debate exists about the chronology of the development of this philosophy between Ortega and Heidegger, especially considering they personally knew and respected each other). A human being is not a “thing”; rather, a human life is a drama, a happening, because we make ourselves as infinitely plastic beings. “Things” are objects of existence, but they do not live as humans do, and each human does so according to their own personal choices in response to the problems we face in navigating our circumstances. “Before us lie the diverse possibilities of being, but behind us lies what we have been. And what we have been acts negatively on what we can be,” he writes, and again this applies to any level of humanity, whether regarding individuals or states (History as a System). Thus, while we cannot know what someone or some collective entity will be, we can know what someone or some collective entity will not be. Those possibilities of being are challenged by the circumstances we find ourselves in, so, “to comprehend anything human, be it personal or collective, one must tell its history” (History as a System). In a general sense, humans are distinct in our possession of the concept of time; the human awareness of the inevitability of death makes this so.

We cannot speak of “progress” in a positive sense in the variable becoming of a human being, because an a priori affirmation of progress toward the better is an error, as it is something that can only be confirmed a posteriori by historical reason. So, by “progress,” Ortega means simply an “accumulation of being, to store up reality” (History as a System). We have each inherited an accumulation of being, which is what further gives history its systematic quality, as he writes: “History is a system, the system of human experiences linked in a single, inexorable chain. Hence nothing can be truly clear in history until everything is clear” (History as a System). Since the ancient Greek period, history and reason had been largely opposed, and Ortega wants to reverse this—hence his use of the term “historical reason.” He is not referring to something extra-historical, but rather something substantive: the reality of the self that is underlying all, and all that has happened to that self. Nothing should be accepted as mere fact, he argues, as facts are fluid interpretations that are themselves also embedded in a historical context, so we must study how they have come about. Even “nature” is still just humankind’s “transitory interpretation” on the things around us (History as a System). As Nietzsche similarly argued, humankind is differentiated from animals because we have a consciousness of our own history and history in general. But again, the idea here is that the past is not really past; as Ortega argues, if we are to speak of some ‘thing’ it must have a presence; it must be present, so the past is active in the present. History tells us who we are through what we have done—only history can tell us this, not the physical sciences, hence again his call for the importance of “historical reason.” The physical sciences study phenomena that are independent, whereas humans have a consciousness of our historicity that is, therefore, not independent from our being.

Through history we try to comprehend the variations that persists in the human spirit, writes Ortega. These hierarchized variations are produced by a “vital sensitivity,” and those variations that are decisive become so through a generation. The theory of generations is fundamental to understanding Ortega’s philosophy on (not of) history, as he argues that previous philosophies on history had focused too much on either the individual or collective, whereas historical life is a coexistence of the two. For Ortega, a generation is divided into groups of fifteen-year increments. Each generation captures a perspective of universal history and carries with it the perspectives that came prior. For each generation, life has two dimensions: first, what was already lived, and second, spontaneity. History can also be understood like cinematography, and with each generation comes a new scene, but it is a film that has not come to an end. We are all always living within a generation and between generations—this is part of the human condition.

The two generations between the ages of thirty to sixty are of particular influence in the movement of history, as they generally represent the most historical activity, he argues. From the ages of thirty to forty-five we tend to find a stage of gestation, creation, and polemic. In the generational group from ages forty-five to sixty, we tend to find a stage of predominance, power, and authority. The first of these two stages prepares one for the next. But Ortega also posits that all historical actuality is primarily comprised of three “todays,” which we can also think of as the family writ large: child, parent, grandparent. Life is not an ‘is’—it is something we must make; it is a task, and each age is a particular task. This is because historical study is not to be concerned with only individual lives, as every life is submerged in a collective life; this is one circumstance, that we are immersed in a set of collective beliefs that form the “spirit of the time.” This is very peculiar, he argues, because unlike individual beliefs that are personally held, collective beliefs that take the form of the “spirit of the time” are essentially held by the anonymous entity that is “society,” and they have vigor regardless of individual acceptance. From the moment we are born we begin absorbing the beliefs of our time. The realization that we are unavoidably assigned to a certain age group, or spirit of the time, and lifestyle, is a melancholic experience that all ‘sensitive’ (philosophically-minded) individuals eventually have, he posits.

Ortega makes an important distinction between being “coeval” or “coetaneous,” and being “contemporary.” The former refers to being of the same age, and the latter refers to being of the same historical time period. The former is that of one’s generation, which is so critical that he argues those of the same generation but different nations are more similar than those of the same nation but different generations. His methodology for studying history is grounded in projecting the structure of generations onto the past, as it is a generation that produces a crisis in beliefs that then leads to change and new beliefs (discussed above). He also defines a generation as a dynamic compromise between the masses and the individual on which history hinges. Every moment of historical reality is the coexistence of generations. If all contemporaries were coetaneous, history would petrify and innovation would be lost, in part because each generation lives their time differently. Each generation, he writes, represents an essential and untransferable piece of historical time. Moreover, each generation also contains all the previous generations, and as such is a perspective on universal history. We are the summary of the past. History intends to discover what human lives have been like, and by human, he is not referring to body or soul, because individuals are not “things,” we are dramas. Because we are thrown into the world, this drama creates a radical insecurity that makes us feel shipwrecked or headed for shipwreck in life. We form interpretations of the circumstances we find ourselves thrown into and then must constantly make decisions based upon those. But we are not alone, of course; to live is to live together, to coexist. Yet it is precisely that reality of coexistence that makes us feel solitude; hence our attempt to avoid this loneliness through love. Ortega’s theory on history is therefore a combination of existential, phenomenological, and historicist elements.

5. Aesthetics

Ortega’s Phenomenology and Art provides a very phenomenological and existentialist philosophy on art. Art is not a gateway into an inner life, into inwardness. When an image is created of inwardness, it then ceases to be inward because inwardness cannot be an object. Thus, what art reveals is what seems to be inwardness through esthetic pleasure. Art is a kind of language that tells us about the execution of this process, but it does not tell us about things themselves. A key example he gives to understand this is the metaphor, which he considers an elementary esthetic object. A metaphor produces a “felt object,” but it is not, strictly speaking, the objects themselves that it includes. Art, he says, is de-creation because like in the example of a metaphor, it creates a new felt object out of essentially the destruction of other objects. There is a connection to Brentano and Husserl here in this experience that consciousness is of a consciousness of an object (though, it has been noted, Ortega ultimately aims to redirect the reduction of Husserl and against pure consciousness to instead promote consciousness from the point of view of life).

In the example of painting, which he considers “the most hermetic of all the arts” (Phenomenology and Art), he further elaborates on the importance of an artist’s view on the occupation itself, of being an “artist.” The occupation one chooses is a very personal and important one, thus style is greatly impacted by how an artist would answer the question as to what it means “to be a painter” (Phenomenology and Art). Art history is not just about changes in styles; it is also about the meaning of art itself. Most important is why a painter paints rather than how a painter paints, he argues (as another very existentialist position).

Ortega’s philosophy on the art of his time is further developed in his essay The Dehumanization of Art. While the focus of his analysis in this text is the art of his time, his objective is to understand and work through some basic characteristics of art in general. As this was published in 1925, the art movements he often refers to are those tending toward abstraction, such as expressionism and cubism. He was quite critical of Picasso, for example, but this may have also been primarily politically motivated. His ultimate judgment is that the art of his time has been “dehumanized” because it is an expression moving further away from the lived experience as it becomes more “modern.” This new art is “objectifying” things; it is objectifying the subjective, as an expression of an observed reality more remote from the lived human reality. After all, the “abstraction” in this art means precisely this, starting with some object in the real world and abstracting it (as opposed to art that is completely non-representational). Art becomes an unreality. In this we find his phenomenological leaning, calling to go back “to the things themselves” in art. This is arguably also part of the general existentialist call to avoid objectifying human individuals.

Still, this can provide insight into his contemporary historical age, and there is value in that—hence his desire to better understand the art of his time, the art that divides the public into the elite few that understand and appreciate it, and the majority who do not understand nor enjoy it. There is also value in how this may be used to ‘put the masses back into their place,’ because only an elite few understand ‘modern art.’ Perhaps this could serve as a test, Ortega argued, by observing how one views a work of abstract art. We can add to our judgments about a person’s place as part of the minority or mass his or her ability to contemplate this art.

6. Philosophy

History, for Ortega, represented the “inconstant and changing,” whereas philosophy represented the “eternal and invariable”—and he called for the two to be united in his approach to philosophical study. History is human history; it is the reality of humankind. As a critic of his time, he also has much to say about the movements in philosophy of his time and in the history of philosophy. To the question, “what is philosophy?” Ortega answers: “it is a thing which is inevitable” (What is Philosophy?). Philosophy cannot be avoided. It is an activity, and in his many writings on the topic, he wants to take this activity itself and submit it to analysis. Philosophy must be de-read vertically, not read horizontally, he urges. Philosophy is philosophizing, and philosophizing is a way of living. Therefore, the basic problem of philosophy that he wants to submit to analysis is to define that way of living, of being, of “our life.”

Ortega’s call for a rebirth of philosophy and his concern over too much reliance on modern science, especially physics, is one of the many reasons why he is often classified into the category of existentialist philosophers. In fact, for Ortega, a philosopher is really a contradistinction from any kind of scientist in their navigation into the unknown, into problems (like other existentialists, he is also fond of the metaphor for life consisting of navigating a ship headed for shipwreck). Philosophy, he says, is a vertical excursion downward. In his discussions on what philosophy is, he makes several contrasts to science. For example, philosophy begins with the admission that the world may be a problem that cannot be solved, whereas the business of science is precisely about trying to solve problems. But he did not solely critique physics, as it was also something that he believed supported his perspectivism, as seen in the relativism discovered by Albert Einstein—but neither is Ortega a strict relativist. While an individual reality is relative to a time and place, each of those moments is an absolute position. Moreover, not all perspectives are equal; errors are committed, and there are hierarchies of perspectives.

The exclusive subject of philosophy is the fundamental being, which he defines as that which is lacking. Philosophy, he says, is self-contained and can be defined as a “science without suppositions,” which is another inheritance from Husserl’s phenomenology (What is Philosophy?). In fact, he takes issue with the term “philosophy” itself; better, perhaps, is to consider it a theory or a theoretic knowledge, he insists. A theory, he argues, is a web of concepts, and concepts represent ‘the content of the mind that can be put into words.’

7. The History of Philosophy

In his unfinished work, The Origin of Philosophy, Ortega outlines a reading of philosophy similar to that of history; it must be studied in its entirety. Just as one cannot only read the last chapter of a novel to understand it, one must read all the chapters that came before. His main objective with this work then is to recreate the origin of philosophy. In the history of philosophy, we find a lot of inadequate philosophy, he argues, but it is part of our human condition to keep thinking, nonetheless. It is part of our human condition to realize that we have not thought everything out adequately. Hence, perhaps The Origin of Philosophy was meant to be unfinished because it cannot be otherwise. Upon the first read, therefore, the history of philosophy is a history of errors. We need only think of what came after the Presocratics, the first on record to try to formalize some ways of philosophical thinking, who then gave birth to the relativism of the Sophists and the skepticism of the Skeptics, as a few examples of what came after in the form primarily of a critique or a reaction against. By revealing the errors of earlier philosophy, Ortega argues, philosophers then create another philosophy in that process. For Ortega to take this focus precisely when he did, working on this text in the mid-1940s, when logical positivism and contemporary analytic philosophers had come to dominate the Anglo-American philosophical landscape, provides just an example of this as “analytic philosophy.” That term came about in part to separate those philosophers from “continental philosophy” (“continental” primarily being in reference to existentialist-like thinkers, such as Ortega—those on the continent of Europe, not the British Isles).

Error, he argues, seems to be more natural than truth. But he does not believe that philosophy is an absolute error; in errors there must be at least the possibility for some element of truth. It is also the case that sometimes when we read philosophy, the opposite happens: we are initially struck by how it seems to resound the “truth.” What we have next, then, is a judgment about how ‘such and such philosophy’ has merit and another does not. But each philosophy, he argues, contains elements of the others as “necessary steps in a dialectical series” (The Origin of Philosophy). The philosophical past, therefore, is both an accumulation of errors and truths. He says: “our present philosophy is in great part the current resuscitation of all the yesterdays of philosophy” (The Origin of Philosophy). Philosophy is a vertical excursion down, because it is built upon philosophical predecessors, and as such, continues to function in and influence the present. When we think about the past, that brings it into the present; in other words, thinking about the past makes it more than just “in the past.” Again, he shares with Nietzsche this distinction between humankind and animals in how we possess the past and are more than just consequences of it; we are conscious of our past. We are also distinct in how we cannot possess the future, though we strive very hard to—modern science is very focused on this and working to improve our chances at prediction. The first philosopher, Thales, is given that title for being the first on record that we know of to start to think for himself and move away from mythological explanation, as famously demonstrated by how he predicted a solar eclipse using what we would define as a kind of primitive science. In being able to predict more of the future, one can thus ‘eternalize oneself’ more. In this process one has also obtained a greater possession of the past. “The dawn of historical reason,” as he refers to it, will arrive when that possession of the past has reached an unparalleled level of passion, urgency, and comprehension. Just as history broadly moves with crises of beliefs, this applies very explicitly to philosophy (as it is also the best way to contemplate the human lived situation). This also relates to his perspectivism and to the notion of hierarchies that are very much pragmatically founded. For Ortega, examples of particularly moving moments in the history of philosophy come from these great shifts in philosophical beliefs, such as those from the period of ancient Greece and from Descartes especially. For Ortega, the three most crucial belief positions in philosophy to examine via its history are realism, idealism, and skepticism. Ortega’s hope was that this would all, ideally, come closer to a full circle with the next belief position: that of his “razón vital e histórica,” or “historical and vital reason.”

Despite the challenges in understanding the wide breadth of writings of José Ortega y Gasset, perhaps it serves us best to read him in the context of his own methodology of historical and vital reason—as an individual, a man of his times, searching for nuggets of insight among a history of errors.

8. References and Further Reading

a. Primary Sources

  • Ortega’s Obras Completas are available digitally.
  • Ortega y Gasset, José. Obras Completas Vols. I-VI. Spain: Penguin Random House Grupo Editorial, 2017.
  • Ortega y Gasset, José. Obras Completas Vols. VII-X (posthumous works). Spain: Penguin Random House Grupo Editorial, 2017.
  • Ortega y Gasset, José. Meditations on Quixote. New York: W.W. Norton, 1961.
  • Ortega y Gasset, José. The Dehumanization of Art and Other Essays on Art, Culture, and Literature. Princeton: Princeton University Press, 2019.
  • Ortega y Gasset, José. Phenomenology and Art. New York: W.W. Norton, 1975.
  • Ortega y Gasset, José. Historical Reason. New York: W.W. Norton, 1984.
  • Ortega y Gasset, José. Toward a Philosophy of History. Chicago: University of Illinois Press, 2002.
  • Ortega y Gasset, José. History as a System and other Essays Toward a Philosophy of History.  New York: W.W. Norton, 1961.
  • Ortega y Gasset, José. An Interpretation of Universal History.  New York: W.W. Norton, 1973.
  • Ortega y Gasset, José. The Revolt of the Masses. New York: W.W. Norton, 1932.
  • Ortega y Gasset, José. What is Philosophy? New York: W.W. Norton, 1960.
  • Ortega y Gasset, José. The Origin of Philosophy. New York: W.W. Norton, 1967.
  • Ortega y Gasset, José. Man and Crisis. New York: W.W. Norton, 1958.
  • Ortega y Gasset, José. Man and People. New York: W.W. Norton, 1957.
  • Ortega y Gasset, José. Meditations on Hunting. New York: Charles Scribner’s Sons, 1972.
  • Ortega y Gasset, José. Psychological Investigations. New York: W.W. Norton, 1987.
  • Ortega y Gasset, José. Mission of the University. New York: W.W. Norton, 1966.
  • Ortega y Gasset, José. The Modern Theme. New York: W.W. Norton, 1933.
  • Ortega y Gasset, José. On Love: Aspects of a Single Theme. Cleveland: The World Publishing Company, 1957.
  • Ortega y Gasset, José. Some Lessons in Metaphysics. New York: W.W. Norton, 1969.
  • Ortega y Gasset, José. What is Knowledge? New York: Suny Press, 2001.
  • Ortega y Gasset, José. Concord and Liberty. New York: W.W. Norton, 1946.
  • Ortega y Gasset, José. Invertebrate Spain. New York: Howard Fertig, 1921.

b. Secondary Sources

  • Blas González, Pedro. Human Existence as Radical Reality: Ortega y Gasset’s Philosophy of Subjectivity. St. Paul: Paragon House, 2011.
  • Díaz, Janet Winecoff. The Major Theme of Existentialism in the Work of Jose Ortega y Gasset. Chapel Hill, NC: University of North Carolina Press, 1970.
  • Dobson, Andrew. An Introduction to the Politics and Philosophy of José Ortega y Gasset. Cambridge: University Press, 1989.
  • Ferrater Mora, José. José Ortega y Gasset: An Outline of His Philosophy. New Haven, CT: Yale University Press, 1957.
  • Ferrater Mora, José. Three Spanish Philosophers: Unamuno, Ortega, and Ferrater Mora. New York: State University of New York Press, 2003.
  • Graham, John T. A Pragmatist Philosophy of Life in Ortega y Gasset. Columbia: University of Missouri Press, 1994.
  • Graham, John T. The Social Thought of Ortega y Gasset: A Systematic Synthesis in Postmodernism and Interdisciplinarity. Columbia: University of Missouri Press, 2001.
  • Graham, John T. Theory of History in Ortega y Gasset: The Dawn of Historical Reason. Columbia: University of Missouri Press, 1997.
  • Gray, Rockwell. The Imperative of Modernity: An Intellectual Biography of José Ortega y Gasset. Berkeley: University of California Press, 1989.
  • Holmes, Oliver W. José Ortega y Gasset. A Philosophy of Man, Society, and History. Chicago: University of Chicago, 1971.
  • Huéscar, Antonio Rodríguez y Jorge García-Gómez. José Ortega y Gasset’s Metaphysical Innovation: A Critique and Overcoming of Idealism. Albany: State University of New York Press, 1995.
  • McClintock, Robert. Man and His Circumstances: Ortega as Educator. New York: Teachers College Press, 1971.
  • Mermall, Thomas. The Rhetoric of Humanism: Spanish Culture after Ortega y Gasset. New York: Bilingual Press, 1976.
  • Raley, Harold C. José Ortega y Gasset: Philosopher of European Unity. University, Alabama: University of Alabama Press, 1971.
  • Sánchez Villaseñor, José. José Ortega y Gasset, Existentialist: A Critical Study of his Thought and his Sources. Chicago: Henry Regnery, 1949.
  • Silver, Philip W. Ortega as Phenomenologist: The Genesis of Meditations on Quixote, New York: Columbia University Press, 1978.
  • Sobrino, Oswald. Freedom and Circumstance: Philosophy in Ortega y Gasset, Charleston: Logon, 2011.

 

Author Information

Marnie Binder
Email: marnie.binder@csus.edu
California State University, Sacramento
U. S. A.

Nietzsche’s Ethics

NietzscheThe ethical thought of German philosopher Friedrich Nietzsche (1844–1900) can be divided into two main components. The first is critical: Nietzsche offers a wide-ranging critique of morality as it currently exists. The second is Nietzsche’s positive ethical philosophy, which focuses primarily on what constitutes health, vitality, and flourishing for certain individuals, the so-called “higher types”.

In the critical project, Nietzsche attacks the morality of his day from several different angles. He argues that the metaphysical foundations of morality do not hold up to scrutiny: the concepts of free will, conscious choice, and responsibility that underpin our understanding of morality are all vociferously critiqued, both on theoretical and on practical grounds. Nietzsche also objects to the content of our contemporary moral commitments. He rejects the idea that suffering is inherently bad and should be eradicated, and he denies that selflessness and compassion should be at the core of our moral code. Key components of Nietzsche’s critical project include his investigation of the history of the development of our moral commitments—the method of “genealogy”—as well as an analysis of the underlying psychological forces at work in our moral experiences and feelings. Ultimately, perhaps Nietzsche’s most serious objection to morality as it currently exists is his claim that it cannot help us to avoid the looming threat of nihilism.

In the positive project, Nietzsche offers a vision of what counts as a good and flourishing form of existence for certain people. This positive ethical vision is not open to everyone, but only to the so-called “higher types”—people whose psycho-physical nature makes them capable of coming to possess the traits and abilities that characterize health, vitality, and flourishing on Nietzsche’s account. The flourishing individual, according to Nietzsche, will be one who is autonomous, authentic, able to “create themselves,” and to affirm life. It is through such people, Nietzsche believes, that the threat of nihilism can be averted.

Table of Contents

  1. The Critical Project
    1. The Object of Nietzsche’s Attacks
    2. Rejection of an Otherworldly Basis for Value
    3. Attacks on the Metaphysical Basis of Moral Agency
    4. Attacks on the Content of Morality
    5. Genealogical Critique
    6. Psychological Critique
    7. The Threat of Nihilism
  2. The Positive Project
    1. Higher Types
    2. Autonomy
    3. Authenticity and Self-Creation
    4. Affirmation
  3. References and Further Reading
    1. Primary Texts
    2. Secondary Texts

1. The Critical Project

In 1981, the British philosopher Bernard Williams wrote that “[i]t is certain, even if not everyone has yet come to see it, that Nietzsche was the greatest moral philosopher of the past century. This was, above all, because he saw how totally problematical morality, as understood over many centuries, has become, and how complex a reaction that fact, when fully understood, requires.” As Williams’s remark suggests, the core of Nietzsche’s ethical thought is critical: Nietzsche seeks, in various ways, to undermine, critique, and problematize morality as we currently understand it. As Nietzsche himself puts it, “we need a critique of moral values, the value of these values should itself, for once, be examined­” (On the Genealogy of Morality, Preface, 6). In speaking of “the value of these values”, Nietzsche is making use of two different senses of the notion of value. One is the set of values that is the object of the critique, the thing to be assessed and evaluated. The other is the standard by which we are to assess these values. In attacking moral values, then, Nietzsche is not setting himself against all possible evaluative systems. And as we shall see, Nietzsche does indeed go on to make many substantive evaluative claims of his own, both critical and positive, including many claims that are broadly ethical in nature. Nietzsche thus proposes to undertake what he calls a “revaluation of all values”, with the final product of this project being a new system of evaluations (see part 2., “The positive project”).

a. The Object of Nietzsche’s Attacks

Since Nietzsche’s critical project is not targeted towards all values as such, we should ask what, exactly, Nietzsche is attacking when he attacks “morality”. In fact, Nietzsche’s various attacks have multiple targets, which together form a family of overlapping worldviews, commitments, and practices. The Judeo-Christian moral-religious outlook is one broad target, but Nietzsche is also keen to attack the post-religious secular legacy of this moral code that he sees as dominant in his contemporary culture in Europe. He is concerned with Kantian morality, as well as the utilitarianism that was gaining prominence around Nietzsche’s time, especially in Britain. Aspects of his attacks are levelled against broadly Platonist metaphysical accounts, as well as the Christian inheritance of these accounts, which understand value as grounded in some otherworldly realm that is more real and true than the world we live in. Other parts of the critical project are directed towards certain particular evaluative commitments, such as a commitment to the centrality of pity or compassion (Mitleid), as exemplified in Schopenhauer’s ethics in particular, but which Nietzsche also sees as a point of thematic commonality between many different moral and religious worldviews. Nietzsche even criticizes evaluative systems that he envisages coming to be widely accepted in the future, such as the commitment to ease and comfort at all costs that he imagines the “last human being” endorsing (see section 1. g., “The threat of nihilism”).

Given this diversity, determining exactly what is under attack in Nietzsche’s critical project is best achieved though attention to the detail of the various attacks. In general, this article uses “morality” as a catch-all term to cover the multiple different objects of Nietzsche’s attacks, allowing the precise target of each attack to be clarified through the nature of the attack itself. The reader should note, then, that not all of Nietzsche’s attacks on “morality” will necessarily apply to each of the individual views and commitments that are gathered under this broad heading.

b. Rejection of an Otherworldly Basis for Value

Nietzsche rejects certain metaphysical accounts of the nature of value. These parts of Nietzsche’s position are not directly about the substantive evaluative content of moral worldviews, but rather the metaphysical presuppositions about the grounds of value that certain moral, and especially moral-theological, worldviews involve. In section 230 of Beyond Good and Evil, Nietzsche states that his philosophical work aims to “translate humanity back into nature”, to reject “the lures of the old metaphysical bird catchers who have been piping at him for far too long: ‘You are more! You are higher! You have a different origin!’”. Human beings, according to Nietzsche, are fundamentally a part of nature. This means that he rejects all accounts of morality that are grounded in a conception of human activity as answerable to a supernatural or otherworldly source of value. The idea of morality as grounded in the commands of God is thus rejected, as is the Platonist picture of a realm of ideal forms, including the “Form of the Good,” as the basis for value.

For the most part, Nietzsche does not go out of his way to argue against these sorts of metaphysical pictures of the nature of value. Instead, he tends to assume that his reader is already committed to a broadly naturalistic understanding of the world and the place of the human being within it. Nietzsche’s rejection of theological or Platonist accounts of the basis of value, then, tends to stand as a background assumption of his discussions, rather than as something he attempts to persuade his reader of directly.

The recurring motif of the “death of God” in Nietzsche’s writing is usefully illustrative here. In The Gay Science, Nietzsche describes a “madman” who is laughed at for announcing the death of God in the marketplace (section 125). But the laughter is not because people think that God is not dead but instead alive and well; rather, these people do not believe in God at all. The intellectual elite of Europe in Nietzsche’s day were, for the most part, atheists. Nietzsche’s insistent emphasis on the idea that “God is dead” is thus not intended as a particularly dramatic way of asserting the non-existence of God, and he does not expect the idea that God does not exist to come as a surprise to his reader. Rather, the problem that Nietzsche seeks to draw attention to is that his fellow atheists have failed to understand the cultural and spiritual significance of the widespread loss of belief in God, and thus of the associated metaphysical picture of the human being as created for a higher divine purpose (see section 1. g., “The threat of nihilism”).

Indeed, Nietzsche is often interested in the way in which aspects of these earlier supernatural worldviews, now largely abandoned, have nonetheless left traces within our current belief and evaluative systems—even within the modern naturalistic conception of the world that Nietzsche takes himself to be working within. Nietzsche writes:

New Battles. – After Buddha was dead, they still showed his shadow in a cave for centuries – a tremendous, gruesome shadow. God is dead; but given the way people are, there may still for millennia be caves in which they show his shadow. – And we – we must still defeat his shadow as well! (The Gay Science, 108)

Although Nietzsche clearly sets himself against supernaturalist accounts of value and of the place of the human being in the cosmos, the precise nature of his own naturalism, and the consequences of this naturalism for his own ethical project, is a topic of debate among commentators. This is complicated by the fact that Nietzsche often directs his attacks towards other naturalist accounts, sometimes simply under the heading of “naturalism,” in a way that can seem to suggest that he himself rejects naturalism. (See Leiter (2015), Clark and Dudrick (2012), and Riccardi (2021) for useful discussion of the nature of Nietzsche’s naturalism.)

In general, Nietzsche expects his reader to share his own basic naturalist orientation and rejection of supernatural metaphysics. However, he thinks that most people have failed to properly understand the full consequences of such commitments. The atheists of his day, thinks Nietzsche, have typically failed to understand the cultural impact that a loss of religious faith will have—perhaps because these cultural effects have not yet shown themselves clearly. Nietzsche also thinks that his contemporaries have not always grasped the ways in which an accurate picture of the nature of the human being will force us to revise or abandon many concepts that are key to our current understanding of morality—perhaps most strikingly, concepts of moral agency and responsibility (see the following section). Many of Nietzsche’s fellow naturalists suppose that we can abandon the supernatural trappings that have previously accompanied morality, and otherwise continue on with our evaluative commitments more or less as before. This, Nietzsche thinks, is not so.

c. Attacks on the Metaphysical Basis of Moral Agency

One family of arguments presented by Nietzsche attacks the metaphysical basis of moral agency. Again, the point here is not directly about the substantive evaluative content of particular moral systems, but rather their metaphysical presuppositions, especially those that have been thought to ground the concept of moral responsibility—notions of the freedom of the will, and the role of consciousness in determining human action.

First, Nietzsche attacks the idea of free will. Nietzsche writes:

The causa sui [cause of itself] is the best self-contradiction that has ever been conceived, a type of logical rape and abomination. But humanity’s excessive pride has got itself profoundly and horribly entangled with precisely this piece of nonsense. The longing for “freedom of the will” in the superlative metaphysical sense (which, unfortunately, still rules in the heads of the half-educated), the longing to bear the entire and ultimate responsibility for your actions yourself and to relieve God, world, ancestors, chance, and society of the burden—all this means nothing less than being that very causa sui and, with a courage greater than Münchhausen’s, pulling yourself by the hair from the swamp of nothingness up into existence. (Beyond Good and Evil, 21)

This passage appears to reject the idea of free will primarily on metaphysical grounds: for the will to be free would be for a thing to be causa sui, the cause of itself, and this is impossible. And so, to the extent that a moral worldview depends on the idea that we do have free will in this sense, then the foundations of such a worldview are undermined.

Some scholars, noting Nietzsche’s references to “pride” and “longing”, have suggested that the primary mode of Nietzsche’s attack on the idea of free will is practical rather than metaphysical. The real problem with the idea of free will, they argue, is that a belief in this idea is motivated by psychological weakness, and is thus not conducive to good psychic health and flourishing (see Janaway (2006)).

Others have argued that Nietzsche’s relationship to the traditional metaphysical debate about free will is not so much to deny that we have free will, but rather to deny the very coherence of the concept at work in this debate (see Kirwin (2017)). For Nietzsche goes on to call the notion of free will an “unconcept” or “nonconcept” (Unbegriff), insisting that just as we must let go of this notion, so too must we let go of “the reversal of this unconcept of ‘free will’: I mean the ‘un-free will’”.

This scholarly disagreement about the nature of Nietzsche’s attacks on the concept of free will also impacts how we understand parts of Nietzsche’s positive ethical vision. In particular, the question of whether we should understand that positive ethical vision to include an ideal of ‘freedom’ in some sense is hotly contested (see section 2. b., “Autonomy”).

Alongside these attacks on the notion of free will, Nietzsche also denies that human action is primarily a matter of conscious decision and control on the part of the agents themselves. We experience ourselves as consciously making decisions and acting on the basis of them, but this experience is, thinks Nietzsche, misleading. To begin with, our conscious self-awareness is only one small part of what is going on within the mind: “For the longest time, conscious thought was considered thought itself; only now does the truth dawn on us that by far the greatest part of the mind’s activity proceeds unconscious and unfelt” (The Gay Science, 333). Furthermore, Nietzsche thinks, it is unclear that this conscious part of the mind really plays any sort of role in determining our action, since “[a]ll of life would be possible without, as it were, seeing itself in the mirror and […] the predominant part of our lives actually unfolds without this mirroring” (The Gay Science, 354). Consciousness, says Nietzsche, is “basically superfluous” (ibid). These parts of Nietzsche’s account of human psychology have often been understood as a precursor to Freudian theories of the unconscious, as well as to recent empirical work establishing that our self-understanding of our own minds and activities is often far from accurate (see Leiter (2019)). Some scholars, while acknowledging Nietzsche’s downgrading of consciousness, have nonetheless argued that Nietzsche retains a robust picture of human agency (see Katsafanas (2016), and section 2. b., “Autonomy”).

Nietzsche’s rejection of free will and his denial of the idea that the conscious mind is the real source of action both appear to undermine the possibility of a person’s being morally responsible for their actions, at least as that notion has traditionally been understood. If moral responsibility requires free will in the sense rejected by Nietzsche, then there can be no moral responsibility. Some philosophers have argued that responsibility does not require free will in this sense, but they have generally done so by arguing that it is sufficient for responsibility that a person’s action follow from their intentions in the right sort of way. But Nietzsche’s attacks on the causal role of consciousness in human action seem to cause problems for this sort of approach as well. In undermining these metaphysical ideas about the nature of human action, then, Nietzsche takes himself to have done away with notion of moral responsibility, thus removing a key underpinning of the system of morality.

d. Attacks on the Content of Morality

Nietzsche also raises objections to the normative content of morality—to the things it presents as valuable and disvaluable, and the actions it prescribes and proscribes. One particular focus of his attacks here is the centrality of Mitleid (variously translated as “pity” or “compassion”) to the moral codes he sees in his contemporary society. Nietzsche sometimes refers to Christianity as “the religion of pity,” and asserts that “[i]n the middle of our unhealthy modernity, nothing is less healthy than Christian pity” (The Antichrist, 7). But Nietzsche’s critique of pity is not limited to Christianity; indeed, he suggests that the “morality of pity” is really an outgrowth of Christianity, rather than properly part of Christianity itself:

[…] ‘On n’est bon que par la pitié: il faut donc qu’il y ait quelque pitié dans tous nos sentiments’ [one is only good through pity: so there must be some pity in all of our sentiments]—thus says morality today! […] That men today feel the sympathetic, disinterested, generally useful social actions to be the moral actions – this is perhaps the most general effect and conversion which Christianity has produced in Europe: although it was not its intention nor contained in its teaching. (Daybreak, 132)

Nietzsche connects the morality of pity to utilitarian and socialist movements, to thinkers in France influenced by the French revolution, and to Schopenhauer’s moral philosophy. (Interestingly, Nietzsche notes that Plato and Kant, who are elsewhere the target of his attacks on morality, do not hold pity in high esteem—On the Genealogy of Morality, Preface, 5.)

The morality of pity, thinks Nietzsche, is problematic in various ways. It emphasizes the eradication of suffering as the main moral goal—and yet suffering, thinks Nietzsche, is not inherently bad, and can indeed be an impetus to growth and creativity. (Nietzsche himself suffered from ill health throughout his life, and often seems to connect his own intellectual and creative achievements to these experiences.) Pity, thinks Nietzsche, both arises from and exacerbates a “softness of feeling” (On the Genealogy of Morality, Preface, 6), as opposed to the sort of strong and hardy psychological constitution that he admires. The morality of pity also prioritizes the wellbeing of “the herd” over that of those individuals who have the potential to achieve greatness. Some of Nietzsche’s attacks on the morality of pity take the form of a distinctive sort of psychological critique: what presents itself as a concern for the other person in fact has a darker, hidden, and more self-serving motive (see section 1. f., “Psychological critique”). Finally, Nietzsche believes that making pity central to our evaluative worldview will lead humanity towards nihilism (see section 1. g., “The threat of nihilism”).

The German word that Nietzsche uses is Mitleid, which can be translated as “pity” or as “compassion”. Some scholars have sought to emphasize the difference between these two concepts, and to interpret Nietzsche’s attacks on Mitleid through the lens of this distinction (Von Tevenar (2007)). The proposal is that pity focuses its attention on the suffered condition rather than on the sufferer themselves, creating distance between the sufferer and pitier, and as a result can end up tinged with a sense of superiority and contempt on the part of the pitier. Compassion, by contrast, is understood to involve genuine other-regarding concern and thus to foster closeness between the two parties. When we read Nietzsche’s attacks on Mitleid in light of this distinction, some of his objections seem to apply primarily to pity, thus understood, while others seem to take compassion as their main target (see section 1. f., “Psychological critique” and section 1. g. “The threat of nihilism” for some further discussion).

Nietzsche’s various objections to Mitleid stand at the heart of his attack on the content of morality. But, as he explains, his concerns with this concept eventually lead him to a broader set of questions about morality. Nietzsche says:

This problem of the value of pity and of the morality of pity […] seems at first to be only an isolated phenomenon, a lone question mark; but whoever pauses over the question and learns to ask, will find what I found:—that a vast new panorama opens up for him, a possibility makes him giddy, mistrust, suspicion, and fear of every kind spring up, belief in morality, all morality, wavers. (On the Genealogy of Morality, Preface, 6)

More generally, then, Nietzsche holds that various traits, behaviors, and ideals that morality typically holds in high regard—humility, love of one’s neighbor, selflessness, equality, and so on—are all open for critique, and indeed all are on Nietzsche’s view found wanting. These values are, according to Nietzsche, “ascetic” or “life-denying”—they involve a devaluation of earthly existence, and indeed of those parts of human existence, such as struggle, suffering, hardship, and overcoming, that are capable of giving rise to greatness. It may be true that the more people possess the qualities that morality holds in high esteem, the easier and more pleasant life may be for the majority of people. But whether or not this is really so does not really matter, for Nietzsche is not concerned with how things are for the majority of people. His interest is primarily in those individuals who have the potential for greatness—those “higher types” who are capable of great deeds and profound creative undertakings. And here, Nietzsche thinks, the characteristic values that morality holds in such esteem are not conducive to the health and flourishing of these individuals.

e. Genealogical Critique

One of the most important and influential components of Nietzsche’s critical project is his attempt to offer a ‘genealogy’ of morality, a certain sort of historical account of its various origins and development over time. This account is offered primarily in On the Genealogy of Morality, though other texts develop similar themes, especially Beyond Good and Evil. In the Genealogy, Nietzsche explicitly connects this historical investigation to his critical project:

[W]e need a critique of moral values, the value of these values should itself, for once, be examined­—and so we need to know about the conditions and circumstances under which the values grew up, developed and changed. (On the Genealogy of Morality, Preface, 6)

Scholars have puzzled over this claim. Why do we need to know about the historical origins of morality in order to assess its value here and now? Indeed, it has seemed to many that Nietzsche is here committing the “genetic fallacy”, wrongly inferring an assessment of a thing’s current meaning or value on the basis of its source or origins. But Nietzsche himself appears to be aware of the fallacy in question (see for example The Gay Science 345), and so we have reason to take seriously the project of the Genealogy and to try to understand it as part of Nietzsche’s critical project.

In fact, there are ways in which a thing’s source or origin can rightly affect our current assessment of it. For example, if you learn that the person who gave you a piece of information is untrustworthy, this does not automatically imply that the information is false, but it does undermine your original justification for accepting it, and gives you reason to reconsider your belief in it. It may be that Nietzsche’s genealogical project works in a similar sort of way. In seeking to understand morality as a historical phenomenon, Nietzsche’s approach already unsettles certain aspects of our understanding of morality’s nature and its claim to authority over us. If we had supposed that morality has a timeless or eternal nature (perhaps because it is bestowed by God, or because it is grounded in something like Plato’s Form of the Good—see section 1. b., “Rejection of an otherworldly basis for value”), then coming to understand it as instead a contingent product of human history and development may give us reason to question our commitment to it. Even if morality is not thereby shown to be bad or false, it does seem to be revealed as something that is properly open to questioning and critique.

Furthermore, part of Nietzsche’s point in developing his genealogical account is that certain human phenomena—here, morality, and its associated concepts and psychological trappings—are essentially historical, in the sense that one will not understand the thing itself as it exists here and now, and thus will not be able to give a proper critique, without understanding how it came to be. (Think of what would be needed for a person to properly understand the phenomenon of racial inequality in the present-day United States, for instance.) To fully comprehend the nature of morality, and thus to get it into view as the object of our critique, thinks Nietzsche, we will need to investigate its origins.

In the First Essay of the Genealogy, “‘Good and Evil,’ ‘Good and Bad,’” Nietzsche charts the emergence of two distinct systems of evaluation. The first is the aristocratic “master morality,” which begins from an evaluation of the aristocratic individual himself as “good,” which here indicates something noble, powerful, and strong. Within this moral code, the contrasting evaluation—“bad”—is largely an afterthought, and points to that which is not noble, namely the lowly, plebian, ill-born masses. The opposing evaluative system, “slave morality,” develops in reaction to the subjugation of this lower class under the power of the masters. Led by a vengeful priestly caste (which Nietzsche connects to Judaism), this lower class enacts the “slave revolt in morality,” turning the aristocratic moral code on its head. Within the slave moral code, the primary evaluative term is “evil,” and it is applied to the masters and their characteristic traits of strength and power. The term “good” is then given meaning relative to this primary term, so that “good” now comes to mean meek, mild, and servile—qualities which the slave class possess of necessity, but which they now cast as the products of their own free choice. This evaluative system comes along with the promise that justice will ultimately be meted out in the afterlife: those who suffer and are oppressed on earth will receive their reward in heaven, while the evil masters will face an eternity of punishment in hell. In the resulting struggle between the two evaluative systems, it was the slave morality that eventually won out, and it is this moral code that Nietzsche takes to be dominant in the Europe of his day.

In the Second Essay, “‘Guilt,’ ‘Bad Conscience,’ and Related Matters,” Nietzsche explores the origins of the institution of punishment and of the feelings of guilt and bad conscience. Punishment, Nietzsche thinks, originally emerged from the economic idea of a creditor-debtor relationship. The idea eventually arises that an unpaid debt, or more generally an injury of some kind, can be repaid through pain caused to the debtor. It is from this idea that the institution of punishment comes into being. But punishment is not what gives rise to feelings of bad conscience. Instead, the origins of bad conscience, of the feeling of guilt, arise as a result of violent drives that would normally be directed outwards becoming internalized. When individuals come to live together in communities, certain natural violent tendencies must be reined in, and as a result they are turned inwards towards the self. It is the basic drive to assert power over others, now internalized and directed towards the self, that gives rise to the phenomenon of bad conscience.

In the Third Essay, “What Do Ascetic Ideals Mean?,” Nietzsche explores the multiple significances that ascetic ideals have had, and the purposes they have served, for different groups of people, including artists, philosophers, and priests. The diversity of meanings that Nietzsche finds in ascetic ideals is an important component of the account: one of the characteristic features of genealogy as a method of investigation is the idea that the object under scrutiny (the phenomenon of morality, for instance) will not have a single unified essence, meaning, or origin, but will rather be made up of multiple overlapping ideas which themselves change and shift over time. Nonetheless, ascetic ideals share in common the characteristic of being fundamentally life-denying, and thus, on Nietzsche’s account, not conducive to flourishing health. And although the narrative of the Genealogy so far has connected these ideals to the Judeo-Christian worldview and moral code, in the final part of the book we are told that the most recent evolution of the ascetic ideal comes in the form of science, with its unquestioning commitment to the value of truth. Nietzsche’s critique of morality thus leads even further than we might have expected. It is not only the Judeo-Christian moral code, nor even its later secular iterations that are under attack here. Rather, Nietzsche seeks to call into question something that his investigation has revealed to be an outgrowth of this moral code, namely a commitment to the value of truth at all costs. Even practices like science, then, embody the life-denying ascetic ideal; even the value of truth is to be called into question, evaluated—and found wanting.

In general, Nietzsche expects that when we consider the origins of morality that he presents us with, we will find them rather unsavory. For instance, once we realize that morality’s high valuation of pity, selflessness, and so on came to be out of the weakness, spite, and vengefulness of the subjugated slave class, this new knowledge will, Nietzsche hopes, serve to lessen the grip that these values have on us. Even if morality’s dark origins do not in themselves undermine the value of these ideals, the disquiet or even disgust that we may feel in attending to them can do important work in helping us to overcome our affective attachment to morality. Overcoming this attachment will pave the way for a more clear-eyed evaluation of these ideals as they exist today.

Nonetheless, the question remains just how far this sort of historical account can take us in assessing the value of morality itself. Even if the ideal of loving one’s neighbor, for instance, originally emerged out of a rather less wholesome desire for revenge, this seems not to undermine the value of the ideal itself. So long as loving one’s neighbor now does not involve such a desire for revenge, what, really, has been shown to be wrong with it? Nietzsche sometimes seems to be suggesting, however, that the historical origins of morality are not merely something that happened in the past. Instead, the dark motives that originally gave rise to morality have left their traces within our current psychological make-up, so that even today the ideal of loving one’s neighbor retains these elements of cruelty and revenge. (See section 1. f., “Psychological critique.”)

The Genealogy leaves behind a complex legacy. Scholars still disagree about what, exactly, the method of genealogy really is and what it can achieve. Nonetheless, Nietzsche’s approach has proved remarkably influential, perhaps most notably in relation to Foucault, who sought to offer his own genealogical accounts of various phenomena. The Genealogy also stands in a complex relationship to anti-Semitism. Nietzsche’s writing, including the Genealogy, often include remarks highly critical of anti-Semitism and anti-Semitic movements of his time. Nonetheless, that the book itself deals freely in anti-Semitic tropes and imagery seems undeniable.

f. Psychological Critique

Another distinctive component of Nietzsche’s critical project is his psychological analysis of moral feelings and behavior. Nietzsche frequently attempts to reveal ways in which our self-understanding of supposedly “moral” experiences can be highly inaccurate. Lurking behind seemingly compassionate responses to others, Nietzsche claims, we find a dark underside of self-serving thoughts, and even wanton cruelty. He suggests that feelings of sympathy [Mitgefühl] and compassion [Mitleid, also translated as “pity”] are secretly pleasurable, for we enjoy finding ourselves in a position of power and relative good fortune in relation to others who are suffering. These supposedly selfless, kind, and other-regarding feelings are thus really nothing of the sort.

Nietzsche’s psychological analysis of moral feelings and behaviors echoes the historical analysis he provides in the Genealogy (see section 1. e., “Genealogical critique”). Nietzsche often uses metaphors of “going underground” to represent investigations into the murky historical origins of morality as well as investigations into subconscious parts of the individual or collective psyche. It is not fully clear exactly how the two sorts of investigation are connected for Nietzsche, but he does seem to think that a person’s present psychic constitution can bear the imprint not only of their own personal history but also of historical events, forces, and struggles that affected their ancestors. If this is so, it seems plausible for Nietzsche to suppose that the subconscious motives at work in a person’s psyche could reflect the historical origins that Nietzsche traces for morality more generally, and that an investigation into one could at the same time illuminate the other.

Leaving aside this connection between psychological investigation and genealogy, when it comes to the detail of Nietzsche’s claims about what is really going on in specific instances of seemingly moral feelings, many commentators have found Nietzsche’s psychological assessments to be cuttingly insightful. As Philippa Foot puts it, “Nietzsche, with his devilish eye for hidden malice and self-aggrandizement and for acts of kindness motivated by the wish to still self-doubt, arouses a wry sense of familiarity in most of us”. Nietzsche does seem to have a knack for uncovering hidden motives, and for getting the reader to recognize these less wholesome parts of their own psyche. For instance, describing our responses when someone we admire is suffering, Nietzsche says:

We try to divine what it is that will ease his pain, and we give it to him; if he wants words of consolation, comforting looks, attentions, acts of service, presents—we give them; but above all, if he wants us to suffer at his suffering we give ourselves out to be suffering; in all this, however, we have the enjoyment of active gratitude—which, in short, is benevolent revenge. If he wants and takes nothing whatever from us, we go away chilled and saddened, almost offended […]. From all this is follows that, even in the most favourable case, there is something degrading in suffering and something elevating and productive of superiority in pitying. (Daybreak, 138)

Here, if the reader follows along imaginatively with Nietzsche’s story, they may indeed find themself feeling “chilled and saddened, almost offended” when supposing that the suffering person does not want their help—perhaps they even experience the feeling a split second before they read Nietzsche’s naming of those very feelings. They have been caught in the act, as it were, and made conscious of the secretly self-regarding nature of their supposedly compassionate responses to the suffering of others.

But even supposing that Nietzsche’s observations are correct about a great many real-world instances of purportedly moral phenomena—or even all of them—what sort of objection to morality does this really give us? After all, the problem here does not seem to be with the moral values or ideals themselves. Nietzsche’s objection here does not appear to directly target compassion itself (say) as a moral ideal, but rather the hypocrisy of those who understand themselves and others to be compassionate, but who are in reality anything but. Indeed, in a certain sense, the critique seems to depend on the idea that cruelty and self-serving attitudes are bad, and this evaluation is itself a core component of the morality that Nietzsche is supposed to be attacking.

There are various ways of making sense of Nietzsche’s psychological critique as part of his broader critique of morality. It may be that the uncovering of these hidden motives is merely intended to elicit an initial air of disquiet and an attitude of suspicion towards the whole system of morality—to force us to let go of our comfortable sense that all is well with morality as it currently exists. It seems likely, in addition, that Nietzsche’s main concern is not so much with moral values in the abstract (with the concept of compassion, say), but rather with their concrete historical and psychological reality—and this reality, Nietzsche suggests, is importantly not as it seems. Or perhaps the point is that human nature is always going to be driven by these more malicious feelings, so that a morality that fails to recognize this fact must be grounded in fantasy.

In general, the approach taken in Nietzsche’s psychological analysis of moral behaviour seems to take the form of an internal critique. Nietzsche expects his reader to be moved, on the basis of their current evaluative commitments, by his unmasking project: the hypocrisy of a cruel and self-serving tendency that masquerades as kindness and compassion is likely to strike us as distasteful, unappealing, perhaps disgusting. And thus shaken from our initially uncritical approval of what had presented itself as kindness and compassion, we may find ourselves psychologically more disposed to embark on the deeper ‘revaluation’ project that Nietzsche wants us to undertake. When we do so, Nietzsche hopes to persuade us of the disvalue not only of cruel egoism that presents itself as compassion, but indeed of compassion itself as an ideal. For this ideal, he argues, is fundamentally life-denying, and as a result will lead to nihilism (see the following section). (For more on the precise form of Nietzsche’s objections to Mitleid—pity or compassion—see Von Tevenar (2007).)

g. The Threat of Nihilism

Perhaps Nietzsche’s main objection to our current moral outlook is the likelihood that it will lead to nihilism. Nietzsche says:

Precisely here I saw the great danger to mankind, its most sublime temptation and seduction—temptation to what? to nothingness?—precisely here I saw the beginning of the end, standstill, mankind looking back wearily, turning its will against life, and the onset of the final sickness becoming gently, sadly manifest: I understood the morality of compassion, casting around ever wider to catch even philosophers and make them ill, as the most uncanny symptom of our European culture which has itself become uncanny, as its detour to a new Buddhism? To a new Euro-Buddhism? to—nihilism? (On the Genealogy of Morality, Preface, 5)

The Europe of Nietzsche’s day is entering a post-religious age. What his contemporaries do not realize, Nietzsche thinks, is that following the “death of God,” humanity faces an imminent catastrophic loss of any sense of meaning. Nietzsche’s contemporaries have supposed that one can go on endorsing the basic evaluative worldview of the Judeo-Christian moral code in a secular age, by simply excising the supernatural metaphysical underpinnings and then continuing as before. But this, thinks Nietzsche, is not so. Without these underpinnings, the system as a whole will collapse.

The problem does not seem to be exactly the metaethical worry that the absence of a properly robust metaphysical grounding for one’s values might undermine the project of evaluation as such. After all, Nietzsche himself seems happy to endorse various evaluative judgments, and he does not take these to be grounded in any divine or otherworldly metaphysics. (However, see Reginster (2006) for discussion of nihilism as arising from an assumption that value must be so grounded.) Instead, the problem seems to arise from the specific content of our current moral worldview. In particular, as we have seen, this worldview embodies ascetic and life-denying values—human beings’ earthly, bodily existence is given a negative evaluative valence. In the religious version of these ascetic ideals, however, the supernatural component provided a higher purpose: earthly suffering was given meaning through the promise that it would be repaid in the afterlife. Shorn of this higher purpose, morality is left with no positive sense of meaning, and all that remains is the negative evaluation of suffering and earthly existence. The old Judeo-Christian morality thus evolves into a secular “morality of pity,” aiming only at alleviating suffering and discomfort for “the herd.”

In pursuing this negative goal, the morality of pity seeks at the same time to make people more equal—and thus, thinks Nietzsche, more homogenous and mediocre. In Thus Spoke Zarathustra, Nietzsche gives a striking portrayal of the endpoint of this process:

Behold! I show you the last human being.

‘What is love? What is creation? What is longing? What is a star?’—thus asks the last human being, blinking.

Then the earth has become small, and on it hops the last human being, who makes everything small. His kind is ineradicable, like the flea beetle; the last human being lives longest.

‘We invented happiness’—say the last human beings, blinking.

They abandoned the regions where it was hard to live: for one needs warmth. One still loves one’s neighbor and rubs up against him: for one needs warmth.

[…]

One has one’s little pleasure for the day and one’s little pleasure for the night: but one honors health.

‘We invented happiness’ say the last human beings, and they blink. (Thus Spoke Zarathustra, Zarathustra’s Prologue, 5)

The “last human being” (often translated as “last man”) is taken by scholars to be Nietzsche’s clearest representation of the nihilism that threatens to follow from the death of God. Without any sense of higher meaning, and valuing only the eradication of suffering, humanity will eventually become like this, concerned only with comfort, small pleasures, and an easy life. Nietzsche’s dark portrait of the vacuously blinking “last human being” is supposed to fill the reader with horror—if this is where our current moral system is leading us, it seems that we have good reason to join Nietzsche in his project of an attempted “revaluation of all values”.

2. The Positive Project

As we have seen, Nietzsche’s critical project aims to undermine or unsettle our commitment to our current moral values. These values are fundamentally life-denying, and as such they threaten to bring nihilism in the wake of the death of God. In place of this system of values, then, Nietzsche develops an alternative evaluative worldview.

Drawing on a distinction suggested by Bernard Williams, we might usefully characterize Nietzsche’s positive project as broadly “ethical” rather than “moral,” in that it is concerned more generally with questions about how to live and what counts as a good, flourishing, or healthy form of life for an individual, rather than with more narrowly “moral” questions about right and wrong, how one ought to treat others, what one’s obligations are, or when an action deserves punishment or reward. As a result of this focus on health and flourishing, some scholars have characterized Nietzsche’s positive ethical project as a form of virtue ethics.

a. Higher Types

Nietzsche is not, however, interested in developing a general account of what counts as flourishing or health for the human being as such. Indeed, he rejects the idea that there could be such a general account. For human beings are not, according to Nietzsche, sufficiently similar to one another to warrant any sort of one-size-fits-all ethical code. The primary distinction is between two broad character “types”: the so-called “higher” and “lower” types. Nietzsche’s concern in the positive project is to spell out what counts as flourishing for the higher types, and under what conditions this might be achieved.

The distinction between higher and lower types appears to be a matter of one’s basic and unalterable psycho-physical make-up. While Nietzsche sometimes speaks as though all people can be straightforwardly sorted into one or the other category, at other points things seem more complicated: it may be, for example, that certain higher or lower character traits can end up mixed together in a particular individual. Nietzsche does not limit the concept of “higher types” to any particular ethnic or geographic group. He mentions instances of this type occurring in many different societies and in many different parts of the world. The distinction itself seems, in addition, to be largely ahistorical, such that there always have been and (perhaps) always will be higher types.

However, the detail of what the higher type looks like does vary based on the particular historical context. For example, the infamous “blond beasts” mentioned in the Genealogy are likely examples of higher types, but Nietzsche does not advocate a return (even if such were possible) to this cheerfully unreflective mode of existence. In the wake of the slave revolt in morality, human beings have become more complicated and more intellectual, and this development—though problematically shot through with ascetic ideals—has opened up new and more refined modes of existence to the higher types. As a result, the individuals that Nietzsche points to as his contemporary examples of higher types—Goethe, Emerson, and of course Nietzsche himself—tend to express their greatness through intellectual and artistic endeavors rather than through plundering and bloodlust. (Napoleon stands as an exception, although Nietzsche seems to think of him as a striking, and also somewhat startling, throw-back to an earlier mode of human existence.)

In general, the “higher type” designation seems to indicate a certain sort of potential that an individual possesses to achieve a certain state of being that Nietzsche takes to be valuable—a potential that may or may not end up being realized. The bulk of Nietzsche’s positive project, then, is concerned with spelling out what this state of being looks like, as well as what circumstances lead to its coming to fruition.

b. Autonomy

In recent years, commentators have focused on the notion of autonomy as a central component of Nietzsche’s ideal for the higher types. The autonomous individual, according to Nietzsche, is characterized primarily by self-mastery, which enables him (it appears, on Nietzsche’s account, to be invariably a “him”) to undertake great and difficult tasks—including, as we have seen, great intellectual and artistic endeavors.

This self-mastery, it seems, is primarily a matter of the arrangement of a person’s “drives”—the various and variously conflicting psychic forces that make up his being. What constitutes an ideal arrangement of drives for Nietzsche is not easy to pin down with precision, but some points seem clear. In the autonomous individual, the drives form a robust sort of a unity, with one or more of the most powerful drives co-opting others into their service, so that the individual is not being pulled in multiple different directions by different competing forces but instead forms a coherent whole. Not all forms of unity, however, will do the job. In Twilight of the Idols, Nietzsche offers a psychological portrait of Socrates, describing the “chaos and anarchy of [Socrates’] instincts” along with the “hypertrophy” of one particular drive—that of reason. In Socrates, according to Nietzsche, reason subjugates and tyrannizes over the other wild and unruly appetites, which are seen as dangerous alien forces that must be suppressed at all costs. The tyranny of reason does impose a unity of sorts, but Nietzsche does not seem impressed by the resulting figure of Socrates, whom he labels as “decadent”. The problem with Socrates’ drive formation may be formal—it may be that one drive merely tyrannizing over the others does not give us the right sort of unity; the controlling drive, we might suppose, ought instead to refine, sublimate, and transform the other drives to redirect them towards its purpose, rather than merely aiming to crush or extirpate them. Alternatively, the problem may be substantive: the issue might not be that one drive tyrannizes, but rather which drive is doing the tyrannizing in the case of Socrates. The tyranny of a less ascetic and life-denying drive might leave us with something that Nietzsche would be happy to think of as genuine self-mastery and hence autonomy. (For an interesting discussion of Nietzsche’s account of Socrates’ decadence, including the implicit references made to Plato’s city-soul analogy in the Republic, see Huddleston (2019). For Nietzsche’s drive-based psychology more generally, see Riccardi (2021), and for its relation to Nietzsche’s ideal, see Janaway (2012).)

A point of contention in the literature concerns whether or not the concept of “autonomy” (and related concepts of self-mastery and unity of drive formation) as Nietzsche uses it should be understood as connected to the concept of freedom. There are two related questions on the table here, which ought to be kept separate. The first is whether autonomy itself should be understood as a conception of freedom, so that to be autonomous is to be free in some sense. If so, then it seems that Nietzsche’s positive ethical vision includes freedom as an ideal that can be possessed by certain individuals who are capable of it. The second is whether or to what extent it is up to the individual to bring it about that he becomes autonomous—that is, whether or not the ideal of autonomy is an ideal that a higher type could pursue and achieve through their own agency. Let us consider the two questions in turn.

We have seen already that Nietzsche rejects a certain conception of freedom—the conception of “free will in the superlative metaphysical sense,” as he puts it (see section 1. c., “Attacks on the metaphysical basis of moral agency”). But several scholars have suggested that Nietzsche’s concept of autonomy is intended to offer an alternative picture of freedom, one that is not automatically granted to all as a metaphysical given, but which is rather the possession of the few. Ken Gemes (2009) thus marks a distinction between “deserts free will”—the sort of free will that could ground moral responsibility and thus a concept of desert, and which Nietzsche denies—and “agency free will” or autonomy, which Nietzsche grants certain individuals can come to possess. Several scholars have embraced Gemes’s distinction, and they and others have developed the idea that autonomy as freedom stands as a certain sort of ideal for Nietzsche (see Janaway (2006), May (2009), Richardson (2009), Kirwin (2017)). The thought is roughly that the autonomous individual is “free” because and insofar as he possesses certain sorts of agential abilities: having mastered himself, the autonomous agent is distinctively able to assert his will in the world, to make and honor certain sorts of commitment to himself or to others, to overcome resistance and obstacles to achieve his ends, and so on.

Against this school of thought, other scholars (most notably Brian Leiter) have argued that the picture of the autonomous individual that Nietzsche thinks so highly of does not give us in any meaningful sense a picture of freedom. On this reading, Nietzsche’s overall views on the question of freedom and free will are simple: none of us, not even those self-mastered higher types can be said to be free. Commentators from this camp do not deny that Nietzsche approves of the individual whose drives form a particular robust and powerful unity and who is thus “master of himself” and able to assert his will in the world. Their point is simply that these qualities do not amount to the individual’s being free in any meaningful sense.

One passage in particular has proven to be a point of controversy in the literature. In the Genealogy, Nietzsche introduces a character, the “Sovereign Individual,” who is described as the endpoint of a long historical process. The Sovereign Individual, Nietzsche says, is:

Like only to himself, having freed himself from the morality of custom, an autonomous, supra-moral individual (because ‘autonomous’ and ‘moral’ are mutually exclusive), in short, we find a man with his own, independent, enduring will, whose prerogative it is to promise—and him a proud consciousness quivering in every muscle of what he has finally achieved and incorporated, an actual awareness of power and freedom, a feeling that man in general has reached completion. (On the Genealogy of Morality, II:2)

How should we interpret this passage? There are, broadly speaking, three types of reading open to us. On the first, Nietzsche is sincere in his rather bombastic praise of this character, and his talk of freedom here should be taken seriously: that the Sovereign Individual is described as “autonomous” and as in various respects “free” gives us reason to think that Nietzsche really does hold freedom as a positive ideal for the higher types (see Ridley (2009) for one instance of this sort of reading). On the second type of reading, Nietzsche’s praise is sincere, but his talk of “freedom” is in a certain sense disingenuous: it is an instance of “persuasive definition” (the term comes from Charles Stevenson, writing in a different context), in which Nietzsche seeks to use the word ‘freedom’ in rather a different way to its ordinary usage, while at the same time capitalizing on the emotional attachment he can reasonably expect his readers will have to the term (see Leiter (2011)). On the third type of reading, Nietzsche’s praise of this character is given in a sarcastic tone: after all, the main achievement of this “Sovereign Individual” appears to be that he is able to keep his promises and pay his debts; perhaps what we have here is not a genuinely autonomous Nietzschean ideal (whatever that amounts to), but rather just a self-important member of the petty-bourgeoisie (see Rukgaber (2012), Acampora (2006)). Scholars remain divided on the interpretation of this passage in particular, as well as on the general question of whether the ideal that Nietzsche offers of the self-mastered individual, constituted by a robust unity of drives, should be thought of as an ideal of freedom.

We can in addition consider a second question. Granting that Nietzsche does think highly of such an individual, and that autonomy in this sense represents an ethical ideal for Nietzsche, we can ask whether or not it is an ideal that the higher types can consciously aspire to and work towards. Nietzsche sometimes talks of this ideal state as a sort of “achievement,” and some commentators have as a result presented autonomy as something that one can choose to pursue, and thus can through one’s own efforts bring about (can “achieve” in this sense). But this strongly agential reading of the process of coming to be autonomous faces a problem. For this account seems to suggest that one can freely, in some sense, bring it about that one becomes autonomous. But if Nietzsche has a positive picture of what it is to be free (and thus to act freely) at all, that picture seems to be the picture of autonomy, the state that one is here trying to achieve. It would be a mistake, then, to suppose that one can freely pursue and achieve autonomy, since this would be to import an additional illicit concept of freedom into the picture—the freedom one exercises in freely choosing to become autonomous.

A more plausible account, and one that accords more closely with Nietzsche’s texts, would have the process of coming-to-autonomy to be something that happens in some sense of its own accord, as a result of the interplay of external circumstance (including multi-generational historical processes) and facts about the individual’s inherent nature. Nietzsche often speaks of the growth of such an individual as occurring like the growth of a seed into a plant: the seed does not choose to grow into a mature plant or pursue it as a conscious goal; rather, if conditions are right, and the seed itself is healthy and well-formed, it will indeed grow and flourish. This, then, is how we should understand the process that results in a higher type’s “achieving” the ideal of autonomy. Whether or not that ideal, once achieved, should properly be thought of as a conception of freedom is a separate question. It does not follow from the fact that a condition is not freely pursued and reached that it cannot, once reached, count as a form of freedom.

c. Authenticity and Self-Creation

As the talk of seeds and plants suggests, a key component of Nietzsche’s positive ideal for the higher types involves a process of development into one’s “proper” or “true” or “natural” form. An acorn, given the right conditions, will grow into a particular type of thing—an oak tree—and as such it will have certain distinctive features: it will grow to a certain height, have leaves of a certain shape, and so on. Even when it was a small acorn, this is the form that is proper to it, to which it is in some sense “destined” to grow. “Destined” here does not mean “guaranteed,” for things may go wrong along the way, and the tree may end up stunted, withered, or barren. Nonetheless, if all goes well, the seed will develop into its proper form. Something like this seems to be what Nietzsche has in mind when he speaks of the importance of “becoming what one is.”

One very interesting feature of Nietzsche’s emphasis on this concept is the connection he draws to another concept that seems to be important to his positive ethical vision, namely the idea that one should “create oneself.” Contrasting himself and other higher types from “the many” who are concerned with “moral chatter,” Nietzsche says:

We, however, want to become who we are—human beings who are new, unique, incomparable, who give themselves laws, who create themselves! (The Gay Science, 335)

These two ideas—becoming who one is, and creating oneself, seem on the face of it to stand in some tension with one another. For the notion of becoming who one is implies that one has a particular determinate essential nature, a nature that one will ideally come to fulfil, just as the acorn in the right conditions can grow to reveal its proper and fullest form, that of the oak tree. But the concept of creating oneself, by contrast, seems to conflict with this sort of essence-based destiny. The notions of creation and creativity that Nietzsche invokes here seem to imply that the endpoint of the process is not fixed ahead of time; instead, there seems to be scope for free choice, for different possible outcomes, perhaps even for arbitrariness.

We can bring the two notions into closer alignment by attending to Nietzsche’s own account of artistic creation. Nietzsche rejects the idea that the artist’s approach is one of “laissez-aller”, letting go; instead, he says:

Every artist knows how far removed this feeling of letting go is from his ‘most natural’ state, the free ordering, placing, disposing and shaping in the moment of ‘inspiration’ – he knows how strictly and subtly he obeys thousands of laws at this very moment, laws that defy conceptual formulation precisely because of their hardness and determinateness. (Beyond Good and Evil, 188)

Artistic creation, then, is precisely not about arbitrary choice, but is rather a sort of activity in accordance with necessity. (We can imagine an artist, having been asked why he chose to compose a painting in particular way, replying: “I didn’t choose it—it had to be that way, otherwise the painting wouldn’t have worked!”) And indeed, immediately following the remark about human beings “creating themselves” in The Gay Science, Nietzsche continues:

To that end we must become the best students and discoverers of everything lawful and necessary in the world: we must become physicists in order to become creators in this sense – while hitherto all valuations and ideals have been built on ignorance of physics or in contradiction to it. (The Gay Science, 335)

Nietzsche wants us to understand the process of creation, then, as intimately connected to notions of necessity and law-governed activity. Just as the great artist is not making arbitrary choices but rather responding to their understanding of the unstated (and unstatable) aesthetic laws that govern how things must be done in this particular instance, so too the process of creation through which one creates oneself is not a matter of arbitrary choice but rather of necessity. What marks out an individual’s development as a process of self-creation will thus depend on whether or not the necessity derives from his own inner nature or from external sources. If the value system that an individual embraces (for instance) is merely a result of his being molded by his surrounding society, the worldview of which he accepts unquestioningly, then he will not count as having created himself, for his character has been shaped by forces outside of him and not by his own internal nature. If, on the other hand, an individual’s character emerges as a result of his own inner necessities, then he will count as having created himself. As we have already seen in the previous section, the idea will not be that a person makes a conscious choice to “create himself,” then going on to do so, for whether or not this process will take place is not a matter of conscious choice on the part of the individual. Nonetheless, the individual who creates himself has the principle of his own development, and his own character, within himself—within his inner nature. In this way, Nietzsche’s key concepts of authenticity (being who one is) and self-creation do indeed turn out to be intimately connected.

d. Affirmation

Perhaps the most fundamental part of Nietzsche’s positive ethical vision is his notion of “affirmation”. The flourishing individual, according to Nietzsche, will “say yes” to life—he will embrace and celebrate earthly existence, with all its suffering and hardships. Connected to this notion of affirmation are two other key Nietzschean concepts—amor fati, or love of (one’s) fate, and the notion of “eternal recurrence”:

My formula for human greatness is amor fati: that you do not want anything to be different, not forwards, not backwards, not for all eternity.

The notion of affirmation should be understood by way of contrast with the worldview of the morality that we have seen under attack in the critical part of Nietzsche’s project. Morality, as we have seen, involves a commitment to “life-denying” values: the earthly reality of human existence, and the suffering and pain it involves, is given a fundamentally negative evaluation, so that the only things that have a positive value are the promise of an afterlife in another world (in the religious iteration of the worldview), and the absence of suffering (in the secular version). The life-denying nature of these values is what threatens a descent into nihilism. Nietzsche’s positive ethical vision, by contrast, calls for an embracing of earthly life, including all of its suffering and pain.

The difficulty of Nietzsche’s ethical demand here should not be underestimated. To truly “say yes” to life, to “love one’s fate,” it is not enough simply to tolerate the difficulties and suffering for the sake of the greatness that comes along with them. Instead, one must actively love all aspects and moments of one’s life—to the extent of willing that one’s whole life, even the lowest lows, be repeated through all eternity. This is the notion of “eternal recurrence” or “eternal return”.

Some of Nietzsche’s unpublished remarks present the notion of eternal recurrence as a cosmological thesis to the effect that time is cyclical, so that everything that has happened will continue to repeat eternally. However, the emphasis within the published works is rather on eternal recurrence as a sort of test of affirmation: the point is to consider how one would react if one learnt that one’s life would repeat eternally—and this is the use of the concept that scholars have for the most part focused on. It is generally agreed that Nietzsche was not claiming that everything will in fact recur eternally.

This notion of eternal recurrence shows up in numerous places in the published works. In the Gay Science, Nietzsche says:

What if some day or night a demon were to steal into your loneliest loneliness and say to you: ‘This life as you now live it and have lived it you will have to live once again and innumerable times again; and there will be nothing new in it, but every pain and every joy and every thought and sigh and everything unspeakably small or great in your life must return to you, all in the same succession and sequence—even this spider and this moonlight between the trees, and even this moment and I myself. […]’ Would you not throw yourself down and gnash your teeth and curse the demon who spoke thus? Or have you once experienced a tremendous moment when you would have answered him: ‘You are a god, and never have I heard anything more divine.’ (The Gay Science, 341)

Eternal recurrence is also the central teaching of the prophet-like figure of Zarathustra in Thus Spoke Zarathustra (compare Nietzsche’s own discussion of Zarathustra in Ecce Homo). However, even Zarathustra himself finds it incredibly difficult to achieve the state of sincerely willing the eternal recurrence. Nietzsche seemed to think that this test of affirmation would be very difficult (perhaps impossible) for people, even truly great individuals, to pass. Nonetheless, this is the state of being that would be genuinely and fully opposed to the life-denying values of morality, and to the nihilism that follows in their wake.

3. References and Further Reading

This article draws primarily on Nietzsche’s published work from the 1880s. References to primary texts within the body of the article are to section numbers rather than page numbers.

a. Primary Texts

  • Daybreak.
  • The Gay Science.
  • Thus Spoke Zarathustra.
  • Beyond Good and Evil.
  • On the Genealogy of Morality.
  • Twilight of the Idols.
  • The Antichrist.
  • Ecce Homo.

b. Secondary Texts

  • Acampora, Christa Davis. “On Sovereignty and Overhumanity: Why It Matters How We Read Nietzsche’s Genealogy II:2.” In Christa Davis Acampora (ed.) Nietzsche’s On the Genealogy of Morals: Critical Essays. Lanham, MD: Rowan & Littlefield, pp. 147–162, 2006
  • Clark, Maudmarie and David Dudrick. The Soul of Nietzche’s Beyond Good and Evil. Cambridge: Cambridge University Press, 2012.
  • Foot, Philippa. “Nietzsche’s Immoralism.” In Richard Schacht (ed.) Nietzsche, Genealogy, Morality: Essays on Nietzsche’s On the Genealogy of Morals. Berkeley: University of California Press, 1994.
  • Foot, Philippa. Natural Goodness, Oxford: Oxford University Press, 2001.
  • Gemes, Ken. “Nietzsche on Free Will, Autonomy and the Sovereign Individual”. In Ken Gemes and Simon May (eds.) Nietzsche on Freedom and Autonomy. Oxford, New York: Oxford University Press, pp. 33–50, 2009.
  • Huddleston, Andrew. Nietzsche on the Decadence and Flourishing of Culture. Oxford: Oxford University Press, 2019.
  • Hurka, Thomas. “Nietzsche: Perfectionist.” In Brian Leiter and Neil Sinhababu (eds.), Nietzsche and Morality, Oxford: Oxford University Press, pp. 9–31, 2007.
  • Janaway, Christopher. “Nietzsche on Free Will, Autonomy and the Sovereign Individual.” Aristotelian Society Supplementary Volume 80, pp. 339–357, 2006.
  • Janaway, Christopher. Beyond Selflessness: Reading Nietzsche’s Genealogy. Oxford: Oxford University Press, 2007
  • Janaway, Christopher. “Nietzsche on Morality, Drives, and Human Greatness.” In Christopher Janaway and Simon Robertson (eds.) Nietzsche, Naturalism, and Normativity. Oxford: Oxford University Press, pp. 183–201, 2012.
  • Katsafanas, Paul. The Nietzschean Self:  Moral Psychology, Agency, and the Unconscious. Oxford: Oxford University Press, 2016.
  • Kirwin, Claire. “Pulling Oneself Up by the Hair: Understanding Nietzsche on Freedom.” Inquiry, vol 61, pp. 82-99, 2017.
  • Leiter, Brian. Nietzsche on Morality, Second Edition, Oxford: Routledge, 2015 (First Edition published as Routledge Philosophy Guidebook to Nietzsche on Morality, Routledge, 2002).
  • Leiter, Brian. “Who Is the ‘Sovereign Individual”? Nietzsche on Freedom.” In Simon May (ed.), Nietzsche’s On the Genealogy of Morality: A Critical Guide. Cambridge: Cambridge University Press, pp. 101–119, 2011.
  • Leiter, Brian. Moral Psychology with Nietzsche, Oxford: Oxford University Press, 2019.
  • May, Simon. “Nihilism and the Free Self.” In Ken Gemes and Simon May (eds.) Nietzsche on Freedom and Autonomy. Oxford, New York: Oxford University Press, pp. 89–106, 2009.
  • May, Simon. (ed.) Nietzsche’s On the Genealogy of Morality: A Critical Guide. Cambridge: Cambridge University Press, 2011.
  • Reginster, Bernard. The Affirmation of Life: Nietzsche on Overcoming Nihilism, Cambridge, MA: Harvard University Press, 2006.
  • Riccardi, Mattia. Nietzsche’s Philosophical Psychology, Oxford: Oxford University Press, 2021.
  • Richardson, John. “Nietzsche’s Freedoms.” In Ken Gemes and Simon May (eds.) Nietzsche on Freedom and Autonomy. Oxford, New York: Oxford University Press, pp. 127–150, 2009.
  • Ridley, Aaron. “What the Sovereign Individual Promises.” In Ken Gemes and Simon May (eds.) Nietzsche on Freedom and Autonomy. Oxford, New York: Oxford University Press, pp. 181–196, 2009.
  • Rukgaber, Matthew. “The ‘Sovereign Individual’ and the ‘Ascetic Ideal’: On a Perennial Misreading of the Second Essay of Nietzsche’s On the Genealogy of Morality.” Journal of Nietzsche Studies, Vol. 43 (2), pp. 213–239, 2012.
  • Von Tevenar, Gudrun. “Nietzsche’s Objections to Pity and Compassion.” In Gudrun von Tevenar (ed.) Nietzsche and Ethics. Bern: Peter Land, pp. 263–82, 2007.
  • Williams, Bernard. “Nietzsche on Tragedy, by M. S. Silk and J. P. Stern; Nietzsche: A Critical Life, by Ronald Hayman; Nietzsche, vol. 1, The Will to Power as Art, by Martin Heidegger, translated by David Farrell Krell, London Review of Books (1981).” Reprinted in his Essays and Reviews 1959–2002, Princeton: Princeton University Press, 2014.

 

Author Information

Claire Kirwin
Email: ckirwin@clemson.edu
Clemson University
U. S. A.

Contrary-to-Duty Paradox

A contrary-to-duty obligation is an obligation telling us what ought to be the case if something that is wrong is true. For example: ‘If you have done something bad, you should make amends’. Doing something bad is wrong, but if it is true that you did do something bad, it ought to be the case that you make amends. Here are some other examples: ‘If he is guilty, he should confess’, ‘If you have hurt your friend, you should apologise to her’, ‘If she will not keep her promise to him, she ought to call him’, ‘If the books are not returned by the due date, you must pay a fine’. Alternatively, we might say that a contrary-to-duty obligation is a conditional obligation where the condition (in the obligation) is forbidden, or where the condition is fulfilled only if a primary obligation is violated. In the first example, he should not be guilty; but if he is, he should confess. You should not have hurt your friend; but if you have, you should apologise. She should keep her promise to him; but if she will not, she ought to call him. The books ought to be returned by the due date; but if they are not, you must pay a fine.

Contrary-to-duty obligations are important in our moral and legal thinking. They turn up in discussions concerning guilt, blame, confession, restoration, reparation, punishment, repentance, retributive justice, compensation, apologies, damage control, and so forth. The rationale of a contrary-to-duty obligation is the fact that most of us do neglect our primary duties from time to time and yet it is reasonable to believe that we should make the best of a bad situation, or at least that it matters what we do when this is the case.

We want to find an adequate symbolisation of such obligations in some logical system. However, it has turned out to be difficult to do that. This is shown by the so-called contrary-to-duty (obligation) paradox, sometimes called the contrary-to-duty imperative paradox. The contrary-to-duty paradox arises when we try to formalise certain intuitively consistent sets of ordinary language sentences, sets that include at least one contrary-to-duty obligation sentence, by means of ordinary counterparts available in various monadic deontic logics, such as the so-called Standard Deontic Logic and similar systems. In many of these systems the resulting sets are inconsistent in the sense that it is possible to deduce contradictions from them, or else they violate some other intuitively plausible condition, for example that the members of the sets should be independent of each other. This article discusses this paradox and some solutions that have been suggested in the literature.

Table of Contents

  1. The Contrary-to-Duty Paradox
  2. Solutions to the Paradox
    1. Quick Solutions
    2. Operator Solutions
    3. Connective Solutions
    4. Action or Agent Solutions
    5. Temporal Solutions
  3. References and Further Reading

1. The Contrary-to-Duty Paradox

Roderick Chisholm was one of the first philosophers to address the contrary-to-duty (obligation or imperative) paradox (Chisholm (1963)). Since then, many different versions of this puzzle have been mentioned in the literature (see, for instance, Powers (1967), Åqvist (1967, 2002), Forrester (1984), Prakken and Sergot (1996), Carmo and Jones (2002), and Rönnedal (2012, pp. 61–66) for some examples). Here we discuss a particular version of a contrary-to-duty (obligation) paradox that involves promises; we call this example ‘the promise (contrary-to-duty) paradox’. Most of the things we say about this particular example can be applied to other versions. But we should keep in mind that different contrary-to-duty paradoxes might require different solutions.

Scenario I: The promise (contrary-to-duty) paradox (After Prakken and Sergot (1996))

Consider the following scenario. It is Monday and you promise a friend to meet her on Friday to help her with some task. Suppose, further, that you always meet your friend on Saturdays. In this example the following sentences all seem to be true:

N-CTD

N1. (On Monday it is true that) You ought to keep your promise (and see your friend on Friday).

N2. (On Monday it is true that) It ought to be that if you keep your promise, you do not apologise (when you meet your friend on Saturday).

N3. (On Monday it is true that) If you do not keep your promise (that is, if you do not see your friend on Friday and help her out), you ought to apologise (when you meet her on Saturday).

N4. (On Monday it is true that) You do not keep your promise (on Friday).

Let N-CTD = {N1, N2, N3, N4}. N3 is a contrary-to-duty obligation (or expresses a contrary-to-duty obligation). If the condition is true, the primary obligation that you should keep your promise (expressed by N1) is violated. N-CTD seems to be consistent as it does not seem possible to derive any contradiction from this set. Nevertheless, if we try to formalise N-CTD in so-called Standard Deontic Logic, for instance, we immediately encounter some problems. Standard Deontic Logic is a well-known logical system described in most introductions to deontic logic (for example, Gabbay, Horty, Parent, van der Meyden and van der Torre (eds.) (2013, pp. 36–39)). It is basically a normal modal system of the kind KD (Chellas (1980)). In Åqvist (2002) this system is called OK+. For introductions to deontic logic, see Hilpinen (1971, 1981), Wieringa and Meyer (1993), McNamara (2010), and Gabbay et al. (2013). Consider the following symbolisation:

SDL-CTD

SDL1 Ok

SDL2 O(k → ¬a)

SDL3 ¬k Oa

SDL4 ¬k

O is a sentential operator that takes a sentence as argument and gives a sentence as value. ‘Op’ is read ‘It ought to be (or it should be) the case that (or it is obligatory that) p’. ¬ is standard negation and → standard material implication, well known from ordinary propositional logic. In SDL-CTD, k is a symbolisation of ‘You keep your promise (meet your friend on Friday and help her with her task)’ and a abbreviates ‘You apologise (to your friend for not keeping your promise)’. In this symbolisation SDL1 is supposed to express a primary obligation and SDL3 a contrary-to-duty obligation telling us what ought to be the case if the primary obligation is violated. However, the set SDL-CTD = {SDL1, SDL2, SDL3, SDL4} is not consistent in Standard Deontic Logic. Oa is entailed by SDL1 and SDL2, and from SDL3 and SDL4 we can derive Oa. Hence, we can deduce the following formula from SDL-CTD: Oa Oa (‘It is obligatory that you apologise and it is obligatory that you do not apologise’), which directly contradicts the so-called axiom D, the schema ¬(OAOA). (∧ is the ordinary symbol for conjunction.) ¬(OAOA) is included in Standard Deontic Logic (usually as an axiom). Clearly, this sentence rules out explicit moral dilemmas. Since N-CTD seems to be consistent, while SDL-CTD is inconsistent, something must be wrong with our formalisation, with Standard Deontic Logic or with our intuitions. In a nutshell, this puzzle is the contrary-to-duty (obligation) paradox.

2. Solutions to the Paradox

Many different solutions to the contrary-to-duty paradox have been suggested in the literature. We can try to find some alternative formalisation of N-CTD, we can try to develop some other kind of deontic logic or we can try to show why at least some of our intuitions about N-CTD are wrong. The various solutions can be divided into five categories: quick solutions, operator solutions, connective solutions, action or agent solutions, and temporal solutions, and these categories can be divided into several subcategories. Various answers to the puzzle are often presented as general solutions to all different kinds of contrary-to-duty paradoxes; and if some proposal takes care of all the different kinds, this is a strong reason to accept this solution. Having said that, it might be the case that the same approach cannot be used to solve all kinds of contrary-to-duty paradoxes.

a. Quick Solutions

In this section, we consider some quick responses to the contrary-to-duty paradox. There are at least three types of replies of this kind: (1) We can reject some axiom schemata or rules of inference in Standard Deontic Logic that are necessary to derive our contradiction. (2) We can try to find some alternative formalisation of N-CTD in monadic deontic logic. (3) We can bite the bullet and reject some of the original intuitions that seem to generate the paradox in the first place.

Few people endorse any of these solutions. Still, it is interesting to say a few words about them since they reveal some of the problems with finding an adequate symbolisation of contrary-to-duty obligations. If possible, we want to be able to solve these problems.

One way of avoiding the contrary-to-duty paradox in monomodal deontic systems is to give up the axiom D, ¬(OA OA) (‘It is not the case that it is obligatory that A and obligatory that not-A’). Without this axiom (or something equivalent), it is no longer possible to derive a contradiction from SDL1−SDL4. In the so-called smallest normal deontic system K (Standard Deontic Logic without the axiom D), for instance, SDL-CTD is consistent. Some might think that there are independent reasons for rejecting D since they think there are, or could be, genuine moral dilemmas. Yet, even if this were true (which is debatable), rejecting D does not seem to be a good solution to the contrary-to-duty paradox for several reasons.

Firstly, even if we reject axiom D, it is problematic to assume that a dilemma follows from N-CTD. We can still derive the sentence OaOa from SDL-CTD in every normal deontic system, which says that it is obligatory that you apologise and it is obligatory that you do not apologise. And this proposition does not seem to follow from N-CTD. Ideally, we want our solution to the paradox to be dilemma-free in the sense that it is not possible to derive any dilemma of the form OAOA from our symbolisation of N-CTD.

Secondly, in every so-called normal deontic logic (even without the axiom D), we can derive the conclusion that everything is both obligatory and forbidden if there is at least one moral dilemma. This follows from the fact that FA (‘It is forbidden that A’) is equivalent to OA (‘It is obligatory that not-A’) and the fact that OaOa entails Or for any r in every normal deontic system. This is clearly absurd. N-CTD does not seem to entail that everything is both obligatory and forbidden. Everything else equal, we want our solution to the contrary-to-duty paradox to avoid this consequence.

Thirdly, such a solution still has problems with the so-called pragmatic oddity (see below, this section).

In monomodal deontic logic, for instance Standard Deontic Logic, we can solve the contrary-to-duty paradox by finding some other formalisation of the sentences in N-CTD. Instead of SDL2 we can use kOa and instead of SDL3 we can use O(¬ka). Then we obtain three consistent alternative symbolisations of N-CTD. Nonetheless, these alternatives are not non-redundant (a set of sentences is non-redundant only if no member in the set follows from the rest). O(¬ka) follows from Ok in every so-called normal deontic logic, including Standard Deontic Logic, and kOa follows from ¬k by propositional logic. But, intuitively, N3 does not appear to follow from N1, and N2 does not appear to follow from N4. N-CTD seems to be non-redundant in that it seems to be the case that no member of this set is derivable from the others. Therefore, we want our symbolisation of N-CTD to be non-redundant.

The so-called pragmatic oddity is a problem for many possible solutions to the contrary-to-duty paradox, including our original symbolisation in Standard Deontic Logic, that is, SDL-CTD, the same symbolisation in the smallest normal deontic system K, and the one that uses kOa instead of O(k → ¬a). In every normal deontic logic (with or without the axiom D), it is possible to derive the following sentence from SDL-CTD: O(ka), which says that it is obligatory that you keep your promise and apologise (for not keeping your promise). Several solutions that use bimodal alethic-deontic logic or counterfactual deontic logic (see Section 2c) as well as Castañeda’s solution (see Section 2d), for instance, also have this problem. The sentence O(ka) is not inconsistent, but it is certainly very odd, and it does not appear to follow from N-CTD that you should keep your promise and apologise. Hence, we do not want our formalisation of N-CTD to entail this counterintuitive conclusion or anything similar to it.

One final quick solution is to reject some intuition. The set of sentences N-CTD in natural language certainly seems to be consistent and non-redundant, it seems to be dilemma-free, and it does not seem to entail the pragmatic oddity or the proposition that everything is both obligatory and forbidden. One possible solution to the contrary-to-duty paradox, then, obviously, is to reject some of these intuitions about this set. If it is not consistent and non-redundant, for instance, there is nothing puzzling about the fact that our set of formalised sentences (for example SDL-CTD) lack one or both of these properties. In fact, if this is the case, the symbolisation should be inconsistent and/or redundant.

The problem with this solution is, of course, that our intuitions seem reliable. N-CTD clearly seems to be consistent, non-redundant, and so forth. And we do not appear to have any independent reasons for rejecting these intuitions. It might be the case that sometimes when we use contrary-to-duty talk, we really are inconsistent or non-redundant, for instance. Still, that does not mean that we are always inconsistent or non-redundant. If N-CTD or some other set of this kind is consistent, non-redundant, and so on, we cannot use this kind of solution to solve all contrary-to-duty paradoxes. Furthermore, it seems that we should not reject our intuitions if there is some better way to solve the contrary-to-duty paradox. So, let us turn to the other solutions. (For more information on quick solutions to the contrary-to-duty paradox, see Rönnedal (2012, pp. 67–98).)

b. Operator Solutions

We shall begin by considering the operator solution. The basic idea behind this kind of solution is that the contrary-to-duty paradox, in some sense, involves different kinds of obligations or different kinds of ‘ought-statements’. Solutions of this type have, for example, been discussed by Åqvist (1967), Jones and Pörn (1985), and Carmo and Jones (2002).

In Standard Deontic Logic a formula of the form OAOA is derivable from SDL-CTD; but OAOA is not consistent with the axiom D. If, however, there are different kinds of obligations, symbolised by distinct obligation operators, it may be possible to formalise our contrary-to-duty scenarios so as to avoid a contradiction. Suppose, for example, that there are two obligation operators O1 and O2 that represent ideal and actual obligations, respectively. Then, it is possible that instead of OaOa we may derive the formula O1aO2a from the symbolisation of our scenarios. But O1aO2a is not inconsistent with the axiom D; O1aO2a says that it is ‘ideally-obligatory’ that you do not apologise and it is ‘actually-obligatory’ that you apologise. If we cannot derive any other formula of the form OAOA, it is no longer possible to derive a contradiction from our formalisation. Furthermore, such a solution seems to be dilemma-free, and it does not seem to be possible to derive the conclusion that everything is both obligatory and forbidden from a set of sentences that introduces different kinds of obligations.

An example: Carmo and Jones’s operator solution

Perhaps the most sophisticated version of this kind of solution is presented by Carmo and Jones (2002). Let us now discuss their answer to the contrary-to-duty paradox to illustrate this basic approach. To understand their view, we must first explain some formal symbols. Carmo and Jones use a dyadic, conditional obligation operator O(…/…) to represent conditional obligations. Intuitively, ‘O(B/A)’ says that in any context in which A is a fixed or unalterable fact, it is obligatory that B, if this is possible. They use two kinds of monadic modal operators: and , and □ and ◇. Intuitively, is intended to capture that which—in a particular situation—is actually fixed, or unalterable, given (among other factors) what the agents concerned have decided to do and not to do. So, A says that it is fixed or unalterable that A. is the dual (possibility operator) of . Intuitively, □ is intended to capture that which—in a particular situation—is not only actually fixed, but would still be fixed even if different decisions had been made, by the agents concerned, regarding how they were going to behave. So, □A says that it is necessary, fixed or unalterable that A, no matter what the agents concerned intend to do or not to do. ◇ is the dual (possibility operator) of □. They also introduce two kinds of derived obligation sentences, OaB and OiB, pertaining to actual obligations and ideal obligations, respectively. OaB is read ‘It is actually obligatory that B’ or ‘It actually ought to be the case that B’, and OiB is read ‘It is ideally obligatory that B’ or ‘It ideally ought to be the case that B’. T is (the constant) Verum; it is equivalent to some logically true sentence (such as, it is not the case that p and not-p). In short, we use the following symbols:

O(B/A) In any context in which A is fixed, it is obligatory that B, if this is possible.

OaB It is actually obligatory that B.

OiB It is ideally obligatory that B.

A It is actually possible that A.

A It is potentially possible that A.

A It is not actually possible that not-A.

A It is not potentially possible that not-A.

T Verum

Before we consider Carmo and Jones’s actual solution to the contrary-to-duty paradoxes, let us say a few words about the formal properties of various sentences in their language. For more on the syntax and semantics of Carmo and Jones’s system, see Carmo and Jones (2002). □ (and ◇) is a normal modal operator of kind KT, and (and ) is a normal modal operator of kind KD (Chellas (1980)). □A is stronger than A, and A is stronger than ◇A. There is, according to Carmo and Jones, an intimate conceptual connection between the two notions of derived obligation, on the one hand, and the two notions of necessity/possibility. The system includes (A ↔ B) → (OaAOaB) and □(A ↔ B) → (OiAOiB) for example. The system also contains the following restricted forms of so-called factual detachment: (O(B/A) ∧ ABB) → OaB, and (O(B/A) ∧ □A ∧ ◇B ∧ ◇¬B) → OiB. We can now symbolise N-CTD in the following way:

O-CTD

O1 O(k/T)

O2 O(¬a/k)

O3 O(a/¬k)

O4 ¬k

We use the same propositional letters as in Section 1. Furthermore, we assume that the following ‘facts’ hold: k, ◇(k ∧ ¬a), ◇(ka), ¬aaa. In other words, we assume that you decide not to keep your promise, but that it is potentially possible for you to keep your promise and not apologise and potentially possible for you to keep your promise and apologise, and that you have not in fact apologised, although it is still actually possible that you apologise and actually possible that you do not apologise. From this, we can derive the following sentences in Carmo and Jones’s system: Oi(k ∧ ¬a) and Oaa; that is, ideally it ought to be that you keep your promise (and help your friend) and do not apologise, but it is actually obligatory that you apologise. Furthermore, the obligation to keep your promise is violated and the ideal obligation to keep your promise and not apologise is also violated. Still, we cannot derive any contradiction. From Oi(k ∧ ¬a) we cannot derive any actual obligation not to apologise. Consequently, we can avoid the contrary-to-duty paradox.

Arguments for Carmo and Jones’s operator solution

According to Carmo and Jones, any adequate solution to the contrary-to-duty paradox should satisfy certain requirements. The representation of N-CTD (and similar sets of sentences) should be: (i) consistent, and (ii) non-redundant, in the sense that the formalisations of the members of N-CTD should be logically independent. The solution should be (iii) applicable to (at least apparently) action- and timeless contrary-to-duty examples (see Section 2d and Section 2e for some examples). (iv) The logical structures of the two conditional obligations in N-CTD (and similar sets of sentences) should be analogous. Furthermore, we should have (v) the capacity to derive actual and (vi) ideal obligations (from (the representation of) N-CTD), (vii) the capacity to represent the fact that a violation of an obligation has occurred, and (viii) the capacity to avoid the pragmatic oddity (see Section 2a above for a description of this problem). Finally, (ix) the assignment of logical form to a sentence in a contrary-to-duty scenario should be independent of the assignment of logical form to the other sentences. Carmo and Jones’s solution satisfies all of these requirements. This is a good reason to accept their approach. Nevertheless, there are also some serious problems with the suggested solution. We now consider two puzzles.

Arguments against Carmo and Jones’s operator solution

Even though Carmo and Jones’s operator solution is quite interesting, it has not generated much discussion. In this section, we consider two arguments against their solution that have not been mentioned in the literature.

Argument 1. Carmo and Jones postulate several different unconditional operators. But ‘ought’ (and ‘obligatory’) does not seem to be ambiguous in the sense the solution suggests. The derived ‘ideal’ obligation to keep the promise and not to apologise does not seem to be of another kind than the derived ‘actual’ obligation to apologise. The ‘ideal’ obligation is an ordinary unconditional obligation to keep your promise and not apologise that holds as long as it is still possible for you to keep your promise and not apologise. And the ‘actual’ obligation is an ordinary unconditional obligation that becomes ‘actual’ as soon as it is settled that you will not keep your promise. Both obligations are unconditional and both obligations are action guiding. The ‘ought’ in the sentence ‘You ought to keep your promise and not apologise’ does not have another meaning than the ‘ought’ in the sentence ‘You ought to apologise’. The only difference between the obligations is that they are in force at different times. Or, at least, so it seems. Furthermore, if the conditional obligation sentences N2 and N3 should be symbolised in the same way, if they have the same logical form, as Carmo and Jones seem to think, it also seems reasonable to assume that the derived unconditional obligation sentences should be symbolised by the same kind of operator.

Argument 2. Carmo and Jones speak about two kinds of obligations: actual obligations and ideal obligations. But it is unclear which of these, if either, they think is action guiding. We have the following alternatives:

(i) Both actual and ideal obligations are action guiding.

(ii) Neither actual nor ideal obligations are action guiding.

(iii) Ideal but not actual obligations are action guiding.

(iv) Actual but not ideal obligations are action guiding.

Yet, all of these alternatives are problematic. It seems that (i) cannot be true. For in Carmo and Jones’s system, we can derive Oi(k ∧ ¬a) and Oaa from the symbolisation of N-CTD. Still, there is no possible world in which it is true both that you keep your promise and not apologise and that you apologise. How, then, can both actual and ideal obligations be action guiding? If we assume that neither actual nor ideal obligations are action guiding, we can avoid this problem, but then the value of Carmo and Jones’s solution is seriously limited. We want, in every situation, to know what we (actually) ‘ought to do’ in a sense of ‘ought to do’ that is action guiding. Nevertheless, according to (ii), neither ideal nor actual obligations are action guiding. In this reading of the text, Carmo and Jones’s system cannot give us any guidance; it does not tell us what we ‘ought to do’ in what seems to be the most interesting sense of this expression. True, the solution does say something about ideal and actual obligations, but why should we care about that? So, (ii) does not appear to be defensible. If it is the ideal and not the actual obligations that are supposed to be action guiding, it is unclear what the purpose of speaking about ‘actual’ obligations is. If actual obligations are supposed to have no influence on our behaviour, they seem to be redundant and serve no function. Moreover, if this is true, why should we call obligations of this kind ‘actual’? Hence, (iii) does not appear to be true either. The only reasonable alternative, therefore, seems to be to assume that it is the actual and not the ideal obligations that are action guiding. Yet, this assumption is also problematic, since it has some counterintuitive consequences. If you form the intention not to keep your promise, if you decide not to help your friend, your actual obligation is to apologise, according to Carmo and Jones. You have an ideal obligation to keep your promise and not apologise, but this obligation is not action guiding. So, it is not the case that you ought to keep your promise and not apologise in a sense that is supposed to have any influence on your behaviour. However, intuitively, it seems to be true that you ought to keep your promise and not apologise as long as you still can keep your promise; as long as this is still (potentially) possible, this seems to be your ‘actual’ obligation, the obligation that is action guiding. As long as you can help your friend (and not apologise), you do not seem to have an actual (action-guiding) obligation to apologise. The fact that you have decided not to keep your promise does not take away your (actual, action-guiding) obligation to keep your promise (and not apologise); you can still change your mind. We cannot avoid our obligations just by forming the intention not to fulfil them. This would make it too easy to get rid of one’s obligations. Consequently, it seems that (iv) is not true either. And if this is the case, Carmo and Jones’s solution is in deep trouble, despite its many real virtues.

c. Connective Solutions

We turn now to our second category of solutions to the contrary-to-duty paradox. In Section 1, we interpreted the English construction ‘if, then’ as material implication. But there are many other possible readings of this expression. According to the connective solutions to the contrary-to-duty paradox, ‘if, then’ should be interpreted in some other way, not as a material implication. The category includes at least four subcategories: (1) the modal (or strict implication) solution according to which ‘if, then’ should be interpreted as strict or necessary implication; (2) the counterfactual (or subjunctive) solution according to which ‘if, then’ should be interpreted as some kind of subjunctive or counterfactual conditional; (3) the non-monotonic solution according to which we should use some kind of non-monotonic logic to symbolise the expression ‘if, then’; and (4) the (primitive) dyadic deontic solution according to which we should develop a new kind of dyadic deontic logic with a primitive, two-place sentential operator that can be used to symbolise conditional norms.

According to the first solution, which we call the modal solution, ‘if, then’ should be interpreted as strict, that is, necessary implication, not as material implication. N2 should, for example, be symbolised in the following way: k => Oa (or perhaps as O(k => ¬a)), and N3 in the following way: ¬k => Oa (or perhaps as O(¬k => a)), where => stands for strict implication and the propositional letters are interpreted as in Section 1. A => B is logically equivalent to □(AB) in most modal systems. □ is a sentential operator that takes one sentence as argument and gives one sentence as value. ‘□A’ says that it is necessary that A. The set {Ok, k => Oa, ¬k => Oa, ¬k} is consistent in some alethic deontic systems (systems that combine deontic and modal logic). So, if we use this symbolisation, it might be possible to avoid the contrary-to-duty paradox. A solution of this kind is discussed by Mott (1973), even though Mott seems to prefer the counterfactual solution. For more on this kind of approach and for some problems with it, see Rönnedal (2012, pp. 99–102).

According to the second solution, the counterfactual solution, the expression ‘if, then’ should be interpreted as some kind of counterfactual or subjunctive implication. Mott (1973) and Niles (1997), for example, seem to defend a solution of this kind, while Tomberlin (1981) and Decew (1981), for instance, criticise it. We say more about the counterfactual solution below (in this section).

According to the third solution, the non-monotonic solution, we should use some kind of non-monotonic logic to symbolise the expression ‘if, then’. A solution of this kind has been discussed by Bonevac (1998). Bonevac introduces a new, non-monotonic, defeasible or generic conditional, >, a sentential operator that takes two sentences as arguments and gives one sentence as value. A > B is true in a possible world, w, if and only if B holds in all A-normal worlds relative to w. This conditional does not support ordinary modus ponens, that is, B does not follow from A and A > B. It only satisfies defeasible modus ponens, that B follows non-monotonically from A and A > B in the absence of contrary information. If we symbolise N2 as O(k > ¬a) (or perhaps as k > Oa), and N3 as ¬k > Oa (and N1 and N4 as in SDL-CTD), we can no longer derive a contradiction from this set in Bonevac’s system. Oa follows non-monotonically from Ok and O(k > ¬a), and Oa follows non-monotonically from ¬k and ¬k > Oa. But from {Ok, O(k > ¬a), ¬k > Oa, ¬k} we can only derive Oa non-monotonically. According to Bonevac, so-called factual detachment takes precedence over so-called deontic detachment. Hence, we can avoid the contrary-to-duty paradox.

A potential problem with this kind of solution is that it is not obvious that it can explain the difference between violation and defeat. If you will not see your friend and help her out, the obligation to keep your promise will be violated. It is not the case that this obligation is defeated, overridden or cancelled. The same seems to be true of the derived obligation that you should not apologise. If you do apologise, the derived (unconditional) obligation that you should not apologise is violated. It is not the case that one of the conditional norms in N-CTD defeat or override the other. Nor is it the case that they cancel each other out. Or, at least, so it seems. Ideally, we want our solution to reflect the idea that the primary obligation in a contrary-to-duty paradox has been violated and not defeated. Likewise, we want to be able to express the idea that the derived unconditional obligation not to apologise has been violated if you apologise. However, according to Bonevac, we cannot derive O¬a from {Ok, O(k > ¬a), ¬k > Oa, ¬k}, not even non-monotonically. This approach to the contrary-to-duty paradoxes does not appear to have generated that much discussion. But the non-monotonic paradigm is interesting and Bonevac’s paper provides a fresh view on the paradox.

According to the fourth solution, the (pure) dyadic deontic solution, we should develop a new kind of dyadic deontic logic with a primitive, two-place sentential operator that can be used to symbolise conditional norms. Sometimes O(B/A) is used to symbolise such norms, sometimes O[A]B, and sometimes AOB. Here we use the following construction: O[A]B. ‘O[A]B’ is read ‘It is obligatory (or it ought to be the case) that B given A’. This has been one of the most popular solutions to the contrary-to-duty paradox and it has many attractive features. Nevertheless, we do not say anything more about it in this article, since we discuss a temporal version of the dyadic deontic solution in Section 2e. For more on this kind of approach and for some problems with it, see Åqvist (1984, 1987, 2002) and Rönnedal (2012, pp. 112–118). For more on dyadic deontic logic, see Rescher (1958), von Wright (1964), Danielsson (1968), Hansson (1969), van Fraassen (1972, 1973), Lewis (1974), von Kutschera (1974), Greenspan (1975), Cox (Al-Hibri) (1978), and van der Torre and Tan (1999). Semantic tableau systems for dyadic deontic logic are developed by Rönnedal (2009).

An example: The counterfactual solution

We now consider the counterfactual solution to the contrary-to-duty paradox and some arguments for and against this approach. Mott (1973) and Niles (1997), for example, are sympathetic to this kind of view, while Tomberlin (1981) and Decew (1981), for instance, criticise it. Some of the arguments in this section have previously been discussed in Rönnedal (2012, pp. 102–106). For more on combining counterfactual logic and deontic logic, see the Appendix, Section 7, in Rönnedal (2012), Rönnedal (2016) and Rönnedal (2019); the tableau systems that are used in this section are described in those works.

In a counterfactual deontic system, a system that combines counterfactual logic and deontic logic, we can symbolise the concept of a conditional obligation in at least four interesting ways: (A □→ OB), O(A □→ B), (A □⇒ OB) and O(A □⇒ B). □→ (and □⇒) is a two-place, sentential operator that takes two sentences as arguments and gives one sentence as value. ‘A □→ B’ (and ‘A □⇒ B’) is often read ‘If A were the case, then B would be the case’. (The differences between □→ and □⇒ are unimportant in this context and as such we focus on □→.) So, maybe we can use some of these formulas to symbolise contrary-to-duty obligation sentences and avoid the contrary-to-duty paradox. Let us now consider one possible formalisation of N-CTD that seems to be among the most plausible in counterfactual deontic logic. In the discussion of Argument 2 in this section (see below), we consider two more attempts.

CF-CTD

CF1 Ok

CF2 k □→ Oa

CF3 ¬k □→ Oa

CF4 ¬k

Let CF-CTD = {CF1, CF2, CF3, CF4}. From CF3 and CF4 we can deduce Oa, but it is not possible to derive Oa from CF1 and CF2, at least not in most reasonable counterfactual deontic systems. Hence, we cannot derive a contradiction in this way.

Arguments for the counterfactual solution

This solution to the contrary-to-duty paradox is attractive for many reasons. (1) CF-CTD is consistent, as we already have seen. (2) The set is non-redundant. CF3 does not seem to be derivable from CF1, and CF2 does not seem to be derivable from CF4 in any interesting counterfactual deontic logic. (3) The set is dilemma-free. We cannot derive OaOa from CF-CTD, nor anything else of the form OAOA. (4) We cannot derive the proposition that everything is both obligatory and forbidden from CF-CTD. (5) We can easily express the idea that the primary obligation to keep the promise has been violated in counterfactual deontic logic. This is just the conjunction of CF1 and CF4. (6) All conditional obligations can be symbolised in the same way. CF2 has the same logical form as CF3. (7) We do not have to postulate several different kinds of unconditional obligations. The unconditional obligation to keep the promise is the same kind of obligation as the derived unconditional obligation to apologise. This is a problem for Carmo and Jones’s operator solution (Section 1 above). (8) The counterfactual solution can take care of apparently actionless contrary-to-duty paradoxes. Such paradoxes are a problem for the action or agent solutions (see Section 2d). (9) The counterfactual solution can perhaps take care of apparently timeless contrary-to-duty paradoxes. Such paradoxes are a problem for the temporal solution (see Section 2e). (Whether or not this argument is successful is debatable.) (10) From CF3 and CF4 we can derive the formula Oa, which says that you should apologise, and, intuitively, it seems that this proposition follows from N3 and N4 (at least in some contexts). (11) In counterfactual deontic logic a conditional obligation can be expressed by a combination of a counterfactual conditional and an ordinary (unconditional) obligation. We do not have to introduce any new primitive dyadic deontic operators. According to the dyadic and temporal dyadic deontic solutions (see above in this section and Section 2e below), we need some new primitive dyadic deontic operator to express conditional obligations.

Hence, the counterfactual solution to the contrary-to-duty paradox seems to be among the most plausible so far suggested in the literature. Nonetheless, it also has some serious problems. We now consider four arguments against this solution. For more on some problems, see Decew (1981) and Tomberlin (1981), and for some responses, see Niles (1997).

Arguments against the counterfactual solution

Argument 1. The symbol □→ has often been taken to represent conditional sentences in the subjunctive, not in the indicative form. That is, A □→ B is read ‘If it were the case that A, then it would be the case that B’, not ‘If A is the case, then B is the case’ (or ‘If A, then B’). So, the correct reading of k □→ Oa seems to be ‘If you were to keep your promise, then it would be obligatory that you do not apologise’, and the correct reading of ¬k □→ Oa seems to be ‘If you were not to keep your promise, then it would be obligatory that you apologise’. If this is true, the formal sentences CF2 and CF3 do not correctly reflect the meaning of the English sentences N2 and N3, because the English sentences are not in the subjunctive form.

Here is a possible response to this argument. A □→ B might perhaps be used to symbolize indicative conditionals and not only subjunctive conditionals, and if this is the case, we can avoid this problem. Furthermore, maybe the formulation in natural language is not satisfactory. Maybe the English sentences in N-CTD are more naturally formulated in the subjunctive form. So, ‘It ought to be that if you keep your promise, you do not apologise’ is taken to mean the same thing as ‘If you were to keep your promise, then it would be obligatory that you do not apologise’; and ‘If you do not keep your promise, you ought to apologise’ is taken to say the same thing as ‘If you were not to keep your promise, then it would be obligatory that you apologise’. And if this is the case, the symbolisations might very well be reasonable. To decide whether this is the case or not, it seems that we have to do much more than just look at the surface structure of the relevant sentences. So, this argument—while interesting—does not seem to be conclusive.

Argument 2. In counterfactual deontic logic, N2 can be interpreted in (at least) two ways: k □→ Oa (CF2) or O(k □→ ¬a) (CF2(b)). Faced with the choice between two plausible formalisations of a certain statement, we ought to choose the stronger one. CF2(b) is stronger than CF2. So, N2 should be symbolized by CF2(b) and not by CF2. Furthermore, CF2(b) corresponds better with the surface structure of N2 than CF2; in N2 the expression ‘It ought to be that’ has a wide and not a narrow scope. This means that N-CTD should be symbolized in the following way:

C2F-CTD

CF1 Ok

CF2(b) O(k □→ ¬a)

CF3 ¬k □→ Oa

CF4 ¬k

Let C2F-CTD = {CF1, CF2(b), CF3, CF4}. Yet, in this reading, the paradox is reinstated, for C2F-CTD is inconsistent in most plausible counterfactual deontic systems. (An argument of this kind against a similar contrary-to-duty paradox can be found in Tomberlin (1981).) Let us now prove this. (In the proofs below, we use some semantic tableau systems that are described in the Appendix, Section 7, in Rönnedal (2012); temporal versions of these systems can be found in Rönnedal (2016). All rules that are used in our deductions are explained in these works.) First, we establish a derived rule, rule DR8, which is used in our proofs. This rule is admissible in any counterfactual (deontic) system that contains the tableau rule Tc5.

Derivation of DR8.

(1) A □→ B, i

                          

(2) ¬(A B), i [CUT]          (3) A B, i [CUT]

(4) A, i [2, ¬→]

(5) ¬B, i [2, ¬→]

(6) irAi [4, Tc5]

(7) B, i [1, 6, □→]

(8) * [5, 7]

Now we are in a position to prove that C2F-CTD is inconsistent. To prove that a set of sentences A1, A2, …, An is inconsistent in a tableau system S, we construct an S-tableau which begins with every sentence in this set suffixed in an appropriate way, such as A1, 0, A2, 0, …, An, 0. If this tableau is closed, that is, if every branch in it is closed, the set is inconsistent in S. (‘MP’ stands for the derived tableau rule Modus Ponens.)

(1) Ok, 0

(2) O(k □→ ¬a), 0

(3) ¬k □→ Oa, 0

(4) ¬k, 0

(5) ¬kOa, 0 [3, DR8]

(6) Oa, 0 [4, 5, MP]

(7) 0s1 [T dD]

(8) k, 1 [1, 7, O]

(9) k □→ ¬a, 1 [2, 7, O]

(10) a, 1 [6, 7, O]

(11) k → ¬a, 1 [9, DR8]

(12) ¬a, 1 [8, 11, MP]

(13) * [10, 12]

So, the counterfactual solution is perhaps not so plausible after all. Nevertheless, this argument against this solution is problematic for at least two different reasons.

(i) It is not clear in what sense CF2(b) is ‘stronger’ than CF2. Tomberlin does not explicitly discuss what he means by this expression in this context. Usually one says that a formula A is (logically) stronger than a formula B in a system S if and only if A entails B but B does not entail A in S. In this sense, CF2(b) does not seem to be stronger than CF2 in any interesting counterfactual deontic logic. But perhaps one can understand ‘stronger’ in some other sense in this argument. CF2(b) is perhaps not logically stronger than CF2, but it is a more natural interpretation of N2 than CF2. Recall that N2 says that it ought to be that if you keep your promise, then you do not apologise. This suggests that the correct symbolisation of N2 is O(k □→ ¬a), not k □→ Oa; in other words, the O-operator should have a wide and not a narrow scope.

(ii) Let us grant that O(k □→ ¬a) is stronger than k □→ Oa in the sense that the former is more natural than the latter. Furthermore, it is plausible to assume that if two interpretations of a sentence are reasonable one should choose the stronger or more natural one (as a pragmatic rule and ceteris paribus). Hence, CF2 should be symbolised as O(k □→ ¬a) and not as k □→ Oa. Here is a possible counterargument. Both O(k □→ ¬a) and k □→ Oa are reasonable interpretations of N2. So, ceteris paribus we ought to choose O(k □→ ¬a). But if we choose O(k □→ ¬a) the resulting set C2F-CTD is inconsistent. Thus, in this case, we cannot (or should not) choose O(k □→ ¬a) as a symbolisation of N2. We should instead choose the narrow scope interpretation k □→ Oa. Furthermore, it is not obvious that N2 says something other than the following sentence: ‘If you keep your promise, it ought to be the case that you do not apologise’ (N2b). And here k □→ Oa seems to be a more natural symbolisation. Even if N2 and N2b are not equivalent, N2b might perhaps express our original idea better than N2. Consequently, this argument does not seem to be conclusive. However, it does seem to show that C2F-CTD is not a plausible solution to the contrary-to-duty paradox.

What happens if we try some other formalisation of N3? Can we avoid this problem then? Let us consider one more attempt to symbolize N-CTD in counterfactual deontic logic.

C3F-CTD

CF1 Ok

CF2(b) O(k □→ ¬a)

CF3(b) O(¬k □→ a)

CF4 ¬k

Let C3F-CTD = {CF1, CF2(b), CF3(b), CF4}. In this set N3 is once more represented by a sentence where the O-operator has wide scope. From this set we can derive Oa from CF1 and CF2(b), but not Oa from CF3(b) and CF4. The set is not inconsistent.

Yet, this solution is problematic for another reason. All of the following sentences seem to be true: O(k □→ ¬a), k □→ Oa, ¬k □→ Oa, but O(¬k □→ a) seems false. According to the standard truth-conditions for counterfactuals, A □→ B is true in a possible world w if and only if B is true in every possible world that is as close as (as similar as) possible to w in which A is true; and OA is true in a possible world w if and only if A is true in every possible world that is deontically accessible from w. If we think of the truth-conditions in this way, O(¬k □→ a) is true in w (our world) if and only if ¬k □→ a is true in all ideal worlds (in all possible worlds that are deontically accessible from w), that is, if and only if: in every ideal world w’ deontically accessible from w, a is true in all the worlds that are as close to w’ as possible in which ¬k is true. But in all ideal worlds you keep your promise, and in all ideal worlds, if you keep your promise, you do not apologise. From this it follows that in all ideal worlds you do not apologise. Accordingly, in all ideal worlds you keep your promise and do not apologise. Take an ideal world, say w’. In the closest ¬k world(s) to w’, ¬a seems to be true (since ¬a is true in w’). If this is correct, ¬k and ¬a is true in one of the closest ¬k worlds to w’. So, ¬k □→ a is not true in w’. Hence, O(¬k □→ a) is not true in w (in our world). In conclusion, if this argument is sound, we cannot avoid the contrary-to-duty paradox by using the symbolisation C3F-CTD.

Argument 3. We turn now to the pragmatic oddity. We have mentioned that this is a problem for some quick solutions and for the modal solution. It is also a problem for the counterfactual solution. In every counterfactual deontic system that includes the tableau rule Tc5 (see Rönnedal (2012, p. 160)), and hence the schema (A □→ B) → (AB), the sentence O(ka) is derivable from CF-CTD. This is odd since it does not seem to follow that it ought to be that you keep your promise and apologise (for not keeping your promise) from N-CTD and since it seems that (A □→ B) → (AB) should hold in every reasonable counterfactual logic. The following semantic tableau shows that O(ka) is derivable from CF-CTD (in most counterfactual deontic systems).

(1) Ok, 0

(2) k □→ Oa, 0

(3) ¬k □→ Oa, 0

(4) ¬k, 0

(5) ¬O(ka), 0

(6) P¬(ka), 0 [5, ¬O]

(7) 0s1 [6, P]

(8) ¬(ka), 1 [6, P]

(9) k, 1 [1, 7, O]

(10) ¬k Oa, 0 [3, DR8]

                          

(11) ¬¬k, 0 [10, →]                           (12) Oa, 0 [10, →]

(13) * [4, 11]                                           (14) a, 1 [12, 7, O]

                            

(15) ¬k, 1 [8, ¬∧]            (16) ¬a, 1 [8, ¬∧]

(17) * [9, 15]                         (18) * [14, 16]

Argument 4. According to the counterfactual solution, so-called factual detachment holds unrestrictedly, that is, OB always follows from A and A □→ OB. This view is criticised by Decew (1981). From the proposition that I will not keep my promise and the proposition that if I will not keep my promise I ought to apologise, it does not follow that I ought to apologise. For as long as I still can keep my promise I ought to keep it, and if I keep it, then I should not apologise. According to Decew, it is not enough that a condition is true, it must be ‘unalterable’ or ‘settled’ before we are justified in detaching the unconditional obligation. See also Greenspan (1975). If this is correct, the counterfactual solution cannot, in itself, solve all contrary-to-duty paradoxes.

d. Action or Agent Solutions

Now, let us turn to the action or agent solutions. A common idea behind most of these solutions is that we should make a distinction between what is obligatory, actions or so-called practitions, and the circumstances of obligations. We should combine deontic logic with some kind of action logic or dynamic logic. And when we do this, we can avoid the contrary-to-duty paradox. Three subcategories deserve to be mentioned: (1) Castañeda’s solution, (2) the Stit solution, and (3) the dynamic deontic solution.

Castañeda has developed a unique approach to deontic logic. According to him, any useful deontic calculus must contain two types of sentences even at the purely sentential level. One type is used to symbolise the indicative clauses—that speak about the conditions and not the actions that are considered obligatory—in a conditional obligation, and the other type is used to symbolise the infinitive clauses that speak about the actions that are considered obligatory and not the conditions. Castañeda thinks that the indicative components, but not the infinitive ones, allow a form of (internal) modus ponens. From N3 and N4 we can derive the conclusion that you ought to apologise, but from N1 and N2 we cannot derive the conclusion that you ought not to apologise. Hence, we avoid the contradiction. For more on this approach, see, for instance, Castañeda (1981). For a summary of some arguments against Castañeda’s solution, see Carmo and Jones (2002); see also Powers (1967).

According to the Stit solution, deontic logic should be combined with some kind of Stit (Seeing to it) logic. However, Stit logic is often combined with temporal logic. So, this approach can also be classified as a temporal solution. We say a few more words about this kind of view in Section 2e.

To illustrate this type of solution to the contrary-to-duty paradox, let us now discuss the dynamic deontic solution and some problems with this particular way of solving the puzzle.

An example: The dynamic deontic solution

According to the dynamic deontic proposal, we can solve the contrary-to-duty paradox if we combine deontic logic with dynamic logic. A view of this kind is suggested by Meyer (1988), which includes a dynamic deontic system. We will now consider this solution and some arguments for and against it. Dynamic deontic logic is concerned with what we ought to do rather than with what ought to be, and the sentences in N-CTD should be interpreted as telling us what you ought to do. The solution is criticised by Anglberger (2008).

Dynamic deontic logic introduces some new notions: α stands for some action, the formula [α]A denotes that performance of the action α (necessarily) leads to a state (or states) where A holds, where A is any sentence and [α] is similar to an ordinary necessity-like modal operator (the so-called box). The truth-conditions of [α]A are as follows: [α]A is true in a possible world w if and only if all possible worlds w’ with Rα(w, w’) satisfy A. Rα is an accessibility relation RαW W associated with α, where W is the set of possible worlds or states. Rα(w, w’) says that from w one (can) get into state w’ by performing α. Fα, to be read ‘the action α is forbidden’, can be defined as Fα ↔ [α]V (call this equivalence Def F; ↔ is ordinary material equivalence), where V is a special atomic formula denoting violation, in other words, that some action is forbidden if and only if doing the action leads to a state of violation. Oα, to be read ‘the action α is obligatory’ or ‘it is obligatory to perform the action α’, can now be defined as OαF(-α) (call this equivalence Def O), where ‐α stands for the non-performance of α. Two further formulas should be explained: α ; β stands for ‘the performance of α followed by β’, and α & β stands for ‘the performance of α and β (simultaneously)’.

The first three sentences in N-CTD can now be formalised in the following way in dynamic deontic logic:

DDLF-CTD

DDLF1 Oα

DDLF2 [α]O‐β

DDLF3 [‐α]Oβ

Let DDLF-CTD = {Oα, [α]O‐β, [‐α]Oβ}, where α stands for the act of keeping your promise (and helping your friend) and β for the act of apologising. In dynamic deontic logic, it is not possible to represent (the dynamic version of) N4, which states that the act of keeping your promise is not performed. This should perhaps make one wonder whether the formalisation is adequate (see Argument 1 below in this section). Yet, if we accept this fact, we can see that the representation solves the contrary-to-duty paradox. From DDLF-CTD it is not possible to derive a contradiction. So, in dynamic deontic logic we can solve the contrary-to-duty paradox.

Arguments for the dynamic solution

Meyer’s system is interesting and there seem to be independent reasons to want to combine deontic logic with some kind of action logic or dynamic logic. The symbolisations of the sentences in N-CTD seem intuitively plausible. DDLF-CTD is consistent; the set is dilemma-free and we cannot derive the proposition that everything is both obligatory and forbidden from it. We can assign formal sentences with analogous structures to all conditional obligations in N-CTD. We do not have to postulate several different types of unconditional obligations. Furthermore, from DDLF-CTD it is possible to derive O(α ; ‐β) ∧ [‐α](VOβ), which says that it is obligatory to perform the sequence α (keeping your promise) followed by ‐β (not-apologising), and if α has not been done (that is, if you do not keep your promise), one is in a state of violation and it is obligatory to do β; that is, it is obligatory to apologise. This conclusion is intuitively plausible. Nevertheless, there are also some potential and quite serious problems with this kind of solution.

Arguments against the dynamic solution

We consider four arguments against the dynamic solution to the contrary-to-duty paradox in this section. Versions of the second and the third can be found in Anglberger (2008). However, as far as we know, Argument 1 and Argument 4 have not been discussed in the literature before. According to the first argument, we cannot symbolise all premises in dynamic deontic logic, which is unsatisfactory. If we try to avoid this problem, we run into the pragmatic oddity once again. According to the second argument, the dynamic formalisations of the contrary-to-duty sets are not non-redundant. According to the third, it is provable in Meyer’s system PDeL + ¬O(α & ‐α) that no possible action is forbidden, which is clearly implausible. ‘¬O(α & ‐α)’ says that it is not obligatory to perform α and non-α. According to the fourth argument, there seem to be action- and/or agentless contrary-to-duty paradoxes, which seem impossible to solve in dynamic deontic logic.

Argument 1. We cannot symbolise all sentences in N-CTD in dynamic deontic logic; there is no plausible formalisation of N4. This is quite problematic. If the sentence N4 cannot be represented in dynamic deontic logic, how can we then claim that we have solved the paradox? Meyer suggests adding a predicate DONE that attaches to action names (Meyer (1988)). Then, DONE(α) says that action α has been performed. If we add this predicate, we can symbolise all sentences in N-CTD. Sentence N4 is rendered DONE(-α). Meyer appears to think that (DONE(α)→A) is derivable from [α]A. This seems plausible. Still, if we assume this, we can deduce a dynamic counterpart of the pragmatic oddity from our contrary-to-duty sets. To prove this, we use a lemma, Lemma 1, that is a theorem in dynamic deontic logic. α and β are interpreted as above.

Lemma 1. O(α & β) ↔ (OαOβ) [Theorem 19 in Meyer (1988)]

1. Oα   N1
2. [α]Oβ   N2
3. [-α] N3
4. DONE(-α) N4
5. [-α] : (DONE(-α) → Oβ) Property of DONE
6. DONE(-α) → Oβ 3, 5
7. 4, 6
8.  ∧  1, 7
9. O(α & β) ↔ ( ∧ ) Instance of Lemma 1
10. O(α & β) 8, 9

But the conclusion 10 in this argument says that it is obligatory that you perform the act of keeping your promise and the act of apologising (for not keeping your promise), and this is counterintuitive.

Argument 2. Recall that the first three sentences in N-CTD are symbolized in the following way: DDLF1 Oα, DDLF2 [α]Oβ, and DDLF3 [-α]Oβ. We will show that we can derive DDLF3 from DDLF1. It follows that the formalisation of N-CTD in dynamic deontic logic is not non-redundant. This is our second argument. The rules that are used in the proofs below are mentioned by Meyer (1988).

Lemma 2 FαF(α & β) [Theorem 16 in Meyer (1988)]

Lemma 3 α ; β = α & -(α ; ‐β)

1. α & -(α ; ‐β) = ‐ ‐α & -(α ; ‐β) [Act‐ ‐]
2. ‐ ‐α & -(α ; ‐β) = -(-α ∪ (α ; ‐β))  [Act-∪]
3. -(-α ∪ (α ; ‐β)) = ‐ ‐(α ; β) [Act-;]
4. ‐ ‐(α ; β) = α ; β [Act‐ ‐]
5. α & -(α ; ‐β) = α ; β [1–4]

Lemma 4 FαF(α ; β)

1. FαF(α & β) Lemma 2
2. FαF(α & -(α ; ‐β)) -(α ; ‐β)/β
3. FαF(α ; β) 2, Lemma 3

Lemma 5 Fα → [α]Fβ

1. FαF(α; β) Lemma 4
2. [α]V → [α; β]V 1, Def F
3. [α]V → [α][β]V 2, (;)
4. Fα → [α]Fβ 3, Def F

Oα is equivalent to F‐α and [‐α]Oβ to [‐α]F‐β. F‐α → [‐α]F‐β is an instance of Lemma 5. So, DDLF3 in DDLF-CTD is derivable from DDLF1. Consequently, DDLF-CTD is not non-redundant.

Argument 3. Here is our third argument. This argument shows that if we add Axiom DD (¬O(α & ‐α)) to Meyer’s dynamic deontic logic PDeL, we can derive a sentence that, in effect, says that no possible action is forbidden. Axiom DD seems to be intuitively plausible, as it is a dynamic counterpart of the axiom D in Standard Deontic Logic that rules out moral dilemmas. Hence, this problem is quite serious. In the proof below, T is Verum and ⊥ is Falsum. T is equivalent to an arbitrary logical truth (for example, p or not-p) and ⊥ is equivalent to an arbitrary contradiction (for example, p and not-p). Obviously, T is equivalent to ¬⊥ and ⊥ is equivalent to ¬T. (Let us call these equivalences Def T and Def ⊥.) Furthermore, <α>β is equivalent to ¬[α]¬β (let us call this equivalence Def <>). So, <α> is similar to an ordinary possibility-like modal operator (the so-called diamond). []-nec (or N) is a fundamental rule in Meyer’s system. It says that if B is a theorem (in the system), then [α]B is also a theorem (in the system).

Axiom DD ¬O(α & ‐α) [DD is called NCO in Meyer (1988)]

Lemma 6 [α](AB) ↔ ([α]A ∧ [α]B) [Theorem 3 in Meyer (1988)]

1. Fα → [α]F‐β Lemma 5 ‐β/β
2. Fα → [α]F‐ ‐β Lemma 5 ‐ ‐β/β
3. Fα → [α]Oβ 1, Def O
4. Fα → [α]O‐β 2, Def O
5. Fα → ([α]Oβ ∧ [α]O‐β) 3, 4
6. [α](OβO‐β) ↔ ([α]Oβ ∧ [α]O‐β) Lemma 6 Oβ/A, O‐β/B
7. Fα → [α](OβO‐β) 5, 6, Replacement
8. O(β & ‐β) ↔ (OβO‐β) Lemma 1 β/α, ‐β/β
9. Fα → [α]O(β & ‐β) 7, 8
10. ¬O(β & ‐β) Axiom DD β/α
11. [α]¬O(β & ‐β) 10, []‐nec
12. Fα → ([α]O(β & ‐β) ∧ [α]¬O(β & ‐β)) 9, 11
13. [α](O(β & ‐β) ∧ ¬O(β & ‐β))↔([α]O(β & ‐β) ∧ [α]¬O(β & ‐β)) Lemma 6 O(β & ‐β)/A, ¬O(β & ‐β)/B
14. Fα → [α](O(β & ‐β) ∧ ¬O(β & ‐β)) 12, 13
15. Fα → [α]⊥ 14, Def ⊥
16. (Fα ∧ <α>T) → ([α]⊥ ∧ <α>T) 15
17. <α>T ↔ ¬[α]⊥ Def <>, Def T, ⊥
18. (Fα ∧ <α>T) → ([α]⊥ ∧ ¬[α]⊥) 16, 17
19. ¬(Fα ∧ <α>T) 18

In effect, 19 claims that no possible action is forbidden. As Anglberger points out, Fα → [α]⊥ (line 15) seems implausible, but it can be true. If α is an impossible action, the consequent—and hence the whole sentence—is true. Nonetheless, if α is possible, α cannot be forbidden. <α>T says that α is possible, in the sense that there is a way to execute α that leads to a state in which T holds. Clearly 19 is implausible. Clearly, we want to be able to say that at least some possible action is forbidden. So, adding the intuitively plausible axiom DD to Meyer’s dynamic deontic logic PDeL is highly problematic.

Argument 4. The last argument against the dynamic solution to the contrary-to-duty paradox that we discuss seems to be a problem for most action or agent solutions. At least it is a problem for both the dynamic solution and the solution that uses some kind of Stit logic. Several examples of such (apparently) action- and/or agentless contrary-to-duty paradoxes have been mentioned in the literature, such as in Prakken and Sergot (1996). Here we consider one introduced by Rönnedal (2018).

Scenario II: Contrary-to-duty paradoxes involving (apparently) action- and/or agentless contrary-to-duty obligations (Rönnedal (2018))

Consider the following scenario. At t1, you are about to get into your car and drive somewhere. Then at t1 it ought to be the case that the doors are closed at t2, when you are in your car. If the doors are not closed, then a warning light ought to appear on the car instrument panel (at t3, a point in time as soon as possible after t2). It ought to be that if the doors are closed (at t2), then it is not the case that a warning light appears on the car instrument panel (at t3). Furthermore, the doors are not closed (at t2 when you are in the car). In this example, all of the following sentences seem to be true:

N2-CTD

AN1 (At t1) The doors ought to be closed (at t2).

AN2 (At t1) It ought to be that if the doors are closed (at t2), then it is not the case that a warning light appears on the car instrument panel (at t3).

AN3 (At t1) If the doors are not closed (at t2) then a warning light ought to appear on the car instrument panel (at t3).

AN4 (At t1 it is the case that at t2) The doors are not closed.

N2-CTD is similar to N-CTD. In this set, AN1 expresses a primary obligation (or ought), and AN3 expresses a contrary-to-duty obligation. The condition in AN3 is satisfied only if the primary obligation expressed by AN1 is violated. But AN3 does not seem to tell us anything about what you or someone else ought to do, and it does not seem to involve any particular agent. AN3 appears to be an action- and agentless contrary-to-duty obligation. It tells us something about what ought to be the case if the world is not as it ought to be according to AN1. It does not seem to be possible to find any plausible symbolisations of N2-CTD and similar paradoxes in dynamic deontic logic or any Stit logic.

Can someone who defends this kind of solution avoid this problem? Two strategies come to mind. One could argue that every kind of apparently action- and agentless contrary-to-duty paradox really involves some kind of action and agent when it is analysed properly. One could, for instance, claim that N2-CTD really includes an implicit agent. It is just that the agent is not a human being; the agent is the car or the warning system in the car. When analysed in detail, AN3 should be understood in the following way:

AN3(b) (At t1) If the doors are not closed (at t2) then the car or the warning system in the car ought to see to it that a warning light appears on the car instrument panel (at t3).

According to this response, one can always find some implicit agent and action in every apparently action- and/or agentless contrary-to-duty paradox. If this is the case, the problem might not be decisive for this kind of solution.

According to the second strategy, we simply deny that genuinely action- and/or agentless obligations are meaningful. If, for example, the sentences in N2-CTD are genuinely actionless and agentless, then they are meaningless and we cannot derive a contradiction from them. Hence, the paradox is solved. If, however, we can show that they involve some kind of actions and some kind of agent or agents, we can use the first strategy to solve them.

Whether any of these strategies is successful is, of course, debatable. There certainly seems to be genuinely action- and agentless obligations that are meaningful, and it seems prima facie unlikely that every apparently action- and agentless obligation can be reduced to an obligation that involves an action and an agent. Is it, for example, really plausible to think of the car or the warning system in the car as an acting agent that can have obligations? Does AN3 [(At t1) If the doors are not closed (at t2) then a warning light ought to appear on the car instrument panel (at t3)] say the same thing as AN3(b) [(At t1) If the doors are not closed (at t2) then the car or the warning system in the car ought to see to it that a warning light appears on the car instrument panel (at t3)]?

e. Temporal Solutions

In this section, we consider some temporal solutions to the contrary-to-duty paradox. The temporal approaches can be divided into three subcategories: (1) the pure temporal solution(s), (2) the temporal-action solution(s), and (3) the temporal dyadic deontic solution(s). All of these combine some kind of temporal logic with some kind of deontic logic. According to the temporal-action solutions, we should also add some kind of action logic to the other parts. Some of the first to construct systems that include both deontic and temporal elements were Montague (1968) and Chellas (1969).

According to the pure temporal solutions, we should use systems that combine ordinary so-called monadic deontic logic with some kind of temporal logic (perhaps together with a modal part) when we symbolise our contrary-to-duty obligations. See Rönnedal (2012, pp. 106–112) for more on some pure temporal solutions and on some problems with such approaches.

The idea of combining temporal logic, deontic logic and some kind of action logic has gained traction. A particularly interesting development is the so-called Stit (Seeing to it) paradigm. According to this paradigm, it is important to make a distinction between agentive and non-agentive sentences. A (deontic) Stit system is a system that includes one or several Stit (Seeing to it) operators that can be used to formalise various agentive sentences. The formula ‘[α: stit A]’ (‘[α: dstit A]’), for instance, says ‘agent α sees to it that A’ (‘agent α deliberately sees to it that A’). [α: (d)stit A] can be abbreviated as [α: A]. Some have argued that systems of this kind can be used to solve the contrary-to-duty paradox; see, for instance, Bartha (1993). According to the Stit approach, deontic constructions must take agentive sentences as complements; in a sentence of the form OA, A must be (or be equivalent to) a Stit sentence. A justification for this claim is, according to Bartha, that practical obligations, ‘ought to do’s’, should be connected to a specific action by a specific agent. The construction ‘agent α is obligated to see to it that A’ can now be defined in the following way: O[α: A] ⟺ L(¬[α: A] → S), where L says that ‘It is settled that’ and S says that ‘there is wrongdoing’ or ‘there is violation of the rules’ or something to that effect. Hence, α is obligated to see to it that A if and only if it is settled that if she does not see to it that A, then there is wrongdoing. In a logic of this kind, N-CTD can be symbolised in the following way: {O[α: k], O[α: [α: k] → [α:¬a]], O[α:¬[α: k] → [α: [α: a]]], ¬[α: k]}. And this set is consistent in Bartha’s system. For more on Stit logic and many relevant references, see Horty (2001), and Belnap, Perloff and Xu (2001).

An example: The temporal dyadic deontic solution

Here we consider, as an example of a temporal solution, the temporal dyadic deontic solution. We should perhaps not talk about ‘the’ temporal dyadic deontic solution, since there really are several different versions of this kind of view. However, let us focus on an example presented in Rönnedal (2018). What is common to all approaches of this kind is that they use some logical system that combines dyadic deontic logic with temporal logic to solve the contrary-to-duty paradox. Usually, the various systems also include a modal part with one or several necessity- and possibility-operators. Solutions of this kind are discussed by, for example, Åqvist (2003), van Eck (1982), Loewer and Belzer (1983), and Feldman (1986, 1990) (see also Åqvist and Hoepelman (1981) and Thomason (1981, 1981b)). Castañeda (1977) and Prakken and Sergot (1996) express some doubts about this kind of approach.

We first describe how the contrary-to-duty paradox can be solved in temporal alethic dyadic deontic logic of the kind introduced by Rönnedal (2018). Then, we consider some reasons why this solution is attractive. We end by mentioning a potential problem with this solution. In temporal alethic dyadic deontic logic, N-CTD can be symbolised in the following way:

F-CTD

F1. Rt1O[T]Rt2k

F2. Rt1O[Rt2k]Rt3a

F3. Rt1O[Rt2k]Rt3a

F4. Rt1Rt2k [⇔Rt2k]

where k and a are interpreted as in SDL-CTD. R is a temporal operator; ‘Rt1A’ says that it is realised at time t1 (it is true on t1) that A, and so forth. t1 refers to the moment on Monday when you make your promise, t2 refers to the moment on Friday when you should keep your promise and t3 refers to the moment on Saturday when you should apologise if you do not keep your promise on Friday. O is a dyadic deontic sentential operator of the kind mentioned in Section 2c. ‘O[B]A’ says that it is obligatory that (it ought to be the case that) A given B. In dyadic deontic logic, an unconditional, monadic O-operator can be defined in terms of the dyadic deontic O-operator in the following way: OA =df O[T]A. According to this definition, it is unconditionally obligatory that A if and only if it is obligatory that A given Verum. All other symbols are interpreted as above. Accordingly, F1 is read as ‘It is true on Monday that you ought to keep your promise on Friday’. F2 is read as ‘It is true on Monday that it ought to be the case that you do not apologise on Saturday given that you keep your promise on Friday’. F3 is read as ‘It is true on Monday that it ought to be the case that you apologise on Saturday given that you do not keep your promise on Friday’. F4 is read as ‘It is true on Monday that it is true on Friday that you do not keep your promise’; in other words, ‘It is true on Friday that you do not keep your promise’. This rendering of N-CTD seems to be plausible.

In temporal (alethic) dyadic deontic logic, truth is relativized to world-moment pairs. This means that a sentence can be true in one possible world w at a particular time t even though it is false in some other possible world, say w’, at this time (that is, at t) or false in this world (that is, in w) at another time, say t’. Some (but not all) sentences are temporally settled. A temporally settled sentence satisfies the following condition: if it is true (in a possible world), it is true at every moment of time (in this possible world); and if it is false (in a possible world), it is false at every moment of time (in this possible world). All the sentences F1−F4 are temporally settled; O[T]Rt2k, O[Rt2k]Rt3a and O[Rt2k]Rt3a are examples of sentences that are not, as their truth values may vary from one moment of time to another (in one and the same possible world).

Rt1Rt2k is equivalent to Rt2k. For it is true on Monday that it is true on Friday that you do not keep your promise if and only if it is true on Friday that you do not keep your promise. Hence, from now on we use Rt2k as a symbolisation of N4. Note that it might be true on Monday that you will not keep your promise on Friday (in some possible world) even though this is not a settled fact—in other words, even though it is not historically necessary. In some possible worlds, you will keep your promise on Friday and in some possible worlds you will not. F4 is true at t1 (on Monday) in the possible worlds where you do not keep your promise at t2 (on Friday).

Let F-CTD = {F1, F2, F3, F4}. F-CTD is consistent in most interesting temporal alethic dyadic deontic systems (see Rönnedal (2018) for a rigorous proof of this claim). Hence, we can solve the contrary-to-duty paradox in temporal alethic dyadic deontic logic.

Arguments for the temporal alethic dyadic deontic solution

We now consider some reasons why the temporal alethic dyadic deontic solution to the contrary-to-duty paradox is attractive. We first briefly mention some features; then, we discuss some reasons in more detail. (1) F-CTD is consistent. (2) F-CTD is non-redundant. (3) F-CTD is dilemma-free. (4) It is not possible to derive the proposition that everything is both obligatory and forbidden from F-CTD. (5) F-CTD avoids the so-called pragmatic oddity. (6) The solution in temporal alethic dyadic deontic logic is applicable to (at least apparently) action- and agentless contrary-to-duty examples. (7) We can assign formal sentences with analogous structures to all conditional obligations in N-CTD in temporal alethic dyadic deontic logic. (8) We can express the idea that an obligation has been violated, and (9) we can symbolise higher order contrary-to-duty obligations in temporal alethic dyadic deontic logic. (10) In temporal alethic dyadic deontic logic we can derive ‘ideal’ obligations, and (11) we can derive ‘actual’ obligations (in certain circumstances). (12) We can avoid the so-called dilemma of commitment and detachment in temporal alethic dyadic deontic logic. All of these reasons are discussed in Rönnedal (2018). Now let us say a few more words about some of them.

Reason (I): F-CTD is dilemma-free. The solution in temporal alethic dyadic deontic logic is dilemma-free. The sentence Rt1O[T]Rt3a is derivable from F1 and F2 (in some systems) (see Reason V below) and from F3b and F4 we can deduce the formula Rt2O[T]Rt3a (in some systems under some circumstances) (see Reason VI below). Accordingly, we can derive the following sentence: Rt1O[T]Rt3a Rt2O[T]Rt3a (in certain systems). Rt1O[T]Rt3a says ‘On Monday [when you have not yet broken your promise] it ought to be the case that you do not apologise on Saturday’, and Rt2O[T]Rt3a says ‘On Friday [when you have broken your promise] it ought to be the case that you apologise on Saturday’. Despite this, O[T]Rt3a and O[T]Rt3a are not true at the same time. Neither Rt1O[T]Rt3aRt1O[T]Rt3a nor Rt2O[T]Rt3a Rt2O[T]Rt3a is derivable from F-CTD in any interesting temporal alethic dyadic deontic system. Consequently, this is not a moral dilemma. Since N-CTD seems to be dilemma-free, we want our formalisation of N-CTD to be dilemma-free too; and F-CTD is, as we have seen, dilemma-free. This is one good reason to be attracted to the temporal alethic dyadic deontic solution.

Reason (II): F-CTD avoids the so-called pragmatic oddity. Neither O[T](Rt2k Rt3a), Rt1O[T](Rt2k Rt3a) nor Rt2O[T](Rt2k Rt3a) is derivable from F-CTD in any interesting temporal alethic dyadic deontic system. Hence, we can avoid the pragmatic oddity (see Section 2a above).

Reason (III): The solution in temporal alethic dyadic deontic logic is applicable to (at least apparently) actionless and agentless contrary-to-duty examples. In Section 2d, we considered an example of an (apparently) action- and agentless contrary-to-duty paradox. In temporal alethic dyadic deontic logic, it is easy to find plausible symbolisations of (apparently) action- and agentless contrary-to-duty obligations; the sentences in N2-CTD have the same logical form as the sentences in N-CTD. It follows that contrary-to-duty paradoxes of this kind can be solved in exactly the same way as we solved our original paradox.

Reason (IV): We can assign formal sentences with analogous structures to all conditional obligations in N-CTD in temporal alethic dyadic deontic logic. According to some deontic logicians, a formalisation of N-CTD is adequate only if the formal sentences assigned to N2 and N3 have the same (or analogous) logical form (see Carmo and Jones (2002)). The temporal alethic dyadic deontic solution satisfies this requirement. Not all solutions do that. F2 and F3 have the ‘same’ logical form and they can both be formalised using dyadic obligation.

Reason (V): We can derive ‘ideal’ obligations in temporal alethic dyadic deontic logic. N1 and N2 seem to entail that you ought not to apologise. Ideally you ought to keep your promise, and ideally it ought to be that if you keep your promise, then you do not apologise (for not keeping your promise). Accordingly, ideally you ought not to apologise. We want our formalisation of N-CTD to reflect this intuition. Rt1O[T]Rt3a is deducible from F1 (Rt1O[T]Rt2k) and F2 (Rt1O[Rt2k]Rt3a) in many temporal dyadic deontic systems. The tableau below proves this.

We use two derived rules in our deduction. These are also used in our next semantic tableau (see Reason VI below). According to the first derived rule, DR1, we may add ¬A, wit to any open branch in a tree that includes ¬RtA, witj. This rule is deducible in every system. According to the second derived rule, DR2, we may add O[T](A B), witj to any open branch in a tree that contains O[A]B, witj. DR2 can be derived in every system that includes the rules T Dα0 and T Dα2. (All other special rules that we use in our deductions are described by Rönnedal (2018).)

(1) Rt1O[T]Rt2k, w0t0

(2) Rt1O[Rt2k]Rt3a, w0t0

(3) ¬Rt1O[T]Rt3a, w0t0

(4) ¬O[T]Rt3a, w0t1 [3, DR1]

(5) P[T]¬Rt3a, w0t1 [4, ¬O]

(6) sTw0w1t1 [5, P]

(7) ¬Rt3a, w1t1 [5, P]

(8) ¬¬a, w1t3 [7, DR1]

(9) O[T]Rt2k, w0t1 [1, Rt]

(10) Rt2k, w1t1 [9, 6, O]

(11) k, w1t2 [10, Rt]

(12) O[Rt2k]Rt3a, w0t1 [2, Rt]

(13) O[T](Rt2k Rt3a), w0t1 [12, DR2]

(14) Rt2k Rt3a, w1t1 [13, 6, O]

                          

(15) ¬Rt2k, w1t1 [14, →]                   (16) Rt3a, w1t1 [14, →]

(17) ¬k, w1t2 [15, DR1]                     (18) ¬a, w1t3 [16, Rt]

(19) * [11, 17]                                          (20) * [8, 18]

Informally, Rt1O[T]Rt3a says that it is true at t1, that is, on Monday, that it ought to be the case that you will not apologise on Saturday when you meet your friend. For, ideally, you keep your promise on Friday. Yet, Rt2O[T]Rt3a does not follow from F1 and F2 (see Reason I above). On Friday, when you have broken your promise, and when it is no longer historically possible for you to keep your promise, then it is not obligatory that you do not apologise on Saturday. On Friday, it is obligatory that you apologise when you meet your friend on Saturday (see Reason VI). Nevertheless, it is plausible to claim that it is true on Monday that it ought to be the case that you do not apologise on Saturday. For on Monday it is not a settled fact that you will not keep your promise; on Monday, it is still possible for you to keep your promise, which you ought to do. These conclusions correspond well with our intuitions about Scenario I.

According to the counterfactual solution (see Section 2c) to the contrary-to-duty paradoxes, we cannot derive any ‘ideal’ obligations of this kind. This is a potential problem for this solution.

Reason (VI): We can derive ‘actual’ obligations in temporal alethic dyadic deontic logic (in certain circumstances). N3 and N4 appear to entail that you ought to apologise. Ideally you ought to keep your promise, but if you do not keep your promise, you ought to apologise. As a matter of fact, you do not keep your promise. It follows that you should apologise. We want our symbolisation of N-CTD to reflect this intuition. Therefore, let us assume that the conditional (contrary-to-duty) obligation expressed by N3 is still in force at time t2; in other words, we assume that the following sentence is true:

F3b Rt2O[Rt2k]Rt3a.

Informally, F3b says that it is true at t2 (on Friday) that if you do not keep your promise on Friday, you ought to apologise on Saturday. Rt2O[T]Rt3a is derivable from F4 (Rt2k) and F3b in every tableau system that includes TDα0, TDα2, TDMO (the dyadic must-ought principle) and TBT (backward transfer) (see Rönnedal (2018)). According to Rt2O[T]Rt3α, it is true at t2 (on Friday), when you have broken your promise to your friend, that it ought to be the case that you apologise to your friend on Saturday when you meet her.

Note that Rt1O[T]Rt3a is not deducible from F3 (or F3b or F3 and F3b) and F4 (see Reason I). According to Rt1O[T]Rt3a, it is true at t1, on Monday, that you should apologise to you friend on Saturday when you meet her. However, on Monday it is not yet a settled fact that you will not keep your promise to your friend; on Monday it is still open to you to keep your promise. Accordingly, it is not true on Monday that you should apologise on Saturday. Since it is true on Monday that you ought to keep your promise, and it ought to be that if you keep your promise then you do not apologise, it follows that it is true on Monday that it ought to be the case that you do not apologise on Saturday (see Reason V). These facts correspond well with our intuitions about Scenario I.

The following tableau proves that Rt2O[T]Rt3a is derivable from F3b and F4:

(1) Rt2k, w0t0

(2) Rt2O[Rt2k]Rt3a, w0t0

(3) ¬Rt2O[T]Rt3a, w0t0

(4) ¬O[T]Rt3a, w0t2 [3, DR1]

(5) P[T]¬Rt3a, w0t2 [4, ¬O]

(6) sTw0w1t2 [5, P]

(7) ¬Rt3a, w1t2 [5, P]

(8) ¬a, w1t3 [7, DR1]

(9) rw0w1t2 [6, T DMO]

(10) ¬k, w0t2 [1, Rt]

(11) O[Rt2k]Rt3a, w0t2 [2, Rt]

(12) O[T](Rt2k Rt3a), w0t2 [11, DR2]

(13) Rt2k Rt3a, w1t2 [6, 12, O]

                          

(14) ¬Rt2k, w1t2 [13, →]               (15) Rt3a, w1t2 [13, →]

(16) ¬¬k, w1t2 [14, DR1]                 (17) a, w1t3 [15, Rt]

(18) k, w1t2 [16, ¬¬]                           (19) * [8, 17]

(20) k, w0t2 [9, 18, T BT]

(21) * [10, 20]

F3 and F3b are independent of each other (in most interesting temporal alethic dyadic deontic systems). Hence, one could argue that N3 should be symbolised by a conjunction of F3 and F3b. For we have assumed that the contrary-to-duty obligation to apologise, given that you do not keep your promise, is still in force at t2. It might be interesting to note that this does not affect the main results in this section. {F1, F2, F3, F3b, F4} is, for example, consistent, non-redundant, and so on. So, we can use such an alternative formalisation of N3 instead of F3. Moreover, note that the symbolisation of N2 can be modified in a similar way.

Reason (VII): In temporal alethic dyadic deontic logic we can avoid the so-called dilemma of commitment and detachment. (Factual) Detachment is an inference pattern that allows us to infer or detach an unconditional obligation from a conditional obligation and this conditional obligation’s condition. Thus, if detachment holds for the conditional (contrary-to-duty) obligation that you should apologise if you do not keep your promise (if detachment is possible), and if you in fact do not keep your promise, then we can derive the unconditional obligation that you should apologise.

van Eck (1982, p. 263) describes the so-called dilemma of commitment and detachment in the following way: (1) detachment should be possible, for we cannot take seriously a conditional obligation if it cannot, by way of detachment, lead to an unconditional obligation; and (2) detachment should not be possible, for if detachment is possible, the following kind of situation would be inconsistent—A, it ought to be the case that B given that A; and C, it ought to be the case that not-B given C. Yet, such a situation is not necessarily inconsistent.

In pure dyadic deontic logic, we cannot deduce the unconditional obligation that it is obligatory that A (OA) from the dyadic obligation that it is obligatory that A given B (O[B]A) and B. Still, if this is true, how can we take such conditional obligations seriously? Hence, the dilemma of commitment and detachment is a problem for solutions to the contrary-to-duty paradox in pure dyadic deontic logic. In temporal alethic dyadic deontic logic, we can avoid this dilemma. We cannot always detach an unconditional obligation from a conditional obligation and its condition, but we can detach the unconditional obligation OB from O[A]B and A if A is non-future or historically necessary (in some interesting temporal alethic dyadic deontic systems). This seems to give us exactly the correct answer to the current problem. Detachment holds, but the rule does not hold unrestrictedly. We have seen above that Rt2O[T]Rt3a, but not Rt1O[T]Rt3a, is derivable from Rt2k and Rt2O[Rt2k]Rt3a in certain systems, that is, that we can detach the former sentence, but not the latter. Nevertheless, we cannot conclude that a set of the following kind must be inconsistent: {A, O[A]B, C, O[C]¬B}; this seems to get us exactly what we want.

All of these reasons show that the temporal dyadic deontic solution is very attractive. It avoids many of the problems with other solutions that have been suggested in the literature. However, even though the solution is quite attractive, it is not unproblematic. We will now consider one potential serious problem.

An argument against the temporal solutions

The following argument against the temporal dyadic deontic solution appears to be a problem for every other kind of temporal solution too. There seems to be timeless (or parallel) contrary-to-duty paradoxes. In a timeless (or parallel) contrary-to-duty paradox, all obligations seem, in some sense, to be in force simultaneously, and both the antecedent and consequent in the contrary-to-duty obligation appear to ‘refer’ to the same time (if indeed they refer to any time at all). Such paradoxes cannot be solved in temporal dyadic deontic logic or any other system of this kind. For a critique of temporal solutions to the contrary-to-duty paradoxes, see Castañeda (1977). Several (apparently) timeless (or parallel) contrary-to-duty paradoxes are mentioned by Prakken and Sergot (1996).

Here is one example.

Scenario III: The Dog Warning Sign Scenario (After Prakken and Sergot (1996))

Consider the following set of cottage regulations. It ought to be that there is no dog. It ought to be that if there is no dog, there is no warning sign. If there is a dog, it ought to be that there is a warning sign. Suppose further that there is a dog. Then all of the following sentences seem to be true:

TN-CTD

(TN1) It ought to be that there is no dog.

(TN2) It ought to be that if there is no dog, there is no warning sign.

(TN3) If there is a dog, it ought to be that there is a warning sign.

(TN4) There is a dog.

(TN1) expresses a primary obligation and (TN3) a contrary-to-duty obligation. The condition in (TN3) is fulfilled only if the primary obligation expressed by (TN1) is violated. Let TN-CTD = {TN1, TN2, TN3, TN4}. It seems possible that all of the sentences in TN-CTD could be true; the set does not seem to be inconsistent. Yet, if this is the case, TN-CTD poses a problem for all temporal solutions.

In this example, all obligations appear to be timeless or parallel; they appear to be in force simultaneously, and the antecedent and consequent in the contrary-to-duty obligation (TN3) seem to refer to one and the same time (or perhaps to no particular time at all). So, a natural symbolisation is the following:

FTN-CTD

(FTN1) O[T]¬d

(FTN2) O[¬d]¬w

(FTN3) O[d]w

(FTN4) d

where d stands for ‘There is a dog’ and w for ‘There is a warning sign’ and all other symbols are interpreted as above. Nevertheless, this set is inconsistent in many temporal alethic dyadic deontic systems. We prove this below. But first let us consider some derived rules that we use in our tableau derivation.

Derived rules

DR3 O[A]B => O[T](AB)

DR4 O[A]B, O[A](BC) => O[A]C

DR5 O[T](AB), A => O[T]B, given that A is non-future.

According to DR3, if we have O[A]B, witj on an open branch in a tree we may add O[T](AB), witj to this branch in this tree. The other derived rules are interpreted in a similar way. A is non-future as long as A does not include any operator that refers to the future.

We are now in a position to prove that the set of sentences FTN-CTD = {FTN1, FTN2, FTN3, FTN4} is inconsistent in every temporal dyadic deontic tableau system that includes the rules TDMO, TDα0TDα4, TFT, and TBT (Rönnedal (2018)). Here is the tableau derivation:

(1) O[T]¬d, w0t0

(2) O[¬d]¬w, w0t0

(3) O[d]w, w0t0

(4) d, w0t0

(5) O[T](¬d → ¬w), w0t0 [2, DR3]

(6) O[T](dw), w0t0 [3, DR3]

(7) O[T]¬w, w0t0 [1, 5, DR4]

(8) O[T]w, w0t0 [4, 6, DR5]

(9) T, w0t0 [Global Assumption]

(10) STw0w1t0 [9, TDα3]

(11) ¬w, w1t0 [7, 10, O]

(12) w, w1t0 [8, 10, O]

(13) * [11, 12]

This is counterintuitive, since TN-CTD seems to be consistent. This is an example of a timeless (parallel) contrary-to-duty paradox.

Can we avoid this problem by introducing some temporal operators in our symbolisation of TN-CTD? One natural interpretation of the sentences in this set is as follows: (TN1) (At t1) It ought to be that there is no dog; (TN2) (At t1) It ought to be that if there is no dog (at t1), there is no warning sign (at t1); (TN3) (At t1) If there is a dog, then (at t1) it ought to be that there is a warning sign (at t1); and (TN4) (At t1) There is a dog.

Hence, an alternative symbolisation of the sentence in (TN-CTD) is the following:

F2TN-CTD

(F2TN1) Rt1O[T]Rt1d

(F2TN2) Rt1O[Rt1d]Rt1w

(F2TN3) Rt1O[Rt1d]Rt1w

(F2TN4) Rt1d

Yet, the set F2TN-CTD = {F2TN1, F2TN2, F2TN3, F2TN4} is also inconsistent. The proof is similar to the one above. So, this move does not help. And it does not seem to be the case that we can find any other plausible symbolisation of TN-CTD in temporal alethic dyadic deontic logic that is consistent. (TN2) cannot, for instance, plausibly be interpreted in the following way: (At t1) It ought to be that if there is no dog (at t2), there is no warning sign (at t3), where t1 is before t2 and t2 before t3. And (TN3) cannot plausibly be interpreted in the following way: (At t1) If there is a dog, then (at t2) it ought to be that there is a warning sign (at t3), where t1 is before t2 and t2 before t3.

Hence, (apparently) timeless contrary-to-duty paradoxes pose a real problem for the temporal dyadic deontic solution and other similar temporal solutions.

3. References and Further Reading

  • Anglberger, A. J. J. (2008). Dynamic Deontic Logic and Its Paradoxes. Studia Logica, Vol. 89, No. 3, pp. 427–435.
  • Åqvist, L. (1967). Good Samaritans, Contrary-to-duty Imperatives, and Epistemic Obligations. Noûs 1, pp. 361–379.
  • Åqvist, L. (1984). Deontic Logic. In D. Gabbay and F. Guenthner (eds.) Handbook of Philosophical Logic, Vol. II, D. Reidel, pp. 605–714.
  • Åqvist, L. (1987). Introduction to Deontic Logic and the Theory of Normative Systems. Naples, Bibliopolis.
  • Åqvist, L. (2002). Deontic Logic. In Gabbay and Guenthner (eds.) Handbook of Philosophical Logic, 2nd Edition, Vol. 8, Dordrecht/Boston/London: Kluwer Academic Publishers, pp. 147–264.
  • Åqvist, L. (2003). Conditionality and Branching Time in Deontic Logic: Further Remarks on the Alchourrón and Bulygin (1983) Example. In Segerberg and Sliwinski (eds.) (2003) Logic, law, morality: thirteen essays in practical philosophy in honour of Lennart Åqvist, Uppsala philosophical studies 51, Uppsala: Uppsala University, pp. 13–37.
  • Åqvist, L. and Hoepelman, J. (1981). Some theorems about a ‘tree’ system of deontic tense logic. In R. Hilpinen (ed.) New Studies in Deontic Logic, D. Reidel, Dordrecht, pp. 187–221.
  • Bartha, P. (1993). Conditional obligation, deontic paradoxes, and the logic of agency. Annals of Mathematics and Artificial Intelligence 9, (1993), pp. 1–23.
  • Belnap, N., Perloff, M. and Xu, M. (2001). Facing the Future: Agents and Choices in Our Indeterminist World. Oxford: Oxford University Press.
  • Bonevac, D. (1998). Against Conditional Obligation. Noûs, Vol 32 (Mars), pp. 37–53.
  • Carmo, J. and Jones, A. J. I. (2002). Deontic Logic and Contrary-to-duties. In Gabbay and Guenthner (eds.) (2002) Handbook of Philosophical Logic, vol 8, pp. 265–343.
  • Castañeda, H. -N. (1977). Ought, Time, and the Deontic Paradoxes. The Journal of Philosophy, Vol. 74, No. 12, pp. 775–791.
  • Castañeda, H. -N. (1981). The paradoxes of deontic logic: the simplest solution to all of them in one fell swoop. In R. Hilpinen (ed.) New Studies in Deontic Logic, D. Reidel, Dordrecht, pp. 37–85.
  • Chellas, B. F. (1969). The Logical Form of Imperatives. Stanford: Perry Lane Press.
  • Chellas, B. F. (1980). Modal Logic: An Introduction. Cambridge: Cambridge University Press.
  • Chisholm, R. M. (1963). Contrary-to-duty Imperatives and Deontic Logic. Analysis 24, pp. 33–36.
  • Cox, Azizah Al-Hibri. (1978). Deontic Logic: A Comprehensive Appraisal and a New Proposal. University Press of America.
  • Danielsson, S. (1968). Preference and Obligation: Studies in the Logic of Ethics. Filosofiska föreningen, Uppsala.
  • Decew, J. W. (1981). Conditional Obligations and Counterfactuals. The Journal of Philosophical Logic 10, pp. 55–72.
  • Feldman, F. (1986). Doing The Best We Can: An Essay in Informal Deontic Logic. Dordrecht: D. Reidel Publishing Company.
  • Feldman, F. (1990). A Simpler Solution to the Paradoxes of Deontic Logic. Philosophical Perspectives, vol. 4, pp. 309–341.
  • Fisher, M. (1964). A contradiction in deontic logic?, Analysis, XXV, pp. 12–13.
  • Forrester, J. W. (1984). Gentle Murder, or the Adverbial Samaritan. Journal of Philosophy, Vol. LXXI, No. 4, pp. 193–197.
  • Gabbay, D., Horty, J., Parent, X., van der Meyden, E. & van der Torre, L. (eds.). (2013). Handbook of Deontic Logic and Normative Systems. College Publications.
  • Greenspan. P. S. (1975). Conditional Oughts and Hypothetical Imperatives. The Journal of Philosophy, Vol. 72, No. 10 (May 22), pp. 259–276.
  • Hansson, B. (1969). An Analysis of Some Deontic Logics. Noûs 3, 373-398. Reprinted in Hilpinen, Risto (ed). 1971. Deontic Logic: Introductory and Systematic Readings. Dordrecht: D. Reidel Publishing Company, pp. 121–147.
  • Hilpinen, R. (ed). (1971). Deontic Logic: Introductory and Systematic Readings. Dordrecht: D. Reidel Publishing Company.
  • Hilpinen, R. (ed). (1981). New Studies in Deontic Logic Norms, Actions, and the Foundation of Ethics. Dordrecht: D. Reidel Publishing Company.
  • Horty, J. F. (2001). Agency and Deontic Logic. Oxford: Oxford University Press.
  • Jones, A. and Pörn, I. (1985). Ideality, sub-ideality and deontic logic. Synthese 65, pp. 275–290.
  • Lewis, D. (1974). Semantic analysis for dyadic deontic logic. In S. Stenlund, editor, Logical Theory and Semantical Analysis, pp. 1–14. D. Reidel Publishing Company, Dordrecht, Holland.
  • Loewer, B. and Belzer, M. (1983). Dyadic deontic detachment. Synthese 54, pp. 295–318.
  • McNamara, P. (2010). Deontic Logic. In E. N. Zalta (ed.), The Stanford Encyclopedia of Philosophy.
  • Montague, R. (1968). Pragmatics. In R. Klibansky (ed.) Contemporary Philosophy: Vol. 1: Logic and the Foundations of Mathematics, pp. 102–122, La Nuova Italia Editrice, Firenze, (1968).
  • Mott, P. L. (1973). On Chisholm’s paradox. Journal of Philosophical Logic 2, pp. 197–211.
  • Meyer, J.-J. C. (1988). A Different Approach to Deontic Logic: Deontic Logic Viewed as a Variant of Dynamic Logic. Notre Dame Journal of Formal Logic, Vol. 29, Number 1.
  • Niles, I. (1997). Rescuing the Counterfactual Solution to Chisholm’s Paradox. Philosophia, Vol. 25, pp. 351–371.
  • Powers, L. (1967). Some Deontic Logicians. Noûs 1, pp. 361–400.
  • Prakken, H. and Sergot, M. (1996). Contrary-to-duty obligations. Studia Logica, 57, pp. 91–115.
  • Rescher, N. (1958). An axiom system for deontic logic. Philosophical studies, Vol. 9, pp. 24–30.
  • Rönnedal, D. (2009). Dyadic Deontic Logic and Semantic Tableaux. Logic and Logical Philosophy, Vol. 18, No. 3–4, pp. 221–252.
  • Rönnedal, D. (2012). Extensions of Deontic Logic: An Investigation into some Multi-Modal Systems. Department of Philosophy, Stockholm University.
  • Rönnedal, D. (2016). Counterfactuals in Temporal Alethic-Deontic Logic. South American Journal of Logic. Vol. 2, n. 1, pp. 57–81.
  • Rönnedal, D. (2018). Temporal Alethic Dyadic Deontic Logic and the Contrary-to-Duty Obligation Paradox. Logic and Logical Philosophy. Vol. 27, No 1, pp. 3–52.
  • Rönnedal, D. (2019). Contrary-to-duty paradoxes and counterfactual deontic logic. Philosophia, 47 (4), pp. 1247–1282.
  • Thomason, R. H. (1981). Deontic Logic as Founded on Tense Logic. In R. Hilpinen (ed.) New Studies in Deontic Logic, D. Reidel, Dordrecht, pp. 165–176.
  • Thomason, R. H. (1981b). Deontic Logic and the Role of Freedom in Moral Deliberation. In R. Hilpinen (ed.) New Studies in Deontic Logic, D. Reidel, Dordrecht, pp. 177–186.
  • Tomberlin, J. E. (1981). Contrary-to-duty imperatives and conditional obligations. Noûs 15, pp. 357–375.
  • van Eck, J. (1982). A system of temporally relative modal and deontic predicate logic and its philosophical applications. Logique et Analyse, Vol 25, No 99, pp. 249–290, and No 100, pp. 339–381. Original publication, as dissertation, Groningen, University of Groningen, 1981.
  • van der Torre, L. W. N. and Tan, Y. H. (1999). Contrary-To-Duty Reasoning with Preference-based Dyadic Obligations. Annals of Mathematics and Artificial Intelligence 27, pp. 49–78.
  • Wieringa, R. J. & Meyer, J.-J. Ch. (1993). Applications of Deontic Logic in Computer Science: A Concise Overview. In J.-J. Meyer and R. Wieringa, editors, Deontic Logic in Computer Science: Normative System Specification, pp. 17–40. John Wiley & Sons, Chichester, England.
  • van Fraassen, C. (1972). The Logic of Conditional Obligation. Journal of Philosophical Logic 1, pp. 417–438.
  • van Fraassen, C. (1973). Values and the Heart’s Command. The Journal of Philosophy LXX, pp. 5–19.
  • von Kutschera, F. (1974). Normative Präferenzen und bedingte Gebote. I Lenk, H., & Berkemann J. (eds.). (1974), pp. 137–165.
  • von Wright, G. H. (1964). A new system of deontic logic. Danish yearbook of philosophy, Vol. 1, pp. 173–182.

 

Author Information

Daniel Rönnedal
Email: daniel.ronnedal@philosophy.su.se
University of Stockholm
Sweden

The Compactness Theorem

The compactness theorem is a fundamental theorem for the model theory of classical propositional and first-order logic. As well as having importance in several areas of mathematics, such as algebra and combinatorics, it also helps to pinpoint the strength of these logics, which are the standard ones used in mathematics and arguably the most important ones in philosophy.

The main focus of this article is the many different proofs of the compactness theorem, applying different Choice-like principles before later calibrating the strength of these and the compactness theorems themselves over Zermelo-Fraenkel set theory ZF. Although the article’s focus is mathematical, much of the discussion keeps an eye on philosophical applications and implications.

We first introduce some standard logics, detailing whether the compactness theorem holds or fails for these. We also broach the neglected question of whether natural language is compact. Besides algebra and combinatorics, the compactness theorem also has implications for topology and foundations of mathematics, via its interaction with the Axiom of Choice. We detail these results as well as those of a philosophical nature, such as apparent ‘paradoxes’ and non-standard models of arithmetic and analysis. We then provide several different proofs of the compactness theorem based on different Choice-like principles.

In later sections, we discuss several variations of compactness in logics that allow for infinite conjunctions / disjunctions or generalised quantifiers, and in higher-order logics. The article concludes with a history of the compactness theorem and its many proofs, starting from those that use syntactic proofs before moving to the semantic proofs model theorists are more accustomed to today.

Click here to read article.

Author Information

A. C. Paseau
Email: alexander.paseau@philosophy.ox.ac.uk
University of Oxford
United Kingdom

and

Robert Leek
Email: r.leek@bham.ac.uk
University of Birmingham
United Kingdom